Evaluation of the submissions will be performed on a private test data set containing individual files of surgical videos @ 1FPS through grand-challenge automated docker submission and evaluation system. The test data and labels are hidden from the participants. The specific metrics to be used for each category for evaluation are given below:

Category 1 - Classification: Average f-1 score across all classes

Category 2 - Localization and classification: COCO mAP@[0.5:0.05:0.95] (https://cocodataset.org/#detection-eval)