Tooth Detection — Model Comparison

Tooth Detection for Missing Tooth Identification (Paper Layout)

Metric	YOLOv11n	YOLOv11n-aug	YOLOv11s	YOLOv11s-aug	YOLOv8n	YOLOv8n-aug	YOLOv8s	YOLOv8s-aug
TP / FN / FP / TN	32/0/2/16	31/1/2/16	31/1/1/17	31/1/1/17	32/0/3/15	31/1/2/16	30/2/3/15	31/1/2/16
Accuracy	0.9600	0.9400	0.9600	0.9600	0.9400	0.9400	0.9000	0.9400
Sensitivity (Recall)	1.0000	0.9688	0.9688	0.9688	1.0000	0.9688	0.9375	0.9688
Specificity	0.8889	0.8889	0.9444	0.9444	0.8333	0.8889	0.8333	0.8889
Precision (PPV)	0.9412	0.9394	0.9688	0.9688	0.9143	0.9394	0.9091	0.9394
F1-Score	0.9697	0.9538	0.9688	0.9688	0.9552	0.9538	0.9231	0.9538
NPV	1.0000	0.9412	0.9444	0.9444	1.0000	0.9412	0.8824	0.9412
AUC-ROC	0.9844	0.9878	0.9931	0.9913	0.9410	0.9896	0.9253	0.9635

Metric	YOLOv11n	YOLOv11s	YOLOv8n	YOLOv8s
TP / FN / FP / TN	32/0/2/16	31/1/1/17	32/0/3/15	30/2/3/15
Accuracy	0.9600	0.9600	0.9400	0.9000
Sensitivity (Recall)	1.0000	0.9688	1.0000	0.9375
Specificity	0.8889	0.9444	0.8333	0.8333
Precision (PPV)	0.9412	0.9688	0.9143	0.9091
F1-Score	0.9697	0.9688	0.9552	0.9231
NPV	1.0000	0.9444	1.0000	0.8824
AUC-ROC	0.9844	0.9931	0.9410	0.9253

Metric	YOLOv11n-aug	YOLOv11s-aug	YOLOv8n-aug	YOLOv8s-aug
TP / FN / FP / TN	31/1/2/16	31/1/1/17	31/1/2/16	31/1/2/16
Accuracy	0.9400	0.9600	0.9400	0.9400
Sensitivity (Recall)	0.9688	0.9688	0.9688	0.9688
Specificity	0.8889	0.9444	0.8889	0.8889
Precision (PPV)	0.9394	0.9688	0.9394	0.9394
F1-Score	0.9538	0.9688	0.9538	0.9538
NPV	0.9412	0.9444	0.9412	0.9412
AUC-ROC	0.9878	0.9913	0.9896	0.9635

Metric	YOLOv11n	YOLOv11n-aug	YOLOv11s	YOLOv11s-aug	YOLOv8n	YOLOv8n-aug	YOLOv8s	YOLOv8s-aug
Precision	0.933	0.951	0.912	0.995	0.940	0.996	0.966	1.000
Recall	0.951	0.955	0.955	0.864	0.955	0.909	0.864	0.908
mAP50	0.955	0.961	0.958	0.960	0.945	0.957	0.949	0.971
mAP50-95	0.687	0.678	0.751	0.735	0.657	0.684	0.694	0.716

Metric	YOLOv11n	YOLOv11s	YOLOv8n	YOLOv8s
Precision	0.933	0.912	0.940	0.966
Recall	0.951	0.955	0.955	0.864
mAP50	0.955	0.958	0.945	0.949
mAP50-95	0.687	0.751	0.657	0.694

Metric	YOLOv11n-aug	YOLOv11s-aug	YOLOv8n-aug	YOLOv8s-aug
Precision	0.951	0.995	0.996	1.000
Recall	0.955	0.864	0.909	0.908
mAP50	0.961	0.960	0.957	0.971
mAP50-95	0.678	0.735	0.684	0.716

Figure 1. Classification metrics comparison across 4 architectures with offline augmentation (higher is better).

Figure 2. Classification metrics comparison across 4 architectures without offline augmentation (higher is better).

Figure 3a. Validation metrics — with augmentation.

Figure 3b. Validation metrics — without augmentation.

Figure 4a. Radar plot — test metrics with augmentation.

Figure 4b. Radar plot — test metrics without augmentation.