RF-DETR训练自建数据集代码及训练过程Debug
RF-DETR训练自建数据集代码及训练过程Debug一、RF-DETR简介二、与YOLOv11、v12性能参数对比三、训练自定义数据集3.1训练环境配置3.2数据集格式要求3.3训练过程Debug3.3.1训练代码3.3.2预训练模型下载3.3.3Transformer模型下载3.3.4训练过程其他参数读者朋友们觉得本文对您有些启发请点点赞一、RF-DETR简介RF-DETR是由Roboflow研究和推出的基于Transformer的实时目标检测模型架构是目前在RF100-VL数据集上的SOTA模型也是第一个在COCO数据集上超过了mAP0.50:0.9560的实时模型官网地址为https://github.com/roboflow/rf-detr。rf-detr按照参数量划分共有三种模型分别为base模型、base-2模型和large模型base模型的参数量约为29Mlarge模型的参数量约为128M。二、与YOLOv11、v12性能参数对比rf-detr-base模型的性能如下图YOLOv11模型在COCO数据集上的性能及参数如下图YOLOv12模型在COCO数据集上的性能及参数如下图如表中所示在COCO数据集上rf-detr的base模型性能为mAP53.3接近于YOLOv11的large模型的mAP53.4略低于YOLOv12的large模型的mAP53.8参数量略高于YOLOv11和v12的large模型。三、训练自定义数据集3.1训练环境配置训练环境要求python版本≥3.9pytorch版本为2.0及以上此外官网未提及对操作系统的要求本文的实际训练环境是在ubuntu系统上进行配置。训练前只需安装rf-detr包命令如下pipinstallrfdetr3.2数据集格式要求数据集格式要求为COCO数据集格式要求格式如下图数据集分为train、valid和test文件夹各个文件夹下存放图片文件和一个json标签文件。如果本地数据集格式为YOLO格式那么可参照下面代码进行转换defconvert_yolo2coco():txt_pathdataset/valid.txtwithopen(txt_path,r)asf:filepathListf.readlines()categories[{id:0,name:person,supercategory:none},{id:1,name:car,supercategory:none},{id:2,name:dog,supercategory:none},]# 构建 COCO 格式字典coco_format{info:{description:YOLO to COCO Converted Dataset},licenses:[],images:[],annotations:[],categories:categories}annotation_id0image_id0folderpathtxt_path.replace(.txt,)output_jsonfolderpath/_annotations.coco.jsonifnotos.path.exists(folderpath):os.mkdir(folderpath)forfilepathintqdm(filepathList):imgnamefilepath.split(/)[-1].replace(\n,)shutil.copy(filepath.replace(\n,),folderpath/imgname)coco_format[images].append({id:image_id,width:640,height:640,file_name:imgname})labelpathfilepath.replace(images,labels).replace(.jpg,.txt).replace(\n,)withopen(labelpath,r)asf:forlineinf.readlines():partsline.strip().split()class_idint(parts[0])x_center,y_center,w,h(float(parts[1]),float(parts[2]),float(parts[3]),float(parts[4]))x(x_center-w/2)*640y(y_center-h/2)*640coco_format[annotations].append({id:annotation_id,image_id:image_id,category_id:class_id,bbox:[x,y,w*640,h*640],area:w*640*h*640,iscrowd:0,segmentation:[]# 只做目标检测不需要分割})annotation_id1image_id1#break# 保存为 COCO 格式 JSON 文件withopen(output_json,w)asf:json.dump(coco_format,f,indent2)print(f转换完成输出文件为:{output_json})3.3训练过程Debug3.3.1训练代码训练代码如下fromrfdetrimportRFDETRBase modelRFDETRBase(pretrain_weightsrf-detr-base-coco.pth)#model RFDETRBase(pretrain_weightsrf-detr-base-2.pth)#model RFDETRLarge(pretrain_weightsrf-detr-large.pth)model.train(dataset_dirdataset,epochs10,batch_size12,grad_accum_steps4,lr1e-4,output_dirmodel)3.3.2预训练模型下载官网给出的代码似乎只能在预训练模型上进行微调因此需要通过pretrain_weights参数指定预训练模型如果pretrain_weights参数为空那么会自动下载预训练模型这一过程很可能因网络问题无法下载成功导致代码长时间无反应。解决方法为先将预训练模型下载到本地三种模型官方下载地址如下https://storage.googleapis.com/rfdetr/rf-detr-base-coco.pthhttps://storage.googleapis.com/rfdetr/rf-detr-base-2.pthhttps://storage.googleapis.com/rfdetr/rf-detr-large.pthCSDN免费下载地址https://download.csdn.net/download/weixin_46846685/90995591https://download.csdn.net/download/weixin_46846685/909923403.3.3Transformer模型下载预训练模型下载后运行上面训练代码可能还会报错OSError: We couldn’t connect to ‘https://huggingface.co’ to load this file, couldn’t find it in the cached files and it looks like facebook/dinov2-small is not the path to a directory containing a file named config.json.Checkout your internet connection or see how to run the library in offline mode at ‘https://huggingface.co/docs/transformers/installation#offline-mode’这是由于网络问题无法下载相关的Transformer模型导致的报错解决方法为手动下载模型然后在本地创建文件夹facebook/dinov2-small将模型文件复制该目录下。模型官方下载地址为https://huggingface.co/facebook/dinov2-small/tree/mainCSDN免费下载地址https://download.csdn.net/download/weixin_46846685/90977658至此应该可以正常训练。3.3.4训练过程其他参数训练的其他参数如下表所示项目Valuedataset_dirSpecifies the COCO-formatted dataset location with train, valid, and test folders, each containing _annotations.coco.json. Ensures the model can properly read and parse data.output_dirDirectory where training artifacts (checkpoints, logs, etc.) are saved. Important for experiment tracking and resuming training.epochsNumber of full passes over the dataset. Increasing this can improve performance but extends total training time.batch_sizeNumber of samples processed per iteration. Higher values require more GPU memory but can speed up training. Must be balanced with grad_accum_steps to maintain the intended total batch size.grad_accum_stepsAccumulates gradients over multiple mini-batches, effectively raising the total batch size without requiring as much memory at once. Helps train on smaller GPUs at the cost of slightly more time per update.lrLearning rate for most parts of the model. Influences how quickly or cautiously the model adjusts its parameters.lr_encoderLearning rate specifically for the encoder portion of the model. Useful for fine-tuning encoder layers at a different pace.resolutionSets the input image dimensions. Higher values can improve accuracy but require more memory and can slow training. Must be divisible by 56.weight_decayCoefficient for L2 regularization. Helps prevent overfitting by penalizing large weights, often improving generalization.deviceSpecifies the hardware (e.g., cpu or cuda) to run training on. GPU significantly speeds up training.use_emaEnables Exponential Moving Average of weights, producing a smoothed checkpoint. Often improves final performance with slight overhead.gradient_checkpointingRe-computes parts of the forward pass during backpropagation to reduce memory usage. Lowers memory needs but increases training time.checkpoint_intervalFrequency (in epochs) at which model checkpoints are saved. More frequent saves provide better coverage but consume more storage.resumePath to a saved checkpoint for continuing training. Restores both model weights and optimizer state.tensorboardEnables logging of training metrics to TensorBoard for monitoring progress and performance.wandbActivates logging to Weights Biases, facilitating cloud-based experiment tracking and visualization.projectProject name for Weights Biases logging. Groups multiple runs under a single heading.runRun name for Weights Biases logging, helping differentiate individual training sessions within a project.early_stoppingEnables an early stopping callback that monitors mAP improvements to decide if training should be stopped. Helps avoid needless epochs when mAP plateaus.early_stopping_patienceNumber of consecutive epochs without mAP improvement before stopping. Prevents wasting resources on minimal gains.early_stopping_min_deltaMinimum change in mAP to qualify as an improvement. Ensures that trivial gains don’t reset the early stopping counter.early_stopping_use_emaWhether to track improvements using the EMA version of the model. Uses EMA metrics if available, otherwise falls back to regular mAP.读者朋友们觉得本文对您有些启发请点点赞