Category Discovery via Looped Deep Pseudo-Task Optimization Using a Large Scale Radiology Image Database

With the advent of large-scale medical databases, thousands of medical images have become available for research and clinical studies. However, with so many images, there are almost no descriptions or annotations associated with each image. Recognizing what is in each image requires highly specialized skills and time. 

Generating semantic labels for large-scale radiology image datasets creates a bottleneck in training the deep convolutional neural network (CNN) for image recognition needed for developing Computer Aided Detection (CAD) systems. Natural language processing (NLP) in radiology reports presents many challenges. Traditional detection and classification problems in medical imaging require precise labels of disease features for training the system. There are few labeled large-scale medical image datasets available. Crowd-sourcing images or other traditional methods to acquire semantic image labels are not options for medical images. Traditionally precise labeling requires a large amount of time for annotation from well-trained medical professionals. Images stored in picture archiving and communication system (PACS) may contain large amounts of medical information that could be extracted to improve computer diagnostic capabilities. Converting the medical records stored in the PACS into labels or tags is very challenging, but may yield significant advances for development of Computer Aided Detection and Diagnostics.

Scientists at the National Institutes of Health-Clinical Center (NIH-CC) have developed a novel technology which uses a looped deep pseudo-task automation approach for automatic category discovery and generates medical image labels. The system can be initialized by a CNN trained on radiology images and text-derived labels. Convergence of better labels leads to better-trained CNN models, which feed more effective deep image features and meaningful clustering/labels. Using this invention, medical images can be rapidly classified and annotated to allow for more effective use. This technology was initially developed using radiology images, but can be applied to other types of images and may be applied to specific computer assisted diagnostic applications in the future. 

Potential Commercial Applications: Competitive Advantages:
  • Deep learning software for abnormality validation
  • Computer Aided Diagnostics (CAD)
  • Time-saving
  • First of its kind

Development Stage:

Pre-clinical (in vivo)


Le Lu

Ronald Summers

Xiaosong Want

Intellectual Property:
Application No. 62/302,084
Application No. PCT/US2017/020185


Wang X, et al. Unsupervised Category Discovery via Looped Deep Pseudo-Task Optimization Using a Large Scale Radiology Image Database.

Collaboration Opportunity:

Licensing and research collaboration

Licensing Contact:
John Hewes, Ph.D.
Phone: 240-276-5515

OTT Reference No: E-046-2016
Updated: Aug 15, 2018