LISS database: A Public Database of Common Imaging Signs of Lung Diseases
Abstract
The imaging signs in lung computed tomography (CT) images are crucial information for the diagnosis of lung diseases. We call it CISL for short. Thus it is important to automatically analysis and process the CISLs for the Computer-aided diagnosis (CAD) of lung diseases. The problem is gaining more and more attentions. To promote the research and development of related techniques, it is necessary to construct a large public database of CISLs. LISS is such a public database, which contains 511 2D CISLs from 252 patients and 166 3D ones from 19 patients. All the 3D instances belong to the common category of ground glass opacity (GGO), while the 2D instances cover 9 common categories, including calcification, cavity & vacuolus, spiculation, lobulation, pleural dragging, bronchial mucus plugs, obstructive pneumonia, and air bronchogram. For each CT scan in the database, an experienced radiologist found out all the CISLs in it, labeled the corresponding regions and annotated their categories. At the same time, all the privacy information has been deleted. The resultant LISS database can be used for not only CAD researches but also medical education and medical devices administration.
Introduction
The CT images play an important role in medicine. For example, the CT images may be used to differentiate mass nature among solid, mucinous, fatty, and vascular. They may also be used to find the slight change of internal structure and margin of mass. The radiologist further extract the CT image patterns of diseased tissues for disease diagnosis, which are called CT imaging sings usually. There is no exact definition of CT imaging signs currently, and the name has not even been normalized. We accept the experienced description of CT imaging signs above (also called CT features, CT findings or CT patterns, etc.).
We find that the research works in Computer-aided detection of CT imaging signs keep increasing every year. For example, there are many detection works of Ground Glass Opacity(GGO), and there are some other detection works of cavity, obstructive pneumonia, and bronchial mucus plugs, etc.
The LISS database is a publicly available database of lung CT imaging signs. The overall goal of this database is to provide large database support for CAD methods research, and to offer help for related medical education, and etc. In general, the possible usefulness of a well-established database of CT imaging signs is introduced as follows.
1) The database and corresponding management system will provide convenience for computer-aided disease diagnosis. For example, when a radiologist reads the CT scan, he can retrieve the lung CT images containing similar imaging signs as those appearing in the reading scan from the database to aid him in making decisions.
2) For the CAD researchers, the database will enable valid evaluation of different techniques in a meaningful manner. Furthermore, many researchers interested in lung medical images may not have the enough medical images, a publically available database of CT imaging signs will be welcome by them.
3) The teaching with database will be more efficient than that without it. The abundant visual information from the database will help students understand and remember related knowledge more easily.
4) The drug administration requires a large CT imaging signs database to validate the CAD devices for deciding whether the license should be granted to this device
The LISS database applies a CT imaging signs perspective to investigate lung lesions and contains 9 common CT imaging signs related to lung diseases. It contains 271 CT scans and corresponding radiologist¡¯s annotations. To make the database available publically, all private data contained in CT images are eliminated or replaced with provisioned values.
Data and data format
The LISS database contains 9 types of CT imaging signs of lung diseases. The numbers of CT scans and annotated ROIs corresponding to these CT imaging signs are summarized in the following Table.
The annotation information for CT scans is stored in plain text format. All the annotation information for each type of CT imaging signs is stored in one file, and the filename is the type name of this CT imaging signs. For example, the GGO.txt records annotation information for GGO imaging signs. The first line of an annotation file is recorded for annotation software and should be ignored by the database users. As for the rest lines, each one is corresponding with a lesion region. The format for every line (except for first line) is ¡±PAx IMx num1 num2 num3 num4¡±. For example, the line ¡°PA18 IM17 307 333 321 348¡± denotes the 17th slice of the 18th scan has a lesion and subsequent 4 numbers are coordinates of top left corner and bottom right corner of rectangle bounding box indicating the lesion region. If a slice has several lesions, multiple lines corresponding to this slice will appear in the annotation file.
We do not make a distinction between training set and testing set. Researchers should point out the training and testing data they used for reproducing the findings on the same data by other researchers.
Rules
The collection of CT scans and the maintenance of this website require a large effort. We decide to maintain this website as a public repository for CAD research of common CT imaging signs to promote cooperative. We do not want to create any obstacles for publishing methods that use the LISS database. At the same time, we ask the body who downloads or uses LISS database to abide the rules below:
(1) Data downloaded from this site may only be used for the purpose of scientific studies, for example, may be used to train or develop new algorithms for scientific studies. All the data from the LISS database cannot be used to train or develop algorithms used in commercial products.
(2) When the data and/or the results of algorithms on the data are used in scientific publications (journal publications, conference papers, technical reports, presentations at conferences and meetings) you must state the data sources.
(3) Teams must notify the maintainers of this site about any publication that is (partly) based on the data on this site, in order for us to maintain a list of publications associated with the LISS database.
Download
You may download LISS database from this link: http://isc.cs.bit.edu.cn/MLMR/LISSdownload.html. In the linked download page, the ¡®documents.rar¡¯ file contains annotation information and relevant file; the 2D CT images are divided into 4 parts: Images_2D(0-100), Images_2D(100-200), Images_2D(200-300), Images_2D(300-363); the ¡®Images_3D.rar¡¯ file contains the 3D CT images.
Notice that you are not allowed to download these data if you do not agree with the above rules.
Citation
There is now a publication about LISS database available. This paper describes many aspects of the database that were previously explained on this page.
The paper can be downloaded from this link: http://ieeexplore.ieee.org/xpl/abstractAuthors.jsp?arnumber=6924794
When the LISS database and/or related content of the paper are used in scientific publications (journal publications, conference papers, technical reports, presentations at conferences and meetings), you should cite the following paper:
Guanghui Han, Xiabi Liu, Feifei Han, I Nyoman Tenaya Santika, Yanfeng Zhao, Xinming Zhao,and Chunwu Zhouet, "The LISS¡ªA Public Database of Common Imaging Signs of Lung Diseases for Computer-Aided Detection and Diagnosis Research and Medical Education," IEEE Trans. Biomedical Engineering, vol.62, no.2, pp.648-656,Feb.2015.
Contact
If you are not yet clear about the database, or if you have additional questions, please e-mail Guanghui Han (hanguanghui@bit.edu.cn).
Address: Machine Learning and Muti-Media Retrieval Lab, Beijing Institute of Technology, Beijing 100081, China.
This work was supported by National Natural Science Foundation of China (NSFC) Grant (Grant no 81171407).