利用自然语言处理技术,在马来亚大学医学中心实现从叙述性病理报告中自动生成概要报告。

01 April 2022


Wee-Ming Tan, Kean-Hooi Teoh, Mogana Darshini Ganggayah, Nur Aishah Taib, Hana Salwani Zaini, and Sarinder Kaur Dhillon

复制引用
已复制!

摘要

Pathology reports represent a primary source of information for cancer registries. University Malaya Medical Centre (UMMC) is a tertiary hospital responsible for training pathologists; thus narrative reporting becomes important. However, the unstructured free-text reports made the information extraction process tedious for clinical audits and data analysis-related research. This study aims to develop an automated natural language processing (NLP) algorithm to summarize the existing narrative breast pathology report from UMMC to a narrower structured synoptic pathology report with a checklist-style report template to ease the creation of pathology reports. The development of the rule-based NLP algorithm was based on the R programming language by using 593 pathology specimens from 174 patients provided by the Department of Pathology, UMMC. The pathologist provides specific keywords for data elements to define the semantic rules of the NLP. The system was evaluated by calculating the precision, recall, and F1-score. The proposed NLP algorithm achieved a micro-F1 score of 99.50% and a macro-F1 score of 98.97% on 178 specimens with 25 data elements. This achievement correlated to clinicians’ needs, which could improve communication between pathologists and clinicians. The study presented here is significant, as structured data is easily minable and could generate important insights.


参考资料

  1. International Agency for Research on Cancer, World Health Organization. (2020). Malaysia—Global Cancer Observatory. https://gco.iarc.fr/today/data/factsheets/populations/458-malaysia-fact-sheets.pdf
  2. National Cancer Institute. (2021). Pathology reports. https://www.cancer.gov/about-cancer/diagnosis-staging/diagnosis/pathology-reports-fact-sheet
  3. Sluijter, C. E., van Lonkhuijzen, L. R., van Slooten, H. J., Nagtegaal, I. D., & Overbeek, L. I. (2016). The effects of implementing synoptic pathology reporting in cancer diagnosis: A systematic review. Virchows Archiv, 468(6), 639–649. https://doi.org/10.1007/s00428-016-1935-1
  4. Renshaw, A. A., Mena-Allauca, M., Gould, E. W., & Sirintrapun, S. J. (2018). Synoptic reporting: Evidence-based review and future directions. JCO Clinical Cancer Informatics, 2, 1–9. https://doi.org/10.1200/CCI.17.00154
  5. Hewer, E. (2020). The oncologist’s guide to synoptic reporting: A primer. Oncology, 98(6), 396–402. https://doi.org/10.1159/000507204
  6. Gao, S., Qiu, J. X., Alawad, M., Hinkle, J., Schaefferkoetter, N., Yoon, H.-J., Christian, B., Fearn, P. A., Penberthy, L., Wu, X.-C., & Tourassi, G. (2019). Classifying cancer pathology reports with hierarchical self-attention networks. Artificial Intelligence in Medicine, 101, Article 101726. https://doi.org/10.1016/j.artmed.2019.101726
  7. Kalra, S., Li, L., & Tizhoosh, H. R. (2019). Automatic classification of pathology reports using TF-IDF features. arXiv. https://doi.org/10.48550/arXiv.1903.07406
  8. College of American Pathologists. (2021). Cancer protocol templates. https://www.cap.org/protocols-and-guidelines/cancer-reporting-tools/cancer-protocol-templates
  9. Lankshear, S., Srigley, J., McGowan, T., Yurcan, M., & Sawka, C. (2013). Standardized synoptic cancer pathology reports—so what and who cares?: A population-based satisfaction survey of 970 pathologists, surgeons, and oncologists. Archives of Pathology & Laboratory Medicine, 137(11), 1599–1602. https://doi.org/10.5858/arpa.2012-0655-OA
  10. Renshaw, A. A., Gatcliffe, T., & Gould, E. W. (2020). Synoptic report response options directly impact patient care. Archives of Pathology & Laboratory Medicine, 144(8), 918–919. https://doi.org/10.5858/arpa.2020-0082-LE
  11. Renshaw, A. A., Mena-Allauca, M., & Gould, E. W. (2016). Reporting Gleason grade/score in synoptic reports of radical prostatectomies. Journal of Pathology Informatics, 7, Article 54. https://doi.org/10.4103/2153-3539.197115
  12. College of American Pathologists. (2021). Definition of synoptic reporting. https://documents.cap.org/documents/synoptic_reporting_definition_examples_v4.0.pdf
  13. Wang,Y., Sohn, S., Liu, S., Shen, F., Wang, L., Atkinson, E. J., Amin, S., & Liu, H. (2019). A clinical text classification paradigm using weak supervision and deep representation. BMC Medical Informatics and Decision Making, 19, Article 1. https://doi.org/10.1186/s12911-019-0728-x
  14. Pham, A. D., Névéol, A., Lavergne, T., Yasunaga, D., Clément, O., Meyer, G., Morello, R., & Burgun, A. (2014). Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings. BMC Bioinformatics, 15, Article 266. https://doi.org/10.1186/1471-2105-15-266
  15. Rink, B., Roberts, K., Harabagiu, S., Scheuermann, R. H., Toomay, S., Browning, T., Bosler, T., & Peshock, R. (2013). Extracting actionable findings of appendicitis from radiology reports using natural language processing. AMIA Joint Summits on Translational Science Proceedings, 2013, 221–225.
  16. Omoregbe, N. A. I., Ndaman, I. O., Misra, S., Abayomi-Alli, O. O., & Damaševičius, R. (2020). Text messaging-based medical diagnosis using natural language processing and fuzzy logic. Journal of Healthcare Engineering, 2020, Article 8839524. https://doi.org/10.1155/2020/8839524
  17. Chen, C. W., Tseng, S. P., Kuan, T. W., & Wang, J. F. (2020). Outpatient text classification using attention-based bidirectional LSTM for robot-assisted servicing in hospital. Information, 11(2), Article 106. https://doi.org/10.3390/info11020106
  18. Carchiolo, V., Longheu, A., Reitano, G., & Zagarella, L. (2019, September 1–4). Medical prescription classification: A NLP-based approach. Paper presented at the Federated Conference on Computer Science and Information Systems, Leipzig, Germany. https://doi.org/10.15439/2019F150
  19. Ooms, J. (2020). Hunspell: High-performance stemmer, tokenizer, and spell checker (R Package Version 3.0.1) [Computer software]. https://CRAN.R-project.org/package=hunspell
  20. Garrett, G., & Hadley, W. (2011). Dates and times made easy with lubridate. Journal of Statistical Software, 40(3), 1–25. https://doi.org/10.18637/jss.v040.i03
  21. Ooms, J., James, D., DebRoy, S., Wickham, H., & Horner, J. (2020). RMySQL: Database interface and ‘MySQL’ driver for R (R Package Version 0.10.21) [Computer software]. https://CRAN.R-project.org/package=RMySQL
  22. Kim, M.-J., Ohk, K., & Moon, C.-S. (2017). Trend analysis by using text mining of journal articles regarding consumer policy. New Physics: Sae Mulli, 67(5), 555–561. https://doi.org/10.3938/NPSM.67.555
  23. Kumar, L., & Kalra, B. P. (2013). Text mining: Concepts, process and applications. Journal of Global Research in Computer Science, 4(3), 36–39.
  24. Li, A.Y., & Elliot, N. (2019). Natural language processing to identify ureteric stones in radiology reports. Journal of Medical Imaging and Radiation Oncology, 63(3), 307–310. https://doi.org/10.1111/1754-9485.12857
  25. Li, M., Lang, M., Deng, F., Chang, K., Buch, K., Rincon, S., Mehan, W., Leslie-Mazwi, T., & Kalpathy-Cramer, J. (2021). Analysis of stroke detection during the COVID-19 pandemic using natural language processing of radiology reports. American Journal of Neuroradiology, 42(3), 429–434. https://doi.org/10.3174/ajnr.A6934
  26. Spandorfer, A., Branch, C., Sharma, P., Sahbaee, P., Schoepf, U. J., Ravenel, J. G., & Nance, J. W. (2019). Deep learning to convert unstructured CT pulmonary angiography reports into structured reports. European Radiology Experimental, 3, Article 37. https://doi.org/10.1186/s41747-019-0115-4
  27. Hammami, L., Paglialonga, A., Pruneri, G., Torresani, M., Sant, M., Bono, C., Caiani, E. G., & Baili, P. (2021). Automated classification of cancer morphology from Italian pathology reports using Natural Language Processing techniques: A rule-based approach. Journal of Biomedical Informatics, 116, Article 103712. https://doi.org/10.1016/j.jbi.2021.103712
  28. Grandini, M., Bagli, E., & Visani, G. (2020). Metrics for multi-class classification: An overview. arXiv. https://doi.org/10.48550/arXiv.2008.05756
  29. Wang,Y., Wang, L., Rastegar-Mojarad, M., Moon, S., Shen, F., Afzal, N., Liu, S., Zeng,Y., Mehrabi, S., Sohn, S., & Liu, H. (2018). Clinical information extraction applications: A literature review. Journal of Biomedical Informatics, 77, 34–49. https://doi.org/10.1016/j.jbi.2017.11.011
  30. Odisho, A.Y., Bridge, M., Webb, M., Ameli, N., Eapen, R.S., Stauf, F., Cowan, J. E., Washington, S. L., Herlemann, A., Carroll, P. R., & Cooperberg, M. R. (2019). Automating the capture of structured pathology data for prostate cancer clinical care and research. JCO Clinical Cancer Informatics, 3, 1–8. https://doi.org/10.1200/CCI.18.00124
  31. Bozkurt, S., Paul, R., Coquet, J., Sun, R., Banerjee, I., Brooks, J. D., & Hernandez-Boussard, T. (2020). Phenotyping severity of patient‐centered outcomes using clinical notes: A prostate cancer use case. Learning Health Systems, 4(4), Article e10237. https://doi.org/10.1002/lrh2.10237
  32. Levy, J., Vattikonda, N., Haudenschild, C., Christensen, B., & Vaickus, L. (2022). Comparison of machine learning algorithms for the prediction of Current Procedural Terminology (CPT) codes from pathology reports. Journal of Pathology Informatics, 13, Article 3. https://doi.org/10.1016/j.jpi.2022.100005
  33. Baranov, N. S., Nagtegaal, I. D., van Grieken, N. C. T., Verhoeven, R. H. A., Voorham, Q. J. M., Rosman, C., & van der Post, R. S. (2019). Synoptic reporting increases quality of upper gastrointestinal cancer pathology reports. Virchows Archiv, 475(2), 255–259. https://doi.org/10.1007/s00428-019-02572-w
  34. Aumann, K., Niermann, K., Asberger, J., Wellner, U., Bronsert, P., Erbes, T., Hauschke, D., Stickeler, E., Gitsch, G., Kayser, G., & Werner, M. (2016). Structured reporting ensures complete content and quick detection of essential data in pathology reports of oncological breast resection specimens. Breast Cancer Research and Treatment, 156(4), 495–500. https://doi.org/10.1007/s10549-016-3773-y
  35. Aumann, K., Amann, D., Gumpp, V., Hauschke, D., Kayser, G., May, A. M., Wetterauer, U., & Werner, M. (2012). Template-based synoptic reports improve the quality of pathology reports of prostatectomy specimens. Histopathology, 60(4), 634–644. https://doi.org/10.1111/j.1365-2559.2011.04111.x
  36. Aumann, K., Kayser, G., Amann, D., Bronsert, P., Hauschke, D., Palade, E., Passlick, B., & Werner, M. (2013). The format type has impact on the quality of pathology reports of oncological lung resection specimens. Lung Cancer, 81(3), 382–387. https://doi.org/10.1016/j.lungcan.2013.05.016
  37. Maley, A., Patrawala, S., & Stoff, B. (2016). Compliance with the College of American Pathologists Protocol for Melanoma in Synoptic and Non-Synoptic reports: A cross-sectional study. Journal of the American Academy of Dermatology, 74(1), 179–181. https://doi.org/10.1016/j.jaad.2015.08.056
  38. Yunker, W. K., Matthews, T. W., & Dort, J. C. (2008). Making the most of your pathology: Standardized histopathology reporting in head and neck cancer. Journal of Otolaryngology - Head & Neck Surgery, 37(1), 48–55.
  39. Woods,Y. L., Mukhtar, S., McClements, P., Lang, J., Steele, R. J., & Carey, F. A. (2014). A survey of reporting of colorectal cancer in Scotland: Compliance with guidelines and effect of proforma reporting. Journal of Clinical Pathology, 67(6), 499–505. https://doi.org/10.1136/jclinpath-2013-201997
  40. Ihnát, P., Delongová, P., Horáček, J., Ihnát Rudinská, L., Vávra, P., & Zonča, P. (2015). The impact of standard protocol implementation on the quality of colorectal cancer pathology reporting. World Journal of Surgery, 39(1), 259–265. https://doi.org/10.1007/s00268-014-2785-z
  41. U.S. Department of Health and Human Services. (2009). Electronic reporting in pathology: Requirements and limitations, a paradigm for national electronic health records implementation. https://aspe.hhs.gov/sites/default/files/private/pdf/76001/report.pdf
  42. Hassell, L. A., Parwani, A. V., Weiss, L., Jones, M. A., & Ye, J. (2010). Challenges and opportunities in the adoption of College of American Pathologists checklists in electronic format: Perspectives and experience of Reporting Pathology Protocols Project (RPP2) participant laboratories. Archives of Pathology & Laboratory Medicine, 134(8), 1152–1159. https://doi.org/10.5858/2009-0414-OAR1
  43. Casati, B., Haugland, H. K., & Barstad, G. M. J. (2014). Factors affecting the implementation and use of electronic templates for histopathology cancer reporting. Pathology, 46(Suppl 1), S165. https://doi.org/10.1097/01.PAT.0000443743.08272.22

引用

Tan, W.-M., Teoh, K.-H., Ganggayah, M. D., Taib, N. A., Zaini, H. S., & Dhillon, S. K. (2022). Automated Generation of Synoptic Reports from Narrative Pathology Reports in University Malaya Medical Centre Using Natural Language Processing. Diagnostics12(4), 879. https://doi.org/10.3390/diagnostics12040879

复制引用 已复制!