<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<article article-type="abstract" dtd-version="1.0" xml:lang="en" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">CC</journal-id>
<journal-id journal-id-type="nlm-ta">Cardiol Croat</journal-id>
<journal-title-group>
<journal-title>Cardiologia Croatica</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Cardiol. Croat.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="ppub">1848-543X</issn>
<issn pub-type="epub">1848-5448</issn>
<publisher><publisher-name>Croatian Cardiac Society</publisher-name></publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">CC 2024 19_11-12_527</article-id>
<article-id pub-id-type="doi">10.15836/ccar2024.527</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Extended Abstract</subject></subj-group>
<subj-group subj-group-type="subheading"><subject>Registries and observational surveys</subject></subj-group>
</article-categories>
<title-group>
<article-title>Is machine learning the optimal tool for assessing outcomes in healthcare data? Insights from a pulmonary embolism cohort</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-3962-2774</contrib-id><name><surname>Pavlov</surname><given-names>Marin</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1">*</xref></contrib>
<contrib contrib-type="author"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-7828-4870</contrib-id><name><surname>Novak</surname><given-names>Andrej</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib>
<contrib contrib-type="author"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-6444-2674</contrib-id><name><surname>Manola</surname><given-names>&#x0160;ime</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib>
<contrib contrib-type="author"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-2637-9691</contrib-id><name><surname>Jurin</surname><given-names>Ivana</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib>
<aff id="aff1"><label>1</label><institution>Dubrava University Hospital</institution>, <addr-line>Zagreb</addr-line>, <country country="hr">Croatia</country></aff>
<aff id="aff2"><label>2</label><institution>University of Zagreb, Faculty of Science</institution>, <institution content-type="dept">Department of Physics</institution>, <addr-line>Zagreb</addr-line>, <country country="hr">Croatia</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>*</label>ADDRESS FOR CORRESPONDENCE: Marin Pavlov, Klini&#x010D;ka bolnica Dubrava Avenija Gojka &#x0160;u&#x0161;ka 6, HR-10000 Zagreb, Croatia. / Phone: +385-99-2360-286 / E-mail: <email xlink:href="marin.pavlov@gmail.com">marin.pavlov@gmail.com</email></corresp></author-notes>
<pub-date date-type="pub" publication-format="electronic"><month>11</month><year>2024</year></pub-date>
<pub-date date-type="pub" publication-format="print"><month>11</month><year>2024</year></pub-date>
<volume>19</volume>
<issue>11-12</issue>
<fpage>527</fpage>
<lpage>527</lpage>
<history>
<date date-type="received"><day>13</day><month>10</month><year>2024</year></date>
<date><day>31</day><month>10</month><year>2024</year></date>
</history>
<permissions>
<copyright-statement>Croatian Cardiac Society</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Croatian Cardiac Society</copyright-holder>
</permissions>
<kwd-group kwd-group-type="author"><title>KEYWORDS: </title><kwd>machine learning</kwd><kwd>pulmonary embolism</kwd><kwd>outcomes</kwd></kwd-group>
</article-meta>
</front>
<body>
<p><bold>Goal:</bold> To determine the outcome predictor rank list in a population of pulmonary embolism (PE) patients with follow-up longer than one year using contemporary machine learning models.</p>
<p><bold>Patients and Methods:</bold> Machine learning models (LightGBM variant of XGBoost) were used to analyse the outcome data of a PE cohort. Patients were recruited from November 2013 until November 2018 in two academic hospitals in metropolitan area and followed by a telephone interview or hospital visit. Primary outcome was all cause mortality. In all patients PE diagnosis was established by computed tomography. Two models were generated in both XGBoost and frequentistic analysis: 1) a model with 19 variables 2) a model with 8 variables. Both models were recreated from previously published results (<xref ref-type="bibr" rid="r1"><italic>1</italic></xref>, <xref ref-type="bibr" rid="r2"><italic>2</italic></xref>).</p>
<p><bold>Results:</bold> The study population comprised of 761 patients (predominantly female (57.4%), aged 73 (61-81)) has been described previously (<xref ref-type="bibr" rid="r1"><italic>1</italic></xref>, <xref ref-type="bibr" rid="r2"><italic>2</italic></xref>). Median follow-up was 675 days (114-1331). Death within follow-up occurred in 335 cases (44.0%). In XGBoost algorhitm, Pulmonary Embolism Severity Index (PESI) score and body mass index (BMI) were the two strongest predictors of primary outcome. Overall, the models were accurate with area under curve of 0.840 and 0.864. For BMI, this is contrary to the results of frequentistic statistic inference, in which BMI failed to enter the Cox proportional hazards model.</p>
<p><bold>Conclusion:</bold> In the XGBoost analysis, a machine learning framework more suitable to handle non-linear data, outcome analysis yielded different results as compared to frequentist statistical inference. Since such non-normally distributed data prevail in health care data bases, machine learning models may provide deeper insight in analysis of variables impact on outcome.</p>
</body>
<back>
<ref-list>
<title>LITERATURE</title>
<ref id="r1"><label>1</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jurin</surname><given-names>I</given-names></name><name><surname>Pavlov</surname><given-names>M</given-names></name><name><surname>Manola</surname><given-names>S</given-names></name><name><surname>Letilovic</surname><given-names>T</given-names></name><name><surname>Hadzibegovic</surname><given-names>I</given-names></name></person-group>. <article-title>Long-term outcome in pulmonary embolism: Is it healthy to be lean?</article-title> <source>Eur J Intern Med</source>. <year>2023</year> July;<volume>113</volume>:<fpage>126</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejim.2023.04.017</pub-id><pub-id pub-id-type="pmid">37095015</pub-id></mixed-citation></ref>
<ref id="r2"><label>2</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jurin</surname><given-names>I</given-names></name><name><surname>Pavlov</surname><given-names>M</given-names></name><name><surname>Manola</surname><given-names>S</given-names></name><name><surname>Radonic</surname><given-names>V</given-names></name><name><surname>Hadzibegovic</surname><given-names>I</given-names></name></person-group>. <article-title>The lean paradox in pulmonary embolism: Beyond the estimated plasma volume?</article-title> <source>Eur J Intern Med</source>. <year>2023</year> August;<volume>114</volume>:<fpage>127</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejim.2023.05.029</pub-id><pub-id pub-id-type="pmid">37258382</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>
