[ https://issues.apache.org/jira/browse/DRILL-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arina Ielchiieva updated DRILL-7641: ------------------------------------ Fix Version/s: 1.18.0 > Convert Excel Reader to Use Streaming Reader > -------------------------------------------- > > Key: DRILL-7641 > URL: https://issues.apache.org/jira/browse/DRILL-7641 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Text & CSV > Affects Versions: 1.17.0 > Reporter: Charles Givre > Assignee: Charles Givre > Priority: Major > Fix For: 1.18.0 > > > The current implementation of the Excel reader uses the Apache POI reader, > which uses excessive amounts of memory. As a result, attempting to read large > Excel files will cause out of memory errors. > This PR converts the format plugin to use a streaming reader, based still on > the POI library. The documentation for the streaming reader can be found > here. [1] > All unit tests pass and I tested the plugin with some large Excel files on my > computer. > [1]: [https://github.com/pjfanning/excel-streaming-reader] > -- This message was sent by Atlassian Jira (v8.3.4#803005)