[ https://issues.apache.org/jira/browse/NIFI-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701968#comment-17701968 ]
David Handermann commented on NIFI-11167: ----------------------------------------- [~dstiegli1] It sounds like a row with all null values should be returned from the Reader. In general, the Reader should provide a record-oriented representation of the Excel document. Many of the properties in ConvertExcelToCSVProcessor are also specific to the CSV output format, so would not apply to a new Excel Reader. A property to select Excel Sheets to extract seems like something that should be supported. It may be better to implement that using a Regular Expression named something like Sheet Name Pattern, defaulted to {{.*}} to include all Sheets. > Add Excel Record Reader > ----------------------- > > Key: NIFI-11167 > URL: https://issues.apache.org/jira/browse/NIFI-11167 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions > Reporter: David Handermann > Assignee: Daniel Stieglitz > Priority: Minor > > A new Excel Record Reader should be implemented to support reading XSLX > spreadsheet rows as NiFi Records. This Reader will enable integration with > various record-oriented components, obviating the need for the narrowly > focused ConvertExcelToCSVProcessor. The initial version of the Excel Reader > should not support the legacy binary XLS format. > The ExcelReader should use a library that supports reading from a stream of > rows to avoid consuming large amounts of heap memory during processing. > The ExcelReader should support configurable properties to read selected > sheets. With Excel supporting typed field values, some amount of field type > mapping will be required. Additional input filtering properties should not be > implemented as existing Processors like QueryRecord support a wide variety of > filtering and projection use cases. -- This message was sent by Atlassian Jira (v8.20.10#820010)