[ 
https://issues.apache.org/jira/browse/NIFI-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688050#comment-17688050
 ] 

Daniel Stieglitz commented on NIFI-10792:
-----------------------------------------

[~exceptionfactory] I am not sure how NIFI-11167 will help if we are to use the 
POI API. Currently I believe  ConvertExcelToCSVProcessor is using the XSSF 
Event API. The stack trace when reading in an Excel file larger than 10MB is:
{code:java}
at org.apache.poi.util.IOUtils.throwRFE(IOUtils.java:599)
        at org.apache.poi.util.IOUtils.checkLength(IOUtils.java:276)
        at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:230)
        at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:203)
        at 
org.apache.poi.openxml4j.util.ZipArchiveFakeEntry.<init>(ZipArchiveFakeEntry.java:82)
        at 
org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:98)
        at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:132)
        at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:319)
        at 
org.apache.nifi.processors.poi.ConvertExcelToCSVProcessor$1.process(ConvertExcelToCSVProcessor.java:240)
{code}

The starting point for using the XSSF Event API is to use the 
{code:java}
org.apache.poi.openxml4j.opc.OPCPackage
{code}
which as you can see above is where the stack trace starts. 
Also I am not sure how the SXSSF model would be pertinent as it is used for 
creating an Excel document and we are trying to create a reader.





> ConvertExcelToCSVProcessor : Failed to convert file over 10MB 
> --------------------------------------------------------------
>
>                 Key: NIFI-10792
>                 URL: https://issues.apache.org/jira/browse/NIFI-10792
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core UI
>    Affects Versions: 1.17.0, 1.16.3, 1.18.0
>            Reporter: mayki
>            Priority: Critical
>              Labels: Excel, csv, processor
>             Fix For: 1.15.3
>
>         Attachments: ConvertExcelToCSVProcessor_1_18_0_with_POI_OLD.PNG, 
> ConvertExcelToCSVProcessor_1_19_1.PNG
>
>
> Hello all,
> It seems all version greater 1.15.3 introduce a failure on the processor 
> *ConvertExcelToCSVProcessor* with this error :
> {code:java}
> Tried to allocate an array of length 101,695,141, but the maximum length for 
> this record type is 100,000,000. If the file is not corrupt or large, please 
> open an issue on bugzilla to request increasing the maximum allowable size 
> for this record type. As a temporary workaround, consider setting a higher 
> override value with IOUtils.setByteArrayMaxOverride() {code}
> I have tested with 2 differences instances nifi version 1.15.3 ==> Work: OK
> And since upgrade in 1.16, 1.17, 1.18 ==> same processsor *failed* with file 
> greater than 10MB.
> Could you help us to correct this bug ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to