zhtk commented on PR #10058: URL: https://github.com/apache/nifi/pull/10058#issuecomment-3022501891
@dan-s1, at first glance submitting a patch to POI seems to be a good idea. After some thought I have considerations to this approach. What we would like to achieve in POI is not just support for copying of rows, but memory-efficient copying of rows using StreamingReader. The problem is that StreamingSheet object returned by StreamingReader doesn't implement all methods from POI Sheet interface. I.e. in many cases it throws UnsupportedOperationException. As a result, such util method submitted to POI repository would be very sensitive to all refactorings and code simplifications, as the code using only streaming approach would be more complicated than code using random memory access. So we may easily end up with situation where after some refactor / change in POI we can't upgrade POI dependency in NiFi, because someone unknowingly broke copy method. One way to mitigate this issue could be to add test in POI, but this would introduce circular dependency: Apache POI -> pjfanning.xls x.StreamingReader -> Apache POI tests. Not a great pattern. Regarding enabling of shared strings in SXSSFWorkbook, I'm not sure how to approach this. On one hand enabling it may blow memory usage again, on the other hand breaking compatibility for some clients sounds bad. If only these clients were listed in Javadoc. By the way, can you briefly explain why the code doesn't work for HSSF? I know for sure that it won't help with memory usage, because format of HSSF enforces to keep the whole file content in memory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
