Hello, I am trying to parallelize Excel processing and I am noticing a bizarre behavior - single threaded processing is actually faster...
I am not doing anything fancy. I just open an XSSFWorkbook, fill out some values, run formula calcs and read the output. If I run single threaded - initial run takes a few seconds to complete (assume because JVM needs to load POI + all the XML, schemas, etc.), but performance improves and subsequent runs all take about 100-200 ms. Same logic executed in a separate thread runs easily for 5 seconds in each thread.... So turns out that single threaded processing of say 10 files is at 4.5 seconds, but multithreaded takes 5-6 easily... No files are shared among threads. The hotspots are in POIXMLDocument.load. Thread behavior also looks correct. File contention is out of the picture too - reading a different file each time. Any ideas as two why or pointers at the POI multithreading best practices are greatly appreciated. Thank you very much in advance!
