XSSF is an XML document. Given that XML is generally about 70-80% overhead vs. 
data, it is not surprising that binary spreadsheets (which can be optimized, 
and have very little overhead) are more memory efficient. In addition, XML must 
be parsed, but binary documents can frequently be accessed using pointers and 
data structures. That gives the binary formats a performance edge, which can be 
significant. I'm not sure how Microsoft handles spreadsheets internally, but 
maybe they keep an internal binary format, and then write it to whatever format 
is requested on save rather than using an internal XML representation for an 
XML spreadsheet, which I what POI is doing.

-----Original Message-----
From: Jack of Shadows [mailto:somerandomlo...@gmail.com] 
Sent: Monday, April 11, 2016 7:46 AM
To: POI Users List
Subject: Re: SSPerformanceTest: Is the FAQ still accurate?

XSSF is basically unusable. 25000 or 50000 isn't that many rows. Memory 
consumption is pretty high too.
That's really confusing, I wouldn't have been surprised if HSSF performed 
poorly -- but it actually works better.
Ohh well, whatever, I guess I'd have to use SXSSF instead.

On Mon, Apr 11, 2016 at 12:04 AM, Dominik Stadler <dominik.stad...@gmx.at>
wrote:

> Hi,
>
> Not sure which exact machine spec the information in the FAQ is based 
> on, maybe there is something that can have quite a big influence on 
> runtime of this sample for XSSF, e.g. which actual JDK is used, 
> Linux/Windows, ... ?!
>
> I did a quick run of it across various versions of POI to see if we 
> degraded performance at some point, but for me it rather was always 
> this way, i.e. HSSF very quick, SXSSF fairly quick (with being very 
> slow in early releases) and XSSF quite a bit slower, maybe we need to 
> adjust the FAQ entry some more here to set correct expectations?
>
> (Exact numbers here are not that relevant as I used my 6+ year old 
> laptop where I was doing other things at the same time, albeit no CPU 
> intensive things, JVM was Sun 6.0, Linux Ubuntu, 25000 rows, 25 cols)
>
>
> latest-2016-04-10:
>
> Elapsed 2 seconds
> Elapsed 15 seconds
> Elapsed 5 seconds
>
>
> 2014-03-22 (the FAQ-Entry was added)
>
> Elapsed 1 seconds
> Elapsed 14 seconds
> Elapsed 3 seconds
>
>
> 3.10:
>
> Elapsed 2 seconds
> Elapsed 14 seconds
> Elapsed 3 seconds
>
>
> 3.9:
>
> Elapsed 1 seconds
> Elapsed 12 seconds
> Elapsed 3 seconds
>
>
> 3.8:
>
> Elapsed 2 seconds
> Elapsed 15 seconds
> Elapsed 3 seconds
>
>
> initial checkin of SSPerformanceTest:
>
> Elapsed 1 seconds
> Elapsed 14 seconds
> Elapsed 47 seconds
>
>
> Dominik.
>
>
>
>
> On Sun, Apr 10, 2016 at 5:59 PM, Jack <somerandomlo...@gmail.com> wrote:
>
> > I'm having the exact same issue, I've tracked down this message from 
> > StackOverflow.
> > I've tested read performance on two XLS and XLSX with identical 
> > content (around 75000 rows, 25 columns).
> > HSSF takes under 5 sec; XSSF takes 15-20 sec.
> >
> > Any idea what is the issue with XSSF performance?
> >
> >
> > On 15.02.2016 17:00, Drew Spencer wrote:
> >
> >> Mike DeHaan <mike <at> mikeandzoya.com> writes:
> >>
> >> As a followup, a user has replied to my stack overflow post with 
> >> some
> >>> information that might be helpful in tracking this issue down. 
> >>> Here is
> >>>
> >> the
> >>
> >>> link to his post:
> >>>
> >>> http://stackoverflow.com/a/34266795/4471563
> >>>
> >>> I ran the same tests in my environments and came up with similar
> >>>
> >> numbers.
> >>
> >>> -Mike DeHaan
> >>>
> >>> I have also asked the same question. Would love to get an answer 
> >>> to
> this
> >> either way. My similar post on StackOverflow is here:
> >> http://stackoverflow.com/questions/34995058/apache-poi-much-quicker
> >> -
> >> using-hssf-than-xssf-what-next
> >>
> >> I received an good answer with the link to the streaming reader, 
> >> but unfortunately I don't think I can use it because my code runs 
> >> on app engine.
> >>
> >> Thanks to anyone that can help.
> >>
> >> Drew Spencer
> >>
> >>
> >> -------------------------------------------------------------------
> >> -- To unsubscribe, e-mail: user-unsubscr...@poi.apache.org For 
> >> additional commands, e-mail: user-h...@poi.apache.org
> >>
> >>
> >>
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: user-unsubscr...@poi.apache.org For 
> > additional commands, e-mail: user-h...@poi.apache.org
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Reply via email to