Ummm... Yes I think I might have identified an issue with POI and a large number of strings. And I was looking at it partly in response to Mike's problem.
But I don't think the issue I found is the root problem. It might explain why large files generated in POI HSSF would not open correctly in Excel. In fact, I couldn't find any problem with the way POI handles things. At this point, I would say that what I have identified is just a difference in the way Excel writes a file with more than 1024 strings, and the way the same file is written from POI. I have tried reading a 3 MB Excel file which contains 65,000 unique strings, 130,000 BIFF records. Everything worked fine (if slowly, but 5 minutes instead of 5 hours). I have a 2 Ghz Pentium laptop, with 1 GB RAM. I did not increase the JVM heap size (so it was 128 MB). I did see one thing which I don't understand. I was debugging the application in Eclipse, and many times during the load, the CPU utilization went down to nearly zero for several seconds at a time. But after 15 to 30 seconds, it would pick up again and run for another 15 to 30 seconds at 100%. Toward the end of the run (when HSSFSheet creation was nearly complete), the idle periods got longer. I am certain that the idle intervals I observed were when the JVM was garbage collecting. I don't understand why Windows showed 0% CPU Utilization during this time. -----Original Message----- From: Danny Mui [mailto:[EMAIL PROTECTED] Sent: Thursday, March 24, 2005 2:27 PM To: POI Users List Subject: Re: HSSF cannot open files that contain many strings Mike Z has identitifed an issue with HSSF handling a bunch of unique strings (dev list). Once that is taken care of, I have a suspicion your issue will be addressed as well. Can you go into bugzilla and provide your excel file as a validation point as well? I can't find an existing bug with this issue so it would help facilitate testing once the coding is complete. As for timeframe, I'll dedicate sometime in April and May as I'll be trekking around Europe and need something to do while sipping coffee ;D Mike Serra wrote: > > Hello again to the POI world, > I have been having an ongoing problem with HSSF's ability to load an > .xls file containing > only strings. A 500kb file filled only with strings will not load, but > it doesn't throw an exception or run out ram either. The process sits > there taking up CPU time and slowly nibbling at system ram, and the file > might take hours to load (I haven't bothered to wait that long). > > In the past, I thought that POI was simply not able to load large files, > but I have since discovered that it can load enormous files, as long as > they contain only numeric data. The strings are the problem. I would > be very grateful if anyone has an idea what causes this. > > Thank you, > Mike S. > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
