I also needed this for determining whether to use SAX or DOM API for a file
according to it size.
In SAX model you scan the actual XML so i unzipped a large xslx and opened
the XML and looked for header tags that can tell me about the real size
(rows / cols) of the xls without reading the whole xlsx.


I fount that one of the first tags is the dimension tag (e.g. <dimension
ref="A1:BK28674"/>) which can give you the amount of cols and rows. Sadly
the sax xml parser that you use (that implements SheetContentsHandler)
doesn't allow you to override a function that checks each tag (only cells
and row start and end and headers... i don't know what headers are but it is
not what i am looking for).

So, you are left with 2 choices:
1. open the xlsx zip programatically, search for the sheet xml, read it
using an XML SAX, find the above tag at the beginning and close it.

2. find the unzipped size of the xlsx and use that as a rough estimate.

I chose the later,
here is the code:

private boolean isLargeExcel(String excelFile) {
        final int LARGE_EXCEL_THRESHOLD = 40 * 1024 * 1024; // 40MB
        ZipInputStream zin=null;
        float expandedSize = 0;
        try {
                FileInputStream fin = new FileInputStream(excelFile);
                zin = new ZipInputStream(fin);
                ZipEntry ze = null;
                        while ((ze = zin.getNextEntry()) != null) 
                                        expandedSize+=ze.getSize();
                } catch (Exception x) {
                                return false;
                        } finally {
                        try {
                                        if (zin != null)
                                                zin.close();
                                } catch (IOException e) {
                                        e.printStackTrace();
                                }
        }
        return (expandedSize > LARGE_EXCEL_THRESHOLD);
}




--
View this message in context: 
http://apache-poi.1045710.n5.nabble.com/How-to-get-row-count-using-SAX-XLSX-tp5529644p5721596.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to