Hi Justin,

Thanks for the pointer. I investigated at replacing my block of code with
something that uses the StreamingParser but stumbled into the issue again
that I needed the whole FeatureCollection to be built as I had to provide
it as an input to FeatureJSON.writeFeatureCollection(). I found that
building the FeatureCollection using the StreamingParser has the same cost
as using just the plain Parser as every feature had to be stored into
memory all at one go.

My follow up question is if you know a FeatureCollection implementation
that streams Features one at a time. Something that probably takes in a
StreamingParser instance and calls parser.parse() for each iterator.next()
call to grab the features inside the FeatureCollection object. I had a
quick look around the codebase but didn't see anything.

Thanks again for your help.

Kind regards,
Logan

On Fri, Nov 23, 2012 at 2:02 AM, Justin Deoliveira <[email protected]>wrote:

> Hi Martin,
>
> You can use the StreamingParser rather than just the Parser interface. The
> former will allow you to walk through all the features rather than read
> them all into memory at once. This does however require you to parse a bit
> differently. For instance:
>
>   //Parser parser = new Parser(configuration);
>  StreamingParser parser = new StreamingParser(confguration, in,
> SimpleFeature.class);
>  SimpleFeature next = null;
>  while ((next = parser.next()) != null) {
>     //do something with the next feature
>  }
>
> Hope that helps.
>
> -Justin
>
>
> On Wed, Nov 21, 2012 at 10:49 PM, Martin Logan <
> [email protected]> wrote:
>
>> I would like to know if you can suggest a better way of handling/parsing
>> a large GML2 document. At the moment, I'm using the sample code you have
>> provided in the
>> WFS_1_0_0_ParsingTest under gt-xsd-wfs package.
>>
>>         File tmp = File.createTempFile("geoserver-DescribeFeatureType",
>> "xml");
>>
>>         tmp.deleteOnExit();
>>
>>         InputStream in = getClass().getResourceAsStream(
>> "geoserver-DescribeFeatureType.xml");
>>
>>         copy(in, tmp);
>>
>>         in = getClass().getResourceAsStream(
>> "geoserver-GetFeature-of-a-very-big-file.xml");
>>
>>         DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
>>
>>         dbf.setNamespaceAware(true);
>>
>>         DocumentBuilder db = dbf.newDocumentBuilder();
>>
>>         Document doc = db.parse(in);
>>
>>         String schemaLocation = doc.getDocumentElement().getAttributeNS(
>>
>>                 "http://www.w3.org/2001/XMLSchema-instance";,
>> "schemaLocation");
>>
>>         String absolutePath =
>> DataUtilities.fileToURL(tmp).toExternalForm();
>>
>>         schemaLocation = schemaLocation.replaceAll("
>> http://cite.opengeospatial.org/gmlsf .*",
>>
>>                 "http://cite.opengeospatial.org/gmlsf " + absolutePath);
>>
>>         doc.getDocumentElement().setAttributeNS("
>> http://www.w3.org/2001/XMLSchema-instance";,
>>
>>                 "schemaLocation", schemaLocation);
>>
>>         tmp = File.createTempFile("geoserver-GetFeature", "xml");
>>
>>         tmp.deleteOnExit();
>>
>>         Transformer tx =
>> TransformerFactory.newInstance().newTransformer();
>>
>>         tx.transform(new DOMSource(doc), new StreamResult(tmp));
>>
>>         in = new FileInputStream(tmp);
>>
>>         Parser parser = new Parser(configuration);
>>
>>         FeatureCollectionType fc = (FeatureCollectionType)
>> parser.parse(in);
>>
>>
>> After running the test against a GML2 document with more than 1000
>> features (with geometry included in the returned attribute), it throws an
>> OutOfMemoryError. Any hints on what I can replace the Transformer.transform
>> call with? I don't think increasing the heap space will be a scalable
>> solution as I might have to handle datasets that are even bigger than the
>> limit I set for the heap memory.
>>
>> Nov 20, 2012 12:53:01 PM org.apache.catalina.core.StandardWrapperValve
>> invoke
>>
>> SEVERE: Servlet.service() for servlet [appServlet] in context with path
>> [/aurin-data-provider] threw exception [Handler processing failed; nested
>> exception is java.lang.OutOfMemoryError: Java heap space] with root cause
>>
>> java.lang.OutOfMemoryError: Java heap space
>>
>> at java.util.Arrays.copyOfRange(Arrays.java:3209)
>>
>> at java.lang.String.<init>(String.java:215)
>>
>> at java.lang.StringBuffer.toString(StringBuffer.java:585)
>>
>> at org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
>> Source)
>>
>> at org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString(Unknown
>> Source)
>>
>> at org.apache.xerces.dom.DeferredTextImpl.synchronizeData(Unknown Source)
>>
>> at org.apache.xerces.dom.CharacterDataImpl.getNodeValue(Unknown Source)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:240)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:226)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:132)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(
>> DOM2TO.java:94)
>>
>> at
>> com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(
>> TransformerImpl.java:679)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(
>> TransformerImpl.java:723)
>>
>> at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(
>> TransformerImpl.java:336)
>>
>>
>> Thanks,
>> Logan
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Monitor your physical, virtual and cloud infrastructure from a single
>> web console. Get in-depth insight into apps, servers, databases, vmware,
>> SAP, cloud infrastructure, etc. Download 30-day Free Trial.
>> Pricing starts from $795 for 25 servers or applications!
>> http://p.sf.net/sfu/zoho_dev2dev_nov
>> _______________________________________________
>> GeoTools-GT2-Users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users
>>
>>
>
>
> --
> Justin Deoliveira
> OpenGeo - http://opengeo.org
> Enterprise support for open source geospatial.
>
>
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
GeoTools-GT2-Users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users

Reply via email to