David Morana created CONNECTORS-927: ---------------------------------------
Summary: Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK. Key: CONNECTORS-927 URL: https://issues.apache.org/jira/browse/CONNECTORS-927 Project: ManifoldCF Issue Type: Bug Reporter: David Morana FYI: Initially, large (>500MB) zip files in a Livelink repository would halt the crawl. And eventually these errors would happen on any size file. (See below) Oracle originally had a work around; (set entityExpansionLimit=0) but it didn’t work This is a known bug in JDK 5, 6, and 7. We upgraded to JDK8 and it seems to have fixed the issue You can read about it here: https://bugs.openjdk.java.net/browse/JDK-8028111 And here: http://stackoverflow.com/questions/20482331/whats-causing-these-parseerror-exceptions-when-reading-off-an-aws-sqs-queue-in {code} 2014-04-12 07:39:20,730 [Worker thread '21'] WARN org.apache.manifoldcf.ingest- Solr exception during indexing https://[redacted]/cs/llisapi.dll?func=ll&objID=2547652&objAction=download (500): parsing error org.apache.solr.common.SolrException: parsing error at org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:101) at org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:325) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:949) Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1] Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK. at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.setInputSource(XMLStreamReaderImpl.java:219) at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.<init>(XMLStreamReaderImpl.java:189) at com.sun.xml.internal.stream.XMLInputFactoryImpl.getXMLStreamReaderImpl(XMLInputFactoryImpl.java:277) at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:155) at org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:99) ... 4 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)