Hello I'm indexing xml files with xpathEntityProcessor, and for some hundreads documents on 12 millions are not processed.
When I tried to index only one of the KO documents it doesn't either index. So it's not a matter of big number of documents. We tried to do the xslt transformation externaly, to catch the xml transformed and to index it in SOLR, it worked. So the doc seems OK. I looked on the doc, it was big, so I commented a part, it has been indexed in solr with xsl transform. So I downloaded the dih code and I debugged the execution of these lines, which launch the xsl transformation, to see what was happening exactly SimpleCharArrayReader caw = new SimpleCharArrayReader(); xslTransformer.transform(new StreamSource(data), new StreamResult(caw)); data = caw.getReader(); It appeared that the caw missed data, so the xsltTransformer didn't work correctly. Digging further in TransformerImpl code, I see the content of my xml file in some buffer but somewhere something goes wrong, that I don't understand ( it's getting very tricky for me). xslTransformer is from class com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl Is there a mean to change the xslt transformer class, or is there a known limitation of size in this xmltransformer, which can be increased? I've work in solr 4.2 and then in solr 4.6. Thank in advance Regards Jérôme Dupont Bibliothèque Nationale de France Département des Systèmes d'Information Tour T3 - Quai François Mauriac 75706 Paris Cedex 13 téléphone: 33 (0)1 53 79 45 40 e-mail: jerome.dup...@bnf.fr ----------------------------------------------- Exposition Astérix à la BnF ! - du 16 octobre 2013 au 19 janvier 2014 - BnF - François-Mitterrand / Grande Galerie Avant d'imprimer, pensez à l'environnement.