I have a large solr response in xml format and would like to import it into
a new solr collection. I'm able to use DIH with solrEntityProcessor, but
only if I first truncate the file to a small subset of the records. I was
hoping to set stream="true" to handle the full file, but I still get an out
of memory error, so I believe stream does not work with solrEntityProcessor
(I know the docs only mention the stream option for the
XPathEntityProcessor, but I was hoping solrEntityProcessor just might have
the same capability).

Before I open a jira to request stream support for solrEntityProcessor in
DIH, is there an alternate approach for importing large files that are in
the solr results format?
Maybe a way to use xpath to get the values and a transformer to set the
field names? I'm hoping to not have to declare the field names in
dataConfig so I can reuse the process across data sets.

Anyone have ideas? thanks

Reply via email to