I fully agree. However, I am just curious to see the limits.
> Am 18.09.2019 um 23:33 schrieb Erick Erickson <erickerick...@gmail.com>: > > When it starts getting complex, I usually move to SolrJ. You say > you're loading documents, so I assume Tika is in the mix too. > > Here's a blog on the topic so you an see how to get started... > > https://lucidworks.com/post/indexing-with-solrj/ > > Best, > Erick > >> On Wed, Sep 18, 2019 at 2:56 PM Jörn Franke <jornfra...@gmail.com> wrote: >> >> Hi, >> >> I load a set of documents. Based on these documents some logic needs to be >> applied to split them into chapters (this is done). One whole document is >> loaded as a parent. Chapters of the whole document + metadata should be >> loaded as child documents of this parent. >> I want to now collect information on how this can be done: >> * Use a custom loader - this is possible and works >> * Use DIH and extract the chapters in a ScriptTransformer and add them as >> child documents there. However, the scripttransformer receives as input >> only a HashMap and while it works to transform field values etc. It does >> not seem possible to add childdocuments within the DIH scripttransformer. I >> tried adding a JavaArray with SolrInputDocuments, but this does not seem to >> work. I see in debug/verbose mode that indeed the transformer adds them to >> the HashMap correctly, but they don't end up in the document. Maybe here it >> could be possible somehow via nested entities? >> * Use DIH+ an UpdateProcessor (Script): there i get the SolrInputDocument >> as a parameter and it seems feasible to extract chapters and add them as >> child documents. >> >> thank you. >> >> best regards