Thanks for all you advices and thoughts.

The "client" in our case is/are the tomcats. To be more precise the webapps 
running in the tomcats. These should serve http request.

I'd also like to note that it's he batch-updates that in my opinion cause load 
(cpu and memory (dependeing on the pdf)) which I would like to take of the 
webapps.  Not the single document insertions/updates.

But if I don't get a clean/stable "Solr-way-to-do-it" solution to this problem 
I will do the extraction in the webapps, as is 


-----Ursprüngliche Nachricht-----
Von: Erick Erickson [mailto:erickerick...@gmail.com] 
Gesendet: Samstag, 13. September 2014 23:22
An: solr-user@lucene.apache.org
Betreff: Re: SolrJ : fieldcontent from (multiple) file(s)

Alexandre:

Hmmm, if you're correct, that pretty much shoots SolrCel in the head too. You'd 
probably have to do something with a custom UpdateRequestProcessor in that 
case...

On Sat, Sep 13, 2014 at 2:06 PM, Alexandre Rafalovitch <arafa...@gmail.com> 
wrote:
> On 13 September 2014 17:03, Erick Erickson <erickerick...@gmail.com> wrote:
>> Which probably just means I don't understand your problem space in 
>> sufficient depth....
>
> I suspect this means the clients do not have access to the shared 
> drive with the files, but the Solr server does. A firewall in between 
> or some such.
>
> If I am right, that would make invoking DataImportHandler a bit 
> complicated as well, due to change of push to pull.
>
> Regards,
>    Alex.
>
> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources 
> and newsletter: http://www.solr-start.com/ and @solrstart Solr 
> popularizers community: https://www.linkedin.com/groups?gid=6713853

Reply via email to