Hi,
You say this is a "Tika error". Is this Tika as a stand-alone service? I
do not recognize any ManifoldCF classes whatsoever in this thread dump.
If this is Tika, I suggest contacting the Tika team.
Karl
On Thu, Sep 30, 2021 at 3:02 AM Bisonti Mario
wrote:
> Additional info.
>
>
>
> I
Hi,
We have reset thottling to 10 for AD and SOLR (2 for the windows
repository).
Job indexing all pptx to null ouput has run successfully (162733 documents)
Job indexing all pptx to solr still fails, manifoldcf.log contains:
WARN 2013-11-07 14:34:06,502 (Worker thread '29') - JCIFS: Possibly
Hi Ronny,
The failure is being caused because the time spent transferring data to
Solr is exceeding the socket timeout you have set for the Solr connection,
for some documents.
This is probably due to excessive load on the Solr instance. My suggestion
is to increase the socket timeout on your
Karl,
I don't know where you live but if you come to Belgium, stop in Brussels
for a good Belgian beer ;-)
In other words, setting the socket timeout to 2000 instead of 900 has
solved the problem.
It has indexed about 160,000 documents in 2 hours.
On the other hand, the Manifold/Solr machine (all
Hi Ronny,
One minor thing: you should need to set throttling to 2 ONLY for the
Windows repository connection, not for AD or Solr.
As for how to debug this issue, first off you should be looking in the
manifoldcf.log file (or the equivalent). You should see WARN messages from
the shared file
Ok Karl, thanks for the tip and the quick response, we will do this and
come back with the result.
On Wed, Nov 6, 2013 at 9:28 PM, Karl Wright daddy...@gmail.com wrote:
Hi Ronny,
One minor thing: you should need to set throttling to 2 ONLY for the
Windows repository connection, not for AD