Re: Exclude files ~$*

2018-07-27 Thread Karl Wright
Can you view the job and include a screen shot of where this is displayed? Thanks. The exclusions are not regexps -- they are file specs. The file specs have special meanings for "*" (matches everything) and "?" (matches one character). You do not need to URL encode them. If you enable

Re: Tika/POI bugs

2018-07-27 Thread Karl Wright
To solve your production problem I highly recommend limiting the size of the docs fed to Tika, for a start. But that is no guarantee, I understand. Out of memory problems are very hard to get good forensics for because they cause major disruptions to the running server. You could turn on a

Exclude files ~$*

2018-07-27 Thread msaunier
Hi Karl, In my JCIFS connector, I want to configure an exclude condition if files name start with ~$* I have add the condition, but it does not working. I need to add: %7E%24* or a regex? Thanks, Maxence,

RE: Tika/POI bugs

2018-07-27 Thread msaunier
Hi Karl, Okay. For the Out of Memory: This is the last day that I can go on to find out where the error comes from. After that, I should go into production to meet my deadlines. I hope to find time in the future to be able to fix this problem on this server, otherwise I could not index

Tika/POI bugs

2018-07-27 Thread Karl Wright
Hi all, I've easily spent 40 hours over the last two weeks chasing down bugs in Apache Tika and POI. The two kinds I see are "ClassNotFound" (due to usage of the wrong ClassLoader), and "OutOfMemoryError" (not clear what it is due to yet). I don't have enough time to create tickets directly in

Re: Job stuck internal http error 500

2018-07-27 Thread Karl Wright
I am afraid you will need to open a Tika ticket, and be prepared to attach your file to it. Thanks, Karl On Fri, Jul 27, 2018 at 6:04 AM Bisonti Mario wrote: > It isn’t a memory problem because xls file bigger (30MB) have been > processed. > > > > This file xlsm with many colors etc hang > >

R: Job stuck internal http error 500

2018-07-27 Thread Bisonti Mario
It isn’t a memory problem because xls file bigger (30MB) have been processed. This file xlsm with many colors etc hang I could suppose that it is a tika/solr erro but I don’t know how to solve it ☹ Oggetto: R: Job stuck internal http error 500 Yes, I am using:

R: Job stuck internal http error 500

2018-07-27 Thread Bisonti Mario
Yes, I am using: /opt/manifoldcf/multiprocess-file-example-proprietary I set: sudo nano options.env.unix -Xms2048m -Xmx2048m But I obtain the same error. My doubt is that it could be a solr/tika problem. What could I do? I restrict the scan to a single file and I obtain the same error Da: Karl

Re: Job stuck internal http error 500

2018-07-27 Thread Karl Wright
Although it is not clear what process you are talking about. If solr ask them. Karl On Fri, Jul 27, 2018, 5:36 AM Karl Wright wrote: > I am presuming you are using the examples. If so, edit the options file > to grant more memory to you agents process by increasing the Xmx value. > > Karl >

Re: Job stuck internal http error 500

2018-07-27 Thread Karl Wright
I am presuming you are using the examples. If so, edit the options file to grant more memory to you agents process by increasing the Xmx value. Karl On Fri, Jul 27, 2018, 3:04 AM Bisonti Mario wrote: > Hallo. > > My job is stucking indexing an xlsx file of 38MB > > > > What could I do to

Re: Solr connection, max connections and CPU

2018-07-27 Thread Bisonti Mario
Thanks a lot Karl!!! On 2018/07/26 13:28:47, Karl Wright wrote: > Hi Mario,> > > There is no connection between the number of CPUs and the number output> > connections. You pick the maximum number of output connections based on> > the number of listening threads that you can use at the same

Job stuck internal http error 500

2018-07-27 Thread Bisonti Mario
Hallo. My job is stucking indexing an xlsx file of 38MB What could I do to solve my problem? In the following there is the error: 2018-07-27 08:55:15.562 WARN (qtp1521083627-52) [ x:core_share] o.e.j.s.HttpChannel /solr/core_share/update/extract java.lang.OutOfMemoryError at