I see the job processed but without the document inside.
10-11-2018 13:32:25.649

job end

1539153700219(G_IT_Area_condivisa_Mario_XLSM)

0

1

10-11-2018 13:32:14.211

job start

1539153700219(G_IT_Area_condivisa_Mario_XLSM)

0

1





Have I to uncheck, on my Solr output connection the “Use the Extract Update 
Handler”?

[cid:image002.jpg@01D46168.9A8CAA70]





Da: Karl Wright <daddy...@gmail.com>
Inviato: giovedì 11 ottobre 2018 13:36
A: user@manifoldcf.apache.org
Oggetto: Re: How to set Tika with ManifoldCF and Solr

Please have a look at your "Simple History" report to see why the documents 
aren't getting indexed.

Thanks,
Karl


On Thu, Oct 11, 2018 at 7:10 AM Bisonti Mario 
<mario.biso...@vimar.com<mailto:mario.biso...@vimar.com>> wrote:
Thanks Karl.
I tried, but it doesn’t index documents.
It seemes that it doesn’t see them?

Perhaps is the “Ignore Tika exception that I don’t know where to set in 
ManifoldCF  the problem?





Da: Karl Wright <daddy...@gmail.com<mailto:daddy...@gmail.com>>
Inviato: giovedì 11 ottobre 2018 12:24
A: user@manifoldcf.apache.org<mailto:user@manifoldcf.apache.org>
Oggetto: Re: How to set Tika with ManifoldCF and Solr

Hi Mario,

(1) When you use the Tika server externally, you do not get the boilerpipe HTML 
extractor available for configuration and use.  That is because it's external 
now.
(2) In your Solr connection, you want to uncheck the box that says "use 
extracting update handler", and you want to change the output handler from 
"/update/extract" to just "/update".

Karl


On Thu, Oct 11, 2018 at 4:45 AM Bisonti Mario 
<mario.biso...@vimar.com<mailto:mario.biso...@vimar.com>> wrote:
Hallo.
I would like to use Tika server started from command line into ManifoldCF so, 
ManifoldCF as Trasformation connector, process with Tika and index to the 
output connecto Solr.

I started Tika server:
java -jar /opt/tika/tika-server-1.19.1.jar

After, I created a transformation connection with TikaServer: localhost and 
Tika port 998 and connection works.

After, I created a job and in the Tab Connection I inserted the Transformation 
yet created Before the Output Solr.


Note that I don’t see the tab “Excepition” and “Boilerplate”
Why this?

Furthermore, if I start the job, I see that Solr hangs with exception:
2018-10-11 10:03:47.268 WARN  (qtp1223240796-17) [   x:core_share] 
o.e.j.s.HttpChannel /solr/core_share/update/extract
java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
        at java.lang.Class.forName0(Native Method) ~[?:?]
        at java.lang.Class.forName(Class.java:374) ~[?:?]

infact, I renamed the tika .jar:
in the folder : solr/contrib/extraction/lib to be sure that solr doesn’t use 
Tika because I would like that Manifoldcfuses Tika buti t doesn’t work.

Have I to configure solr to don’t use Tika I suppose.

How to do this?

I see 
https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/107708451/Data+Extraction+Tika+Embedded+in+Solr+Deactivation+Configuration<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatafari.atlassian.net%2Fwiki%2Fspaces%2FDATAFARI%2Fpages%2F107708451%2FData%2BExtraction%2BTika%2BEmbedded%2Bin%2BSolr%2BDeactivation%2BConfiguration&data=01%7C01%7CMario.Bisonti%40vimar.com%7Cc642acf35d86405a36ae08d62f6db5f9%7Ca1f008bcd59b4c668f8760fd9af15c7f%7C1&sdata=3TheV5EcDQOPMsvbOZPLjNpOobbhdgsvpysG%2Bua21PM%3D&reserved=0>
 but I haven’t Datafari, so, in a Solr standard configuration, how could I 
deactivated the tika ?

Thanks a lot

Mario

Reply via email to