Re: indexing rich data with solr 5.3.1 integreting in Ubuntu server

2016-01-26 Thread kostali hassan
they are loaded because solr is indexing .doc and .docx (msword) and fail
for pdf files .

2016-01-26 12:49 GMT+00:00 Emir Arnautovic :

> Hi,
> I would first check if external libraries are present and loaded. How do
> you start Solr? Try explicitly setting solr.install.dir or set absolute
> path to libs and see in logs if they are loaded.
>
>  regex=".*\.jar" />
>
>
> Thanks,
> Emir
>
> On 25.01.2016 15:16, kostali hassan wrote:
>
>> 0down votefavorite
>> <
>> http://stackoverflow.com/questions/34962280/solr-indexing-pdf-attachments-not-working-in-ubuntu#
>> >
>>
>>
>> I have a problem with integrating solr in Ubuntu server.Before using solr
>> on ubuntu server i tested it on my mac it was working perfectly for DIH
>> request handler and update/extract. it indexed my PDF,Doc,Docx
>> documents.so
>> after installing solr on ubuntu server and using the same configuration
>> files and librairies. i've found out that solr doesn't index PDf documents
>> and none Error and any exceptions in solr log.But i can search over .Doc
>> and .Docx documents.
>>
>> here some parts of my solrconfig.xml contents :
>>
>> > regex=".*\.jar" />
>>> regex="solr-cell-\d.*\.jar" />
>>
>> >startup="lazy"
>>class="solr.extraction.ExtractingRequestHandler" >
>>  
>>true
>>ignored_
>>_text_
>>  
>>
>>
>> DIH config:
>>
>> > class="org.apache.solr.handler.dataimport.DataImportHandler">
>> 
>> tika.config.xml
>> 
>> 
>>
>> tika.config.xml
>>
>> 
>>  
>>  
>>  > dataSource="null" rootEntity="false"
>>  baseDir="D:\Lucene\document"
>> fileName=".*\.(DOC)|(PDF)|(pdf)|(doc)|(docx)|(ppt)"
>> onError="skip"
>>  recursive="true">
>>  
>>  
>>  
>>   
>> >  name="documentImport"
>> dataSource="files"
>>  processor="TikaEntityProcessor"
>>  url="${files.fileAbsolutePath}"
>>  format="text">
>>
>>
>>  
>> > name="title" meta="true"/>
>>  
>>
>> > name="content"/>
>>  > name="LastModifiedBy" meta="true"/>
>>  
>>  
>>  
>> 
>>
>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


Re: indexing rich data with solr 5.3.1 integreting in Ubuntu server

2016-01-26 Thread Emir Arnautovic

Hi,
I would first check if external libraries are present and loaded. How do 
you start Solr? Try explicitly setting solr.install.dir or set absolute 
path to libs and see in logs if they are loaded.





Thanks,
Emir

On 25.01.2016 15:16, kostali hassan wrote:

0down votefavorite


I have a problem with integrating solr in Ubuntu server.Before using solr
on ubuntu server i tested it on my mac it was working perfectly for DIH
request handler and update/extract. it indexed my PDF,Doc,Docx documents.so
after installing solr on ubuntu server and using the same configuration
files and librairies. i've found out that solr doesn't index PDf documents
and none Error and any exceptions in solr log.But i can search over .Doc
and .Docx documents.

here some parts of my solrconfig.xml contents :


   


 
   true
   ignored_
   _text_
 
   

DIH config:



tika.config.xml



tika.config.xml


 
 
 
 
 
 
  



 

 


 
 
 
 




--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



indexing rich data with solr 5.3.1 integreting in Ubuntu server

2016-01-25 Thread kostali hassan
0down votefavorite


I have a problem with integrating solr in Ubuntu server.Before using solr
on ubuntu server i tested it on my mac it was working perfectly for DIH
request handler and update/extract. it indexed my PDF,Doc,Docx documents.so
after installing solr on ubuntu server and using the same configuration
files and librairies. i've found out that solr doesn't index PDf documents
and none Error and any exceptions in solr log.But i can search over .Doc
and .Docx documents.

here some parts of my solrconfig.xml contents :


  



  true
  ignored_
  _text_

  

DIH config:



tika.config.xml



tika.config.xml








 
   














indexing rich data with solr 5.3.1 integreting in Ubuntu server

2016-01-23 Thread kostali hassan
0down votefavorite


I have a problem with integrating solr in Ubuntu server.Before using solr
on ubuntu server i tested it on my mac it was working perfectly for DIH
request handler and update/extract. it indexed my PDF,Doc,Docx documents.so
after installing solr on ubuntu server and using the same configuration
files and librairies. i've found out that solr doesn't index PDf documents
and none Error and any exceptions in solr log.But i can search over .Doc
and .Docx documents.

here some parts of my solrconfig.xml contents :


  



  true
  ignored_
  _text_

  

DIH config:



tika.config.xml



tika.config.xml