Re: Help indexing PDF files

2010-05-09 Thread Rivulet Enterprise Search

user Rivulet Enterprise Search  http://sourceforge.net/projects/rivu/
http://sourceforge.net/projects/rivu/ 
it's very easy .
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p786781.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Help indexing PDF files

2010-05-07 Thread Markus Jelsma
Hi,

 

 

The wiki page [1] on this subject will get you started.

 

[1]: http://wiki.apache.org/solr/ExtractingRequestHandler

 

 

Cheers
 
-Original message-
From: Leonardo Azize Martins laz...@gmail.com
Sent: Fri 07-05-2010 15:37
To: solr-user@lucene.apache.org; 
Subject: Help indexing PDF files

Hi,

I am new in Solr.
I would like to index some PDF files.

How can I do using example schema from 1.4.0 version?

Regards,
Leo


Re: Help indexing PDF files

2010-05-07 Thread Leonardo Azize Martins
I am using this page, but in my downloaded version there is no site
directory.

Thanks

2010/5/7 Markus Jelsma markus.jel...@buyways.nl

 Hi,





 The wiki page [1] on this subject will get you started.



 [1]: http://wiki.apache.org/solr/ExtractingRequestHandler





 Cheers

 -Original message-
 From: Leonardo Azize Martins laz...@gmail.com
 Sent: Fri 07-05-2010 15:37
 To: solr-user@lucene.apache.org;
 Subject: Help indexing PDF files

 Hi,

 I am new in Solr.
 I would like to index some PDF files.

 How can I do using example schema from 1.4.0 version?

 Regards,
 Leo



RE: Re: Help indexing PDF files

2010-05-07 Thread Markus Jelsma
You don't need it, you can use any PDF file.
 
-Original message-
From: Leonardo Azize Martins laz...@gmail.com
Sent: Fri 07-05-2010 15:45
To: solr-user@lucene.apache.org; 
Subject: Re: Help indexing PDF files

I am using this page, but in my downloaded version there is no site
directory.

Thanks

2010/5/7 Markus Jelsma markus.jel...@buyways.nl

 Hi,





 The wiki page [1] on this subject will get you started.



 [1]: http://wiki.apache.org/solr/ExtractingRequestHandler





 Cheers

 -Original message-
 From: Leonardo Azize Martins laz...@gmail.com
 Sent: Fri 07-05-2010 15:37
 To: solr-user@lucene.apache.org;
 Subject: Help indexing PDF files

 Hi,

 I am new in Solr.
 I would like to index some PDF files.

 How can I do using example schema from 1.4.0 version?

 Regards,
 Leo



RE: Help indexing PDF files

2010-05-07 Thread caman

Take a look at Tika library

 

From: Leonardo Azize Martins [via Lucene]
[mailto:ml-node+783677-325080270-124...@n3.nabble.com] 
Sent: Friday, May 07, 2010 6:37 AM
To: caman
Subject: Help indexing PDF files

 

Hi, 

I am new in Solr. 
I would like to index some PDF files. 

How can I do using example schema from 1.4.0 version? 

Regards, 
Leo 



  _  

View message @
http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.h
tml 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
 (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p784092.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Help indexing PDF files

2010-05-07 Thread Leonardo Azize Martins
I had Solr in machine A.

In machine B I run the command below:
curl http://10.33.19.201:8983/solr/update/extract?extractOnly=true;
--data-binary @VPSX_V1_R10.pdf

and I get the response:
java.lang.IllegalStateException: Form too large

What I and doing wrong?
Is it the right or best way to send PDF files to be indexed?

Regards,
Leo



2010/5/7 caman aboxfortheotherst...@gmail.com


 Take a look at Tika library



 From: Leonardo Azize Martins [via Lucene]
 [mailto:ml-node+783677-325080270-124...@n3.nabble.comml-node%2b783677-325080270-124...@n3.nabble.com
 ]
 Sent: Friday, May 07, 2010 6:37 AM
 To: caman
 Subject: Help indexing PDF files



  Hi,

 I am new in Solr.
 I would like to index some PDF files.

 How can I do using example schema from 1.4.0 version?

 Regards,
 Leo



  _

 View message @

 http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.h
 tml
 To start a new topic under Solr - User, email
 ml-node+472068-464289649-124...@n3.nabble.comml-node%2b472068-464289649-124...@n3.nabble.com
 To unsubscribe from Solr - User, click
  (link removed)
 GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx  here.




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p784092.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Help indexing PDF files

2010-05-07 Thread Leonardo Azize Martins
Hi,

Sorry, I am newbie.

Using these two commands it works.

 curl 
http://10.33.19.201:8983/solr/update/extract?stream.file=C:\\temp\\VPSX_V1_R10.pdfstream.contentType=application/pdfliteral.id=M4968\\C$\\temp\\VPSX_V1_R10.pdfcommit=true


 curl '
http://10.33.19.201:8983/solr/update/extract?literal.id=doc1commit=true' -F
te...@vpsx_v1_r10.pdf

Thanks for all help.



Going ahead, what is the best choice to index a windows share?
Using stream.file or not?
Indexing all files all times or verifying if a file was changes and if so,
index it?

Regards,
Leo



2010/5/7 Leonardo Azize Martins laz...@gmail.com

 I had Solr in machine A.

 In machine B I run the command below:
 curl http://10.33.19.201:8983/solr/update/extract?extractOnly=true;
 --data-binary @VPSX_V1_R10.pdf

 and I get the response:
 java.lang.IllegalStateException: Form too large

 What I and doing wrong?
 Is it the right or best way to send PDF files to be indexed?

 Regards,
 Leo



 2010/5/7 caman aboxfortheotherst...@gmail.com


 Take a look at Tika library



 From: Leonardo Azize Martins [via Lucene]
 [mailto:ml-node+783677-325080270-124...@n3.nabble.comml-node%2b783677-325080270-124...@n3.nabble.com
 ]
 Sent: Friday, May 07, 2010 6:37 AM
 To: caman
 Subject: Help indexing PDF files



  Hi,

 I am new in Solr.
 I would like to index some PDF files.

 How can I do using example schema from 1.4.0 version?

 Regards,
 Leo



  _

 View message @

 http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.h
 tmlhttp://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.html
 To start a new topic under Solr - User, email
 ml-node+472068-464289649-124...@n3.nabble.comml-node%2b472068-464289649-124...@n3.nabble.com
 To unsubscribe from Solr - User, click
  (link removed)
 GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx  here.




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p784092.html
 Sent from the Solr - User mailing list archive at Nabble.com.