Re: Help indexing PDF files
user Rivulet Enterprise Search http://sourceforge.net/projects/rivu/ http://sourceforge.net/projects/rivu/ it's very easy . -- View this message in context: http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p786781.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Help indexing PDF files
Hi, The wiki page [1] on this subject will get you started. [1]: http://wiki.apache.org/solr/ExtractingRequestHandler Cheers -Original message- From: Leonardo Azize Martins laz...@gmail.com Sent: Fri 07-05-2010 15:37 To: solr-user@lucene.apache.org; Subject: Help indexing PDF files Hi, I am new in Solr. I would like to index some PDF files. How can I do using example schema from 1.4.0 version? Regards, Leo
Re: Help indexing PDF files
I am using this page, but in my downloaded version there is no site directory. Thanks 2010/5/7 Markus Jelsma markus.jel...@buyways.nl Hi, The wiki page [1] on this subject will get you started. [1]: http://wiki.apache.org/solr/ExtractingRequestHandler Cheers -Original message- From: Leonardo Azize Martins laz...@gmail.com Sent: Fri 07-05-2010 15:37 To: solr-user@lucene.apache.org; Subject: Help indexing PDF files Hi, I am new in Solr. I would like to index some PDF files. How can I do using example schema from 1.4.0 version? Regards, Leo
RE: Re: Help indexing PDF files
You don't need it, you can use any PDF file. -Original message- From: Leonardo Azize Martins laz...@gmail.com Sent: Fri 07-05-2010 15:45 To: solr-user@lucene.apache.org; Subject: Re: Help indexing PDF files I am using this page, but in my downloaded version there is no site directory. Thanks 2010/5/7 Markus Jelsma markus.jel...@buyways.nl Hi, The wiki page [1] on this subject will get you started. [1]: http://wiki.apache.org/solr/ExtractingRequestHandler Cheers -Original message- From: Leonardo Azize Martins laz...@gmail.com Sent: Fri 07-05-2010 15:37 To: solr-user@lucene.apache.org; Subject: Help indexing PDF files Hi, I am new in Solr. I would like to index some PDF files. How can I do using example schema from 1.4.0 version? Regards, Leo
RE: Help indexing PDF files
Take a look at Tika library From: Leonardo Azize Martins [via Lucene] [mailto:ml-node+783677-325080270-124...@n3.nabble.com] Sent: Friday, May 07, 2010 6:37 AM To: caman Subject: Help indexing PDF files Hi, I am new in Solr. I would like to index some PDF files. How can I do using example schema from 1.4.0 version? Regards, Leo _ View message @ http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.h tml To start a new topic under Solr - User, email ml-node+472068-464289649-124...@n3.nabble.com To unsubscribe from Solr - User, click (link removed) GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx here. -- View this message in context: http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p784092.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Help indexing PDF files
I had Solr in machine A. In machine B I run the command below: curl http://10.33.19.201:8983/solr/update/extract?extractOnly=true; --data-binary @VPSX_V1_R10.pdf and I get the response: java.lang.IllegalStateException: Form too large What I and doing wrong? Is it the right or best way to send PDF files to be indexed? Regards, Leo 2010/5/7 caman aboxfortheotherst...@gmail.com Take a look at Tika library From: Leonardo Azize Martins [via Lucene] [mailto:ml-node+783677-325080270-124...@n3.nabble.comml-node%2b783677-325080270-124...@n3.nabble.com ] Sent: Friday, May 07, 2010 6:37 AM To: caman Subject: Help indexing PDF files Hi, I am new in Solr. I would like to index some PDF files. How can I do using example schema from 1.4.0 version? Regards, Leo _ View message @ http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.h tml To start a new topic under Solr - User, email ml-node+472068-464289649-124...@n3.nabble.comml-node%2b472068-464289649-124...@n3.nabble.com To unsubscribe from Solr - User, click (link removed) GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx here. -- View this message in context: http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p784092.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Help indexing PDF files
Hi, Sorry, I am newbie. Using these two commands it works. curl http://10.33.19.201:8983/solr/update/extract?stream.file=C:\\temp\\VPSX_V1_R10.pdfstream.contentType=application/pdfliteral.id=M4968\\C$\\temp\\VPSX_V1_R10.pdfcommit=true curl ' http://10.33.19.201:8983/solr/update/extract?literal.id=doc1commit=true' -F te...@vpsx_v1_r10.pdf Thanks for all help. Going ahead, what is the best choice to index a windows share? Using stream.file or not? Indexing all files all times or verifying if a file was changes and if so, index it? Regards, Leo 2010/5/7 Leonardo Azize Martins laz...@gmail.com I had Solr in machine A. In machine B I run the command below: curl http://10.33.19.201:8983/solr/update/extract?extractOnly=true; --data-binary @VPSX_V1_R10.pdf and I get the response: java.lang.IllegalStateException: Form too large What I and doing wrong? Is it the right or best way to send PDF files to be indexed? Regards, Leo 2010/5/7 caman aboxfortheotherst...@gmail.com Take a look at Tika library From: Leonardo Azize Martins [via Lucene] [mailto:ml-node+783677-325080270-124...@n3.nabble.comml-node%2b783677-325080270-124...@n3.nabble.com ] Sent: Friday, May 07, 2010 6:37 AM To: caman Subject: Help indexing PDF files Hi, I am new in Solr. I would like to index some PDF files. How can I do using example schema from 1.4.0 version? Regards, Leo _ View message @ http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.h tmlhttp://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.html To start a new topic under Solr - User, email ml-node+472068-464289649-124...@n3.nabble.comml-node%2b472068-464289649-124...@n3.nabble.com To unsubscribe from Solr - User, click (link removed) GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx here. -- View this message in context: http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p784092.html Sent from the Solr - User mailing list archive at Nabble.com.