You don't seem to be too creative with your doc_id values, so perhaps you can use Solr 4's post.jar recursive option: http://wiki.apache.org/solr/ExtractingRequestHandler#SimplePostTool_.28post.jar.29
Otherwise, you need to correlate the ID and the source file somehow, so you probably need a file with ID and location fields and then use DataImportHandler with nested entities to do so. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Wed, Jul 17, 2013 at 12:15 PM, sodoo <first...@yahoo.com> wrote: > Hi guys. > > I need a lot of pdf, doc, txt files. > Now I index manually below command. > > ######### PDF INDEX > curl > " > http://localhost:8983/solr/update/extract?stream.file=/opt/solr/documents/test.pdf&literal.doc_id=pdf_1&commit=true > " > > ######### TXT INDEX > curl > " > http://localhost:8983/solr/update/extract?stream.file=/opt/solr/documents/test1.txt&literal.doc_id=txt_1&commit=true > " > > ######### WORD DOC INDEX > curl > " > http://localhost:8983/solr/update/extract?stream.file=/opt/solr/documents/test2.docx&literal.doc_id=doc_1&commit=true > " > > But this is bad solution. Because I have almost 100 pdf, 200 docx and 50 > txt. Then add to day by day all of documents. > > I need a good solution. > > Please assist me on this and advice me. > > Thanks. > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-index-lot-of-pdf-doc-txt-tp4078651.html > Sent from the Solr - User mailing list archive at Nabble.com. >