Re: Solr Indexing MAX FILE LIMIT
Maybe you can start by testing this with split -l and xargs :-) These are standard Unix toolkit approaches and since you use one of them (curl) you may be happy to use others too. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Wed, Nov 14, 2012 at 11:33 PM, mitra mitra.re...@ornext.com wrote: Thank you eric I didnt know that we could write a Java class for it , can you provide me with some info on how to Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Indexing-MAX-FILE-LIMIT-tp4019952p4020407.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Indexing MAX FILE LIMIT
Thank you eric I didnt know that we could write a Java class for it , can you provide me with some info on how to Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Indexing-MAX-FILE-LIMIT-tp4019952p4020407.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr Indexing MAX FILE LIMIT
Hi - instead of trying to make the system ingest such large files perhaps you can split the files in many small pieces. -Original message- From:mitra mitra.re...@ornext.com Sent: Tue 13-Nov-2012 09:05 To: solr-user@lucene.apache.org Subject: Solr Indexing MAX FILE LIMIT Hello Guys Im using Apache solr 3.6.1 on tomcat 7 for indexing csv files using curl on windows machine ** My question is that what would be the max csv file size limit when doing a HTTP POST or while using the following curl command curl http://localhost:8080/solr/update/csv -F stream.file=D:\eighth.csv -F commit=true -F optimize=true -F encapsulate= -F keepEmpty=true ** My requirement is quite large because we have to index CSV files ranging between 8 to 10 GB ** What would be the optimum settings for index parameters like commit for better perfomance on a machine with 8gb RAM Please guide me on it Thanks in Advance -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Indexing-MAX-FILE-LIMIT-tp4019952.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr Indexing MAX FILE LIMIT
Thankyou *** I understand that the default size for HTTP POST in tomcat is 2mb can we change that somehow so that i dont need to split the 10gb csv into 2mb chunks curl http://localhost:8080/solr/update/csv -F stream.file=D:\eighth.csv -F commit=true -F optimize=true -F encapsulate= -F keepEmpty=true *** As I mentioned im using the above command to post rather than using this below format curl http://localhost:8080/solr/update/csv --data-binary @eighth.csv -H 'Content-type:text/plain; charset=utf-8' ***My question Is the Limit still applicable even when not using the above data binary format also -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Indexing-MAX-FILE-LIMIT-tp4019952p4019965.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Indexing MAX FILE LIMIT
Have you considered writing a small SolrJ (or other client) program that processed the rows in your huge file and sent them to solr in sensible chunks? That would give you much finer control over how the file was processed, how many docs were sent to Solr at a time, what to do with errors. You could even run N simultaneous programs to increase throughput... FWIW, Erick On Tue, Nov 13, 2012 at 3:42 AM, mitra mitra.re...@ornext.com wrote: Thankyou *** I understand that the default size for HTTP POST in tomcat is 2mb can we change that somehow so that i dont need to split the 10gb csv into 2mb chunks curl http://localhost:8080/solr/update/csv -F stream.file=D:\eighth.csv -F commit=true -F optimize=true -F encapsulate= -F keepEmpty=true *** As I mentioned im using the above command to post rather than using this below format curl http://localhost:8080/solr/update/csv --data-binary @eighth.csv -H 'Content-type:text/plain; charset=utf-8' ***My question Is the Limit still applicable even when not using the above data binary format also -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Indexing-MAX-FILE-LIMIT-tp4019952p4019965.html Sent from the Solr - User mailing list archive at Nabble.com.