If you have an application, why are you sending XML documents to Solr? Can't you convert it to any other format and then send them in batches? Or even if it is XML, just bite and send in 100 document batches. Or in smaller batches and use auto-commit settings I mentioned earlier.
Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Apr 1, 2014 at 7:30 AM, Floyd Wu <floyd...@gmail.com> wrote: > Hi Upayavira, > User don't hit solr directly, the search documents through my application. > The application is a entrance for user to upload documents and then indexed > by solr. > the situation is they upload a plain-text, something like dictionary. You > know, that dictionary is something big. > I'm trying to figure out some good technique before I can split these xml > to small one and streaming to solr. > > Floyd > > > > 2014-04-01 2:55 GMT+08:00 Upayavira <u...@odoko.co.uk>: > >> Tell the user they can't have! >> >> Or, write a small app that reads in their XML in one go, and pushes it >> in parts to Solr. Generally, I'd say letting a user hit Solr directly is >> a bad thing - especially a user who doesn't know the details of how Solr >> works. >> >> Upayavira >> >> On Mon, Mar 31, 2014, at 07:17 AM, Floyd Wu wrote: >> > Hi Alex, >> > >> > Thanks for your responding. Personally I don't want to feed these big xml >> > to solr. But users wants. >> > I'll try your suggestions later. >> > >> > Many thanks. >> > >> > Floyd >> > >> > >> > >> > 2014-03-31 13:44 GMT+08:00 Alexandre Rafalovitch <arafa...@gmail.com>: >> > >> > > Without digging too deep into why exactly this is happening, here are >> > > the general options: >> > > >> > > 0. Are you actually committing? Check the messages in the logs and see >> > > if the records show up when you expect them too. >> > > 1. Are you actually trying to feed 20Mb file to Solr? Maybe it's HTTP >> > > buffer that's blowing up? Try using stream.file instead (notice >> > > security warning though): http://wiki.apache.org/solr/ContentStream >> > > 2. Split file into smaller ones and and commit each separately >> > > 3. Set hard auto-commit in solrconfig.xml based on number of documents >> > > to flush in-memory structures to disk >> > > 4. Switch to using DataImportHandler to pull from XML instead of >> pushing >> > > 5. Increase amount of memory to Solr (-X command line flags) >> > > >> > > Regards, >> > > Alex. >> > > >> > > Personal website: http://www.outerthoughts.com/ >> > > Current project: http://www.solr-start.com/ - Accelerating your Solr >> > > proficiency >> > > >> > > On Mon, Mar 31, 2014 at 12:00 PM, Floyd Wu <floyd...@gmail.com> wrote: >> > > > I have many plain text xml that I transfer to form of solr xml >> format. >> > > > But every time I send them to solr, I hit OOM exception. >> > > > How to configure solr to "eat" these big xml? >> > > > Please guide me a way. Thanks >> > > > >> > > > floyd >> > > >>