Thanks for your kind reply. I tried using both sqlentityprocessor and set batchSize to -1but didn't get any improvement. It'd be helpful if I can see data import handler's log.
On Saturday, November 7, 2015, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > LoL. Of course I meant SolrJ. I had to misspell the most important > word of the hundreds I wrote in this thread :-) > > Thank you Erick for the correction. > ---- > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > http://www.solr-start.com/ > > > On 7 November 2015 at 19:18, Erick Erickson <erickerick...@gmail.com > <javascript:;>> wrote: > > Alexandre, did you mean SolrJ? > > > > Here's a way to get started > > https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ > > > > Best, > > Erick > > > > On Sat, Nov 7, 2015 at 2:22 PM, Alexandre Rafalovitch > > <arafa...@gmail.com <javascript:;>> wrote: > >> Have you thought of just using Solr. Might be faster than > troubleshooting > >> DIH for complex scenarios. > >> On 7 Nov 2015 3:39 pm, "Yangrui Guo" <guoyang...@gmail.com > <javascript:;>> wrote: > >> > >>> I found multiple strange things besides the slowness. I performed > count(*) > >>> in MySQL but only one-fifth of the records were imported. Also > sometimes > >>> dataimporthandler either doesn't import at all or only imports a > portion > >>> of the table. How can I debug the importer? > >>> > >>> On Saturday, November 7, 2015, Yangrui Guo <guoyang...@gmail.com > <javascript:;>> wrote: > >>> > >>> > I just realized that not everything was ok. Three child entities > were not > >>> > imported. Had set batchSize to -1 but again solr was stuck :( > >>> > > >>> > On Fri, Nov 6, 2015 at 3:11 PM, Yangrui Guo <guoyang...@gmail.com > <javascript:;> > >>> > <javascript:_e(%7B%7D,'cvml','guoyang...@gmail.com <javascript:;>');>> > wrote: > >>> > > >>> >> Thanks for the reply. I just removed CacheKeyLookUp and CachedKey > and > >>> >> used WHERE clause instead. Everything works fine now. > >>> >> > >>> >> Yangrui > >>> >> > >>> >> > >>> >> On Friday, November 6, 2015, Shawn Heisey <apa...@elyograg.org > <javascript:;> > >>> >> <javascript:_e(%7B%7D,'cvml','apa...@elyograg.org <javascript:;>');>> > wrote: > >>> >> > >>> >>> On 11/6/2015 10:32 AM, Yangrui Guo wrote: > >>> >>> > <entity name="movie_actress" transformer="RegexTransformer" > >>> >>> > >>> >>> There's a good chance that JDBC is trying to read the entire > result set > >>> >>> (all three million rows) into memory before sending any of that > info to > >>> >>> Solr. > >>> >>> > >>> >>> Set the batchSize to -1 for MySQL so that it will stream results to > >>> Solr > >>> >>> as soon as they are available, and not wait for all of them. > Here's > >>> >>> more info on the situation, which frequently causes OutOfMemory > >>> problems > >>> >>> for users: > >>> >>> > >>> >>> > >>> >>> > >>> > http://wiki.apache.org/solr/DataImportHandlerFaq?highlight=%28mysql%29|%28batchsize%29#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F > >>> >>> < > >>> > http://wiki.apache.org/solr/DataImportHandlerFaq?highlight=%28mysql%29%7C%28batchsize%29#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F > >>> > > >>> >>> > >>> >>> > >>> >>> Thanks, > >>> >>> Shawn > >>> >>> > >>> >>> > >>> > > >>> >