I'm not sure why you have this problem.  I use DIH 1.4.1 in production with 
Jboss 5 (based on Tomcat) and seldom restart the JVMs and haven't experienced 
anything like this.  As for the warnings with ThreadLocals, I doubt these are 
causing a severe memory leak:  in 1.4.1, the DataImporter class has 2 
ThreadLocals:  1 is an AtomicLong to hold a document count.  The other is a 
SimpleDateFormat.  Even if these "leaked", it would have to re-create tons of 
them for you to notice.

If this happens again, take a thread dump and see where it is looping.  Then 
compare your DIH configuration and see if you (or someone on this list) can 
figure out why DIH might go into a loop.  Perhaps this is a bug that needs to 
be fixed still.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Weigel, Christian [mailto:christian.wei...@sage.com] 
Sent: Tuesday, January 22, 2013 5:07 AM
To: solr-user@lucene.apache.org
Subject: Problems with DataImportHandler in SOLR 1.4.0

Hi,

we are experiencing problems with the DataImportHandler in Solr 1.4.0. We are 
reading datasets from MySQL and this worked fine before, but it seems the 
DataImportHandler ran into an infinite loop and is requesting millions of data 
rows from MySQL. This resulted in a High CPU Load on our DB Server.

Calling the dataimporthandler via curl with "abort" and doing a "full-import" 
solved the issue, but before I tried that after restarting Tomcat/Solr DIH 
seemed to be stuck in this infinite loop every time it came back/started.

I think these Log Messages are related to the problem:

SEVERE: The web application [/solr] created a ThreadLocal with key of type 
[org.apache.solr.handler.dataimport.DataImporter$2] (value 
[org.apache.solr.handler.dataimport.DataImporter$2@54b435c8]) and a value of 
type [java.util.concurrent.atomic.AtomicLong] (value [580601]) but failed to 
remove it when the web application was stopped. This is very likely to create a 
memory leak.

org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/solr] appears to have started a thread named 
[MySQL Statement Cancellation Timer] but has failed to stop it. This is very 
likely to create a memory leak.

Has somebody encountered a similar behaviour before?, is there a way to prevent 
DIH to run into these issues?

Our Current Setup:

Solr 1.4.0
apache-solr-dataimporthandler-1.4.1-dev
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Apache Tomcat/6.0.35
MySQL 5.1.50

Kind Regards,

    Christian

Reply via email to