Gunnlaugur Thor Briem created SOLR-6644:
-------------------------------------------

             Summary: DataImportHandler holds on to each DB connection until 
the end
                 Key: SOLR-6644
                 URL: https://issues.apache.org/jira/browse/SOLR-6644
             Project: Solr
          Issue Type: Bug
          Components: contrib - DataImportHandler
    Affects Versions: 4.10.1, 4.7
            Reporter: Gunnlaugur Thor Briem


DataImportHandler with a JDBC data source opens one DB connection per entity, 
and then holds on to that DB connection with an open transaction after it's 
finished processing that entity ... right until the whole DataImportHandler 
operation is finished.

So this can mean dozens of DB connections tied up for hours, unnecessarily --- 
with each connection staying in "idle in transaction" state, holding (in 
PostgreSQL) an AccessShareLock on each relation it has looked at. Not ideal for 
production operations, of course.

Here are the connections from Solr to the DB when a large import has been 
running for a while:

{code}
 backend_start | xact_start | query_start |        state        | minutes idle 
---------------+------------+-------------+---------------------+--------------
 20:03:20      | 20:03:20   | 20:03:21    | idle in transaction |           32
 20:03:22      | 20:03:22   | 20:03:22    | idle in transaction |           32
 20:03:22      | 20:03:22   | 20:03:22    | idle in transaction |           32
 20:03:22      | 20:03:22   | 20:03:23    | idle in transaction |           32
 20:03:21      | 20:03:21   | 20:16:35    | idle in transaction |           19
 20:03:21      | 20:03:21   | 20:16:35    | idle in transaction |           19
 20:03:22      | 20:03:22   | 20:16:35    | idle in transaction |           19
 20:03:22      | 20:03:22   | 20:16:35    | idle in transaction |           19
 20:03:22      | 20:03:22   | 20:16:35    | idle in transaction |           19
 20:16:37      | 20:16:37   | 20:16:38    | idle in transaction |           19
 20:03:21      | 20:03:21   | 20:16:35    | idle in transaction |           19
 20:03:21      | 20:03:21   | 20:16:35    | idle in transaction |           19
 20:03:21      | 20:03:21   | 20:16:35    | idle in transaction |           19
 20:16:36      | 20:16:36   | 20:16:37    | idle in transaction |           19
 20:03:20      | 20:03:20   | 20:16:35    | idle in transaction |           19
 20:16:36      | 20:16:36   | 20:35:49    | idle in transaction |            0
 20:16:36      | 20:16:36   | 20:35:49    | idle in transaction |            0
 20:16:37      | 20:16:37   | 20:35:49    | idle in transaction |            0
 20:16:35      | 20:16:35   | 20:35:41    | idle in transaction |            0
 20:16:36      | 20:16:36   | 20:35:49    | idle in transaction |            0
 20:16:37      | 20:16:37   | 20:35:49    | active              |            0
{code}

Most of these haven't been touched for a long time, and will not be needed 
again (because DataImportHandler is done with that top-level entity). They 
should be released as soon as possible.

Noticed in production in Solr 4.7.0, then reproduced in 4.10.1 (so probably 
also true of all versions inbetween).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to