Re: DIH - Closing ResultSet in JdbcDataSource
I have found where a root entity has completed processing and added the logic to clear the entity's cache at that point (didn't change any of the logic for clearing all entity caches once the import has completed). I have also created an enhancement request found at https://issues.apache.org/jira/browse/SOLR-2313. On Tue, Jan 11, 2011 at 2:54 PM, Shane Perry thry...@gmail.com wrote: By placing some strategic debug messages, I have found that the JDBC connections are not being closed until all entity elements have been processed (in the entire config file). A simplified example would be: dataConfig dataSource name=ds1 driver=org.postgresql.Driver url=jdbc:postgresql://localhost:5432/db1 user=... password=... / dataSource name=ds2 driver=org.postgresql.Driver url=jdbc:postgresql://localhost:5432/db2 user=... password=... / document entity name=entity1 datasource=ds1 ... ... field list ... entity name=entity1a datasource=ds1 ... ... field list ... /entity /entity entity name=entity2 datasource=ds2 ... ... field list ... entity name=entity2a datasource=ds2 ... ... field list ... /entity /entity /document /dataConfig The behavior is: JDBC connection opened for entity1 and entity1a - Applicable queries run and ResultSet objects processed All open ResultSet and Statement objects closed for entity1 and entity1a JDBC connection opened for entity2 and entity2a - Applicable queries run and ResultSet objects processed All open ResultSet and Statement objects closed for entity2 and entity2a All JDBC connections (none are closed at this point) are closed. In my instance, I have some 95 unique entity elements (19 parents with 5 children each), resulting in 95 open JDBC connections. If I understand the process correctly, it should be safe to close the JDBC connection for a root entity (immediate children of document) and all descendant entity elements once the parent has been successfully completed. I have been digging around the code, but due to my unfamiliarity with the code, I'm not sure where this would occur. Is this a valid solution? It's looking like I should probably open a defect and I'm willing to do so along with submitting a patch, but need a little more direction on where the fix would best reside. Thanks, Shane On Mon, Jan 10, 2011 at 7:14 AM, Shane Perry thry...@gmail.com wrote: Gora, Thanks for the response. After taking another look, you are correct about the hasnext() closing the ResultSet object (1.4.1 as well as 1.4.0). I didn't recognize the case difference in the two function calls, so missed it. I'll keep looking into the original issue and reply if I find a cause/solution. Shane On Sat, Jan 8, 2011 at 4:04 AM, Gora Mohanty g...@mimirtech.com wrote: On Sat, Jan 8, 2011 at 1:10 AM, Shane Perry thry...@gmail.com wrote: Hi, I am in the process of migrating our system from Postgres 8.4 to Solr 1.4.1. Our system is fairly complex and as a result, I have had to define 19 base entities in the data-config.xml definition file. Each of these entities executes 5 queries. When doing a full-import, as each entity completes, the server hosting Postgres shows 5 idle in transaction for the entity. In digging through the code, I found that the JdbcDataSource wraps the ResultSet object in a custom ResultSetIterator object, leaving the ResultSet open. Walking through the code I can't find a close() call anywhere on the ResultSet. I believe this results in the idle in transaction processes. [...] Have not examined the idle in transaction issue that you mention, but the ResultSet object in a ResultSetIterator is closed in the private hasnext() method, when there are no more results, or if there is an exception. hasnext() is called by the public hasNext() method that should be used in iterating over the results, so I see no issue there. Regards, Gora P.S. This is from Solr 1.4.0 code, but I would not think that this part of the code would have changed.
Re: DIH - Closing ResultSet in JdbcDataSource
By placing some strategic debug messages, I have found that the JDBC connections are not being closed until all entity elements have been processed (in the entire config file). A simplified example would be: dataConfig dataSource name=ds1 driver=org.postgresql.Driver url=jdbc:postgresql://localhost:5432/db1 user=... password=... / dataSource name=ds2 driver=org.postgresql.Driver url=jdbc:postgresql://localhost:5432/db2 user=... password=... / document entity name=entity1 datasource=ds1 ... ... field list ... entity name=entity1a datasource=ds1 ... ... field list ... /entity /entity entity name=entity2 datasource=ds2 ... ... field list ... entity name=entity2a datasource=ds2 ... ... field list ... /entity /entity /document /dataConfig The behavior is: JDBC connection opened for entity1 and entity1a - Applicable queries run and ResultSet objects processed All open ResultSet and Statement objects closed for entity1 and entity1a JDBC connection opened for entity2 and entity2a - Applicable queries run and ResultSet objects processed All open ResultSet and Statement objects closed for entity2 and entity2a All JDBC connections (none are closed at this point) are closed. In my instance, I have some 95 unique entity elements (19 parents with 5 children each), resulting in 95 open JDBC connections. If I understand the process correctly, it should be safe to close the JDBC connection for a root entity (immediate children of document) and all descendant entity elements once the parent has been successfully completed. I have been digging around the code, but due to my unfamiliarity with the code, I'm not sure where this would occur. Is this a valid solution? It's looking like I should probably open a defect and I'm willing to do so along with submitting a patch, but need a little more direction on where the fix would best reside. Thanks, Shane On Mon, Jan 10, 2011 at 7:14 AM, Shane Perry thry...@gmail.com wrote: Gora, Thanks for the response. After taking another look, you are correct about the hasnext() closing the ResultSet object (1.4.1 as well as 1.4.0). I didn't recognize the case difference in the two function calls, so missed it. I'll keep looking into the original issue and reply if I find a cause/solution. Shane On Sat, Jan 8, 2011 at 4:04 AM, Gora Mohanty g...@mimirtech.com wrote: On Sat, Jan 8, 2011 at 1:10 AM, Shane Perry thry...@gmail.com wrote: Hi, I am in the process of migrating our system from Postgres 8.4 to Solr 1.4.1. Our system is fairly complex and as a result, I have had to define 19 base entities in the data-config.xml definition file. Each of these entities executes 5 queries. When doing a full-import, as each entity completes, the server hosting Postgres shows 5 idle in transaction for the entity. In digging through the code, I found that the JdbcDataSource wraps the ResultSet object in a custom ResultSetIterator object, leaving the ResultSet open. Walking through the code I can't find a close() call anywhere on the ResultSet. I believe this results in the idle in transaction processes. [...] Have not examined the idle in transaction issue that you mention, but the ResultSet object in a ResultSetIterator is closed in the private hasnext() method, when there are no more results, or if there is an exception. hasnext() is called by the public hasNext() method that should be used in iterating over the results, so I see no issue there. Regards, Gora P.S. This is from Solr 1.4.0 code, but I would not think that this part of the code would have changed.
Re: DIH - Closing ResultSet in JdbcDataSource
Gora, Thanks for the response. After taking another look, you are correct about the hasnext() closing the ResultSet object (1.4.1 as well as 1.4.0). I didn't recognize the case difference in the two function calls, so missed it. I'll keep looking into the original issue and reply if I find a cause/solution. Shane On Sat, Jan 8, 2011 at 4:04 AM, Gora Mohanty g...@mimirtech.com wrote: On Sat, Jan 8, 2011 at 1:10 AM, Shane Perry thry...@gmail.com wrote: Hi, I am in the process of migrating our system from Postgres 8.4 to Solr 1.4.1. Our system is fairly complex and as a result, I have had to define 19 base entities in the data-config.xml definition file. Each of these entities executes 5 queries. When doing a full-import, as each entity completes, the server hosting Postgres shows 5 idle in transaction for the entity. In digging through the code, I found that the JdbcDataSource wraps the ResultSet object in a custom ResultSetIterator object, leaving the ResultSet open. Walking through the code I can't find a close() call anywhere on the ResultSet. I believe this results in the idle in transaction processes. [...] Have not examined the idle in transaction issue that you mention, but the ResultSet object in a ResultSetIterator is closed in the private hasnext() method, when there are no more results, or if there is an exception. hasnext() is called by the public hasNext() method that should be used in iterating over the results, so I see no issue there. Regards, Gora P.S. This is from Solr 1.4.0 code, but I would not think that this part of the code would have changed.
Re: DIH - Closing ResultSet in JdbcDataSource
On Sat, Jan 8, 2011 at 1:10 AM, Shane Perry thry...@gmail.com wrote: Hi, I am in the process of migrating our system from Postgres 8.4 to Solr 1.4.1. Our system is fairly complex and as a result, I have had to define 19 base entities in the data-config.xml definition file. Each of these entities executes 5 queries. When doing a full-import, as each entity completes, the server hosting Postgres shows 5 idle in transaction for the entity. In digging through the code, I found that the JdbcDataSource wraps the ResultSet object in a custom ResultSetIterator object, leaving the ResultSet open. Walking through the code I can't find a close() call anywhere on the ResultSet. I believe this results in the idle in transaction processes. [...] Have not examined the idle in transaction issue that you mention, but the ResultSet object in a ResultSetIterator is closed in the private hasnext() method, when there are no more results, or if there is an exception. hasnext() is called by the public hasNext() method that should be used in iterating over the results, so I see no issue there. Regards, Gora P.S. This is from Solr 1.4.0 code, but I would not think that this part of the code would have changed.
DIH - Closing ResultSet in JdbcDataSource
Hi, I am in the process of migrating our system from Postgres 8.4 to Solr 1.4.1. Our system is fairly complex and as a result, I have had to define 19 base entities in the data-config.xml definition file. Each of these entities executes 5 queries. When doing a full-import, as each entity completes, the server hosting Postgres shows 5 idle in transaction for the entity. In digging through the code, I found that the JdbcDataSource wraps the ResultSet object in a custom ResultSetIterator object, leaving the ResultSet open. Walking through the code I can't find a close() call anywhere on the ResultSet. I believe this results in the idle in transaction processes. Am I off base here? I'm not sure what the overall implications are of the idle in transaction processes, but is there a way I can get around the issue without importing each entity manually? Any feedback would be greatly appreciated. Thanks in advance, Shane
Re: DIH - Closing ResultSet in JdbcDataSource
This is my configuration which seems to work just fine. ?xml version=1.0 encoding=utf-8 ? dataConfig dataSource type=JdbcDataSource name=DBImport driver=net.sourceforge.jtds.jdbc.Driver url=jdbc:jtds:sqlserver://localhost;databaseName=50_DEV;responseBuffering=adaptive; user=test password=test onError=skip/ document From there it's just a matter of running the select statement and mapping it against the correct fields in your index. Adam On Fri, Jan 7, 2011 at 2:40 PM, Shane Perry thry...@gmail.com wrote: Hi, I am in the process of migrating our system from Postgres 8.4 to Solr 1.4.1. Our system is fairly complex and as a result, I have had to define 19 base entities in the data-config.xml definition file. Each of these entities executes 5 queries. When doing a full-import, as each entity completes, the server hosting Postgres shows 5 idle in transaction for the entity. In digging through the code, I found that the JdbcDataSource wraps the ResultSet object in a custom ResultSetIterator object, leaving the ResultSet open. Walking through the code I can't find a close() call anywhere on the ResultSet. I believe this results in the idle in transaction processes. Am I off base here? I'm not sure what the overall implications are of the idle in transaction processes, but is there a way I can get around the issue without importing each entity manually? Any feedback would be greatly appreciated. Thanks in advance, Shane