On 2/7/2018 11:40 PM, Srinivas Kashyap wrote:
We have configured Solr index server on tomcat and fetch the data from database
to index the data. We have implemented delta query indexing based on modify_ts.
What version of Solr? Just as an FYI: Since version 5.0, running in
user-provided containers (like Tomcat) is not a supported configuration.
https://wiki.apache.org/solr/WhyNoWar
In our data-config.xml we have a parent entity and 17 child entity. We have 18
such solr cores. When we call delta-import on a core, it executes 18 SQL query
to query database.
Each time delta-import is opening a new session onto database. Log-in and
log-out though happening at a split second, we are finding millions of login
and logout at database.
As per our DBA, login and logout are costly operation in terms of server
resources.
Is there a way to reduce the number of logins and logouts and have a
persistent DB connection from solr?
Directly, with a JDBC driver configured in the dataimport handler?
Probably not. But it looks like there may be a workaround -- setting up
a JNDI datasource in your servlet container, and letting that handle the
connection pooling for you.
http://lucene.472066.n3.nabble.com/how-to-configure-mysql-pool-connection-on-Solr-Server-tp4038974p4039040.html
It is likely that your container can set up connection pooling with most
JDBC drivers, not just MySQL.
The dataimport handler is a useful module, but it has limitations. If
you write your own indexing program that is fully aware of your source
data, you're likely to get better results.
Something else to consider -- sometimes by clever use of SQL JOINs, you
can put the information gathering done by child entities into the main
query of the parent entity. If you can do that and eliminate all your
child entities, then Solr will make exactly ONE query to your database
for any import operation, and you won't need to worry about reusing open
connections.
Thanks,
Shawn