[
https://issues.apache.org/jira/browse/SOLR-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298040#comment-16298040
]
Karl Zweimüller commented on SOLR-11772:
----------------------------------------
Hi Shawn!
No, I don't have a patch - I didn't analyze the DIH-Code. I found out by
monitoring the oracle db.
Changing the sql-statements to statements with bind-variables should be
jdbc-standard and not oracle-driver-specific. Therefore it should work with all
sql-databases. But I don't know if other databases would also benefit in
performance.
To set the cursor-sharing-parameter in the jdbc-url is not possible.
But to place the parameter in the after-logon-trigger of the db-user is a
working solution - but a workaround.This modified db-user should be used only
for Solr-DIH!
Rearranging the statements (using joins) would be possible, but I think would
raise the overall execution-time.
I don't have a problem when DIH won't change to jdbc-bind-variables, as I know
a workaround for us now.
Maybe this Issue can help others to solve Solr performance-issues with DIH and
oracle-db.
Charly
> Use JDBC-bind variables for DIH to improve performance with oracle db
> ---------------------------------------------------------------------
>
> Key: SOLR-11772
> URL: https://issues.apache.org/jira/browse/SOLR-11772
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: contrib - DataImportHandler
> Reporter: Karl Zweimüller
> Priority: Minor
>
> I just reduced the time for my full-import (solr 6.0.1) on an oracle-database
> for 1.4mio documents from 36 hours to 5 hours by setting the oracle
> session-parameter "CURSOR_SHARING=FORCE".
> Here I found one with the same problem:
> http://lucene.472066.n3.nabble.com/Optimizing-Dataimport-from-Oracle-cursor-sharing-changing-oracle-session-parameters-td4350601.html
> I have 1.4 mio documents and for every document i need 12 queries to collect
> sub-information for the actual document.
> This makes about 17mio sql-Statements to oracle for a full-import.
> As DIH doesn't use bind-variables
> (https://docs.oracle.com/cd/B28359_01/appdev.111/b28765/addfunc.htm#TDPJD210),
> every select looks "different" for oracle and a full parse (analyze
> statement, get optimal query-plan,..) has to be done 17mio times.
> By setting the session parameter "CURSOR_SHARING=FORCE", which can be done in
> an on_logon_trigger, oracle replaces all literals ins SQL with bind-variables
> and then can skip the hard-parse.
> This reduced my full-import-time from 36 hours to 5 hours. (With this you get
> only 13 different sql-statements compared to 17mio different statements
> before.
> As oracle states, that setting the CURSOR_SHARING=FORCE is only a workaround,
> it would be fine when DIH would use bind-variables for the variables.
> Charly
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]