[jira] Commented: (SOLR-1426) Allow delta-import to run continously until aborted
[ https://issues.apache.org/jira/browse/SOLR-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761289#action_12761289 ] Abdul Chaudhry commented on SOLR-1426: -- NOTE: the last_index_time is broken with the perpetual patch I hacked around this by changing the data-config.xml file for the deltaQuery to do something like this:- WHERE updated_at DATE_SUB('${dataimporter.last_index_time}',INTERVAL 10 SECOND) This is because of the time discrepancy between the sleep and the writers last_index_time. However, it looks like the delta-import is broken in the latest build of solr trunk revision 820731. It looks like the lastIndexTime in the DataImporter is not populated after a delta and so if you used ${dataimporter.last_index_time} then the deltaQuery uses the wrong time. I am going to wait until delta-import is fixed before I update a patch. Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Assignee: Noble Paul Fix For: 1.5 Attachments: delta-import-perpetual.patch Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://localhost:8985/solr/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://localhost:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1426) Allow delta-import to run continously until aborted
[ https://issues.apache.org/jira/browse/SOLR-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12761300#action_12761300 ] Abdul Chaudhry commented on SOLR-1426: -- The SOLR-783 fix seems to force you to use the entity name with the last_index_time My fix for this was to change the deltaQuery like so :- WHERE updated_at DATE_SUB('${dataimporter.[name of entity].last_index_time}',INTERVAL 10 SECOND) Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Assignee: Noble Paul Fix For: 1.5 Attachments: delta-import-perpetual.patch Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://localhost:8985/solr/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://localhost:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1426) Allow delta-import to run continously until aborted
[ https://issues.apache.org/jira/browse/SOLR-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755077#action_12755077 ] Abdul Chaudhry commented on SOLR-1426: -- The perpetual option only makes sense for one command; that is the delta-import command. I could not see a compelling use case for using perpetual with any other command. The abort should stop any in-flight delta-import which is the current behaviour with the patch. The sleep interval should be set using something like perpetual.delay and default to a reasonable value such as 3 secs. Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Assignee: Noble Paul Fix For: 1.5 Attachments: delta-import-perpetual.patch Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://localhost:8985/solr/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://localhost:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1426) Allow delta-import to run continously until aborted
[ https://issues.apache.org/jira/browse/SOLR-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754816#action_12754816 ] Abdul Chaudhry commented on SOLR-1426: -- You can run a crontab every minute but I need near real-time changes mirrored from a set of tables in a database to a search index. You should be aware that Lucene 2.9 includes what it calls near realtime search capabilities and if you include these into solr 1.4 then the use case for delta-import will probably change from running every few hours and minutes (which is probably what you are used to right now) and quickly move to running every few seconds. In that case running a crontab every minute is too long to wait and writing a script to call curl every few seconds will seem like an excessive use of system resources. So, in answer to your question, it's probably is not a common use case now but with lucene 2.9 it will become a common use case. Anyway, Its your call - take it or leave it. Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Assignee: Noble Paul Fix For: 1.5 Attachments: delta-import-perpetual.patch Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://localhost:8985/solr/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://localhost:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1426) Allow delta-import to run continously until aborted
Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Fix For: 1.4 Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://search-master.fansnap.com:8985/solr/tickets/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://search-master.fansnap.com:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1426) Allow delta-import to run continously until aborted
[ https://issues.apache.org/jira/browse/SOLR-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdul Chaudhry updated SOLR-1426: - Attachment: delta-import-perpetual.patch Uploaded a patch that implements this feature. Ran all unit tests on my tree and they pass. The only thing I have hard-coded is the sleep interval which is :- Thread.sleep(3000) This should probably be configurable. Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Fix For: 1.4 Attachments: delta-import-perpetual.patch Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://search-master.fansnap.com:8985/solr/tickets/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://search-master.fansnap.com:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1426) Allow delta-import to run continously until aborted
[ https://issues.apache.org/jira/browse/SOLR-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdul Chaudhry updated SOLR-1426: - Description: Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://localhost:8985/solr/tickets/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://localhost:8985/solr/tickets/select/?command=abort was: Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://search-master.fansnap.com:8985/solr/tickets/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://search-master.fansnap.com:8985/solr/tickets/select/?command=abort Allow delta-import to run continously until aborted --- Key: SOLR-1426 URL: https://issues.apache.org/jira/browse/SOLR-1426 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Fix For: 1.4 Attachments: delta-import-perpetual.patch Modify the delta-import so that it takes a perpetual flag that makes it run continuously until its aborted. http://localhost:8985/solr/tickets/select/?command=delta-importclean=falseqt=/dataimportcommit=trueperpetual=true perpetual means the delta import will keep running and pause for a few seconds when running queries. The only way to stop delta import will be to explicitly issue an abort like so:- http://localhost:8985/solr/tickets/select/?command=abort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751615#action_12751615 ] Abdul Chaudhry commented on SOLR-236: - If this helps you fix your unit tests. I fixed the unit tests by changing the CollapseFilter constructor that's used for testing to take a StringIndex like so :- - CollapseFilter(int collapseMaxDocs, int collapseTreshold) { + CollapseFilter(int collapseMaxDocs, int collapseTreshold, FieldCache.StringIndex index) { +this.collapseIndex = index; and then I changed the unit test cases to move values into a StringIndex in CollapseFilterTest like so:- public void testNormalCollapse_collapseThresholdOne() { -collapseFilter = new CollapseFilter(Integer.MAX_VALUE, 1); +String[] values = new String[]{a, b, c}; +int[] order = new int[]{0, 1, 0, 2, 1, 0, 1}; +FieldCache.StringIndex index = new FieldCache.StringIndex(order, values); +int[] docIds = new int[]{1, 2, 0, 3, 4, 5, 6}; + +collapseFilter = new CollapseFilter(Integer.MAX_VALUE, 1, index); -String[] values = new String[]{a, b, a, c, b, a, b}; Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751243#action_12751243 ] Abdul Chaudhry commented on SOLR-236: - I have some ideas for performance improvements. I noticed that the code fetches the field cache twice, once for the collapse and then for the response object, assuming you asked for the info count in the response. That seems expensive, especially for real-time content. I think its better to use FieldCache.StringIndex instead of returning a large string array and keep it around for the collapse and the response object. I changed the code so that I keep the cache around like so /** * Keep the field cached for the collapsed fields for the response object as well */ private FieldCache.StringIndex collapseIndex; when collapsing , you can get the current value using something like this and remove the code passing the array int currentId = i.nextDoc(); String currentValue = collapseIndex.lookup[collapseIndex.order[currentId]]; when building the response for the info count, you can reference the same cache like so:- if (collapseInfoCount) { resCount.add(collapseFieldType.indexedToReadable( collapseIndex.lookup[collapseIndex.order[id]]), count); } I also added timing for the cache access as it could be slow if you are doing a lot of updates I have added code for displaying selected fields for the duplicates but its difficult to submit . I hope this gets committed as its hard to sumbit a patch as its not in svn and I cannot submit a patch to a patch to a patch .. you get the idea. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751243#action_12751243 ] Abdul Chaudhry edited comment on SOLR-236 at 9/3/09 5:56 PM: - I have some ideas for performance improvements. I noticed that the code fetches the field cache twice, once for the collapse and then for the response object, assuming you asked for the info count in the response. That seems expensive, especially for real-time content. I think its better to use FieldCache.StringIndex instead of returning a large string array and keep it around for the collapse and the response object. I changed the code so that I keep the cache around like so /** * Keep the field cached for the collapsed fields for the response object as well */ private FieldCache.StringIndex collapseIndex; To get the index use something like this instead of getting the string array for all docs collapseIndex = FieldCache.DEFAULT.getStringIndex(searcher.getReader(), collapseField) when collapsing , you can get the current value using something like this and remove the code passing the array int currentId = i.nextDoc(); String currentValue = collapseIndex.lookup[collapseIndex.order[currentId]]; when building the response for the info count, you can reference the same cache like so:- if (collapseInfoCount) { resCount.add(collapseFieldType.indexedToReadable( collapseIndex.lookup[collapseIndex.order[id]]), count); } I also added timing for the cache access as it could be slow if you are doing a lot of updates I have added code for displaying selected fields for the duplicates but its difficult to submit . I hope this gets committed as its hard to sumbit a patch as its not in svn and I cannot submit a patch to a patch to a patch .. you get the idea. was (Author: abdollar): I have some ideas for performance improvements. I noticed that the code fetches the field cache twice, once for the collapse and then for the response object, assuming you asked for the info count in the response. That seems expensive, especially for real-time content. I think its better to use FieldCache.StringIndex instead of returning a large string array and keep it around for the collapse and the response object. I changed the code so that I keep the cache around like so /** * Keep the field cached for the collapsed fields for the response object as well */ private FieldCache.StringIndex collapseIndex; when collapsing , you can get the current value using something like this and remove the code passing the array int currentId = i.nextDoc(); String currentValue = collapseIndex.lookup[collapseIndex.order[currentId]]; when building the response for the info count, you can reference the same cache like so:- if (collapseInfoCount) { resCount.add(collapseFieldType.indexedToReadable( collapseIndex.lookup[collapseIndex.order[id]]), count); } I also added timing for the cache access as it could be slow if you are doing a lot of updates I have added code for displaying selected fields for the duplicates but its difficult to submit . I hope this gets committed as its hard to sumbit a patch as its not in svn and I cannot submit a patch to a patch to a patch .. you get the idea. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection.
[jira] Created: (SOLR-1391) The XPath field in the XPathEntityResolver should use the resolver to replace possible tokens
The XPath field in the XPathEntityResolver should use the resolver to replace possible tokens - Key: SOLR-1391 URL: https://issues.apache.org/jira/browse/SOLR-1391 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry If you have a data-import configuration that nests an entity that includes an XPath with an XPathEntityProcessor - like so :- entity name=a ...etc datasource= field column=xpath_value/ entity name=b dataSource=filereader processor=XPathEntityProcessor ... etc / field column=my_field xpath=${a.xpath_value} / /entity /entity This will fail with an error like so Caused by: java.lang.RuntimeException: xpath must start with '/' : ${a.xpath_value} We should allow the xpath to be replaced with the token from entity a -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1391) The XPath field in the XPathEntityResolver should use the resolver to replace possible tokens
[ https://issues.apache.org/jira/browse/SOLR-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdul Chaudhry updated SOLR-1391: - Attachment: xpath.patch The fix is simple from what I can tell and I have updated the patch. I just used resolver.replaceTokens on the xpath field The XPath field in the XPathEntityResolver should use the resolver to replace possible tokens - Key: SOLR-1391 URL: https://issues.apache.org/jira/browse/SOLR-1391 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4 Reporter: Abdul Chaudhry Attachments: xpath.patch If you have a data-import configuration that nests an entity that includes an XPath with an XPathEntityProcessor - like so :- entity name=a ...etc datasource= field column=xpath_value/ entity name=b dataSource=filereader processor=XPathEntityProcessor ... etc / field column=my_field xpath=${a.xpath_value} / /entity /entity This will fail with an error like so Caused by: java.lang.RuntimeException: xpath must start with '/' : ${a.xpath_value} We should allow the xpath to be replaced with the token from entity a -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1262) DIH needs support for prepared statements
[ https://issues.apache.org/jira/browse/SOLR-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733138#action_12733138 ] Abdul Chaudhry commented on SOLR-1262: -- I could try to use stored procedures instead of trying to get prepared stements to work in the DIH - however that would mean most of the logic would need to live in SQL and I hate SQL. DIH needs support for prepared statements -- Key: SOLR-1262 URL: https://issues.apache.org/jira/browse/SOLR-1262 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.3 Environment: linux mysql Reporter: Abdul Chaudhry Assignee: Noble Paul Priority: Minor During an indexing run we noticed that we were spending a lot of time creating and tearing down queries in mysql The queries we are using are complex and involve joins spanning across multiple tables. We should support prepared statements in the data import handler via the data-config.xml file - for those databases that support prepared statements. We could add a new attribute to the entity entity in dataConfig - say - pquery or preparedQuery and then pass the prepared statement and have values filled in by the actual queries for each row using a placeholder - like a ? or something else. I would probably start by hacking class JdbcDataSource to try a test but was wondering if anyone had experienced this or had any suggestions or if there is something in the works that I missed - I couldn't find any other bugs mentioning using prepared statements for performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1262) DIH needs support for prepared statements
DIH needs support for prepared statements -- Key: SOLR-1262 URL: https://issues.apache.org/jira/browse/SOLR-1262 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.3 Environment: linux mysql Reporter: Abdul Chaudhry Priority: Critical During an indexing run we noticed that we were spending a lot of time creating and tearing down queries in mysql The queries we are using are complex and involve joins spanning across multiple tables. We should support prepared statements in the data import handler via the data-config.xml file - for those databases that support prepared statements. We could add a new attribute to the entity entity in dataConfig - say - pquery or preparedQuery and then pass the prepared statement and have values filled in by the actual queries for each row using a placeholder - like a ? or something else. I would probably start by hacking class JdbcDataSource to try a test but was wondering if anyone had experienced this or had any suggestions or if there is something in the works that I missed - I couldn't find any other bugs mentioning using prepared statements for performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.