[jira] [Comment Edited] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken

Sudheer Prem (JIRA) Tue, 12 Mar 2013 10:35:07 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600078#comment-13600078
 ]


Sudheer Prem edited comment on SOLR-4561 at 3/12/13 5:06 PM:
-------------------------------------------------------------

I have a scenario where table A contain 5 million rows and table B contain more 
than a million rows. The join condition matches for only a couple of thousands 
of records. I had been using this feature in earlier version of Solr. Suddenly 
due to this change, it took the wrong join (one which matches the first 
condition) and populate that value to all documents.

After debugging, my thought for the fix is like this:

This is happening because, in the method SqlEntityProcessor.nextRow(), the 
query is initialized and loaded only if the the rowIterator is null. Actually, 
the query can be initialized if the query is different than the previous query. 
If the logic is changed in that way, i think this issue will be fixed.
To apply this logic, change the SqlEntityProcessor.nextRow() method from 

{code}
if (rowIterator == null) {
      String q = getQuery();
      initQuery(context.replaceTokens(q));
}
{code}

to the code mentioned below:

{code}
    String q = context.replaceTokens(getQuery());
    if(!q.equals(this.query)){
      initQuery(q);
    }
{code}

Initial testing shows that, it seems working as expected.

                
      was (Author: sudheerprem):
    I have a scenario where table A contain 5 million rows and table B contain 
more than a million rows. The join condition matches for only a couple of 
thousands of records. I had been using this feature in earlier version of Solr. 
Suddenly due to this change, it took the wrong join (one which matches the 
first condition) and populate that value to all documents.

After debugging, my thought for the fix is like this:

This is happening because, in the method SqlEntityProcessor.nextRow(), the 
query is initialized and loaded only if the the rowIterator is null. Actually, 
the query should be initialized if the query is different than the previous 
query. If the logic is changed in that way, i think this issue will be fixed.
To apply this logic, change the SqlEntityProcessor.nextRow() method from 

{code}
if (rowIterator == null) {
      String q = getQuery();
      initQuery(context.replaceTokens(q));
}
{code}

to the code mentioned below:

{code}
    String q = context.replaceTokens(getQuery());
    if(!q.equals(this.query)){
      initQuery(q);
    }
{code}

Initial testing shows that, it seems working as expected.

                  
> CachedSqlEntityProcessor with parametarized query is broken
> -----------------------------------------------------------
>
>                 Key: SOLR-4561
>                 URL: https://issues.apache.org/jira/browse/SOLR-4561
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.1
>            Reporter: Sudheer Prem
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> When child entities are created and the child entity is provided with a 
> parametrized query as below, 
> {code:xml} 
> <entity name="x" query="select * from x">
>     <entity name="y" query="select * from y where xid=${x.id}" 
> processor="CachedSqlEntityProcessor">
>     </entity>
> <entity>
> {code} 
> the Entity Processor always return the result from the fist query even though 
> the parameter is changed, It is happening because, 
> EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
> after calling DIHCacheSupport.getCacheData() method.
> This can be fixed by changing the else block in getNext() method of 
> EntityProcessorBase from
> {code} 
> else  {
>       return cacheSupport.getCacheData(context, query, rowIterator);
>       
> }
> {code} 
> to the code mentioned below:
> {code} 
> else  {
>       Map<String,Object> cacheData = cacheSupport.getCacheData(context, 
> query, rowIterator);
>       query = null;
>       rowIterator = null;
>       return cacheData;
>     }
> {code}   
> Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken

Reply via email to