[ 
https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13656291#comment-13656291
 ] 

Mikhail Khludnev commented on SOLR-4799:
----------------------------------------

I want do review the functionality, here is the proposed config

{code:xml}
<dataConfig>
        <document>
                <entity name="parent" processor="SqlEntityProcessor" 
query="SELECT * FROM PARENT ORDER BY id">          
                        <entity name="child_1" 
processor="OrderedChildrenEntityProcessor"
                                where="parent_id=parent.id" query="SELECT * 
FROM CHILD_1 ORDER BY parent_id" >
                        </entity>                       
                </entity>
        </document>
</dataConfig>
{code}

Do you like it?

Parent and child SQLs can have different order that kills zipper. 
OrderedChildrenEntityProcessor can enforce ASC order for the PK and FK keys 
(and throw exception in case of violation), but it also might detect order 
itself that complicates the code a little. What do you expect for the first 
code contribution?

                
> SQLEntityProcessor for zipper join
> ----------------------------------
>
>                 Key: SOLR-4799
>                 URL: https://issues.apache.org/jira/browse/SOLR-4799
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Mikhail Khludnev
>            Priority: Minor
>              Labels: dih
>
> DIH is mostly considered as a playground tool, and real usages end up with 
> SolrJ. I want to contribute few improvements target DIH performance.
> This one provides performant approach for joining SQL Entities with miserable 
> memory at contrast to 
> http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor  
> The idea is:
> * parent table is explicitly ordered by it’s PK in SQL
> * children table is explicitly ordered by parent_id FK in SQL
> * children entity processor joins ordered resultsets by ‘zipper’ algorithm.
> Do you think it’s worth to contribute it into DIH?
> cc: [~goksron] [~jdyer]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to