[
https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019885#comment-14019885
]
James Dyer commented on SOLR-4799:
----------------------------------
bq. At the same time, in nearly a year, no further improvements on DIH were
done as far as I know. So, perhaps this addition should be committed even if it
is not ideal.
I would say the exact opposite. There are not very many people maintaining DIH
code, and those of us that do are lazy about it. Therefore, let's not stuff
more big features in and make more code to maintain when there are no
maintainers. I have code here in JIRA that I've used in production for years
that I've been unwilling to commit just for this very reason.
I do see Flume as a great DIH replacement, but from the documentation I don't
see it having very great RDBMS support? I think a lot of DIH users are using
it to import data from an RDBMS into Solr.
> SQLEntityProcessor for zipper join
> ----------------------------------
>
> Key: SOLR-4799
> URL: https://issues.apache.org/jira/browse/SOLR-4799
> Project: Solr
> Issue Type: New Feature
> Components: contrib - DataImportHandler
> Reporter: Mikhail Khludnev
> Priority: Minor
> Labels: dih
> Attachments: SOLR-4799.patch
>
>
> DIH is mostly considered as a playground tool, and real usages end up with
> SolrJ. I want to contribute few improvements target DIH performance.
> This one provides performant approach for joining SQL Entities with miserable
> memory at contrast to
> http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor
> The idea is:
> * parent table is explicitly ordered by it’s PK in SQL
> * children table is explicitly ordered by parent_id FK in SQL
> * children entity processor joins ordered resultsets by ‘zipper’ algorithm.
> Do you think it’s worth to contribute it into DIH?
> cc: [~goksron] [~jdyer]
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]