Hi Vijay, We think that it is definitely a better option than using custom approach with an ETL tool. This handler is a re-usable tool to accomplish synchronizing databases with SOLR. In the current form, it is stable --- we've haven't had any issues even when indexing large databases. We recently indexed around 1.7 million documents with it (the database dump was around 1.6 GB, index size was around 700 MB).
Just as a heads up, we're also adding RSS feeds as a data source. We'll be using it to index and search over user generated content in our organization. As for committing this to SOLR, I guess we still need to do some cleanup and testcases but that won't take too long. Perhaps, a committer can give his thoughts on that (Ryan?). By all means, go ahead and play around with it. We have a wiki page at http://wiki.apache.org/solr/DataImportHandler which has documentation and working example. The wiki also has a solr.war with the patch applied so you don't need to worry about the patches. If you are in doubt or run into any problem, please post it here and we'll gladly help you with it. Thanks for your interest. Enjoy! On Feb 18, 2008 11:25 PM, Vijay Rao <[EMAIL PROTECTED]> wrote: > hi, > We have a similar requirement in our organization.We are planning to use an > ETL tool to synchronize our DB with Solr. Looks like this is a better > approach. > Is it production quality? > When do you plan to commit this to Solr? > Cheers > Vijay > > > > On Feb 18, 2008 11:20 PM, Noble Paul (JIRA) <[EMAIL PROTECTED]> wrote: > > > > > [ > > > https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569979#action_12569979] > > > > Noble Paul commented on SOLR-469: > > --------------------------------- > > > > Add a facility to delete documents from Solr index on the basis of a solr > > query. > > It is useful if you wish to expire the documents after a certain period of > > time. > > > > > > > DB Import RequestHandler > > > ------------------------ > > > > > > Key: SOLR-469 > > > URL: https://issues.apache.org/jira/browse/SOLR-469 > > > Project: Solr > > > Issue Type: New Feature > > > Components: update > > > Affects Versions: 1.3 > > > Reporter: Noble Paul > > > Priority: Minor > > > Fix For: 1.3 > > > > > > Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch > > > > > > > > > We need a RequestHandler Which can import data from a DB or other > > dataSources into the Solr index .Think of it as an advanced form of > > SqlUpload Plugin (SOLR-103). > > > The way it works is as follows. > > > * Provide a configuration file (xml) to the Handler which takes in > > the necessary SQL queries and mappings to a solr schema > > > - It also takes in a properties file for the data source > > configuraution > > > * Given the configuration it can also generate the solr schema.xml > > > * It is registered as a RequestHandler which can take two commands > > do-full-import, do-delta-import > > > - do-full-import - dumps all the data from the Database into > > the index (based on the SQL query in configuration) > > > - do-delta-import - dumps all the data that has changed since > > last import. (We assume a modified-timestamp column in tables) > > > * It provides a admin page > > > - where we can schedule it to be run automatically at regular > > intervals > > > - It shows the status of the Handler (idle, full-import, > > delta-import) > > > > -- > > This message is automatically generated by JIRA. > > - > > You can reply to this email to add a comment to the issue online. > > > > > -- Regards, Shalin Shekhar Mangar.