[ https://issues.apache.org/activemq/browse/CAMEL-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=56667#action_56667 ]
Ashwin Karpe edited comment on CAMEL-1472 at 12/28/09 6:47 AM: --------------------------------------------------------------- Hi Claus, Jon & Hadrian, I have created a new Apache Lucene Component & Query processor and have attached a patch along with a zip file containing the code for your review. I have also added the requisite unit tests and ensured that the code undergoes checkstyle validation. The component works as follows Lucene Producer: Index Creation example ---------------------------------------------------------- context.addRoutes(new RouteBuilder() { public void configure() { from("direct:start"). to("lucene://stdQuotesIndex?analyzerRef=#stdAnalyzer&indexDir=#std&srcDir=#load_dir"). to("mock:result"); } }); where each URI parameter setting does the following - analyzerRef: can be any valid implementation of Lucene Directory Analyzer (StandardAnalyzer, WhitespaceAnalyzer, StopAnalyzer... etc) - srcDir: an optional directory location for loading Text or XML documents at endpoint or Lucene Index creation. Since these settings cannot be directly passed into the URI, I pass them using the JNDI registry associated with the the Default Component (example shown below). Example: Providing values for the Lucene URI -------------------------------------------------------------- @Override protected JndiRegistry createRegistry() throws Exception { JndiRegistry registry = new JndiRegistry(createJndiContext()); registry.bind("std", new File("target/stdindexDir")); registry.bind("load_dir", new File("src/test/resources/sources")); registry.bind("stdAnalyzer", new StandardAnalyzer(Version.LUCENE_CURRENT)); return registry; } I have also added a Query Processor that is fully capable of running any queries (including wildcards etc) against a Lucene Document Index and present the results in a schema driven XML format (example provided below) Example: Query Processor for Lucene called LuceneSearcher ------------------------------------------------------------------------------------- context.addRoutes(new RouteBuilder() { public void configure() { from("direct:start"). setHeader("QUERY", constant("Rodney Dangerfield")). process(new LuceneSearcher("target/stdindexDir", analyzer, null, 20)). to("mock:searchResult"); } }); Example: Search Results presentation Format ---------------------------------------------------------------- <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <hits xmlns="http://camel.apache.org/lucene/SearchData"> <numberOfHits>2</numberOfHits> <hit> <number>1</number> <hitLocation>15</hitLocation> <score>0.9453935</score> <data>I worked in a pet store and people kept asking how big I?d get. - Rodney Dangerfield</data> </hit> <hit> <number>2</number> <hitLocation>13</hitLocation> <score>0.8272193</score> <data>I tell ya when I was a kid, all I knew was rejection. My yo-yo, it never came back. - Rodney Dangerfield</data> </hit> </hits> I used the latest version of Lucene version 3.0 for the implementation but this can be moved up easily over time since I have no hard restrictions on Lucene versions. The API sets could be different moving backwards though. I have not verified this.... Lucene has undergone a lot of change in each subsequent version it seems :). The good news is that for the most part they offer backward compatibility for API's. Please find attached the patch as well as a zip file containing the code. Can you please review and please let me know what you think. I would be happy to update the documentation once I get your feedback and am happy to make any needed changes. Cheers, Ashwin... was (Author: akarpe): Hi Claus, Jon & Hadrian, I have created a new Apache Lucene Component & Query processor and have attached a patch along with a zip file containing the code for your review. I have also added the requisite unit tests and ensured that the code undergoes checkstyle validation. The component works as follows <code> context.addRoutes(new RouteBuilder() { public void configure() { from("direct:start"). to("lucene://stdQuotesIndex?analyzerRef=#stdAnalyzer&indexDir=#std&srcDir=#load_dir"). to("mock:result"); } }); </code> where each URI parameter setting does the following - analyzerRef: can be any valid implementation of Lucene Directory Analyzer (StandardAnalyzer, WhitespaceAnalyzer, StopAnalyzer... etc) - srcDir: an optional directory location for loading Text or XML documents at endpoint or Lucene Index creation. Since these settings cannot be directly passed into the URI, I pass them using the JNDI registry associated with the the Default Component (example shown below). <code> @Override protected JndiRegistry createRegistry() throws Exception { JndiRegistry registry = new JndiRegistry(createJndiContext()); registry.bind("std", new File("target/stdindexDir")); registry.bind("load_dir", new File("src/test/resources/sources")); registry.bind("stdAnalyzer", new StandardAnalyzer(Version.LUCENE_CURRENT)); return registry; } </code> I used the latest version of Lucene version 3.0 for the implementation but this can be moved up easily over time since I have no hard restrictions on Lucene versions. The API sets could be different moving backwards though. I have not verified this.... Lucene has undergone a lot of change in each subsequent version it seems :). Please find attached the patch as well as a zip file containing the code. Can you please review and please let me know what you think. I would be happy to update the documentation once I get your feedback and am happy to make any needed changes. Cheers, Ashwin... > Lucene Component > ---------------- > > Key: CAMEL-1472 > URL: https://issues.apache.org/activemq/browse/CAMEL-1472 > Project: Apache Camel > Issue Type: New Feature > Reporter: Claus Ibsen > Assignee: Ashwin Karpe > Fix For: Future > > Attachments: camel-lucene-20091227.patch, camel-lucene.zip > > > We should add a new component for Apache Lucene integration -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.