Below is the link to my branch which contains the changes related to the format plugin.
https://github.com/rchallapalli/drill/tree/lucene/contrib/format-lucene Any thoughts on how to handle contributions like this which still have some work to be done? - Rahul On Mon, Aug 3, 2015 at 12:21 PM, rahul challapalli < [email protected]> wrote: > Thanks Jason. > > I want to look at the solr plugin and see where we can collaborate or if > we already duplicated part of the effort. > > I still need to push a few commits. I will share the code once I get these > changes pushed. > > - Rahul > > > > On Mon, Aug 3, 2015 at 11:31 AM, Jason Altekruse <[email protected] > > wrote: > >> Hey Rahul, >> >> This is really cool! Thanks for all of the time you put into writing this, >> I think we have a lot of available opportunities to reach new communities >> with efforts like this. >> >> I noticed last week another contributor opened a JIRA for a solr plugin, >> there might be a good opportunity for the two of you to join efforts, as I >> believe he likely stated working on a lucene reader as part of his solr >> work. >> >> Would you like to post a link to your work on Github or another public >> host >> of your code? >> >> https://issues.apache.org/jira/browse/DRILL-3585 >> >> On Mon, Aug 3, 2015 at 2:29 AM, Stefán Baxter <[email protected]> >> wrote: >> >> > Hi, >> > >> > I'm pretty new around here but I just wanted to tell you how much your >> work >> > can benefit us. This is great!. >> > >> > Look forward to trying it out. >> > >> > Regards, >> > -Stefán >> > >> > On Mon, Aug 3, 2015 at 8:38 AM, rahul challapalli < >> > [email protected]> wrote: >> > >> > > Hello Drillers, >> > > >> > > I have been working on a lucene format plugin. In its current state, >> the >> > > below sample query successfully searches a lucene index and returns >> the >> > > results. >> > > >> > > select path from dfs_test.`/search-index` where >> > contents='maxItemsPerBlock' >> > > and contents = 'BlockTreeTermsIndex' >> > > >> > > >> > > >> > > *High Level Overview of Current Implementation:* >> > > >> > > *Parallelization:* A lucene segment is the lowest level of >> > > parrallelization. >> > > *Filter Pushdown:* Currently the format plugin is designed to push the >> > > complete filter into the scan. >> > > *Filter Evaluation:* Each condition in the filter is treated as a >> lucene >> > > TermQuery >> > > < >> > > >> > >> http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/TermQuery.html >> > > > >> > > and multiple conditions are joined using a BooleanQuery >> > > < >> > > >> > >> http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/BooleanQuery.html >> > > >. >> > > If we *do not* use a TermQuery, then we have to know the exact type of >> > > Analyzer >> > > < >> > > >> > >> https://lucene.apache.org/core/5_2_1/core/org/apache/lucene/analysis/Analyzer.html >> > > > >> > > to use with each field in the query. >> > > Ex: 'contents' field might have been analyzed using a >> > StandardAnalyzer >> > > < >> > > >> > >> https://lucene.apache.org/core/5_2_1/analyzers-common/org/apache/lucene/analysis/standard/StandardAnalyzer.html >> > > > >> > > and the 'path' field might not have been analyzed at all. >> > > If desired, support for raw lucene queries with a reserved word >> should be >> > > easy to add. >> > > Ex: select * from dfs.`search-index` where searchQuery = >> > > "+contents:maxItemsPerBlock >> > > +path:/home/file.txt"; >> > > *Converting SqlFilter to Lucene Query:* Currently only "=" and "!=" >> > > operators are handled while converting a sql filter into a lucene >> query. >> > > For indexed fields this might be sufficient to handle a good number of >> > > cases. For non-indexed fields operators like ">,<, like etc" need to >> be >> > > handled. >> > > *FileSystems:* Currently the format plugin only works on a local >> > > filesystem. >> > > >> > > >> > > Though far from complete, I want to work with the community to get >> some >> > > feedback and avoid any chance of duplication of work. Kindly let me >> know >> > > your thoughts >> > > >> > > - Rahul >> > > >> > >> > >
