Hi, I'm pretty new around here but I just wanted to tell you how much your work can benefit us. This is great!.
Look forward to trying it out. Regards, -Stefán On Mon, Aug 3, 2015 at 8:38 AM, rahul challapalli < [email protected]> wrote: > Hello Drillers, > > I have been working on a lucene format plugin. In its current state, the > below sample query successfully searches a lucene index and returns the > results. > > select path from dfs_test.`/search-index` where contents='maxItemsPerBlock' > and contents = 'BlockTreeTermsIndex' > > > > *High Level Overview of Current Implementation:* > > *Parallelization:* A lucene segment is the lowest level of > parrallelization. > *Filter Pushdown:* Currently the format plugin is designed to push the > complete filter into the scan. > *Filter Evaluation:* Each condition in the filter is treated as a lucene > TermQuery > < > http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/TermQuery.html > > > and multiple conditions are joined using a BooleanQuery > < > http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/BooleanQuery.html > >. > If we *do not* use a TermQuery, then we have to know the exact type of > Analyzer > < > https://lucene.apache.org/core/5_2_1/core/org/apache/lucene/analysis/Analyzer.html > > > to use with each field in the query. > Ex: 'contents' field might have been analyzed using a StandardAnalyzer > < > https://lucene.apache.org/core/5_2_1/analyzers-common/org/apache/lucene/analysis/standard/StandardAnalyzer.html > > > and the 'path' field might not have been analyzed at all. > If desired, support for raw lucene queries with a reserved word should be > easy to add. > Ex: select * from dfs.`search-index` where searchQuery = > "+contents:maxItemsPerBlock > +path:/home/file.txt"; > *Converting SqlFilter to Lucene Query:* Currently only "=" and "!=" > operators are handled while converting a sql filter into a lucene query. > For indexed fields this might be sufficient to handle a good number of > cases. For non-indexed fields operators like ">,<, like etc" need to be > handled. > *FileSystems:* Currently the format plugin only works on a local > filesystem. > > > Though far from complete, I want to work with the community to get some > feedback and avoid any chance of duplication of work. Kindly let me know > your thoughts > > - Rahul >
