Re: Lucene Format Plugin

Jason Altekruse Mon, 03 Aug 2015 11:32:11 -0700

Hey Rahul,

This is really cool! Thanks for all of the time you put into writing this,
I think we have a lot of available opportunities to reach new communities
with efforts like this.


I noticed last week another contributor opened a JIRA for a solr plugin,
there might be a good opportunity for the two of you to join efforts, as I
believe he likely stated working on a lucene reader as part of his solr
work.

Would you like to post a link to your work on Github or another public host
of your code?

https://issues.apache.org/jira/browse/DRILL-3585

On Mon, Aug 3, 2015 at 2:29 AM, Stefán Baxter <[email protected]>
wrote:

> Hi,
>
> I'm pretty new around here but I just wanted to tell you how much your work
> can benefit us. This is great!.
>
> Look forward to trying it out.
>
> Regards,
>  -Stefán
>
> On Mon, Aug 3, 2015 at 8:38 AM, rahul challapalli <
> [email protected]> wrote:
>
> > Hello Drillers,
> >
> > I have been working on a lucene format plugin. In its current state, the
> > below sample query successfully searches a lucene index and returns the
> > results.
> >
> > select path from dfs_test.`/search-index` where
> contents='maxItemsPerBlock'
> > and contents = 'BlockTreeTermsIndex'
> >
> >
> >
> > *High Level Overview of Current Implementation:*
> >
> > *Parallelization:* A lucene segment is the lowest level of
> > parrallelization.
> > *Filter Pushdown:* Currently the format plugin is designed to push the
> > complete filter into the scan.
> > *Filter Evaluation:* Each condition in the filter is treated as a lucene
> > TermQuery
> > <
> >
> http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/TermQuery.html
> > >
> > and multiple conditions are joined using a BooleanQuery
> > <
> >
> http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/BooleanQuery.html
> > >.
> > If we *do not* use a TermQuery, then we have to know the exact type of
> > Analyzer
> > <
> >
> https://lucene.apache.org/core/5_2_1/core/org/apache/lucene/analysis/Analyzer.html
> > >
> > to use with each field in the query.
> >     Ex: 'contents' field might have been analyzed using a
> StandardAnalyzer
> > <
> >
> https://lucene.apache.org/core/5_2_1/analyzers-common/org/apache/lucene/analysis/standard/StandardAnalyzer.html
> > >
> > and the 'path' field might not have been analyzed at all.
> > If desired, support for raw lucene queries with a reserved word should be
> > easy to add.
> >     Ex: select * from dfs.`search-index` where searchQuery =
> > "+contents:maxItemsPerBlock
> > +path:/home/file.txt";
> > *Converting SqlFilter to Lucene Query:* Currently only "=" and "!="
> > operators are handled while converting a sql filter into a lucene query.
> > For indexed fields this might be sufficient to handle a good number of
> > cases. For non-indexed fields operators like ">,<, like etc" need to be
> > handled.
> > *FileSystems:* Currently the format plugin only works on a local
> > filesystem.
> >
> >
> > Though far from complete, I want to work with the community to get some
> > feedback and avoid any chance of duplication of work. Kindly let me know
> > your thoughts
> >
> > - Rahul
> >
>

Re: Lucene Format Plugin

Reply via email to