Re: How to access DocValues inside a customized collector?

2018-09-21 Thread Lisheng Zhang
Thanks very much Uwe and Mikhail!

Your points are all very well taken, so far it seems to work well, i will
test more to verify details.

Lisheng

On Fri, Sep 21, 2018 at 3:54 AM Uwe Schindler  wrote:

> Hi,
>
> in general your approach is right, but you have to do it correctly. It
> depends on the Collector subclass you are using. The simplest is to
> subclass SimpleCollector:
> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/SimpleCollector.html
>
> There you have to override 2 methods:
>
> doSetNextReader(LeafReaderContext context): Here you call *once*
> context.reader().getBinaryDocValues(String field) and save the thing in a
> private member field "actReaderdocValues" of the collector (non-final).
>
> In collect(docId) you can then call actReaderdocValues.advanceExact(docId)
> and retrieve the value. As collect is always called "in order", its safe to
> use advanceExact().
>
> Important is: Don't get a new docvalues instance on each call and
> advanceExact()! This is only needed for out of order! So in combination
> with an collector (like above) you get maximum performance, as everything
> is per leaf reader and in order.
>
> Uwe
>
> -
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Lisheng Zhang 
> > Sent: Friday, September 21, 2018 3:23 AM
> > To: java-user@lucene.apache.org
> > Subject: How to access DocValues inside a customized collector?
> >
> > we need to use binary DocValues (in a customized collector) added during
> > indexing, i first tested in standard TopScoreDocCollector, it seems that
> we
> > need to:
> >
> > LeafReaderContext => reader() => get binary iterator => advanced to
> correct
> > location
> >
> > Is this the correct way or actually we have a better API (since we
> already
> > in that docId it seems to me that the binary DocValues should be readily
> > available?
> >
> > Also do we have a way to see directly indexed data (Luke seems obsolete,
> > Marple does not work with lucene 7.4.0 yet)?
> >
> > Thanks very much for helps, Lisheng
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


RE: How to access DocValues inside a customized collector?

2018-09-21 Thread Uwe Schindler
Hi,

in general your approach is right, but you have to do it correctly. It depends 
on the Collector subclass you are using. The simplest is to subclass 
SimpleCollector: 
https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/SimpleCollector.html

There you have to override 2 methods:

doSetNextReader(LeafReaderContext context): Here you call *once* 
context.reader().getBinaryDocValues(String field) and save the thing in a 
private member field "actReaderdocValues" of the collector (non-final).

In collect(docId) you can then call actReaderdocValues.advanceExact(docId) and 
retrieve the value. As collect is always called "in order", its safe to use 
advanceExact().

Important is: Don't get a new docvalues instance on each call and 
advanceExact()! This is only needed for out of order! So in combination with an 
collector (like above) you get maximum performance, as everything is per leaf 
reader and in order.

Uwe

-
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Lisheng Zhang 
> Sent: Friday, September 21, 2018 3:23 AM
> To: java-user@lucene.apache.org
> Subject: How to access DocValues inside a customized collector?
> 
> we need to use binary DocValues (in a customized collector) added during
> indexing, i first tested in standard TopScoreDocCollector, it seems that we
> need to:
> 
> LeafReaderContext => reader() => get binary iterator => advanced to correct
> location
> 
> Is this the correct way or actually we have a better API (since we already
> in that docId it seems to me that the binary DocValues should be readily
> available?
> 
> Also do we have a way to see directly indexed data (Luke seems obsolete,
> Marple does not work with lucene 7.4.0 yet)?
> 
> Thanks very much for helps, Lisheng


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How to access DocValues inside a customized collector?

2018-09-21 Thread Mikhail Khludnev
Not sure why are you looking for something better, since it's the best API
already.
You can check the sample usage at
.FastTaxonomyFacetCounts.countAll(IndexReader), also notice
FastTaxonomyFacetCounts.count(List) where DV iterator is
dragged by enclosing intersection.
also SolrDocumentFetcher.decodeDVField(int, LeafReader, String) does
exactly this.

On Fri, Sep 21, 2018 at 4:23 AM Lisheng Zhang  wrote:

> we need to use binary DocValues (in a customized collector) added during
> indexing, i first tested in standard TopScoreDocCollector, it seems that we
> need to:
>
> LeafReaderContext => reader() => get binary iterator => advanced to correct
> location
>
> Is this the correct way or actually we have a better API (since we already
> in that docId it seems to me that the binary DocValues should be readily
> available?
>
> Also do we have a way to see directly indexed data (Luke seems obsolete,
> Marple does not work with lucene 7.4.0 yet)?
>
> Thanks very much for helps, Lisheng
>


-- 
Sincerely yours
Mikhail Khludnev


Re: How to access DocValues inside a customized collector?

2018-09-20 Thread Lisheng Zhang
Erick: Thanks very much for quick help, Luke you referred worked well (i
found binary DocValues did get put in well)

However i am still not sure how to efficiently access DocValues in a
collector,

" The Terms component directly access the indexed data and can be used
to poke around in the indexed data. "

Could you elaborate a little or roughly point a source code where DocValues
were accessed inside collector (lucene or solr
source code would be fine)?

Thanks again for helps!







On Thu, Sep 20, 2018 at 7:39 PM Erick Erickson 
wrote:

> What Luke are you using? I think this one is being maintained:
> https://github.com/DmitryKey/luke
>
> The Terms component directly access the indexed data and can be used
> to poke around in the indexed data.
>
> I'll skip the accessing DocValues as I have to go back and look every time.
> On Thu, Sep 20, 2018 at 6:23 PM Lisheng Zhang  wrote:
> >
> > we need to use binary DocValues (in a customized collector) added during
> > indexing, i first tested in standard TopScoreDocCollector, it seems that
> we
> > need to:
> >
> > LeafReaderContext => reader() => get binary iterator => advanced to
> correct
> > location
> >
> > Is this the correct way or actually we have a better API (since we
> already
> > in that docId it seems to me that the binary DocValues should be readily
> > available?
> >
> > Also do we have a way to see directly indexed data (Luke seems obsolete,
> > Marple does not work with lucene 7.4.0 yet)?
> >
> > Thanks very much for helps, Lisheng
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: How to access DocValues inside a customized collector?

2018-09-20 Thread Erick Erickson
What Luke are you using? I think this one is being maintained:
https://github.com/DmitryKey/luke

The Terms component directly access the indexed data and can be used
to poke around in the indexed data.

I'll skip the accessing DocValues as I have to go back and look every time.
On Thu, Sep 20, 2018 at 6:23 PM Lisheng Zhang  wrote:
>
> we need to use binary DocValues (in a customized collector) added during
> indexing, i first tested in standard TopScoreDocCollector, it seems that we
> need to:
>
> LeafReaderContext => reader() => get binary iterator => advanced to correct
> location
>
> Is this the correct way or actually we have a better API (since we already
> in that docId it seems to me that the binary DocValues should be readily
> available?
>
> Also do we have a way to see directly indexed data (Luke seems obsolete,
> Marple does not work with lucene 7.4.0 yet)?
>
> Thanks very much for helps, Lisheng

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org