OK, so the most straightforward way to do that would be to change the signature 
to positions(boolean needsPayloads, boolean needsOffsets), I guess.  This is a 
new API so it's not breaking anything.  

It'll be tomorrow morning before I have a proper go at this now (Cambridge Beer 
Festival tonight…).  Is the mailing list the best place to discuss this, or is 
JIRA/IRC better?

On 23 May 2012, at 13:43, Simon Willnauer wrote:

> hey alan,
> 
> I added position iterator support to ConjunctionTermScorer and
> committed it to the branch. All tests that don't rely on payloads are
> passing in core. Previously we had to decide if we need positions up
> front, the current code can pull them lazily which causes less changes
> on the Scorer API. I think we should keep it that way, the only
> problem is that we have currently now way to pass information to the
> iterators if we need payloads or not. Same is true for offsets since
> they are now in the index. I think it would be good if you could
> tackle the payloads first and pass some info to the Scorer#positions()
> method so we can pull the right thing.
> 
> happy coding.
> 
> simon
> 
> On Wed, May 23, 2012 at 1:23 PM, Alan Woodward
> <[email protected]> wrote:
>> Sweet, thanks Simon.  I'll have a go at getting some failing tests passing 
>> to begin with.
>> 
>> On 23 May 2012, at 11:59, Simon Willnauer wrote:
>> 
>>> alan,
>>> 
>>> I merged the branch manually and created a new branch from it. its
>>> here: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-2878
>>> the branch compiles but lots of nocommits / todos
>>> 
>>> if you have questions please ask I will help as much as I can
>>> 
>>> simon
>>> 
>>> On Tue, May 22, 2012 at 8:38 PM, Alan Woodward
>>> <[email protected]> wrote:
>>>> Hey, I reckon I can have a decent go at getting the branch updated.  Is it 
>>>> best to work this out as a patch applying to trunk?  Any patch that merges 
>>>> in all the trunk changes to the branch is going to be absolutely massive…
>>>> 
>>>> On 17 May 2012, at 13:15, Simon Willnauer wrote:
>>>> 
>>>>> ok man. I will try to merge up the branch. I tell you this is going to
>>>>> be messy and it might not compile but I will make it reasonable so you
>>>>> can start.
>>>>> 
>>>>> simon
>>>>> 
>>>>> On Thu, May 17, 2012 at 8:03 AM, Alan Woodward
>>>>> <[email protected]> wrote:
>>>>>> Sorry for vanishing for so long, life unexpectedly caught up with me...  
>>>>>> I'm going to have some time to look at this again next week though, if 
>>>>>> you're interested in picking it up again.
>>>>>> 
>>>>>> On 21 Mar 2012, at 09:02, Alan Woodward wrote:
>>>>>> 
>>>>>>> That would be great, thanks!  I had a go at merging it last night, but 
>>>>>>> there are a *lot* of changes that I haven't got my head round yet, so 
>>>>>>> it was getting pretty messy.
>>>>>>> 
>>>>>>> On 21 Mar 2012, at 08:49, Simon Willnauer wrote:
>>>>>>> 
>>>>>>>> Alan, if you want I can just merge the branch up next week and we
>>>>>>>> iterate from there?
>>>>>>>> 
>>>>>>>> simon
>>>>>>>> 
>>>>>>>> On Tue, Mar 20, 2012 at 12:34 PM, Erick Erickson
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> Yep, the first challenge is always getting the old patch(es) to 
>>>>>>>>> apply.....
>>>>>>>>> 
>>>>>>>>> On Tue, Mar 20, 2012 at 4:09 AM, Alan Woodward
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>> Thanks for all the offers of help!  It looks as though most of the 
>>>>>>>>>> hard work has already been done, which is exactly where I like to 
>>>>>>>>>> pick up projects.  :-)
>>>>>>>>>> 
>>>>>>>>>> Maybe the best place to start would be for me to rebase the branch 
>>>>>>>>>> against trunk, and see what still fits?  I think there have been 
>>>>>>>>>> some fairly major changes in the internals since July last year.
>>>>>>>>>> 
>>>>>>>>>> On 19 Mar 2012, at 17:07, Mike Sokolov wrote:
>>>>>>>>>> 
>>>>>>>>>>> I posted a patch with a Collector somewhat similar to what you 
>>>>>>>>>>> described, Alan - it's attached to one of the sub-issues 
>>>>>>>>>>> https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a 
>>>>>>>>>>> fairly complete "alpha" state, but has seen no production use of 
>>>>>>>>>>> course, since it relies on the remainder of the unfinished work in 
>>>>>>>>>>> that branch.  It works by creating a TokenStream based on match 
>>>>>>>>>>> positions returned from the query and passing that to the existing 
>>>>>>>>>>> Highlighter.  Please feel free to get in touch if you decide to 
>>>>>>>>>>> look into that and have questions.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -Mike
>>>>>>>>>>> 
>>>>>>>>>>> On 03/19/2012 11:51 AM, Simon Willnauer wrote:
>>>>>>>>>>>> On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindler<[email protected]>  
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Have you marked that for GSOC? Would be a good idea!
>>>>>>>>>>>>> 
>>>>>>>>>>>> yes I did
>>>>>>>>>>>> 
>>>>>>>>>>>>> -----
>>>>>>>>>>>>> Uwe Schindler
>>>>>>>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>>>>>>>>>>>> http://www.thetaphi.de
>>>>>>>>>>>>> eMail: [email protected]
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Simon Willnauer [mailto:[email protected]]
>>>>>>>>>>>>>> Sent: Monday, March 19, 2012 4:43 PM
>>>>>>>>>>>>>> To: [email protected]
>>>>>>>>>>>>>> Subject: Re: Using term offsets for hit highlighting
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Alan, you made my day!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The branch is kind of outdated but I looked at it lately and I 
>>>>>>>>>>>>>> can certainly help
>>>>>>>>>>>>>> to get it up to speed. The feature in that branch is quite a big 
>>>>>>>>>>>>>> one and its in a
>>>>>>>>>>>>>> very early stage. Still I want to encourage you to take a look 
>>>>>>>>>>>>>> and work on it. I
>>>>>>>>>>>>>> promise all my help with the issues!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> let me know if you have questions!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> simon
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mon, Mar 19, 2012 at 3:52 PM, Alan Woodward
>>>>>>>>>>>>>> <[email protected]>  wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Cool, thanks Robert.  I'll take a look at the JIRA ticket.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 19 Mar 2012, at 14:44, Robert Muir wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Mon, Mar 19, 2012 at 10:38 AM, Alan Woodward
>>>>>>>>>>>>>>>> <[email protected]>  wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> The project I'm currently working on requires the reporting 
>>>>>>>>>>>>>>>>> of exact
>>>>>>>>>>>>>>>>> hit positions from some pretty hairy queries, not all of 
>>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>> covered by the existing highlighter modules.  I'm working 
>>>>>>>>>>>>>>>>> round this
>>>>>>>>>>>>>>>>> by translating everything into SpanQueries, and using the 
>>>>>>>>>>>>>>>>> getSpans()
>>>>>>>>>>>>>>>>> method to locate hits (I've extended the Spans interface to 
>>>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>> term offsets available - see
>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LUCENE-3826).  This 
>>>>>>>>>>>>>>>>> works for
>>>>>>>>>>>>>>>>> our use-case, but isn't terribly efficient, and obviously 
>>>>>>>>>>>>>>>>> isn't applicable to
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> non-Span queries.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I've seen a bit of chatter on the list about using term 
>>>>>>>>>>>>>>>>> offsets to
>>>>>>>>>>>>>>>>> provide accurate highlighting in Lucene.  I'm going to have a 
>>>>>>>>>>>>>>>>> couple
>>>>>>>>>>>>>>>>> of weeks free in April, and I thought I might have a go at
>>>>>>>>>>>>>>>>> implementing this.  Mainly I'm wondering if there's already 
>>>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>> thoughts about how to do it.  My current thoughts are to 
>>>>>>>>>>>>>>>>> somehow
>>>>>>>>>>>>>>>>> extend the Weight and Scorer interface to make term offsets
>>>>>>>>>>>>>>>>> available; to get highlights for a given set of documents, 
>>>>>>>>>>>>>>>>> you'd
>>>>>>>>>>>>>>>>> essentially run the query again, with a filter on just the 
>>>>>>>>>>>>>>>>> documents
>>>>>>>>>>>>>>>>> you want highlighted, and have a custom collector that gets 
>>>>>>>>>>>>>>>>> the term
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> offsets in place of the scores.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi Alan, Simon started some initial work on
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LUCENE-2878
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Some work and prototypes were done in a branch, but it might be
>>>>>>>>>>>>>>>> lagging behind trunk a bit.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Additionally at the time it was first done, I think we didn't 
>>>>>>>>>>>>>>>> yet
>>>>>>>>>>>>>>>> support offsets in the postings lists.
>>>>>>>>>>>>>>>> We've since added this and several codecs support it.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> lucidimagination.com
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>>> To unsubscribe, e-mail: [email protected] For
>>>>>>>>>>>>>>>> additional commands, e-mail: [email protected]
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>> To unsubscribe, e-mail: [email protected] For
>>>>>>>>>>>>>>> additional commands, e-mail: [email protected]
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>> To unsubscribe, e-mail: [email protected] For 
>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>> commands, e-mail: [email protected]
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [email protected]
>>>>> For additional commands, e-mail: [email protected]
>>>>> 
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>> 
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to