Hi,

We use payloads but we can't use the whole lucene API.
For example we use it to do some relation query for example :

@quote(@speaker(obama) @discourse(health))

Search for all documents that contains a quote by Obama talking about health. We encode linguistic informations (standoff annotations) inside payloads and use custom search API to query the index. I didn't found a convenable way to attach my code to lucene Query/Scorer/Weight API. Like SpanQuery you have to rewrite the whole Query stack. In short if you want to go with Payloads that do more than boosting a term there's chances that you'll need to rewrite a big part of the query stack.


Le 27/11/2012 16:59, Wu, Stephen T., Ph.D. a écrit :
I think we're looking at doing something related.  I haven't explored the
Enums or know how to make a postings codec... But what is "flexible
indexing" in Lucene 4.0 if it's not the ability to make new postings codecs?

We're trying to incorporate attributes onto terms/spans in indexes.  We'd
also like to try out some interesting ways to score things that go beyond
just tokens.

We were considering using Attributes instead of Payloads, because it seems
like using Payloads ties you to a particular kind of scoring -- just a
weight on a token.  Can Payloads be used for more general scoring functions?
E.g., considering a span of text alongside multiple Payloads?

Does it make sense to move outside of Payloads here?

Thanks!

stephen




On 11/19/12 8:14 AM, "Michael McCandless" <luc...@mikemccandless.com> wrote:

A new postings format would be tricky because you have new attributes
you want to index.

The DocsAndPositionsEnum does have an attributes source, but this is
not well explored, and there are known problems (they can't be easily
merged in the composite reader case).

So that's why I suggested packing your information into a payload ...

Mike McCandless

http://blog.mikemccandless.com

On Sun, Nov 18, 2012 at 8:33 PM, wgggfiy <wuqiu....@qq.com> wrote:
thx, mike.
about the 3th question, "encode them all into the payload" is better than
"a new postings format with the codec" ??
I mean replace the orginal posting item (position, startOffset, endOffset,
payload) with my own inverted item such as
class TestPostingItem
{
         int termId;
         long startOffset;
         long endOffset;
         float score;
         int segId;
         long timeStamp;
}
?




--
View this message in context:
http://lucene.472066.n3.nabble.com/what-is-the-offsets-and-payload-in-DocsAnd
PositionsEnum-for-tp4020933p4020968.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





--
David Causse
Spotter
http://www.spotter.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to