Hi,
We use payloads but we can't use the whole lucene API.
For example we use it to do some relation query for example :
@quote(@speaker(obama) @discourse(health))
Search for all documents that contains a quote by Obama talking about
health.
We encode linguistic informations (standoff annotations) inside payloads
and use custom search API to query the index.
I didn't found a convenable way to attach my code to lucene
Query/Scorer/Weight API. Like SpanQuery you have to rewrite the whole
Query stack.
In short if you want to go with Payloads that do more than boosting a
term there's chances that you'll need to rewrite a big part of the query
stack.
Le 27/11/2012 16:59, Wu, Stephen T., Ph.D. a écrit :
I think we're looking at doing something related. I haven't explored the
Enums or know how to make a postings codec... But what is "flexible
indexing" in Lucene 4.0 if it's not the ability to make new postings codecs?
We're trying to incorporate attributes onto terms/spans in indexes. We'd
also like to try out some interesting ways to score things that go beyond
just tokens.
We were considering using Attributes instead of Payloads, because it seems
like using Payloads ties you to a particular kind of scoring -- just a
weight on a token. Can Payloads be used for more general scoring functions?
E.g., considering a span of text alongside multiple Payloads?
Does it make sense to move outside of Payloads here?
Thanks!
stephen
On 11/19/12 8:14 AM, "Michael McCandless" <[email protected]> wrote:
A new postings format would be tricky because you have new attributes
you want to index.
The DocsAndPositionsEnum does have an attributes source, but this is
not well explored, and there are known problems (they can't be easily
merged in the composite reader case).
So that's why I suggested packing your information into a payload ...
Mike McCandless
http://blog.mikemccandless.com
On Sun, Nov 18, 2012 at 8:33 PM, wgggfiy <[email protected]> wrote:
thx, mike.
about the 3th question, "encode them all into the payload" is better than
"a new postings format with the codec" ??
I mean replace the orginal posting item (position, startOffset, endOffset,
payload) with my own inverted item such as
class TestPostingItem
{
int termId;
long startOffset;
long endOffset;
float score;
int segId;
long timeStamp;
}
?
--
View this message in context:
http://lucene.472066.n3.nabble.com/what-is-the-offsets-and-payload-in-DocsAnd
PositionsEnum-for-tp4020933p4020968.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
--
David Causse
Spotter
http://www.spotter.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]