Hi Grant,
I certainly agree that it would be great if we could make some progress
and commit the payloads patch soon. I think it is quite independent from
FI. FI will introduce different posting formats (see Wiki:
http://wiki.apache.org/lucene-java/FlexibleIndexing). Payloads will be
part of some of those formats, but not all (i. e. per-position payloads
only make sense if positions are stored).
The only concern some people had was about the API the patch introduces.
It extends Token and TermPositions. Doug's argument was, that if we
introduce new APIs now but want to change them with FI, then it will be
hard to support those APIs. I think that is a valid point, but at the
same time it slows down progress to have to plan ahead in too many
directions. That's why I'd vote for marking the new APIs as experimental
so that people can try them out at own risk.
If we could agree on that approach then I'd go ahead and submit an
updated payloads patch in the next days, that applies cleanly on the
current trunk and contains the additional warnings in the javadocs.
In regard of FI and 662 however I really believe we should split it up
and plan ahead (in a way I mentioned already), so that we have more
isolated patches. It is really great that we have 662 already (Nicolas,
thank you so much for your hard work, I hope you'll keep working with us
on FI!!). We'll probably use some of that code, and it will definitely
be helpful.
Michael
Grant Ingersoll wrote:
Hi Michael,
This is very good. I know 662 is different, just wasn't sure if
Nicolas patch was meant to be applied after 662, b/c I know we had
discussed this before.
I do agree with you about planning this out, but I also know that
patches seem to motivate people the best and provide a certain
concreteness to it all. I mostly started asking questions on these
two issues b/c I wanted to spur some more discussion and see if we can
get people motivated to move on it.
I was hoping that I would be able to apply each patch to two different
checkouts so I could start seeing where the overlap is and how they
could fit together (I also admit I was procrastinating on my ApacheCon
talk...). In the new, flexible world, the payloads implementation
could be a separate implementation of the indexing or it could be part
of the core/existing file format implementation. Sometimes I just
need to get my hands on the code to get a real feel for what I feel is
the best way to do it.
I agree about the XML storage for Index information. We do that in
our in-house wrapper around Lucene, storing info about the language,
analyzer used, etc. We may also want a binary index-level storage
capability. I know most people just create a single document usually
to store binary info about the index, but an binary storage might be
good too.
Part of me says to apply the Payloads patch now, as it provides a lot
of bang for the buck and I think the FI is going to take a lot longer
to hash out. However, I know that it may pin us in or force us to
change things for FI. Ultimately, I would love to see both these
features for the next release, but that isn't a requirement. Also, on
FI, I would love to see two different implementations of whatever API
we choose before releasing it, as I always find two implementations of
an Interface really work out the API details.
-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]