Le Mercredi 20 Décembre 2006 20:42, Michael Busch a écrit : > Doug Cutting wrote: > > Michael, > > > > This sounds like very good work. The back-compatibility of this > > approach is great. But we should also consider this in the broader > > context of index-format flexibility. > > > > Three general approaches have been proposed. They are not exclusive. > > > > 1. Make the index format extensible by adding user-implementable > > reader and writer interfaces for postings. > > > > 2. Add a richer set of standard index formats, including things like > > compressed fields, no-positions, per-position weights, etc. > > > > 3. Provide hooks for including arbitrary binary data. > > > > Your proposal is of type (3). LUCENE-662 is a (1). Approaches of > > type (2) are most friendly to non-Java implementations, since the > > semantics of each variation are well-defined. > > > > I don't see a reason not to pursue all three, but in a coordinated > > manner. In particular, we don't want to add a feature of type (3) > > that would make it harder to add type (1) APIs. It would thus be best > > if we had a rough specification of type (1) and type (2). A proposal > > of type (2) is at: > > > > http://wiki.apache.org/jakarta-lucene/FlexibleIndexing > > > > But I'm not sure that we yet have any proposed designs for an > > extensible posting API. (Is anyone aware of one?) This payload > > proposal can probably be easily incorporated into such a design, but I > > would have more confidence if we had one. I guess I should attempt one! > > Doug, > > thanks for your detailed response. I'm aware that the long-term goal is > the flexible index format and I see the payloads patch only as a part of > it. The patch focuses on extending the index data structures and about a > possible payload encoding. It doesn't focus yet on a flexible API, it > only offers the two mentioned low-level methods to add and retrieve byte > arrays. > > I would love to work with you guys on the flexible index format and to > combine my patch with your suggestions and the patch from Nicolas! I > will look at your proposal and Nicolas' patch tomorrow (have to go now). > I just attached my patch (LUCENE-755), so if you get a chance you could > take a look at it.
I have just looked at it. It looks great :) But I still doesn't understand why a new entry in the fieldinfo is needed. There is the same for TermVector. And code like that fail for no obvious reason : Document doc = new Document(); doc.add(new Field("f1", "v1", Store.YES, Index.TOKENIZED, TermVector.WITH_POSITIONS_OFFSETS)); doc.add(new Field("f1", "v2", Store.YES, Index.TOKENIZED, TermVector.NO)); RAMDirectory ram = new RAMDirectory(); IndexWriter writer = new IndexWriter(ram, new StandardAnalyzer(), true); writer.addDocument(doc); writer.close(); Knowing a little bit about how lucene works, I have an idea why this fail, but can we avoid this ? Nicolas -- Nicolas LALEVÉE Solutions & Technologies ANYWARE TECHNOLOGIES Tel : +33 (0)5 61 00 52 90 Fax : +33 (0)5 61 00 51 46 http://www.anyware-tech.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]