[ 
https://issues.apache.org/jira/browse/LUCENE-662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Lalevée updated LUCENE-662:
-----------------------------------

    Attachment: indexFormat.patch
                indexFormat-only.patch

Synchronized with the trunk, so with the payload feature. It allowed me to 
refactor in one class the payload writing which is in two places today : it is 
now in the DefaultPostingWriter class.

>From my last update, the TODO list is still to do, nothing has been fixed. 
>Furthermore there is a regression in the new patch : the ensureOpen() is not 
>correctly handled for lazy loaded fields : a test fail. This is due to the 
>fact that the FieldsReader doesn't handle it anymore in my patch. As the data 
>struture can be customized, lazy loading is exported to the FieldData created 
>by the FieldsReader. So the both instance have to communicate about the 
>closing of the streams. So a new item in the TODO list.

As discussed in java-dev, here is a light patch with only the index format 
handling, without the possibility to redefine how data and postings are 
store/retreived.


> Extendable writer and reader of field data
> ------------------------------------------
>
>                 Key: LUCENE-662
>                 URL: https://issues.apache.org/jira/browse/LUCENE-662
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Nicolas Lalevée
>            Priority: Minor
>         Attachments: entrytable.patch, generic-fieldIO-2.patch, 
> generic-fieldIO-3.patch, generic-fieldIO-4.patch, generic-fieldIO-5.patch, 
> generic-fieldIO.patch, indexFormat-only.patch, indexFormat.patch, 
> indexFormat.patch, indexFormat.patch
>
>
> As discussed on the dev mailing list, I have modified Lucene to allow to 
> define how the data of a field is writen and read in the index.
> Basically, I have introduced the notion of IndexFormat. It is in fact a 
> factory of FieldsWriter and FieldsReader. So the IndexReader, the indexWriter 
> and the SegmentMerger are using this factory and not doing a "new 
> FieldsReader/Writer()".
> I have also introduced the notion of FieldData. It handles every data of a 
> field, and also the writing and the reading in a stream. I have done this way 
> because in the current design of Lucene, Fiedable is an interface, so methods 
> with a protected or package visibility cannot be defined.
> A FieldsWriter just writes data into a stream via the FieldData of the field.
> A FieldsReader instanciates a FieldData depending on the field name. Then it 
> use the field data to read the stream. And finnaly it instanciates a Field 
> with the field data.
> About compatibility, I think it is kept, as I have writen a 
> DefaultIndexFormat that provides some DefaultFieldsWriter and 
> DefaultFieldsReader. These implementations do the exact job that is done 
> today.
> To acheive this modification, some classes and methods had to be moved from 
> private and/or final to public or protected.
> About the lazy fields, I have implemented them in a more general way in the 
> implementation of the abstract class FieldData, so it will be totally 
> transparent for the Lucene user that will extends FieldData. The stream is 
> kept in the fieldData and used as soon as the stringValue (or something else) 
> is called. Implementing this way allowed me to handle the recently introduced 
> LOAD_FOR_MERGE; it is just a lazy field data, and when read() is called on 
> this lazy field data, the saved input stream is directly copied in the output 
> stream.
> I have a last issue with this patch. The current design allow to read an 
> index in an old format, and just do a writer.addIndexes() into a new format. 
> With the new design, you cannot, because the writer will use the 
> FieldData.write provided by the reader.
> enjoy !

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to