On Wed, Mar 25, 2009 at 12:39 PM, Marvin Humphrey
<[email protected]> wrote:
>> Eg, you will store new deletions against segment X with segment Y
>> (when X's new deletions got flushed at the same time that segment Y
>> was flushed). So, where will segment X's new delCount be recorded?
>
> In Segment Y's metadata.
>
> {
> "deletions" : {
> "files" : {
> "seg_1" : {
> "count" : "2",
> "filename" : "seg_3/deletions-seg_1.bv"
> },
> "seg_2" : {
> "count" : "1",
> "filename" : "seg_3/deletions-seg_2.bv"
> }
> },
> "format" : "1"
> },
> "segmeta" : {
> "doc_count" : "0",
> "field_names" : [
> "",
> "content"
> ],
> "format" : "1",
> "name" : "seg_3"
> }
> }
>
>> Also, what happens if I open a writer, do only deletes, and close? Do
>> you flush an empty (no added docs) segment Y simply to record the new
>> deletions?
>
> Yes. Note the "doc_count" of 0 under segmeta's key in the previous JSON
> sample.
OK
>> Will snapshot allow user-defined (opaque to Lucy) metadata to be recored
>> inside it?
>
> Yes. ("User" in this case means the developer, not end-user, though an
> irresponsible dev could forward end-user data.) Custom DataWriter
> implementations are encouraged to store their metadata within the segmeta
> file, rather than write it themselves to custom files.
Yes I've struggled to find the right term for "the developer using
Lucene/KS/Lucy". We normally call them users, but then we have to
differentiate the "end users".
Mike