Re: possible segment merge improvement?

Doron Cohen Thu, 01 Nov 2007 06:35:15 -0800

[EMAIL PROTECTED] wrote on 01/11/2007 16:10:27:

> > If we make this change to Lucene then for those apps that effectively
> > have a static field schema (because all docs always have matching
> > fields), we can get the same performance that KinoSearch always gets
> > during its merging of stored fields & term vectors.
>
> Does "all docs have matching fields" mean that the fields must be
> present (as well as identically typed) on each doc, or could they
> still be sparse?  If they can be sparse, how do you avoid
> renumbering???


Perhaps I interpreted this optimization proposal wrong. -

My understanding is that this is for stored fields data
in the field data (.fdt) file, where FieldNum might
need to be changed, in:

   DocFieldData --> FieldCount, <FieldNum, Bits, Value> FieldCount

My reading of Robert's suggestion is that when we know that
FieldInfos of the resulted segment is identical to the
FieldInfos of a certain (sub) segment being merged then
there is no need to parse+rewrite the field data for all
docs of that (sub)segment, rather they can be written as is.

Doron


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: possible segment merge improvement?

Reply via email to