Hello, --- Paul Elschot <[EMAIL PROTECTED]> wrote:
> On Tuesday 01 November 2005 08:51, Otis Gospodnetic wrote: > > Hello, > > > > I spent most of today talking to some people about Lucene, and one > of > > them said how they would really like to have an "instantaneous > index > > merge", and how he is thinking he could achieve that by simply > opening > > segments file of one index, and adding segment names of the other > > index/indices, plus adjusting the segment size (SegSize in > > fileformats.html), thus creating a single (but unoptimized) index. > > > > Any reactions to that? > > > > I imagine this isn't quite that simple to implement, as one would > have > > to renumber all documents, in order to avoid having multiple > documents > > with the same document id. > > > > Can anyone think of any other problems with this approach, or > perhaps > > offer ideas for possible document renumbering? > > Document numbers within segments are determined dynamically in the > index reader, so these should not be a problem. Each segment simply > numbers > its documents from zero. Uh, and I always thought they were stored in the index. Aren't they stored in the .fdx and .fdt files? And shouldn't they also be linked from some place. I see a mention of document numbers in information about the .frq. > Iirc the segment names determine the order > of the segments for an index reader. > > I think creating a new index by adding segments from an existing one > should > be fairly straightforward. Some care will be needed to avoid > clashes in the segment names. You mean ensuring that segment _x from index A doesn't clash with _x from index B? Segment names are written only in the segments file, I believe, so I think if I detect that _x is already taken, I could simply rename it to something (e.g. _foo) that hasn't been taken yet, and remember to use that segment name when writing the segments file. > Also what should happen with > the index from which the segments are taken? Should the shared > segments be copied between indexes? I can simply distroy the original index once I've created a fakely merged one. I'm not sure what you mean by shared segments. If I have two indices, A and B, then each of them will have its own set of segments with no segments in common. > It's possible to share segments between indexes when the file system > allows files to be present in multiple directories. Oh, are you saying that I could just leave segments where they are and use something like symlinks to point to them from a new index? e.g. A: <index files for A> B: <index files for B> C: <symlinks to index files for A> <symlinks to index files for B> <segments file with segment names for A and B> ? Thanks, Otis --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
