Re: DocSet: BitDocSet or HashDocSet ?

2008-11-03 Thread Mike Klaas
On 28-Oct-08, at 5:36 AM, Jérôme Etévé wrote: Hi all, In my code, I'd like to keep a subset of my 14M docs which is around 100k large. What is according to you the best option in terms of speed and memory usage ? Some basic thoughts tells me the BitDocSet should be the fastest for

Re: DocSet: BitDocSet or HashDocSet ?

2008-10-29 Thread Chris Hostetter
: The doc of HashDocSet says t can be a better choice if there are few : docs in the set . What does 'few' means in this context ? it's relative the total size of your index. if you have a million docs, but you are dealing with DocSets that are only going to contain 10 docs, then both the

DocSet: BitDocSet or HashDocSet ?

2008-10-28 Thread Jérôme Etévé
Hi all, In my code, I'd like to keep a subset of my 14M docs which is around 100k large. What is according to you the best option in terms of speed and memory usage ? Some basic thoughts tells me the BitDocSet should be the fastest for lookup, but takes ~ 14M * sizeof(int) in memory,

Re: DocSet: BitDocSet or HashDocSet ?

2008-10-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
bitdocset does not take ~ 14M * sizeof(int) in memory it may take a maximum of 14M/8 bytes in memory ~= 1.75MB On Tue, Oct 28, 2008 at 6:06 PM, Jérôme Etévé [EMAIL PROTECTED] wrote: Hi all, In my code, I'd like to keep a subset of my 14M docs which is around 100k large. What is