ah :) "with 3TB of ram (we have these running), int64 for >2^32 documents in a single index should not be a problem"
Maybe i m reasoning in bad way but normally the size of storage is not the size of memory. I dont know lucene in the deep, but i would aspect lucene index is scanning a block step by step, not all in memory. For this reason in a previous post, i mentioned about possibility to use iterator instead array, because array load in memory all the results,instead iterator load a single document (or a fixed number of them) for every step. In the case you call loadAll() there is a problem with memory. 2016-08-19 15:39 GMT+02:00, Glen Newton <glen.new...@gmail.com>: > Making docid an int64 is a non-trivial undertaking, and this work needs to > be compared against the use cases and how compelling they are. > > That said, in the lifetime of most software projects a decision is made to > break backward compatibility to move the project forward. > When/if moving to int64 happens, it will be one of these moments. It is not > a Bad Thing (necessarily). :-) > > And for use cases, if I am running a commercial JVM on a 64 core machine > with 3TB of ram (we have these running), int64 for >2^32 documents in a > single index should not be a problem... :-) > > glen > > On Fri, Aug 19, 2016 at 4:43 AM, Adrien Grand <jpou...@gmail.com> wrote: > >> Le ven. 19 août 2016 à 03:32, Trejkaz <trej...@trypticon.org> a écrit : >> >> > But hang on: >> > * TopDocs#merge still returns a TopDocs. >> > * TopDocs still uses an array of ScoreDoc. >> > * ScoreDoc still uses an int doc ID. >> > >> >> This is why ScoreDoc has a `shardId` so that you can know which index a >> document comes from. >> >> I'm not saying we should not switch to long doc ids, but as outlined in >> some other responses it would be a challenging change. >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org