Greets, Lucene, KinoSearch, and Ferret all use 32-bit document numbers. For present practical index sizes, that's enough.
Since we're going to optimize for 64-bit architectures, though, I think we ought to look forward and define document numbers as 64-bit signed integers. That way, we won't have to worry about changing things down the road to meet the needs of growing search clusters. Memory space and disk space are concerns but I think we get around most of that by guaranteeing that no individual segment can contain more than I32_MAX docs. That way, things like document deletion maps can stay as arrays of i32_t. Marvin Humphrey
