On Fri, Jan 19, 2018 at 19:05:06 -0500, Emilio G. Cota wrote: > > > > On Fri, 12 Jan 2018 19:32:10 +0800 > > > > Antonios Motakis <antonios.mota...@huawei.com> wrote: > > > Since inodes are not completely random, and we usually have a handful of > > > device IDs, > > > we get a much smaller number of entries to track in the hash table. > > > > > > So what this would give: > > > (1) Would be faster and take less memory than mapping the full > > > inode_nr,devi_id > > > tuple to unique QID paths > > > (2) Guaranteed not to run out of bits when inode numbers stay below > > > the lowest > > > 54 bits and we have less than 1024 devices. > > > (3) When we get beyond this this limit, there is a chance we run > > > out of bits to > > > allocate new QID paths, but we can detect this and refuse to serve the > > > offending > > > files instead of allowing a collision. > > > > > > We could tweak the prefix size to match the scenarios that we consider > > > more likely, > > > but I think close to 10-16 bits sounds reasonable enough. What do you > > > think? > > Assuming assumption (2) is very likely to be true, I'd suggest > dropping the intermediate hash table altogether, and simply refuse > to work with any files that do not meet (2). > > That said, the naive solution of having a large hash table with all entries > in it might be worth a shot.
hmm but that would still take a lot of memory. Given assumption (2), a good compromise would be the following, taking into account that the number of total gids is unlikely to reach even close to 2**64: - bit 63: 0/1 determines "fast" or "slow" encoding - 62-0: - fast (trivial) encoding: when assumption (2) is met - 62-53: device id (it fits because of (2)) - 52-0: inode (it fits because of (2)) - slow path: assumption (2) isn't met. Then, assign incremental IDs in the [0,2**63-1] range and track them in a hash table. Choosing 10 or whatever else bits for the device id is of course TBD, as Antonios you pointed out. Something like this will give you great performance and 0 memory overhead for the majority of cases if (2) indeed holds. Emilio