fair enough Vikas,

as said I was also proposing BloomFilter.
At that stage I "gave up” mainly due “synchronization issue” at startup.
Since the cache initialization was a separate thread I had to take care on 
properly handle this situation and updating properly the bloomfilter.
With this new approach of limiting the vanity Path entries size I think the 
Bloom Filter can come back in the solution :)

Thanks for the “remainder”

regards

antonio

On Dec 5, 2014, at 6:02 PM, Vikas Saurabh <vikas.saur...@gmail.com> wrote:

>> we already took in consideration Bloom Filter for a related issue [2].
>> We decided that is still not too optimal since it leads toward content 
>> duplication and I would like to avoid that for now
>> 
>> [2] https://issues.apache.org/jira/browse/SLING-3290
>> 
> 
> Well, imho, bloom filters won't duplicate content -- they'd just have
> bit-masks to tentatively mark existence of a value. Moreover, if we
> use guava's implementation (which I think sling doesn't want to do...
> if I am reading SLING-3290 correctly), then we can serialize them on
> clean shutdown to have practically no work done during startup. For
> crashes, we can probably live with re-creating the filter again.
> 
> About, BloomFilterUtils attached in SLING-3290, I think it's just
> using 1 hash function to create mask. In general, bloom filter
> implementation would have more number of hashes to configure less
> false-positives.
> 
> About caching actual data in RAM (and assuming sling would sit on top
> of Oak??) -- should caching of most used nodes be a responsibility of
> repository implementation?.. but, that's probably a different
> discussion.
> 
> Thanks,
> Vikas

Reply via email to