[EMAIL PROTECTED] wrote:
Hi Gal,
I'm curious about the memory consumption of the cache and the speed of
retrieval of an item from the cache, when the cache has 100k domains in
it.
Slightly off-topic, but I hope this is relevant to the original reason
for creating this plugin...
There is a BSD-licensed library that implements a large subset of
regexps, which is based on finite automata. It is reported to be
scalable and very fast (benchmarks are surely impressive):
http://www.brics.dk/~amoeller/automaton/
I suggest to do some tests with 100k regexps and see if it survives.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers