Users & Developers & Possible Contributors, 

Hi,

Recently I did some code hacks and I am using frequency calcs for TermVector
instead of default out-of-the-box DocSet Intersections. It improves
performance hundreds of times at shopping engine http://www.tokenizer.org -
please check http://issues.apache.org/jira/browse/SOLR-711 - I feel the term
"faceting" (and related architectural decision made for CNET several years
ago) is completely wrong. Default SOLR response times: 30-180 seconds; with
TermVector: 0.2 seconds (25 millions documents, tokenized field). For
non-tokenized field: it also looks natural to use frequency calcs, but I
have not done it yet.

Sorry... too busy with Liferay Portal contract assignments,
http://www.linkedin.com/in/liferay

Another possible performance improvements: create safe & concurrent cache
for SOLR, you may check LingPipe, and also
http://issues.apache.org/jira/browse/SOLR-665 and
http://issues.apache.org/jira/browse/SOLR-667.

Lucene developers are doing greate job to remove synchronization in several
places too, such as isDeleted() method call... would be nice to have
unsynchronized API version for read-only indexes.


Thanks!




-- 
View this message in context: 
http://www.nabble.com/Contributions-Needed%3A-Faceting-Performance%2C-SOLR-Caching-tp20058987p20058987.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to