Re: Building term frequency matrix over 6 million documents...

2014-01-24 Thread Marcio Napoli
Hi! I believe the approach below can help you. http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java Marcio http://numere.stela.org.br Go beyond Luceneā„¢ features with NumereĀ® 2014/1/24 Witdouck, Xavier > Hi all, > > We have over 6 m

Building term frequency matrix over 6 million documents...

2014-01-24 Thread Witdouck, Xavier
Hi all, We have over 6 million documents in our index, and would like to construct a term frequency matrix over all 6 million documents as quickly as possible. Each document has a numeric date field, so we would like to build a time series which contains values which are the sum of all frequen