Hello,

I want to use Solr as a search engine. I have indexed data like:
ID | TEXT | CREATION_DATE

Daily increase by 500 000 rows.

My problem:
*INPUT:* fixed set of tokens (max size 40), set of days
*RESULT:* How many rows (TEXT) contain fixed set of tokens and are created
in day1, day2, ..., day20

I tried to build aggregates like:
*1. Solution*
DATE (days) | TOKEN_1 | TOKEN_2 | ... | TOKEN_40

where for example:
TOKEN_3 - string like "ID_1,ID_2,...,ID_N", where ID_* contain the TOKEN_3

then I can split TOKEN_* to Set<Long> and size of Set<Long> is the number of
distinct rows.
*PROBLEM:* But here is the problem with sending to long strings that must be
splitted by the client side (to big response data).

*2. Solution*
DATE (days) | TOKENS | COUNT

where 
TOKENS contains combination of input tokens.
For 3 tokens I have 7 combinations
For 5 tokens I have 31 combinations
For 10 tokens I have 1023 combinations
For 20 tokens I have 1048575 combinations
etc.
*PROBLEM:* To many cases (combinations) with 40 tokens

Maybe the 1 Solution would be good if I could split the strings by some Solr
function (custom function) or...?

Thanks for any ideas





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Count-rows-with-tokens-tp3274643p3274643.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to