RE: 答复: Internals about "Too many values for UnInvertedField faceting on field xxx"

2014-05-26 Thread
Thanks a lot.

> There are only 256 byte arrays to hold all of the ord data, and the
pointers into those arrays are only 24 bits long.  That gets you back
to 32 bits, or 4GB of ord data max.  It's practically less since you
only have to overflow one array before the exception is thrown.

What does the ord data mean? Term Id or Term-Document Relation or Document-Term 
Relation ? 






答复: Internals about "Too many values for UnInvertedField faceting on field xxx"

2014-05-24 Thread
Thanks for your reply. I'll try it.

We're  still interested in the real limitation about  "Too many values for
UnInvertedField faceting on field xxx" .

Could anybody tell us some internals about "Too many values for
UnInvertedField faceting on field xxx" ?

-邮件原件-
发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] 
发送时间: 2014年5月24日 0:26
收件人: solr-user@lucene.apache.org
主题: RE: Internals about "Too many values for UnInvertedField faceting on
field xxx"

张月祥 [zhan...@calis.edu.cn] wrote:
> Could anybody tell us some internals about "Too many values for 
> UnInvertedField faceting on field xxx" ?

I must admit I do not fully understand it in detail, but it is a known
problem with Field Cache (facet.method=fc) faceting. The remedy is to use
DocValues, which does not have the same limitation. This should also result
in lower heap usage. You will have to re-index everything though.

We have successfully used DocValues on an index with 400M documents and 300M
unique values on a single facet field.

- Toke Eskildsen




Internals about "Too many values for UnInvertedField faceting on field xxx"

2014-05-23 Thread
Could anybody tell us some internals about "Too many values for
UnInvertedField faceting on field xxx" ?

 

We have two solr servers.

 

Solr A :  

 

128G RAM, 60M docs, 2600 different terms with field “code”,  every term of
field “code” has fixed length 6.

the sum count of token of field “code” is 9 Billions. 

The total space used by field “code” is 50 Billions.

 

 

Solr B:  

 

128G RAM, 140M docs,1600 different terms with field “code” every term of
field “code” has fixed length 6.

the sum count of token of field “code” is 18 Billions

The total space of field “code” is 90 Billions.

 

 

When we do facet query “
q=*:*&wt=xml&indent=true&facet=true&facet.field=code”  

Solr B is OK,  BUT Solr A meets Exception with the message “Too many values
for UnInvertedField faceting on field code”.

 

Now we think the limitation of UnInvertedField is related with the number of
different terms with one field

 

Could anybody tell us some internals about this problem? We won’t to use
facet.method=enum because it ‘s too slow to use.

 

Thanks!