[ 
https://issues.apache.org/jira/browse/SOLR-9789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15686350#comment-15686350
 ] 

Shawn Heisey commented on SOLR-9789:
------------------------------------

If the collections in SolrCloud on version 5.5.1 are created with that version, 
clusterstate.json will have a size of zero.  The only way clusterstate.json can 
get this large is on a cloud that was upgraded from 4.x.  If you set up a brand 
new collection, you will find that it is not in clusterstate.json -- instead, 
there is a file named state.json in the collection's path within zookeeper.  
This is the "v2" clusterstate that was new in Solr 5.0.

A correct fix for your situation would be to create brand new collections so 
each collection has a separate clusterstate, and have your code retrieve the 
individual state files.

Fixing the NPE in Solr so it returns an understandable error is likely to 
require more than a simple null check.  As written, the code will have no idea 
why the byte array is null.  The code should check for the existence of the 
file being requested, and if the ZK client can retrieve the size of the znode, 
then the error message could be even more specific.  If I can find some free 
time, I can look into making this change.  Fair warning -- the change is not 
likely to appear in version 5, but only in a new 6.x version.

This entire problem is caused by zookeeper allowing the *creation* of znodes 
larger than their "jute.maxbuffer" size limit, which zookeeper then refuses to 
*read*.  There is a bug on zookeeper for this, but so far it has not been 
addressed.  I am having trouble locating the issue.


> ZkCLI throws NullPointerException if zkClient doesn't return data
> -----------------------------------------------------------------
>
>                 Key: SOLR-9789
>                 URL: https://issues.apache.org/jira/browse/SOLR-9789
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 5.5.1
>            Reporter: Gary Lee
>            Priority: Minor
>
> We ran into a situation where using ZkCLI to get the Solr clusterstate in 
> Zookeeper always returned a NPE. We eventually found that it was due to a 
> clusterstate being too large (over 1M Zookeeper node limit) so the 
> zkClient.getData call returned null, but it was confusing to instead throw an 
> NPE because ZkCLI assumes non-null byte data. Could a check be added to not 
> throw NPE, but report a warning instead?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to