[ 
https://issues.apache.org/jira/browse/HELIX-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HELIX-594:
---------------------------------
    Description: 
 this is a safety feature where Helix automatically detects GC and disconnects 
from the cluster automatically. Unfortunately in some cases it surfaces as NPE. 

We should probably describe the reason for disabling in the instance config. 
Currently we just disable the node, we should probably add an attribute 
DISABLE_CAUSE:"TOO MANY DISCONNECTS FROM ZK. CHECK JAVA GC LOG" or something 
like that.


  was:
We always get the following errors on startup.. (#1 looks like the leader 
elector for controller... ) . Ours is a FULL_AUTO embedded controller helix 
configuration.

1.org.apache.helix.manager.zk.ZkBaseDataAccessor.doCreate(ZkBaseDataAccessor.java:138)
Node already exists. path: /streamio/STATEMODELDEFS/STORAGE_DEFAULT_SM_SCHEMATA


2. org.apache.helix.manager.zk.CallbackHandler.invoke(CallbackHandler.java:130) 
Skip processing callbacks for listener: 
org.apache.helix.messaging.handling.HelixTaskExecutor@1a9f9f09, path: 
/streamio/INSTANCES/datapipe11-sjc1-controller-/MESSAGES, expected types: 
[CALLBACK, FINALIZE] but was INIT


3.org.apache.helix.healthcheck.ParticipantHealthReportTask.stop(ParticipantHealthReportTask.java:67)
ParticipantHealthReportTimerTask already stopped
org.apache.helix.healthcheck.ParticipantHealthReportTask in stop at line 67


> Misleading NPE trying to reconnect, upon ZK Timeout
> ---------------------------------------------------
>
>                 Key: HELIX-594
>                 URL: https://issues.apache.org/jira/browse/HELIX-594
>             Project: Apache Helix
>          Issue Type: Improvement
>          Components: helix-core
>    Affects Versions: 0.6.5
>            Reporter: Vinoth Chandar
>            Priority: Minor
>             Fix For: master
>
>
>  this is a safety feature where Helix automatically detects GC and 
> disconnects from the cluster automatically. Unfortunately in some cases it 
> surfaces as NPE. 
> We should probably describe the reason for disabling in the instance config. 
> Currently we just disable the node, we should probably add an attribute 
> DISABLE_CAUSE:"TOO MANY DISCONNECTS FROM ZK. CHECK JAVA GC LOG" or something 
> like that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to