Kushal Gautam created KARAF-6267:
------------------------------------
Summary: Group Join issue in Cellar
Key: KARAF-6267
URL: https://issues.apache.org/jira/browse/KARAF-6267
Project: Karaf
Issue Type: Bug
Components: cellar
Affects Versions: 4.2.0
Environment: Windows 10
Reporter: Kushal Gautam
I have two nodes(management and worker) and three groups(default, management
and workers). Both these nodes are new instances of Karaf with cellar
installed on them, along with camel and jdbc. But the bundles using these
services are not installed yet. So, it's a fresh karaf installation.
The first node joined successfully to the management node and quit the
default group. I tried to allocate the second node to the workers node as
shown in the dump below. But, I am strangely getting an error time and
again.
karaf@root()> cluster:group-list
| Group | Members
--+------------+----------------------
| management | 192.168.99.1:5701
x | default | 192.168.99.1:5702(x)
karaf@root()> cluster:group-create workers
karaf@root()> cluster:group-list
| Group | Members
--+------------+----------------------
| management | 192.168.99.1:5701
| workers |
x | default | 192.168.99.1:5702(x)
karaf@root()> cluster:group-join workers
No result received within given timeout
karaf@root()> log:tail
13:29:42.862 INFO [hz.cellar.InvocationMonitorThread] [192.168.99.1]:5702
[cellar] [3.9.1] Invocations:1 timeouts:1 backup-timeouts:0
13:30:31.539 ERROR [pool-14-thread-26] Error while dispatching task
java.lang.NullPointerException: null
at java.util.HashSet.<init>(HashSet.java:118) ~[?:?]
at
org.apache.karaf.cellar.hazelcast.HazelcastGroupManager.registerGroup(HazelcastGroupManager.java:467)
~[67:org.apache.karaf.cellar.hazelcast:4.1.2]
at
org.apache.karaf.cellar.core.control.ManageGroupCommandHandler.joinGroup(ManageGroupCommandHandler.java:91)
~[65:org.apache.karaf.cellar.core:4.1.2]
at
org.apache.karaf.cellar.core.control.ManageGroupCommandHandler.execute(ManageGroupCommandHandler.java:41)
~[65:org.apache.karaf.cellar.core:4.1.2]
at
org.apache.karaf.cellar.core.control.ManageGroupCommandHandler.execute(ManageGroupCommandHandler.java:27)
~[65:org.apache.karaf.cellar.core:4.1.2]
at
org.apache.karaf.cellar.core.command.CommandHandler.handle(CommandHandler.java:40)
~[65:org.apache.karaf.cellar.core:4.1.2]
at
org.apache.karaf.cellar.core.command.CommandHandler.handle(CommandHandler.java:28)
~[65:org.apache.karaf.cellar.core:4.1.2]
at
org.apache.karaf.cellar.core.event.EventDispatchTask.run(EventDispatchTask.java:67)
[65:org.apache.karaf.cellar.core:4.1.2]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:?]
at java.lang.Thread.run(Thread.java:745) [?:?]
karaf@root()>
cluster:group-list
| Group | Members
--+------------+----------------------
| management | 192.168.99.1:5701
| workers |
x | default | 192.168.99.1:5702(x)
On the management node, I have set the producer and consumer state to OFF
Apart from that, I have added "org.ops4j.pax.transx.tm.geronimo" to the
excluded config list in "org.apache.karaf.cellar.node" to avoid transaction
lock issue.
It would be great if somebody could help me figure out the issue here. I
walked through the source code, but I could not see how exactly is it
getting a null pointer.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)