Re: [Neo4j] Advice on correct 2-instance setup is required

Dennis O Tue, 17 May 2016 08:37:48 -0700

Hi Yayati,

*>> as a 3 Machine quorum is required for making a HA cluster*


Looks like neo4j does not require a 3rd machine to function as follows 
from "...When running Neo4j in HA mode there is always a single master and 
zero or more slaves...." on http://neo4j.com/docs/3.0.1/ha-architecture.html

Dennis





On Tuesday, May 17, 2016 at 12:56:40 AM UTC-4, Yayati Sule wrote:
>
> Hi Dennis,
> I am facing simialr problem, but in my case I have 2 machines as slaves 
> and 1 master, but I cannot get them up and running. In your case maybe you 
> can try to setup a third instance and try to start the cluster as a 3 
> Machine quorum is required for making a HA cluster.
>
> Regards,
> Yayati Sule
> Associate Data Scientist
> Innoplexus Consulting Services Pvt. Ltd.
> www.innoplexus.com
> Mob : +91-9527459407
>
> Landline: +91-20-66527300
>
> © 2011-16 Innoplexus Consulting Services Pvt. Ltd.
>
> Unless otherwise explicitly stated, all rights including those in copyright 
> in the content of this e-mail are owned by Innoplexus Consulting Services Pvt 
> Ltd. and all related legal entities. The contents of this e-mail shall not be 
> copied, reproduced, or transmitted in any form without the written permission 
> of Innoplexus Consulting Services Pvt Ltd or that of the copyright owner. The 
> receipt of this mail is the acknowledgement of the receipt of contents; if 
> the recipient is not the intended addressee then the recipient shall notify 
> the sender immediately.
>
> The contents are provided for information only and no opinions expressed 
> should be relied on without further consultation with Innoplexus Consulting 
> Services Pvt Ltd. and all related legal entities. While all endeavors have 
> been made to ensure accuracy, Innoplexus Consulting Services Pvt. Ltd. makes 
> no warranty or representation to its accuracy, completeness or fairness and 
> persons who rely on it do so entirely at their own risk. The information 
> herein may be changed or withdrawn at any time without notice. Innoplexus 
> Consulting Services Pvt. Ltd. will not be liable to any client or third party 
> for the accuracy of the information supplied through this service.
>
> Innoplexus Consulting Services Pvt. Ltd. accepts no responsibility or 
> liability for the contents of any other site, whether linked to this site or 
> not, or any consequences from your acting upon the contents of another site.
>
> Please Consider the environment before printing this email.
>
>
> On Tue, May 17, 2016 at 10:22 AM, Dennis O <dennis....@gmail.com 
> <javascript:>> wrote:
>
>> Hi,
>>
>> Please advise on required configuration for the 2-instance "HA" setup 
>> (AWS, neo4j Enterprise 3.0.1).
>>
>> Currently I have on both instances:
>> - dbms.mode=HA
>> - ha.initial_hosts=172.31.35.147:5001,172.31.33.173:5001
>> - ha.host.coordination is commented out
>> - ha.host.data is commented out
>> Port 5001, 5002, 7474, 6001 open on both.
>>
>> Differences
>> 1. One node has ha.server_id=1 (172.31.33.173), another one 
>> - ha.server_id=2
>> 2. Node with id=1 is Debian 8.4, id=2 is Centos 7
>>
>>
>> With this setup, node with id=1 starts w/o problems, elected as master, 
>> second one however fails.
>>
>> Some log extracts:
>>
>>
>> 2016-05-17 04:50:51.781+0000 INFO  [o.n.k.h.MasterClient214] 
>> MasterClient214 communication channel created towards /127.0.0.1:6001
>> 2016-05-17 04:50:51.790+0000 INFO  [o.n.k.h.c.SwitchToSlave] Copying 
>> store from master
>> 2016-05-17 04:50:51.791+0000 INFO  [o.n.k.h.MasterClient214] Thread[31, 
>> HA Mode switcher-1] Trying to open a new channel from /172.31.35.147:0 
>> to /127.0.0.1:6001
>> 2016-05-17 04:50:51.791+0000 DEBUG [o.n.k.h.MasterClient214] 
>> MasterClient214 could not connect from /172.31.35.147:0 to /
>> 127.0.0.1:6001
>> 2016-05-17 04:50:51.796+0000 INFO  [o.n.k.h.MasterClient214] 
>> MasterClient214[/127.0.0.1:6001] shutdown
>> 2016-05-17 04:50:51.796+0000 ERROR 
>> [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Error while trying to switch to 
>> slave MasterClient214 could not connect from /172.31.35.147:0 to /
>> 127.0.0.1:6001
>> org.neo4j.com.ComException: MasterClient214 could not connect from /
>> 172.31.35.147:0 to /127.0.0.1:6001
>> at org.neo4j.com.Client$2.create(Client.java:225)
>> at org.neo4j.com.Client$2.create(Client.java:202)
>> at org.neo4j.com.ResourcePool.acquire(ResourcePool.java:177)
>> at org.neo4j.com.Client.acquireChannelContext(Client.java:390)
>> at org.neo4j.com.Client.sendRequest(Client.java:296)
>> at org.neo4j.com.Client.sendRequest(Client.java:289)
>> at org.neo4j.kernel.ha.MasterClient210.copyStore(MasterClient210.java:311)
>> at 
>> org.neo4j.kernel.ha.cluster.SwitchToSlave$1.copyStore(SwitchToSlave.java:531)
>> at 
>> org.neo4j.com.storecopy.StoreCopyClient.copyStore(StoreCopyClient.java:191)
>> at 
>> org.neo4j.kernel.ha.cluster.SwitchToSlave.copyStoreFromMaster(SwitchToSlave.java:525)
>> at 
>> org.neo4j.kernel.ha.cluster.SwitchToSlave.copyStoreFromMasterIfNeeded(SwitchToSlave.java:348)
>> at 
>> org.neo4j.kernel.ha.cluster.SwitchToSlave.switchToSlave(SwitchToSlave.java:272)
>> at 
>> org.neo4j.kernel.ha.cluster.modeswitch.HighAvailabilityModeSwitcher$1.run(HighAvailabilityModeSwitcher.java:348)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> at 
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at 
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:104)
>> Caused by: java.net.ConnectException: Connection refused
>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>> at 
>> org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:148)
>> at 
>> org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:104)
>> at 
>> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:78)
>> at 
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>> at 
>> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:41)
>> ... 4 more
>> 2016-05-17 04:50:51.797+0000 INFO 
>>  [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Attempting to switch to slave 
>> in 7s
>> 2016-05-17 04:50:58.799+0000 INFO  [o.n.k.i.f.CommunityFacadeFactory] No 
>> locking implementation specified, defaulting to 'forseti'
>> 2016-05-17 04:50:58.799+0000 INFO  [o.n.k.h.c.SwitchToSlave] ServerId 2, 
>> moving to slave for master ha://0.0.0.0:6001?serverId=1
>>
>>
>>
>> 2016-05-17 04:30:57.535+0000 DEBUG [o.n.c.p.c.ClusterState$2] [AsyncLog @ 
>> 2016-05-17 04:30:57.534+0000]  ClusterState: 
>> discovery-[configurationTimeout]->discovery conversation-id:2/13# 
>> payload:ConfigurationTimeoutState{remainingPings=3}
>> 2016-05-17 04:30:57.535+0000 DEBUG [o.n.c.p.h.HeartbeatState$1] [AsyncLog 
>> @ 2016-05-17 04:30:57.535+0000]  HeartbeatState: 
>> start-[reset_send_heartbeat]->start conversation-id:2/13#
>> 2016-05-17 04:30:57.538+0000 INFO  [o.n.c.c.NetworkSender] [AsyncLog @ 
>> 2016-05-17 04:30:57.537+0000]  Attempting to connect from /
>> 172.31.35.147:0 to /172.31.33.173:5001
>> 2016-05-17 04:30:57.540+0000 INFO  [o.n.c.c.NetworkSender] [AsyncLog @ 
>> 2016-05-17 04:30:57.540+0000]  Failed to connect to /172.31.33.173:5001 
>> due to: java.net.ConnectException: Connection refused
>> 2016-05-17 04:30:57.540+0000 DEBUG [o.n.c.p.c.ClusterState$2] [AsyncLog @ 
>> 2016-05-17 04:30:57.540+0000]  ClusterState: 
>> discovery-[configurationRequest]->discovery from:cluster://
>> 172.31.35.147:5001 conversation-id:2/13# 
>> payload:ConfigurationRequestState{joiningId=2, joiningUri=cluster://
>> 172.31.35.147:5001}
>> 2016-05-17 04:30:58.420+0000 INFO  [o.n.c.c.NetworkReceiver] [AsyncLog @ 
>> 2016-05-17 04:30:58.420+0000]  cluster://172.31.35.147:47188 
>> disconnected from me at cluster://172.31.35.147:5001
>> 2016-05-17 04:30:58.420+0000 INFO  [o.n.c.c.NetworkReceiver] [AsyncLog @ 
>> 2016-05-17 04:30:58.420+0000]  cluster://172.31.35.147:47188 
>> disconnected from me at cluster://172.31.35.147:5001
>> 2016-05-17 04:30:58.434+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check 
>> Pointing triggered by database shutdown [1]:  Starting check pointing...
>> 2016-05-17 04:30:58.438+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check 
>> Pointing triggered by database shutdown [1]:  Starting store flush...
>> 2016-05-17 04:30:58.443+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check 
>> Pointing triggered by database shutdown [1]:  Store flush completed
>> 2016-05-17 04:30:58.443+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check 
>> Pointing triggered by database shutdown [1]:  Starting appending check 
>> point entry into the tx log...
>> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check 
>> Pointing triggered by database shutdown [1]:  Appending check point entry 
>> into the tx log completed
>> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check 
>> Pointing triggered by database shutdown [1]:  Check pointing completed
>> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log 
>> Rotation [0]:  Starting log pruning.
>> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log 
>> Rotation [0]:  Log pruning complete.
>> 2016-05-17 04:30:58.475+0000 INFO  [o.n.k.i.DiagnosticsManager] --- 
>> STOPPING diagnostics START ---
>> 2016-05-17 04:30:58.475+0000 INFO  [o.n.k.i.DiagnosticsManager] High 
>> Availability diagnostics
>> Member state:PENDING
>> State machines:
>>    AtomicBroadcastMessage:start
>>    AcceptorMessage:start
>>    ProposerMessage:start
>>    LearnerMessage:start
>>    HeartbeatMessage:start
>>    ElectionMessage:start
>>    SnapshotMessage:start
>>    ClusterMessage:discovery
>> Current timeouts:
>> join:configurationTimeout{conversation-id=2/13#, timeout-count=29, 
>> created-by=2}
>> 2016-05-17 04:30:58.475+0000 INFO  [o.n.k.i.DiagnosticsManager] --- 
>> STOPPING diagnostics END ---
>> 2016-05-17 04:30:58.475+0000 INFO 
>>  [o.n.k.h.f.HighlyAvailableFacadeFactory] Shutdown started
>>
>> etc. 
>>
>>
>> Any insights are highly appreciated!!
>>
>> Thank you!
>> Dennis
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to neo4j+un...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Advice on correct 2-instance setup is required

Reply via email to