Thanks,
The subscription-redundancy is set to "1" and the region is used in 2 nodes
(there are more nodes which re not related to it).
Yes there is an exception, which I am yet to understand: (and this exception
causes the closure of the CQ in this node as well as sending operation message
to the other node to close!)
caught exception while running:
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:51)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at
org.apache.geode.internal.cache.tier.sockets.Message.flushBuffer(Message.java:651)
at
org.apache.geode.internal.cache.tier.sockets.Message.sendBytes(Message.java:632)
at
org.apache.geode.internal.cache.tier.sockets.ChunkedMessage.sendChunk(ChunkedMessage.java:314)
at
org.apache.geode.internal.cache.tier.sockets.ChunkedMessage.sendChunk(ChunkedMessage.java:322)
at
org.apache.geode.internal.cache.tier.sockets.BaseCommand.writeQueryResponseChunk(BaseCommand.java:756)
at
org.apache.geode.internal.cache.tier.sockets.BaseCommandQuery.processQueryUsingParams(BaseCommandQuery.java:225)
at
org.apache.geode.internal.cache.tier.sockets.BaseCommandQuery.processQuery(BaseCommandQuery.java:70)
at
org.apache.geode.internal.cache.tier.sockets.command.ExecuteCQ61.cmdExecute(ExecuteCQ61.java:179)
at
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:147)
at
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:783)
at
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doOneMessage(ServerConnection.java:913)
at
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1143)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$1$1.run(AcceptorImpl.java:546)
at java.lang.Thread.run(Thread.java:745)
could it be the client disconnected from the node right after sending this
message? (The client itself continues to run normally...)
The scenario is that after all nodes are initialized, I am stopping one server
out of 2. Sometimes, 1 out of 5 - the CQ stops notifying the client after this
stop. Most if the time the CQ continues to run fine.
I am certain this is related to some timing issue, some registration which
fails, something also related to the filter profiles which is held in the
region...
Thanks
Roi
-----Original Message-----
From: Anilkumar Gingade [mailto:[email protected]]
Sent: Wednesday, August 16, 2017 1:41 AM
To: [email protected]
Subject: Re: continuous query internal mechanism questions
In Geode, high availability for subscription events are achieved by having
redundant event-queues (HAQueues) on multiple severs; this is configured using
redundancy-level with client connection. Based on the redundancy level, the
client register CQs on multiple servers. During the subscription
(CQ) registration, it elects/assigns one of the server to host primary HAQueue.
The client keeps monitoring the redundancy level during node join or failure;
to satisfy the redundancy level.
You can find more about HAQueues at
https://cwiki.apache.org/confluence/display/GEODE/HA+Client+Event+Queues
I assume, you have 2 node cluster. What is your subscription redundancy level?
>> For some reason, sometimes there is a failure to complete the first
registration
Is there any log message, stack trace, reporting reason for failure? If its dev
environment, you can run client/server with debug/fine level log to see
additional info.
Are you trying to stop your server, while registering the CQs? Can you give
more detail about your test scenario...
-Anil.
On Tue, Aug 15, 2017 at 11:25 AM, Jason Huynh <[email protected]> wrote:
> I am not quite sure how native client registers cqs. From my understanding:
> with the java api, I believe there is only one message (ExecuteCQ
> message) that is executed on the server side and then replicated to
> the other nodes through the profile (OperationMessage).
>
> It seems the extra ExecuteCQ message failing and then closing the cq
> might be putting the system in a weird state...
>
> On Tue, Aug 15, 2017 at 7:56 AM Roi Apelker <[email protected]>
> wrote:
>
> > Hi,
> >
> > I have been examining the continuous query registration mechanism
> > for quite some time This is related to an issue that I have, where
> > sometimes a node crashes
> (1
> > node out of 2), and the other one does not send CQ events. The CQ is
> > registered on a partitioned region which resides on these 2 nodes.
> >
> > I noticed the following behavior, and I wonder if anyone can comment
> > regarding it, if it is justified or not and what is the reason:
> >
> > 1. When the software using the client (native client) registers for
> > the CQ, a CQ command (ExecuteCQ61) is received on both servers.
> > -- is this normal behaviour? Does the client actually send this
> > command to both servers?
> >
> > 2. When this command is received by a server, and the CQ is
> > registered, another registration message is sent to the other node
> > via an OperationMessage (REGISTER_CQ)
> > -- it seems that regularly, the server can handle this situation as
> > the second registration identifies the previous one and does not affect it.
> but
> > the question, why do we need this 2nd registration, if there is a
> > command sent to each server?
> >
> > 3. For some reason, sometimes there is a failure to complete the
> > first registration (executed by ExecuteCQ61) and then this failure
> > causes a closure to the CQ, which is accompanied with a close
> > request to the other node.
> > -- I assume by now, since 2 registrations and one closure have
> > occurred on node 2, the CQ is still active and the client receives
> > notifications.
> >
> > 4. Sometimes, 1 out of 5, once node 1 crashes, I get a cleanup
> > operation, caused by the crash (via MemberCrashedEvent), and this
> > also closes the existing CQ, and in this case the CQ in node 2 does
> > not operate anymore
> and
> > the client receives no notifications.
> > -- fact is, that 4 out of 4 times, I do not get this cleanup by
> > MemberCrashedEvent (maybe due to some other error), and that the CQ
> > notifications are received normally.
> >
> > Can anyone clear things up for me? Any comment on any of the
> > statements above will be greatly appreciated.
> >
> > Thanks,
> >
> > Roi
> >
> >
> > -----Original Message-----
> > From: Roi Apelker
> > Sent: Wednesday, August 09, 2017 3:21 PM
> > To: [email protected]
> > Subject: RE: continuous query internal mechanism
> >
> > Dhanyavad
> >
> > -----Original Message-----
> > From: Anilkumar Gingade [mailto:[email protected]]
> > Sent: Tuesday, August 08, 2017 9:55 PM
> > To: [email protected]
> > Subject: Re: continuous query internal mechanism
> >
> > Registered events, i meant, are events generated for interest
> registration
> > "region.registerInterest(*)". And CqEvents are for CQs registered.
> >
> > -Anil.
> >
> >
> > On Tue, Aug 8, 2017 at 12:27 AM, Roi Apelker
> > <[email protected]>
> > wrote:
> >
> > > Shukriya
> > >
> > > What is the difference between registered events and CQ events?
> > >
> > > -----Original Message-----
> > > From: Anilkumar Gingade [mailto:[email protected]]
> > > Sent: Monday, August 07, 2017 10:12 PM
> > > To: [email protected]
> > > Subject: Re: continuous query internal mechanism
> > >
> > > CQ Processing on server side is same for all clients (Java, C++)...
> > >
> > > The subscription events are sent to client as ClientUpdateMessage,
> > > which holds information about registered events and CQ events. The
> > > client process this and updates/invokes the client side
> > > cache/listeners with respective event. Look into
> > > ClientUpdateMessageImpl and CacheClientUpdater (for client side
> > processing).
> > >
> > > -Anil.
> > >
> > >
> > >
> > >
> > > On Mon, Aug 7, 2017 at 11:01 AM, Roi Apelker
> > > <[email protected]>
> > > wrote:
> > >
> > > > Thanks,
> > > >
> > > > By the way, is there any difference in the behaviour of the
> > > > server, if the client that registered the CQ is a native (C++) client?
> > > >
> > > > I have been going over the classes and code for some time and
> > > > can't seem to find the actual location where a CQ
> > > > update/notification is
> > > sent...
> > > >
> > > > It's like CqEventImpl class is never even generated in this scenario.
> > > >
> > > > If anyone can help here I would be most grateful :-)
> > > >
> > > > Thanks
> > > >
> > > > Roi
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Anilkumar Gingade [mailto:[email protected]]
> > > > Sent: Monday, August 07, 2017 8:23 PM
> > > > To: [email protected]
> > > > Subject: Re: continuous query internal mechanism
> > > >
> > > > You can find those in CqServiceImpl.process*()...
> > > >
> > > > -Anil.
> > > >
> > > >
> > > > On Mon, Aug 7, 2017 at 9:14 AM, Roi Apelker
> > > > <[email protected]>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am trying to look into the code of the continuous query
> > > > > mechanism
> > > > > - where the GEODE server sends the notification back to the client.
> > > > >
> > > > > Can anyone point me to the central classes of continuous
> > > > > query, especially to the one that is responsible for the
> > > > > calculation of the new data and packing it as a message back to the
> > > > > client?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Roi
> > > > >
> > > > > This message and the information contained herein is
> > > > > proprietary and confidential and subject to the Amdocs policy
> > > > > statement,
> > > > >
> > > > > you may review at
> > > > > https://www.amdocs.com/about/email-disclaimer <
> > > > > https://www.amdocs.com/about/email-disclaimer>
> > > > >
> > > > This message and the information contained herein is proprietary
> > > > and confidential and subject to the Amdocs policy statement,
> > > >
> > > > you may review at https://www.amdocs.com/about/email-disclaimer
> > > > < https://www.amdocs.com/about/email-disclaimer>
> > > >
> > > This message and the information contained herein is proprietary
> > > and confidential and subject to the Amdocs policy statement,
> > >
> > > you may review at https://www.amdocs.com/about/email-disclaimer <
> > > https://www.amdocs.com/about/email-disclaimer>
> > >
> > This message and the information contained herein is proprietary and
> > confidential and subject to the Amdocs policy statement,
> >
> > you may review at https://www.amdocs.com/about/email-disclaimer <
> > https://www.amdocs.com/about/email-disclaimer>
> > This message and the information contained herein is proprietary and
> > confidential and subject to the Amdocs policy statement,
> >
> > you may review at https://www.amdocs.com/about/email-disclaimer <
> > https://www.amdocs.com/about/email-disclaimer>
> >
>
This message and the information contained herein is proprietary and
confidential and subject to the Amdocs policy statement,
you may review at https://www.amdocs.com/about/email-disclaimer
<https://www.amdocs.com/about/email-disclaimer>