Re: Enabling Auth between Zookeeper Servers

2020-02-11 Thread Rakesh Radhakrishnan
>java.io.IOException: No JAAS configuration section named 'Server'

I could see you have enabled client-server authentication as well. It looks
to me that the error is coming from that. Please share the complete error
logs to trace it.
Have you configured "*Server*" section along with the "*QuorumServer*" and "
*QuorumClient*" sections? If not, please configure "*Server*" section along
with others and try it out.

Reference:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Client-Server+mutual+authentication
[image: image.png]

Thanks,
Rakesh

On Tue, Feb 11, 2020 at 7:26 AM Sebastian Schmitz <
sebastian.schm...@propellerhead.co.nz> wrote:

> Hello,
>
> I'm currently looking into enabling the Auth between Zookeeper-Servers
> and found this documentation:
>
>
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Server-Server+mutual+authentication
>
> However, when I use the config from the document (for Digest-MD5) I get
> this exception in Zookeeper 3.4.14 and also 3.5.6, which I tried because
> I thought using latest version could help:
> java.io.IOException: No JAAS configuration section named 'Server' was
> found in '/opt/zookeeper-cluster/zookeeper/conf/jaas.conf
>
> And of course that's right, because there's only QuorumServer and
> QuorumClient in the jaas.conf:
>
> jaas.conf:
> QuorumServer {
> org.apache.zookeeper.server.auth.DigestLoginModule required
> user_zookeeper="test";
> };
>
> QuorumClient {
> org.apache.zookeeper.server.auth.DigestLoginModule required
> username="zookeeper"
> password="test";
> };
>
> I also tried renaming the QuorumServer to just "Server". No change.
>
> My zoo.cfg:
> tickTime=2000
> initLimit=10
> syncLimit=5
> dataDir=/mnt/zk_data
> clientPort=2181
> dataLogDir=/mnt/zk_data_log
> autopurge.snapRetainCount=3
> autopurge.purgeInterval=24
> quorum.auth.enableSasl=true
> quorum.auth.learnerRequireSasl=false
> quorum.auth.serverRequireSasl=false
> quorum.auth.learner.loginContext=QuorumLearner
> quorum.auth.server.loginContext=QuorumServer
> quorum.cnxn.threads.size=20
> authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
> secureClientPort=2281
> server.1=0.0.0.0:2888:3888
>
> Any idea what I could try? Or maybe there's some better document on how
> to achieve this?
>
> Thank you
>
> Sebastian
>
>
> --
> DISCLAIMER
> This email contains information that is confidential and which
> may be
> legally privileged. If you have received this email in error please
>
> notify the sender immediately and delete the email.
> This email is intended
> solely for the use of the intended recipient and you may not use or
> disclose this email in any way.
>


Re: default value for quorum.auth.kerberos.servicePrincipal

2019-12-17 Thread Rakesh Radhakrishnan
As the name says, "quorum.auth.kerberos.servicePrincipal" property is
specifically for Kerberos based quorum authentication and no need to set
anything if you are enabling digest-md5.

Like mentioned earlier, its default value is "zkquorum/localhost" and it
will never be used if you configure/enable digest-md5.

Thanks,
Rakesh

On Mon, Dec 16, 2019 at 7:14 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> "quorum.auth.kerberos.servicePrincipal" this one
>
> On Sun, Dec 15, 2019, 9:33 PM Rakesh Radhakrishnan 
> wrote:
>
> > OK, got it.
> >
> > >>>> Even if i enable sasl but md5-diget what should be this property set
> > to,
> > Could you please name the specific property you are referring.
> >
> > Hope you are talking about "DIGEST-MD5" mechanism ? String[] mechs = {
> > "DIGEST-MD5" };
> >
> > Presently the execution flow is that, if there is
> > no subject.getPrincipals() in jaas config then it must not be GSSAPI and
> > fallback to check DIGEST-MD5 details in jaas config.
> > Whenever user want to enable DIGEST-MD5, they have to define the JAAS
> > configuration file with DIGEST-MD5 configs like below and there is no
> > default value for this mechanism.
> >  QuorumServer {
> >org.apache.zookeeper.server.auth.DigestLoginModule required
> >user_test1="mypassword";
> >  };
> >
> > QuorumLearner {
> >org.apache.zookeeper.server.auth.DigestLoginModule required
> >user_test2=" mypassword";
> >  };
> >
> > Populate DIGEST-MD5 user -> password map for the "QuorumServer",
> > "QuorumLearner" section.
> > Usernames are distinguished from other options by prefixing the username
> > with a "user_" prefix.
> >
> > Hope its clear to you.
> >
> > Thanks,
> > Rakesh
> >
> > On Fri, Dec 13, 2019 at 9:45 PM rammohan ganapavarapu <
> > rammohanga...@gmail.com> wrote:
> >
> > > Hi Rakesh,
> > >
> > > Right now i am not enabling sasl but i am trying to define all default
> > > properties and should be able to use them once sasl is enabled with
> > > override values. So my question is for digest auth do we even need this
> > > property? i remember seeing i don't set that property it was using the
> > > default value "zkquorum/localhost".
> > >
> > > Thanks,
> > > Ram
> > >
> > > On Thu, Dec 12, 2019 at 11:06 PM Rakesh Radhakrishnan <
> > rake...@apache.org>
> > > wrote:
> > >
> > > > Hi Ram,
> > > >
> > > > ZooKeeper Quorum authentication support two schemes, Kerberos or
> > > > DIGEST-MD5. User has to configure either Kerb or digest configuration
> > > > values. Both together not required.
> > > >
> > > > I'd recommend you to go through Kerberos, digest simulation unit test
> > > cases
> > > > where we have valid and invalid scenarios. Hope this would get idea
> > about
> > > > the required configs.
> > > >
> > > >
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/auth/QuorumDigestAuthTest.java
> > > >
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/auth/QuorumKerberosHostBasedAuthTest.java
> > > >
> > > > Could you describe the issues that troubles you in setting up quorum
> > > auth,
> > > > if any.
> > > >
> > > > Thanks,
> > > > Rakesh
> > > >
> > > > On Fri, Dec 13, 2019 at 3:49 AM rammohan ganapavarapu <
> > > > rammohanga...@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Even if i enable sasl but md5-diget what should be this property
> set
> > > to,
> > > > > this property only take effect for kerberos or for both?
> > > > >
> > > > > Ram
> > > > >
> > > > > On Fri, Dec 6, 2019 at 7:55 AM rammohan ganapavarapu <
> > > > > rammohanga...@gmail.com> wrote:
> > > > >
> > > > > > Mate,
> > > > > >
> > > > > > Thank you, I did search source code found the same, I am trying
> to
> > > > c

Re: default value for quorum.auth.kerberos.servicePrincipal

2019-12-15 Thread Rakesh Radhakrishnan
OK, got it.

>>>> Even if i enable sasl but md5-diget what should be this property set
to,
Could you please name the specific property you are referring.

Hope you are talking about "DIGEST-MD5" mechanism ? String[] mechs = {
"DIGEST-MD5" };

Presently the execution flow is that, if there is
no subject.getPrincipals() in jaas config then it must not be GSSAPI and
fallback to check DIGEST-MD5 details in jaas config.
Whenever user want to enable DIGEST-MD5, they have to define the JAAS
configuration file with DIGEST-MD5 configs like below and there is no
default value for this mechanism.
 QuorumServer {
   org.apache.zookeeper.server.auth.DigestLoginModule required
   user_test1="mypassword";
 };

QuorumLearner {
   org.apache.zookeeper.server.auth.DigestLoginModule required
   user_test2=" mypassword";
 };

Populate DIGEST-MD5 user -> password map for the "QuorumServer",
"QuorumLearner" section.
Usernames are distinguished from other options by prefixing the username
with a "user_" prefix.

Hope its clear to you.

Thanks,
Rakesh

On Fri, Dec 13, 2019 at 9:45 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Hi Rakesh,
>
> Right now i am not enabling sasl but i am trying to define all default
> properties and should be able to use them once sasl is enabled with
> override values. So my question is for digest auth do we even need this
> property? i remember seeing i don't set that property it was using the
> default value "zkquorum/localhost".
>
> Thanks,
> Ram
>
> On Thu, Dec 12, 2019 at 11:06 PM Rakesh Radhakrishnan 
> wrote:
>
> > Hi Ram,
> >
> > ZooKeeper Quorum authentication support two schemes, Kerberos or
> > DIGEST-MD5. User has to configure either Kerb or digest configuration
> > values. Both together not required.
> >
> > I'd recommend you to go through Kerberos, digest simulation unit test
> cases
> > where we have valid and invalid scenarios. Hope this would get idea about
> > the required configs.
> >
> >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/auth/QuorumDigestAuthTest.java
> >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/auth/QuorumKerberosHostBasedAuthTest.java
> >
> > Could you describe the issues that troubles you in setting up quorum
> auth,
> > if any.
> >
> > Thanks,
> > Rakesh
> >
> > On Fri, Dec 13, 2019 at 3:49 AM rammohan ganapavarapu <
> > rammohanga...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Even if i enable sasl but md5-diget what should be this property set
> to,
> > > this property only take effect for kerberos or for both?
> > >
> > > Ram
> > >
> > > On Fri, Dec 6, 2019 at 7:55 AM rammohan ganapavarapu <
> > > rammohanga...@gmail.com> wrote:
> > >
> > > > Mate,
> > > >
> > > > Thank you, I did search source code found the same, I am trying to
> > create
> > > > a zoo conf with all default properties.
> > > >
> > > > Ram
> > > >
> > > > On Fri, Dec 6, 2019, 2:44 AM Mate Szalay-Beko
> > > 
> > > > wrote:
> > > >
> > > >> Hi Ram,
> > > >>
> > > >> this parameter is needed to be defined when you want to enable
> secure
> > > >> authentication in the communication between ZooKeeper servers. In
> > > general,
> > > >> the 'principal' is a 'username' what you want your ZooKeeper servers
> > to
> > > >> use
> > > >> when they talk with each other. Ideally you have a central Kereros
> > > service
> > > >> somewhere where this principal is already registered.
> > > >> A kerberos principal is usually in the form of
> > > >> "user_or_service_name/host@realm" (some more explanation:
> > > >> https://ssimo.org/blog/id_016.html)
> > > >>
> > > >> According to the source code, the default value of
> > > >> quorum.auth.kerberos.servicePrincipal is "zkquorum/localhost". But I
> > > think
> > > >> if you don't enable the quorum SASL in ZooKeeper, then this property
> > > will
> > > >> never be actually used.
> > > >>
> > > >> Please see this page about SASL in ZooKeeper:
> > > >>
> > >
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+and+SASL
> > > >>
> > > >> I also found a Cloudera blogpost on the topic:
> > > >>
> > > >>
> > >
> >
> https://blog.cloudera.com/hardening-apache-zookeeper-security-sasl-quorum-peer-mutual-authentication-and-authorization/
> > > >>
> > > >> Cheers,
> > > >> Mate
> > > >>
> > > >>
> > > >> On Thu, Dec 5, 2019 at 11:50 PM rammohan ganapavarapu <
> > > >> rammohanga...@gmail.com> wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > What is the default value for this property, if i don't  enable
> sasl
> > > >> and if
> > > >> > i don't define what will be the value?
> > > >> >
> > > >> > quorum.auth.kerberos.servicePrincipal
> > > >> >
> > > >> > Also what does this means "servicename/_HOST"
> > > >> >
> > > >> > Thanks,
> > > >> > Ram
> > > >> >
> > > >>
> > > >
> > >
> >
>


Re: default value for quorum.auth.kerberos.servicePrincipal

2019-12-12 Thread Rakesh Radhakrishnan
Hi Ram,

ZooKeeper Quorum authentication support two schemes, Kerberos or
DIGEST-MD5. User has to configure either Kerb or digest configuration
values. Both together not required.

I'd recommend you to go through Kerberos, digest simulation unit test cases
where we have valid and invalid scenarios. Hope this would get idea about
the required configs.

https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/auth/QuorumDigestAuthTest.java
https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/auth/QuorumKerberosHostBasedAuthTest.java

Could you describe the issues that troubles you in setting up quorum auth,
if any.

Thanks,
Rakesh

On Fri, Dec 13, 2019 at 3:49 AM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Hi,
>
> Even if i enable sasl but md5-diget what should be this property set to,
> this property only take effect for kerberos or for both?
>
> Ram
>
> On Fri, Dec 6, 2019 at 7:55 AM rammohan ganapavarapu <
> rammohanga...@gmail.com> wrote:
>
> > Mate,
> >
> > Thank you, I did search source code found the same, I am trying to create
> > a zoo conf with all default properties.
> >
> > Ram
> >
> > On Fri, Dec 6, 2019, 2:44 AM Mate Szalay-Beko
> 
> > wrote:
> >
> >> Hi Ram,
> >>
> >> this parameter is needed to be defined when you want to enable secure
> >> authentication in the communication between ZooKeeper servers. In
> general,
> >> the 'principal' is a 'username' what you want your ZooKeeper servers to
> >> use
> >> when they talk with each other. Ideally you have a central Kereros
> service
> >> somewhere where this principal is already registered.
> >> A kerberos principal is usually in the form of
> >> "user_or_service_name/host@realm" (some more explanation:
> >> https://ssimo.org/blog/id_016.html)
> >>
> >> According to the source code, the default value of
> >> quorum.auth.kerberos.servicePrincipal is "zkquorum/localhost". But I
> think
> >> if you don't enable the quorum SASL in ZooKeeper, then this property
> will
> >> never be actually used.
> >>
> >> Please see this page about SASL in ZooKeeper:
> >>
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+and+SASL
> >>
> >> I also found a Cloudera blogpost on the topic:
> >>
> >>
> https://blog.cloudera.com/hardening-apache-zookeeper-security-sasl-quorum-peer-mutual-authentication-and-authorization/
> >>
> >> Cheers,
> >> Mate
> >>
> >>
> >> On Thu, Dec 5, 2019 at 11:50 PM rammohan ganapavarapu <
> >> rammohanga...@gmail.com> wrote:
> >>
> >> > Hi,
> >> >
> >> > What is the default value for this property, if i don't  enable sasl
> >> and if
> >> > i don't define what will be the value?
> >> >
> >> > quorum.auth.kerberos.servicePrincipal
> >> >
> >> > Also what does this means "servicename/_HOST"
> >> >
> >> > Thanks,
> >> > Ram
> >> >
> >>
> >
>


Re: Observer properties for SASL authentication in 3.4.13 version

2018-09-29 Thread Rakesh Radhakrishnan
OK, it looks to me some common networking related issue.

1) To confirm, can you remove the Observer type and simply try to join zk
server to quorum like participant?

2) Can you also confirm, hope you don't have "hostname" from the 127.0.0.1
line in /etc/hosts. Something like,

   127.0.0.1   node203ea localhost localhost.localdomain localhost4
localhost4.localdomain4
   ::1 localhost localhost.localdomain localhost6
localhost6.localdomain6

http://ccl.cse.nd.edu/operations/condor/hostname.shtml

On Fri, Sep 28, 2018 at 10:25 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Any thoughts on what could be the reason for observers not able to connect
> to followers/leader?
>
> Ram
>
> On Thu, Sep 27, 2018 at 1:00 PM rammohan ganapavarapu <
> rammohanga...@gmail.com> wrote:
>
>> Incase if you have not received my previous logs files.
>>
>> On Tue, Sep 25, 2018 at 8:25 AM rammohan ganapavarapu <
>> rammohanga...@gmail.com> wrote:
>>
>>> Rakesh,
>>>
>>> Thank you, i have 3 floower and 3 observers in two different DC's
>>> followers came up fine with SASL but for some reasons observers are not
>>> coming up with the following error but i dont see any network issues, i was
>>> able to telnet to 2181 and 3888 ports.
>>>
>>>
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@620] - Queue
>>> size: 1
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@620] - Queue
>>> size: 1
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@620] - Queue
>>> size: 1
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@555] -
>>> Opening channel to server 1
>>> 2018-09-24 17:55:34,151 [myid:6] - WARN
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot
>>> open channel to 1 at election address zk-server1/10.16.1.102:3888
>>> java.net.SocketTimeoutException: connect timed out
>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> at
>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>> at
>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>> at
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>> at java.net.Socket.connect(Socket.java:589)
>>> at
>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
>>> at
>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
>>> at
>>> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:838)
>>> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:957)
>>>
>>>
>>> server.1=zk-server1:2888:3888
>>> server.2=zk-server2:2888:3888
>>> server.3=zk-server3:2888:3888
>>> server.4=zk-server4:2888:3888:observer
>>> server.5=zk-server5:2888:3888:observer
>>> server.6=zk-server6:2888:3888:observer
>>> peerType=observer
>>>
>>> What could be the reason?
>>>
>>> Ram
>>>
>>> On Tue, Sep 25, 2018 at 12:12 AM Rakesh Radhakrishnan <
>>> rake...@apache.org> wrote:
>>>
>>>> Thanks Ram for the interest on this feature.
>>>>
>>>> Yes, user can enable SASL for Observer nodes as well. In general,
>>>> QuorumLearner will send authentication packet to peer QuorumServer.
>>>> Observer is a learner which follows the same quorum authentication protocol
>>>> and auth logic will work fine.
>>>>
>>>> FYI, hope you are referring below links for configurations,
>>>>
>>>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Server-Server+mutual+authentication
>>>>
>>>> https://blog.cloudera.com/blog/2017/01/hardening-apache-zookeeper-security-sasl-quorum-peer-mutual-authentication-and-authorization/
>>>>
>>>> Please let us know if you are facing any issues.
>>>>
>>>> Thanks,
>>>> Rakesh
>>>>
>>>> On Mon, Sep 24, 2018 at 8:31 AM rammohan ganapavarapu <
>>>> rammohanga...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Do we need to configure any thing on observer nodes for SASL
>>>>> authentication?
>>>>>
>>>>> tcpKeepAlive=true ( this is not for sasl but just asking )
>>>>>
>>>>> quorum.auth.enableSasl=true
>>>>> quorum.auth.learnerRequireSasl=true
>>>>> quorum.auth.serverRequireSasl=true
>>>>>
>>>>> What will happen if i set these properties on observers nodes as well ?
>>>>>
>>>>> Thanks,
>>>>> Ram
>>>>>
>>>>


Re: Observer properties for SASL authentication in 3.4.13 version

2018-09-25 Thread Rakesh Radhakrishnan
I'm in IST time zone and causes the delay:-)

Have you verified zk cluster by not configuring "sasl" in all these servers
and started, just to rule out the possibility of any errors with quorum
authentication logic?

Could you give more details:

1) Are you seeing that all Observers(4,5,6) are not able to connect to any
of the quorum 1,2,3 servers ? It would be good if you could share zk logs.
2) Hope you have checked that "myid" file is correct in each server - that
each server has a distinct server id.
3) Do you have firewall/security and no issues overthere ?. Make sure
2888/3888 are all open.
4) Hope /etc/hosts entries on all the nodes are fine.
5) Have you configured sasl configs in Observer nodes?

Rakesh

On Wed, Sep 26, 2018 at 9:19 AM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Any help?
>
> On Tue, Sep 25, 2018 at 2:20 PM rammohan ganapavarapu <
> rammohanga...@gmail.com> wrote:
>
>> And observer never joining the cluster its keep saying  "Cannot open
>> channel to"  in the logs.
>>
>> On Tue, Sep 25, 2018 at 8:25 AM rammohan ganapavarapu <
>> rammohanga...@gmail.com> wrote:
>>
>>> Rakesh,
>>>
>>> Thank you, i have 3 floower and 3 observers in two different DC's
>>> followers came up fine with SASL but for some reasons observers are not
>>> coming up with the following error but i dont see any network issues, i was
>>> able to telnet to 2181 and 3888 ports.
>>>
>>>
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@620] - Queue
>>> size: 1
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@620] - Queue
>>> size: 1
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@620] - Queue
>>> size: 1
>>> 2018-09-24 17:55:34,145 [myid:6] - DEBUG
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@555] -
>>> Opening channel to server 1
>>> 2018-09-24 17:55:34,151 [myid:6] - WARN
>>> [QuorumPeer[myid=6]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot
>>> open channel to 1 at election address zk-server1/10.16.1.102:3888
>>> java.net.SocketTimeoutException: connect timed out
>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> at
>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>> at
>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>> at
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>> at java.net.Socket.connect(Socket.java:589)
>>> at
>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
>>> at
>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
>>> at
>>> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:838)
>>> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:957)
>>>
>>>
>>> server.1=zk-server1:2888:3888
>>> server.2=zk-server2:2888:3888
>>> server.3=zk-server3:2888:3888
>>> server.4=zk-server4:2888:3888:observer
>>> server.5=zk-server5:2888:3888:observer
>>> server.6=zk-server6:2888:3888:observer
>>> peerType=observer
>>>
>>> What could be the reason?
>>>
>>> Ram
>>>
>>> On Tue, Sep 25, 2018 at 12:12 AM Rakesh Radhakrishnan <
>>> rake...@apache.org> wrote:
>>>
>>>> Thanks Ram for the interest on this feature.
>>>>
>>>> Yes, user can enable SASL for Observer nodes as well. In general,
>>>> QuorumLearner will send authentication packet to peer QuorumServer.
>>>> Observer is a learner which follows the same quorum authentication protocol
>>>> and auth logic will work fine.
>>>>
>>>> FYI, hope you are referring below links for configurations,
>>>>
>>>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Server-Server+mutual+authentication
>>>>
>>>> https://blog.cloudera.com/blog/2017/01/hardening-apache-zookeeper-security-sasl-quorum-peer-mutual-authentication-and-authorization/
>>>>
>>>> Please let us know if you are facing any issues.
>>>>
>>>> Thanks,
>>>> Rakesh
>>>>
>>>> On Mon, Sep 24, 2018 at 8:31 AM rammohan ganapavarapu <
>>>> rammohanga...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Do we need to configure any thing on observer nodes for SASL
>>>>> authentication?
>>>>>
>>>>> tcpKeepAlive=true ( this is not for sasl but just asking )
>>>>>
>>>>> quorum.auth.enableSasl=true
>>>>> quorum.auth.learnerRequireSasl=true
>>>>> quorum.auth.serverRequireSasl=true
>>>>>
>>>>> What will happen if i set these properties on observers nodes as well ?
>>>>>
>>>>> Thanks,
>>>>> Ram
>>>>>
>>>>


Re: Observer properties for SASL authentication in 3.4.13 version

2018-09-25 Thread Rakesh Radhakrishnan
Thanks Ram for the interest on this feature.

Yes, user can enable SASL for Observer nodes as well. In general,
QuorumLearner will send authentication packet to peer QuorumServer.
Observer is a learner which follows the same quorum authentication protocol
and auth logic will work fine.

FYI, hope you are referring below links for configurations,
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Server-Server+mutual+authentication
https://blog.cloudera.com/blog/2017/01/hardening-apache-zookeeper-security-sasl-quorum-peer-mutual-authentication-and-authorization/

Please let us know if you are facing any issues.

Thanks,
Rakesh

On Mon, Sep 24, 2018 at 8:31 AM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Hi,
>
> Do we need to configure any thing on observer nodes for SASL
> authentication?
>
> tcpKeepAlive=true ( this is not for sasl but just asking )
>
> quorum.auth.enableSasl=true
> quorum.auth.learnerRequireSasl=true
> quorum.auth.serverRequireSasl=true
>
> What will happen if i set these properties on observers nodes as well ?
>
> Thanks,
> Ram
>


Re: New PMC Member: Michael Han

2017-06-27 Thread Rakesh Radhakrishnan
Congratulations Michael!

Keep up the great efforts!


Rakesh

On Tue, Jun 27, 2017 at 10:18 PM, Flavio Junqueira  wrote:

> I'm very happy to announce that the Apache ZooKeeper PMC has voted to
> invite Michael Han to join the PMC and Michael accepted. Michael has done
> outstanding work in the community over the recent past and we felt it was
> time for Michael to deepen his level of engagement by joining the PMC.
>
> Please join me in congratulating Michael for his achievement.
> Congratulations, Michael!
>
> -Flavio
>
>
>


Re: Version parameter passed to ZooKeeper.setACL

2017-06-22 Thread Rakesh Radhakrishnan
Agreed. Please feel free to raise a jira task for improving zkcli#setACL()
javadoc.

Rakesh

On Fri, Jun 23, 2017 at 8:32 AM, Brahma Reddy Battula <
brahmareddy.batt...@huawei.com> wrote:

> One suggestion here.
>
> Java doc or argument can be improved which might not mislead..?
>
> i) public Stat setACL(final String path, List acl, int aversion or
> aclVersion)
> ii) "should pass aclversion only"..something like this..
>
>
>
> --Brahma Reddy Battula
>
> -Original Message-
> From: Rakesh Radhakrishnan [mailto:rake...@apache.org]
> Sent: 23 June 2017 10:47
> To: user@zookeeper.apache.org
> Subject: Re: Version parameter passed to ZooKeeper.setACL
>
> Hi Arpit,
>
> Stat#aversion represents "the number of changes to the ACL of this znode."
> On calling the zkcli#setACL api, internally ZK server will increase the
> 'aversion' by one. If the given 'aversion' does not match the znode's
> aversion it will throw BadVersionException.
>
> While invoking the #setACL api, you should pass "Stat.aversion".
>
> Rakesh
>
> On Thu, Jun 22, 2017 at 11:34 PM, Arpit Agarwal <aagar...@hortonworks.com>
> wrote:
>
> > Greetings,
> >
> > For the ZooKeeper.setACL call:
> >
> > https://github.com/apache/zookeeper/blob/master/src/java/
> > main/org/apache/zookeeper/ZooKeeper.java#L2368
> >
> >  public Stat setACL(final String path, List acl, int version)
> >
> > Should version be set to Stat.version or Stat.aversion?
> >
> > Thanks,
> > Arpit
> >
> >
> >
>


Re: Version parameter passed to ZooKeeper.setACL

2017-06-22 Thread Rakesh Radhakrishnan
Hi Arpit,

Stat#aversion represents "the number of changes to the ACL of this znode."
On calling the zkcli#setACL api, internally ZK server will increase the
'aversion' by one. If the given 'aversion' does not match the znode's
aversion it will throw BadVersionException.

While invoking the #setACL api, you should pass "Stat.aversion".

Rakesh

On Thu, Jun 22, 2017 at 11:34 PM, Arpit Agarwal 
wrote:

> Greetings,
>
> For the ZooKeeper.setACL call:
>
> https://github.com/apache/zookeeper/blob/master/src/java/
> main/org/apache/zookeeper/ZooKeeper.java#L2368
>
>  public Stat setACL(final String path, List acl, int version)
>
> Should version be set to Stat.version or Stat.aversion?
>
> Thanks,
> Arpit
>
>
>


Re: [ANNOUNCE] Apache ZooKeeper 3.4.10

2017-04-02 Thread Rakesh Radhakrishnan
>>>https://github.com/apache/bigtop/pull/192

This is awesome! Thank you
@Yakir Gibraltar for the useful work.


Rakesh

On Sun, Apr 2, 2017 at 5:05 PM, yakirgb  wrote:

> I created PR to Bigtop, checked on CentOS 7, looks fine from my side.
> https://github.com/apache/bigtop/pull/192
>
>
>
>
> --
> View this message in context: http://zookeeper-user.578899.
> n2.nabble.com/ANNOUNCE-Apache-ZooKeeper-3-4-10-tp7583027p7583032.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.
>


Re: [ANNOUNCE] Apache ZooKeeper 3.4.10

2017-03-31 Thread Rakesh Radhakrishnan
I haven't tried installing ZK via bigtop. Just a plain thought, is it
possible to change base = '3.4.6' to base = '3.4.10' in
https://github.com/apache/bigtop/blob/master/bigtop.bom#L116 and do a try?

Rakesh

On Thu, Mar 30, 2017 at 11:23 PM, yakirgb  wrote:

> I executed: gradle zookeeper-rpm
> but it created rpm of version 3.4.6, how to install 3.4.10 via bigtop?
> https://github.com/apache/bigtop/blob/master/bigtop.bom#L116
>
>
>
> --
> View this message in context: http://zookeeper-user.578899.
> n2.nabble.com/ANNOUNCE-Apache-ZooKeeper-3-4-10-tp7583027p7583030.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.
>


[ANNOUNCE] Apache ZooKeeper 3.4.10

2017-03-30 Thread Rakesh Radhakrishnan
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
3.4.10

ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming,
configuration management, synchronization, and group services - in a
simple interface so you don't have to write them from scratch. You can
use it off-the-shelf to implement consensus, group management, leader
election, and presence protocols. And you can build on it for your
own, specific needs.

For ZooKeeper release details and downloads, visit:
http://zookeeper.apache.org/releases.html

ZooKeeper 3.4.10 Release Notes are at:
http://zookeeper.apache.org/doc/r3.4.10/releasenotes.html

We would like to thank the contributors that made the release possible.

Regards,

The ZooKeeper Team


Re: Sessions Expire due to Network partitioning in Zookeeper

2017-03-05 Thread Rakesh Radhakrishnan
Sorry, it was a busy weekend for me. I hope the following will give more
clarity.

(1) Client connected to a server and sends heartbeat periodically
Send pings 1/3 the session timeout
https://github.com/apache/zookeeper/blob/branch-3.4/src/c/src/zookeeper.c#L1675
In your case, sessiontimeout=45secs, then approax 15secs is the ping
interval

(2) Client connected to a server and the connected server is unavailable.
Missing ping response. Usually, this time period could be (ping interval +
small grace period, for idle time calc).
This allows time to reconnect to a different server.

In your case, timeout 45 secs, client took 15 secs to get the disconnection
event from server C.
Then, it took 30 secs to hit connection error from server B. This is very
suspicious and ideally client should receive an error within 1/3 timeout +
small grace period.
Any chance to check the status of the client machine during this 30 secs
time period. Do you think Kernel message can help ?

Below is, the session "0x35a926acae8" related log from the shared
client log.
>>> zookeeper_client: 2017-03-03
09:56:14,770:23078(0x7f9467786700):ZOO_INFO@check_events@1775: session
establishment complete on server [172.25.83.205:2181],
sessionId=0x35a926acae8, negotiated timeout=45000
>>> zookeeper_client: 2017-03-03
09:56:30,770:23078(0x7f9467786700):ZOO_DEBUG@zookeeper_process@2218: Got
ping response in 1 ms
>>> zookeeper_client: 2017-03-03
09:56:42,213:23078(0x7f9467786700):ZOO_ERROR@handle_socket_error_msg@1746:
Socket [172.25.83.205:2181] zk retcode=-4, errno=112(Host is down): failed
while receiving aserver response
>>> zookeeper_client: 2017-03-03
09:57:12,220:23078(0x7f9467786700):ZOO_ERROR@handle_socket_error_msg@1666:
Socket [172.25.83.204:2181] zk retcode=-7, errno=110(Connection timed out):
connection to 172.25.83.204:2181 timed out (exceeded timeout by 6ms)
>>> zookeeper_client: 2017-03-03
09:57:12,221:23078(0x7f9467786700):ZOO_INFO@check_events@1728: initiated
connection to server [172.25.83.201:2181]
>>> zookeeper_client: 2017-03-03
09:57:12,223:23078(0x7f9467786700):ZOO_ERROR@handle_socket_error_msg@1764:
Socket [172.25.83.201:2181] zk retcode=-112, errno=116(Stale file handle):
sessionId=0x35a926acae8 has expired.

Rakesh

On Mon, Mar 6, 2017 at 10:11 AM, Tharindu Kumara <zonik.hatkum...@gmail.com>
wrote:

> Hi Rakesh,
>
> Any updates on this?
>
> Is this a bug or an expected situation?
>
> Thanks
>
>
>
> On Fri, Mar 3, 2017 at 10:49 AM, Tharindu Kumara <
> zonik.hatkum...@gmail.com>
> wrote:
>
> > Hi Rakesh,
> >
> > I have made a small mistake in the above email.
> > For client's session timeout I have used 2ms not 45000ms.
> > Sorry for the mistake.
> > Basically it waits for 28 seconds to connect to Server A. (14 seconds to
> > figure out disconnection and another 14 seconds to connect to Server B)
> >
> >
> >
> >
> > Then as soon as it connected to Server A, Server A sends the session
> > expiration.
> > Then I again did the same experiment with setting client's session
> > expiration time to 45000ms.
> > Then it took 15secs to retrieve the disconnection event to client.
> > After from that point, client took another 30secs to trying to connect to
> > Server B.
> > And then after that, client figures out that it cannot connect to Server
> B
> > and try to initiate a connection to Server A.
> > Since 15 + 30 secs is 45, when as soon as client connected to Server A,
> > Server A sends the session expiration event.
> > What is the reason for that? It looks like client is trying too much time
> > to connect to Server B.
> >
> >
> > Here I attached the Server B's log (Leader) and client's log.
> > Server A = 172.25.83.201
> > Server B = 172.25.83.204
> > Server C = 172.25.83.205
> > And client's session id is 0x35a926acae8
> >
> > https://drive.google.com/file/d/0B9o9GJ_CoG1_
> > NWRTQ3hmeGpZRktrc040MkZzTk5LTDdpMi1v/view?usp=sharing
> > https://drive.google.com/file/d/0B9o9GJ_CoG1_
> > VHlKa092QzB6M3NPemRORk9Lc0w1a2N5ZDdz/view?usp=sharing
> >
> > And also in the above post, you mentioned that, you mentioned about a
> > connection-timeout.
> > Can you please explain a bit more?
> >
> > I see that when I try with both 45000ms and 2ms, it takes around 14 -
> > 15 seconds to figure out a disconnection from clients end.
> > Looks like it is a constant and client has (session timeout - connection
> > loss notified time) to find a new Server. Am I correct here?
> > You mentioned that client connection timeout is equal to
> > ("sessiontimeout/listed servers count").
> > Please explain a bit abo

Re: Sessions Expire due to Network partitioning in Zookeeper

2017-03-02 Thread Rakesh Radhakrishnan
>>> You mentioned that a client sends a ping every 1/3 the session timeout.
Yes, you are correct. Again to analyse your issue, we have to consider
re-connection timeout also, which is "sessiontimeout/listed servers count"
https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1292
https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1098

Coincidentally, in your example both heartbeat interval and re-connection
interval are same as you have three servers.

>>>>> It looks like C3 has taken 14 seconds to determine the disconnected
event
>>>>> and another 14 seconds to that it cannot connect to Server B(C3 is
isolated
>>>>> from B).

With this info, total elapsed time is 28 secs which is less than 45 secs
session timeout. Now, the client has 17 secs (45 secs - 28 secs) time
period to re-establish a connection with server A, right? Could you please
check whether the client is connecting to A during this period?

Rakesh



On Thu, Mar 2, 2017 at 6:58 PM, Tharindu Kumara <zonik.hatkum...@gmail.com>
wrote:

> ​​
> Hi Rakesh,
>
> First of all thank you for the quick reply.
>
> >>>>> Actually, ZooKeeper client has retry mechanism.
> >>>>> Client sends a ping every 1/3 the session timeout (here, 3 is the no.
> of listed servers, A, B, C) and then looks for a response before another
> 1/3 elapses. This allows time to reconnect to a different server (and still
> maintain the session) if the connected server becomes unavailable.
>
> You mentioned that a client sends a ping every 1/3 the session timeout. And
> 3 is the no of listed servers.
>
> I doubt that. Because, I am using the C Binding and after inspecting the
> code it looks like that 3 is a hard coded value.
> Simply no matter what the number of clients, zk client biding is always
> sending a ping every 1/3 session timeout.
>
> Can please clarify that for me?
>
> Here I used a tick of 3000ms and session expiration timeout of 45000ms.
>
> And please find the screenshot of extacted client log outout.
>
> https://anonimag.es/image/JT9htnL
>
> It looks like C3 has taken 14 seconds to determine the disconnected event
> and another 14 seconds to that it cannot connect to Server B(C3 is isolated
> from B).
>
>
>
> On Thu, Mar 2, 2017 at 4:08 PM, Rakesh Radhakrishnan <rake...@apache.org>
> wrote:
>
> > >>>> According to my understanding, it looks like, when a client trying
> to
> > >>>> connect to a server that it cannot connect due to a network
> > partitioning,
> > >>>> it uses a blocking call and it waits too much time trying to
> > >>>> connect to a server that it cannot communicate.
> >
> > Actually, ZooKeeper client has retry mechanism.
> > Client sends a ping every 1/3 the session timeout (here, 3 is the no. of
> > listed servers, A, B, C)
> > and then looks for a response before another 1/3 elapses. This allows
> time
> > to reconnect to a
> > different server (and still maintain the session) if the connected server
> > becomes unavailable.
> >
> > Could you grep the following log message in your client log and tell me
> how
> > much time C3 taken for the re-connection attempts.
> > "Client session timed out, have not heard from server in "
> >
> > C3 might have first attempted to reconnect to B and then A. Also, need to
> > check how much time C3 taken to detect connection failure from server C.
> >
> > Could you please share the zk client log to dig more.
> >
> > Rakesh
> >
> >
> > On Thu, Mar 2, 2017 at 11:04 AM, Tharindu Kumara <
> > zonik.hatkum...@gmail.com>
> > wrote:
> >
> > >  > ​
> > > 1) Could you tell me the status of Server C, is this lost connection to
> > the
> > >  > quorum and fails to join quorum continuously as B is the Leader
> ?
> > >
> > > Yes, B the leader. C Server is completely isolated from the Leader(B)
> > > and It cannot communicate with the Leader. C cannot continuously
> connect
> > to
> > > the
> > >
> > > Leader.
> > >
> > >
> > >  > 2) C3 is connected C. Please tell me the connection host string
> passed
> > > to
> > >  > this client. Does it contains all three servers info
> > "A:clientport,
> > >  >B:clientport, C:clientport" ?
> > >
> > > Yes, C3's connection string contains all three servers. ("A:clientport,
> > > B:clientport, C:cl

Re: Sessions Expire due to Network partitioning in Zookeeper

2017-03-02 Thread Rakesh Radhakrishnan
>>>> According to my understanding, it looks like, when a client trying to
>>>> connect to a server that it cannot connect due to a network
partitioning,
>>>> it uses a blocking call and it waits too much time trying to
>>>> connect to a server that it cannot communicate.

Actually, ZooKeeper client has retry mechanism.
Client sends a ping every 1/3 the session timeout (here, 3 is the no. of
listed servers, A, B, C)
and then looks for a response before another 1/3 elapses. This allows time
to reconnect to a
different server (and still maintain the session) if the connected server
becomes unavailable.

Could you grep the following log message in your client log and tell me how
much time C3 taken for the re-connection attempts.
"Client session timed out, have not heard from server in "

C3 might have first attempted to reconnect to B and then A. Also, need to
check how much time C3 taken to detect connection failure from server C.

Could you please share the zk client log to dig more.

Rakesh


On Thu, Mar 2, 2017 at 11:04 AM, Tharindu Kumara <zonik.hatkum...@gmail.com>
wrote:

>  > ​
> 1) Could you tell me the status of Server C, is this lost connection to the
>  > quorum and fails to join quorum continuously as B is the Leader ?
>
> Yes, B the leader. C Server is completely isolated from the Leader(B)
> and It cannot communicate with the Leader. C cannot continuously connect to
> the
>
> Leader.
>
>
>  > 2) C3 is connected C. Please tell me the connection host string passed
> to
>  > this client. Does it contains all three servers info "A:clientport,
>  >B:clientport, C:clientport" ?
>
> Yes, C3's connection string contains all three servers. ("A:clientport,
> B:clientport, C:clientport")
>
>
>  > 3) Please check all three servers and client C3 logs to see any
>  >inconsistencies or exceptions.
>
> After looking at logs, it seems when the server C isolated from the Leader,
>
> a disconnect event fires to client C3. Then it (C3) tries too much time to
> connect to Server B(Leader) .
>
> But it cannot connect to server B, as we blocked the connection between
> Server C and
>
> Server B. Basically, C3 tries more than half of the session timeout time to
> connect to Server B.
>
> Then after figuring out that C3 cannot to connect to Server B, it tries to
> connect
>
> to Server A, and it connects to Server A successfully. But this is too
> late, because
>
> session is already expired at the time C3 connected.
>
> And this happens sometimes only. Because when we specify all the servers in
> the client's
>
> connect string, sometimes after C3 disconnecting from Server C, instead of
> trying to connect to
>
> Server B it connects to Server A as the first attempt. In this case the
> client C3 connects to the
>
> quorum successfully before the session expiration.
>
> According to my understanding, it looks like, when a client trying to
> connect to a server that it cannot
>
> connect due to a network partitioning, it uses a blocking call and it waits
> too much time trying to
>
> connect to a server that it cannot communicate.
>
>
>
>  > 4) ZooKeeper version used in your testing ?
>
> I used zookeeper 3.4.9 (current stable release)
>
>
>
> On Thu, Mar 2, 2017 at 7:48 AM, Rakesh Radhakrishnan <rake...@apache.org>
> wrote:
>
> > Hi,
> >
> > Could you please give few more details,
> >
> > ​​
> > 1) Could you tell me the status of Server C, is this lost connection to
> the
> > quorum and fails to join quorum continuously as B is the Leader ?
> >
> > 2) C3 is connected C. Please tell me the connection host string passed to
> > this client. Does it contains all three servers info "A:clientport,
> > B:clientport, C:clientport" ?
> >
> > 3) Please check all three servers and client C3 logs to see any
> > inconsistencies or exceptions.
> >
> > 4) ZooKeeper version used in your testing ?
> >
> >
> > Rakesh
> >
> > On Wed, Mar 1, 2017 at 4:55 PM, Tharindu Kumara <
> zonik.hatkum...@gmail.com
> > >
> > wrote:
> >
> > > ​Recently, carried out a test to to find the behavior of clients when a
> > > client is partitioned from the ensemble.
> > >
> > > Here I used a ensemble of 3 zookeeper servers called A, B and C. And
> > quorum
> > > was set up like below.
> > >
> > > A - Follower
> > > B - Leader
> > > C - Follower​
> > >
> > > A  <---> B <---> C
> > >\/
> > >
> >

Re: Sessions Expire due to Network partitioning in Zookeeper

2017-03-01 Thread Rakesh Radhakrishnan
Hi,

Could you please give few more details,

1) Could you tell me the status of Server C, is this lost connection to the
quorum and fails to join quorum continuously as B is the Leader ?

2) C3 is connected C. Please tell me the connection host string passed to
this client. Does it contains all three servers info "A:clientport,
B:clientport, C:clientport" ?

3) Please check all three servers and client C3 logs to see any
inconsistencies or exceptions.

4) ZooKeeper version used in your testing ?


Rakesh

On Wed, Mar 1, 2017 at 4:55 PM, Tharindu Kumara 
wrote:

> ​Recently, carried out a test to to find the behavior of clients when a
> client is partitioned from the ensemble.
>
> Here I used a ensemble of 3 zookeeper servers called A, B and C. And quorum
> was set up like below.
>
> A - Follower
> B - Leader
> C - Follower​
>
> A  <---> B <---> C
>\/
>
> And 3 clients are connected to ensemble like below.
>
> C1 is connected A
> C2 is connected B
> C3 is connected C.
>
> I used iptables to remove the network link between B and C.
>
> command used: iptables -I INPUT -s 123.123.45.123 -j DROP
>
> After removing the link connections looks like below.
>
> A  <> B C
>\/
>
> Simply there is no way to communicate from B to C and vice versa.
>
> Here What I noticed is that the client connected to Zookeeper Server "C",
> could not connect to the ensemble resulting a session expiration timeout.
>
> For this experiment I used tickTime of 3000ms and client session expiration
> timeout of 45000ms. And tested with different combinations also.
>
> Can someone please explain what is the root cause for this behavior?
>


Re: Query: 3.5.x version as alpha

2017-02-22 Thread Rakesh Radhakrishnan
>>>> Can anyone confirm, is there any specific reason in the naming
convention in this release and to make it as "alpha".
As we know, 3.5.x release contains many critical features like, dynamic
reconfig, SSL feature etc. Naturally, community would focus on
stabilization: resolving blocker bugs and freezing public APIs in an
incremental way. Starts with a series of alpha releases, so people can run
and test with, once we address all the blockers and feel comfortable with
the APIs & remaining jiras we then switch to beta. Please go through the
following link, which gives you idea about 3.5 release plans -
http://markmail.org/message/ymxliy2rrwjc2pmo

>>>> As per my understanding Alpha means its beta release and not
recommended to use in production, however as per below mail
>>>>  it seems many customers using the alpha version, so there is no harm
to use this version.
Like I mentioned in my previous mail thread, 3.5.3 release is underway and
contains many fixes. Probably, you could try this upcoming version.
Meantime, you could setup test cluster using the latest available version
and please feel free to report issues, if any.
It looks like there are lots of people interested in seeing 3.5 stable
version, and if we get everyone to contribute more patches and code
reviews, we should be able to do it sooner. From the dev/user community
mail threads, I could see many companies are using ZooKeeper
http://zookeeper.apache.org/doc/r3.5.2-alpha/ latest release and reporting
queries/improvements etc.
http://zookeeper-user.578899.n2.nabble.com/Question-on-Release-3-5-x-td7581755.html
.

Probably, other folks can share thoughts.

Thanks,
Rakesh

On Wed, Feb 22, 2017 at 9:50 AM, Deepti Sharma S <
deepti.s.sha...@ericsson.com> wrote:

> Hello Team,
>
> Can anyone response on below query, this quite urgent for us.
>
>
>
> Ericsson
>
> DEEPTI SHARMA
> Senior Configuration Engineer
> ITIL 2011 Foundation Certified
> BICP, R
>
> Ericsson
> 3rd Floor, ASF Insignia - Block B Kings Canyon,
> Gwal Pahari, Gurgaon, Haryana 122 003, India
> Phone 0124-6243000
> deepti.s.sha...@ericsson.com
> www.ericsson.com
>
>
>
> Legal entity: EGI, registered office in Gwal Pahari, Haryana. This
> Communication is Confidential. We only send and receive email on the basis
> of the terms set out at www.ericsson.com/email_disclaimer
> -Original Message-
> From: Deepti Sharma S
> Sent: 20 February, 2017 2:50 PM
> To: user@zookeeper.apache.org
> Subject: RE: Query: 3.5.x version as alpha
>
> Hi Team,
>
> Can anyone confirm, is there any specific reason in the naming convention
> in this release and to make it as "alpha".
>
> As per my understanding Alpha means its beta release and not recommended
> to use in production, however as per below mail it seems many customers
> using the alpha version, so there is no harm to use this version.
>
>
>
> Ericsson
>
> DEEPTI SHARMA
> Senior Configuration Engineer
> ITIL 2011 Foundation Certified
> BICP, R
>
> Ericsson
> 3rd Floor, ASF Insignia - Block B Kings Canyon, Gwal Pahari, Gurgaon,
> Haryana 122 003, India Phone 0124-6243000 deepti.s.sha...@ericsson.com
> www.ericsson.com
>
>
>
> Legal entity: EGI, registered office in Gwal Pahari, Haryana. This
> Communication is Confidential. We only send and receive email on the basis
> of the terms set out at www.ericsson.com/email_disclaimer -Original
> Message-
> From: Rakesh Radhakrishnan [mailto:rake...@apache.org]
> Sent: 14 February, 2017 5:05 PM
> To: user@zookeeper.apache.org
> Subject: Re: Query: 3.5.x version as alpha
>
> Hi Deepti,
>
> Thanks for the interest in 3.5 features.
>
> AFAIK, there are many companies started using 3.5.2 latest alpha version.
> I believe with the community support, would be able to reach to a 3.5.3
> beta version soon. But, I don't have exact time frame of 3.5.3 releasing
> now, because that is a joint efforts of ZooKeeper community(devs, users
> etc).
> Probably, you could setup a test cluster with the available 3.5.2-alpha
> latest version and start analyzing/understanding the changes in this for
> smooth adoption to your eco system. Also, please feel free to report
> issues/queries to the dev/user mailing list. I hope some of us will help
> you.
>
> Release discussion thread:
> https://qnalist.com/questions/7887505/upcoming-3-4-3-5-releases
>
> Thanks,
> Rakesh
>
> On Tue, Feb 14, 2017 at 4:25 PM, Deepti Sharma S <
> deepti.s.sha...@ericsson.com> wrote:
>
> > Hello Team,
> >
> > We are using zookeeper version 3.4.3 which does not include the
> > encryption. As per our technical analysis the encryption support is
> > introduced in 3.5.0 versio

Re: Query: 3.5.x version as alpha

2017-02-14 Thread Rakesh Radhakrishnan
Hi Deepti,

Thanks for the interest in 3.5 features.

AFAIK, there are many companies started using 3.5.2 latest alpha version. I
believe with the community support, would be able to reach to a 3.5.3 beta
version soon. But, I don't have exact time frame of 3.5.3 releasing now,
because that is a joint efforts of ZooKeeper community(devs, users etc).
Probably, you could setup a test cluster with the available 3.5.2-alpha
latest version and start analyzing/understanding the changes in this for
smooth adoption to your eco system. Also, please feel free to report
issues/queries to the dev/user mailing list. I hope some of us will help
you.

Release discussion thread:
https://qnalist.com/questions/7887505/upcoming-3-4-3-5-releases

Thanks,
Rakesh

On Tue, Feb 14, 2017 at 4:25 PM, Deepti Sharma S <
deepti.s.sha...@ericsson.com> wrote:

> Hello Team,
>
> We are using zookeeper version 3.4.3 which does not include the
> encryption. As per our technical analysis the encryption support is
> introduced in 3.5.0 version. However 3.5.x versions are seems not to be
> stable as per their naming convention as "alpha".
>
> Can you please confirm if we can use the 3.5.x version and will have
> community support in these versions.
>
>
> [Ericsson]
>
> DEEPTI SHARMA
> Senior Configuration Engineer
> ITIL 2011 Foundation Certified
> BICP, R
>
> Ericsson
> 3rd Floor, ASF Insignia - Block B Kings Canyon,
> Gwal Pahari, Gurgaon, Haryana 122 003, India
> Phone 0124-6243000
> deepti.s.sha...@ericsson.com
> www.ericsson.com
>
>
> [http://www.ericsson.com/current_campaign] ericsson.com/current_campaign>
>
> Legal entity: EGI, registered office in Gwal Pahari, Haryana. This
> Communication is Confidential. We only send and receive email on the basis
> of the terms set out at www.ericsson.com/email_disclaimer ericsson.com/email_disclaimer>
>


Re: CHANGES.txt?

2017-02-06 Thread Rakesh Radhakrishnan
Hi All,

I hope all are agreeing to remove CHANGE.txt file from ZK project
repository. I believe, ZOOKEEPER-2672 jira needs to be concluded(to avoid
confusions) before cutting 3.4.10 release. I'm planning to remove
CHANGE.txt file in 2 days time frame, 8 February 2017, 6:00 PM (PST), if
there is no objection. Thanks!

Thanks & Regards,
Rakesh

On Thu, Feb 2, 2017 at 9:06 PM, Rakesh Radhakrishnan <rake...@apache.org>
wrote:

> +1 for removing the CHANGE.txt file.
>
> I agree to rely on source control (github) revision logs instead of
> CHANGE.txt moving forward.
>
> Note: Jira issue ZOOKEEPER-2672 will be used to remove this file.
>
> Thanks,
> Rakesh
>
>
> On Sat, Dec 17, 2016 at 3:59 AM, Michael Han <h...@cloudera.com> wrote:
>
>> See https://www.mail-archive.com/dev@zookeeper.apache.org/msg37108.html
>>
>> I think there was no decision explicitly made but looks like every one was
>> OK to head towards the direction of removing CHANGE.TXT.
>>
>> On Fri, Dec 16, 2016 at 3:14 AM, Flavio Junqueira <f...@apache.org> wrote:
>>
>> > Could anyone remind me of what we decided for CHANGES.txt? Are we
>> supposed
>> > to add resolved jiras to it or are we going to rely on the git log?
>> >
>> > I have committed ZK-761 to master without the change to CHANGES.txt, but
>> > then I noticed that some commits are going in with it, so I did it to
>> the
>> > 3.5 branch. I want to know if I need to make it consistent or not.
>> >
>> > -Flavio
>>
>>
>>
>>
>> --
>> Cheers
>> Michael.
>>
>
>


Re: CHANGES.txt?

2017-02-02 Thread Rakesh Radhakrishnan
+1 for removing the CHANGE.txt file.

I agree to rely on source control (github) revision logs instead of
CHANGE.txt moving forward.

Note: Jira issue ZOOKEEPER-2672 will be used to remove this file.

Thanks,
Rakesh


On Sat, Dec 17, 2016 at 3:59 AM, Michael Han  wrote:

> See https://www.mail-archive.com/dev@zookeeper.apache.org/msg37108.html
>
> I think there was no decision explicitly made but looks like every one was
> OK to head towards the direction of removing CHANGE.TXT.
>
> On Fri, Dec 16, 2016 at 3:14 AM, Flavio Junqueira  wrote:
>
> > Could anyone remind me of what we decided for CHANGES.txt? Are we
> supposed
> > to add resolved jiras to it or are we going to rely on the git log?
> >
> > I have committed ZK-761 to master without the change to CHANGES.txt, but
> > then I noticed that some commits are going in with it, so I did it to the
> > 3.5 branch. I want to know if I need to make it consistent or not.
> >
> > -Flavio
>
>
>
>
> --
> Cheers
> Michael.
>


Re: security

2016-12-16 Thread Rakesh Radhakrishnan
I believe with the community support, will be able to reach to a 3.5.x beta
version soon.
FYI, please refer the release discussion thread https://qnalist.com/que
stions/7887505/upcoming-3-4-3-5-releases

Rakesh

On Fri, Dec 16, 2016 at 1:06 PM, FaXin Zhong <faxin.zh...@ericsson.com>
wrote:

> Hi,
>
> OK.  3.5.x are still alpha or being beta version, when will the formal
> stable version release, can you foresee?  Thanks.
>
> BRs/Faxin
>
> -Original Message-
> From: Michael Han [mailto:h...@cloudera.com]
> Sent: den 15 december 2016 19:48
> To: UserZooKeeper <user@zookeeper.apache.org>
> Subject: Re: security
>
> >> is there any plan to support SSL
> There is ZOOKEEPER-1000
> <https://issues.apache.org/jira/browse/ZOOKEEPER-1000>, but no one is
> actively pushing this.
>
> >>  Does zookeeper provide KDC HA as off-shelf support?
> HA of KDC is not part of ZooKeeper's responsibility. KDC has its own HA
> solutions (i.e. through master slave replication). The test report is a
> record of what's done for the purpose of testing, and is not a reference
> for a product deployment.
>
>
> On Thu, Dec 15, 2016 at 2:34 AM, FaXin Zhong <faxin.zh...@ericsson.com>
> wrote:
>
> > Hi,
> >
> > Many thanks for the info.  For the server-server communication, is
> > there any plan to support SSL as well?  We better have one security
> > approach for client and server.
> >
> > The test report mentions installing the KDC on sever 1, how to secure
> > the KDC HA? Does zookeeper provide KDC HA as off-shelf support?
> >
> > BRs/Faxin
> >
> > -Original Message-
> > From: Rakesh Radhakrishnan [mailto:rake...@apache.org]
> > Sent: den 14 december 2016 14:24
> > To: user@zookeeper.apache.org
> > Subject: Re: security
> >
> > Hi,
> >
> > Adding one more point to the above. Please refer the test report here,
> > https://goo.gl/qNR45M
> >
> > Both the issues mentioned in the report has been discussed.
> > Problem-1)  This has been taken care and corrected the document
> > Problem-2) This is a deployment mistake. Please go through the
> > analysis section and has to be taken care during deployment.
> >
> > Thanks,
> > Rakesh
> >
> > On Wed, Dec 14, 2016 at 6:41 PM, Rakesh Radhakrishnan
> > <rake...@apache.org>
> > wrote:
> >
> > > 1 => AFAIK, there are many companies adopted 3.5.x latest alpha
> > > version and no major issues reported so far. I hope beta release
> > > will be out soon at the first quarter of next year if there is no
> > > blockers/critical issues by anyone. IIUC, 3.5.3 release discussion
> > > is in progress. Probably, you can do a trial run and start
> > > analyzing/understanding the changes in 3.5.x latest version
> > (3.5.2-alpha) for smooth adoption to your eco system.
> > >
> > > 2 => Thanks for the interest on this feature. This work has been
> > > committed into the branch 3.4 recently(two weeks back) and planning
> > > 3.4.10 release asap including this feature. Again, the release
> > discussion is in progress.
> > > This feature has been tested by multiple folks and the test reports
> > > are available. Please go through the below links to understand more
> > > on
> > this.
> > > I'd really appreciate if you could test this feature and publish
> > feedback.
> > > Thanks! Please feel free to contact or discuss issues, some of us
> > > will help you. There are plans to forward port this feature to
> > > branch 3.5 via
> > > ZOOKEEPER-2639 task.
> > >
> > > https://qnalist.com/questions/7332914/test-plan-for-zk-1045-
> > > call-for-volunteers
> > > https://issues.apache.org/jira/secure/attachment/12834567/ZO
> > > OKEEPER-1045%20Test%20Plan.pdf - The problems mentioned in this test
> > > report is already taken care.
> > >
> > > Feature documentation is getting ready and draft version is
> > > available
> > here.
> > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKee
> > > per+and+SASL+authentication
> > > Documentation review is going on.
> > >
> > > Regards,
> > > Rakesh
> > >
> > > On Wed, Dec 14, 2016 at 5:54 PM, FaXin Zhong
> > > <faxin.zh...@ericsson.com>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> Our product is using zookeeper. I have some security questions
> > >> about zookeeper as below.
> > >>
> > >>
> > >> 1.   We want to use ssl for the client-server communication,
> > >> zookeeper supports it since 3.5.1, while it's alpha version,  is it
> > >> OK to upgrade zookeeper to 3.5.1 or latest? We are currently using
> > >> 3.4.8 for customers.
> > >>
> > >>
> > >> 2.   Does zookeeper support server-server secure communication as
> > >> well?  Or any plan? I don't find it in zookeeper documents, but
> > >> found some JIRA stuff
> > >> "ZOOKEEPER-1045<https://issues.apache.org/jira/browse/ZOOKEE
> > >> PER-1045> covers server-server mutual authentication by SASL", what
> > >> PER-1045> do
> > >> you think of it for commercial usage?
> > >>
> > >>
> > >> Thanks a lot!
> > >>
> > >> BRs/Faxin
> > >>
> > >
> > >
> >
>
>
>
> --
> Cheers
> Michael.
>


Re: security

2016-12-14 Thread Rakesh Radhakrishnan
Hi,

Adding one more point to the above. Please refer the test report here,
https://goo.gl/qNR45M

Both the issues mentioned in the report has been discussed.
Problem-1)  This has been taken care and corrected the document
Problem-2) This is a deployment mistake. Please go through the analysis
section and has to be taken care during deployment.

Thanks,
Rakesh

On Wed, Dec 14, 2016 at 6:41 PM, Rakesh Radhakrishnan <rake...@apache.org>
wrote:

> 1 => AFAIK, there are many companies adopted 3.5.x latest alpha version
> and no major issues reported so far. I hope beta release will be out soon
> at the first quarter of next year if there is no blockers/critical issues
> by anyone. IIUC, 3.5.3 release discussion is in progress. Probably, you can
> do a trial run and start analyzing/understanding the changes in 3.5.x
> latest version (3.5.2-alpha) for smooth adoption to your eco system.
>
> 2 => Thanks for the interest on this feature. This work has been committed
> into the branch 3.4 recently(two weeks back) and planning 3.4.10 release
> asap including this feature. Again, the release discussion is in progress.
> This feature has been tested by multiple folks and the test reports are
> available. Please go through the below links to understand more on this.
> I'd really appreciate if you could test this feature and publish feedback.
> Thanks! Please feel free to contact or discuss issues, some of us will help
> you. There are plans to forward port this feature to branch 3.5 via
> ZOOKEEPER-2639 task.
>
> https://qnalist.com/questions/7332914/test-plan-for-zk-1045-
> call-for-volunteers
> https://issues.apache.org/jira/secure/attachment/12834567/ZO
> OKEEPER-1045%20Test%20Plan.pdf - The problems mentioned in this test
> report is already taken care.
>
> Feature documentation is getting ready and draft version is available here.
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKee
> per+and+SASL+authentication
> Documentation review is going on.
>
> Regards,
> Rakesh
>
> On Wed, Dec 14, 2016 at 5:54 PM, FaXin Zhong <faxin.zh...@ericsson.com>
> wrote:
>
>> Hi,
>>
>> Our product is using zookeeper. I have some security questions about
>> zookeeper as below.
>>
>>
>> 1.   We want to use ssl for the client-server communication,
>> zookeeper supports it since 3.5.1, while it's alpha version,  is it OK to
>> upgrade zookeeper to 3.5.1 or latest? We are currently using 3.4.8 for
>> customers.
>>
>>
>> 2.   Does zookeeper support server-server secure communication as
>> well?  Or any plan? I don't find it in zookeeper documents, but found some
>> JIRA stuff "ZOOKEEPER-1045<https://issues.apache.org/jira/browse/ZOOKEE
>> PER-1045> covers server-server mutual authentication by SASL", what do
>> you think of it for commercial usage?
>>
>>
>> Thanks a lot!
>>
>> BRs/Faxin
>>
>
>


Re: zookeeper discnnected every minutes when i am getting bulk data

2016-12-02 Thread Rakesh Radhakrishnan
Like Flavio mentioned in previous mail, could you please check ZooKeeper
server logs. IMHO, that will help to understand the status of the cluster
and we may get hints to identify the problem.

Thanks for detailing your architecture. Probably, will come back to this
after analyzing ZK server logs.

-Rakesh-

On Fri, Dec 2, 2016 at 3:45 PM, AjeetSingh 
wrote:

> Hi,
> Please follow the below Arch for my setup.
>
>
> Logstash (Shipper)+Kafka/Zookeeper Cluster 3 nodes+Logstash
> (Indexer)+LB+ElasticSearch Cluster(8 nodes)+LB+KIbana
>
> This is my Arch for current setup but that would be scale in future but
> before going to scale need to resolve this issue.
>
> Please help me.
>
>
>
> Cheers,
> Ajeet S
>
>
>
>
>
>
> -
> Cheers,
> Ajeet S
>
> --
> View this message in context: http://zookeeper-user.578899.
> n2.nabble.com/zookeeper-discnnected-every-minutes-
> when-i-am-getting-bulk-data-tp7582817p7582822.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.
>


Re: AvgRequestLatency metric always zero

2016-12-01 Thread Rakesh Radhakrishnan
Thanks Arshad for the good analysis.

 How can we go about getting these fixes?

Please feel free to raise an improvement task under ZK project issue
tracker, URL: https://issues.apache.org/jira/browse/ZOOKEEPER
Also, I'd appreciate if you can put a patch to fix it. Probably you can
refer how to contribute to the project section,
https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute

On Fri, Dec 2, 2016 at 3:25 AM, allen chan 
wrote:

> Arshad,
> How can we go about getting these fixes?
>
>
> On Thu, Dec 1, 2016 at 12:01 PM, Arshad Mohammad <
> arshad.mohamma...@gmail.com> wrote:
>
> > Hi Alen
> >
> > AvgRequestLatency 0 implies that on an average time taken by the server
> to
> > serve a request is less than one milli second. May be it is really 0 or
> it
> > may be 0.99 ms.
> >
> > This behaviour is not specific to 3.4.7 and 3.4.9 version but the same
> > behaviour is there in all versions
> >
> > I find two reason whys AvgRequestLatency is almost all the time 0
> >
> > 1) Ping requests are counted the most:
> >
> > AvgRequestLatency is calculated as
> >
> > AvgRequestLatency=totalLatency/count
> >
> > Ping requests come very often and complete very fast, these request add
> > nothing to totalLatency but add one to count.
> >
> > 2) Wrong data type is chosen to store AvgRequestLatency:
> >
> > AvgRequestLatency is calculated and store as the long value instead of
> > double vlaue.
> >
> >
> >
> > In my opinion ZooKeeper code should be modified to improve this metrics
> >
> > i) Ping request should be ignored while recording the statistics or at
> > least should be configurable whether to ignore or not. If ping request is
> > not counted even other metrics will be more meaningful.
> >
> > ii)  AvgRequestLatency should be of double type.
> >
> >
> >
> > -Arshad
> >
> > On Thu, Dec 1, 2016 at 4:50 AM, allen chan  >
> > wrote:
> >
> > > Anyone seeing this issue? I am experiencing it on 3.4.7 and 3.4.9
> > >
> > > JMX metric name: AvgRequestLatency
> > > JMX Location:
> > > org.apache.ZooKeeperService:name0=ReplicatedServer_id#,
> > > name1=replica.#,name2=[Leader|Follower]:AvgRequestLatency
> > >
> > > It always has a value of zero. The MaxRequestLatency is non-zero and
> > > changes while the AvgRequestLatency always is zero.
> > >
> > > The zk_avg_latency metric in mntr is also zero. Is this metric not
> > tracked
> > > anymore?
> > >
> > > Thanks
> > > --
> > > Allen Michael Chan
> > >
> >
>
>
>
> --
> Allen Michael Chan
>


Re: How to use Kerberose auth while using Zookeeper client API

2016-11-29 Thread Rakesh Radhakrishnan
Hi Xie Gang,

I hope the following links will help you,

https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zookeeper+and+SASL
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#sc_java_client_configuration
http://www.cloudera.com/documentation/cdh/5-1-x/CDH5-Security-Guide/cdh5sg_zookeeper_security.html

Thanks,
Rakesh

On Wed, Nov 30, 2016 at 11:18 AM, Xie Gang  wrote:

> Hello,
>
> I'm writing a program with Zookeeper client API. But I need kerberose auth
> to access the server. How to enable it on client program? Any guide here?
> thanks in advance.
>
> --
> Xie Gang
>


Re: How to determine if the node's type is PERSISTENT_SEQUENTIAL

2016-11-29 Thread Rakesh Radhakrishnan
Could you share more details about the problem you are facing while copying
the node with type PERSISTENT_SEQUENTIAL. I'd prefer to fix those issues if
possible, rather than doing work around.

Rakesh

On Tue, Nov 29, 2016 at 8:22 PM, Xie  wrote:

> Hello,
>
>
> I'm developing a too to migrate the data from ZK cluster. But found that
> copying the node with type PERSISTENT_SEQUENTIAL could be problem. So, I
> have to skip those nodes. But how can I get the this node type with API?
>
>
>
>
> Thanks,
> Gang


Re: Zookeeper upgrade

2016-11-10 Thread Rakesh Radhakrishnan
Probably you could do rolling restarts. You have to do this manually by,


Pre-requisite:- Ensure all the servers in the cluster is running. For
example, if you have three node cluster A, B and C. Assume, A & B are
Followers and C is the Leader. This is to ensure that, if one of the server
stopped(assume A) during upgrade step then other servers(B & C) would be
able to provide service to the ZooKeeper clients.

step-1) stop one server at a time,
step-2) then upgrade new software,
step-3) finally restart the server back.
The upgrade is now complete on this server and you can proceed to the next.

Now, the restarted server will join the quorum. The side effect of this is,
since this server is getting stopped all the clients(sessions) connected to
this server will be Disconnected and then Reconnected with other server in
the ZK cluster.

When you do the above steps in LEADER server, then surely the quorum will
be lost and all the other running PARTICIPANTS will enter into quorum
reformation phase and elect one among them as new LEADER. Here also, you
will see all the clients(sessions) connected to this cluster will be
Disconnected and then Reconnected back safely. AFAIK, two times clients
will get Disconnected and then Reconnected events, if you have watcher
registered.

I'd prefer to start rolling upgrade from OBSERVERs, then FOLLOWERs and at
the end will pick the LEADER server. This way you could get chance to
monitor your newly upgraded server for any inconsistencies with less client
Disconnections. Perhaps, do rolling back if any issues are encountered.

Imp Note*:*- I hope you will first experiment rolling upgrade steps in your
test cluster and then do it in production cluster.

References:-
https://wiki.apache.org/hadoop/ZooKeeper/FAQ#A6
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_ig_zookeeper_earlier_cdh5_upgrade.html

Thanks,
Rakesh

On Fri, Nov 11, 2016 at 9:22 AM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Hello All,
>
> I am using zookeeper 3.4.5 version in production and would like to upgrade
> to the latest version 3.4.9
>
> Can someone point me to the correct upgrade(with out having a downtime)
> path?.
>


Re: Adding and removing replicas?

2016-10-20 Thread Rakesh Radhakrishnan
Hi Steve,

I'd suggest you to look at ZooKeeper-3.5.2 latest version and use dynamic
reconfig feature. This will help to resize(add/remove zk server) your
cluster without restarting entire cluster.

Please refer the following links to understand more about the dynamic
reconfig feature:-
https://zookeeper.apache.org/doc/r3.5.2-alpha/zookeeperReconfig.html
http://www.slideshare.net/Hadoop_Summit/dynamic-reconfiguration-of-zookeeper

Regards,
Rakesh

On Thu, Oct 20, 2016 at 3:19 AM, Steve Newman  wrote:

> Apologies for a basic question, but I've been researching and haven't been
> able to find the answer online.
>
> What is the best way to add or remove replicas from a running ZooKeeper
> cluster, with minimal downtime? To add a replica, the naive answer would
> seem to be:
>
> 1. Prepare the new replica(s), i.e. install ZooKeeper and set up the
> configuration files.
> 2. Edit the configuration for all replicas (new and existing) to list the
> new replicas.
> 3. Restart all replicas. (Simultaneously? Or gradually, one at a time?)
>
> Is this the best way to do it? Step 3 seems scary in a production cluster.
> Also, will the new replicas smoothly pick up the existing data, or is it
> better to seed them with a snapshot somehow?
>
> Similarly, the naive answer for removing a replica would seem to be:
>
> 1. Halt the ZooKeeper process.
> 2. Edit the configuration for all other replicas to remove the replica
> that's going away.
> 3. Restart all remaining replicas (one at a time?).
>
> Again, is this the best approach?
>
> Thanks,
> Steve
>


Re: Zookeeper - Network partitioning behaviour

2016-10-06 Thread Rakesh Radhakrishnan
Hi Imesha Sudasingha,

For example, we have A,B,C,D,E five servers, that formed quorum and assume
A is the Leader.  Again assume network partition happened between
A,B(minority region)  and C,D,E. As we know, 3 is the majority factor to
form quorum. Since Leader A is in the minority region, the entire quorum
will get shutdown and all the servers will automatically moves to leader
election phase. Now, A & B will do sending notifications each other and
will never succeed to form quorum due to <3 factor. On the other side, C,D
and E will participate each other and elect one of them as Leader.

>But will the partition with a minority will also continue to serve read
>requests until a write request comes to the leader in the minority
>partition?
After the partition, all the servers in the minority region will get
shutdown and moves to leader election phase. All the client sessions
connected to these servers will be disconnected and
will receive "KeeperState.Disconnected" event to their watchers, if any
registered.

But ZooKeeper supports read-only server mode. In this mode, client can
connect to the read-only server even when the server might be partitioned
from the quorum.

Reference:- Please read the section "Read Only Mode Server" in the apache
document link, https://zookeeper.apache.org/doc/r3.4.9/zookeeperAdmin.html


>In the above scenario, what will happen to the "watches" that have
>already been registered to a node in the minority partition?
Since the quorum is re-forming, all the client session watchers will
receive "KeeperState.Disconnected" event. These client session will start
sending connection request to all quorum servers(A,B,C,D,E) to re-establish
the connection. In my above example, C,D,E re-forms quorum successfully and
client sessions will reconnect to one of these servers in the quorum(am
assuming zkclient has the C,D,E server host address configured). Clients
automatically reset watches during successful session reconnect. There is a
feature to disable this watch resetting, please read the configuration
"zookeeper.disableAutoWatchReset" section in the following apache doc link
to understand more on this,
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html

Regards,
Rakesh

On Wed, Oct 5, 2016 at 6:39 PM, Imesha Sudasingha 
wrote:

> Hi all,
>
> When I was going through zookeeper, I noticed the following scenarios,
>
>- Suppose we have a 5 member quorum operating as normal. In a network
>partitioning where the leader and a member node goes into one partition
> and
>other 3 nodes get into the other network partition. I known then the
> side
>with the majority will elect a new leader and carry on to serve
> requests.
>But will the partition with a minority will also continue to serve read
>requests until a write request comes to the leader in the minority
>partition? How does zookeeper handle this occasion?
>- In the above scenario, what will happen to the "watches" that have
>already been registered to a node in the minority partition?
>
> Can you clarify the above scenarios?
> Thanks in advance!
>
> --
> *Imesha Sudasingha*
> Undergraduate of Department of Computer Science and  Engineering,
> University of Moratuwa.
> +94717086160
> View in Linkedin 
>


Re: [ANNOUNCE] Apache ZooKeeper 3.4.9

2016-09-05 Thread Rakesh Radhakrishnan
FYI, created ZOOKEEPER-2552 to revisit the release note and do the
corrective actions.

Rakesh

On Mon, Sep 5, 2016 at 6:05 PM, Rakesh Radhakrishnan <rake...@apache.org>
wrote:

> Thanks Edward for pointing out this. Its wrongly marked the fix version as
> "3.4.8", all these were created after 3.4.8 version release and has caused
> the trouble. I will raise a jira to correct the release note.
>
> Regards,
> Rakesh
>
>
> On Mon, Sep 5, 2016 at 5:46 PM, Edward Ribeiro <edward.ribe...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have seen a couple of issues listed on http://zookeeper.apache.org/
>> doc/r3.4.9/releasenotes.html
>> <http://zookeeper.apache.org/doc/r3.4.9/releasenotes.html> that are
>> either 'Open' or 'Patch available'. I
>> discover by accident because I will start working on the open one today.
>> ;)
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-2512
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-2391
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-2468
>>
>> Cheers,
>> Edward
>>
>> On Mon, Sep 5, 2016 at 1:22 AM, Patrick Hunt <ph...@apache.org> wrote:
>>
>> > Kudos Rakesh on pushing this through. Thanks to everyone that
>> contributed.
>> >
>> > Patrick
>> >
>> > On Sun, Sep 4, 2016 at 1:57 AM, Flavio Junqueira <f...@apache.org>
>> wrote:
>> >
>> > > Great to see it out! Thanks Rakesh and community.
>> > >
>> > > -Flavio
>> > >
>> > > > On 04 Sep 2016, at 05:32, Rakesh Radhakrishnan <rake...@apache.org>
>> > > wrote:
>> > > >
>> > > > The Apache ZooKeeper team is proud to announce Apache ZooKeeper
>> version
>> > > > 3.4.9.
>> > > >
>> > > > ZooKeeper is a high-performance coordination service for distributed
>> > > > applications. It exposes common services - such as naming,
>> > > > configuration management, synchronization, and group services - in a
>> > > > simple interface so you don't have to write them from scratch. You
>> can
>> > > > use it off-the-shelf to implement consensus, group management,
>> leader
>> > > > election, and presence protocols. And you can build on it for your
>> > > > own, specific needs.
>> > > >
>> > > > For ZooKeeper release details and downloads, visit:
>> > > > http://zookeeper.apache.org/releases.html
>> > > >
>> > > > ZooKeeper 3.4.9 Release Notes are at:
>> > > > http://zookeeper.apache.org/doc/r3.4.9/releasenotes.html
>> > > >
>> > > > We would like to thank the contributors that made the release
>> possible.
>> > > >
>> > > > Regards,
>> > > > The ZooKeeper Team
>> > >
>> > >
>> >
>>
>
>


Re: [ANNOUNCE] Apache ZooKeeper 3.4.9

2016-09-05 Thread Rakesh Radhakrishnan
Thanks Edward for pointing out this. Its wrongly marked the fix version as
"3.4.8", all these were created after 3.4.8 version release and has caused
the trouble. I will raise a jira to correct the release note.

Regards,
Rakesh

On Mon, Sep 5, 2016 at 5:46 PM, Edward Ribeiro <edward.ribe...@gmail.com>
wrote:

> Hi,
>
> I have seen a couple of issues listed on http://zookeeper.apache.org/
> doc/r3.4.9/releasenotes.html
> <http://zookeeper.apache.org/doc/r3.4.9/releasenotes.html> that are
> either 'Open' or 'Patch available'. I
> discover by accident because I will start working on the open one today. ;)
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-2512
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-2391
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-2468
>
> Cheers,
> Edward
>
> On Mon, Sep 5, 2016 at 1:22 AM, Patrick Hunt <ph...@apache.org> wrote:
>
> > Kudos Rakesh on pushing this through. Thanks to everyone that
> contributed.
> >
> > Patrick
> >
> > On Sun, Sep 4, 2016 at 1:57 AM, Flavio Junqueira <f...@apache.org> wrote:
> >
> > > Great to see it out! Thanks Rakesh and community.
> > >
> > > -Flavio
> > >
> > > > On 04 Sep 2016, at 05:32, Rakesh Radhakrishnan <rake...@apache.org>
> > > wrote:
> > > >
> > > > The Apache ZooKeeper team is proud to announce Apache ZooKeeper
> version
> > > > 3.4.9.
> > > >
> > > > ZooKeeper is a high-performance coordination service for distributed
> > > > applications. It exposes common services - such as naming,
> > > > configuration management, synchronization, and group services - in a
> > > > simple interface so you don't have to write them from scratch. You
> can
> > > > use it off-the-shelf to implement consensus, group management, leader
> > > > election, and presence protocols. And you can build on it for your
> > > > own, specific needs.
> > > >
> > > > For ZooKeeper release details and downloads, visit:
> > > > http://zookeeper.apache.org/releases.html
> > > >
> > > > ZooKeeper 3.4.9 Release Notes are at:
> > > > http://zookeeper.apache.org/doc/r3.4.9/releasenotes.html
> > > >
> > > > We would like to thank the contributors that made the release
> possible.
> > > >
> > > > Regards,
> > > > The ZooKeeper Team
> > >
> > >
> >
>


[ANNOUNCE] Apache ZooKeeper 3.4.9

2016-09-03 Thread Rakesh Radhakrishnan
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
3.4.9.

ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming,
configuration management, synchronization, and group services - in a
simple interface so you don't have to write them from scratch. You can
use it off-the-shelf to implement consensus, group management, leader
election, and presence protocols. And you can build on it for your
own, specific needs.

For ZooKeeper release details and downloads, visit:
http://zookeeper.apache.org/releases.html

ZooKeeper 3.4.9 Release Notes are at:
http://zookeeper.apache.org/doc/r3.4.9/releasenotes.html

We would like to thank the contributors that made the release possible.

Regards,
The ZooKeeper Team


Re: zookeeper 3.4.6 listening on unknown random port

2016-08-31 Thread Rakesh Radhakrishnan
Hi Mazhar,

I doubt the possibility of jmx remote monitoring random port. Did you
configure environment variable JMXPORT to avoid random port ?

Rakesh

On Wed, Aug 31, 2016 at 4:06 PM, Mazhar Shaikh 
wrote:

> Hi All,
>
> Im running a 3 zookeeper  process and noticed all 3 zookeeper process is
> listening on one unknown random port.
>
> Can someone please help me to understand why is this port used for ?
>
> This port not handling any connections and all the 3 zookeeper has
> different port number.
>
> Node1 -> has 1 zookeeper (z1)
> node2 -> has 2 zookeeper (z3 & z3)
>
> server.1=z1.zookeeper.com:2888:3888
> server.2=z2.zookeeper.com:2888:3888
> server.3=z3.zookeeper.com:2889:3889
>
> ** Node1 - z1 *
> Zookeeper version: 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> Clients:
>  /192.168.2.116:53028[1](queued=0,recved=173,sent=173)
>  /192.168.2.27:38375[1](queued=0,recved=175,sent=175)
>  /192.168.2.28:42044[1](queued=0,recved=161,sent=161)
>  /192.168.2.116:55142[0](queued=0,recved=1,sent=0)
>  /192.168.2.216:50024[1](queued=0,recved=176,sent=176)
>
> Latency min/avg/max: 0/149/14357
> Received: 1765
> Sent: 1753
> Connections: 5
> Outstanding: 0
> Zxid: 0x116b8
> Mode: follower
> Node count: 216
>
> ** Node2 - z2 *
> Zookeeper version: 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> Clients:
>  /192.168.2.17:45103[1](queued=0,recved=195,sent=195)
>  /192.168.2.18:38099[1](queued=0,recved=176,sent=176)
>  /192.168.2.116:60666[0](queued=0,recved=1,sent=0)
>
> Latency min/avg/max: 0/308/21523
> Received: 9680
> Sent: 9499
> Connections: 3
> Outstanding: 0
> Zxid: 0x116b8
> Mode: leader
> Node count: 216
>
> ** Node2 - z3 *
> Zookeeper version: 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> Clients:
>  /192.168.2.116:47141[0](queued=0,recved=1,sent=0)
>
> Latency min/avg/max: 0/0/0
> Received: 13
> Sent: 12
> Connections: 1
> Outstanding: 0
> Zxid: 0x116b8
> Mode: follower
> Node count: 216
>
> ***
>
> z1 - ports
> # netstat -anp | grep 15643 | grep LISTEN
> tcp6   0  0 :::36895:::*
> LISTEN  15643/java
> tcp6   0  0 :::2181 :::*
> LISTEN  15643/java
> tcp6   0  0 169.254.2.116:3888  :::*
> LISTEN  15643/java
>
>
> z2 - ports
> # netstat -anp | grep 21235 | grep LISTEN
> tcp6   0  0 :::2181 :::*
> LISTEN  21235/java
> tcp6   0  0 :::40037:::*
> LISTEN  21235/java
> tcp6   0  0 169.254.2.216:2888  :::*
> LISTEN  21235/java
> tcp6   0  0 169.254.2.216:3888  :::*
> LISTEN  21235/java
>
> z3 - ports
>   root@sysctrl2:~# netstat -anp | grep 40497 | grep
> LISTEN
> tcp6   0  0 :::2182 :::*
> LISTEN  40497/java
> tcp6   0  0 169.254.2.1:3889:::*
> LISTEN  40497/java
> tcp6   0  0 :::42455:::*
> LISTEN  40497/java
>
>
> Here "z1,z2,z3" are listening on "36895, 40037 & 42455" port respectively
> which has not been configured.
>
> Thanks.
>
> Regards,
> Mazhar Shaikh.
>


Re: [ANNOUNCE] Chris Nauroth joins the Apache ZooKeeper PMC

2016-08-07 Thread Rakesh Radhakrishnan
Congratulations, Chris!

Rakesh

On Sun, Aug 7, 2016 at 11:35 PM, Flavio Junqueira  wrote:

> In recognition of all his contributions to the project, the Apache
> ZooKeeper PMC has invited Chris Nauroth to join the PMC and he has
> accepted. I'd like to take the opportunity to thank Chris for his
> contributions and commitment to the project. Thank you and congratulations
> for joining the PMC, Chris!
>
> -Flavio


Re: Error Start ZK 3.5.1 a second time

2016-07-14 Thread Rakesh Radhakrishnan
Hi Curtis,

>>>but should I open a ticket?  Is this a bug?

I could see ZOOKEEPER-2244 addresses the case you have mentioned and is
fixed in 3.5.2 version.

Most probably 3.5.2-alpha version release will happen soon and voting is
going on. If you are interested you can try testing your case using the
release candidate which is available in staging repo ->
https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.2-alpha/

Regards,
Rakesh

On Thu, Jul 14, 2016 at 9:06 PM, Cantrell, Curtis 
wrote:

> Ok.  I'm on a windows machine.   All I had to do was to open the zoo.cfg
> that had been written by zookeeper when the backup was created and turn the
> slashes around, and now the server comes up...
>
> dynamicConfigFile=D:/zookeeper1-3.5.1/conf/zoo.cfg.dynamic.1
>
> It looks like there is something that cannot handle a forward slash and
> zookeeper is writing the forward slash itself.
>
> Thank you,
> All solved  but should I open a ticket?  Is this a bug?
>
> Thank you,
> Curtis
>
>
> -Original Message-
> From: Cantrell, Curtis [mailto:curtis.cantr...@bkfs.com]
> Sent: Thursday, July 14, 2016 11:06 AM
> To: user@zookeeper.apache.org
> Subject: RE: Error Start ZK 3.5.1 a second time
>
> I looks like a file separator issue when reading where the
> dynamicConfigFile is location from the zoo.cfg
>
> This is what is written in the zoo.cfg
>
>
>  dynamicConfigFile=D:\zookeeper3-3.5.1\conf\zoo.cfg.dynamic.1
>
> But this is the complaint of the FileNotFoundException on startup.
>
>Caused by: java.io.FileNotFoundException:
> D:zookeeper1-3.5.1confzoo.cfg.dynamic.1 (The system cannot find the
> file specified)
>
> Is there a file separator problem?
>
> Thank you,
> Curtis
>
>
>
> The information contained in this message is proprietary and/or
> confidential. If you are not the intended recipient, please: (i) delete the
> message and all copies; (ii) do not disclose, distribute or use the message
> in any manner; and (iii) notify the sender immediately. In addition, please
> be aware that any message addressed to our domain is subject to archiving
> and review by persons other than the intended recipient. Thank you.
> The information contained in this message is proprietary and/or
> confidential. If you are not the intended recipient, please: (i) delete the
> message and all copies; (ii) do not disclose, distribute or use the message
> in any manner; and (iii) notify the sender immediately. In addition, please
> be aware that any message addressed to our domain is subject to archiving
> and review by persons other than the intended recipient. Thank you.
>


Re: Observer doesn't write transactions?

2016-06-20 Thread Rakesh Radhakrishnan
Hey Gokul,

Observer will write the transactions into the log file like other Follower
servers. Could you share the server side configurations. Also, can you
compare the Observer configurations with other Follower in your cluster to
see any differences.

Regards,
Rakesh

On Mon, Jun 20, 2016 at 3:43 PM, Gokul  wrote:

> Hi,
>
> I am using ZK version 3.4.5
>
> I was running a 3 ZK servers locally with an Observer and observed that
> Observer stores each transaction it gets from leader only in in-memory
> snapshot and doesn't write the transaction to transaction log in disk. Is
> this the expected behaviour or am I missing some config parameter?
>


Re: Apache ZooKeeper Meetup - Jan 27, Cloudera HQ

2016-01-21 Thread Rakesh Radhakrishnan
Thank you for organizing this Flavio!  I'm interested to attend remotely.

Best Regards,
Rakesh

On Wed, Jan 20, 2016 at 11:19 PM, Raúl Gutiérrez Segalés <
r...@itevenworks.net> wrote:

> Thanks for setting this up Flavio! See you all there!
>
> On 20 January 2016 at 08:35, Flavio Junqueira  wrote:
>
> > Hello!
> >
> > We are organizing a meetup in the Bay Area next week, and I'd love to see
> > everyone who is the area there. Please check the event page and don't
> > forget to RSVP:
> >
> > https://www.eventbrite.com/e/apache-zookeeper-meetup-tickets-20906479844
> <
> > https://www.eventbrite.com/e/apache-zookeeper-meetup-tickets-20906479844
> >
> >
> > Also, it'd be great to have folks speaking about stuff around ZK. If
> > you're interested, let me know and I'll add you to the agenda.
> >
> > See you there!
> >
> > -Flavio
>


Re: zookeeper connection

2016-01-19 Thread Rakesh Radhakrishnan
To enable secure zkclient-zkserver communication, you need to configure set
of configurations at the server and the client side. The default zoo.cfg
configuration file doesn't have proper auth related configurations. I hope
the following links will help you to understand more on the configurations.

https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zookeeper+and+SASL
http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_sg_zookeeper_security.html

Regards,
Rakesh

On Tue, Jan 19, 2016 at 1:59 PM, sam mohel <sammoh...@gmail.com> wrote:

> Thanks for replying . Sorry I have another email , how can I check that i'm
> not using a secure client-server communication , I just download zookeeper
> and set configuration in zoo.cfg
>
> And where can I find zookeeper. Sasl.client ?
>
> Thanks for your time
>
> On Tuesday, January 19, 2016, Rakesh Radhakrishnan <
> rakeshr.apa...@gmail.com>
> wrote:
>
> > >>>>and what does this statment means
> > >>>>[INFO] Opening socket connection to server 127.0.0.1/127.0.0.1:2181.
> > Will
> > >>>>not attempt to authenticate using SASL (unknown error)
> >
> > By default, SASL client is enabled and ZooKeeper client tries to use SASL
> > auth. It can be disabled by setting the system property
> > "zookeeper.sasl.client" to false. I hope you are not using a secure
> > client-server communication, in that case its not a problem, you can
> ignore
> > this.
> >
> > -Rakesh
> >
> > On Tue, Jan 19, 2016 at 1:00 PM, researcher cs <
> prog.researc...@gmail.com
> > <javascript:;>>
> > wrote:
> >
> > > I'm new to zookeeper and storm , i'm using zookeeper connection in
> storm
> > > i started to run
> > > bin / .zkServer.sh start
> > > are this statements means everything is ok ?
> > >
> > > 2016-01-19 09:02:40 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
> > > 2016-01-19 09:02:40 o.a.z.ZooKeeper [INFO] Initiating client
> connection,
> > > connectString=127.0.0.1:2181 sessionTimeout=2
> > > watcher=com.netflix.curator.ConnectionState@4188f17b
> > > 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Opening socket connection
> to
> > > server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate
> using
> > > SASL (unknown error)
> > > 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Socket connection
> established
> > > to 127.0.0.1/127.0.0.1:2181, initiating session
> > > 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Session establishment
> > complete
> > > on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x15258a7ee360010,
> > > negotiated timeout = 2
> > > 2016-01-19 09:02:41 b.s.zookeeper [INFO] Zookeeper state update:
> > > :connected:none
> > > 2016-01-19 09:02:41 o.a.z.ZooKeeper [INFO] Session: 0x15258a7ee360010
> > > closed
> > > 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] EventThread shut down
> > > 2016-01-19 09:02:41 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
> > > 2016-01-19 09:02:41 o.a.z.ZooKeeper [INFO] Initiating client
> connection,
> > > connectString=127.0.0.1:2181/storm sessionTimeout=2
> > > watcher=com.netflix.curator.ConnectionState@26eaaa40
> > > 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Opening socket connection
> to
> > > server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate
> using
> > > SASL (unknown error)
> > > 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Socket connection
> established
> > > to 127.0.0.1/127.0.0.1:2181, initiating session
> > > 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Session establishment
> > complete
> > > on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x15258a7ee360011,
> > > negotiated timeout = 2
> > > 2016-01-19 09:02:41 b.s.m.TransportFactory [INFO] Storm peer transport
> > > plugin:backtype.storm.messaging.netty.Context
> > >
> > >
> > >
> > > and what does this statment means
> > > [INFO] Opening socket connection to server 127.0.0.1/127.0.0.1:2181.
> > Will
> > > not attempt to authenticate using SASL (unknown error)
> > >
> > > ?
> > > Should i fix it or not ?
> > >
> >
>


Re: zookeeper connection

2016-01-19 Thread Rakesh Radhakrishnan
and what does this statment means
[INFO] Opening socket connection to server 127.0.0.1/127.0.0.1:2181.
Will
not attempt to authenticate using SASL (unknown error)

By default, SASL client is enabled and ZooKeeper client tries to use SASL
auth. It can be disabled by setting the system property
"zookeeper.sasl.client" to false. I hope you are not using a secure
client-server communication, in that case its not a problem, you can ignore
this.

-Rakesh

On Tue, Jan 19, 2016 at 1:00 PM, researcher cs 
wrote:

> I'm new to zookeeper and storm , i'm using zookeeper connection in storm
> i started to run
> bin / .zkServer.sh start
> are this statements means everything is ok ?
>
> 2016-01-19 09:02:40 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
> 2016-01-19 09:02:40 o.a.z.ZooKeeper [INFO] Initiating client connection,
> connectString=127.0.0.1:2181 sessionTimeout=2
> watcher=com.netflix.curator.ConnectionState@4188f17b
> 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Opening socket connection to
> server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using
> SASL (unknown error)
> 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Socket connection established
> to 127.0.0.1/127.0.0.1:2181, initiating session
> 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Session establishment complete
> on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x15258a7ee360010,
> negotiated timeout = 2
> 2016-01-19 09:02:41 b.s.zookeeper [INFO] Zookeeper state update:
> :connected:none
> 2016-01-19 09:02:41 o.a.z.ZooKeeper [INFO] Session: 0x15258a7ee360010
> closed
> 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] EventThread shut down
> 2016-01-19 09:02:41 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
> 2016-01-19 09:02:41 o.a.z.ZooKeeper [INFO] Initiating client connection,
> connectString=127.0.0.1:2181/storm sessionTimeout=2
> watcher=com.netflix.curator.ConnectionState@26eaaa40
> 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Opening socket connection to
> server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using
> SASL (unknown error)
> 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Socket connection established
> to 127.0.0.1/127.0.0.1:2181, initiating session
> 2016-01-19 09:02:41 o.a.z.ClientCnxn [INFO] Session establishment complete
> on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x15258a7ee360011,
> negotiated timeout = 2
> 2016-01-19 09:02:41 b.s.m.TransportFactory [INFO] Storm peer transport
> plugin:backtype.storm.messaging.netty.Context
>
>
>
> and what does this statment means
> [INFO] Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will
> not attempt to authenticate using SASL (unknown error)
>
> ?
> Should i fix it or not ?
>


Re: 3.4.8 release

2016-01-18 Thread Rakesh Radhakrishnan
Thank you Raul for the initiative.

It would be good if we can include ZOOKEEPER-2247 into this release, it
looks like the proposed patch solves the issue. Probably needs to undergo
few more review cycles.

Thanks,
Rakesh

On Tue, Jan 19, 2016 at 6:30 AM, Raúl Gutiérrez Segalés  wrote:

> Hi,
>
> I'd like to prepare an RC for 3.4.8 soonish (hopefully, this week). For
> now, this is the only blocker I can see:
>
> ZOOKEEPER-2355: Ephemeral node is never deleted if follower fails while
> reading the proposal packet
>
> Given the magnitude of that bug, I think it should go out with this
> release.
>
> Any other JIRA that should be included?
>
>
> -rgs
>


Re: Cannot find version in repository

2016-01-03 Thread Rakesh Radhakrishnan
Hi Sam,

but where can i find log of zookeeper ?

Please refer the "log4j.properties" configuration file. By default its
using CONSOLE appender.
zookeeper.root.logger=INFO, CONSOLE

You can try modifying this to use ROLLINGFILE

-Rakesh

On Mon, Jan 4, 2016 at 3:55 AM, sam mohel  wrote:

> Thanks i found process with java for zookeeper but where can i find log of
> zookeeper ?
>
> On Sun, Jan 3, 2016 at 7:02 PM, Ted Yu  wrote:
>
> > Was it possible that QuorumPeerMain didn't start ?
> >
> > Please check the log.
> >
> > Also use 'ps aux' and search for processes using java.
> >
> > Cheers
> >
> > On Sun, Jan 3, 2016 at 8:52 AM, sam mohel  wrote:
> >
> > > thanks for your time , but it is an example not by process
> > > the result with me is just
> > >
> > > 1410 Jps
> > >
> > >
> > > but the example illustarte that i should got like this
> > > 1935 nimbus
> > > 1092 ElasticSearch
> > >  1982 core
> > >  1529 QuorumPeerMain
> > >
> > >
> > > BUT with different numbers
> > >
> > > On Sun, Jan 3, 2016 at 6:49 PM, Ted Yu  wrote:
> > >
> > > > I assumed you downloaded 3.4.6 from:
> > > > https://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/
> > > >
> > > > If you ran zkServer.sh from where 3.4.6 was stored locally, the
> version
> > > > would be 3.4.6
> > > >
> > > > Alternatively, doing 'ps aux | grep 1529' (the pid for
> QuorumPeerMain)
> > > > would show you the path.
> > > >
> > > > Cheers
> > > >
> > > > On Sun, Jan 3, 2016 at 8:34 AM, sam mohel 
> wrote:
> > > >
> > > > > thanks i installed now 3.4.6 and delete 3.4.7 but how can i know
> > > versoin
> > > > of
> > > > > running zookeeper ?
> > > > >
> > > > > i mean i'm now used ./zkServer.sh start
> > > > > and zookeeper ran , i want to know the verion of it to make sure
> that
> > > > it's
> > > > > 3.4.6 ?
> > > > >
> > > > > i noticed when i used jps after ran zookeeper i got
> > > > >
> > > > > 1410 Jps   only
> > > > >
> > > > > which it's supposed i got like
> > > > >
> > > > > 2129 Jps
> > > > >  991 NettyServer
> > > > >  1935 nimbus
> > > > >  1092 ElasticSearch
> > > > >  1982 core
> > > > >  1529 QuorumPeerMain
> > > > >
> > > > >
> > > > > On Sun, Jan 3, 2016 at 6:18 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > From https://github.com/apache/storm/blob/0.9.x-branch/pom.xml ,
> > > 3.4.6
> > > > > is
> > > > > > used.
> > > > > >
> > > > > > If you build Storm locally, it would download artifacts for 3.4.6
> > > into
> > > > > > local maven repository.
> > > > > >
> > > > > > What I meant in previous email was that 3.4.7 was not stable -
> see
> > > > > > ZOOKEEPER-2347
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Sun, Jan 3, 2016 at 7:43 AM, sam mohel 
> > > wrote:
> > > > > >
> > > > > > > i'm sorry i didn't got what do you mean by the link , just i'm
> > > asking
> > > > > > that
> > > > > > > storm using in library folder version -3.3.3 and i installed
> > 3.4.7
> > > or
> > > > > > > whatever version but in the repoisroty folder still read 3.3.3
> > so i
> > > > > need
> > > > > > to
> > > > > > > add my version i installed
> > > > > > >
> > > > > > > On Sun, Jan 3, 2016 at 5:23 PM, Ted Yu 
> > > wrote:
> > > > > > >
> > > > > > > > Please see this thread:
> > > > > > > >
> > > > > > > >
> > > > >
> > http://search-hadoop.com/m/JhBoa1A4TJl1A62qq=Re+Whither+ZK+3+4+6+
> > > > > > > >
> > > > > > > > Suggest switching to 3.4.6 before 3.4.8 comes out.
> > > > > > > >
> > > > > > > > FYI
> > > > > > > >
> > > > > > > > On Sun, Jan 3, 2016 at 5:23 AM, sam mohel <
> sammoh...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://stackoverflow.com/questions/34574276/cannot-find-version-in-repository#
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I'm new to ubuntu also in zookeeper, I installed
> > storm-0.9.0.1
> > > > with
> > > > > > > > > zookeeper-3.4.7
> > > > > > > > >
> > > > > > > > > But when i checked this path
> > > > > > > > >
> > > > > > > > > .m2/repository/org/apache/zookeeper/
> > > > > > > > >
> > > > > > > > > I found zookeeper version 3.3.3 I need to add my version
> > 3.4.7
> > > > with
> > > > > > all
> > > > > > > > > files it need How can i add it ?
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Thread dump looks OK?

2015-10-23 Thread Rakesh Radhakrishnan
Hi John,

In your threaddump, EventTread is waiting to get a watch notification or
operation response from the ZooKeeper server. Event thread uses the
datastructure (EventThread#waitingEvents) to keep all the notifications
from the ZooKeeper server and will wait until an item is available. So, I
think there is no issue with this threaddump.

Are you facing any issue in your cluster?

Regards,
Rakesh


On Sat, Oct 24, 2015 at 6:34 AM, John Lindwall 
wrote:

> We use zookeeper 3.4.6
>
> Our webapp running in a tomcat server suffered a condition where user
> sessions seemed to "freeze". We got a single thread dump at that time. I am
> curious about a particular stacktrace that is seen in the thread dump,
> related to zookeeper.
>
> Does the following stacktrace cause any alarm or concern?  Thanks!
>
> "ajp-bio-8012-exec-561-EventThread" daemon prio=3 tid=0x000105a46000
> nid=0x11fa waiting on condition [0xfffd967ff000]
>
>java.lang.Thread.State: WAITING (parking)
>
> at sun.misc.Unsafe.park(Native Method)
>
> - parking to wait for <0x11fd9110> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>
> at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>
> at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)
>


Re: Reasonable config for ZK

2015-10-08 Thread Rakesh Radhakrishnan
Hi Vikrant,

I am wondering what should be a reasonable tick time (and other critical
config) I should keep for the same? right now it is 500ms and I do see some
connection loss occasionally.

Do you see any disturbances in the ZooKeeper quorum, I meant any exceptions
between Followers and Leader communications ?

If you are seeing connection loss exceptions only at the zkclient side and
there is no exceptions at the quorum server-to-server communications, then
you could check your client's

1) the session timeout value, please give me the configured session timeout
2) any GC pauses or see any chances of delay in ZooKeeper client to server
heartbeats etc.

Also, would be helpful if you can share the logs.

-Rakesh

On Fri, Oct 9, 2015 at 4:09 AM, vikrant singh 
wrote:

> Hello All,
> I have a 3 server cluster running on virtual machines with 4 core Intel
> processor with 25gb RAM.
> MaxLatency on servers are around 200-500 ms. Another similar cluster with
> more ephemeral nodes (and watches) has MaxLatency of around 2 secs.
>
> I am wondering what should be a reasonable tick time (and other critical
> config) I should keep for the same? right now it is 500ms and I do see some
> connection loss occasionally.
>
> If it helps, I use curator for creating these connection and most of them
> are ephemeral.
> Thanks,
> Vikrant
>


Re: 3-server Zab cluster

2015-10-01 Thread Rakesh Radhakrishnan
Hi Ibrahim,

Below example taken from your older mail thread.

> 1. leader  (L)  sends a proposal p with zxid =10 to F1 and F2.
> 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2
crashes before receiving P10. L has not received any ACKs

My thoughts for the above scenario is,

In your case, zk client sees a successful response from F1. Then assume F2
joins quorum first and L become the leader again. But the newly formed
quorum will not have the zxid=10 transaction. This will make the cluster
inconsistent, isn't it?

Apart from the above case I'm not seeing any other problems with 3 node
cluster. The above data loss case can be avoided by putting an assumption
that more than a tolerated number of server failures may affect the cluster
consistency and results in data loss. But I feel this optimization would
have more cases if we scale up the cluster size beyond 3 servers. Now, I'm
not thinking in that direction as your case is limited to 3 node cluster.

Regards,
Rakesh


On Tue, Sep 29, 2015 at 2:28 PM, Ibrahim El-sanosi (PGR) <
i.s.el-san...@newcastle.ac.uk> wrote:

> Yes Alex, in my post I mentioned that this (small) optimization can only
> work with 3-servers cluster.
>
> Who could confirm the optimization can work?
>
> Ibrahim
>
> -Original Message-
> From: Alexander Shraer [mailto:shra...@gmail.com]
> Sent: Tuesday, September 29, 2015 12:11 ص
> To: user@zookeeper.apache.org
> Subject: Re: 3-server Zab cluster
>
> I'm not 100% sure whether operations that were pending on the leader are
> sent out during sync when this leader looses quorum and re-elected. If so,
> then maybe you're right. But in any case, this would not work for 5 or more
> servers...
>
> On Mon, Sep 28, 2015 at 3:51 PM, Ibrahim El-sanosi (PGR) <
> i.s.el-san...@newcastle.ac.uk> wrote:
>
> > Thank you Alex for replaying.
> >
> > When you said " the leader gets re-elected and the operation is
> > truncated from logs at other servers". I though the new leader will
> > sync the its logs with other followers (synchronization phase),
> > resulting in the operation will commit by new quorum.  Let me make the
> scenarios as steps:
> >
> > 1. leader  (L)  sends a proposal p with zxid =10 to F1 and F2.
> > 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2
> > crashes before receiving P10. L has not received any ACKs
> >
> > Possible solution  (1)
> > The leader will move to LOOKING phase as there is no quorum supporting
> > its leadership. Now Assume F2 wakes up. F2 forms a quorum with the L
> > (pervious leader), L becomes new leader again as it has latest zxid (10)
> in its log.
> > L syncs its state with F2, as a result L, F1 (before crashing) and F2
> > commit P10.  Is that correct?
> >
> > Possible solution  (2)
> > The leader will move to LOOKING phase as there is no quorum supporting
> > its leadership. Now Assume F1 (with Zxid =10  committed) wakes up. I
> > am not sure who should be a leader (F1 with Zxid =10 committed or L
> > (pervious
> > leader) with Zxid = 10 logged), I think F1 become a new leader as it
> > has Zxid = 10 committed. F1 forms a quorum with the L (pervious
> > leader), F1 becomes new leader as it has latest zxid (10) . L (new
> > leader) syncs its state with L (pervious leader now become a
> > follower), as a result Zxid10 commits by new quorum.  Is that correct?
> >
> > What do you think?
> >
> > Ibrahim
> >
> >
> >
> >
> >
> > -Original Message-
> > From: Alexander Shraer [mailto:shra...@gmail.com]
> > Sent: Monday, September 28, 2015 07:27 م
> > To: user@zookeeper.apache.org
> > Cc: d...@zookeeper.apache.org
> > Subject: Re: 3-server Zab cluster
> >
> > Committing locally when sending an ACK at a server would lead to loss
> > of consistency - it is possible that this is the only server that
> > acks, e.g., this server is temporarily disconnected from the leader,
> > the leader gets re-elected and the operation is truncated from logs at
> > other servers. Its ok to ACK it but its not ok to commit since this
> > exposes this to users as a committed operation that they can see.
> >
> > On Mon, Sep 28, 2015 at 4:19 AM, Ibrahim El-sanosi (PGR) <
> > i.s.el-san...@newcastle.ac.uk> wrote:
> >
> > > In Zab, assume we have a cluster consists of 3-servers. To deliver a
> > > write request, it must run 3 communication steps proposal,
> > > acknowledgement and commit.
> > > As Zab uses reliable FIFO, it is possible to remove commit round. As
> > > soon as a follower receives a proposal, it logs, sends an ACK and
> > > commits locally. Upon receiving ACK from any follower, leader
> > > commits a proposal locally, no COMMIT message need to be sent to
> > > followers. In this case, all servers commit a proposal in two
> > > round-trips, resulting in reducing latency particularly in followers.
> > >
> > > Note that this optimization can only work in 3-servers cluster
> > > (follower reaches a majority as soon as it acks).
> > > Does anyone see any problems with such 

Re: [ANNOUNCE] New committer: Chris Nauroth

2015-09-28 Thread Rakesh Radhakrishnan
Welcome Chris, thanks for all your great work and congrats!

-Rakesh

On Mon, Sep 28, 2015 at 8:11 PM, Flavio Junqueira  wrote:

> The Apache ZooKeeper PMC is pleased to announce that Chris Nauroth has
> accepted to become a committer. Chris has been a great contributor and very
> active in the community.
>
> Congrats, Chris!
>
> -Flavio


Re: Purge zookeeper data best method?

2015-06-20 Thread Rakesh Radhakrishnan
Hi Tim,

 (1) Using PurgeTxnLog
a utility exposed by ZooKeeper. Administrator can run 'cronjob' on the
Zookeeper server machine.

 (2) Setting zoo.cfg with:
Its automatic purging of snapshots and corresponding transaction logs.

Probably you can visit this section for more understanding.
http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup

You will be able to achieve the desired functionality using both the
options. Anyway internally both are doing the cleanups using PurgeTxnLog
utility. Only the difference is, (1) is a kind of external scheduling and
(2) is automatic way.

IMHO, you can go for automatic purging and admins not required to take
effort on making cronjob.

Best Regards,
Rakesh

On Sat, Jun 20, 2015 at 6:17 PM, Tim Molter tim.mol...@gmail.com wrote:

 I found two methods for purging zookeeper data

 1) Using PurgeTxnLog

 2) Setting zoo.cfg with:

 # Enable regular purging of old data and transaction logs every 24 hours
 autopurge.purgeInterval=24
 autopurge.snapRetainCount=5

 Do both do the same thing? Which one should I use?



Re: Zookeeper port 3181

2015-04-22 Thread Rakesh Radhakrishnan
Hi Eyal,

ZooKeeper and BookKeeper are two different services.

Are you trying to setup only zookeeper cluster? Can you show the
configurations and the command to start the servers.

Regards,
Rakesh

On Wed, Apr 22, 2015 at 1:17 PM, Eyal Bar eyal@kenshoo.com wrote:

 Hi,

 The bookkeeper service listens on port 3181 as I read from the
 documentation (
 http://zookeeper.apache.org/doc/r3.3.6/bookkeeperStarted.html).
 The question is why is this port only open on 1 out of 3 installed
 zookeeper servers. Is there always one live bookkeeper out of the 3
 zookeeper?

 CDH5 = Cloudera Hadoop version 5 not Cassandra.

 Best,
 On Apr 21, 2015 5:57 PM, Flavio Junqueira fpjunque...@yahoo.com.invalid
 
 wrote:

  I'm confused, you refer to bookie, is it about bookkeeper?
 
  Also, you may want to ask Cloudera folks directly about their
  distribution, you're likely to get a better answer.
 
  -Flavio
 
  -Original Message-
  From: Eyal Bar eyal@kenshoo.com
  Sent: ‎4/‎21/‎2015 2:27 PM
  To: user@zookeeper.apache.org user@zookeeper.apache.org
  Subject: Zookeeper port 3181
 
  Hi,
 
  I have a CDH5 installed with HA configuration and part on this
 installation
  are 3 Zookeeper servers.
 
  I have noticed that port 3181, which the bookie uses to listens for
  connection requests from clients, is open only on 1 out of the 3
 installed
  Zookeepers servers.
 
  Do any of you know why port 3181 isn't open on *all *3 Zookeepers?
 
  Thanks,
 
  --
  *[ Eyal Bar ]*
  MySQL and Cassandra Database Administrator - Infrastructure Team  //
  *Kenshoo*
  *Office* +972 (3) 746-6500 x473 // *Mobile* +972 (52) 458-6100
  *eyal@kenshoo.com eyal@kenshoo.com*
   eyal.has...@kenshoo.com* eyal.has...@kenshoo.com*
  ___
  *www.Kenshoo.com* http://kenshoo.com/
 
  --
  This e-mail, as well as any attached document, may contain material which
  is confidential and privileged and may include trademark, copyright and
  other intellectual property rights that are proprietary to Kenshoo Ltd,
   its subsidiaries or affiliates (Kenshoo). This e-mail and its
  attachments may be read, copied and used only by the addressee for the
  purpose(s) for which it was disclosed herein. If you have received it in
  error, please destroy the message and any attachment, and contact us
  immediately. If you are not the intended recipient, be aware that any
  review, reliance, disclosure, copying, distribution or use of the
 contents
  of this message without Kenshoo's express permission is strictly
  prohibited.
 

 --
 This e-mail, as well as any attached document, may contain material which
 is confidential and privileged and may include trademark, copyright and
 other intellectual property rights that are proprietary to Kenshoo Ltd,
  its subsidiaries or affiliates (Kenshoo). This e-mail and its
 attachments may be read, copied and used only by the addressee for the
 purpose(s) for which it was disclosed herein. If you have received it in
 error, please destroy the message and any attachment, and contact us
 immediately. If you are not the intended recipient, be aware that any
 review, reliance, disclosure, copying, distribution or use of the contents
 of this message without Kenshoo's express permission is strictly
 prohibited.



Re: Heartbeats not being received / responded to?

2015-01-21 Thread Rakesh Radhakrishnan
Hi,

Frustratingly, this behavior is very inconsistent.  In fact, after 5+
failures in a row on our UX lead's machine, now I just heard from him that
he tried again and it worked.  Could it be that there is some kind of state
on Zookeeper that needs to timeout and disappear before things will work?


Meantime, could you see this jira
https://issues.apache.org/jira/browse/ZOOKEEPER-1871
I feel, this is similar line. If yes, can you try the patch and see it
passing consistently ?


-Rakesh



On Thu, Jan 22, 2015 at 12:55 AM, Camille Fournier cami...@apache.org
wrote:

 Do you have the logs from the zk server? What type of machines are you
 running this on? Which version of ZK?

 Thanks,
 C

 On Wed, Jan 21, 2015 at 2:00 PM, Ian Rose ianr...@fullstory.com wrote:

  Hi all -
 
  Reviving an old thread here.  We have now been running Zookeeper for a
 few
  months now and this problem has continued to dog us.  This is entirely a
  localhost problem (when zookeeper and the client app are both running on
 a
  developer laptop) - thankfully everything has been solid on our
 production
  servers.  Nonetheless this causes major headaches in our development
  process.  For example, I just now debugged a problem our UX lead was
 having
  getting our dev stack up and running on his local machine; the problem
 was
  that our startup script was failing when using zkCli.sh to create a
 /solr
  node in ZK.  This would not be a problem if it was a rare/spurious error,
  but he is getting this error every single time.
 
  For reference, here is the complete stack trace (which appears after
  several seconds - the failure is not immediate):
 
  Exception in thread main
  org.apache.zookeeper.KeeperException$ConnectionLossException:
  KeeperErrorCode = ConnectionLoss for /solr
  at
 org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
  at
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
  at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
  at
  org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:695)
  at
  org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:588)
  at
  org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:360)
  at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
  at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
 
  Frustratingly, this behavior is very inconsistent.  In fact, after 5+
  failures in a row on our UX lead's machine, now I just heard from him
 that
  he tried again and it worked.  Could it be that there is some kind of
 state
  on Zookeeper that needs to timeout and disappear before things will work?
 
  More concretely, the next time this happens, what kind of info should I
  capture to make this more debuggable?
 
  Many thanks,
  Ian
 
 
 
  On Mon, Sep 29, 2014 at 4:51 PM, Ian Rose ianr...@fullstory.com wrote:
 
   Well unfortunately although I am able to repro this problem with a
   completely new, clean ZK install on my laptop, I *cannot* repro with
 the
   same on my coworkers laptop.  So unfortunately I am forced to conclude
  that
   there is something strange going on locally.
  
   Thanks for the help anyhow!
  
   - Ian
  
  
   On Mon, Sep 29, 2014 at 9:40 AM, Camille Fournier cami...@apache.org
   wrote:
  
   No that is not expected. Odd that you get disconnected once and then
   reconnect fine. Does the same thing happen in your kazoo clients, one
   disconnect but then the second connect is ok?
   Which version of ZK are you running? Are you running this with some
 sort
   of
   auth or password to the zk server?
  
   Thanks,
   C
  
   On Mon, Sep 29, 2014 at 9:24 AM, Ian Rose ianr...@fullstory.com
  wrote:
  
Perhaps I'm simply misunderstanding what the expected behavior would
  be.
Why would my client by disconnected?  Does zookeeper drop idle
  clients?
And note that this isn't a spurious disconnect; my client is
 *always*
dropped
at that time.
   
On Friday zkCli seemed to be working just fine for me, but now I am
   getting
disconnections similar to my kazoo-based client.
   
Here is a session showing `ls /' failing.  This behavior is
  reproducible
for me currently.
   
$ ./bin/zkCli.sh
 Connecting to localhost:2181
 Welcome to ZooKeeper!
 JLine support is enabled
 [zk: localhost:2181(CONNECTING) 0]
 WATCHER::
 WatchedEvent state:SyncConnected type:None path:null
 [zk: localhost:2181(CONNECTED) 0]
 [zk: localhost:2181(CONNECTED) 0] ls /
 WATCHER::
 WatchedEvent state:Disconnected type:None path:null
 Exception in thread main
 org.apache.zookeeper.KeeperException$ConnectionLossException:
 KeeperErrorCode = ConnectionLoss for /
 at
   org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
 at
   org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at 

Re: Apache BookKeeper TLP!

2014-11-23 Thread Rakesh Radhakrishnan
Great News! Congrats all the contributors and users for the continuous
effort.


Best Regards,
Rakesh

On Sun, Nov 23, 2014 at 7:11 PM, Jiannan Wang jian...@yahoo-inc.com wrote:

 Awesome, wonderful news! Hope the BookKeeper project could have deeper
 impact to the industry and go further.

 Regards,
 Jiannan

 On 11/23/14, 6:26 PM, Flavio Junqueira fpjunque...@yahoo.com.INVALID
 wrote:

 I just wanted to share that the board approved our proposal to make
 Apache BookKeeper a top-level project. I'm thrilled with the news and I
 want to thank everyone for the support, especially the ZooKeeper
 community for hosting the project. The BookKeeper community will start
 working on the transition right away.
 
 -Flavio




Re: BookKeeper vs. Bookkeeper

2014-10-25 Thread Rakesh Radhakrishnan
+1 for BookKeeper :)

HDFS has a contrib module named BKJM (BookKeeperJournalManager), I think
bunch of people already familiar with BK acronym.

-Rakesh

On Thu, Oct 23, 2014 at 10:17 PM, Flavio Junqueira fpjunque...@yahoo.com
wrote:

 +1 for BookKeeper. I like it because it makes clear where the acronym BK
 comes from, and it creates a link to ZooKeeper, a system we rely on. It is
 actually not uncommon that project names violate capitalization or grammar
 rules in general, so it really doesn't bother me. Not to mention that we
 have been used it this way for quite a long time.

 -Flavio


   On Thursday, October 23, 2014 10:01 AM, Ivan Kelly iv...@apache.org
 wrote:



 Hi folks,

 Now that we are going top level, we have a chance to fix something
 that has been bugging me since I started working on the project; the
 superfluously capitalization of K in the name. The word 'bookkeeper'
 is a single word. The norms of capitalization, in written language,
 and also in CamelCase, dictate that only the first letter of a word
 should ever be capitalized. What's more, since people are familiar
 with this, the lean towards doing this, only remembering to add the K
 some of the time. So we end up in the following situation.

 ivank@trainmoney-ll ~/src/bookkeeper Thu Oct
 23 10:45:31 [0 jobs] [hist 10099]
 $ find . -name *.java -exec grep Bookkeeper {} \; | wc -l
 763
 ivank@trainmoney-ll ~/src/bookkeeper Thu Oct
 23 10:45:36 [0 jobs] [hist 10100]
 $ find . -name *.java -exec grep BookKeeper {} \; | wc -l
 682

 In the code, Bookkeeper is actually more common.

 The word bookkeeper is hard enough to type, with 3 consecutive double
 letters, without having a random capitalization in the middle.

 Does this annoy anyone else? Should we kill the K?

 Regards
 Ivan





Re: Latency in asynchronous mode

2014-10-23 Thread Rakesh Radhakrishnan
Hi Ibrahim,

In async tests, could you give the details like:

* number of clients
* number of threads
* data size storing in each znode

Also, it would be good to monitor :

1) JVM stats(one way is through JMX) like heap, gc activities. This is to
see if latency spike corresponds to gc activity or not.

2) Since you are doubting fsync, I think $ iostat would be helpful to see
disk statistics. For example, $ iostat -d -x 2 10 and collects the disk
latency.

3) CPU usage through top or sar unix commands. I didn't use sar , but I
could see it gives more details like percent of CPU running idle with a
process waiting for block I/O etc.


-Rakesh


On Thu, Oct 23, 2014 at 6:44 PM, Alexander Shraer shra...@gmail.com wrote:

 Maybe due to queueing at the leader in asynchronous mode - if in your
 experiment you have one client in sync mode the leader has just one op in
 the queue at a time
 On Oct 23, 2014 1:57 PM, Ibrahim i.s.el-san...@newcastle.ac.uk wrote:

  Hi folks,
 
  I am testing ZooKeeper latency in Asynchronous mode. I am sending update
  (write) requests to Zookeeper cluster that consists of 5 physical
  Zookeeper.
 
  So, when I run the stat command I get high latency like:
  Latency min/avg/max: 7/339/392
  Latency min/avg/max: 1/371/627
  Latency min/avg/max: 1/371/627
  Latency min/avg/max: 1/364/674
  I guess such high latency correspond to fsync (batch requests). But I
 wish
  if someone could help me and explain this behaviour.
 
  However, testing Zookeeper using Synchronous mode, it gives me reasonable
  result like:
  Latency min/avg/max: 6/24/55
  Latency min/avg/max: 7/22/61
  Latency min/avg/max: 7/30/65
 
  Note that the latency measures in milliseconds.
 
  I look forward to hearing from you.
 
  Ibrahim
 
 
 
 
 
 
 
  --
  View this message in context:
 
 http://zookeeper-user.578899.n2.nabble.com/Latency-in-asynchronous-mode-tp7580446.html
  Sent from the zookeeper-user mailing list archive at Nabble.com.
 



Re: [VOTE] Taking Bookkeeper Subproject to a Top Level Project

2014-10-19 Thread Rakesh Radhakrishnan
+1, Great!


Best Regards,
Rakesh

On Fri, Oct 17, 2014 at 11:58 PM, Michi Mutsuzaki mi...@cs.stanford.edu
wrote:

 +1

 On Fri, Oct 17, 2014 at 8:55 AM, Greg Asta greg.a...@omnigon.com wrote:
  +1.
 
  -Greg
 
  -Original Message-
  From: Flavio Junqueira [mailto:fpjunque...@yahoo.com.INVALID]
  Sent: Friday, October 17, 2014 9:35 AM
  To: bookkeeper-user@zookeeper.apache.org;
 bookkeeper-...@zookeeper.apache.org; zookeeper dev list; zookeeper user
 list
  Subject: Re: [VOTE] Taking Bookkeeper Subproject to a Top Level Project
 
  +1, it is about time to go TLP.
 
  -Flavio
 
 
  On Friday, October 17, 2014 2:33 PM, Ivan Kelly iv...@apache.org
 wrote:
 
 
 
 
 Hi folks,
 
 I'd like to run a community vote on converting the Bookkeeper
 subproject of Apache Zookeeper into an Apache top level project.
 
 To refresh you on the status of the subproject currently.
 - We've made 7 releases since becoming a subproject
 - We have production use cases in 3 major companies (that we know of,
 there may be more)
 - We have active committers in 4 major companies
 
 Going top level is an oppurtunity to increase our visibility which will
 help attract more committers and usecases.
 
 We will be roughly following the incubator process, though it will be
 slightly different as we're a subproject and not in incubator itself.
 We still need to work out the details but in any case, a community vote
 is the first step.
 
 The vote will be a lazy concensus vote. Zookeeper PMC members have
 veto. The vote will be open until Wednesday 22nd October, 18:00 GMT.
 
 Regards
 Ivan
 
 
 



Re: [VOTE] Taking Bookkeeper Subproject to a Top Level Project

2014-10-19 Thread Rakesh Radhakrishnan
+1, Great!


Best Regards,
Rakesh

On Fri, Oct 17, 2014 at 11:58 PM, Michi Mutsuzaki mi...@cs.stanford.edu
wrote:

 +1

 On Fri, Oct 17, 2014 at 8:55 AM, Greg Asta greg.a...@omnigon.com wrote:
  +1.
 
  -Greg
 
  -Original Message-
  From: Flavio Junqueira [mailto:fpjunque...@yahoo.com.INVALID]
  Sent: Friday, October 17, 2014 9:35 AM
  To: bookkeeper-u...@zookeeper.apache.org;
 bookkeeper-...@zookeeper.apache.org; zookeeper dev list; zookeeper user
 list
  Subject: Re: [VOTE] Taking Bookkeeper Subproject to a Top Level Project
 
  +1, it is about time to go TLP.
 
  -Flavio
 
 
  On Friday, October 17, 2014 2:33 PM, Ivan Kelly iv...@apache.org
 wrote:
 
 
 
 
 Hi folks,
 
 I'd like to run a community vote on converting the Bookkeeper
 subproject of Apache Zookeeper into an Apache top level project.
 
 To refresh you on the status of the subproject currently.
 - We've made 7 releases since becoming a subproject
 - We have production use cases in 3 major companies (that we know of,
 there may be more)
 - We have active committers in 4 major companies
 
 Going top level is an oppurtunity to increase our visibility which will
 help attract more committers and usecases.
 
 We will be roughly following the incubator process, though it will be
 slightly different as we're a subproject and not in incubator itself.
 We still need to work out the details but in any case, a community vote
 is the first step.
 
 The vote will be a lazy concensus vote. Zookeeper PMC members have
 veto. The vote will be open until Wednesday 22nd October, 18:00 GMT.
 
 Regards
 Ivan
 
 
 



Re: Log dir locked after ZK shutdown

2014-09-02 Thread Rakesh Radhakrishnan
Hi Stevo Slavic,

I hope you are experimenting with standalone server. Please see
ZooKeeperServerMain#runFromConfig() to see how a standalone server can be
started/shutdown.  Also, you can refer ZooKeeperServerMainTest to see
standalone tests.

Yes, ZKServer#shutdown is not releasing all the resources, instead it is
only clearing. One reason I can see is - in case of quorum lost, before
entering into the next leader election phase it will only clear out all the
data structures maintained in its database by calling #shutdown. Server
will keep the I/O resources in hand as re-establishing resources is again
costlier one.

Regards,
Rakesh


On Tue, Sep 2, 2014 at 10:06 PM, Stevo Slavić ssla...@gmail.com wrote:

 Found out through experiment - after shoutdown I also had to close ZK
 server database.

 I wonder why isn't this part of the server shutdown? ZooKeeperServer
 shutdown currently (ZK 3.4.6) calls only clear on database.

 Kind regards,
 Stevo Slavic.


 On Tue, Sep 2, 2014 at 5:19 PM, Stevo Slavić ssla...@gmail.com wrote:

  Hello ZooKeeper community,
 
  In ZK (3.4.6) related tests, if I try to delete ZK data directory
  immediatelly after calling shutdown on server connection factory, only on
  Windows I get an IOException that a file could not be deleted. On Linux
 and
  Mac same code works well.
 
  I'm creating a temporary director using Java 7
  Files.createTempDirectory(zookeeper-)
  resulting in directory similar to the following:
 
  C:\Users\Foo\AppData\Local\Temp\zookeeper-4563523978878660799\
 
  Then ZooKeeper is configured to use this as data directory.
 
  To delete it I'm using FileUtils.deleteDirectory from commons-io (2.4)
  It walks the directory tree cleaning it up. It fails deleting log file
 
 
 
 C:\Users\Foo\AppData\Local\Temp\zookeeper-4563523978878660799\version-2\log.1
 
  Is there something else I have to call to make sure
  ZooKeeperServer/ServerCnxnFactory is stopped and released all locks?
  If not, is this a known bug/feature?
 
 
  Kind regards,
  Stevo Slavic.
 



Re: Zookeeper for storing huge centralize data and Max possible No of zNodes - Please advise

2014-08-15 Thread Rakesh Radhakrishnan
Adding one more, you can also explore Apache BookKeeper for storing millions
of K/V data. Since it uses filesystem for storing the ledgers  its
entries(data) size won't be a constraint. But it doesn't have watch
notifications.

Regards,
Rakesh




On Fri, Aug 15, 2014 at 10:18 PM, Alvaro Gareppe agare...@gmail.com wrote:

 It doesn't seems to be the best use of zookeeper, Maybe redis is a best
 tool for you.. or some no sql db like Casandra or mongo is going to fit a
 little bit more.




 On Fri, Aug 15, 2014 at 3:41 AM, saurabh jain sauravma...@gmail.com
 wrote:

  Hi Folks,
 
  I am planning to use ZooKeeper znodes to store my key value data.
  Name of the znode will be my key and data present inside it will be my
  value.
 
 
  The problem is I can have millions  of key/value pairs.
 
  Is the zookeeper recommended to solve this type of problem.  I read about
  the zookeeper that
  it is not an actual File system but a splitted File system  and should be
  used only for distributed coordination service.
 
  My requirement is something like this that I need a global place to store
  these key values so that all the jobs can access it and even if some job
  create a new znode then my other jobs can see these changes with the help
  of watcher.
 
  Would you guys recommend using ZooKeeper for above problem statement ?
 
  I have also read in one of the mail archive on the limitation of max no
 of
  znodes , if this limitation still exists then may be this solution won't
  work in my case.
 
 
 
 http://zookeeper-user.578899.n2.nabble.com/Question-regarding-the-maximum-number-of-ZNODES-a-zookeeper-td6979604.html
 
  Please advise
 
  Many Thanks
  Saurabh
 



 --
 Ing. Alvaro Gareppe
 agare...@gmail.com



Re: Best way to bundle zk in our product

2014-06-26 Thread Rakesh Radhakrishnan
 And can I connect to ZK using zkCli if I do embedded mode.

Yes, you can connect to the ZK server using zkCli admin. There is no
difference - embedded mode or separate JVM.


-Rakesh


On Thu, Jun 26, 2014 at 8:12 PM, Lahiru Gunathilake glah...@gmail.com
wrote:

 HI Rakesh,

 Thanks for your response. If ZK call System.ext() my application will
 immediately crash I guess. And can I connect to ZK using zkCli if I do
 embedded mode.

 I believe answer is NO...

 I have a feeling cleanest way is to ask users to download ZK and start or
 bundle ZK jars with all scripts and always start as a separate JVM.

 Regards
 Lahiru


 On Thu, Jun 26, 2014 at 10:20 AM, Rakesh R rake...@huawei.com wrote:

  Hi Lahiru,
 
  I had embedded ZooKeeper server without any issues but not in the
  production cluster. I feel you can use it for day to day development
 phase.
  Its being used in ZK unit tests, please refer
  org.apache.zookeeper.test.QuorumUtil or
  org.apache.zookeeper.server.quorum.QuorumPeerTestBase.
 
  I've noticed few cases, this may be useful to you.
 
  1) ZK server code has System.exit() which may affect your service.
  2) Better to redirect ZK logs to a separate log file which will help in
  debugging issues independently.
  3) Observe the network traffic and GC, this may affect ZK server
  communications and resulting in failures.
  4) In general, it would be difficult to restart your service or ZK server
  without affecting each other.
 
 
  -Rakesh
 
  -Original Message-
  From: Lahiru Gunathilake [mailto:glah...@gmail.com]
  Sent: 26 June 2014 18:50
  To: user@zookeeper.apache.org
  Subject: Best way to bundle zk in our product
 
  Hi All,
 
  With all the community help I was able to integrate ZK to Apache
  Airavata[1] to achieve fault-tolerance and it was a very interesting
  experience to work with ZK. It works as it explains without any issue.
 
  Now I have an issue how to bundle and ship it. Currently what I have
 asked
  the community is to start ZK instance then run our services. Personally I
  like that approach and its much cleaner and in production we can cluster
  both Airavata and ZK. But for the curiosity I want to know is there a
  better way to bundle like an embedded zk which is stable enough for day
  today development or there is a better way to do it.
 
  [1]
  Regards
  Lahiru
 
  --
  System Analyst Programmer
  PTI Lab
  Indiana University
 



 --
 System Analyst Programmer
 PTI Lab
 Indiana University



Re: renaming a znode

2014-06-17 Thread Rakesh Radhakrishnan
ZK is designed to be highly performant. Probably you can test after
finalizing your client side algo.

Also, you can try multi API to group operations and send as a single
request to the server. It helps to reduce client to server interactions.
Like,
op.delete(/writer/my_current1);
ops.add(op);
op.create(/writer/my_current2);
ops.add(op);
zk.multi(ops);

AFAIK there is no discussions related to renaming znode.

Regards,
Rakesh



On Tue, Jun 17, 2014 at 2:10 PM, Mudit Verma mudit.f2004...@gmail.com
wrote:

 Thanks Rakesh. I will have to something like that, but I am quite
 concerned about the performance issues.

 BTW, are you aware if this renaming functionality is planned for future
 releases?

 Thanks
 Mudit

 On 16 Jun 2014, at 04:30 pm, Rakesh R rake...@huawei.com wrote:

  Hey Mudit,
 
  Renaming is not possible.
 
  Secondly, do you mean all these entries are at same level, under
 /map_current znode?
  I'd prefer multiple levels instead of having too many direct children of
 a single znode. ZOOKEEPER-1162 is one such example case study of too many
 children.
 
  /map_current/entry1
  /map_current/entry2
  ...
  /map_current/entryn
 
 
  I think, JZ is suggesting the following way:
 
  Step1) Get all the children of /map_current // first, get all the
 children(number of op depends on the levels)
 
  Step2) Prepare list of transactions and submit to the server   //second
 op
- create /map_frozen/entry1create /map_frozen/entryn. Ops
 should be ordered like, create parent to child.
- delete /map_current/entry1...delete /map_current/entryn. Delete
 in reverse way from child to parent.
 
  Multi send these ops as a single request, but if we have millions of ops
 in a single req it can make the request too heavy. I haven't tested though.
 
 
 
 
  I'm thinking an alternate approach to avoid the bulk delete and
 creation. Does this work for you?
 
  /writer/my_current1   - Clients will always look here and get the
 'current writer' node. Now he can create entries as follows.
 
  /my_current1/my_entry1/my_current1/my_entryn
 
  Now I wanted to frozen /my_current1. Just delete my_current1 from
 /writer/my_current1 and create new writer znode /writer/my_current2.
  Now clients will see /writer/my_current2 and write entries to this.
 
 
  Regards,
  Rakesh
 
  -Original Message-
  From: Mudit Verma [mailto:mudit.f2004...@gmail.com]
  Sent: 16 June 2014 19:40
  To: Jordan Zimmerman
  Cc: Camille Fournier; user@zookeeper.apache.org
  Subject: Re: renaming a znode
 
  I just realised that we can not even delete a parent node, if it has
 children.  :(
 
 
  On 16 Jun 2014, at 03:43 pm, Mudit Verma mudit.f2004...@gmail.com
 wrote:
 
  problem is, it is going to be a very very costly operation (using multi
 transactions). A map may contain millions of entries. I first need to get
 the data of all these entries, delete them and create them again under
 different parent name.
 
  If we have a rename option, all I need to do is just rename the parent
 znode.
 
  Thanks
  Mudit
  On 16 Jun 2014, at 03:42 pm, Jordan Zimmerman 
 jor...@jordanzimmerman.com wrote:
 
  Yeah
 
 
  From: Camille Fournier cami...@apache.org
  Reply: user@zookeeper.apache.org user@zookeeper.apache.org
  Date: June 16, 2014 at 8:42:19 AM
  To: bookkeeper-u...@zookeeper.apache.org user@zookeeper.apache.org
  Cc: Mudit Verma mudit.f2004...@gmail.com
  Subject:  Re: renaming a znode
 
  Just to clarify you mean the multi API?
  C
  On Jun 16, 2014 9:40 AM, Jordan Zimmerman
  jor...@jordanzimmerman.com
  wrote:
 
  You could use the transaction api to create a new node and delete
  the old node.
 
  -JZ
 
 
  From: Mudit Verma mudit.f2004...@gmail.com
  Reply: user@zookeeper.apache.org user@zookeeper.apache.org
  Date: June 16, 2014 at 8:38:11 AM
  To: user@zookeeper.apache.org user@zookeeper.apache.org
  Subject: renaming a znode
 
  Hello People,
 
  Sorry for asking many questions these days. :)
 
  I am wondering if it is possible to rename a znode? I am building
  a distributed map on top of zookeeper for special needs. From time
  to time, I need to freeze the map without restricting write access
 to the map.
 
  I plan to do it by maintaining two maps:
 
  map_current
  map_frozen
 
  all the map entries are maintained as separate children znodes
  where key is the name of the child node and value is the value
  stored on the child node ..
  for example /map_current/entry1(kv)
  /map_current/entry2(kv)
 
 
  Now at some point of time, I need to iterate the map while still
  allowing write access by other clients. While I iterate, I don't
  want other clients to see these entries. Once I process map_frozen
  entries I will delete them (I don't need them anymore) by just
 deleting the parent node.
 
  I plan to rename existing map from map_current to map_frozen and
  

Re: Distributed Applocation using Zookeeper

2014-06-07 Thread Rakesh Radhakrishnan
Hi Kapil,

ZooKeeper server guarantees a total order of messages, and it also
guarantees a total order of proposals. It has internal mechanism to do the
total ordering, which uses a transaction id (zxid) for the requests. As I
know, each transaction is independent to each other. Could you give more
details about your usecases and the way you are going to use ZooKeeper. It
would help to understand more.

 So why locks are needed in first place? It will be great if some one
can help here.

I didn't fully get your point about locking. Are you talking about
distributed locking recipe ?

Regards,
Rakesh


On Thu, Jun 5, 2014 at 11:13 PM, Kapil kapildeshpande.1...@gmail.com
wrote:

 Hi,
 I need to design distributed application using zookeeper. This is the first
 time I am using Zookeeper so I am little confused with its usage. I have
 read that Zab protocol ensures serializability when it comes to multiple
 updates but I am unable to understand, if that is the case than it will
 automatically allow lock free implementation. So why locks are needed in
 first place? It will be great if some one can help here.

 Thanks




 --
 View this message in context:
 http://zookeeper-user.578899.n2.nabble.com/Distributed-Applocation-using-Zookeeper-tp7579952.html
 Sent from the zookeeper-user mailing list archive at Nabble.com.



Re: Distributed Applocation using Zookeeper

2014-06-07 Thread Rakesh Radhakrishnan
I hope you will be keeping each IP address as an znode in ZooKeeper. Also,
if you don't have any costly operation before storing the znode in
ZooKeeper, there is an optimistic way. When a request comes, pick one of
the available IP from the pool and try create IP znode in ZooKeeper. Say,
if two are trying to create same znode in ZK simultaneously, one will get
KeeperException.NodeExistsException. Now pick another available one
optimistically and creates again.

Regards,
Rakesh


On Sat, Jun 7, 2014 at 11:33 PM, Kapil kapildeshpande.1...@gmail.com
wrote:

 Thanks Rakesh for mail,
 My functionality is as follows.
 I am going to use zookeeper to allocate IP addresses. So it will allocate
 Ip
 address from given pool of available IP addresses each time request comes.
 Once Ip address allocated I am going to store it in znode to maintain
 record
 so that I can know dynamically each time which IP addresses are available.
 So each time while allocating IP address I have some concurrency mechanism
 to do this.
 So I was only confused that if updates are A-Sequential why do we need
 locks. But I think I now understand that since my update operation is
 divided into number of individual atomic updates I will have to implement
 locking.

 Thanks




 --
 View this message in context:
 http://zookeeper-user.578899.n2.nabble.com/Distributed-Applocation-using-Zookeeper-tp7579952p7579966.html
 Sent from the zookeeper-user mailing list archive at Nabble.com.



Re: Distributed Applocation using Zookeeper

2014-06-07 Thread Rakesh Radhakrishnan
Then you can try an optimistic way of randomly picking one IP address and
create znode and handles KeeperException.NodeExistsException gracefully.
This makes concurrency mechanism in client side simple. I'm trying to show
one approach, as I told earlier it again depends on your client side algo
complexity, you can take a call :)

Regards,
Rakesh


On Sun, Jun 8, 2014 at 12:13 AM, Kapil kapildeshpande.1...@gmail.com
wrote:

 Yes exactly, I am storing each IP address as nodes.



 --
 View this message in context:
 http://zookeeper-user.578899.n2.nabble.com/Distributed-Applocation-using-Zookeeper-tp7579952p7579968.html
 Sent from the zookeeper-user mailing list archive at Nabble.com.



Re: Asynchronous API's and monotonicity

2014-06-06 Thread Rakesh Radhakrishnan
Hi Mudit,

Thanks Flavio for showing BookKeeper. Yeah this is another option you can
explore in your free time:)


I'd like to introduce it briefly to you, please have a look if you are
interested.

Documentation available at
http://zookeeper.apache.org/bookkeeper/docs/r4.2.2/

Basic terminologies:-

 - servers called asbookies,
 - log are ledgers,
 - and each unit of a log (aka record) is a ledger entry

Bookie server uses filesystem to store the ledgers and their entries(Since
it uses filesystem, message size won't be a constraint I guess). Also, it
uses ZK for storing the ledger metadata information.

Basic operations:-
1) Open a bookkeeper client.
2) Create a ledger -
Here it will internally generate id for this ledger and it will be unique.
Like ZooKeeper sequential znodes it internally maintains sequence to
generate the ids.
Upon creating a ledger, a BookKeeper client writes metadata about the
ledger to ZK.
3) Write to the ledger - User can add entries(user data) to the ledger. BK
guarantees single writer.
4) After write close the ledger.

Assume user has created four ledgers, now ledger id looks like L1,
L2, L3, L4. When user tries to create a new ledger, then id
will be incremented L5.

Ledger metadata in ZK:
/ledgers/L1
/ledgers/L2
/ledgers/L3
/ledgers/L4
/ledgers/L5

Now using this sequential ledger znodes present in ZK, one can write the
logic of distributed queue.

Any queries feel free to ping us happy to help you:)
You can reach us bookkeeper-...@zookeeper.apache.org or user mailing id.

Regards,
Rakesh


On Sat, Jun 7, 2014 at 2:32 AM, Flavio Junqueira 
fpjunque...@yahoo.com.invalid wrote:

 You may want to have a look at BookKeeper.

 -Flavio

 On 05 Jun 2014, at 16:32, Mudit Verma mudit.f2004...@gmail.com wrote:

  Hi Zookeeper Users,
 
  Lately, I have been working on a research project where I want to use
 zookeeper as a distributed logging service.
 
  I want to build a queue on top of zookeeper (also provided in recipes).
 
  What for:
  Intention is to insert some operations performed by different clients in
 a distributed queue, and process them lazily at some later point of time.
  And I want some ordering between these operations.
 
  Setup:
  5 physical  zookeeper servers
 
  The problem is:
  In my current setup, I am observing a latency of about 13 ms per enqueue
 operation (using synchronous create APIs with sequential flags). I want to
 significantly reduce this time. The other way could be to use asynchronous
 zookeeper calls  but I am not sure what can be the side effects. Would it
 still be monotonous when used with SEQUENTIAL flag?
 
  For example, a  client X created a SEQUENTIAL node Z1 at time t1 using
 async create, same client created another SEQUENTIAL node Z2 at time t2
 where t2  t1. Would the monotonic number associated with Z1 be lesser than
 that of Z2?
 
  Your help is much appreciated.
 
  Thanks
  Mudit
 




Re: Asynchronous API's and monotonicity

2014-06-05 Thread Rakesh Radhakrishnan
Hi Mudit,

A client will have a consistent order and will dispatches the request
sequentially. Basically its in FIFO order, all requests from a given client
are executed in the order that they were sent by the client. In your case,
Z1 request will be processed first and then Z2 request. Ofcourse, Z1's
sequential number will be less than Z2's sequential number.

But this behaviour may not be same if we perform operations through
different clients. Here network delays or other factors may cause different
clients to see a change.

Regards,
Rakesh



On Thu, Jun 5, 2014 at 9:02 PM, Mudit Verma mudit.f2004...@gmail.com
wrote:

 Hi Zookeeper Users,

 Lately, I have been working on a research project where I want to use
 zookeeper as a distributed logging service.

 I want to build a queue on top of zookeeper (also provided in recipes).

 What for:
 Intention is to insert some operations performed by different clients in a
 distributed queue, and process them lazily at some later point of time.
  And I want some ordering between these operations.

 Setup:
 5 physical  zookeeper servers

 The problem is:
 In my current setup, I am observing a latency of about 13 ms per enqueue
 operation (using synchronous create APIs with sequential flags). I want to
 significantly reduce this time. The other way could be to use asynchronous
 zookeeper calls  but I am not sure what can be the side effects. Would it
 still be monotonous when used with SEQUENTIAL flag?

 For example, a  client X created a SEQUENTIAL node Z1 at time t1 using
 async create, same client created another SEQUENTIAL node Z2 at time t2
 where t2  t1. Would the monotonic number associated with Z1 be lesser than
 that of Z2?

 Your help is much appreciated.

 Thanks
 Mudit




Re: Command to find the leader

2014-04-25 Thread Rakesh Radhakrishnan
Hi Kamani,

Please see four letter words, hope that will help you.
http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_zkCommands


Another way is to use JMX, ZK exposes set of JMX Beans which also contains
server statistics.
http://zookeeper.apache.org/doc/r3.4.6/zookeeperJMX.html#ch_reference


-Rakesh




On Fri, Apr 25, 2014 at 4:41 PM, kamani prakashkam...@gmail.com wrote:

 I believe there are specific commands to find who is the leader among set
 of
 servers.  Can someone share those commands. I am using zookeeper 3.4.6 as
 of
 now.



 --
 View this message in context:
 http://zookeeper-user.578899.n2.nabble.com/Command-to-find-the-leader-tp7579801.html
 Sent from the zookeeper-user mailing list archive at Nabble.com.