Hi Karl,

Can i see zookeeper connection reset messages due to system running on top
of memory limits as i have 12G of RAM and can see its using 11.5G while job
is running?


Is there any way i should ascertain memory to zookeeper nodes & if so, is
there any yardstick?

Regards.

On Mon, Sep 15, 2014 at 7:16 PM, Karl Wright <daddy...@gmail.com> wrote:

> Hi Lalit,
>
> Looks like this is the result of a tomcat shutdown, and is a probable race
> condition bug in Zookeeper:
>
>
> http://mail-archives.apache.org/mod_mbox/tomcat-users/201306.mbox/%3cbay174-w32b2284bedae503e9d22d3a8...@phx.gbl%3E
>
> Karl
>
>
> On Mon, Sep 15, 2014 at 9:41 AM, lalit jangra <lalit.j.jan...@gmail.com>
> wrote:
>
>> Hi Karl,
>>
>> Along with this, i could see below errors in tomcat catalina.out.
>>
>> Sep 15, 2014 1:06:14 PM org.apache.catalina.loader.WebappClassLoader
>> loadClass
>>
>> INFO: Illegal access: this web application instance has been stopped
>> already.  Could not load org.apache.zookeeper.server.ZooTrace.  The
>> eventual following stack trace is caused by an error thrown for debugging
>> purposes as well as to attempt to terminate the thread which caused the
>> illegal access, and has no functional impact.
>>
>> java.lang.IllegalStateException
>>
>>         at
>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1612)
>>
>>         at
>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
>>
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1115)
>>
>>
>>
>> [http-bio-80-exec-1-SendThread(iwdc2preecma04.iwater.ie:2183)] ERROR
>> org.apache.zookeeper.ClientCnxn - from http-bio-80-exec-1-SendThread(
>> iwdc2preecma04.iwater.ie:2183)
>>
>> java.lang.NoClassDefFoundError: org/apache/zookeeper/server/ZooTrace
>>
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1115)
>>
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.zookeeper.server.ZooTrace
>>
>>         at
>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
>>
>>         at
>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
>>
>>         ... 1 more
>>
>> [http-bio-80-exec-1-SendThread(iwdc2preecma04.iwater.ie:2182)] ERROR
>> org.apache.zookeeper.ClientCnxn - from http-bio-80-exec-1-SendThread(
>> iwdc2preecma04.iwater.ie:2182)
>>
>> Sep 15, 2014 1:06:14 PM org.apache.coyote.AbstractProtocol destroy
>>
>> INFO: Destroying ProtocolHandler ["http-bio-80"]
>>
>> java.lang.NoClassDefFoundError: org/apache/zookeeper/server/ZooTrace
>>
>> Regards.
>>
>> On Mon, Sep 15, 2014 at 7:05 PM, lalit jangra <lalit.j.jan...@gmail.com>
>> wrote:
>>
>>> Thanks Karl,
>>>
>>> While crawling is very slow, its taking long so a bit of frustrating and
>>> as i have multiple high volume jobs that too in parallel, it does not seem
>>> to be a good thing.
>>>
>>> I have also raised it on Zookeeper forums @
>>> http://zookeeper-user.578899.n2.nabble.com/Getting-errors-in-zookeeper-logs-td7580260.html
>>> but waiting for reply.
>>>
>>> Regards.
>>>
>>> On Mon, Sep 15, 2014 at 6:51 PM, Karl Wright <daddy...@gmail.com> wrote:
>>>
>>>> HI Lalit,
>>>>
>>>> When MCF cannot reach zookeeper, MCF crawls will pause until the
>>>> zookeeper connections are reestablished.  Then the crawls should resume.
>>>> This should *not* abort your crawls, but it will make them very slow.
>>>>
>>>> I am not a zookeeper expert, so I would post on their message boards to
>>>> see if there is any adjustment that can be made to zookeeper parameters
>>>> that would improve zookeeper behavior when you have a flaky network.
>>>> However, since the obvious solution is to fix your network, they may not
>>>> have a code solution for you.
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>
>>>> On Mon, Sep 15, 2014 at 9:15 AM, lalit jangra <lalit.j.jan...@gmail.com
>>>> > wrote:
>>>>
>>>>> Thanks Karl,
>>>>>
>>>>> Ideally resetting connections should be taken care by zookeeper itself
>>>>> as i could see re-establishment of connections later in logs.
>>>>>
>>>>> Can you suggest any way to overcome this in addition to network issue
>>>>> resolution as my crawls are not working again and again? Anything in 
>>>>> config
>>>>> files etc.?
>>>>>
>>>>> Regards.
>>>>>
>>>>>
>>>>> On Mon, Sep 15, 2014 at 6:39 PM, Karl Wright <daddy...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Lalit,
>>>>>>
>>>>>> Zookeeper will keep working, but you should understand that you are
>>>>>> dropping connections to your zookeeper members for unknown reasons, which
>>>>>> is causing your crawl to stall when it happens.  This argues that perhaps
>>>>>> you have some network flakiness of some kind.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 15, 2014 at 8:59 AM, lalit jangra <
>>>>>> lalit.j.jan...@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am running cluster of two Apache ManifoldCF nodes on two separate
>>>>>>> machines each of which having 3 zookeeper instances (total 6 instances 
>>>>>>> in
>>>>>>> cluster). When i am running up manifoldCF agents, i see below warning
>>>>>>> during startup.
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] INFO
>>>>>>> org.apache.zookeeper.ClientCnxn - Unable to read additional data from
>>>>>>> server sessionid 0x0, likely server has closed socket, closing socket
>>>>>>> connection and attempting reconnect
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>>>>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
>>>>>>> authenticate using SASL (unknown error)
>>>>>>>
>>>>>>>
>>>>>>> Also i could see below error in logs in while agents are running.
>>>>>>>
>>>>>>> [http-bio-80-exec-2] INFO org.apache.zookeeper.ZooKeeper -
>>>>>>> Initiating client connection,
>>>>>>> connectString=iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183
>>>>>>> sessionTimeout=4000
>>>>>>> watcher=org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection$ZooKeeperWatcher@51d83fd7
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>>>>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to
>>>>>>> authenticate using SASL (unknown error)
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO
>>>>>>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, initiating session
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] WARN
>>>>>>> org.apache.zookeeper.ClientCnxn - Session 0x0 for server
>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2182, unexpected error,
>>>>>>> closing socket connection and attempting reconnect
>>>>>>>
>>>>>>> java.io.IOException: Connection reset by peer
>>>>>>>
>>>>>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>>>>>
>>>>>>>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>>>>>
>>>>>>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225)
>>>>>>>
>>>>>>>         at sun.nio.ch.IOUtil.read(IOUtil.java:193)
>>>>>>>
>>>>>>>         at
>>>>>>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>>>>>>>
>>>>>>>         at
>>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>>>>>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183. Will not attempt to
>>>>>>> authenticate using SASL (unknown error)
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>>>>>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183, initiating session
>>>>>>>
>>>>>>> [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO
>>>>>>> org.apache.zookeeper.ClientCnxn - Session establishment complete on 
>>>>>>> server
>>>>>>> iwdc2preecma04.iwater.ie/10.231.72.25:2183, sessionid =
>>>>>>> 0x6487851bd330078, negotiated timeout = 4000
>>>>>>>
>>>>>>>
>>>>>>> Below are configurations for 1. zookeeper nodes & 2. MCF nodes for
>>>>>>> zookeeper.
>>>>>>>
>>>>>>>
>>>>>>> *zoo.cfg :  Same for all six zookeeper nodes.*
>>>>>>>
>>>>>>>
>>>>>>> # The number of milliseconds of each tick
>>>>>>>
>>>>>>> tickTime=2000
>>>>>>>
>>>>>>> dataDir=/app/IW/zookeeper/data/data.1
>>>>>>>
>>>>>>> dataLogDir=/app/IW/zookeeper/logs/log.1
>>>>>>>
>>>>>>> clientPort=2181
>>>>>>>
>>>>>>> server.1=iwdc1preecma03:2888:3888
>>>>>>>
>>>>>>> server.2=iwdc1preecma03:2889:3889
>>>>>>>
>>>>>>> server.3=iwdc1preecma03:2890:3890
>>>>>>>
>>>>>>> server.4=iwdc2preecma04:2891:3891
>>>>>>>
>>>>>>> server.5=iwdc2preecma04:2892:3892
>>>>>>>
>>>>>>> server.6=iwdc2preecma04:2893:3893
>>>>>>>
>>>>>>> # The number of ticks that the initial
>>>>>>>
>>>>>>> # synchronization phase can take
>>>>>>>
>>>>>>> initLimit=10
>>>>>>>
>>>>>>> # The number of ticks that can pass between
>>>>>>>
>>>>>>> # sending a request and getting an acknowledgement
>>>>>>>
>>>>>>> syncLimit=5
>>>>>>>
>>>>>>> # the directory where the snapshot is stored.
>>>>>>>
>>>>>>> # do not use /tmp for storage, /tmp here is just
>>>>>>>
>>>>>>> # example sakes.
>>>>>>>
>>>>>>> #dataDir=/tmp/zookeeper
>>>>>>>
>>>>>>> # the port at which the clients will connect
>>>>>>>
>>>>>>> #clientPort=2181
>>>>>>>
>>>>>>> # the maximum number of client connections.
>>>>>>>
>>>>>>> # increase this if you need to handle more clients
>>>>>>>
>>>>>>> #maxClientCnxns=60
>>>>>>>
>>>>>>> #
>>>>>>>
>>>>>>> # Be sure to read the maintenance section of the
>>>>>>>
>>>>>>> # administrator guide before turning on autopurge.
>>>>>>>
>>>>>>> #
>>>>>>>
>>>>>>> #
>>>>>>> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
>>>>>>>
>>>>>>> #
>>>>>>>
>>>>>>> # The number of snapshots to retain in dataDir
>>>>>>>
>>>>>>> autopurge.snapRetainCount=3
>>>>>>>
>>>>>>> # Purge task interval in hours
>>>>>>>
>>>>>>> # Set to "0" to disable auto purge feature
>>>>>>>
>>>>>>> autopurge.purgeInterval=1
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *ManifoldCF configurations : same for both ManifoldCF nodes.*
>>>>>>>
>>>>>>>
>>>>>>> <property name="org.apache.manifoldcf.lockmanagerclass"
>>>>>>> value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/>
>>>>>>>
>>>>>>>   <property name="org.apache.manifoldcf.zookeeper.connectstring"
>>>>>>> value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/>
>>>>>>>
>>>>>>> <property name="org.apache.manifoldcf.zookeeper.sessiontimeout"
>>>>>>> value="4000"/>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *I want to know if due to above warnings/errors, will zookeeper stop
>>>>>>> working or will zookeeper will work and these are non-failing messages,
>>>>>>> because ManifoldCF jobs are stuck while i can see these errors.*
>>>>>>>
>>>>>>> Please suggest.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Lalit.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Lalit.
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Lalit.
>>>
>>
>>
>>
>> --
>> Regards,
>> Lalit.
>>
>
>


-- 
Regards,
Lalit.

Reply via email to