Lock example

2010-09-08 Thread Tim Robertson
Hi all,

I am new to ZK and using the queue and lock examples that come with
zookeeper but have run into ZOOKEEPER-645 with the lock.
I have several JVMs each keeping a long running ZK client and the
first JVM (and hence client) does not respect the locks obtained by
subsequent clients - e.g. the first client always manages to get the
lock even if another client holds it.

Before I start digging, I thought I'd ask if anyone has a simple lock
implemented they might share?  My needs are simply to lock a URL to
indicate that it is being worked on, so that I don't hammer my
endpoints with multiple clients.

Thanks for any advice,
Tim


Re: Lock example

2010-09-13 Thread Tim Robertson
Thanks Mahadev,

It's good for this confirmation as this is what I ended up doing.


On Mon, Sep 13, 2010 at 11:33 PM, Mahadev Konar  wrote:
> Hi Tim,
>  The lock recipe you mention is supposed to avoid her affect and prevent 
> starvation (though it has bugs :)).
>  Are you looking for something like that or just a simple lock and unlock 
> that doesn't have to worry abt the above issues.
> If that's the case then just doing an ephemeral create and delete should give 
> you your lock and unlock recipes.
>
>
> Thanks
> mahadev
>
>
> On 9/8/10 9:58 PM, "Tim Robertson"  wrote:
>
> Hi all,
>
> I am new to ZK and using the queue and lock examples that come with
> zookeeper but have run into ZOOKEEPER-645 with the lock.
> I have several JVMs each keeping a long running ZK client and the
> first JVM (and hence client) does not respect the locks obtained by
> subsequent clients - e.g. the first client always manages to get the
> lock even if another client holds it.
>
> Before I start digging, I thought I'd ask if anyone has a simple lock
> implemented they might share?  My needs are simply to lock a URL to
> indicate that it is being worked on, so that I don't hammer my
> endpoints with multiple clients.
>
> Thanks for any advice,
> Tim
>
>


Expiring session... timeout of 600000ms exceeded

2010-09-21 Thread Tim Robertson
Hi all,

I am seeing a lot of my clients being kicked out after the 10 minute
negotiated timeout is exceeded.
My clients are each a JVM (around 100 running on a machine) which are
doing web crawling of specific endpoints and handling the response XML
- so they do wait around for 3-4 minutes on HTTP timeouts, but
certainly not 10 mins.
I am just prototyping right now on a 2xquad core mac pro with 12GB
memory, and the 100 child processes only get -Xmx64m and I don't see
my machine exhausted.

Do my clients need to do anything in order to initiate keep alive
heart beats or should this be automatic (I thought the ticktime would
dictate this)?

# my conf is:
tickTime=2000
dataDir=/Volumes/Data/zookeeper
clientPort=2181
maxClientCnxns=1
minSessionTimeout=4000
maxSessionTimeout=80

Thanks for any pointers to this newbie,
Tim


Re: Expiring session... timeout of 600000ms exceeded

2010-09-21 Thread Tim Robertson
Thanks Ted,

> To answer your last question first, no you don't have to do anything
> explicit to keep the ZK connection alive.  It is maintained by a dedicated
> thread.  You do have to keep your java program responsive and ZK problems
> like this almost always indicate that you have a problem with your program
> checking out for extended periods of time.
>
> My strong guess is that you have something evil happening with your java
> process that is actually causing this delay.
>
> Since you have tiny memory, it probably isn't GC.  Since you have a bunch of
> processes, swap and process wakeup delays seem plausible.  What is the load
> average on your box?

CPU spikes when responses come in, but mostly it's IO wait on the
endpoints (timeout of 3 minutes).  I suspect HTTP client 4 is dropping
into a retry mechanism though, but have not investigated this yet.

> On the topic of your application, why you are using processes instead of
> threads?  With threads, you can get your memory overhead down to 10's of
> kilobytes as opposed to 10's of megabytes.

I am just prototyping scaling out many processes and potentially
across multiple machines.  Our live crawler runs in a single JVM, but
some of these crawlers take 4-6 weeks, so long running processes block
others, so I was looking at alternatives - our live crawler also uses
DOM based XML parsing so hitting memory limits - SAX would address
this.  Also we want to be able to deploy patches to the crawlers
without interrupting those long running jobs if possible.

> Also, why not use something like Bixo so you don't have to prototype a
> threaded crawler?

It is not a web crawler but more of a custom web service client that
issues queries for pages of data.  A second query is assembled based
on the response of the first.  These are Biodiversity domain specific
protocols DiGIR, TAPIR and BioCASe which are closer to SOAP based
requests / response.  I'll look at Bixo.

Thanks again,
Tim




>
> On Tue, Sep 21, 2010 at 8:24 AM, Tim Robertson 
> wrote:
>
>> Hi all,
>>
>> I am seeing a lot of my clients being kicked out after the 10 minute
>> negotiated timeout is exceeded.
>> My clients are each a JVM (around 100 running on a machine) which are
>> doing web crawling of specific endpoints and handling the response XML
>> - so they do wait around for 3-4 minutes on HTTP timeouts, but
>> certainly not 10 mins.
>> I am just prototyping right now on a 2xquad core mac pro with 12GB
>> memory, and the 100 child processes only get -Xmx64m and I don't see
>> my machine exhausted.
>>
>> Do my clients need to do anything in order to initiate keep alive
>> heart beats or should this be automatic (I thought the ticktime would
>> dictate this)?
>>
>> # my conf is:
>> tickTime=2000
>> dataDir=/Volumes/Data/zookeeper
>> clientPort=2181
>> maxClientCnxns=1
>> minSessionTimeout=4000
>> maxSessionTimeout=80
>>
>> Thanks for any pointers to this newbie,
>> Tim
>>
>


Setting the heap size

2010-10-28 Thread Tim Robertson
Hi all,

We are setting up a small Hadoop 13 node cluster running 1 HDFS
master, 9 region severs for HBase and 3 map reduce nodes, and are just
installing zookeeper to perform the HBase coordination and to manage a
few simple process locks for other tasks we run.

Could someone please advise what kind on heap we should give to our
single ZK node and also (ahem) how does one actually set this? It's
not immediately obvious in the docs or config.

Thanks,
Tim


Re: Setting the heap size

2010-10-29 Thread Tim Robertson
Great - thanks Patrick!


On Thu, Oct 28, 2010 at 6:13 PM, Patrick Hunt  wrote:
> Tim, one other thing you might want to be aware of:
> http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_supervision
>
> Patrick
>
> On Thu, Oct 28, 2010 at 9:11 AM, Patrick Hunt  wrote:
>> On Thu, Oct 28, 2010 at 2:52 AM, Tim Robertson
>>  wrote:
>>> We are setting up a small Hadoop 13 node cluster running 1 HDFS
>>> master, 9 region severs for HBase and 3 map reduce nodes, and are just
>>> installing zookeeper to perform the HBase coordination and to manage a
>>> few simple process locks for other tasks we run.
>>>
>>> Could someone please advise what kind on heap we should give to our
>>> single ZK node and also (ahem) how does one actually set this? It's
>>> not immediately obvious in the docs or config.
>>
>> The amount of heap necessary will be dependent on the application(s)
>> using ZK, also configuration of the heap is dependent on what
>> packaging you are using to start ZK.
>>
>> Are you using zkServer.sh from our distribution? If so then you
>> probably want to set JVMFLAGS env variable. We pass this through to
>> the jvm, see -Xmx in the man page
>> (http://www.manpagez.com/man/1/java/)
>>
>> Given this is Hbase (which I'm reasonably familiar with) the default
>> heap should be fine. However you might want to check with the Hbase
>> team on that.
>>
>> I'd also encourage you to enter a JIRA on the (lack of) doc issue you
>> highlighted:  https://issues.apache.org/jira/browse/ZOOKEEPER
>>
>> Regards,
>>
>> Patrick
>>
>