Re: Concurrency control

2015-10-03 Thread Naganarasimha Garla
n general, users specifying resources for containers itself is a difficult >>>> task. >>>> And it might not be right to expect that the admin will do it for each >>>> application in the queue either. Basically governing will be difficult if >>>>

Re: Concurrency control

2015-10-02 Thread Laxman Ch
nd it might not be right to expect that the admin will do it for each >>> application in the queue either. Basically governing will be difficult if >>> its not enforced from queue/scheduler side. >>> >>> + Naga >>> >>> -- >>> *From:* Laxman Ch [la

Re: Concurrency control

2015-10-01 Thread Harsh J
;> + Naga >> >> ------ >> *From:* Laxman Ch [laxman@gmail.com] >> *Sent:* Tuesday, September 29, 2015 16:52 >> >> *To:* user@hadoop.apache.org >> *Subject:* Re: Concurrency control >> >> IMO, its better to have

Re: Concurrency control

2015-10-01 Thread Laxman Ch
rced from queue/scheduler side. > > + Naga > > -- > *From:* Laxman Ch [laxman@gmail.com] > *Sent:* Tuesday, September 29, 2015 16:52 > > *To:* user@hadoop.apache.org > *Subject:* Re: Concurrency control > > IMO, its better to have a application level configuration t

Re: Concurrency control

2015-09-29 Thread Laxman Ch
Bouncing this thread again. Any other thoughts please? On 17 September 2015 at 23:21, Laxman Ch wrote: > No Naga. That wont help. > > I am running two applications (app1 - 100 vcores, app2 - 100 vcores) with > same user which runs in same queue (capacity=100vcores). In

RE: Concurrency control

2015-09-29 Thread Naganarasimha G R (Naga)
o allow a single app to acquire more resources. Thoughts ? + Naga From: Rohith Sharma K S [rohithsharm...@huawei.com] Sent: Tuesday, September 29, 2015 14:07 To: user@hadoop.apache.org Subject: RE: Concurrency control Hi Laxman, In Hadoop-2.8(Not released yet)

Re: Concurrency control

2015-09-29 Thread Namikaze Minato
ga > > > > > From: Rohith Sharma K S [rohithsharm...@huawei.com] > Sent: Tuesday, September 29, 2015 14:07 > To: user@hadoop.apache.org > Subject: RE: Concurrency control > > Hi Laxman, > > > > In Hadoop-2.8(Not r

RE: Concurrency control

2015-09-29 Thread Naganarasimha G R (Naga)
: Tuesday, September 29, 2015 16:03 To: user@hadoop.apache.org Subject: Re: Concurrency control Thanks Rohit, Naga and Lloyd for the responses. > I think Laxman should also tell us more about which application type he is > running. We run mr jobs mostly with default core/memory allocat

Re: Concurrency control

2015-09-29 Thread Laxman Ch
p-limit-factor" : The multiple of > > the queue capacity which can be configured to allow a single app to > acquire > > more resources. Thoughts ? > > > > + Naga > > > > > > > > > > From: Rohith Sharma K S [rohithsharm...@huawei.com]

Re: Concurrency control

2015-09-29 Thread Laxman Ch
- > *From:* Laxman Ch [laxman@gmail.com] > *Sent:* Tuesday, September 29, 2015 16:03 > > *To:* user@hadoop.apache.org > *Subject:* Re: Concurrency control > > Thanks Rohit, Naga and Lloyd for the responses. > > > I think Laxman should also tell us more about wh

RE: Concurrency control

2015-09-29 Thread Naganarasimha G R (Naga)
from queue/scheduler side. + Naga From: Laxman Ch [laxman@gmail.com] Sent: Tuesday, September 29, 2015 16:52 To: user@hadoop.apache.org Subject: Re: Concurrency control IMO, its better to have a application level configuration than to have a scheduler/queue

Fwd: Concurrency control

2015-09-17 Thread Laxman Ch
Hi, In YARN, do we have any way to control the amount of resources (vcores, memory) used by an application SIMULTANEOUSLY. - In my cluster, noticed some large and long running mr-app occupied all the slots of the queue and blocking other apps to get started. - I'm using Capacity schedulers

Re: Concurrency control

2015-09-17 Thread Naganarasimha Garla
Hi Laxman, Yes if cgroups are enabled and "yarn.scheduler.capacity.resource-calculator" configured to DominantResourceCalculator then cpu and memory can be controlled. Please Kindly furhter refer to the official documentation http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html But may

Re: Concurrency control

2015-09-17 Thread Laxman Ch
Yes. I'm already using cgroups. Cgroups helps in controlling the resources at container level. But my requirement is more about controlling the concurrent resource usage of an application at whole cluster level. And yes, we do configure queues properly. But, that won't help. For example, I have

Re: Concurrency control

2015-09-17 Thread Laxman Ch
No Naga. That wont help. I am running two applications (app1 - 100 vcores, app2 - 100 vcores) with same user which runs in same queue (capacity=100vcores). In this scenario, if app1 triggers first occupies all the slots and runs longs then app2 will starve longer. Let me reiterate my problem

Re: Concurrency control

2015-09-17 Thread Naganarasimha Garla
Hi Laxman, For the example you have stated may be we can do the following things : 1. Create/modify the queue with capacity and max cap set such that its equivalent to 100 vcores. So as there is no elasticity, given application will not be using the resources beyond the capacity configured 2.

Any issue with large concurrency due to single active instance of YARN Resource Manager?

2014-09-02 Thread bo yang
Hi Guys, I am thinking how many concurrent jobs a single Resource Manager might be able to manage? Following is my understanding, please correct me if I am wrong. Let's say if we have 1000 concurrent jobs running. Resource Manager will have 1000 records in memory to manage these jobs. And it

Re: Any issue with large concurrency due to single active instance of YARN Resource Manager?

2014-09-02 Thread Zhijie Shen
Hi Bo, RM doesn't create an individual thread for each running app. The app life cycle management is event driven. There's a dispatcher, which runs on one thread to handle the events for all apps. Zhijie On Mon, Sep 1, 2014 at 11:39 PM, bo yang bobyan...@gmail.com wrote: Hi Guys, I am

Re: Any issue with large concurrency due to single active instance of YARN Resource Manager?

2014-09-02 Thread bo yang
Hi Zhijie, That is great to know. Thanks! So there seems no be much limit to support large concurrency. To move this question further, what might be the max number of concurrent jobs which one Resource Manager could support? Is there any numbers from your experience? Thanks, Bo On Tue

Re: Any issue with large concurrency due to single active instance of YARN Resource Manager?

2014-09-02 Thread Zhijie Shen
concurrency. To move this question further, what might be the max number of concurrent jobs which one Resource Manager could support? Is there any numbers from your experience? Thanks, Bo On Tue, Sep 2, 2014 at 12:10 AM, Zhijie Shen zs...@hortonworks.com wrote: Hi Bo, RM doesn't create

Oozie apparent concurrency deadlocking

2012-11-15 Thread Kartashov, Andy
Guys, Have struggled for the last four days with this and still cannot find an answer even after hours of searching the web. I tried oozie workflow to execute my consecutive sqoop jobs in parallel. I use forking that executes 9 sqoop-action-nodes. I had no problem executing the job on a

RE: Oozie apparent concurrency deadlocking

2012-11-15 Thread Kartashov, Andy
@hadoop.apache.org; 'cdh-u...@cloudera.org' Subject: Oozie apparent concurrency deadlocking Guys, Have struggled for the last four days with this and still cannot find an answer even after hours of searching the web. I tried oozie workflow to execute my consecutive sqoop jobs in parallel. I use

Re: concurrency

2012-10-12 Thread Koert Kuipers
). We also have map-red queries that read from the entire dataset (/data/*). My worry here is concurrency. It will happen that a query job runs while a loader job is adding a new partition at the same time. Is there a risk that the query could read incomplete or corrupt files

concurrency

2012-10-12 Thread Koert Kuipers
(so they write to new sub-directories). We also have map-red queries that read from the entire dataset (/data/*). My worry here is concurrency. It will happen that a query job runs while a loader job is adding a new partition at the same time. Is there a risk that the query could read incomplete

Re: concurrency

2012-10-12 Thread Harsh J
that use map-red jobs to add new partitions to this data set at a regular interval (so they write to new sub-directories). We also have map-red queries that read from the entire dataset (/data/*). My worry here is concurrency. It will happen that a query job runs while a loader job is adding

Re: concurrency

2012-10-12 Thread J. Rottinghuis
have loaders that use map-red jobs to add new partitions to this data set at a regular interval (so they write to new sub-directories). We also have map-red queries that read from the entire dataset (/data/*). My worry here is concurrency. It will happen that a query job runs while a loader

Re: Concurrency control

2012-07-18 Thread Harsh J
/write concurrency over HDFS files? Cause HDFS files do not allow concurrent writers (one active lease per file), AFAICT. On Wed, Jul 18, 2012 at 9:09 PM, saubhagya dey saubhagya@gmail.com wrote: how do i manage concurrency in hadoop like we do in teradata. We need to have a read and write

Re: Concurrency control

2012-07-18 Thread Michael Segel
, the HBase coprocessors (new from Apache HBase 0.92 onwards) provide you an ability to do that too. If your question is indeed specific to HBase, please ask it in a more clarified form on the u...@hbase.apache.org lists. If not HBase, do you mean read/write concurrency over HDFS files? Cause HDFS

RE: Concurrency control

2012-07-18 Thread saubhagya....@gmail.com
so do I need to think that concurrency can be controlled in hive.if so then please illustrate. The problem that I am facing is that in cluster when two different user accessing the same table , and one is writing into it and the other is reading,so in this case how is this handled

concurrency in exporting HBase contents

2010-01-22 Thread Ted Yu
Hi, Suppose during export there is ongoing write operation to HBase table I am exporting, which snapshot does export use ? Is there special action I should take ? Thanks

Re: concurrency in exporting HBase contents

2010-01-22 Thread Jean-Daniel Cryans
Which kind of export are you talking about? A MapReduce or a distcp? In any case, it is very probable that your import will miss some writes unless you block them. In 0.21 this will be a lot easier using multi datacenter replication along with the ability to replay logs from one cluster to

RE: Files does not exist error: concurrency control on hive queries...

2009-09-11 Thread Ashish Thusoo
: Files does not exist error: concurrency control on hive queries... Zookeeper sounds like a decent alternative, though it would add a new dependency for deployment. Maybe we could open a jira for it first to track this issue? Thanks, Eva. On 9/9/09 2:49 PM, Prasad Chakka pcha...@facebook.com

Re: Files does not exist error: concurrency control on hive queries...

2009-09-11 Thread Eva Tse
From: Eva Tse [e...@netflix.com] Sent: Wednesday, September 09, 2009 10:45 PM To: hive-user@hadoop.apache.org Subject: Re: Files does not exist error: concurrency control on hive queries... Zookeeper sounds like a decent alternative, though it would add

Re: Files does not exist error: concurrency control on hive queries...

2009-09-10 Thread Eva Tse
but there have to be periodic cleanups (when clients die without releasing locks) etc which is hacky so less preferrable. Another option is to point a ZooKeeper cluster to Hive and ask Hive to use it for locking. So those who are not concerned about concurrency control, don’t have to install ZooKeeper

Re: Files does not exist error: concurrency control on hive queries...

2009-09-10 Thread Edward Capriolo
is logical place to do locking but there have to be periodic cleanups (when clients die without releasing locks) etc which is hacky so less preferrable. Another option is to point a ZooKeeper cluster to Hive and ask Hive to use it for locking. So those who are not concerned about concurrency

Re: Files does not exist error: concurrency control on hive queries...

2009-09-09 Thread Eva Tse
Prasad, We believe the problem is that one of the query is doing an ‘insert overwrite ... select from’ which actually is deleting and merging the small files. The other query somehow couldn’t find those files that it thought it has seen before and failed. So, it looks like a concurrency issue

Re: Files does not exist error: concurrency control on hive queries...

2009-09-09 Thread Prasad Chakka
Reply-To: hive-user@hadoop.apache.org Date: Wed, 9 Sep 2009 10:19:24 -0700 To: hive-user@hadoop.apache.org Subject: Re: Files does not exist error: concurrency control on hive queries... Prasad, We believe the problem is that one of the query is doing an ‘insert overwrite ... select from’ which

Re: Files does not exist error: concurrency control on hive queries...

2009-09-09 Thread Cliff Resnick
...@netflix.com *Reply-To: *hive-user@hadoop.apache.org *Date: *Wed, 9 Sep 2009 10:19:24 -0700 *To: *hive-user@hadoop.apache.org *Subject: *Re: Files does not exist error: concurrency control on hive queries... Prasad, We believe the problem is that one of the query is doing an ‘insert overwrite

Re: Files does not exist error: concurrency control on hive queries...

2009-09-09 Thread Eva Tse
@hadoop.apache.org Subject: Re: Files does not exist error: concurrency control on hive queries... Prasad, We believe the problem is that one of the query is doing an ‘insert overwrite ... select from’ which actually is deleting and merging the small files. The other query somehow couldn’t find

Re: Files does not exist error: concurrency control on hive queries...

2009-09-09 Thread Prasad Chakka
...@netflix.com Reply-To: hive-user@hadoop.apache.org Date: Wed, 9 Sep 2009 12:36:11 -0700 To: hive-user@hadoop.apache.org, Dhruba Borthakur dhr...@facebook.com Subject: Re: Files does not exist error: concurrency control on hive queries... Hi Prasad, Are you implying the expected behavior for these queries

Re: Files does not exist error: concurrency control on hive queries...

2009-09-09 Thread Eva Tse
Regardless of whether the user uses a HiveServer, looks like the logical place to do locking or concurrency control would be at the metastore DB. This is actually one big advantage of Hive. The r/w lock or access control can be achieved by a DB row with lock count for each partition, etc

Re: Files does not exist error: concurrency control on hive queries...

2009-09-09 Thread Prasad Chakka
about concurrency control, don’t have to install ZooKeeper but other can. ZooKeeper provides leases so there won’t be any problem of hanging locks and it will be easier for admins to clean it up. I suppose it depends on whoever wants to take this task up :) Prasad