Re: Maps split size

2012-10-28 Thread Mark Olimpiati
Well, when I said I found a solution this link was one of them :). Even though I set : dfs.block.size = mapred.min.split.size = mapred.max.split.size = 14MB the job is still running maps with 64MB ! I don't see what else can I change :( Thanks, Mark On Fri, Oct 26, 2012 at 2:23 PM, Bertrand

ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-10-28 Thread lei liu
I think these methods should are idempotent, these methods should be repeated calls to be harmless by same client. Thanks, LiuLei

Re: ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-10-28 Thread Ted Dunning
Create cannot be idempotent because of the problem of watches and sequential files. Similarly, mkdirs, rename and delete cannot generally be idempotent. In particular applications, you might find it is OK to treat them as such, but there are definitely applications where they are not idempotent.

Re: Cloudera Certified Developer for Apache Hadoop (CCDH)

2012-10-28 Thread Terry Healy
I just completed the Cloudera Developer course last week and would highly recommend it. I have not taken the test yet, but the instructor will point out many topics that are included in the test. For resources, be sure to make use of the Cloudera University,

RE: Cluster wide atomic operations

2012-10-28 Thread David Parks
I need a unique permanent ID assigned to new item encountered, which has a constraint that it is in the range of, let's say for simple discussion, one to one million. I suppose I could assign a range of usable IDs to each reduce task (where ID's are assigned) and keep those organized somehow

Re: Cluster wide atomic operations

2012-10-28 Thread Michael Katzenellenbogen
Twitter's Snowflake may provide you with some inspiration: https://github.com/twitter/snowflake -Michael On Oct 28, 2012, at 9:16 PM, David Parks davidpark...@yahoo.com wrote: I need a unique permanent ID assigned to new item encountered, which has a constraint that it is in the range of,

Re: ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-10-28 Thread lei liu
Thanks Ted for your reply. What is the the problem of watches and sequential files? If you can describe in detail, I can better understand the problem. 2012/10/29 Ted Dunning tdunn...@maprtech.com Create cannot be idempotent because of the problem of watches and sequential files.

Re: Cluster wide atomic operations

2012-10-28 Thread Ted Dunning
On Sun, Oct 28, 2012 at 9:15 PM, David Parks davidpark...@yahoo.com wrote: I need a unique permanent ID assigned to new item encountered, which has a constraint that it is in the range of, let’s say for simple discussion, one to one million. Having such a limited range may require that you

Re: ClientProtocol create、mkdirs 、rename and delete methods are not Idempotent

2012-10-28 Thread Ted Dunning
Create cannot be idempotent with sequential files. Doing the same create twice creates two different files. On Sun, Oct 28, 2012 at 10:25 PM, lei liu liulei...@gmail.com wrote: Thanks Ted for your reply. What is the the problem of watches and sequential files? If you can describe in