SecretKey in MapReduce

2012-09-20 Thread Pedro Sá da Costa
Hi, - Hadoop 1.0.2 uses a SecretKey, but I don't understand what's the purpose of that. Can anyone explain what's the purpose of the SecretKey? - Is this Secret key shared between JobTracker, TaskTrackers, and Map and Reduce tasks? -- Best regards,

RE: unsubscribe

2012-09-20 Thread sathyavageeswaran
granted From: shanmukhan battinapati [mailto:shanmukha...@gmail.com] Sent: 20 September 2012 15:14 To: user@hadoop.apache.org Subject: unsubscribe Thanks Regards Shanmukhan.B On Wed, Sep 19, 2012 at 8:18 AM, 黄 山 thuhuang...@gmail.com wrote: unsubscribe _ No virus

unsubscribe

2012-09-20 Thread 王成
发自我的 iPhone

Will all the intermediate output with the same key go to the same reducer?

2012-09-20 Thread Jason Yang
Hi, all I have a question that whether all the intermediate output with the same key go to the same reducer or not? If it is, in case of only two keys are generated from mapper, but there are 3 reducer running in this job, what would happen? If not, how could I do some processing over the all

Re: Will all the intermediate output with the same key go to the same reducer?

2012-09-20 Thread Hemanth Yamijala
Hi, Yes. By contract, all intermediate output with the same key goes to the same reducer. In your example, suppose of the two keys generated from the mapper, one key goes to reducer 1 and the second goes to reducer 2, reducer 3 will not have any records to process and end without producing any

Job failed with large volume of small data: java.io.EOFException

2012-09-20 Thread Jason Yang
Hi, all I have encounter a weird problem, I got a MR job which would always failed if there are large number of input file(e.g. 400 input files), but always succeed if there is only a little input files(e.g. 20 input files). In this job , the map phase would read all the input files and

any way to set hadoop daemons periodicity?

2012-09-20 Thread George Kousiouris
Hi all, I have noticed that sometimes the hdfs takes too long to act upon a specific change in the configuration e.g. the replication factor. While the under/over replicated state of a file is immediately detected, the corrective action may take some time, even if the file sizes are very

Re: Job failed with large volume of small data: java.io.EOFException

2012-09-20 Thread Bejoy Ks
Hi Jason Are you seeing any errors in your data node logs. Specifically like ' xceivers count exceeded'. In that case you may need to bump up te value of dfs.datanode.max.xcievers to ahigher value. If not, it is possible that you are crossing the upper limit of open files on your linux boxes

Re: IBM big insights distribution

2012-09-20 Thread Tom Deutsch
Exactly Ted, and just trying to be respectful of this being an Apache mailing list. Snarky behavior devalues everyone's efforts, so trying to head that off as it doesn't have any place on a community list or forum. From: Ted Dunning tdunn...@maprtech.com To: user@hadoop.apache.org,

Re: IBM big insights distribution

2012-09-20 Thread John McPherson
Just clarifying because of the reference to mainframe IBM BigInsights runs on clusters of x86 servers. John From: Michael Segel michael_se...@hotmail.com To: user@hadoop.apache.org Cc: serge.blazhiyevs...@nice.com Date: 09/20/2012 05:25 AM Subject:Re: IBM big

Re: IBM big insights distribution

2012-09-20 Thread Andy Isaacson
On Thu, Sep 20, 2012 at 5:24 AM, Michael Segel michael_se...@hotmail.com wrote: Why is it that when anyone asks a question about IBM Tom wants to take it off line? To be fair, most vendors tend to redirect distro-specific discussion to non-apache.org forums. Cloudera has a cdh-user list, MapR

Re: Will all the intermediate output with the same key go to the same reducer?

2012-09-20 Thread feng lu
Hi If not, how could I do some processing over the all data, like counting? Maybe you can refer to the teraSort example in hadoop. it use a partitioner that splits text keys into roughly equal partitions in a global sorted order. On Thu, Sep 20, 2012 at 9:28 PM, Hemanth Yamijala