RE: Shuffle phase replication factor

2013-05-23 Thread John Lilley
? Thanks, John From: erlv5...@gmail.com [mailto:erlv5...@gmail.com] On Behalf Of Kun Ling Sent: Wednesday, May 22, 2013 7:50 PM To: user Subject: Re: Shuffle phase replication factor Hi John, 1. for the number of simultaneous connection limitations. You can configure this using

Re: Shuffle phase replication factor

2013-05-23 Thread Sandy Ryza
? Thanks, John ** ** *From:* erlv5...@gmail.com [mailto:erlv5...@gmail.com] *On Behalf Of *Kun Ling *Sent:* Wednesday, May 22, 2013 7:50 PM *To:* user *Subject:* Re: Shuffle phase replication factor ** ** Hi John, ** ** ** ** 1. for the number

RE: Shuffle phase replication factor

2013-05-22 Thread John Lilley
[mailto:k...@123.org] Sent: Tuesday, May 21, 2013 12:59 PM To: user@hadoop.apache.org Subject: Re: Shuffle phase replication factor The map output doesn't get written to HDFS. The map task writes its output to its local disk, the reduce tasks will pull the data through HTTP for further processing. Am

RE: Shuffle phase replication factor

2013-05-22 Thread John Lilley
enough to allow a server-side to disconnect at any time to free up slots and the client-side will retry the request? Thanks john From: Shahab Yunus [mailto:shahab.yu...@gmail.com] Sent: Wednesday, May 22, 2013 8:38 AM To: user@hadoop.apache.org Subject: Re: Shuffle phase replication factor

Re: Shuffle phase replication factor

2013-05-22 Thread Rahul Bhattacharjee
-side will retry the request? Thanks john ** ** *From:* Shahab Yunus [mailto:shahab.yu...@gmail.com] *Sent:* Wednesday, May 22, 2013 8:38 AM *To:* user@hadoop.apache.org *Subject:* Re: Shuffle phase replication factor ** ** As mentioned by Bertrand, Hadoop, The Definitive

RE: Shuffle phase replication factor

2013-05-22 Thread John Lilley
to the pending/failing connection attempts that exceed the limit? Thanks! john From: Rahul Bhattacharjee [mailto:rahul.rec@gmail.com] Sent: Wednesday, May 22, 2013 8:52 AM To: user@hadoop.apache.org Subject: Re: Shuffle phase replication factor There are properties/configuration to control

Re: Shuffle phase replication factor

2013-05-22 Thread Kun Ling
/failing connection attempts that exceed the limit? Thanks! john ** ** *From:* Rahul Bhattacharjee [mailto:rahul.rec@gmail.com] *Sent:* Wednesday, May 22, 2013 8:52 AM *To:* user@hadoop.apache.org *Subject:* Re: Shuffle phase replication factor

Re: Shuffle phase replication factor

2013-05-21 Thread Kai Voigt
The map output doesn't get written to HDFS. The map task writes its output to its local disk, the reduce tasks will pull the data through HTTP for further processing. Am 21.05.2013 um 19:57 schrieb John Lilley john.lil...@redpoint.net: When MapReduce enters “shuffle” to partition the tuples,

Re: Shuffle phase replication factor

2013-05-21 Thread Ian Wrigley
Intermediate data is written to local disk, not to HDFS. Ian. On May 21, 2013, at 1:57 PM, John Lilley john.lil...@redpoint.net wrote: When MapReduce enters “shuffle” to partition the tuples, I am assuming that it writes intermediate data to HDFS. What replication factor is used for those