[jira] Updated: (MAPREDUCE-1823) Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode

2010-11-05 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1823:
--

Status: Open  (was: Patch Available)

> Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode
> -
>
> Key: MAPREDUCE-1823
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1823
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1823.txt
>
>
> RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method 
> fetches information from DataNode so it is slow. It becomes the bottleneck of 
> the RaidNode. It will be nice if we can make this more efficient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

2010-11-05 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1831:
--

Status: Open  (was: Patch Available)

> Delete the co-located replicas when raiding file
> 
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, 
> MAPREDUCE-1831.v1.1.txt
>
>
> In raid, it is good to have the blocks on the same stripe located on 
> different machine.
> This way when one machine is down, it does not broke two blocks on the stripe.
> By doing this, we can decrease the block error probability in raid from 
> O(p^3) to O(p^4) which can be a hugh improvement (where p is the replica 
> missing probability).
> One way to do this is that we can add a new BlockPlacementPolicy which 
> deletes the replicas that are co-located.
> So when raiding the file, we can make the remaining replicas live on 
> different machines.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1706) Log RAID recoveries on HDFS

2010-11-05 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1706:
--

Status: Patch Available  (was: Open)

> Log RAID recoveries on HDFS
> ---
>
> Key: MAPREDUCE-1706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1706
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Reporter: Rodrigo Schmidt
>Assignee: Scott Chen
> Attachments: MAPREDUCE-1706.txt
>
>
> It would be good to have a way to centralize all the recovery logs, since 
> recovery can be executed by any hdfs client. The best place to store this 
> information is HDFS itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1706) Log RAID recoveries on HDFS

2010-11-05 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1706:
--

Attachment: MAPREDUCE-1706.txt

> Log RAID recoveries on HDFS
> ---
>
> Key: MAPREDUCE-1706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1706
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Reporter: Rodrigo Schmidt
>Assignee: Scott Chen
> Attachments: MAPREDUCE-1706.txt
>
>
> It would be good to have a way to centralize all the recovery logs, since 
> recovery can be executed by any hdfs client. The best place to store this 
> information is HDFS itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again

2010-11-05 Thread Junjie Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Liang updated MAPREDUCE-1901:


Attachment: 1901.PATCH

This patch depends on HADOOP-7022 for a small tweak to the MD5Hash function, 
where we keep track of the file size when we calculate the hash of a file. It 
is also a (combied) fix for MAPREDUCE-1902.

> Jobs should not submit the same jar files over and over again
> -
>
> Key: MAPREDUCE-1901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Joydeep Sen Sarma
> Attachments: 1901.PATCH, 1901.PATCH
>
>
> Currently each Hadoop job uploads the required resources 
> (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in 
> executing this job would then download these resources into local disk.
> In an environment where most of the users are using a standard set of jars 
> and files (because they are using a framework like Hive/Pig) - the same jars 
> keep getting uploaded and downloaded repeatedly. The overhead of this 
> protocol (primarily in terms of end-user latency) is significant when:
> - the jobs are small (and conversantly - large in number)
> - Namenode is under load (meaning hdfs latencies are high and made worse, in 
> part, by this protocol)
> Hadoop should provide a way for jobs in a cooperative environment to not 
> submit the same files over and again. Identifying and caching execution 
> resources by a content signature (md5/sha) would be a good alternative to 
> have available.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-2167) Faster directory traversal for raid node

2010-11-05 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-2167:
---

Attachment: MAPREDUCE-2167.2.patch

Using a semaphore now to track the active threads. The logic is much simpler 
now.

> Faster directory traversal for raid node
> 
>
> Key: MAPREDUCE-2167
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2167
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-2167.2.patch, MAPREDUCE-2167.patch
>
>
> The RaidNode currently iterates over the directory structure to figure out 
> which files to RAID. With millions of files, this can take a long time - 
> especially if some files are already RAIDed and the RaidNode needs to look at 
> parity files / parity file HARs to determine if the file needs to be RAIDed.
> The directory traversal is encapsulated inside the class DirectoryTraversal, 
> which examines one file at a time, using the caller's thread.
> My proposal is to make this multi-threaded as follows:
>  * use a pool of threads inside DirectoryTraversal
>  * The caller's thread is used to retrieve directories, and each new 
> directory is assigned to a thread in the pool. The worker thread examines all 
> the files the directory.
>  * If there sub-directories, those are added back as workitems to the pool.
> Comments?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2167) Faster directory traversal for raid node

2010-11-05 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928815#action_12928815
 ] 

Scott Chen commented on MAPREDUCE-2167:
---

We can use a BlockingQueue to make the patch simpler.

> Faster directory traversal for raid node
> 
>
> Key: MAPREDUCE-2167
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2167
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
> Attachments: MAPREDUCE-2167.patch
>
>
> The RaidNode currently iterates over the directory structure to figure out 
> which files to RAID. With millions of files, this can take a long time - 
> especially if some files are already RAIDed and the RaidNode needs to look at 
> parity files / parity file HARs to determine if the file needs to be RAIDed.
> The directory traversal is encapsulated inside the class DirectoryTraversal, 
> which examines one file at a time, using the caller's thread.
> My proposal is to make this multi-threaded as follows:
>  * use a pool of threads inside DirectoryTraversal
>  * The caller's thread is used to retrieve directories, and each new 
> directory is assigned to a thread in the pool. The worker thread examines all 
> the files the directory.
>  * If there sub-directories, those are added back as workitems to the pool.
> Comments?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2172) test-patch.properties contains incorrect/version-dependent values of OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS

2010-11-05 Thread Patrick Kling (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928756#action_12928756
 ] 

Patrick Kling commented on MAPREDUCE-2172:
--

I think part of the problem is that the exact number of warnings depends on the 
version of findbugs used. Is there a specific version that every developer 
should be using?

> test-patch.properties contains incorrect/version-dependent values of 
> OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS
> --
>
> Key: MAPREDUCE-2172
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2172
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: FindBugs 1.3.4
>Reporter: Patrick Kling
>
> Running ant test-patch with an empty patch yields 25 findbugs warning and 3 
> release audit warnings (rather than the 0 findbugs warnings and 1 release 
> audit warning specified in test-patch.properties):
> {code}
> [exec] -1 overall.  
> [exec] 
> [exec] +1 @author.  The patch does not contain any @author tags.
> [exec] 
> [exec] -1 tests included.  The patch doesn't appear to include any new or 
> modified tests.
> [exec] Please justify why no new tests are needed for 
> this patch.
> [exec] Also please list what manual steps were 
> performed to verify this patch.
> [exec] 
> [exec] +1 javadoc.  The javadoc tool did not generate any warning 
> messages.
> [exec] 
> [exec] +1 javac.  The applied patch does not increase the total number of 
> javac compiler warnings.
> [exec] 
> [exec] -1 findbugs.  The patch appears to introduce 25 new Findbugs 
> warnings.
> [exec] 
> [exec] -1 release audit.  The applied patch generated 3 release audit 
> warnings (more than the trunk's current 1 warnings).
> [exec] 
> [exec] +1 system test framework.  The patch passed system test framework 
> compile.
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2170) Send out last-minute load averages in TaskTrackerStatus

2010-11-05 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928752#action_12928752
 ] 

Scott Chen commented on MAPREDUCE-2170:
---

Hey Harsh, I think it is OK to cast it into float.

> Send out last-minute load averages in TaskTrackerStatus
> ---
>
> Key: MAPREDUCE-2170
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2170
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Affects Versions: 0.22.0
> Environment: GNU/Linux
>Reporter: Harsh J Chouraria
>Assignee: Harsh J Chouraria
>Priority: Minor
> Fix For: 0.22.0
>
>   Original Estimate: 0.33h
>  Remaining Estimate: 0.33h
>
> Load averages could be useful in scheduling. This patch looks to extend the 
> existing Linux resource plugin (via /proc/loadavg file) to allow transmitting 
> load averages of the last one minute via the TaskTrackerStatus.
> Patch is up for review, with test cases added, at: 
> https://reviews.apache.org/r/20/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2172) test-patch.properties contains incorrect/version-dependent values of OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS

2010-11-05 Thread Patrick Kling (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928742#action_12928742
 ] 

Patrick Kling commented on MAPREDUCE-2172:
--

The same issue exists in hadoop-common, where test-patch on a clean checkout 
yields 2 new findbugs warnings.

> test-patch.properties contains incorrect/version-dependent values of 
> OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS
> --
>
> Key: MAPREDUCE-2172
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2172
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: FindBugs 1.3.4
>Reporter: Patrick Kling
>
> Running ant test-patch with an empty patch yields 25 findbugs warning and 3 
> release audit warnings (rather than the 0 findbugs warnings and 1 release 
> audit warning specified in test-patch.properties):
> {code}
> [exec] -1 overall.  
> [exec] 
> [exec] +1 @author.  The patch does not contain any @author tags.
> [exec] 
> [exec] -1 tests included.  The patch doesn't appear to include any new or 
> modified tests.
> [exec] Please justify why no new tests are needed for 
> this patch.
> [exec] Also please list what manual steps were 
> performed to verify this patch.
> [exec] 
> [exec] +1 javadoc.  The javadoc tool did not generate any warning 
> messages.
> [exec] 
> [exec] +1 javac.  The applied patch does not increase the total number of 
> javac compiler warnings.
> [exec] 
> [exec] -1 findbugs.  The patch appears to introduce 25 new Findbugs 
> warnings.
> [exec] 
> [exec] -1 release audit.  The applied patch generated 3 release audit 
> warnings (more than the trunk's current 1 warnings).
> [exec] 
> [exec] +1 system test framework.  The patch passed system test framework 
> compile.
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2172) test-patch.properties contains incorrect/version-dependent values of OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS

2010-11-05 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928739#action_12928739
 ] 

Nigel Daley commented on MAPREDUCE-2172:


Giri and I worked on getting Common pre-commit testing back in shape and that 
is now working (as of last night).  I suspect MR trunk has drifted from clean 
javadoc, findbugs, and release audit warnings since it's MR pre-commit testing 
was working.  We'll move on to cleaning this and HDFS up soon.

> test-patch.properties contains incorrect/version-dependent values of 
> OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS
> --
>
> Key: MAPREDUCE-2172
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2172
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: FindBugs 1.3.4
>Reporter: Patrick Kling
>
> Running ant test-patch with an empty patch yields 25 findbugs warning and 3 
> release audit warnings (rather than the 0 findbugs warnings and 1 release 
> audit warning specified in test-patch.properties):
> {code}
> [exec] -1 overall.  
> [exec] 
> [exec] +1 @author.  The patch does not contain any @author tags.
> [exec] 
> [exec] -1 tests included.  The patch doesn't appear to include any new or 
> modified tests.
> [exec] Please justify why no new tests are needed for 
> this patch.
> [exec] Also please list what manual steps were 
> performed to verify this patch.
> [exec] 
> [exec] +1 javadoc.  The javadoc tool did not generate any warning 
> messages.
> [exec] 
> [exec] +1 javac.  The applied patch does not increase the total number of 
> javac compiler warnings.
> [exec] 
> [exec] -1 findbugs.  The patch appears to introduce 25 new Findbugs 
> warnings.
> [exec] 
> [exec] -1 release audit.  The applied patch generated 3 release audit 
> warnings (more than the trunk's current 1 warnings).
> [exec] 
> [exec] +1 system test framework.  The patch passed system test framework 
> compile.
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-2172) test-patch.properties contains incorrect/version-dependent values of OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS

2010-11-05 Thread Ramkumar Vadali (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928728#action_12928728
 ] 

Ramkumar Vadali commented on MAPREDUCE-2172:


@Nigel, both Patrick and I see ant test-patch fail on a clean checkout. We 
think this is related to HADOOP-7008. Is there some configuration we should 
change before running ant test-patch?

> test-patch.properties contains incorrect/version-dependent values of 
> OK_FINDBUGS_WARNINGS and OK_RELEASEAUDIT_WARNINGS
> --
>
> Key: MAPREDUCE-2172
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2172
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: FindBugs 1.3.4
>Reporter: Patrick Kling
>
> Running ant test-patch with an empty patch yields 25 findbugs warning and 3 
> release audit warnings (rather than the 0 findbugs warnings and 1 release 
> audit warning specified in test-patch.properties):
> {code}
> [exec] -1 overall.  
> [exec] 
> [exec] +1 @author.  The patch does not contain any @author tags.
> [exec] 
> [exec] -1 tests included.  The patch doesn't appear to include any new or 
> modified tests.
> [exec] Please justify why no new tests are needed for 
> this patch.
> [exec] Also please list what manual steps were 
> performed to verify this patch.
> [exec] 
> [exec] +1 javadoc.  The javadoc tool did not generate any warning 
> messages.
> [exec] 
> [exec] +1 javac.  The applied patch does not increase the total number of 
> javac compiler warnings.
> [exec] 
> [exec] -1 findbugs.  The patch appears to introduce 25 new Findbugs 
> warnings.
> [exec] 
> [exec] -1 release audit.  The applied patch generated 3 release audit 
> warnings (more than the trunk's current 1 warnings).
> [exec] 
> [exec] +1 system test framework.  The patch passed system test framework 
> compile.
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-2176) ant test-patch failing on a clean checkout

2010-11-05 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali resolved MAPREDUCE-2176.


Resolution: Duplicate

Dup of MAPREDUCE-2172

> ant test-patch failing on a clean checkout
> --
>
> Key: MAPREDUCE-2176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2176
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ramkumar Vadali
>
> ant test-patch fails for a dummy patch on CHANGES.txt:
> {code}
>  [exec] 
>  [exec] -1 overall.  
>  [exec] 
>  [exec] +1 @author.  The patch does not contain any @author tags.
>  [exec] 
>  [exec] -1 tests included.  The patch doesn't appear to include any 
> new or modified tests.
>  [exec] Please justify why no new tests are 
> needed for this patch.
>  [exec] Also please list what manual steps were 
> performed to verify this patch.
>  [exec] 
>  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
> messages.
>  [exec] 
>  [exec] +1 javac.  The applied patch does not increase the total 
> number of javac compiler warnings.
>  [exec] 
>  [exec] -1 findbugs.  The patch appears to introduce 13 new Findbugs 
> warnings.
>  [exec] 
>  [exec] -1 release audit.  The applied patch generated 3 release 
> audit warnings (more than the trunk's current 1 warnings).
>  [exec] 
>  [exec] +1 system test framework.  The patch passed system test 
> framework compile.
>  [exec] 
>  [exec] 
>  [exec] 
>  [exec] 
>  [exec] 
> ==
>  [exec] 
> ==
>  [exec] Finished build.
>  [exec] 
> ==
>  [exec] 
> ==
>  [exec] 
>  [exec] 
> BUILD FAILED
> /data/users/rvadali/apache/hadoop-mapred-trunk/build.xml:1740: exec returned: 
> 3
> Total time: 13 minutes 14 seconds
> Test results are in /tmp/rvadali.hadoopQA
> [rvad...@dev502 hadoop-mapred-trunk]$ svn st 
> ?  build-fi
> ?  SecurityAuth.audit
> ?  lib/jdiff/hadoop-mapred_0.22.0-SNAPSHOT.xml
> M  CHANGES.txt
> X  src/test/bin
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1861) Raid should rearrange the replicas while raiding

2010-11-05 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1861:
--

Status: Open  (was: Patch Available)

> Raid should rearrange the replicas while raiding
> 
>
> Key: MAPREDUCE-1861
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1861
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1861.txt, MAPREDUCE-1861.txt
>
>
> Raided file introduce extra dependencies on the blocks on the same stripe.
> Therefore we need a new way to place the blocks.
> It is desirable that raided file satisfies the following two conditions:
> a. Replicas on the same stripe should be on different machines (or racks)
> b. Replicas of the same block should be on different racks
> MAPREDUCE-1831 will try to delete the replicas on the same stripe and the 
> same machine (a).
> But in the mean time, it will try to maintain the number of distinct racks of 
> one block (b).
> We cannot satisfy (a) and (b) at the same time with the current logic in 
> BlockPlacementPolicyDefault.chooseTarget().
> One choice we have is to change BlockPlacementPolicyDefault.chooseTarget().
> However, this placement is in general good for all files including the 
> unraided ones.
> It is not clear to us that we can make this good for both raided and unraided 
> files.
> So we propose this idea that when raiding the file. We create one more 
> off-rack replica (so the replication=4 now).
> Than we delete two blocks using the policy in MAPREDUCE-1831 after that 
> (replication=2 now).
> This way we can rearrange the replicas to satisfy (a) and (b) at the same 
> time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.