date:20100727

[jira] Created: (MAPREDUCE-1975) gridmix shows unnecessary InterruptedException

2010-07-27 Thread Ravi Gummadi (JIRA)

gridmix shows unnecessary InterruptedException
--

 Key: MAPREDUCE-1975
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1975
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Fix For: 0.22.0


The following InterruptedException is seen when gridmix is run and it ran 
successfully:

10/06/24 20:43:03 INFO gridmix.ReplayJobFactory: START REPLAY @ 11331037109
10/06/24 20:43:03 ERROR gridmix.Statistics: Statistics interrupt while waiting 
for polling null
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObjec\
t.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObjec\
t.await(AbstractQueuedSynchronizer.java:2066)
at org.apache.hadoop.mapred.gridmix.Statistics$StatCollector.run(Statis\
tics.java:190)
10/06/24 20:43:03 INFO gridmix.Gridmix: Exiting...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1686) ClassNotFoundException for custom format classes provided in libjars

2010-07-27 Thread Amareshwari Sriramadasu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1686:
---

Status: Open  (was: Patch Available)

Thanks Paul for the patch. Some comments on the patch:
* Please remove printStackTrace() calls in catch blocks in StreamUtil. Since 
StreamUtil.goodClassOrNull is used to find whether the passed mapper/reducer 
value is class or command, we don't want to print the stacktrace. Also, see 
MAPREDUCE-571.
* The testcase does not pass even after the fix because the path given for the 
jar is never built. For example, see the testjar directory in 
src/test/mapred/testjar and how it is built.
* In the testcase, loadLibJar() and assert associated with it, seems 
unnecessary. 

> ClassNotFoundException for custom format classes provided in libjars
> 
>
> Key: MAPREDUCE-1686
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1686
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.20.1
>Reporter: Paul Burkhardt
>Priority: Minor
> Attachments: HADOOP-1686.patch
>
>
> The StreamUtil::goodClassOrNull method assumes user-provided classes have 
> package names and if not, they are part of the Hadoop Streaming package. For 
> example, using custom InputFormat or OutputFormat classes without package 
> names will fail with a ClassNotFound exception which is not indicative given 
> the classes are provided in the libjars option. Admittedly, most Java 
> packages should have a package name so this should rarely come up.
> Possible resolution options:
> 1) modify the error message to include the actual classname that was 
> attempted in the goodClassOrNull method
> 2) call the Configuration::getClassByName method first and if class not found 
> check for default package name and try the call again
> {code}
> public static Class goodClassOrNull(Configuration conf, String className, 
> String defaultPackage) {
> Class clazz = null;
> try {
> clazz = conf.getClassByName(className);
> } catch (ClassNotFoundException cnf) {
> }
> if (clazz == null) {
> if (className.indexOf('.') == -1 && defaultPackage != null) {
> className = defaultPackage + "." + className;
> try {
> clazz = conf.getClassByName(className);
> } catch (ClassNotFoundException cnf) {
> }
> }
> }
> return clazz;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1968) Deprecate GridMix v1

2010-07-27 Thread Ranjit Mathew (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893054#action_12893054
 ] 

Ranjit Mathew commented on MAPREDUCE-1968:
--

Yes, that's the ultimate goal. However, GridMix v3 doesn't yet cover 
CPU/memory-load simulation and does not come out-of-the-box with benchmarking 
work-loads that can mimic those from GridMix v2. It doesn't cover compression 
and pipes either.

> Deprecate GridMix v1
> 
>
> Key: MAPREDUCE-1968
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1968
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: contrib/gridmix
>Reporter: Ranjit Mathew
>
> GridMix v2 in "src/benchmarks/gridmix2" obsoletes GridMix v1 in 
> "src/benchmarks/gridmix".
> The latter should be deprecated and then removed to reduce the clutter in the 
> source-tree.
> One way of doing this is shown by the "hadoop" script from 0.20.xx that has 
> been deprecated
> in favour of "mapred", for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1882) Use Jsch instead of Shell.java

2010-07-27 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893048#action_12893048
 ] 

Konstantin Boudnik commented on MAPREDUCE-1882:
---

And for the trunk?

> Use Jsch instead of Shell.java 
> ---
>
> Key: MAPREDUCE-1882
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1882
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
> Environment: herriot framework 
>Reporter: Balaji Rajagopalan
>Assignee: Iyappan Srinivasan
> Attachments: 1882-ydist-security-patch.txt, RemoteExecution.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In herriot ( hadoop system test case dev) we often find that we are resorted 
> to habit of ssh to remote node execute a shell command, and come out. It is 
> wise to use Jsch instead of doing this through Shell.java ( hadoop code), 
> since Jsch provides nice Java abstraction, the JIRA will only close after we 
> import Jsch input hadoop build system and also fix all the existing test 
> cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1890) Create automated test scenarios for decommissioning of task trackers

2010-07-27 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893049#action_12893049
 ] 

Konstantin Boudnik commented on MAPREDUCE-1890:
---

trunk patch, please

> Create automated test scenarios for decommissioning of task trackers
> 
>
> Key: MAPREDUCE-1890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1890
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
> Attachments: 1890-ydist-security-patch.txt, TestDecomissioning.patch
>
>
> Test scenarios :
> 1) Put a healthy slave task tracker in the dfs.exclude file.
> 2) As a valid user, decommission a  node in the cluster by issuing the 
> command "hadoop mradmin -refreshNodes"
> 3) Make sure that the node is decommissioned.
> 4) Now take the task tracker out of the file.
> 5) As a valid user, again issue the command "hadoop mradmin -refreshNodes"
> 6) Make sure that the node is not in the decommiossion list.
> 7) Bring back that node.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1963) [Herriot] TaskMemoryManager should log process-tree's status while killing tasks

2010-07-27 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893047#action_12893047
 ] 

Konstantin Boudnik commented on MAPREDUCE-1963:
---

Please have patch for the trunk as well.

> [Herriot] TaskMemoryManager should log process-tree's status while killing 
> tasks
> 
>
> Key: MAPREDUCE-1963
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1963
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1963-ydist-security.patch
>
>
> 1. Execute a streaming job which will increase memory usage beyond configured 
> memory limits during mapping phase. TaskMemoryManager should logs a map 
> task's process-tree's status just before killing the task. 
> 2. Execute a streaming job which will increase memory usage beyond configured 
> memory limits during reduce phase.  TaskMemoryManager should logs a 
> reduce task's process-tree's status just before killing the task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1958) using delegation token over hftp for long running clients (part of hdfs 1296)

2010-07-27 Thread Boris Shkolnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Shkolnik updated MAPREDUCE-1958:
--

Attachment: MAPREDUCE-1958-1.patch

ran tests. all passed.

> using delegation token over hftp for long running clients (part of hdfs 1296)
> -
>
> Key: MAPREDUCE-1958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1958
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Boris Shkolnik
>Assignee: Boris Shkolnik
> Attachments: MAPREDUCE-1958-1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1974) FairScheduler can preempt the same task many times

2010-07-27 Thread Scott Chen (JIRA)

FairScheduler can preempt the same task many times
--

 Key: MAPREDUCE-1974
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1974
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Scott Chen
Assignee: Scott Chen


In FairScheduler.preemptTasks(), tasks are collected from 
JobInProgress.runningMapCache.
But tasks repeat multiple times in  JobInProgress.runningMapCache (on rack, 
node and cluster).
This makes FairScheduler preempt the same task many times.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1973) Optimize input split creation

2010-07-27 Thread Paul Burkhardt (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Burkhardt updated MAPREDUCE-1973:
--

Attachment: HADOOP-1973.patch

Please review.

> Optimize input split creation
> -
>
> Key: MAPREDUCE-1973
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1973
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.1, 0.20.2
> Environment: Intel Nehalem cluster running Red Hat.
>Reporter: Paul Burkhardt
>Priority: Minor
> Attachments: HADOOP-1973.patch
>
>
> The input split returns the locations that host the file blocks in the split. 
> The locations are determined by the getBlockLocations method of the 
> filesystem client which requires a remote connection to the filesystem (i.e. 
> HDFS). The remote connection is made for each file in the entire input split. 
> For jobs with many input files the network connections dominate the cost of 
> writing the input split file.
> A job requests a listing of the input files from the remote filesystem and 
> creates a FileStatus object as a handle for each file in the listing. The 
> FileStatus object can be imbued with the necessary host information on the 
> remote end and passed to the client-side in the bulk return of the listing 
> request. A getHosts method of the FileStatus would then return the locations 
> for the blocks comprising that file and eliminate the need for another trip 
> to the remote filesystem.
> The INodeFile maintains the blocks for a file and is an obvious choice to be 
> the originator for the locations of that file. It is also available to the 
> FSDirectory which first creates the listing of FileStatus objects. We propose 
> that the block locations be generated by the INodeFile to instantiate the 
> FileStatus object during the getListing request.
> Our tests demonstrated a factor of 2000 speedup for approximately 60,000 
> input files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1973) Optimize input split creation

2010-07-27 Thread Paul Burkhardt (JIRA)

Optimize input split creation
-

 Key: MAPREDUCE-1973
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1973
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.20.2, 0.20.1
 Environment: Intel Nehalem cluster running Red Hat.
Reporter: Paul Burkhardt
Priority: Minor


The input split returns the locations that host the file blocks in the split. 
The locations are determined by the getBlockLocations method of the filesystem 
client which requires a remote connection to the filesystem (i.e. HDFS). The 
remote connection is made for each file in the entire input split. For jobs 
with many input files the network connections dominate the cost of writing the 
input split file.

A job requests a listing of the input files from the remote filesystem and 
creates a FileStatus object as a handle for each file in the listing. The 
FileStatus object can be imbued with the necessary host information on the 
remote end and passed to the client-side in the bulk return of the listing 
request. A getHosts method of the FileStatus would then return the locations 
for the blocks comprising that file and eliminate the need for another trip to 
the remote filesystem.

The INodeFile maintains the blocks for a file and is an obvious choice to be 
the originator for the locations of that file. It is also available to the 
FSDirectory which first creates the listing of FileStatus objects. We propose 
that the block locations be generated by the INodeFile to instantiate the 
FileStatus object during the getListing request.

Our tests demonstrated a factor of 2000 speedup for approximately 60,000 input 
files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1968) Deprecate GridMix v1

2010-07-27 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892973#action_12892973
 ] 

Owen O'Malley commented on MAPREDUCE-1968:
--

I think we should remove both gridmix v1 and v2. It is pretty clear that v3 is 
the best so far.

> Deprecate GridMix v1
> 
>
> Key: MAPREDUCE-1968
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1968
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: contrib/gridmix
>Reporter: Ranjit Mathew
>
> GridMix v2 in "src/benchmarks/gridmix2" obsoletes GridMix v1 in 
> "src/benchmarks/gridmix".
> The latter should be deprecated and then removed to reduce the clutter in the 
> source-tree.
> One way of doing this is shown by the "hadoop" script from 0.20.xx that has 
> been deprecated
> in favour of "mapred", for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

2010-07-27 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892971#action_12892971
 ] 

Owen O'Malley commented on MAPREDUCE-1288:
--

{quote}
(2) introduce the concept of group sharing of distributed cache files so as to 
avoid repetitive downloads for group shared files also. This may be a complex 
solution after all.
{quote}

This would be quite complex to get right. In particular, it is difficult to 
determine which group should have access. If we want to improve it, I'd suggest 
that we use hardlinks to give each user access to a single copy of the file.. 
Of course you need to ensure that they do in fact have read access to the 
original file. *smile*

> DistributedCache localizes only once per cache URI
> --
>
> Key: MAPREDUCE-1288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, security, tasktracker
>Affects Versions: 0.21.0
>Reporter: Devaraj Das
>Priority: Critical
> Attachments: MR-1288-bp20-1.patch, MR-1288-bp20-2.patch, 
> MR-1288-bp20-3.patch
>
>
> As part of the file localization the distributed cache localizer creates a 
> copy of the file in the corresponding user's private directory. The 
> localization in DistributedCache assumes the key as the URI of the cachefile 
> and if it already exists in the map, the localization is not done again. This 
> means that another user cannot access the same distributed cache file. We 
> should change the key to include the username so that localization is done 
> for every user.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (MAPREDUCE-1154) Large-scale, automated test framwork for Map-Reduce

2010-07-27 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik resolved MAPREDUCE-1154.
---

Resolution: Duplicate

This has been addressed as HADOOP-6332 and derived work.

> Large-scale, automated test framwork for Map-Reduce
> ---
>
> Key: MAPREDUCE-1154
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1154
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Reporter: Arun C Murthy
> Attachments: testing.patch
>
>
> HADOOP-6332 proposes a large-scale, automated, junit-based test-framework for 
> Hadoop.
> This jira is meant to track relevant work to Map-Reduce.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1686) ClassNotFoundException for custom format classes provided in libjars

2010-07-27 Thread Paul Burkhardt (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892952#action_12892952
 ] 

Paul Burkhardt commented on MAPREDUCE-1686:
---

Patch has been created and submitted.

> ClassNotFoundException for custom format classes provided in libjars
> 
>
> Key: MAPREDUCE-1686
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1686
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.20.1
>Reporter: Paul Burkhardt
>Priority: Minor
> Attachments: HADOOP-1686.patch
>
>
> The StreamUtil::goodClassOrNull method assumes user-provided classes have 
> package names and if not, they are part of the Hadoop Streaming package. For 
> example, using custom InputFormat or OutputFormat classes without package 
> names will fail with a ClassNotFound exception which is not indicative given 
> the classes are provided in the libjars option. Admittedly, most Java 
> packages should have a package name so this should rarely come up.
> Possible resolution options:
> 1) modify the error message to include the actual classname that was 
> attempted in the goodClassOrNull method
> 2) call the Configuration::getClassByName method first and if class not found 
> check for default package name and try the call again
> {code}
> public static Class goodClassOrNull(Configuration conf, String className, 
> String defaultPackage) {
> Class clazz = null;
> try {
> clazz = conf.getClassByName(className);
> } catch (ClassNotFoundException cnf) {
> }
> if (clazz == null) {
> if (className.indexOf('.') == -1 && defaultPackage != null) {
> className = defaultPackage + "." + className;
> try {
> clazz = conf.getClassByName(className);
> } catch (ClassNotFoundException cnf) {
> }
> }
> }
> return clazz;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

2010-07-27 Thread Devaraj Das (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892948#action_12892948
 ] 

Devaraj Das commented on MAPREDUCE-1288:


bq. Devaraj, this corner case is exactly what Hemanth was trying to explain 
earlier on this ticket, starting with comment #4 above

Yeah.. i realized that.. That's the reason i stuck to this jira rather than 
opening a new one :-)

bq. As for the approach, we have two options: (1) (this seems to be what the 
patch is doing) for group shared files, localize them separately for each user. 
This is a simple solution, but sacrifices the optimization ( may not be too 
bad?)

Yes, I am going with this for now. If needed (after we deploy this patch on our 
clusters and observe), we can look at proposal (2) in your comment..

> DistributedCache localizes only once per cache URI
> --
>
> Key: MAPREDUCE-1288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, security, tasktracker
>Affects Versions: 0.21.0
>Reporter: Devaraj Das
>Priority: Critical
> Attachments: MR-1288-bp20-1.patch, MR-1288-bp20-2.patch, 
> MR-1288-bp20-3.patch
>
>
> As part of the file localization the distributed cache localizer creates a 
> copy of the file in the corresponding user's private directory. The 
> localization in DistributedCache assumes the key as the URI of the cachefile 
> and if it already exists in the map, the localization is not done again. This 
> means that another user cannot access the same distributed cache file. We 
> should change the key to include the username so that localization is done 
> for every user.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

2010-07-27 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-1288:
---

Attachment: MR-1288-bp20-3.patch

Patch addressing Owen's comments.

> DistributedCache localizes only once per cache URI
> --
>
> Key: MAPREDUCE-1288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, security, tasktracker
>Affects Versions: 0.21.0
>Reporter: Devaraj Das
>Priority: Critical
> Attachments: MR-1288-bp20-1.patch, MR-1288-bp20-2.patch, 
> MR-1288-bp20-3.patch
>
>
> As part of the file localization the distributed cache localizer creates a 
> copy of the file in the corresponding user's private directory. The 
> localization in DistributedCache assumes the key as the URI of the cachefile 
> and if it already exists in the map, the localization is not done again. This 
> means that another user cannot access the same distributed cache file. We 
> should change the key to include the username so that localization is done 
> for every user.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

2010-07-27 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892916#action_12892916
 ] 

Owen O'Malley commented on MAPREDUCE-1288:
--

It looks good. I'd suggest:
1. change DistributedChache.releaseCache to pass in the current user to 
TrackerDistributedCacheManager.releaseCache rather than creating a new method.
2. it looks like the constructor for CacheFile can easily throw IOException 
instead of putting it in a RuntimeException.


> DistributedCache localizes only once per cache URI
> --
>
> Key: MAPREDUCE-1288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, security, tasktracker
>Affects Versions: 0.21.0
>Reporter: Devaraj Das
>Priority: Critical
> Attachments: MR-1288-bp20-1.patch, MR-1288-bp20-2.patch
>
>
> As part of the file localization the distributed cache localizer creates a 
> copy of the file in the corresponding user's private directory. The 
> localization in DistributedCache assumes the key as the URI of the cachefile 
> and if it already exists in the map, the localization is not done again. This 
> means that another user cannot access the same distributed cache file. We 
> should change the key to include the username so that localization is done 
> for every user.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1969) Allow raid to use Reed-Solomon erasure codes

2010-07-27 Thread Scott Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1969:
--

Description: 
Currently raid uses one parity block per stripe which corrects one missing 
block on one stripe.
Using Reed-Solomon code, we can add any number of parity blocks to tolerate 
more missing blocks.
This way we can get a good file corrupt probability even if we set the 
replication to 1.

Here are some simple comparisons:
1. No raid, replication = 3:
File corruption probability = O(p^3), Storage space = 3x

2. Single parity raid with stripe size = 10, replication = 2:
File corruption probability = O(p^4), Storage space = 2.2x 

3. Reed-Solomon raid with parity size = 4 and stripe size = 10, replication = 1:
File corruption probability = O(p^5), Storage space = 1.4x

where p is the missing block probability.
Reed-Solomon code can save lots of space without compromising the corruption 
probability.

To achieve this, we need some changes to raid:
1. Add a block placement policy that knows about raid logic and do not put 
blocks on the same stripe on the same node.
2. Add an automatic block fixing mechanism. The block fixing will replace the 
replication of under replicated blocks.
3. Allow raid to use general erasure code. It is now hard coded using Xor.
4. Add a Reed-Solomon code implementation

We are planing to use it on the older data only.
Because setting replication = 1 hurts the data locality.


  was:
Currently raid uses one parity block per stripe which corrects one missing 
block on one stripe.
Using Reed-Solomon code, we can add any number of parity blocks to tolerate 
more missing blocks.
This way we can get a good file corrupt probability even if we set the 
replication to 1.

Here are some simple comparisons:
1. No raid, replication = 3:
File corruption probability = O(p^3), Storage space = 3x

2. Signal parity raid with stripe size = 10, replication = 2:
File corruption probability = O(p^4), Storage space = 2.2x 

3. Reed-Solomon raid with parity size = 4 and stripe size = 10, replication = 1:
File corruption probability = O(p^5), Storage space = 1.4x

where p is the missing block probability.
Reed-Solomon code can save lots of space without compromising the corruption 
probability.

To achieve this, we need some changes to raid:
1. Add a block placement policy that knows about raid logic and do not put 
blocks on the same stripe on the same node.
2. Add an automatic block fixing mechanism. The block fixing will replace the 
replication of under replicated blocks.
3. Allow raid to use general erasure code. It is now hard coded using Xor.
4. Add a Reed-Solomon code implementation

We are planing to use it on the older data only.
Because setting replication = 1 hurts the data locality.



> Allow raid to use Reed-Solomon erasure codes
> 
>
> Key: MAPREDUCE-1969
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1969
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Reporter: Scott Chen
> Fix For: 0.22.0
>
>
> Currently raid uses one parity block per stripe which corrects one missing 
> block on one stripe.
> Using Reed-Solomon code, we can add any number of parity blocks to tolerate 
> more missing blocks.
> This way we can get a good file corrupt probability even if we set the 
> replication to 1.
> Here are some simple comparisons:
> 1. No raid, replication = 3:
> File corruption probability = O(p^3), Storage space = 3x
> 2. Single parity raid with stripe size = 10, replication = 2:
> File corruption probability = O(p^4), Storage space = 2.2x 
> 3. Reed-Solomon raid with parity size = 4 and stripe size = 10, replication = 
> 1:
> File corruption probability = O(p^5), Storage space = 1.4x
> where p is the missing block probability.
> Reed-Solomon code can save lots of space without compromising the corruption 
> probability.
> To achieve this, we need some changes to raid:
> 1. Add a block placement policy that knows about raid logic and do not put 
> blocks on the same stripe on the same node.
> 2. Add an automatic block fixing mechanism. The block fixing will replace the 
> replication of under replicated blocks.
> 3. Allow raid to use general erasure code. It is now hard coded using Xor.
> 4. Add a Reed-Solomon code implementation
> We are planing to use it on the older data only.
> Because setting replication = 1 hurts the data locality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1972) TestUserLogCleanup test cant clean up the toBeDeleted

2010-07-27 Thread Giridharan Kesavan (JIRA)

TestUserLogCleanup test cant clean up the toBeDeleted
-

 Key: MAPREDUCE-1972
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1972
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: 
http://hudson.zones.apache.org/hudson/view/Mapreduce/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/300
Reporter: Giridharan Kesavan


All the hudson patch test builds are failing as the 
Mapreduce-Patch-h4.grid.sp2.yahoo.net/trunk/build/test/logs/userlogs has a 
folder created by the following test which doesnt seem to have read permission. 

Running org.apache.hadoop.mapred.TestUserLogCleanup
 [exec] [junit] 2010-07-14 22:24:54,027 INFO  mapred.UserLogCleaner 
(UserLogCleaner.java:markJobLogsForDeletion(174)) - Adding job_test_0001 for 
user-log deletion with retainTimeStamp:720
...

 [exec] [junit] 2010-07-14 22:24:54,373 WARN  util.MRAsyncDiskService 
(MRAsyncDiskService.java:run(214)) - Failure in deletion of 
toBeDeleted/2010-07-14_22-24-54.372_6 on 
/grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h4.grid.sp2.yahoo.net/trunk/build/test/logs/userlogs
 with original name job_20100714203911410_0002
 [exec] [junit] 2010-07-14 22:24:54,374 WARN  util.MRAsyncDiskService 
(MRAsyncDiskService.java:run(214)) - Failure in deletion of 
toBeDeleted/2010-07-14_22-24-54.373_7 on 
/grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h4.grid.sp2.yahoo.net/trunk/build/test/logs/userlogs
 with original name job_test_0003
 [exec] [junit] 2010-07-14 22:24:54,391 WARN  util.MRAsyncDiskService 
(MRAsyncDiskService.java:run(214)) - Failure in deletion of 
toBeDeleted/2010-07-14_22-24-54.372_6 on 
/grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h4.grid.sp2.yahoo.net/trunk/build/test/logs/userlogs
 with original name 
/grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h4.grid.sp2.yahoo.net/trunk/build/test/logs/userlogs/toBeDeleted/2010-07-14_22-24-54.372_6
 [exec] [junit] 2010-07-14 22:24:54,405 INFO  mapred.UserLogCleaner 
(UserLogCleaner.java:markJobLogsForDeletion(174)) - Adding job_test_0001 for 
user-log deletion with retainTimeStamp:720
..
 [exec] [junit] 2010-07-14 22:24:54,441 WARN  util.MRAsyncDiskService 
(MRAsyncDiskService.java:run(214)) - Failure in deletion of 
toBeDeleted/2010-07-14_22-2


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1964) Running hi Ram jobs when TTs are blacklisted

2010-07-27 Thread Balaji Rajagopalan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Rajagopalan updated MAPREDUCE-1964:
--

Attachment: hiRam_bList_y20_1.patch

implemented vinay's comments

> Running hi Ram jobs when TTs are blacklisted
> 
>
> Key: MAPREDUCE-1964
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1964
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Balaji Rajagopalan
> Attachments: hiRam_bList_y20.patch, hiRam_bList_y20_1.patch
>
>
> More slots are getting reserved for HiRAM job tasks then required 
> Blacklist more than 25% TTs across the job.  Run high ram job.  No 
> java.lang.RuntimeException should be displayed. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

2010-07-27 Thread Balaji Rajagopalan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Rajagopalan updated MAPREDUCE-1971:
--

Attachment: concurrent_exp_y20_1.patch

Implemented iyappans comment. 

> herriot automation system test case for verification of bug fix to jobhistory
> -
>
> Key: MAPREDUCE-1971
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1971
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
> Environment: herriot 
>Reporter: Balaji Rajagopalan
> Attachments: concurrent_exp_y20.patch, concurrent_exp_y20_1.patch
>
>
> Run a few jobs and check the job history page .  Job history information 
> should be displayed properly  . 
> Analyze a running job . The values shown in the page should be correct . 
> Concurrently access jobs in job history page . No exception should be thrown. 
> In the developed herriot test case accesses the job tracker tracker directly, 
> the jsp page access does the same.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

2010-07-27 Thread Iyappan Srinivasan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892726#action_12892726
 ] 

Iyappan Srinivasan commented on MAPREDUCE-1971:
---

Remove the extra comment "//{" in runSleepJob.
Code looks good otherwise.


> herriot automation system test case for verification of bug fix to jobhistory
> -
>
> Key: MAPREDUCE-1971
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1971
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
> Environment: herriot 
>Reporter: Balaji Rajagopalan
> Attachments: concurrent_exp_y20.patch
>
>
> Run a few jobs and check the job history page .  Job history information 
> should be displayed properly  . 
> Analyze a running job . The values shown in the page should be correct . 
> Concurrently access jobs in job history page . No exception should be thrown. 
> In the developed herriot test case accesses the job tracker tracker directly, 
> the jsp page access does the same.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

2010-07-27 Thread Vinod K V (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892724#action_12892724
 ] 

Vinod K V commented on MAPREDUCE-1288:
--

Devaraj, this corner case is exactly what Hemanth was trying to explain earlier 
on this ticket, starting with comment #4 above :)

As for the approach, we have two options:
 (1) (this seems to be what the patch is doing) for group shared files, 
localize them separately for each user. This is a simple solution, but 
sacrifices the optimization ( may not be too bad?)
 (2) introduce the concept of group sharing of distributed cache files so as to 
avoid repetitive downloads for group shared files also. This may be a complex 
solution after all.

> DistributedCache localizes only once per cache URI
> --
>
> Key: MAPREDUCE-1288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, security, tasktracker
>Affects Versions: 0.21.0
>Reporter: Devaraj Das
>Priority: Critical
> Attachments: MR-1288-bp20-1.patch, MR-1288-bp20-2.patch
>
>
> As part of the file localization the distributed cache localizer creates a 
> copy of the file in the corresponding user's private directory. The 
> localization in DistributedCache assumes the key as the URI of the cachefile 
> and if it already exists in the map, the localization is not done again. This 
> means that another user cannot access the same distributed cache file. We 
> should change the key to include the username so that localization is done 
> for every user.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

2010-07-27 Thread Balaji Rajagopalan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Rajagopalan updated MAPREDUCE-1971:
--

Attachment: concurrent_exp_y20.patch

First patch for y20. 

> herriot automation system test case for verification of bug fix to jobhistory
> -
>
> Key: MAPREDUCE-1971
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1971
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
> Environment: herriot 
>Reporter: Balaji Rajagopalan
> Attachments: concurrent_exp_y20.patch
>
>
> Run a few jobs and check the job history page .  Job history information 
> should be displayed properly  . 
> Analyze a running job . The values shown in the page should be correct . 
> Concurrently access jobs in job history page . No exception should be thrown. 
> In the developed herriot test case accesses the job tracker tracker directly, 
> the jsp page access does the same.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

2010-07-27 Thread Balaji Rajagopalan (JIRA)

herriot automation system test case for verification of bug fix to jobhistory
-

 Key: MAPREDUCE-1971
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1971
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
 Environment: herriot 
Reporter: Balaji Rajagopalan


Run a few jobs and check the job history page .  Job history information should 
be displayed properly  . 
Analyze a running job . The values shown in the page should be correct . 
Concurrently access jobs in job history page . No exception should be thrown. 

In the developed herriot test case accesses the job tracker tracker directly, 
the jsp page access does the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

2010-07-27 Thread Giridharan Kesavan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated MAPREDUCE-1834:
--

Status: Patch Available  (was: Open)

> TestSimulatorDeterministicReplay timesout on trunk
> --
>
> Key: MAPREDUCE-1834
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Hong Tang
> Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, 
> TestSimulatorDeterministicReplay.log
>
>
> TestSimulatorDeterministicReplay timesout on trunk.
> See hudson patch build 
> http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

2010-07-27 Thread Giridharan Kesavan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated MAPREDUCE-1834:
--

Status: Open  (was: Patch Available)

> TestSimulatorDeterministicReplay timesout on trunk
> --
>
> Key: MAPREDUCE-1834
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Hong Tang
> Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, 
> TestSimulatorDeterministicReplay.log
>
>
> TestSimulatorDeterministicReplay timesout on trunk.
> See hudson patch build 
> http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1253) Making Mumak work with Capacity-Scheduler

2010-07-27 Thread Hong Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1253:
-

Priority: Major  (was: Minor)

> Making Mumak work with Capacity-Scheduler
> -
>
> Key: MAPREDUCE-1253
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1253
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Anirban Dasgupta
>Assignee: Anirban Dasgupta
> Attachments: MAPREDUCE-1253-20100406.patch, 
> MAPREDUCE-1253-20100726-2.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> In order to make the capacity-scheduler work in the mumak simulation 
> environment, we have to replace the job-initialization threads of the 
> capacity scheduler with classes that perform event-based initialization. We 
> propose to use aspectj to disable the threads  of the JobInitializationPoller 
> class used by the Capacity Scheduler, and then perform the corresponding 
> initialization tasks through a simulation job-initialization class that 
> receives periodic wake-up calls from the simulator engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1253) Making Mumak work with Capacity-Scheduler

2010-07-27 Thread Hong Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892690#action_12892690
 ] 

Hong Tang commented on MAPREDUCE-1253:
--

All mumak unit tests passed on my local machine after applying patch 
mr-1834-20100727.patch from MAPREDUCE-1834.

> Making Mumak work with Capacity-Scheduler
> -
>
> Key: MAPREDUCE-1253
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1253
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Anirban Dasgupta
>Assignee: Anirban Dasgupta
>Priority: Minor
> Attachments: MAPREDUCE-1253-20100406.patch, 
> MAPREDUCE-1253-20100726-2.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> In order to make the capacity-scheduler work in the mumak simulation 
> environment, we have to replace the job-initialization threads of the 
> capacity scheduler with classes that perform event-based initialization. We 
> propose to use aspectj to disable the threads  of the JobInitializationPoller 
> class used by the Capacity Scheduler, and then perform the corresponding 
> initialization tasks through a simulation job-initialization class that 
> receives periodic wake-up calls from the simulator engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1253) Making Mumak work with Capacity-Scheduler

2010-07-27 Thread Hong Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1253:
-

Attachment: MAPREDUCE-1253-20100726-2.patch

> Making Mumak work with Capacity-Scheduler
> -
>
> Key: MAPREDUCE-1253
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1253
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Anirban Dasgupta
>Assignee: Anirban Dasgupta
>Priority: Minor
> Attachments: MAPREDUCE-1253-20100406.patch, 
> MAPREDUCE-1253-20100726-2.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> In order to make the capacity-scheduler work in the mumak simulation 
> environment, we have to replace the job-initialization threads of the 
> capacity scheduler with classes that perform event-based initialization. We 
> propose to use aspectj to disable the threads  of the JobInitializationPoller 
> class used by the Capacity Scheduler, and then perform the corresponding 
> initialization tasks through a simulation job-initialization class that 
> receives periodic wake-up calls from the simulator engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

2010-07-27 Thread Hong Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892684#action_12892684
 ] 

Hong Tang commented on MAPREDUCE-1834:
--

It is a bit tricky to intercept the constructor of ConcurrentHashMap because 
there is no ConcurrentLinkedHashMap. I implemented a faked concurrent hash map 
which does not support concurrency (ok because mumak runs as a single thread) 
and uses a LinkedHashMap for internal storage.

> TestSimulatorDeterministicReplay timesout on trunk
> --
>
> Key: MAPREDUCE-1834
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Hong Tang
> Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, 
> TestSimulatorDeterministicReplay.log
>
>
> TestSimulatorDeterministicReplay timesout on trunk.
> See hudson patch build 
> http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

2010-07-27 Thread Hong Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1834:
-

Status: Patch Available  (was: Open)

> TestSimulatorDeterministicReplay timesout on trunk
> --
>
> Key: MAPREDUCE-1834
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Hong Tang
> Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, 
> TestSimulatorDeterministicReplay.log
>
>
> TestSimulatorDeterministicReplay timesout on trunk.
> See hudson patch build 
> http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

2010-07-27 Thread Hong Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1834:
-

Attachment: mr-1834-20100727.patch

> TestSimulatorDeterministicReplay timesout on trunk
> --
>
> Key: MAPREDUCE-1834
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Hong Tang
> Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, 
> TestSimulatorDeterministicReplay.log
>
>
> TestSimulatorDeterministicReplay timesout on trunk.
> See hudson patch build 
> http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

2010-07-27 Thread Hong Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang reassigned MAPREDUCE-1834:


Assignee: Hong Tang

> TestSimulatorDeterministicReplay timesout on trunk
> --
>
> Key: MAPREDUCE-1834
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Hong Tang
> Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, 
> TestSimulatorDeterministicReplay.log
>
>
> TestSimulatorDeterministicReplay timesout on trunk.
> See hudson patch build 
> http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1975) gridmix shows unnecessary InterruptedException

[jira] Updated: (MAPREDUCE-1686) ClassNotFoundException for custom format classes provided in libjars

[jira] Commented: (MAPREDUCE-1968) Deprecate GridMix v1

[jira] Commented: (MAPREDUCE-1882) Use Jsch instead of Shell.java

[jira] Commented: (MAPREDUCE-1890) Create automated test scenarios for decommissioning of task trackers

[jira] Commented: (MAPREDUCE-1963) [Herriot] TaskMemoryManager should log process-tree's status while killing tasks

[jira] Updated: (MAPREDUCE-1958) using delegation token over hftp for long running clients (part of hdfs 1296)

[jira] Created: (MAPREDUCE-1974) FairScheduler can preempt the same task many times

[jira] Updated: (MAPREDUCE-1973) Optimize input split creation

[jira] Created: (MAPREDUCE-1973) Optimize input split creation

[jira] Commented: (MAPREDUCE-1968) Deprecate GridMix v1

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

[jira] Resolved: (MAPREDUCE-1154) Large-scale, automated test framwork for Map-Reduce

[jira] Commented: (MAPREDUCE-1686) ClassNotFoundException for custom format classes provided in libjars

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

[jira] Updated: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

[jira] Updated: (MAPREDUCE-1969) Allow raid to use Reed-Solomon erasure codes

[jira] Created: (MAPREDUCE-1972) TestUserLogCleanup test cant clean up the toBeDeleted

[jira] Updated: (MAPREDUCE-1964) Running hi Ram jobs when TTs are blacklisted

[jira] Updated: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

[jira] Commented: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

[jira] Commented: (MAPREDUCE-1288) DistributedCache localizes only once per cache URI

[jira] Updated: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

[jira] Created: (MAPREDUCE-1971) herriot automation system test case for verification of bug fix to jobhistory

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

[jira] Updated: (MAPREDUCE-1253) Making Mumak work with Capacity-Scheduler

[jira] Commented: (MAPREDUCE-1253) Making Mumak work with Capacity-Scheduler

[jira] Updated: (MAPREDUCE-1253) Making Mumak work with Capacity-Scheduler

[jira] Commented: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

[jira] Assigned: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

34 matches

Site Navigation

Mail list logo

Footer information