[jira] [Commented] (MAPREDUCE-4149) Rumen fails to parse certain counter strings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254647#comment-13254647 ] Amar Kamat commented on MAPREDUCE-4149: --- +1 for the branch-1 patch. Rumen fails to parse certain counter strings Key: MAPREDUCE-4149 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 4149.branch-1.v1.patch, 4149.branch-1.v2.patch, 4149.patch If a counter name contains { or }, Rumen is not able to parse it and throws ParseException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4149) Rumen fails to parse certain counter strings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253264#comment-13253264 ] Amar Kamat commented on MAPREDUCE-4149: --- Ravi, let 'UserCounterMapper' extend 'IdentityMapper'. This way, you can set our special counters in UserCounterMapper's map api and then invoke IdentityMapper's map() api. Thoughts? Rumen fails to parse certain counter strings Key: MAPREDUCE-4149 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 4149.branch-1.v1.patch, 4149.patch If a counter name contains { or }, Rumen is not able to parse it and throws ParseException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4087) [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242252#comment-13242252 ] Amar Kamat commented on MAPREDUCE-4087: --- Looks good to me. +1 [Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases Key: MAPREDUCE-4087 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4087 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 4087.patch In map() method of GenerateDistCacheData job of Gridmix, val.setSize() is done every time based on the bytes to be written to a distributed cache file. When we try to write data to next distributed cache file in the same map task, the size of random data generated in each iteration can become small based on the particular case. This can make this dist cache data generation slow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3953) Gridmix throws NPE and does not simulate a job if the trace contains null taskStatus for a task
[ https://issues.apache.org/jira/browse/MAPREDUCE-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13231844#comment-13231844 ] Amar Kamat commented on MAPREDUCE-3953: --- +1. Patch looks good to me. Not sure why the patch failed. Ravi, can you kindly update the test-patch and junit test results. Gridmix throws NPE and does not simulate a job if the trace contains null taskStatus for a task --- Key: MAPREDUCE-3953 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3953 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3953.v1.patch In a trace file, if a succeeded job contains a failed task, then that task's taskStatus will be null. This is causing NPE in Gridmix and then Gridmix is ignoring/not-considering such jobs for simulation. The job could succeed even with failed tasks if the job submitter in original cluster configured that job to tolerate failures using mapreduce.map.failures.maxpercent and mapreduce.reduce.failures.maxpercent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4013) Reduce task gets stuck when a M/R job is configured to tolerate failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230047#comment-13230047 ] Amar Kamat commented on MAPREDUCE-4013: --- BTW, the job has some failed map tasks. Reduce task gets stuck when a M/R job is configured to tolerate failures Key: MAPREDUCE-4013 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4013 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Amar Kamat Priority: Blocker Labels: shuffle Fix For: 0.24.0 When a M/R job is configured to run with some tolerance to task failures (via mapreduce.map.failures.maxpercent), then the reduce task of that job gets stuck in the shuffle phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3829) [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given
[ https://issues.apache.org/jira/browse/MAPREDUCE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227421#comment-13227421 ] Amar Kamat commented on MAPREDUCE-3829: --- Looks good to me. +1 [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given - Key: MAPREDUCE-3829 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3829 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3829.v0.patch, 3829.v1.3.patch, 3829.v1.patch, 3829.v2.patch Instead of throwing exception messages on to the console, Gridmix should give better error message when input-data directory already exists and -generate option is given. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1300#comment-1300 ] Amar Kamat commented on MAPREDUCE-2722: --- +1. Looks good to me. Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used -- Key: MAPREDUCE-2722 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3829) [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given
[ https://issues.apache.org/jira/browse/MAPREDUCE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220692#comment-13220692 ] Amar Kamat commented on MAPREDUCE-3829: --- Thanks Ravi for the patch. Few comments: # It would be nice to move the FileSystem, size and path checks to the writeInputData() API. This way you can test this API and the current fix via JUnit tests. # Add JUnit tests to test ## writeInputData() w.r.t zero-data size, missing input dir ## 777 permissions on io-path. [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given - Key: MAPREDUCE-3829 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3829 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3829.v0.patch, 3829.v1.patch Instead of throwing exception messages on to the console, Gridmix should give better error message when input-data directory already exists and -generate option is given. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220701#comment-13220701 ] Amar Kamat commented on MAPREDUCE-2722: --- Ravi, compression-emulation is a feature having 3 parts # Input compression emulation # Intermediate compression emulation # Output compression emulation Intermediate and output compression emulation happens only when the compression-emulation feature is turned on and the job's config has those parameters set. For input compression, Gridmix relies on 'mapred.input.dir'. If there are compressed input files only then input compression emulation will be attempted. Scale the input-data-size field only if input-compression-emulation is desired. Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used -- Key: MAPREDUCE-2722 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 2722.v1.patch, MR2722.patch When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3926) No information of unfinished map task in Job History, if all attempts of another map task fail.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217228#comment-13217228 ] Amar Kamat commented on MAPREDUCE-3926: --- Mitesh, I guess adding this to 0.20.205 might involve a lot of change. Also, the JT has no information about the running tasks i.e they could in fact be RUNNING, KILLED, FAILED, PENDING etc. Note that this can happen for SUCCESSFUL jobs too. The job can still complete/finish while the speculative tasks are running. In such cases, there is no information about the speculative tasks logged in the job history. This can surely be fixed in trunk. No information of unfinished map task in Job History, if all attempts of another map task fail. --- Key: MAPREDUCE-3926 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3926 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.205.0 Reporter: Mitesh Singh Jat Priority: Minor No information of unfinished map task in Job History, if all attempts of another map task fail. For example, 1. The first map task's first attempt m_00_0 was making progress 2. The second map task failed 4 times, before completion of first map task attempt. 3. Hence, a job cleanup task was launched and completed, before completion of first map task attempt. 4. After job cleanup task, runningMapCache is cleaned {noformat} completedTask() - jobComplete() - garbageCollect() - this.runningMapCache = null; |- retireMap() - if (runningMapCache == null) Running cache for maps missing!! Job details are missing. {noformat} 5. Hence, Running cache for maps missing!! Job details are missing. error comes (from retireMap() which is called after jobComplete() ) and no information is added further to Job History. Therefore, first map task's information is missing from Job History page. I have created a sample streaming MR job, to reproduce this issue. {code:title=mapper.sh} #!/bin/bash read line if [[ $line == sleep ]] then for i in 1 2 3 do echo Sleeping 2 sleep 5 done exit 0 else echo Exiting 2 exit -1 fi {code} Input file: in1.txt is for long running map task (here first map task) {code:title=/user/mitesh/input/in1.txt} sleep {code} Input file: in2.txt is for failing map task (here second map task) {code:title=/user/mitesh/input/in2.txt} exit {code} Running the sample streaming MR job. {noformat} $ hadoop fs -rmr -skipTrash xyz $ hadoop fs -jar $HADOOP_INSTALL/hadoop-streaming.jar -Dmapred.map.max.attempts=2 -Dmapred.min.split.size=7 -Dmapred.map.tasks=2 -mapper mapper.sh -file mapper.sh -reducer NONE -input /user/mitesh/input/in1.txt -input /user/mitesh/input/in2.txt -output xyz {noformat} Job History web UI {noformat} Hadoop Job job_201201310454_542302 on History Viewer User: mitesh JobName: streamjob7439640883203077520.jar JobConf: hdfs://nn:port/user/mitesh/.staging/job_201201310454_542302/job.xml Job-ACLs: mapreduce.job.acl-view-job: No users are allowed mapreduce.job.acl-modify-job: No users are allowed Submitted At: 27-Feb-2012 12:56:02 Launched At: 27-Feb-2012 12:56:11 (8sec) Finished At: 27-Feb-2012 12:56:31 (20sec) Status: FAILED Failure Info: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201201310454_542302_m_01 Analyse This Job Kind Total Tasks(successful+failed+killed) Successful tasksFailed tasksKilled tasksStart Time Finish Time Setup 1 1 0 0 27-Feb-2012 12:56:12 27-Feb-2012 12:56:16 (4sec) Map 2 0 2 0 27-Feb-2012 12:56:1627-Feb-2012 12:56:26 (10sec) Reduce0 0 0 0 Cleanup 1 1 0 0 27-Feb-2012 12:56:26 27-Feb-2012 12:56:31 (4sec) {noformat} Above it shows, only 2 failed tasks (belong to second map task). Only from JT logs, the task tracker of first map task can be found. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3925) [Gridmix] Gridmix stress mode should be queue aware
[ https://issues.apache.org/jira/browse/MAPREDUCE-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217043#comment-13217043 ] Amar Kamat commented on MAPREDUCE-3925: --- One way of doing this would be to pre-process the trace and break it down to 'n' sub-traces i.e 1 trace per queue. Similar to how the stress mode stresses the entire cluster, Gridmix can now run 'n' stress threads (one thread per queue) such that each of them will consume a sub-trace and stress the corresponding queue. This way, if Gridmix can make sure that each of the queues is sufficiently loaded, then we can safely assume that the cluster is sufficiently busy. Note-1: If the queue-level sub-trace is not capable of stressing the queue (due to small jobs or few jobs), then other stress threads should make sure they overload their queues sufficiently to use up the free slots. Note-2: For single queue, Gridmix should default to the current behavior. Thoughts? [Gridmix] Gridmix stress mode should be queue aware --- Key: MAPREDUCE-3925 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3925 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Labels: gridmix, multi-queue, stress Fix For: 0.24.0 Currently, the Gridmix stress mode submits jobs in the same order as seen in the trace. When Gridmix is configured to run with multiple queues, the stress mode might end up queuing lot of jobs in a single queue without really stressing the entire cluster. The goal is to make sure that each queue is loaded thus keeping the entire cluster busy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214532#comment-13214532 ] Amar Kamat commented on MAPREDUCE-2722: --- Changes look good to me. +1. Is it possible to add a JUnit? Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used -- Key: MAPREDUCE-2722 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: MR2722.patch When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3829) [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given
[ https://issues.apache.org/jira/browse/MAPREDUCE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214551#comment-13214551 ] Amar Kamat commented on MAPREDUCE-3829: --- Ravi, Should we reuse the 'STARTUP_FAILED_ERROR' in DistributedCacheEmulator? LOG statements should point to the real cause of the error. Lets try to keep all the error codes in one place i.e Gridmix.java. Other changes looks good to me. [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given - Key: MAPREDUCE-3829 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3829 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3829.v0.patch Instead of throwing exception messages on to the console, Gridmix should give better error message when input-data directory already exists and -generate option is given. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3757) Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts
[ https://issues.apache.org/jira/browse/MAPREDUCE-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209271#comment-13209271 ] Amar Kamat commented on MAPREDUCE-3757: --- Looks good to me. +1. Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts Key: MAPREDUCE-3757 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3757 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3757.v0.patch, 3757.v1.1.patch, 3757.v1.patch Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts when it is adjusting the attempt-start-time and attempt-finish-time. This is leading to wrong values which are greater than the attempt-finish-time in trace file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3757) Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts
[ https://issues.apache.org/jira/browse/MAPREDUCE-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204365#comment-13204365 ] Amar Kamat commented on MAPREDUCE-3757: --- Looks good to me. +1 Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts Key: MAPREDUCE-3757 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3757 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3757.v0.patch Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts when it is adjusting the attempt-start-time and attempt-finish-time. This is leading to wrong values which are greater than the attempt-finish-time in trace file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3806) [Gridmix] TestGridmixSubmission fails due to incorrect version of jackson
[ https://issues.apache.org/jira/browse/MAPREDUCE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203315#comment-13203315 ] Amar Kamat commented on MAPREDUCE-3806: --- I found some interesting link on this issue. Read http://stackoverflow.com/questions/6537287/jersey-and-jackson-maven-dependency-issues. It seems like Ravi replaced all the jackson-1.7.1 jars with jackson-1.8.8 in the {{build/ivy/lib/gridmix/test}} folder and found that this test passes. Actually he copied the jackson-1.8.8 as jackson1.7.1. [Gridmix] TestGridmixSubmission fails due to incorrect version of jackson - Key: MAPREDUCE-3806 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3806 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Labels: error, gridmix, junit Fix For: 0.24.0 {{TestGridmixSubmission}} fails with the following error {code} org.codehaus.jackson.type.JavaType.isMapLikeType()Z java.lang.NoSuchMethodError: org.codehaus.jackson.type.JavaType.isMapLikeType()Z at org.codehaus.jackson.map.deser.StdDeserializerProvider._createDeserializer(StdDeserializerProvider.java:374) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCache2(StdDeserializerProvider.java:307) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCacheValueDeserializer(StdDeserializerProvider.java:287) at org.codehaus.jackson.map.deser.StdDeserializerProvider.findValueDeserializer(StdDeserializerProvider.java:136) at org.codehaus.jackson.map.deser.StdDeserializer.findDeserializer(StdDeserializer.java:551) at org.codehaus.jackson.map.deser.BeanDeserializer.resolve(BeanDeserializer.java:268) at org.codehaus.jackson.map.deser.StdDeserializerProvider._resolveDeserializer(StdDeserializerProvider.java:404) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCache2(StdDeserializerProvider.java:349) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCacheValueDeserializer(StdDeserializerProvider.java:287) at org.codehaus.jackson.map.deser.StdDeserializerProvider.findValueDeserializer(StdDeserializerProvider.java:136) at org.codehaus.jackson.map.deser.StdDeserializerProvider.findTypedValueDeserializer(StdDeserializerProvider.java:157) at org.codehaus.jackson.map.ObjectMapper._findRootDeserializer(ObjectMapper.java:2468) at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2383) at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1094) at org.apache.hadoop.tools.rumen.JsonObjectMapperParser.getNext(JsonObjectMapperParser.java:84) at org.apache.hadoop.tools.rumen.ZombieJobProducer.getNextJob(ZombieJobProducer.java:117) at org.apache.hadoop.tools.rumen.ZombieJobProducer.getNextJob(ZombieJobProducer.java:29) at org.apache.hadoop.mapred.gridmix.TestGridmixSubmission.testTraceReader(TestGridmixSubmission.java:440) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3806) [Gridmix] TestGridmixSubmission fails due to incorrect version of jackson
[ https://issues.apache.org/jira/browse/MAPREDUCE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13200317#comment-13200317 ] Amar Kamat commented on MAPREDUCE-3806: --- Gridmix depends on {{jackson 1.8}}. This dependency was introduced by MAPREDUCE-778. It seems like another version of jackson (probably jackson 1.7) is added to the classpath and is taking precedence. A quick search for this jackson version points to the {{hadoop-yarn}} folder. But I dont see any mention of jackson in {{hadoop-yarn}}. [Gridmix] TestGridmixSubmission fails due to incorrect version of jackson - Key: MAPREDUCE-3806 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3806 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Labels: error, gridmix, junit Fix For: 0.24.0 {{TestGridmixSubmission}} fails with the following error {code} org.codehaus.jackson.type.JavaType.isMapLikeType()Z java.lang.NoSuchMethodError: org.codehaus.jackson.type.JavaType.isMapLikeType()Z at org.codehaus.jackson.map.deser.StdDeserializerProvider._createDeserializer(StdDeserializerProvider.java:374) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCache2(StdDeserializerProvider.java:307) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCacheValueDeserializer(StdDeserializerProvider.java:287) at org.codehaus.jackson.map.deser.StdDeserializerProvider.findValueDeserializer(StdDeserializerProvider.java:136) at org.codehaus.jackson.map.deser.StdDeserializer.findDeserializer(StdDeserializer.java:551) at org.codehaus.jackson.map.deser.BeanDeserializer.resolve(BeanDeserializer.java:268) at org.codehaus.jackson.map.deser.StdDeserializerProvider._resolveDeserializer(StdDeserializerProvider.java:404) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCache2(StdDeserializerProvider.java:349) at org.codehaus.jackson.map.deser.StdDeserializerProvider._createAndCacheValueDeserializer(StdDeserializerProvider.java:287) at org.codehaus.jackson.map.deser.StdDeserializerProvider.findValueDeserializer(StdDeserializerProvider.java:136) at org.codehaus.jackson.map.deser.StdDeserializerProvider.findTypedValueDeserializer(StdDeserializerProvider.java:157) at org.codehaus.jackson.map.ObjectMapper._findRootDeserializer(ObjectMapper.java:2468) at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2383) at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1094) at org.apache.hadoop.tools.rumen.JsonObjectMapperParser.getNext(JsonObjectMapperParser.java:84) at org.apache.hadoop.tools.rumen.ZombieJobProducer.getNextJob(ZombieJobProducer.java:117) at org.apache.hadoop.tools.rumen.ZombieJobProducer.getNextJob(ZombieJobProducer.java:29) at org.apache.hadoop.mapred.gridmix.TestGridmixSubmission.testTraceReader(TestGridmixSubmission.java:440) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3620) GrimdMix Stats at the end of GridMix are not reported correctly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13198574#comment-13198574 ] Amar Kamat commented on MAPREDUCE-3620: --- The issue occurs due to lost jobs. In case of lost jobs, Gridmix statistics is not informed resulting into missing events. GrimdMix Stats at the end of GridMix are not reported correctly --- Key: MAPREDUCE-3620 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3620 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Amar Kamat Courtesy [~vinaythota] {quote} Job trace contains 1205 jobs and Gridmix start processing 1200 jobs after processing. However, after completion of gridmix run, execution summary details, it showed 1196 jobs are processed and remaining 4 jobs are missing. One log shows 1196 jobs processed and another log shows 1200 jobs are processed. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3597) Provide a way to access other info of history file from Rumentool
[ https://issues.apache.org/jira/browse/MAPREDUCE-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196816#comment-13196816 ] Amar Kamat commented on MAPREDUCE-3597: --- +1. Looks good to me. Provide a way to access other info of history file from Rumentool - Key: MAPREDUCE-3597 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3597 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tools/rumen Affects Versions: 0.24.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.24.0 Attachments: 3597.branch-0.23.patch, 3597.branch-1.v1.patch, 3597.branch-1.v2.1.patch, 3597.branch-1.v2.2.patch, 3597.branch-1.v2.patch, 3597.v0.patch, 3597.v1.patch, MAPREDUCE-3597_branch-0.23.patch As the trace file generated by Rumen TraceBuilder is skipping some of the info like job counters, task counters, etc. we need a way to access other info available in history file which is not dumped to trace file. This is useful for components which want to parse history files and get info. These components can directly use/leverage Rumen's parsing of history files across hadoop releases and get history info in a consistent way for further analysis/processing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2784) [Gridmix] TestGridmixSummary fails with NPE when run in DEBUG mode.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196003#comment-13196003 ] Amar Kamat commented on MAPREDUCE-2784: --- Committed this to branch-0.23 too. [Gridmix] TestGridmixSummary fails with NPE when run in DEBUG mode. --- Key: MAPREDUCE-2784 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2784 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Amar Kamat Assignee: Amar Kamat Labels: gridmix, junit Fix For: 0.24.0 Attachments: mapreduce-2784-v1.3.patch, mapreduce-2784-v1.4.patch, mapreduce-2784-v1.5.patch, mapreduce-2784-v1.6.1.patch TestGridmixSummary fails with NPE when run in debug mode. JobFactory tries to access the _createReaderThread()_ API of JobStoryProducer which returns null in TestGridmixSummary's FakeJobStoryProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3481) [Gridmix] Improve STRESS mode locking
[ https://issues.apache.org/jira/browse/MAPREDUCE-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196005#comment-13196005 ] Amar Kamat commented on MAPREDUCE-3481: --- Committed this to branch-0.23 too. [Gridmix] Improve STRESS mode locking - Key: MAPREDUCE-3481 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3481 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: gridmix, locking, stress-mode Fix For: 0.24.0 Attachments: MAPREDUCE-3481-v1.6.patch, MAPREDUCE-3481-v1.7.patch Gridmix STREES mode code doesnt sufficiently load the cluster due to improper locking. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3597) Provide a way to access other info of history file from Rumentool
[ https://issues.apache.org/jira/browse/MAPREDUCE-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189633#comment-13189633 ] Amar Kamat commented on MAPREDUCE-3597: --- Few questions: 1. You have added a new {{getQueueName()}} API to {{JobSubmittedEvent}} class. Can you add some test case or validation lines for this newly added API? 2. Is there a testcase testing the map/reduce job level counters? As I understand, these counters got added recently and not available in the test logs. Is it possible to run a MR job (or reuse logs from other test scenarios) for the same? 3. Is it possible for the conf entires (e.g. queue name etc) to be null? Is it safe to check for nullity before setting the field in Parsed/LoggedTask from the conf? Provide a way to access other info of history file from Rumentool - Key: MAPREDUCE-3597 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3597 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tools/rumen Affects Versions: 0.24.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.24.0 Attachments: 3597.branch-1.v1.patch, 3597.branch-1.v2.patch, 3597.v0.patch, 3597.v1.patch As the trace file generated by Rumen TraceBuilder is skipping some of the info like job counters, task counters, etc. we need a way to access other info available in history file which is not dumped to trace file. This is useful for components which want to parse history files and get info. These components can directly use/leverage Rumen's parsing of history files across hadoop releases and get history info in a consistent way for further analysis/processing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3597) Provide a way to access other info of history file from Rumentool
[ https://issues.apache.org/jira/browse/MAPREDUCE-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185576#comment-13185576 ] Amar Kamat commented on MAPREDUCE-3597: --- The patch looks good to me. It seems that branch-1 Rumen is aware of pre and post 21 changes. We need to be sure of the implications. Provide a way to access other info of history file from Rumentool - Key: MAPREDUCE-3597 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3597 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tools/rumen Affects Versions: 0.24.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.24.0 Attachments: 3597.branch-1.v1.patch, 3597.v0.patch, 3597.v1.patch As the trace file generated by Rumen TraceBuilder is skipping some of the info like job counters, task counters, etc. we need a way to access other info available in history file which is not dumped to trace file. This is useful for components which want to parse history files and get info. These components can directly use/leverage Rumen's parsing of history files across hadoop releases and get history info in a consistent way for further analysis/processing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3543) Mavenize Gridmix.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181094#comment-13181094 ] Amar Kamat commented on MAPREDUCE-3543: --- Alejandro, Rumen was designed to be a contract that MapReduce provides for tools and users depending on MapReduce JobHistory. Rumen should be packaged with Hadoop MapReduce. Rumen might need access to internal MapReduce classes. Currently, only Gridmix uses Rumen but in future Mumak will also be using it. Mavenize Gridmix. - Key: MAPREDUCE-3543 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3543 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Mahadev konar Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3543v1.patch, MAPREDUCE-3543v1.sh Gridmix codebase still resides in src/contrib and needs to be compiled via ant. We should move it to maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179256#comment-13179256 ] Amar Kamat commented on MAPREDUCE-3476: --- Vinod, I see some sub-tickets being opened for optimizing YARN. Can you kindly link them to this JIRA? Optimize YARN API calls --- Key: MAPREDUCE-3476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Vinod Kumar Vavilapalli Priority: Critical Several YARN API calls are taking inordinately long. This might be a performance blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3462) Job submission failing in JUnit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179257#comment-13179257 ] Amar Kamat commented on MAPREDUCE-3462: --- Fixing contrib tests to respect {{src/test/mapred-site.xml}} can be addressed later. I will commit this patch for now. Job submission failing in JUnit tests - Key: MAPREDUCE-3462 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Ravi Prakash Priority: Blocker Labels: junit, test Attachments: 3462.trunk.patch, MAPREDUCE-3462.branch-0.23.patch When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and TestCompressionEmulationUtils), I see job submission failing with the following error: {noformat} java.lang.IllegalStateException: Variable substitution depth too large: 20 ${fs.default.name} at org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551) at org.apache.hadoop.conf.Configuration.get(Configuration.java:569) at org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020) at org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176) at org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190) at org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150) at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425) at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380) at org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3597) Provide a way to access other info of history file from Rumentool
[ https://issues.apache.org/jira/browse/MAPREDUCE-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179309#comment-13179309 ] Amar Kamat commented on MAPREDUCE-3597: --- Ravi, is it possible to port this to branch-1? Provide a way to access other info of history file from Rumentool - Key: MAPREDUCE-3597 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3597 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tools/rumen Affects Versions: 0.24.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.24.0 Attachments: 3597.v0.patch, 3597.v1.patch As the trace file generated by Rumen TraceBuilder is skipping some of the info like job counters, task counters, etc. we need a way to access other info available in history file which is not dumped to trace file. This is useful for components which want to parse history files and get info. These components can directly use/leverage Rumen's parsing of history files across hadoop releases and get history info in a consistent way for further analysis/processing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3597) Provide a way to access other info of history file from Rumentool
[ https://issues.apache.org/jira/browse/MAPREDUCE-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175445#comment-13175445 ] Amar Kamat commented on MAPREDUCE-3597: --- {{test-patch}} passed on my local box. Rumen and Gridmix tests passed. Provide a way to access other info of history file from Rumentool - Key: MAPREDUCE-3597 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3597 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tools/rumen Affects Versions: 0.24.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.24.0 Attachments: 3597.v0.patch, 3597.v1.patch As the trace file generated by Rumen TraceBuilder is skipping some of the info like job counters, task counters, etc. we need a way to access other info available in history file which is not dumped to trace file. This is useful for components which want to parse history files and get info. These components can directly use/leverage Rumen's parsing of history files across hadoop releases and get history info in a consistent way for further analysis/processing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3462) Job submission failing in JUnit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175250#comment-13175250 ] Amar Kamat commented on MAPREDUCE-3462: --- Tested the fix on {{TestCompressionEmulationUtils}} and the test passed. I was wondering if it makes sense to add this to mapred-site.xml either at the top level (i.e {{conf/mapred-site.xml}}) or just for tests (i.e {{src/test/mapred-site.xml}}). I tired setting this property in {{src/test/mapred-site.xml}} but the test still failed. Somehow, we should make sure that the contrib tests load the {{src/test/mapred-site.xml}}. Job submission failing in JUnit tests - Key: MAPREDUCE-3462 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Ravi Prakash Priority: Blocker Labels: junit, test Attachments: MAPREDUCE-3462.branch-0.23.patch When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and TestCompressionEmulationUtils), I see job submission failing with the following error: {noformat} java.lang.IllegalStateException: Variable substitution depth too large: 20 ${fs.default.name} at org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551) at org.apache.hadoop.conf.Configuration.get(Configuration.java:569) at org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020) at org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176) at org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190) at org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150) at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425) at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380) at org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3462) Job submission failing in JUnit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175259#comment-13175259 ] Amar Kamat commented on MAPREDUCE-3462: --- I think setting {{mapreduce.job.hdfs-servers}} to an empty string in {{src/java/mapred-default.xml}} should take care of the failures. Thoughts? Job submission failing in JUnit tests - Key: MAPREDUCE-3462 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Ravi Prakash Priority: Blocker Labels: junit, test Attachments: MAPREDUCE-3462.branch-0.23.patch When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and TestCompressionEmulationUtils), I see job submission failing with the following error: {noformat} java.lang.IllegalStateException: Variable substitution depth too large: 20 ${fs.default.name} at org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551) at org.apache.hadoop.conf.Configuration.get(Configuration.java:569) at org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020) at org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176) at org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190) at org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150) at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425) at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380) at org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152) at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2517) Porting Gridmix v3 system tests into trunk branch.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175291#comment-13175291 ] Amar Kamat commented on MAPREDUCE-2517: --- Committed the backported patch to Hadoop branch-1.1 (0.20.206). Thanks Vinay! Porting Gridmix v3 system tests into trunk branch. -- Key: MAPREDUCE-2517 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2517 Project: Hadoop Map/Reduce Issue Type: Task Components: contrib/gridmix Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Fix For: 0.23.0 Attachments: MAPREDUCE-2517-h20-v1.0.patch, MAPREDUCE-2517-v2.patch, MAPREDUCE-2517-v3.patch, MAPREDUCE-2517-v4.patch, MAPREDUCE-2517.patch Porting of girdmix v3 system tests into trunk branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3349) No rack-name logged in JobHistory for unsuccessful tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173794#comment-13173794 ] Amar Kamat commented on MAPREDUCE-3349: --- Sid, Does it makes sense to port MAPREDUCE-778 to 0.23? Its an important feature and affects only Rumen. No rack-name logged in JobHistory for unsuccessful tasks Key: MAPREDUCE-3349 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3349 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Devaraj K Priority: Blocker Labels: hostname, rackname, rumen, unsuccessful Fix For: 0.23.1 Attachments: MAPREDUCE-3349-v1.11.patch, MAPREDUCE-3349-v1.4.patch, MAPREDUCE-3349-v1.6.patch, MAPREDUCE-3349.patch Found this while running jobs on a cluster with [~Karams]. This is because TaskAttemptUnsuccessfulCompletionEvent history record doesn't have a rack field. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-778) [Rumen] Need a standalone JobHistory log anonymizer
[ https://issues.apache.org/jira/browse/MAPREDUCE-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13170753#comment-13170753 ] Amar Kamat commented on MAPREDUCE-778: -- test-patch passed on my local box. {noformat} +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 44 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. {noformat} All the Rumen JUnit tests except 'TestRumenJobTraces' passed. 'TestRumenJobTraces' failed due to MAPREDUCE-3462. 12 out of 16 Gridmix JUnit tests passed. 4 Gridmix tests failed due to MAPREDUCE-3462 and MAPREDUCE-3168. [Rumen] Need a standalone JobHistory log anonymizer --- Key: MAPREDUCE-778 URL: https://issues.apache.org/jira/browse/MAPREDUCE-778 Project: Hadoop Map/Reduce Issue Type: New Feature Components: tools/rumen Affects Versions: 0.24.0 Reporter: Hong Tang Assignee: Amar Kamat Labels: anonymization, rumen Fix For: 0.24.0 Attachments: anonymizer.patch, anonymizer.py, mapreduce-778-v1.14-12.patch, mapreduce-778-v1.14-14.patch, mapreduce-778-v1.2-2.patch, same.py Job history logs contain a rich set of information that can help understand and characterize cluster workload and individual job execution. Examples of work that parses or utilizes job history include HADOOP-3585, MAPREDUCE-534, HDFS-459, MAPREDUCE-728, and MAPREDUCE-776. Some of the parsing tools developed in previous work already contains a component to anonymize the logs. It would be nice to combine these effort and have a common standalone tool that can anonymizes job history logs and preserve much of the structure of the files so that existing tools on top of job history logs continue work with no modification. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-778) [Rumen] Need a standalone JobHistory log anonymizer
[ https://issues.apache.org/jira/browse/MAPREDUCE-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169263#comment-13169263 ] Amar Kamat commented on MAPREDUCE-778: -- I ran test-patch on my local box for MapReduce and it passed. {noformat} +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 44 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. {noformat} All the Rumen JUnit tests except 'TestRumenJobTraces' passed. 'TestRumenJobTraces' failed due to MAPREDUCE-3462. 11 out of 16 Gridmix JUnit tests passed. 5 Gridmix tests failed due to MAPREDUCE-3462 and MAPREDUCE-3168. [Rumen] Need a standalone JobHistory log anonymizer --- Key: MAPREDUCE-778 URL: https://issues.apache.org/jira/browse/MAPREDUCE-778 Project: Hadoop Map/Reduce Issue Type: New Feature Components: tools/rumen Affects Versions: 0.24.0 Reporter: Hong Tang Assignee: Amar Kamat Labels: anonymization, rumen Fix For: 0.24.0 Attachments: anonymizer.patch, anonymizer.py, mapreduce-778-v1.14-12.patch, mapreduce-778-v1.2-2.patch, same.py Job history logs contain a rich set of information that can help understand and characterize cluster workload and individual job execution. Examples of work that parses or utilizes job history include HADOOP-3585, MAPREDUCE-534, HDFS-459, MAPREDUCE-728, and MAPREDUCE-776. Some of the parsing tools developed in previous work already contains a component to anonymize the logs. It would be nice to combine these effort and have a common standalone tool that can anonymizes job history logs and preserve much of the structure of the files so that existing tools on top of job history logs continue work with no modification. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2950) [Gridmix] TestUserResolve fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169157#comment-13169157 ] Amar Kamat commented on MAPREDUCE-2950: --- Committed this to the 0.23 branch. [Gridmix] TestUserResolve fails in trunk Key: MAPREDUCE-2950 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2950 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Assignee: Ravi Gummadi Labels: gridmix, junit, test-user-resolve Fix For: 0.24.0 Attachments: MR2950.patch TestUserResolve fails in trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13159037#comment-13159037 ] Amar Kamat commented on MAPREDUCE-3476: --- Thanks Ravi for opening a JIRA. Optimize YARN API calls --- Key: MAPREDUCE-3476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Priority: Critical Several YARN API calls are taking inordinately long. This might be a performance blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3168) [Gridmix] TestCompressionEmulationUtils fails after MR-3158
[ https://issues.apache.org/jira/browse/MAPREDUCE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156600#comment-13156600 ] Amar Kamat commented on MAPREDUCE-3168: --- This test is now failing because of MAPREDUCE-3462. [Gridmix] TestCompressionEmulationUtils fails after MR-3158 --- Key: MAPREDUCE-3168 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3168 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: compression-emulation, gridmix, local-job-runner Fix For: 0.24.0 TestCompressionEmulationUtils fails after MAPREDUCE-3158 as it uses local job-runner to run jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-1042) rumen should be able to output compressed trace files
[ https://issues.apache.org/jira/browse/MAPREDUCE-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156448#comment-13156448 ] Amar Kamat commented on MAPREDUCE-1042: --- Rumen supports compressed output files. If the output filename contains a recognized extension (e.g. .gzip, .zip etc), Rumen will recognize that and generate a compressed output file. rumen should be able to output compressed trace files - Key: MAPREDUCE-1042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1042 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tools/rumen Reporter: Dick King Assignee: Dick King Fix For: 0.22.0 rumen is used primarily to create job trace files which are then processed by other tools. These trace files can exceed 100 gigabytes. However, gzip compression normally achieves 15:1 compression on these traces. I would like to modify rumen so it can output compressed files directly, rather than outputting unwieldy uncompressed files and letting me compress it later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3454) [Gridmix] TestDistCacheEmulation is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154971#comment-13154971 ] Amar Kamat commented on MAPREDUCE-3454: --- I see lot of test with such issues. A simple search for the 'createDummyMapTaskAttemptContext(' gives the following result. {noformat} hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestDistCacheEmulation.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestMRSequenceFileAsBinaryInputFormat.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestMRSequenceFileAsTextInputFormat.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestMRSequenceFileInputFilter.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/input/TestNLineInputFormat.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/join/TestJoinProperties.java hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/lib/output/TestMRSequenceFileAsBinaryOutputFormat.java {noformat} [Gridmix] TestDistCacheEmulation is broken -- Key: MAPREDUCE-3454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3454 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: gridmix Fix For: 0.23.1 TestDistCacheEmulation is broken as 'MapReduceTestUtil' no longer exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3454) [Gridmix] TestDistCacheEmulation is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154976#comment-13154976 ] Amar Kamat commented on MAPREDUCE-3454: --- Hitesh, I assume that you are aware of the code changes required to fix this JIRA. Can you kindly take this up? [Gridmix] TestDistCacheEmulation is broken -- Key: MAPREDUCE-3454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3454 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: gridmix Fix For: 0.23.1 TestDistCacheEmulation is broken as 'MapReduceTestUtil' no longer exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3412) 'ant docs' is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154993#comment-13154993 ] Amar Kamat commented on MAPREDUCE-3412: --- I committed this to branch-0.23 too. 'ant docs' is broken Key: MAPREDUCE-3412 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3412 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: docs Fix For: 0.24.0 Attachments: mapreduce-3412-v1.0.patch 'ant docs' no longer work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3008) [Gridmix] Improve cumulative CPU usage emulation for short running tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155029#comment-13155029 ] Amar Kamat commented on MAPREDUCE-3008: --- I just committed the backported patch to branch-0.20-security. Thanks Vinay! [Gridmix] Improve cumulative CPU usage emulation for short running tasks Key: MAPREDUCE-3008 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3008 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: contrib/gridmix Affects Versions: 0.24.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: cpu-emulation, gridmix Fix For: 0.24.0 Attachments: mapreduce-2591-v1.4.2.patch, mapreduce-2591-v1.7-0.20-security.patch, mapreduce-2591-v1.7.patch CPU emulation in Gridmix fails to meet the expected target if the map has no data to sort/spill/merge. There are 2 major reasons for this: 1. The map task end immediately ends soon after the map task. The map progress is 67% while the map phase ends. 2. Currently, the sort (comparator) doesnt emulate CPU. If the map is short lived, the CPU emulation thread (spawned from the map task in cleanup) doesn't get a chance to emulate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3349) No rack-name logged in JobHistory for unsuccessful tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151899#comment-13151899 ] Amar Kamat commented on MAPREDUCE-3349: --- Kindly make sure that Rumen's {{TopologyBuilder}} is also fixed in this JIRA. {{TopologyBuilder}} expects hostname in the form '/rack/host' and uses {{TaskAttemptFinishedEvent}} and {{TaskAttemptUnsuccessfulCompletionEvent}} for the same. It also uses split-location from {{TaskStartedEvent}}. Since the hostnames and racknames are now separately logged, we need to make sure that {{TopologyBuilder}} is aware of that. No rack-name logged in JobHistory for unsuccessful tasks Key: MAPREDUCE-3349 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3349 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Devaraj K Priority: Blocker Fix For: 0.23.1 Attachments: MAPREDUCE-3349.patch Found this while running jobs on a cluster with [~Karams]. This is because TaskAttemptUnsuccessfulCompletionEvent history record doesn't have a rack field. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3412) 'ant docs' is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151403#comment-13151403 ] Amar Kamat commented on MAPREDUCE-3412: --- It seems like there is a stale reference to capacity-scheduler.html in {{hadoop-mapreduce-project/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml}} 'ant docs' is broken Key: MAPREDUCE-3412 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3412 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0 Reporter: Amar Kamat Labels: docs Fix For: 0.24.0 'ant docs' no longer work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2733) Gridmix v3 cpu emulation system tests.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146921#comment-13146921 ] Amar Kamat commented on MAPREDUCE-2733: --- Committed this to branch-0.23 too. Gridmix v3 cpu emulation system tests. -- Key: MAPREDUCE-2733 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2733 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Fix For: 0.24.0 Attachments: MAPREDUCE-2733.patch, MAPREDUCE-2733.v2.patch, MAPREDUCE-2733.v3.patch, MAPREDUCE-2733.v4.patch 1. Enable CPU emulation with default resource usage interval and run Gridmix v3 with a trace file that contains the CPU resource usage details. 2. Enable CPU emulation with custom resource usage interval and run Gridmix v3 with a trace file that contains the CPU resource usage details. 3. Disable CPU emulation and run Gridmix v3 with a trace file that contains the CPU resource usage details. 4. Enable CPU emulation with default resource usage interval and run Gridmix v3 with a trace file that doesn't contains the CPU resource usage details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3346) Rumen LoggedTaskAttempt getHostName call returns hostname as null
[ https://issues.apache.org/jira/browse/MAPREDUCE-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146074#comment-13146074 ] Amar Kamat commented on MAPREDUCE-3346: --- The corner case should be fixed in MAPREDUCE-1976. I will commit this patch. Rumen LoggedTaskAttempt getHostName call returns hostname as null -- Key: MAPREDUCE-3346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3346 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0 Reporter: Karam Singh Assignee: Amar Kamat Priority: Blocker After MAPREDUCE-3035 and MAPREDUCE-3317 Now MRV2 job history contains hostName and rackName. when rumen trace builder is ran on jobhistory, its generated trace contains hostname in form of hostName : /raclname/hostname But getHostName for LoggedTaskAttempt returns hostname as null Seems that TraceBuilder is setting hostName properly but JobTraceReader is not able read it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3317) Rumen TraceBuilder is emiting null as hostname
[ https://issues.apache.org/jira/browse/MAPREDUCE-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141150#comment-13141150 ] Amar Kamat commented on MAPREDUCE-3317: --- What should be the expected output if either hostname or rackname is null? We should make sure that Rumen's output is consistent and should match the previous version. Rumen TraceBuilder is emiting null as hostname -- Key: MAPREDUCE-3317 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3317 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 3317.patch Trace generated by Rumen TraceBuilder contains null as hostname even though hostName and rackName are seen in history file. This is after MAPREDUCE-3035. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3157) Rumen TraceBuilder is skipping analyzing 0.20 history files
[ https://issues.apache.org/jira/browse/MAPREDUCE-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140244#comment-13140244 ] Amar Kamat commented on MAPREDUCE-3157: --- Committed this patch to branch-0.23 too. Rumen TraceBuilder is skipping analyzing 0.20 history files --- Key: MAPREDUCE-3157 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3157 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: MR3157.patch Rumen TraceBuilder is assuming the Pre21 history file name format to be JTIdentifier_jobId_something. But it can be jobId_something also as it is now in latest 0.20.x version. This also needs to be understood by TraceBuilder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3166) Make Rumen use job history api instead of relying on current history file name format
[ https://issues.apache.org/jira/browse/MAPREDUCE-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140246#comment-13140246 ] Amar Kamat commented on MAPREDUCE-3166: --- Committed this to branch-0.23. Make Rumen use job history api instead of relying on current history file name format - Key: MAPREDUCE-3166 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3166 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: MR3166.patch Rumen should not depend on the regular expression of job history file name format and should use the newly added api like isValidJobHistoryFileName(), getJobIDFromHistoryFilePath(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3241) (Rumen)TraceBuilder throws IllegalArgumentException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140293#comment-13140293 ] Amar Kamat commented on MAPREDUCE-3241: --- bq. Need in branch-0.23 too? Already committed to branch 0.23. (Rumen)TraceBuilder throws IllegalArgumentException --- Key: MAPREDUCE-3241 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3241 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0 Reporter: Devaraj K Assignee: Amar Kamat Fix For: 0.24.0 Attachments: mapreduce-3241-v1.1.patch When we run the TraceBuilder, we get this exception. Output of the TraceBuilder doesn't contain the map and reduce task information. {code} 2011-10-21 22:07:17,268 WARN rumen.TraceBuilder (TraceBuilder.java:run(272)) - TraceBuilder got an error while processing the [possibly virtual] file job_1319214405771_0002-1319214846458-root-word+count-1319214871038-1-1-SUCCEEDED.jhist within Path hdfs://10.18.52.57:9000/user/root/null/history/done_intermediate/root/job_1319214405771_0002-1319214846458-root-word+count-1319214871038-1-1-SUCCEEDED.jhist java.lang.IllegalArgumentException: JobBuilder.process(HistoryEvent): unknown event type at org.apache.hadoop.tools.rumen.JobBuilder.process(JobBuilder.java:165) at org.apache.hadoop.tools.rumen.TraceBuilder.processJobHistory(TraceBuilder.java:304) at org.apache.hadoop.tools.rumen.TraceBuilder.run(TraceBuilder.java:258) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83) at org.apache.hadoop.tools.rumen.TraceBuilder.main(TraceBuilder.java:185) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:189) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3241) (Rumen)TraceBuilder throws IllegalArgumentException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133850#comment-13133850 ] Amar Kamat commented on MAPREDUCE-3241: --- Thanks for the explanation Devaraj. I am not sure why AM related events should surface in a MapReduce job's history. Requesting Ravi to look into this. (Rumen)TraceBuilder throws IllegalArgumentException --- Key: MAPREDUCE-3241 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3241 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0 Reporter: Devaraj K Assignee: Devaraj K When we run the TraceBuilder, we get this exception. Output of the TraceBuilder doesn't contain the map and reduce task information. {code} 2011-10-21 22:07:17,268 WARN rumen.TraceBuilder (TraceBuilder.java:run(272)) - TraceBuilder got an error while processing the [possibly virtual] file job_1319214405771_0002-1319214846458-root-word+count-1319214871038-1-1-SUCCEEDED.jhist within Path hdfs://10.18.52.57:9000/user/root/null/history/done_intermediate/root/job_1319214405771_0002-1319214846458-root-word+count-1319214871038-1-1-SUCCEEDED.jhist java.lang.IllegalArgumentException: JobBuilder.process(HistoryEvent): unknown event type at org.apache.hadoop.tools.rumen.JobBuilder.process(JobBuilder.java:165) at org.apache.hadoop.tools.rumen.TraceBuilder.processJobHistory(TraceBuilder.java:304) at org.apache.hadoop.tools.rumen.TraceBuilder.run(TraceBuilder.java:258) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83) at org.apache.hadoop.tools.rumen.TraceBuilder.main(TraceBuilder.java:185) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:189) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3241) (Rumen)TraceBuilder throws IllegalArgumentException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133274#comment-13133274 ] Amar Kamat commented on MAPREDUCE-3241: --- Devaraj, Can you tell us how to reproduce this issue? What is the version of the JobHistory files? (Rumen)TraceBuilder throws IllegalArgumentException --- Key: MAPREDUCE-3241 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3241 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0 Reporter: Devaraj K Assignee: Devaraj K When we run the TraceBuilder, we get this exception. Output of the TraceBuilder doesn't contain the map and reduce task information. {code} 2011-10-21 22:07:17,268 WARN rumen.TraceBuilder (TraceBuilder.java:run(272)) - TraceBuilder got an error while processing the [possibly virtual] file job_1319214405771_0002-1319214846458-root-word+count-1319214871038-1-1-SUCCEEDED.jhist within Path hdfs://10.18.52.57:9000/user/root/null/history/done_intermediate/root/job_1319214405771_0002-1319214846458-root-word+count-1319214871038-1-1-SUCCEEDED.jhist java.lang.IllegalArgumentException: JobBuilder.process(HistoryEvent): unknown event type at org.apache.hadoop.tools.rumen.JobBuilder.process(JobBuilder.java:165) at org.apache.hadoop.tools.rumen.TraceBuilder.processJobHistory(TraceBuilder.java:304) at org.apache.hadoop.tools.rumen.TraceBuilder.run(TraceBuilder.java:258) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83) at org.apache.hadoop.tools.rumen.TraceBuilder.main(TraceBuilder.java:185) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:189) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3118) Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129672#comment-13129672 ] Amar Kamat commented on MAPREDUCE-3118: --- Folks, We ran the wordcount and Gridmix3 from Hadoop 0.20.205 on a Hadoop cluster running a MR-3118 patched version of 0.20.205. All the jobs ran fine. We didn't see any compatibility issues. I will commit this patch. Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch - Key: MAPREDUCE-3118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3118 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix, tools/rumen Affects Versions: 0.20.206.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.20.206.0 Attachments: gridmix_rumen_backports.v2.4.patch, gridmix_rumen_backports.v2.5.patch, gridmix_rumen_backports.v2.6.patch Backporting all the features and bugfixes that went into gridmix and rumen of trunk to hadoop 0.20 security branch. This will enable using all these gridmix features and run gridmix/rumen on the history logs of 0.20 security branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2733) Gridmix v3 cpu emulation system tests.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126462#comment-13126462 ] Amar Kamat commented on MAPREDUCE-2733: --- +1. Kindly share the output of test-patch, ant-tests and system tests. Gridmix v3 cpu emulation system tests. -- Key: MAPREDUCE-2733 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2733 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Attachments: MAPREDUCE-2733.patch, MAPREDUCE-2733.v2.patch, MAPREDUCE-2733.v3.patch 1. Enable CPU emulation with default resource usage interval and run Gridmix v3 with a trace file that contains the CPU resource usage details. 2. Enable CPU emulation with custom resource usage interval and run Gridmix v3 with a trace file that contains the CPU resource usage details. 3. Disable CPU emulation and run Gridmix v3 with a trace file that contains the CPU resource usage details. 4. Enable CPU emulation with default resource usage interval and run Gridmix v3 with a trace file that doesn't contains the CPU resource usage details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3035) MR V2 jobhistory does not contain rack information
[ https://issues.apache.org/jira/browse/MAPREDUCE-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126471#comment-13126471 ] Amar Kamat commented on MAPREDUCE-3035: --- W.r.t Rumen, {{LoggedTaskAttempt}} should now support {{setNodeName(RackName, HostName)}} API. Also in {{JobBuilder}} (e.g. see line-532), calls to {{LoggedTaskAttempt.setHostName()}} should be modified to {{LoggedTaskAttempt.setNodeName(RackName, HostName)}}. Hopefully this will help you guys get started. MR V2 jobhistory does not contain rack information -- Key: MAPREDUCE-3035 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3035 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Karam Singh Assignee: chackaravarthy Priority: Blocker Fix For: 0.23.0 Attachments: MAPREDUCE-3035.patch When topology.node.switch.mapping.impl is set to enable rack-locality resolution via the topology script, from the RM web-UI, we can see the rack information for each node. Running a job also reveals the information about rack-local map tasks launched at end of job completion on the client side. But the hostname field for attempts in the JobHistory does not contain this rack information. In case of hadoop-0.20 securiy or MRV1, hostname field of job history does contain rackid/hostname whereas in MRV2, hostname field only contains the hostIP. Thus this is a regression. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2733) Gridmix v3 cpu emulation system tests.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126341#comment-13126341 ] Amar Kamat commented on MAPREDUCE-2733: --- The implementation looks fine. There are few typos in the javadocs and log statements. Gridmix v3 cpu emulation system tests. -- Key: MAPREDUCE-2733 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2733 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Attachments: MAPREDUCE-2733.patch 1. Enable CPU emulation with default resource usage interval and run Gridmix v3 with a trace file that contains the CPU resource usage details. 2. Enable CPU emulation with custom resource usage interval and run Gridmix v3 with a trace file that contains the CPU resource usage details. 3. Disable CPU emulation and run Gridmix v3 with a trace file that contains the CPU resource usage details. 4. Enable CPU emulation with default resource usage interval and run Gridmix v3 with a trace file that doesn't contains the CPU resource usage details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3118) Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125580#comment-13125580 ] Amar Kamat commented on MAPREDUCE-3118: --- Folks, This patch adds a _getProgress()_ API to public classes and interfaces like: 1. Progress.java 2. TaskInputOutputContext.java 3. Reporter.java This API provides the task's current progress and is immensely useful of Gridmix. We tend to believe that this might get classified as a backward incompatible change. We want to be sure about the incompatibility side of the story and make a call accordingly. Kindly let us know your thoughts/comments regarding the same. Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch - Key: MAPREDUCE-3118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3118 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix, tools/rumen Affects Versions: 0.20.206.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.20.206.0 Attachments: gridmix_rumen_backports.v2.4.patch, gridmix_rumen_backports.v2.5.patch, gridmix_rumen_backports.v2.6.patch Backporting all the features and bugfixes that went into gridmix and rumen of trunk to hadoop 0.20 security branch. This will enable using all these gridmix features and run gridmix/rumen on the history logs of 0.20 security branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3166) Make Rumen use job history api instead of relying on current history file name format
[ https://issues.apache.org/jira/browse/MAPREDUCE-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125607#comment-13125607 ] Amar Kamat commented on MAPREDUCE-3166: --- +1. Looks good to me. Make Rumen use job history api instead of relying on current history file name format - Key: MAPREDUCE-3166 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3166 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: MR3166.patch Rumen should not depend on the regular expression of job history file name format and should use the newly added api like isValidJobHistoryFileName(), getJobIDFromHistoryFilePath(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3157) Rumen TraceBuilder is skipping analyzing 0.20 history files
[ https://issues.apache.org/jira/browse/MAPREDUCE-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124694#comment-13124694 ] Amar Kamat commented on MAPREDUCE-3157: --- Ravi, Instead of parsing the current job-history files using JobHistory.CONF_FILENAME_REGEX or JobHistory.JOBHISTORY_FILENAME_REGEX, should we use the JobHistory's filename parsing APIs like getJobIDFromHistoryFilePath()? Rumen TraceBuilder is skipping analyzing 0.20 history files --- Key: MAPREDUCE-3157 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3157 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: MR3157.patch Rumen TraceBuilder is assuming the Pre21 history file name format to be JTIdentifier_jobId_something. But it can be jobId_something also as it is now in latest 0.20.x version. This also needs to be understood by TraceBuilder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3157) Rumen TraceBuilder is skipping analyzing 0.20 history files
[ https://issues.apache.org/jira/browse/MAPREDUCE-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124695#comment-13124695 ] Amar Kamat commented on MAPREDUCE-3157: --- Other than my previous comment, the patch looks good to me. Rumen TraceBuilder is skipping analyzing 0.20 history files --- Key: MAPREDUCE-3157 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3157 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: MR3157.patch Rumen TraceBuilder is assuming the Pre21 history file name format to be JTIdentifier_jobId_something. But it can be jobId_something also as it is now in latest 0.20.x version. This also needs to be understood by TraceBuilder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3118) Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122577#comment-13122577 ] Amar Kamat commented on MAPREDUCE-3118: --- I see some MRJobConfig fields in DistributedCacheEmulator's javadoc. Can this be changed to their new locations maybe in JobContext or DistributedCache? Rest of the patch looks good to me. @Ravi: Can you upload test-patch and ant-test status along with the final patch? Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch - Key: MAPREDUCE-3118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3118 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix, tools/rumen Affects Versions: 0.20.206.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.20.206.0 Attachments: gridmix_rumen_backports.v2.4.patch, gridmix_rumen_backports.v2.5.patch Backporting all the features and bugfixes that went into gridmix and rumen of trunk to hadoop 0.20 security branch. This will enable using all these gridmix features and run gridmix/rumen on the history logs of 0.20 security branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3118) Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13121694#comment-13121694 ] Amar Kamat commented on MAPREDUCE-3118: --- @Matt I have taken up the review of the latest patch. @All Vinay has tested this patch on a test cluster and seems like all the latest Gridmix features work fine. Vinay also backported the system tests he wrote for trunk to make sure that all the features are tested thoroughly. Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch - Key: MAPREDUCE-3118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3118 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix, tools/rumen Affects Versions: 0.20.206.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.20.206.0 Attachments: gridmix_rumen_backports.v2.4.patch, gridmix_rumen_backports.v2.5.patch Backporting all the features and bugfixes that went into gridmix and rumen of trunk to hadoop 0.20 security branch. This will enable using all these gridmix features and run gridmix/rumen on the history logs of 0.20 security branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2777) Backport MAPREDUCE-220 to Hadoop 20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116565#comment-13116565 ] Amar Kamat commented on MAPREDUCE-2777: --- I also committed this to 0.20-205. Backport MAPREDUCE-220 to Hadoop 20 security branch --- Key: MAPREDUCE-2777 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2777 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.20.205.0 Reporter: Jonathan Eagles Assignee: Amar Kamat Fix For: 0.20.206.0 Attachments: mapreduce-2777-v1.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2777) Backport MAPREDUCE-220 to Hadoop 20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116947#comment-13116947 ] Amar Kamat commented on MAPREDUCE-2777: --- Matt, The 2 failures are due to 1. TestTrackerDistributedCacheManager : This error seems unrelated to this patch. This patch doesn't change anything in DistributedCache. I checked the stack trace and found NPEs. I will look into it further. 2. TestTTMemoryReporting: This testcase got deleted and rewritten as part of this patch. We only need to delete the stray file. I uploaded the patch only after test-patch and ant-tests passed. I am re-running them to make sure nothing changed after I generated the last patch. Backport MAPREDUCE-220 to Hadoop 20 security branch --- Key: MAPREDUCE-2777 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2777 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.20.205.0 Reporter: Jonathan Eagles Assignee: Amar Kamat Attachments: mapreduce-2777-v1.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2777) Backport MAPREDUCE-220 to Hadoop 20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116954#comment-13116954 ] Amar Kamat commented on MAPREDUCE-2777: --- test-patch output: {noformat} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 21 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.8) warnings. {noformat} Backport MAPREDUCE-220 to Hadoop 20 security branch --- Key: MAPREDUCE-2777 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2777 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.20.205.0 Reporter: Jonathan Eagles Assignee: Amar Kamat Attachments: mapreduce-2777-v1.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2777) Backport MAPREDUCE-220 to Hadoop 20 security branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116958#comment-13116958 ] Amar Kamat commented on MAPREDUCE-2777: --- I re-ran TestTrackerDistributedCacheManager with and without my patch and it passed. I suggest we commit this patch to 0.20.205. Backport MAPREDUCE-220 to Hadoop 20 security branch --- Key: MAPREDUCE-2777 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2777 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.20.205.0 Reporter: Jonathan Eagles Assignee: Amar Kamat Attachments: mapreduce-2777-v1.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira