[jira] [Commented] (MAPREDUCE-6321) Map tasks take a lot of time to start up

2015-05-01 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524211#comment-14524211
 ] 

Ray Chiang commented on MAPREDUCE-6321:
---

I'd suggest running again with the fix from YARN-2990 and seeing if the times 
go down.  Release 2.7.0 should have the fix.

 Map tasks take a lot of time to start up
 

 Key: MAPREDUCE-6321
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.6.0
Reporter: Rajat Jain
Priority: Critical
  Labels: performance

 I have noticed repeatedly that the map tasks take a lot of time to startup on 
 YARN clusters. This is not the scheduling part, this is after the actual 
 container is launched containing the Map task. Take for example, the sample 
 log from a mapper of a Pi job that I launched. The command I used to launch 
 the Pi job was:
 {code}
 hadoop jar 
 /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
 {code}
 This is the sample job from one of the mappers which took 14 seconds to 
 complete. If you notice from the logs, most of the time taken by this job is 
 during the start up. I notice that the most mappers take anywhere between 7 
 to 15 seconds during start up and have seen this behavior consistent across 
 mapreduce jobs. This really affects the performance of short running mappers.
 I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the 
 mapper memory is always specified as 2048m and so on.
 Log:
 {code}
 2015-04-18 06:48:34,081 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
 hadoop-metrics2.properties
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period 
 at 10 second(s).
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
 started
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Executing with tokens:
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
 mapreduce.job, Service: job_1429338752209_0059, Ident: 
 (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Sleeping for 0ms before retrying again. Got null now.
 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 mapreduce.cluster.local.dir for child: 
 /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
 2015-04-18 06:48:36,706 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:37,387 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:39,388 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. 
 Instead, use dfs.metrics.session-id
 2015-04-18 06:48:39,448 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:41,060 INFO [main] 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to 
 org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up 
 progress from Task
 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using 
 ResourceCalculatorProcessTree : [ ]
 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: 
 hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: 
 (EQUATOR) 0 kvi 234881020(939524080)
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 mapreduce.task.io.sort.mb: 896
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft 
 limit at 657666880
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 bufstart = 0; bufvoid = 939524096
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart 
 = 234881020; length = 58720256
 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map 
 output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Starting flush of map output
 2015-04-18 

[jira] [Commented] (MAPREDUCE-6321) Map tasks take a lot of time to start up

2015-05-01 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524212#comment-14524212
 ] 

Ray Chiang commented on MAPREDUCE-6321:
---

Oh, assuming you're running FairScheduler.

 Map tasks take a lot of time to start up
 

 Key: MAPREDUCE-6321
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.6.0
Reporter: Rajat Jain
Priority: Critical
  Labels: performance

 I have noticed repeatedly that the map tasks take a lot of time to startup on 
 YARN clusters. This is not the scheduling part, this is after the actual 
 container is launched containing the Map task. Take for example, the sample 
 log from a mapper of a Pi job that I launched. The command I used to launch 
 the Pi job was:
 {code}
 hadoop jar 
 /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
 {code}
 This is the sample job from one of the mappers which took 14 seconds to 
 complete. If you notice from the logs, most of the time taken by this job is 
 during the start up. I notice that the most mappers take anywhere between 7 
 to 15 seconds during start up and have seen this behavior consistent across 
 mapreduce jobs. This really affects the performance of short running mappers.
 I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the 
 mapper memory is always specified as 2048m and so on.
 Log:
 {code}
 2015-04-18 06:48:34,081 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
 hadoop-metrics2.properties
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period 
 at 10 second(s).
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
 started
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Executing with tokens:
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
 mapreduce.job, Service: job_1429338752209_0059, Ident: 
 (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Sleeping for 0ms before retrying again. Got null now.
 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 mapreduce.cluster.local.dir for child: 
 /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
 2015-04-18 06:48:36,706 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:37,387 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:39,388 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. 
 Instead, use dfs.metrics.session-id
 2015-04-18 06:48:39,448 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:41,060 INFO [main] 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to 
 org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up 
 progress from Task
 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using 
 ResourceCalculatorProcessTree : [ ]
 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: 
 hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: 
 (EQUATOR) 0 kvi 234881020(939524080)
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 mapreduce.task.io.sort.mb: 896
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft 
 limit at 657666880
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 bufstart = 0; bufvoid = 939524096
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart 
 = 234881020; length = 58720256
 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map 
 output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Starting flush of map output
 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Spilling map output
 

[jira] [Commented] (MAPREDUCE-6321) Map tasks take a lot of time to start up

2015-05-01 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524254#comment-14524254
 ] 

Ray Chiang commented on MAPREDUCE-6321:
---

Task startup time includes scheduling determination delays, which is what 
YARN-2990 fixes.  Localization and JVM startup are usually a noticeable chunk 
of the remaining time.

 Map tasks take a lot of time to start up
 

 Key: MAPREDUCE-6321
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.6.0
Reporter: Rajat Jain
Priority: Critical
  Labels: performance

 I have noticed repeatedly that the map tasks take a lot of time to startup on 
 YARN clusters. This is not the scheduling part, this is after the actual 
 container is launched containing the Map task. Take for example, the sample 
 log from a mapper of a Pi job that I launched. The command I used to launch 
 the Pi job was:
 {code}
 hadoop jar 
 /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
 {code}
 This is the sample job from one of the mappers which took 14 seconds to 
 complete. If you notice from the logs, most of the time taken by this job is 
 during the start up. I notice that the most mappers take anywhere between 7 
 to 15 seconds during start up and have seen this behavior consistent across 
 mapreduce jobs. This really affects the performance of short running mappers.
 I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the 
 mapper memory is always specified as 2048m and so on.
 Log:
 {code}
 2015-04-18 06:48:34,081 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
 hadoop-metrics2.properties
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period 
 at 10 second(s).
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
 started
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Executing with tokens:
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
 mapreduce.job, Service: job_1429338752209_0059, Ident: 
 (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Sleeping for 0ms before retrying again. Got null now.
 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 mapreduce.cluster.local.dir for child: 
 /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
 2015-04-18 06:48:36,706 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:37,387 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:39,388 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. 
 Instead, use dfs.metrics.session-id
 2015-04-18 06:48:39,448 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:41,060 INFO [main] 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to 
 org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up 
 progress from Task
 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using 
 ResourceCalculatorProcessTree : [ ]
 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: 
 hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: 
 (EQUATOR) 0 kvi 234881020(939524080)
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 mapreduce.task.io.sort.mb: 896
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft 
 limit at 657666880
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 bufstart = 0; bufvoid = 939524096
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart 
 = 234881020; length = 58720256
 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map 
 output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: 

[jira] [Commented] (MAPREDUCE-5649) Reduce cannot use more than 2G memory for the final merge

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524514#comment-14524514
 ] 

Hadoop QA commented on MAPREDUCE-5649:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 46s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 29s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 54s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | mapreduce tests |   1m 36s | Tests passed in 
hadoop-mapreduce-client-core. |
| | |  38m  4s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729903/MAPREDUCE-5649.002.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| whitespace | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5487/artifact/patchprocess/whitespace.txt
 |
| hadoop-mapreduce-client-core test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5487/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5487/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5487/console |


This message was automatically generated.

 Reduce cannot use more than 2G memory  for the final merge
 --

 Key: MAPREDUCE-5649
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5649
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: stanley shi
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5649.001.patch, MAPREDUCE-5649.002.patch


 In the org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.java file, in 
 the finalMerge method: 
  int maxInMemReduce = (int)Math.min(
 Runtime.getRuntime().maxMemory() * maxRedPer, Integer.MAX_VALUE);
  
 This means no matter how much memory user has, reducer will not retain more 
 than 2G data in memory before the reduce phase starts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5377) JobID is not displayed truly by hadoop job -history command

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524641#comment-14524641
 ] 

Hadoop QA commented on MAPREDUCE-5377:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12591055/MAPREDUCE-5377.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5520/console |


This message was automatically generated.

 JobID is not displayed truly by hadoop job -history command
 -

 Key: MAPREDUCE-5377
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5377
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
  Labels: newbie
 Attachments: MAPREDUCE-5377.patch


 JobID output by hadoop job -history command is wrong string.
 {quote}
 [hadoop@hadoop hadoop]$ hadoop job -history terasort
 Hadoop job: 0001_1374260789919_hadoop
 =
 Job tracker host name: job
 job tracker start time: Tue May 18 15:39:51 PDT 1976
 User: hadoop
 JobName: TeraSort
 JobConf: 
 hdfs://hadoop:8020/hadoop/mapred/staging/hadoop/.staging/job_201307191206_0001/job.xml
 Submitted At: 19-7-2013 12:06:29
 Launched At: 19-7-2013 12:06:30 (0sec)
 Finished At: 19-7-2013 12:06:44 (14sec)
 Status: SUCCESS
 {quote}
 In this example, it should show job_201307191206_0001 at Hadoop job:, but 
 shows 0001_1374260789919_hadoop. In addition, Job tracker host name and 
 job tracker start time is invalid.
 This problem can solve by fixing setting of jobId in HistoryViewer(). In 
 addition, it should fix the information of JobTracker at HistoryViewr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5403) MR changes to accommodate yarn.application.classpath being moved to the server-side

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524643#comment-14524643
 ] 

Hadoop QA commented on MAPREDUCE-5403:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12594253/MAPREDUCE-5403-2.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5521/console |


This message was automatically generated.

 MR changes to accommodate yarn.application.classpath being moved to the 
 server-side
 ---

 Key: MAPREDUCE-5403
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5403-1.patch, MAPREDUCE-5403-2.patch, 
 MAPREDUCE-5403.patch


 yarn.application.classpath is a confusing property because it is used by 
 MapReduce and not YARN, and MapReduce already has 
 mapreduce.application.classpath, which provides the same functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4065) Add .proto files to built tarball

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524851#comment-14524851
 ] 

Hadoop QA commented on MAPREDUCE-4065:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12650714/MAPREDUCE-4065.1.patch 
|
| Optional Tests | javadoc javac unit |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5554/console |


This message was automatically generated.

 Add .proto files to built tarball
 -

 Key: MAPREDUCE-4065
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.2, 2.4.0
Reporter: Ralph H Castain
Assignee: Tsuyoshi Ozawa
 Attachments: MAPREDUCE-4065.1.patch


 Please add the .proto files to the built tarball so that users can build 3rd 
 party tools that use protocol buffers without having to do an svn checkout of 
 the source code.
 Sorry I don't know more about Maven, or I would provide a patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5817) mappers get rescheduled on node transition even after all reducers are completed

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524854#comment-14524854
 ] 

Hadoop QA commented on MAPREDUCE-5817:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12638107/mapreduce-5817.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5557/console |


This message was automatically generated.

 mappers get rescheduled on node transition even after all reducers are 
 completed
 

 Key: MAPREDUCE-5817
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 2.3.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: mapreduce-5817.patch


 We're seeing a behavior where a job runs long after all reducers were already 
 finished. We found that the job was rescheduling and running a number of 
 mappers beyond the point of reducer completion. In one situation, the job ran 
 for some 9 more hours after all reducers completed!
 This happens because whenever a node transition (to an unusable state) comes 
 into the app master, it just reschedules all mappers that already ran on the 
 node in all cases.
 Therefore, if any node transition has a potential to extend the job period. 
 Once this window opens, another node transition can prolong it, and this can 
 happen indefinitely in theory.
 If there is some instability in the pool (unhealthy, etc.) for a duration, 
 then any big job is severely vulnerable to this problem.
 If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
 reschedule mapper tasks. If all reducers are completed, the mapper outputs 
 are no longer needed, and there is no need to reschedule mapper tasks as they 
 would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5362) clean up POM dependencies

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524853#comment-14524853
 ] 

Hadoop QA commented on MAPREDUCE-5362:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12640169/mr-5362-0.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5556/console |


This message was automatically generated.

 clean up POM dependencies
 -

 Key: MAPREDUCE-5362
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5362
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: MAPREDUCE-5362.patch, mr-5362-0.patch


 Intermediate 'pom' modules define dependencies inherited by leaf modules.
 This is causing issues in intellij IDE.
 We should normalize the leaf modules like in common, hdfs and tools where all 
 dependencies are defined in each leaf module and the intermediate 'pom' 
 module do not define any dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524864#comment-14524864
 ] 

Hadoop QA commented on MAPREDUCE-5981:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12656504/MAPREDUCE-5981.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5558/console |


This message was automatically generated.

 Log levels of certain MR logs can be changed to DEBUG
 -

 Key: MAPREDUCE-5981
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: MAPREDUCE-5981.patch


 Following map reduce logs can be changed to DEBUG log level.
 1. In 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 
 313), the second log is not required to be at info level. This can be moved 
 to debug as a warn log is anyways printed if verifyReply fails.
   SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey);
   LOG.info(for url=+msgToEncode+ sent hash and received reply);
 2. Thread related info need not be printed in logs at INFO level. Below 2 
 logs can be moved to DEBUG
 a) In 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java
  : 381), below log can be changed to DEBUG
LOG.info(Assigning  + host +  with  + host.getNumKnownMapOutputs() +
 to  + Thread.currentThread().getName());
 b) In 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java
  : 411), below log can be changed to DEBUG
  LOG.info(assigned  + includedMaps +  of  + totalSize +  to  +
  host +  to  + Thread.currentThread().getName());
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6345) Documentation fix for when CRLA is enabled for MRAppMaster logs

2015-05-01 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6345:
-
   Resolution: Fixed
Fix Version/s: 2.8.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks [~ragarwal] for contribution! Committed to trunk and branch-2.

 Documentation fix for when CRLA is enabled for MRAppMaster logs
 ---

 Key: MAPREDUCE-6345
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6345
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0
Reporter: Rohit Agarwal
Assignee: Rohit Agarwal
Priority: Trivial
 Fix For: 2.8.0

 Attachments: MAPREDUCE-6345.patch


 CRLA is enabled for the ApplicationMaster when both 
 yarn.app.mapreduce.am.container.log.limit.kb (not 
 mapreduce.task.userlog.limit.kb) and 
 yarn.app.mapreduce.am.container.log.backups are greater than zero.
 This was changed in MAPREDUCE-5773.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524614#comment-14524614
 ] 

Hadoop QA commented on MAPREDUCE-4346:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12533607/MAPREDUCE-4346_rev4.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5512/console |


This message was automatically generated.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4271) Make TestCapacityScheduler more robust with non-Sun JDK

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524626#comment-14524626
 ] 

Hadoop QA commented on MAPREDUCE-4271:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12567098/MAPREDUCE-4271-branch1-v2.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5514/console |


This message was automatically generated.

 Make TestCapacityScheduler more robust with non-Sun JDK
 ---

 Key: MAPREDUCE-4271
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4271
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
  Labels: alt-jdk, capacity
 Attachments: MAPREDUCE-4271-branch1-v2.patch, 
 mapreduce-4271-branch-1.patch, test-afterepatch.result, 
 test-beforepatch.result, test-patch.result


 The capacity scheduler queue is initialized with a HashMap, the values of 
 which are later added to a list (a queue for assigning tasks). 
 TestCapacityScheduler depends on the order of the list hence not portable 
 across JDKs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5188) error when verify FileType of RS_SOURCE in getCompanionBlocks in BlockPlacementPolicyRaid.java

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524616#comment-14524616
 ] 

Hadoop QA commented on MAPREDUCE-5188:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12580811/MAPREDUCE-5188.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5513/console |


This message was automatically generated.

 error when verify FileType of RS_SOURCE in getCompanionBlocks  in 
 BlockPlacementPolicyRaid.java
 ---

 Key: MAPREDUCE-5188
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5188
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Affects Versions: 2.0.2-alpha
Reporter: junjin
Assignee: junjin
Priority: Critical
  Labels: contrib/raid
 Fix For: 2.0.2-alpha

 Attachments: MAPREDUCE-5188.patch


 error when verify FileType of RS_SOURCE in getCompanionBlocks  in 
 BlockPlacementPolicyRaid.java
 need change xorParityLength in line #379 to rsParityLength since it's for 
 verifying RS_SOURCE  type



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6030) In mr-jobhistory-daemon.sh, some env variables are not affected by mapred-env.sh

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524874#comment-14524874
 ] 

Hadoop QA commented on MAPREDUCE-6030:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12660804/MAPREDUCE-6030.patch |
| Optional Tests | shellcheck |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5560/console |


This message was automatically generated.

 In mr-jobhistory-daemon.sh, some env variables are not affected by 
 mapred-env.sh
 

 Key: MAPREDUCE-6030
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6030
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.4.1
Reporter: Youngjoon Kim
Assignee: Youngjoon Kim
Priority: Minor
 Attachments: MAPREDUCE-6030.patch


 In mr-jobhistory-daemon.sh, some env variables are exported before sourcing 
 mapred-env.sh, so these variables don't use values defined in mapred-env.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6020) Too many threads blocking on the global JobTracker lock from getJobCounters, optimize getJobCounters to release global JobTracker lock before access the per job cou

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524871#comment-14524871
 ] 

Hadoop QA commented on MAPREDUCE-6020:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12659032/MAPREDUCE-6020.branch1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5559/console |


This message was automatically generated.

 Too many threads blocking on the global JobTracker lock from getJobCounters, 
 optimize getJobCounters to release global JobTracker lock before access the 
 per job counter in JobInProgress
 -

 Key: MAPREDUCE-6020
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6020
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.10
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-6020.branch1.patch


 Too many threads blocking on the global JobTracker lock from getJobCounters, 
 optimize getJobCounters to release global JobTracker lock before access the 
 per job counter in JobInProgress. It may be a lot of JobClients to call 
 getJobCounters in JobTracker at the same time, Current code will lock the 
 JobTracker to block all the threads to get counter from JobInProgress. It is 
 better to unlock the JobTracker when get counter from 
 JobInProgress(job.getCounters(counters)). So all the theads can run parallel 
 when access its own job counter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6321) Map tasks take a lot of time to start up

2015-05-01 Thread Rajat Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524221#comment-14524221
 ] 

Rajat Jain commented on MAPREDUCE-6321:
---

Yes, we run FairScheduler. However, this is not related to FairScheduler since 
this slowness is during map task startup.

 Map tasks take a lot of time to start up
 

 Key: MAPREDUCE-6321
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.6.0
Reporter: Rajat Jain
Priority: Critical
  Labels: performance

 I have noticed repeatedly that the map tasks take a lot of time to startup on 
 YARN clusters. This is not the scheduling part, this is after the actual 
 container is launched containing the Map task. Take for example, the sample 
 log from a mapper of a Pi job that I launched. The command I used to launch 
 the Pi job was:
 {code}
 hadoop jar 
 /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
 {code}
 This is the sample job from one of the mappers which took 14 seconds to 
 complete. If you notice from the logs, most of the time taken by this job is 
 during the start up. I notice that the most mappers take anywhere between 7 
 to 15 seconds during start up and have seen this behavior consistent across 
 mapreduce jobs. This really affects the performance of short running mappers.
 I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the 
 mapper memory is always specified as 2048m and so on.
 Log:
 {code}
 2015-04-18 06:48:34,081 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
 hadoop-metrics2.properties
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period 
 at 10 second(s).
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
 started
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Executing with tokens:
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
 mapreduce.job, Service: job_1429338752209_0059, Ident: 
 (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Sleeping for 0ms before retrying again. Got null now.
 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 mapreduce.cluster.local.dir for child: 
 /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
 2015-04-18 06:48:36,706 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:37,387 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:39,388 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. 
 Instead, use dfs.metrics.session-id
 2015-04-18 06:48:39,448 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:41,060 INFO [main] 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to 
 org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up 
 progress from Task
 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using 
 ResourceCalculatorProcessTree : [ ]
 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: 
 hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: 
 (EQUATOR) 0 kvi 234881020(939524080)
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 mapreduce.task.io.sort.mb: 896
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft 
 limit at 657666880
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 bufstart = 0; bufvoid = 939524096
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart 
 = 234881020; length = 58720256
 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map 
 output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Starting flush of map output
 2015-04-18 

[jira] [Commented] (MAPREDUCE-5097) Job.addArchiveToClassPath is ignored when running job with LocalJobRunner

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524608#comment-14524608
 ] 

Hadoop QA commented on MAPREDUCE-5097:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12575117/MAPREDUCE-5097-ugly-test.patch
 |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5510/console |


This message was automatically generated.

 Job.addArchiveToClassPath is ignored when running job with LocalJobRunner
 -

 Key: MAPREDUCE-5097
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5097
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Attachments: MAPREDUCE-5097-ugly-test.patch, MAPREDUCE-5097.patch


 Using external dependency jar in mr job. Adding it to the job classpath via 
 Job.addArchiveToClassPath(...) doesn't work when running with LocalJobRunner 
 (i.e. in unit test). This makes it harder to unit-test such jobs (with 
 third-party runtime dependencies).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4330) TaskAttemptCompletedEventTransition invalidates previously successful attempt without checking if the newly completed attempt is successful

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524612#comment-14524612
 ] 

Hadoop QA commented on MAPREDUCE-4330:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12578792/MAPREDUCE-4330-20130415.1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5511/console |


This message was automatically generated.

 TaskAttemptCompletedEventTransition invalidates previously successful attempt 
 without checking if the newly completed attempt is successful
 ---

 Key: MAPREDUCE-4330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.1
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
 Attachments: MAPREDUCE-4330-20130415.1.patch, 
 MAPREDUCE-4330-20130415.patch, MAPREDUCE-4330-21032013.1.patch, 
 MAPREDUCE-4330-21032013.patch


 The previously completed attempt is removed from 
 successAttemptCompletionEventNoMap and marked OBSOLETE.
 After that, if the newly completed attempt is successful then it is added to 
 the successAttemptCompletionEventNoMap. 
 This seems wrong because the newly completed attempt could be failed and thus 
 there is no need to invalidate the successful attempt.
 One error case would be when a speculative attempt completes with 
 killed/failed after the successful version has completed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5611) CombineFileInputFormat only requests a single location per split when more could be optimal

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524808#comment-14524808
 ] 

Hadoop QA commented on MAPREDUCE-5611:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12613866/CombineFileInputFormat-trunk.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5545/console |


This message was automatically generated.

 CombineFileInputFormat only requests a single location per split when more 
 could be optimal
 ---

 Key: MAPREDUCE-5611
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5611
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Chandra Prakash Bhagtani
Assignee: Chandra Prakash Bhagtani
 Attachments: CombineFileInputFormat-trunk.patch


 I have come across an issue with CombineFileInputFormat. Actually I ran a 
 hive query on approx 1.2 GB data with CombineHiveInputFormat which internally 
 uses CombineFileInputFormat. My cluster size is 9 datanodes and 
 max.split.size is 256 MB
 When I ran this query with replication factor 9, hive consistently creates 
 all 6 rack-local tasks and with replication factor 3 it creates 5 rack-local 
 and 1 data local tasks. 
  When replication factor is 9 (equal to cluster size), all the tasks should 
 be data-local as each datanode contains all the replicas of the input data, 
 but that is not happening i.e all the tasks are rack-local. 
 When I dug into CombineFileInputFormat.java code in getMoreSplits method, I 
 found the issue with the following snippet (specially in case of higher 
 replication factor)
 {code:title=CombineFileInputFormat.java|borderStyle=solid}
 for (IteratorMap.EntryString,
  ListOneBlockInfo iter = nodeToBlocks.entrySet().iterator();
  iter.hasNext();) {
Map.EntryString, ListOneBlockInfo one = iter.next();
   nodes.add(one.getKey());
   ListOneBlockInfo blocksInNode = one.getValue();
   // for each block, copy it into validBlocks. Delete it from
   // blockToNodes so that the same block does not appear in
   // two different splits.
   for (OneBlockInfo oneblock : blocksInNode) {
 if (blockToNodes.containsKey(oneblock)) {
   validBlocks.add(oneblock);
   blockToNodes.remove(oneblock);
   curSplitSize += oneblock.length;
   // if the accumulated split size exceeds the maximum, then
   // create this split.
   if (maxSize != 0  curSplitSize = maxSize) {
 // create an input split and add it to the splits array
 addCreatedSplit(splits, nodes, validBlocks);
 curSplitSize = 0;
 validBlocks.clear();
   }
 }
   }
 {code}
 First node in the map nodeToBlocks has all the replicas of input file, so the 
 above code creates 6 splits all with only one location. Now if JT doesn't 
 schedule these tasks on that node, all the tasks will be rack-local, even 
 though all the other datanodes have all the other replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524798#comment-14524798
 ] 

Hadoop QA commented on MAPREDUCE-5621:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12613541/MAPREDUCE-5621.patch |
| Optional Tests | shellcheck |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5544/console |


This message was automatically generated.

 mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
 

 Key: MAPREDUCE-5621
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
 Attachments: MAPREDUCE-5621.patch


 mr-jobhistory-daemon.sh executes mkdir and chown command to output the log 
 files.
 This is always executed with or without a directory. In addition, this is 
 executed not only starting daemon but also stopping daemon.
 It add if like hadoop-daemon.sh and yarn-daemon.sh and should control it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3486) All jobs of all queues will be returned, whethor a particular queueName is specified or not

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524823#comment-14524823
 ] 

Hadoop QA commented on MAPREDUCE-3486:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12505621/MAPREDUCE-3486.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5547/console |


This message was automatically generated.

 All jobs of all queues will be returned, whethor a particular queueName is 
 specified or not
 ---

 Key: MAPREDUCE-3486
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3486
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.3, 1.3.0, 1.2.2
Reporter: XieXianshan
Assignee: XieXianshan
Priority: Minor
 Attachments: MAPREDUCE-3486.patch


 JobTracker.getJobsFromQueue(queueName) will return all jobs of all queues 
 about the jobtracker even though i specify a queueName. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6350) JobHistory doesn't support fully-functional search

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524240#comment-14524240
 ] 

Hadoop QA commented on MAPREDUCE-6350:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 29s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  5s | The applied patch generated  3 
new checkstyle issues (total was 15, now 17). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 58s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | mapreduce tests |   9m 20s | Tests passed in 
hadoop-mapreduce-client-app. |
| {color:green}+1{color} | mapreduce tests |   0m 45s | Tests passed in 
hadoop-mapreduce-client-common. |
| | |  47m 22s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12623739/YARN-1614.v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / d3d019c |
| checkstyle |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5486/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5486/artifact/patchprocess/whitespace.txt
 |
| hadoop-mapreduce-client-app test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5486/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
 |
| hadoop-mapreduce-client-common test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5486/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5486/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5486/console |


This message was automatically generated.

 JobHistory doesn't support fully-functional search
 --

 Key: MAPREDUCE-6350
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6350
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
 Attachments: YARN-1614.v1.patch, YARN-1614.v2.patch


 job history server will only output the first 50 characters of the job names 
 in webUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-2393) No total min share limitation of all pools

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524589#comment-14524589
 ] 

Hadoop QA commented on MAPREDUCE-2393:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12490803/MAPREDUCE-2393.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5503/console |


This message was automatically generated.

 No total min share limitation of all pools
 --

 Key: MAPREDUCE-2393
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2393
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 0.21.0
Reporter: Denny Ye
  Labels: fair, scheduler
 Attachments: MAPREDUCE-2393.patch


 hi, there is no limitation about min share of all pools with cluster total 
 shares. User can define arbitrary amount of min share for each pool. It has 
 such description in fair scheduler design document, but no regular code. 
 It may critical for slot distribution. One pool can hold all cluster slots to 
 meet it's min share that greater than cluster total slots very much.
 If that case has happened, we should scaled down proportionally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4261) MRAppMaster throws NPE while stopping RMContainerAllocator service

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524597#comment-14524597
 ] 

Hadoop QA commented on MAPREDUCE-4261:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12528092/MAPREDUCE-4261.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5504/console |


This message was automatically generated.

 MRAppMaster throws NPE while stopping RMContainerAllocator service
 --

 Key: MAPREDUCE-4261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.1-alpha, 2.0.2-alpha
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4261.patch


 {code:xml}
 2012-05-16 18:55:54,222 INFO [Thread-1] 
 org.apache.hadoop.yarn.service.CompositeService: Error stopping 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter.stop(MRAppMaster.java:716)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1036)
   at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
 2012-05-16 18:55:54,222 INFO [Thread-1] 
 org.apache.hadoop.yarn.service.CompositeService: Error stopping 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getStat(RMContainerAllocator.java:521)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.stop(RMContainerAllocator.java:227)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.stop(MRAppMaster.java:668)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1036)
   at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524602#comment-14524602
 ] 

Hadoop QA commented on MAPREDUCE-4469:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12565910/MAPREDUCE-4469_rev5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5506/console |


This message was automatically generated.

 Resource calculation in child tasks is CPU-heavy
 

 Key: MAPREDUCE-4469
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 1.0.3
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4469.patch, MAPREDUCE-4469_rev2.patch, 
 MAPREDUCE-4469_rev3.patch, MAPREDUCE-4469_rev4.patch, 
 MAPREDUCE-4469_rev5.patch


 In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
 each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
 that it's spending a lot of time looping through all the files in /proc to 
 calculate resource usage.
 As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
 within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
 runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4998) backport MAPREDUCE-3376: Old mapred API combiner uses NULL reporter to branch-1

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524604#comment-14524604
 ] 

Hadoop QA commented on MAPREDUCE-4998:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12568797/MAPREDUCE-4998-branch-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5508/console |


This message was automatically generated.

 backport MAPREDUCE-3376: Old mapred API combiner uses NULL reporter to 
 branch-1
 ---

 Key: MAPREDUCE-4998
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4998
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Reporter: Jim Donofrio
Priority: Minor
 Attachments: MAPREDUCE-4998-branch-1.patch


 http://s.apache.org/eI9
 backport MAPREDUCE-3376: Old mapred API combiner uses NULL reporter to 
 branch-1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4882) Error in estimating the length of the output file in Spill Phase

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524601#comment-14524601
 ] 

Hadoop QA commented on MAPREDUCE-4882:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12566626/MAPREDUCE-4882.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5507/console |


This message was automatically generated.

 Error in estimating the length of the output file in Spill Phase
 

 Key: MAPREDUCE-4882
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4882
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 1.0.3
 Environment: Any Environment
Reporter: Lijie Xu
Assignee: Jerry Chen
  Labels: patch
 Attachments: MAPREDUCE-4882.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 The sortAndSpill() method in MapTask.java has an error in estimating the 
 length of the output file. 
 The long size should be (bufvoid - bufstart) + bufend not (bufvoid - 
 bufend) + bufstart when bufend  bufstart.
 Here is the original code in MapTask.java.
  private void sortAndSpill() throws IOException, ClassNotFoundException,
InterruptedException {
   //approximate the length of the output file to be the length of the
   //buffer + header lengths for the partitions
   long size = (bufend = bufstart
   ? bufend - bufstart
   : (bufvoid - bufend) + bufstart) +
   partitions * APPROX_HEADER_LENGTH;
   FSDataOutputStream out = null;
 --
 I had a test on TeraSort. A snippet from mapper's log is as follows:
 MapTask: Spilling map output: record full = true
 MapTask: bufstart = 157286200; bufend = 10485460; bufvoid = 199229440
 MapTask: kvstart = 262142; kvend = 131069; length = 655360
 MapTask: Finished spill 3
 In this occasioin, Spill Bytes should be (199229440 - 157286200) + 10485460 = 
 52428700 (52 MB) because the number of spilled records is 524287 and each 
 record costs 100B.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4917) multiple BlockFixer should be supported in order to improve scalability and reduce too much work on single BlockFixer

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524598#comment-14524598
 ] 

Hadoop QA commented on MAPREDUCE-4917:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12563471/MAPREDUCE-4917.2.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5505/console |


This message was automatically generated.

 multiple BlockFixer should be supported in order to improve scalability and 
 reduce too much work on single BlockFixer
 -

 Key: MAPREDUCE-4917
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4917
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/raid
Affects Versions: 0.22.0
Reporter: Jun Jin
Assignee: Jun Jin
  Labels: patch
 Fix For: 0.22.0

 Attachments: MAPREDUCE-4917.1.patch, MAPREDUCE-4917.2.patch

   Original Estimate: 672h
  Remaining Estimate: 672h

 current implementation can only run single BlockFixer since the fsck (in 
 RaidDFSUtil.getCorruptFiles) only check the whole DFS file system. multiple 
 BlockFixer will do the same thing and try to fix same file if multiple 
 BlockFixer launched.
 the change/fix will be mainly in BlockFixer.java and 
 RaidDFSUtil.getCorruptFile(), to enable fsck to check the different paths 
 defined in separated Raid.xml for single RaidNode/BlockFixer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524607#comment-14524607
 ] 

Hadoop QA commented on MAPREDUCE-4956:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12574452/MAPREDUCE-4956_3.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5509/console |


This message was automatically generated.

 The Additional JH Info Should Be Exposed
 

 Key: MAPREDUCE-4956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4956
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4956_1.patch, MAPREDUCE-4956_2.patch, 
 MAPREDUCE-4956_3.patch


 In MAPREDUCE-4838, the addition info has been added to JH. This info is 
 useful to be exposed, at least via UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524728#comment-14524728
 ] 

Hadoop QA commented on MAPREDUCE-4980:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  1s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12611165/MAPREDUCE-4980--n8.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5533/console |


This message was automatically generated.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi Ozawa
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, 
 MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980--n7.patch, 
 MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n8.patch, MAPREDUCE-4980.1.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5403) MR changes to accommodate yarn.application.classpath being moved to the server-side

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524698#comment-14524698
 ] 

Hadoop QA commented on MAPREDUCE-5403:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12594253/MAPREDUCE-5403-2.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5527/console |


This message was automatically generated.

 MR changes to accommodate yarn.application.classpath being moved to the 
 server-side
 ---

 Key: MAPREDUCE-5403
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5403-1.patch, MAPREDUCE-5403-2.patch, 
 MAPREDUCE-5403.patch


 yarn.application.classpath is a confusing property because it is used by 
 MapReduce and not YARN, and MapReduce already has 
 mapreduce.application.classpath, which provides the same functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5377) JobID is not displayed truly by hadoop job -history command

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524709#comment-14524709
 ] 

Hadoop QA commented on MAPREDUCE-5377:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12591055/MAPREDUCE-5377.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5531/console |


This message was automatically generated.

 JobID is not displayed truly by hadoop job -history command
 -

 Key: MAPREDUCE-5377
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5377
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
  Labels: newbie
 Attachments: MAPREDUCE-5377.patch


 JobID output by hadoop job -history command is wrong string.
 {quote}
 [hadoop@hadoop hadoop]$ hadoop job -history terasort
 Hadoop job: 0001_1374260789919_hadoop
 =
 Job tracker host name: job
 job tracker start time: Tue May 18 15:39:51 PDT 1976
 User: hadoop
 JobName: TeraSort
 JobConf: 
 hdfs://hadoop:8020/hadoop/mapred/staging/hadoop/.staging/job_201307191206_0001/job.xml
 Submitted At: 19-7-2013 12:06:29
 Launched At: 19-7-2013 12:06:30 (0sec)
 Finished At: 19-7-2013 12:06:44 (14sec)
 Status: SUCCESS
 {quote}
 In this example, it should show job_201307191206_0001 at Hadoop job:, but 
 shows 0001_1374260789919_hadoop. In addition, Job tracker host name and 
 job tracker start time is invalid.
 This problem can solve by fixing setting of jobId in HistoryViewer(). In 
 addition, it should fix the information of JobTracker at HistoryViewr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5150) Backport 2009 terasort (MAPREDUCE-639) to branch-1

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524707#comment-14524707
 ] 

Hadoop QA commented on MAPREDUCE-5150:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12578622/MAPREDUCE-5150-branch-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5530/console |


This message was automatically generated.

 Backport 2009 terasort (MAPREDUCE-639) to branch-1
 --

 Key: MAPREDUCE-5150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 1.2.0
Reporter: Gera Shegalov
Priority: Minor
 Attachments: MAPREDUCE-5150-branch-1.patch


 Users evaluate performance of Hadoop clusters using different benchmarks such 
 as TeraSort. However, terasort version in branch-1 is outdated. It works on 
 teragen dataset that cannot exceed 4 billion unique keys and it does not have 
 the fast non-sampling partitioner SimplePartitioner either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3936) Clients should not enforce counter limits

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524711#comment-14524711
 ] 

Hadoop QA commented on MAPREDUCE-3936:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12544972/MAPREDUCE-3936.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5532/console |


This message was automatically generated.

 Clients should not enforce counter limits 
 --

 Key: MAPREDUCE-3936
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3936
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-3936.patch, MAPREDUCE-3936.patch


 The code for enforcing counter limits (from MAPREDUCE-1943) creates a static 
 JobConf instance to load the limits, which may throw an exception if the 
 client limit is set to be lower than the limit on the cluster (perhaps 
 because the cluster limit was raised from the default).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Commented] (MAPREDUCE-5365) Set mapreduce.job.classloader to true by default

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524704#comment-14524704
 ] 

Hadoop QA commented on MAPREDUCE-5365:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12590345/MAPREDUCE-5365.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5529/console |


This message was automatically generated.

 Set mapreduce.job.classloader to true by default
 

 Key: MAPREDUCE-5365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5365
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5365.patch


 MAPREDUCE-1700 introduced the mapreduce.job.classpath option, which uses a 
 custom classloader to separate system classes from user classes.  It seems 
 like there are only rare cases when a user would not want this on, and that 
 it should enabled by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524703#comment-14524703
 ] 

Hadoop QA commented on MAPREDUCE-4346:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12533607/MAPREDUCE-4346_rev4.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5528/console |


This message was automatically generated.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3807) JobTracker needs fix similar to HDFS-94

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524693#comment-14524693
 ] 

Hadoop QA commented on MAPREDUCE-3807:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12515105/MAPREDUCE-3807.patch |
| Optional Tests |  |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5526/console |


This message was automatically generated.

 JobTracker needs fix similar to HDFS-94
 ---

 Key: MAPREDUCE-3807
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3807
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Harsh J
  Labels: newbie
 Attachments: MAPREDUCE-3807.patch


 1.0 JobTracker's jobtracker.jsp page currently shows:
 {code}
 h2Cluster Summary (Heap Size is %= 
 StringUtils.byteDesc(Runtime.getRuntime().totalMemory()) %/%= 
 StringUtils.byteDesc(Runtime.getRuntime().maxMemory()) %)/h2
 {code}
 It could use an improvement same as HDFS-94 to reflect live heap usage more 
 accurately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6345) Documentation fix for when CRLA is enabled for MRAppMaster logs

2015-05-01 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6345:
-
Summary: Documentation fix for when CRLA is enabled for MRAppMaster logs  
(was: Documentation fix for when CRLA is enabled for MR AppMaster logs)

 Documentation fix for when CRLA is enabled for MRAppMaster logs
 ---

 Key: MAPREDUCE-6345
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6345
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0
Reporter: Rohit Agarwal
Assignee: Rohit Agarwal
Priority: Trivial
 Attachments: MAPREDUCE-6345.patch


 CRLA is enabled for the ApplicationMaster when both 
 yarn.app.mapreduce.am.container.log.limit.kb (not 
 mapreduce.task.userlog.limit.kb) and 
 yarn.app.mapreduce.am.container.log.backups are greater than zero.
 This was changed in MAPREDUCE-5773.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6345) Documentation fix for when CRLA is enabled for MR AppMaster logs

2015-05-01 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6345:
-
Summary: Documentation fix for when CRLA is enabled for MR AppMaster logs  
(was: Documentation fix for when CRLA is enabled for MR App Master's logs)

 Documentation fix for when CRLA is enabled for MR AppMaster logs
 

 Key: MAPREDUCE-6345
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6345
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0
Reporter: Rohit Agarwal
Assignee: Rohit Agarwal
Priority: Trivial
 Attachments: MAPREDUCE-6345.patch


 CRLA is enabled for the ApplicationMaster when both 
 yarn.app.mapreduce.am.container.log.limit.kb (not 
 mapreduce.task.userlog.limit.kb) and 
 yarn.app.mapreduce.am.container.log.backups are greater than zero.
 This was changed in MAPREDUCE-5773.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6345) Documentation fix for when CRLA is enabled for MRAppMaster logs

2015-05-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524453#comment-14524453
 ] 

Hudson commented on MAPREDUCE-6345:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #7716 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7716/])
MAPREDUCE-6345. Documentation fix for when CRLA is enabled for MRAppMaster 
logs. (Rohit Agarwal via gera) (gera: rev 
f1a152cc0adc071277c80637ea6f5faa0bf06a1a)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml


 Documentation fix for when CRLA is enabled for MRAppMaster logs
 ---

 Key: MAPREDUCE-6345
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6345
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0
Reporter: Rohit Agarwal
Assignee: Rohit Agarwal
Priority: Trivial
 Attachments: MAPREDUCE-6345.patch


 CRLA is enabled for the ApplicationMaster when both 
 yarn.app.mapreduce.am.container.log.limit.kb (not 
 mapreduce.task.userlog.limit.kb) and 
 yarn.app.mapreduce.am.container.log.backups are greater than zero.
 This was changed in MAPREDUCE-5773.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4136) Hadoop streaming might succeed even through reducer fails

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524554#comment-14524554
 ] 

Hadoop QA commented on MAPREDUCE-4136:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12522230/mapreduce-4136.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5493/console |


This message was automatically generated.

 Hadoop streaming might succeed even through reducer fails
 -

 Key: MAPREDUCE-4136
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4136
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.20.205.0
Reporter: Wouter de Bie
 Attachments: mapreduce-4136.patch


 Hadoop streaming can even succeed even though the reducer has failed. This 
 happens when Hadoop calls {{PipeReducer.close()}}, but in the mean time the 
 reducer has failed and the process has died. When {{clientOut_.flush()}} 
 throws an {{IOException}} in {{PipeMapRed.mapRedFinish()}} this exception is 
 caught but only logged. The exit status of the child process is never checked 
 and task is marked as successful.
 I've attached a patch that seems to fix it for us.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3876) vertica query, sql command not properly ended

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524547#comment-14524547
 ] 

Hadoop QA commented on MAPREDUCE-3876:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12516880/HADOOP-oracleDriver-src.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5492/console |


This message was automatically generated.

 vertica query, sql command not properly ended
 -

 Key: MAPREDUCE-3876
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3876
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 1.0.0
 Environment: Red Hat 5.5
 Oracle 11
Reporter: Joseph Doss
  Labels: hadoop, newbie, patch
 Attachments: HADOOP-oracleDriver-src.patch


 When running a test script, we're getting a java IO exception thrown.
 This test works on hadoop-0.20.0 but not on hadoop-1.0.0.
 Fri Feb 17 11:36:40 EST 2012
 Running processes with name syncGL.sh: 0
 LIB_JARS: 
 /home/hadoop/verticasync/lib/vertica_4.1.14_jdk_5.jar,/home/hadoop/verticasync/lib/mail.jar,/home/hadoop/verticasync/lib/jdbc14.jar
 VERTICA_SYNC_JAR: /home/hadoop/verticasync/lib/vertica-sync.jar
 PROPERTIES_FILE: 
 /home/hadoop/verticasync/config/ssp-vertica-sync-gl.properties
 Starting Vertica data sync - GL - process
 Warning: $HADOOP_HOME is deprecated.
 12/02/17 11:36:43 INFO mapred.JobClient: Running job: job_201202171122_0001
 12/02/17 11:36:44 INFO mapred.JobClient:  map 0% reduce 0%
 12/02/17 11:36:56 INFO mapred.JobClient: Task Id : 
 attempt_201202171122_0001_m_00_0, Status : FAILED
 java.io.IOException: ORA-00933: SQL command not properly ended
   at 
 org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:197)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 12/02/17 11:36:57 INFO mapred.JobClient: Task Id : 
 attempt_201202171122_0001_m_01_0, Status : FAILED
 java.io.IOException: ORA-00933: SQL command not properly ended
   at 
 org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:197)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4506) EofException / 'connection reset by peer' while copying map output

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524557#comment-14524557
 ] 

Hadoop QA commented on MAPREDUCE-4506:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12538889/ReduceTask.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5494/console |


This message was automatically generated.

 EofException / 'connection reset by peer' while copying map output 
 ---

 Key: MAPREDUCE-4506
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4506
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.3
 Environment: Ubuntu Linux 12.04 LTS, 64-bit, Java 6 update 33
Reporter: Piotr Kołaczkowski
Priority: Minor
 Attachments: RamManager.patch, ReduceTask.patch


 When running complex mapreduce jobs with many mappers and reducers (e.g. 8 
 mappers, 8 reducers on a 8 core machine), sometimes the following exceptions 
 pop up in the logs during the shuffle phase:
 {noformat}
 WARN [570516323@qtp-2060060479-164] 2012-07-19 02:50:21,229 TaskTracker.java 
 (line 3894) getMapOutput(attempt_201207161621_0217_m_71_0,0) failed :
 org.mortbay.jetty.EofException
 at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
 at 
 org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:568)
 at 
 org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1005)
 at 
 org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:648)
 at 
 org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:579)
 at 
 org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3872)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
 at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.io.IOException: Connection reset by peer
 at sun.nio.ch.FileDispatcher.write0(Native Method)
 at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
 at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
 at sun.nio.ch.IOUtil.write(IOUtil.java:43)
 at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
 at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
 at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
 {noformat}
 The problem looks like some network problems at first, however it turns out 
 that hadoop shuffleInMemory sometimes deliberately closes map-output-copy 
 connections just to reopen them a few milliseconds later, because of 
 temporary 

[jira] [Commented] (MAPREDUCE-3882) fix some compile warnings of hadoop-mapreduce-examples

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524546#comment-14524546
 ] 

Hadoop QA commented on MAPREDUCE-3882:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12515084/mapreduce-3882.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5491/console |


This message was automatically generated.

 fix some compile warnings of hadoop-mapreduce-examples
 --

 Key: MAPREDUCE-3882
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3882
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
 Environment: Windows 7
Reporter: Changming Sun
Priority: Minor
 Attachments: mapreduce-3882.patch

   Original Estimate: 2m
  Remaining Estimate: 2m

 fix some compile warnings of hadoop-mapreduce-examples



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4308) Remove excessive split log messages

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524565#comment-14524565
 ] 

Hadoop QA commented on MAPREDUCE-4308:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12530811/mapreduce-4308-branch-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5497/console |


This message was automatically generated.

 Remove excessive split log messages
 ---

 Key: MAPREDUCE-4308
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4308
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 1.0.3
Reporter: Kihwal Lee
 Attachments: mapreduce-4308-branch-1.patch


 Job tracker currently prints out information on every split.
 {noformat}
 2012-05-20 00:06:01,985 INFO org.apache.hadoop.mapred.JobInProgress: 
 tip:task_201205100740_1745_m_00 has split on node:/192.168.0.1
 /my.totally.madeup.host.com
 {noformat}
 I looked at one cluster and these messages were taking up more than 30% of 
 the JT log. If jobs have large number of maps, it can be worse. I think it is 
 reasonable to lower the log level of the statement from INFO to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-1290) DBOutputFormat does not support rewriteBatchedStatements when using MySQL jdbc drivers

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524559#comment-14524559
 ] 

Hadoop QA commented on MAPREDUCE-1290:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12542003/MapReduce-1290-trunk.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5495/console |


This message was automatically generated.

 DBOutputFormat does not support rewriteBatchedStatements when using MySQL 
 jdbc drivers
 --

 Key: MAPREDUCE-1290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Joe Crobak
  Labels: DBOutoutFormat, patch
 Attachments: MAPREDUCE-1290.patch, MapReduce-1290-trunk.patch


 The DBOutputFormat adds a semi-colon to the end of the INSERT statement that 
 it uses to save fields to the database.  Semicolons are typically used in 
 command line programs but are not needed when using the JDBC API.  In this 
 case, the stray semi-colon breaks rewriteBatchedStatement support. See: 
 http://forums.mysql.com/read.php?39,271526,271526#msg-271526 for an example.
 In my use case, rewriteBatchedStatement is very useful because it increases 
 the speed of inserts and reduces memory consumption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4639) CombineFileInputFormat#getSplits should throw IOException when input paths contain a directory

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524564#comment-14524564
 ] 

Hadoop QA commented on MAPREDUCE-4639:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12544000/MAPREDUCE-4639.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5496/console |


This message was automatically generated.

 CombineFileInputFormat#getSplits should throw IOException when input paths 
 contain a directory
 --

 Key: MAPREDUCE-4639
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4639
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Reporter: Jim Donofrio
Priority: Minor
 Attachments: MAPREDUCE-4639.patch


 FileInputFormat#getSplits throws an IOException when the input paths contain 
 a directory. CombineFileInputFormat should do the same, otherwise the jo will 
 not fail until the record reader is initialized when FileSystem#open will say 
 that the directory does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4473) tasktracker rank on machines.jsp?type=active

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524567#comment-14524567
 ] 

Hadoop QA commented on MAPREDUCE-4473:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12537657/MAPREDUCE-4473.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5498/console |


This message was automatically generated.

 tasktracker rank on machines.jsp?type=active
 

 Key: MAPREDUCE-4473
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4473
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 0.20.2, 0.21.0, 0.22.0, 0.23.0, 0.23.1, 1.0.0, 1.0.1, 
 1.0.2, 1.0.3
Reporter: jian fan
Priority: Minor
  Labels: tasktracker
 Attachments: MAPREDUCE-4473.patch


 sometimes we need to simple judge which tasktracker is down from the page of 
 machines.jsp?type=active



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4293) Rumen TraceBuilder gets NPE some times

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524545#comment-14524545
 ] 

Hadoop QA commented on MAPREDUCE-4293:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12530340/4293.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5490/console |


This message was automatically generated.

 Rumen TraceBuilder gets NPE some times
 --

 Key: MAPREDUCE-4293
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4293
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Attachments: 4293.patch


 Rumen TraceBuilder's JobBuilder.processTaskFailedEvent throws NPE if 
 failedDueToAttempt is not available in history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524766#comment-14524766
 ] 

Hadoop QA commented on MAPREDUCE-5748:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12635637/0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5540/console |


This message was automatically generated.

 Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
 

 Key: MAPREDUCE-5748
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: 
 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch


 Starting around line 510:
 {code}
   ChannelFuture lastMap = null;
   for (String mapId : mapIds) {
 ...
   }
   lastMap.addListener(metrics);
   lastMap.addListener(ChannelFutureListener.CLOSE);
 {code}
 If mapIds is empty, lastMap would remain null, leading to NPE in 
 addListener() call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524778#comment-14524778
 ] 

Hadoop QA commented on MAPREDUCE-5621:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12613541/MAPREDUCE-5621.patch |
| Optional Tests | shellcheck |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5542/console |


This message was automatically generated.

 mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
 

 Key: MAPREDUCE-5621
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
Priority: Minor
 Attachments: MAPREDUCE-5621.patch


 mr-jobhistory-daemon.sh executes mkdir and chown command to output the log 
 files.
 This is always executed with or without a directory. In addition, this is 
 executed not only starting daemon but also stopping daemon.
 It add if like hadoop-daemon.sh and yarn-daemon.sh and should control it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3876) vertica query, sql command not properly ended

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524539#comment-14524539
 ] 

Hadoop QA commented on MAPREDUCE-3876:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12516880/HADOOP-oracleDriver-src.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5489/console |


This message was automatically generated.

 vertica query, sql command not properly ended
 -

 Key: MAPREDUCE-3876
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3876
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 1.0.0
 Environment: Red Hat 5.5
 Oracle 11
Reporter: Joseph Doss
  Labels: hadoop, newbie, patch
 Attachments: HADOOP-oracleDriver-src.patch


 When running a test script, we're getting a java IO exception thrown.
 This test works on hadoop-0.20.0 but not on hadoop-1.0.0.
 Fri Feb 17 11:36:40 EST 2012
 Running processes with name syncGL.sh: 0
 LIB_JARS: 
 /home/hadoop/verticasync/lib/vertica_4.1.14_jdk_5.jar,/home/hadoop/verticasync/lib/mail.jar,/home/hadoop/verticasync/lib/jdbc14.jar
 VERTICA_SYNC_JAR: /home/hadoop/verticasync/lib/vertica-sync.jar
 PROPERTIES_FILE: 
 /home/hadoop/verticasync/config/ssp-vertica-sync-gl.properties
 Starting Vertica data sync - GL - process
 Warning: $HADOOP_HOME is deprecated.
 12/02/17 11:36:43 INFO mapred.JobClient: Running job: job_201202171122_0001
 12/02/17 11:36:44 INFO mapred.JobClient:  map 0% reduce 0%
 12/02/17 11:36:56 INFO mapred.JobClient: Task Id : 
 attempt_201202171122_0001_m_00_0, Status : FAILED
 java.io.IOException: ORA-00933: SQL command not properly ended
   at 
 org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:197)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 12/02/17 11:36:57 INFO mapred.JobClient: Task Id : 
 attempt_201202171122_0001_m_01_0, Status : FAILED
 java.io.IOException: ORA-00933: SQL command not properly ended
   at 
 org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:197)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3882) fix some compile warnings of hadoop-mapreduce-examples

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524537#comment-14524537
 ] 

Hadoop QA commented on MAPREDUCE-3882:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12515084/mapreduce-3882.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5488/console |


This message was automatically generated.

 fix some compile warnings of hadoop-mapreduce-examples
 --

 Key: MAPREDUCE-3882
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3882
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
 Environment: Windows 7
Reporter: Changming Sun
Priority: Minor
 Attachments: mapreduce-3882.patch

   Original Estimate: 2m
  Remaining Estimate: 2m

 fix some compile warnings of hadoop-mapreduce-examples



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4271) Make TestCapacityScheduler more robust with non-Sun JDK

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524663#comment-14524663
 ] 

Hadoop QA commented on MAPREDUCE-4271:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12567098/MAPREDUCE-4271-branch1-v2.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5522/console |


This message was automatically generated.

 Make TestCapacityScheduler more robust with non-Sun JDK
 ---

 Key: MAPREDUCE-4271
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4271
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
  Labels: alt-jdk, capacity
 Attachments: MAPREDUCE-4271-branch1-v2.patch, 
 mapreduce-4271-branch-1.patch, test-afterepatch.result, 
 test-beforepatch.result, test-patch.result


 The capacity scheduler queue is initialized with a HashMap, the values of 
 which are later added to a list (a queue for assigning tasks). 
 TestCapacityScheduler depends on the order of the list hence not portable 
 across JDKs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5490) MapReduce doesn't set the environment variable for children processes

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524759#comment-14524759
 ] 

Hadoop QA commented on MAPREDUCE-5490:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12629589/MAPREDUCE-5490.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5539/console |


This message was automatically generated.

 MapReduce doesn't set the environment variable for children processes
 -

 Key: MAPREDUCE-5490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: MAPREDUCE-5490.patch, mr-5490.patch, mr-5490.patch


 Currently, MapReduce uses the command line argument to pass the classpath to 
 the child. This breaks if the process forks a child that needs the same 
 classpath. Such a case happens in Hive when it uses map-side joins. I propose 
 that we make MapReduce in branch-1 use the CLASSPATH environment variable 
 like YARN does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5704) Optimize nextJobId in JobTracker

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524744#comment-14524744
 ] 

Hadoop QA commented on MAPREDUCE-5704:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12621052/MAPREDUCE-5704.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5535/console |


This message was automatically generated.

 Optimize nextJobId in JobTracker
 

 Key: MAPREDUCE-5704
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5704
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, mrv1
Affects Versions: 1.2.1
Reporter: JamesLi
Assignee: JamesLi
 Attachments: MAPREDUCE-5704.patch


 When jobtracker start, nextJobId start with 1,if we have run 3000 jobs  then 
 restart jobtracker and run a new job,we can not see this new job on 
 jobtracker:5030/jobhistory.jsp unless click get more results button.
 In jobhistory_jsp.java, array SCAN_SIZES controls job numbers displayed on 
 jobhistory.jsp.
 I make a little chage,when jobtracker start,find the biggest id under history 
 done directory,job will start with maxId+1 or 1 if can not find any job files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4502) Node-level aggregation with combining the result of maps

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524738#comment-14524738
 ] 

Hadoop QA commented on MAPREDUCE-4502:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  1s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12592783/MAPREDUCE-4502.10.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5534/console |


This message was automatically generated.

 Node-level aggregation with combining the result of maps
 

 Key: MAPREDUCE-4502
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Affects Versions: 3.0.0
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
 Attachments: MAPREDUCE-4502.1.patch, MAPREDUCE-4502.10.patch, 
 MAPREDUCE-4502.2.patch, MAPREDUCE-4502.3.patch, MAPREDUCE-4502.4.patch, 
 MAPREDUCE-4502.5.patch, MAPREDUCE-4502.6.patch, MAPREDUCE-4502.7.patch, 
 MAPREDUCE-4502.8.patch, MAPREDUCE-4502.8.patch, MAPREDUCE-4502.9.patch, 
 MAPREDUCE-4502.9.patch, MAPREDUCE-4525-pof.diff, design_v2.pdf, 
 design_v3.pdf, speculative_draft.pdf


 The shuffle costs is expensive in Hadoop in spite of the existence of 
 combiner, because the scope of combining is limited within only one MapTask. 
 To solve this problem, it's a good way to aggregate the result of maps per 
 node/rack by launch combiner.
 This JIRA is to implement the multi-level aggregation infrastructure, 
 including combining per container(MAPREDUCE-3902 is related), coordinating 
 containers by application master without breaking fault tolerance of jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4883) Reducer's Maximum Shuffle Buffer Size should be enlarged for 64bit JVM

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524751#comment-14524751
 ] 

Hadoop QA commented on MAPREDUCE-4883:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12566621/MAPREDUCE-4883.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5537/console |


This message was automatically generated.

 Reducer's Maximum Shuffle Buffer Size should be enlarged for 64bit JVM
 --

 Key: MAPREDUCE-4883
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4883
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.20.2, 1.0.3
 Environment: Especially for 64bit JVM
Reporter: Lijie Xu
Assignee: Jerry Chen
  Labels: patch
 Attachments: MAPREDUCE-4883.patch

   Original Estimate: 12h
  Remaining Estimate: 12h

 In hadoop-0.20.2, hadoop-1.0.3 or other versions, reducer's shuffle buffer 
 size cannot exceed 2048MB (i.e., Integer.MAX_VALUE). This is reasonable for 
 32bit JVM.
 But for 64bit JVM, although reducer's JVM size can be set more than 2048MB 
 (e.g., mapred.child.java.opts=-Xmx4000m), the heap size used for shuffle 
 buffer is at most 2048MB * maxInMemCopyUse (default 0.7) not 4000MB * 
 maxInMemCopyUse. 
 So the pointed piece of code in ReduceTask.java needs modification for 64bit 
 JVM.
 ---
   private final long maxSize;
   private final long maxSingleShuffleLimit;
  
   private long size = 0;
  
   private Object dataAvailable = new Object();
   private long fullSize = 0;
   private int numPendingRequests = 0;
   private int numRequiredMapOutputs = 0;
   private int numClosed = 0;
   private boolean closed = false;
  
   public ShuffleRamManager(Configuration conf) throws IOException {
 final float maxInMemCopyUse =
   conf.getFloat(mapred.job.shuffle.input.buffer.percent, 0.70f);
 if (maxInMemCopyUse  1.0 || maxInMemCopyUse  0.0) {
   throw new IOException(mapred.job.shuffle.input.buffer.percent +
 maxInMemCopyUse);
 }
 // Allow unit tests to fix Runtime memory
 --   maxSize = (int)(conf.getInt(mapred.job.reduce.total.mem.bytes,
 --(int)Math.min(Runtime.getRuntime().maxMemory(), Integer.MAX_VALUE))
 --  * maxInMemCopyUse);
 maxSingleShuffleLimit = (long)(maxSize * 
 MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
 LOG.info(ShuffleRamManager: MemoryLimit= + maxSize +
  , MaxSingleShuffleLimit= + maxSingleShuffleLimit);
   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3486) All jobs of all queues will be returned, whethor a particular queueName is specified or not

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524757#comment-14524757
 ] 

Hadoop QA commented on MAPREDUCE-3486:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12505621/MAPREDUCE-3486.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5538/console |


This message was automatically generated.

 All jobs of all queues will be returned, whethor a particular queueName is 
 specified or not
 ---

 Key: MAPREDUCE-3486
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3486
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.3, 1.3.0, 1.2.2
Reporter: XieXianshan
Assignee: XieXianshan
Priority: Minor
 Attachments: MAPREDUCE-3486.patch


 JobTracker.getJobsFromQueue(queueName) will return all jobs of all queues 
 about the jobtracker even though i specify a queueName. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5611) CombineFileInputFormat only requests a single location per split when more could be optimal

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524745#comment-14524745
 ] 

Hadoop QA commented on MAPREDUCE-5611:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12613866/CombineFileInputFormat-trunk.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5536/console |


This message was automatically generated.

 CombineFileInputFormat only requests a single location per split when more 
 could be optimal
 ---

 Key: MAPREDUCE-5611
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5611
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Chandra Prakash Bhagtani
Assignee: Chandra Prakash Bhagtani
 Attachments: CombineFileInputFormat-trunk.patch


 I have come across an issue with CombineFileInputFormat. Actually I ran a 
 hive query on approx 1.2 GB data with CombineHiveInputFormat which internally 
 uses CombineFileInputFormat. My cluster size is 9 datanodes and 
 max.split.size is 256 MB
 When I ran this query with replication factor 9, hive consistently creates 
 all 6 rack-local tasks and with replication factor 3 it creates 5 rack-local 
 and 1 data local tasks. 
  When replication factor is 9 (equal to cluster size), all the tasks should 
 be data-local as each datanode contains all the replicas of the input data, 
 but that is not happening i.e all the tasks are rack-local. 
 When I dug into CombineFileInputFormat.java code in getMoreSplits method, I 
 found the issue with the following snippet (specially in case of higher 
 replication factor)
 {code:title=CombineFileInputFormat.java|borderStyle=solid}
 for (IteratorMap.EntryString,
  ListOneBlockInfo iter = nodeToBlocks.entrySet().iterator();
  iter.hasNext();) {
Map.EntryString, ListOneBlockInfo one = iter.next();
   nodes.add(one.getKey());
   ListOneBlockInfo blocksInNode = one.getValue();
   // for each block, copy it into validBlocks. Delete it from
   // blockToNodes so that the same block does not appear in
   // two different splits.
   for (OneBlockInfo oneblock : blocksInNode) {
 if (blockToNodes.containsKey(oneblock)) {
   validBlocks.add(oneblock);
   blockToNodes.remove(oneblock);
   curSplitSize += oneblock.length;
   // if the accumulated split size exceeds the maximum, then
   // create this split.
   if (maxSize != 0  curSplitSize = maxSize) {
 // create an input split and add it to the splits array
 addCreatedSplit(splits, nodes, validBlocks);
 curSplitSize = 0;
 validBlocks.clear();
   }
 }
   }
 {code}
 First node in the map nodeToBlocks has all the replicas of input file, so the 
 above code creates 6 splits all with only one location. Now if JT doesn't 
 schedule these tasks on that node, all the tasks will be rack-local, even 
 though all the other datanodes have all the other replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524769#comment-14524769
 ] 

Hadoop QA commented on MAPREDUCE-4957:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12566460/MAPREDUCE-4957.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5541/console |


This message was automatically generated.

 Throw FileNotFoundException when running in single node and 
 mapreduce.framework.name is local
 ---

 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: MAPREDUCE-4957.patch, MAPREDUCE-4957.patch


 Run in single node and mapreduce.framework.name is local, and get following 
 error:
 java.io.FileNotFoundException: File does not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
  
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5799) add default value of MR_AM_ADMIN_USER_ENV

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524194#comment-14524194
 ] 

Hadoop QA commented on MAPREDUCE-5799:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 43s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 31s | The applied patch generated  1 
new checkstyle issues (total was 26, now 26). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 41s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:red}-1{color} | mapreduce tests |  98m 52s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| | | 134m 55s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.mapred.TestMiniMRClientCluster |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729831/MAPREDUCE-5799.002.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3393461 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5485/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-jobclient.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5485/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5485/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5485/console |


This message was automatically generated.

 add default value of MR_AM_ADMIN_USER_ENV
 -

 Key: MAPREDUCE-5799
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5799
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.3.0
Reporter: Liyin Liang
Assignee: Rajesh Kartha
 Attachments: MAPREDUCE-5799-1.diff, MAPREDUCE-5799.002.patch, 
 MAPREDUCE-5799.diff


 Submit a 1 map + 1 reduce sleep job with the following config:
 {code}
   property
   namemapreduce.map.output.compress/name
   valuetrue/value
   /property
   property
   namemapreduce.map.output.compress.codec/name
   valueorg.apache.hadoop.io.compress.SnappyCodec/value
   /property
 property
   namemapreduce.job.ubertask.enable/name
   valuetrue/value
 /property
 {code}
 And the LinuxContainerExecutor is enable on NodeManager.
 This job will fail with the following error:
 {code}
 2014-03-18 21:28:20,153 FATAL [uber-SubtaskRunner] 
 org.apache.hadoop.mapred.LocalContainerLauncher: Error running local 
 (uberized) 'child' : java.lang.UnsatisfiedLinkError: 
 org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
 at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native 
 Method)
 at 
 org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)
 at 
 org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:132)
 at 
 org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148)
 at 
 org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163)
 at org.apache.hadoop.mapred.IFile$Writer.init(IFile.java:115)
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1583)
 at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462)
 at 
 org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700)
 at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1990)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:774)

[jira] [Commented] (MAPREDUCE-6321) Map tasks take a lot of time to start up

2015-05-01 Thread Rajat Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524222#comment-14524222
 ] 

Rajat Jain commented on MAPREDUCE-6321:
---

Yes, we run FairScheduler. However, this is not related to FairScheduler since 
this slowness is during map task startup.

 Map tasks take a lot of time to start up
 

 Key: MAPREDUCE-6321
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 2.6.0
Reporter: Rajat Jain
Priority: Critical
  Labels: performance

 I have noticed repeatedly that the map tasks take a lot of time to startup on 
 YARN clusters. This is not the scheduling part, this is after the actual 
 container is launched containing the Map task. Take for example, the sample 
 log from a mapper of a Pi job that I launched. The command I used to launch 
 the Pi job was:
 {code}
 hadoop jar 
 /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
 {code}
 This is the sample job from one of the mappers which took 14 seconds to 
 complete. If you notice from the logs, most of the time taken by this job is 
 during the start up. I notice that the most mappers take anywhere between 7 
 to 15 seconds during start up and have seen this behavior consistent across 
 mapreduce jobs. This really affects the performance of short running mappers.
 I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the 
 mapper memory is always specified as 2048m and so on.
 Log:
 {code}
 2015-04-18 06:48:34,081 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
 hadoop-metrics2.properties
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period 
 at 10 second(s).
 2015-04-18 06:48:34,637 INFO [main] 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
 started
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Executing with tokens:
 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
 mapreduce.job, Service: job_1429338752209_0059, Ident: 
 (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 Sleeping for 0ms before retrying again. Got null now.
 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: 
 mapreduce.cluster.local.dir for child: 
 /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
 2015-04-18 06:48:36,706 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:37,387 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:39,388 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. 
 Instead, use dfs.metrics.session-id
 2015-04-18 06:48:39,448 INFO [main] 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 2015-04-18 06:48:41,060 INFO [main] 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to 
 org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up 
 progress from Task
 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using 
 ResourceCalculatorProcessTree : [ ]
 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: 
 hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: 
 (EQUATOR) 0 kvi 234881020(939524080)
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 mapreduce.task.io.sort.mb: 896
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft 
 limit at 657666880
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: 
 bufstart = 0; bufvoid = 939524096
 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart 
 = 234881020; length = 58720256
 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map 
 output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Starting flush of map output
 2015-04-18 

[jira] [Commented] (MAPREDUCE-2058) FairScheduler:NullPointerException in web interface when JobTracker not initialized

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524587#comment-14524587
 ] 

Hadoop QA commented on MAPREDUCE-2058:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12555264/MAPREDUCE-2058-branch-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5502/console |


This message was automatically generated.

 FairScheduler:NullPointerException in web interface when JobTracker not 
 initialized
 ---

 Key: MAPREDUCE-2058
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2058
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 0.22.0, 1.0.4
Reporter: Dan Adkins
 Attachments: MAPREDUCE-2058-branch-1.patch, MAPREDUCE-2058.patch


 When I contact the jobtracker web interface prior to the job tracker being 
 fully initialized (say, if hdfs is still in safe mode), I get the following 
 error:
 10/09/09 18:06:02 ERROR mortbay.log: /jobtracker.jsp
 java.lang.NullPointerException
 at 
 org.apache.hadoop.mapred.FairScheduler.getJobs(FairScheduler.java:909)
 at 
 org.apache.hadoop.mapred.JobTracker.getJobsFromQueue(JobTracker.java:4357)
 at 
 org.apache.hadoop.mapred.JobTracker.getQueueInfoArray(JobTracker.java:4334)
 at 
 org.apache.hadoop.mapred.JobTracker.getRootQueues(JobTracker.java:4295)
 at 
 org.apache.hadoop.mapred.jobtracker_jsp.generateSummaryTable(jobtracker_jsp.java:44)
 at 
 org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:176)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124)
 at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:857)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:324)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)   
  at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4482) Backport MR sort plugin(MAPREDUCE-2454) to Hadoop 1.2

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524581#comment-14524581
 ] 

Hadoop QA commented on MAPREDUCE-4482:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12546055/mapreduce-4482-release-1.1.0-rc4.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5500/console |


This message was automatically generated.

 Backport MR sort plugin(MAPREDUCE-2454) to Hadoop 1.2
 -

 Key: MAPREDUCE-4482
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4482
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1
Affects Versions: 1.2.0
Reporter: Mariappan Asokan
Assignee: Mariappan Asokan
 Attachments: HadoopSortPlugin.pdf, 
 mapreduce-4482-release-1.1.0-rc4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3881) building fail under Windows

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524582#comment-14524582
 ] 

Hadoop QA commented on MAPREDUCE-3881:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12515081/pom.xml.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5501/console |


This message was automatically generated.

 building fail under Windows
 ---

 Key: MAPREDUCE-3881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3881
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
 Environment: D:\os\hadoopcommonmvn --version
 Apache Maven 3.0.4 (r1232337; 2012-01-17 16:44:56+0800)
 Maven home: C:\portable\maven\bin\..
 Java version: 1.7.0_02, vendor: Oracle Corporation
 Java home: C:\Program Files (x86)\Java\jdk1.7.0_02\jre
 Default locale: zh_CN, platform encoding: GBK
 OS name: windows 7, version: 6.1, arch: x86, family: windows
Reporter: Changming Sun
Priority: Minor
 Attachments: pom.xml.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 hadoop-mapreduce-project\hadoop-yarn\hadoop-yarn-common\pom.xml is not 
 portable.
  execution
 idgenerate-version/id
 phasegenerate-sources/phase
 configuration
   executablescripts/saveVersion.sh/executable
   arguments
 argument${project.version}/argument
 argument${project.build.directory}/argument
   /arguments
 /configuration
 goals
   goalexec/goal
 /goals
   /execution
 when I built it under windows , I got a such error:
 [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec 
 (gen
 erate-version) on project hadoop-yarn-common: Command execution failed. 
 Cannot r
 un program scripts\saveVersion.sh (in directory 
 D:\os\hadoopcommon\hadoop-map
 reduce-project\hadoop-yarn\hadoop-yarn-common): CreateProcess error=2, 
 
 ? - [Help 1]
 we should modify it like this: (copied from 
 hadoop-common-project\hadoop-common\pom.xml)
 configuration
   target
 mkdir 
 dir=${project.build.directory}/generated-sources/java/
 exec executable=sh
   arg
   line=${basedir}/dev-support/saveVersion.sh 
 ${project.version} ${project.build.directory}/generated-sources/java/
 /exec
   /target
 /configuration
   /execution



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-695) MiniMRCluster while shutting down should not wait for currently running jobs to finish

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524578#comment-14524578
 ] 

Hadoop QA commented on MAPREDUCE-695:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12412383/mapreduce-695.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5499/console |


This message was automatically generated.

 MiniMRCluster while shutting down should not wait for currently running jobs 
 to finish
 --

 Key: MAPREDUCE-695
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-695
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 1.0.3
Reporter: Sreekanth Ramakrishnan
Priority: Minor
 Attachments: mapreduce-695.patch


 Currently in {{org.apache.hadoop.mapred.MiniMRCluster.shutdown()}} we do a 
 {{waitTaskTrackers()}} which can cause {{MiniMRCluster}} to hang indefinitely 
 when used in conjunction with Controlled jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4273) Make CombineFileInputFormat split result JDK independent

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524684#comment-14524684
 ] 

Hadoop QA commented on MAPREDUCE-4273:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12567099/MAPREDUCE-4273-branch1-v2.patch
 |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5525/console |


This message was automatically generated.

 Make CombineFileInputFormat split result JDK independent
 

 Key: MAPREDUCE-4273
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4273
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
 Attachments: MAPREDUCE-4273-branch1-v2.patch, 
 mapreduce-4273-branch-1.patch, mapreduce-4273-branch-2.patch, 
 mapreduce-4273.patch


 The split result of CombineFileInputFormat depends on the iteration order of  
 nodeToBlocks and rackToBlocks hash maps, which makes the result HashMap 
 implementation hence JDK dependent.
 This is manifested as TestCombineFileInputFormat failures on alternative JDKs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4330) TaskAttemptCompletedEventTransition invalidates previously successful attempt without checking if the newly completed attempt is successful

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524674#comment-14524674
 ] 

Hadoop QA commented on MAPREDUCE-4330:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12578792/MAPREDUCE-4330-20130415.1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5523/console |


This message was automatically generated.

 TaskAttemptCompletedEventTransition invalidates previously successful attempt 
 without checking if the newly completed attempt is successful
 ---

 Key: MAPREDUCE-4330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.1
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
 Attachments: MAPREDUCE-4330-20130415.1.patch, 
 MAPREDUCE-4330-20130415.patch, MAPREDUCE-4330-21032013.1.patch, 
 MAPREDUCE-4330-21032013.patch


 The previously completed attempt is removed from 
 successAttemptCompletionEventNoMap and marked OBSOLETE.
 After that, if the newly completed attempt is successful then it is added to 
 the successAttemptCompletionEventNoMap. 
 This seems wrong because the newly completed attempt could be failed and thus 
 there is no need to invalidate the successful attempt.
 One error case would be when a speculative attempt completes with 
 killed/failed after the successful version has completed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5188) error when verify FileType of RS_SOURCE in getCompanionBlocks in BlockPlacementPolicyRaid.java

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524683#comment-14524683
 ] 

Hadoop QA commented on MAPREDUCE-5188:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12580811/MAPREDUCE-5188.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5524/console |


This message was automatically generated.

 error when verify FileType of RS_SOURCE in getCompanionBlocks  in 
 BlockPlacementPolicyRaid.java
 ---

 Key: MAPREDUCE-5188
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5188
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Affects Versions: 2.0.2-alpha
Reporter: junjin
Assignee: junjin
Priority: Critical
  Labels: contrib/raid
 Fix For: 2.0.2-alpha

 Attachments: MAPREDUCE-5188.patch


 error when verify FileType of RS_SOURCE in getCompanionBlocks  in 
 BlockPlacementPolicyRaid.java
 need change xorParityLength in line #379 to rsParityLength since it's for 
 verifying RS_SOURCE  type



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524825#comment-14524825
 ] 

Hadoop QA commented on MAPREDUCE-5748:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12635637/0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5549/console |


This message was automatically generated.

 Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
 

 Key: MAPREDUCE-5748
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: 
 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch


 Starting around line 510:
 {code}
   ChannelFuture lastMap = null;
   for (String mapId : mapIds) {
 ...
   }
   lastMap.addListener(metrics);
   lastMap.addListener(ChannelFutureListener.CLOSE);
 {code}
 If mapIds is empty, lastMap would remain null, leading to NPE in 
 addListener() call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5907) Improve getSplits() performance for fs implementations that can utilize performance gains from recursive listing

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524821#comment-14524821
 ] 

Hadoop QA commented on MAPREDUCE-5907:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12648040/MAPREDUCE-5907-3.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5546/console |


This message was automatically generated.

 Improve getSplits() performance for fs implementations that can utilize 
 performance gains from recursive listing
 

 Key: MAPREDUCE-5907
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5907
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.4.0
Reporter: Sumit Kumar
Assignee: Sumit Kumar
 Attachments: MAPREDUCE-5907-2.patch, MAPREDUCE-5907-3.patch, 
 MAPREDUCE-5907.patch


 FileInputFormat (both mapreduce and mapred implementations) use recursive 
 listing while calculating splits. They however do this by doing listing level 
 by level. That means to discover files in /foo/bar means they do listing at 
 /foo/bar first to get the immediate children, then make the same call on all 
 immediate children for /foo/bar to discover their immediate children and so 
 on. This doesn't scale well for object store based fs implementations like s3 
 and swift because every listStatus call ends up being a webservice call to 
 backend. In cases where large number of files are considered for input, this 
 makes getSplits() call slow. 
 This patch adds a new set of recursive list apis that gives opportunity to 
 the fs implementations to optimize. The behavior remains the same for other 
 implementations (that is a default implementation is provided for other fs so 
 they don't have to implement anything new). However for objectstore based fs 
 implementations it provides a simple change to include recursive flag as true 
 (as shown in the patch) to improve listing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4883) Reducer's Maximum Shuffle Buffer Size should be enlarged for 64bit JVM

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524846#comment-14524846
 ] 

Hadoop QA commented on MAPREDUCE-4883:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12566621/MAPREDUCE-4883.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5552/console |


This message was automatically generated.

 Reducer's Maximum Shuffle Buffer Size should be enlarged for 64bit JVM
 --

 Key: MAPREDUCE-4883
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4883
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.20.2, 1.0.3
 Environment: Especially for 64bit JVM
Reporter: Lijie Xu
Assignee: Jerry Chen
  Labels: patch
 Attachments: MAPREDUCE-4883.patch

   Original Estimate: 12h
  Remaining Estimate: 12h

 In hadoop-0.20.2, hadoop-1.0.3 or other versions, reducer's shuffle buffer 
 size cannot exceed 2048MB (i.e., Integer.MAX_VALUE). This is reasonable for 
 32bit JVM.
 But for 64bit JVM, although reducer's JVM size can be set more than 2048MB 
 (e.g., mapred.child.java.opts=-Xmx4000m), the heap size used for shuffle 
 buffer is at most 2048MB * maxInMemCopyUse (default 0.7) not 4000MB * 
 maxInMemCopyUse. 
 So the pointed piece of code in ReduceTask.java needs modification for 64bit 
 JVM.
 ---
   private final long maxSize;
   private final long maxSingleShuffleLimit;
  
   private long size = 0;
  
   private Object dataAvailable = new Object();
   private long fullSize = 0;
   private int numPendingRequests = 0;
   private int numRequiredMapOutputs = 0;
   private int numClosed = 0;
   private boolean closed = false;
  
   public ShuffleRamManager(Configuration conf) throws IOException {
 final float maxInMemCopyUse =
   conf.getFloat(mapred.job.shuffle.input.buffer.percent, 0.70f);
 if (maxInMemCopyUse  1.0 || maxInMemCopyUse  0.0) {
   throw new IOException(mapred.job.shuffle.input.buffer.percent +
 maxInMemCopyUse);
 }
 // Allow unit tests to fix Runtime memory
 --   maxSize = (int)(conf.getInt(mapred.job.reduce.total.mem.bytes,
 --(int)Math.min(Runtime.getRuntime().maxMemory(), Integer.MAX_VALUE))
 --  * maxInMemCopyUse);
 maxSingleShuffleLimit = (long)(maxSize * 
 MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
 LOG.info(ShuffleRamManager: MemoryLimit= + maxSize +
  , MaxSingleShuffleLimit= + maxSingleShuffleLimit);
   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524852#comment-14524852
 ] 

Hadoop QA commented on MAPREDUCE-5889:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12647036/MAPREDUCE-5889.3.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build//console |


This message was automatically generated.

 Deprecate FileInputFormat.setInputPaths(Job, String) and 
 FileInputFormat.addInputPaths(Job, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie
 Attachments: MAPREDUCE-5889.3.patch, MAPREDUCE-5889.patch, 
 MAPREDUCE-5889.patch


 {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and 
 {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail 
 to parse commaSeparatedPaths if a comma is included in the file path. (e.g. 
 Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(Job 
 job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} 
 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5704) Optimize nextJobId in JobTracker

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524824#comment-14524824
 ] 

Hadoop QA commented on MAPREDUCE-5704:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  1s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12621052/MAPREDUCE-5704.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5548/console |


This message was automatically generated.

 Optimize nextJobId in JobTracker
 

 Key: MAPREDUCE-5704
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5704
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, mrv1
Affects Versions: 1.2.1
Reporter: JamesLi
Assignee: JamesLi
 Attachments: MAPREDUCE-5704.patch


 When jobtracker start, nextJobId start with 1,if we have run 3000 jobs  then 
 restart jobtracker and run a new job,we can not see this new job on 
 jobtracker:5030/jobhistory.jsp unless click get more results button.
 In jobhistory_jsp.java, array SCAN_SIZES controls job numbers displayed on 
 jobhistory.jsp.
 I make a little chage,when jobtracker start,find the biggest id under history 
 done directory,job will start with maxId+1 or 1 if can not find any job files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524850#comment-14524850
 ] 

Hadoop QA commented on MAPREDUCE-4711:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12548038/MAPREDUCE-4711.branch-0.23.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5553/console |


This message was automatically generated.

 Append time elapsed since job-start-time for finished tasks
 ---

 Key: MAPREDUCE-4711
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3
Reporter: Ravi Prakash
 Attachments: MAPREDUCE-4711.branch-0.23.patch


 In 0.20.x/1.x, the analyze job link gave this information
 bq. The last Map task task_sometask finished at (relative to the Job launch 
 time): 5/10 20:23:10 (1hrs, 27mins, 54sec)
 The time it took for the last task to finish needs to be calculated mentally 
 in 0.23. I believe we should print it next to the finish time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and mapreduce.framework.name is local

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524834#comment-14524834
 ] 

Hadoop QA commented on MAPREDUCE-4957:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12566460/MAPREDUCE-4957.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5550/console |


This message was automatically generated.

 Throw FileNotFoundException when running in single node and 
 mapreduce.framework.name is local
 ---

 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: MAPREDUCE-4957.patch, MAPREDUCE-4957.patch


 Run in single node and mapreduce.framework.name is local, and get following 
 error:
 java.io.FileNotFoundException: File does not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
  
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
  
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:396) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
  
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
 at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  
 at java.lang.reflect.Method.invoke(Method.java:597) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: 
 /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5490) MapReduce doesn't set the environment variable for children processes

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524843#comment-14524843
 ] 

Hadoop QA commented on MAPREDUCE-5490:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12629589/MAPREDUCE-5490.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5551/console |


This message was automatically generated.

 MapReduce doesn't set the environment variable for children processes
 -

 Key: MAPREDUCE-5490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: MAPREDUCE-5490.patch, mr-5490.patch, mr-5490.patch


 Currently, MapReduce uses the command line argument to pass the classpath to 
 the child. This breaks if the process forks a child that needs the same 
 classpath. Such a case happens in Hive when it uses map-side joins. I propose 
 that we make MapReduce in branch-1 use the CLASSPATH environment variable 
 like YARN does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5649) Reduce cannot use more than 2G memory for the final merge

2015-05-01 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-5649:
-
Attachment: MAPREDUCE-5649.002.patch

Thanks for review, [~jlowe]! 002.patch 

 Reduce cannot use more than 2G memory  for the final merge
 --

 Key: MAPREDUCE-5649
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5649
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: stanley shi
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5649.001.patch, MAPREDUCE-5649.002.patch


 In the org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.java file, in 
 the finalMerge method: 
  int maxInMemReduce = (int)Math.min(
 Runtime.getRuntime().maxMemory() * maxRedPer, Integer.MAX_VALUE);
  
 This means no matter how much memory user has, reducer will not retain more 
 than 2G data in memory before the reduce phase starts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4273) Make CombineFileInputFormat split result JDK independent

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524629#comment-14524629
 ] 

Hadoop QA commented on MAPREDUCE-4273:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12567099/MAPREDUCE-4273-branch1-v2.patch
 |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5516/console |


This message was automatically generated.

 Make CombineFileInputFormat split result JDK independent
 

 Key: MAPREDUCE-4273
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4273
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
 Attachments: MAPREDUCE-4273-branch1-v2.patch, 
 mapreduce-4273-branch-1.patch, mapreduce-4273-branch-2.patch, 
 mapreduce-4273.patch


 The split result of CombineFileInputFormat depends on the iteration order of  
 nodeToBlocks and rackToBlocks hash maps, which makes the result HashMap 
 implementation hence JDK dependent.
 This is manifested as TestCombineFileInputFormat failures on alternative JDKs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3936) Clients should not enforce counter limits

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524627#comment-14524627
 ] 

Hadoop QA commented on MAPREDUCE-3936:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12544972/MAPREDUCE-3936.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5515/console |


This message was automatically generated.

 Clients should not enforce counter limits 
 --

 Key: MAPREDUCE-3936
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3936
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-3936.patch, MAPREDUCE-3936.patch


 The code for enforcing counter limits (from MAPREDUCE-1943) creates a static 
 JobConf instance to load the limits, which may throw an exception if the 
 client limit is set to be lower than the limit on the cluster (perhaps 
 because the cluster limit was raised from the default).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5150) Backport 2009 terasort (MAPREDUCE-639) to branch-1

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524633#comment-14524633
 ] 

Hadoop QA commented on MAPREDUCE-5150:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12578622/MAPREDUCE-5150-branch-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5518/console |


This message was automatically generated.

 Backport 2009 terasort (MAPREDUCE-639) to branch-1
 --

 Key: MAPREDUCE-5150
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5150
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 1.2.0
Reporter: Gera Shegalov
Priority: Minor
 Attachments: MAPREDUCE-5150-branch-1.patch


 Users evaluate performance of Hadoop clusters using different benchmarks such 
 as TeraSort. However, terasort version in branch-1 is outdated. It works on 
 teragen dataset that cannot exceed 4 billion unique keys and it does not have 
 the fast non-sampling partitioner SimplePartitioner either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3807) JobTracker needs fix similar to HDFS-94

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524630#comment-14524630
 ] 

Hadoop QA commented on MAPREDUCE-3807:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12515105/MAPREDUCE-3807.patch |
| Optional Tests |  |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5517/console |


This message was automatically generated.

 JobTracker needs fix similar to HDFS-94
 ---

 Key: MAPREDUCE-3807
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3807
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Harsh J
  Labels: newbie
 Attachments: MAPREDUCE-3807.patch


 1.0 JobTracker's jobtracker.jsp page currently shows:
 {code}
 h2Cluster Summary (Heap Size is %= 
 StringUtils.byteDesc(Runtime.getRuntime().totalMemory()) %/%= 
 StringUtils.byteDesc(Runtime.getRuntime().maxMemory()) %)/h2
 {code}
 It could use an improvement same as HDFS-94 to reflect live heap usage more 
 accurately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5365) Set mapreduce.job.classloader to true by default

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524636#comment-14524636
 ] 

Hadoop QA commented on MAPREDUCE-5365:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12590345/MAPREDUCE-5365.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5519/console |


This message was automatically generated.

 Set mapreduce.job.classloader to true by default
 

 Key: MAPREDUCE-5365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5365
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5365.patch


 MAPREDUCE-1700 introduced the mapreduce.job.classpath option, which uses a 
 custom classloader to separate system classes from user classes.  It seems 
 like there are only rare cases when a user would not want this on, and that 
 it should enabled by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5907) Improve getSplits() performance for fs implementations that can utilize performance gains from recursive listing

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524788#comment-14524788
 ] 

Hadoop QA commented on MAPREDUCE-5907:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12648040/MAPREDUCE-5907-3.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5543/console |


This message was automatically generated.

 Improve getSplits() performance for fs implementations that can utilize 
 performance gains from recursive listing
 

 Key: MAPREDUCE-5907
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5907
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.4.0
Reporter: Sumit Kumar
Assignee: Sumit Kumar
 Attachments: MAPREDUCE-5907-2.patch, MAPREDUCE-5907-3.patch, 
 MAPREDUCE-5907.patch


 FileInputFormat (both mapreduce and mapred implementations) use recursive 
 listing while calculating splits. They however do this by doing listing level 
 by level. That means to discover files in /foo/bar means they do listing at 
 /foo/bar first to get the immediate children, then make the same call on all 
 immediate children for /foo/bar to discover their immediate children and so 
 on. This doesn't scale well for object store based fs implementations like s3 
 and swift because every listStatus call ends up being a webservice call to 
 backend. In cases where large number of files are considered for input, this 
 makes getSplits() call slow. 
 This patch adds a new set of recursive list apis that gives opportunity to 
 the fs implementations to optimize. The behavior remains the same for other 
 implementations (that is a default implementation is provided for other fs so 
 they don't have to implement anything new). However for objectstore based fs 
 implementations it provides a simple change to include recursive flag as true 
 (as shown in the patch) to improve listing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5929) YARNRunner.java, path for jobJarPath not set correctly

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524955#comment-14524955
 ] 

Hadoop QA commented on MAPREDUCE-5929:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12668704/MAPREDUCE-5929.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5575/console |


This message was automatically generated.

 YARNRunner.java, path for jobJarPath not set correctly
 --

 Key: MAPREDUCE-5929
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5929
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Chao Tian
Assignee: Rahul Palamuttam
  Labels: newbie, patch
 Attachments: MAPREDUCE-5929.patch


 In YARNRunner.java, line 357,
 Path jobJarPath = new Path(jobConf.get(MRJobConfig.JAR));
 This causes the job.jar file to miss scheme, host and port number on 
 distributed file systems other than hdfs. 
 If we compare line 357 with line 344, there job.xml is actually set as
  
 Path jobConfPath = new Path(jobSubmitDir,MRJobConfig.JOB_CONF_FILE);
 It appears jobSubmitDir is missing on line 357, which causes this problem. 
 In hdfs, the additional qualify process will correct this problem, but not 
 other generic distributed file systems.
 The proposed change is to replace 35 7 with
 Path jobJarPath = new Path(jobConf.get(jobSubmitDir,MRJobConfig.JAR));
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524938#comment-14524938
 ] 

Hadoop QA commented on MAPREDUCE-5392:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12669757/MAPREDUCE-5392.5.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5573/console |


This message was automatically generated.

 mapred job -history all command throws IndexOutOfBoundsException
 --

 Key: MAPREDUCE-5392
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5392
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0, 2.0.5-alpha, 2.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
 Attachments: MAPREDUCE-5392.2.patch, MAPREDUCE-5392.3.patch, 
 MAPREDUCE-5392.4.patch, MAPREDUCE-5392.5.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch


 When I use an all option by mapred job -history comamnd, the following 
 exceptions are displayed and do not work.
 {code}
 Exception in thread main java.lang.StringIndexOutOfBoundsException: String 
 index out of range: -3
 at java.lang.String.substring(String.java:1875)
 at 
 org.apache.hadoop.mapreduce.util.HostUtil.convertTrackerNameToHostName(HostUtil.java:49)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.getTaskLogsUrl(HistoryViewer.java:459)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.printAllTaskAttempts(HistoryViewer.java:235)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.print(HistoryViewer.java:117)
 at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:472)
 at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1233)
 {code}
 This is because a node name recorded in History file is not given tracker_. 
 Therefore it makes modifications to be able to read History file even if a 
 node name is not given by tracker_.
 In addition, it fixes the URL of displayed task log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4216) Make MultipleOutputs generic to support non-file output formats

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525072#comment-14525072
 ] 

Hadoop QA commented on MAPREDUCE-4216:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 33s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 46s | The applied patch generated  3 
new checkstyle issues (total was 67, now 70). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 4  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | mapreduce tests |   1m 35s | Tests passed in 
hadoop-mapreduce-client-core. |
| | |  37m 46s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12525460/MAPREDUCE-4216.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| javadoc | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5581/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5581/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5581/artifact/patchprocess/whitespace.txt
 |
| hadoop-mapreduce-client-core test log | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5581/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5581/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5581/console |


This message was automatically generated.

 Make MultipleOutputs generic to support non-file output formats
 ---

 Key: MAPREDUCE-4216
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4216
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 1.0.2
Reporter: Robbie Strickland
  Labels: Output
 Attachments: MAPREDUCE-4216.patch


 The current MultipleOutputs implementation is tied to FileOutputFormat in 
 such a way that it is not extensible to other types of output. It should be 
 made more generic, such as with an interface that can be implemented for 
 different outputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5362) clean up POM dependencies

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524911#comment-14524911
 ] 

Hadoop QA commented on MAPREDUCE-5362:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12640169/mr-5362-0.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5567/console |


This message was automatically generated.

 clean up POM dependencies
 -

 Key: MAPREDUCE-5362
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5362
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: MAPREDUCE-5362.patch, mr-5362-0.patch


 Intermediate 'pom' modules define dependencies inherited by leaf modules.
 This is causing issues in intellij IDE.
 We should normalize the leaf modules like in common, hdfs and tools where all 
 dependencies are defined in each leaf module and the intermediate 'pom' 
 module do not define any dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6030) In mr-jobhistory-daemon.sh, some env variables are not affected by mapred-env.sh

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524907#comment-14524907
 ] 

Hadoop QA commented on MAPREDUCE-6030:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12660804/MAPREDUCE-6030.patch |
| Optional Tests | shellcheck |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5564/console |


This message was automatically generated.

 In mr-jobhistory-daemon.sh, some env variables are not affected by 
 mapred-env.sh
 

 Key: MAPREDUCE-6030
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6030
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.4.1
Reporter: Youngjoon Kim
Assignee: Youngjoon Kim
Priority: Minor
 Attachments: MAPREDUCE-6030.patch


 In mr-jobhistory-daemon.sh, some env variables are exported before sourcing 
 mapred-env.sh, so these variables don't use values defined in mapred-env.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6038) A boolean may be set error in the Word Count v2.0 in MapReduce Tutorial

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524881#comment-14524881
 ] 

Hadoop QA commented on MAPREDUCE-6038:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12665076/MAPREDUCE-6038.1.patch 
|
| Optional Tests | site |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5561/console |


This message was automatically generated.

 A boolean may be set error in the Word Count v2.0 in MapReduce Tutorial
 ---

 Key: MAPREDUCE-6038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: java version 1.8.0_11 hostspot 64-bit
Reporter: Pei Ma
Assignee: Tsuyoshi Ozawa
Priority: Minor
 Attachments: MAPREDUCE-6038.1.patch


 As a beginner, when I learned about the basic of the mr, I found that I 
 cound't run the WordCount2 using the command bin/hadoop jar wc.jar 
 WordCount2 /user/joe/wordcount/input /user/joe/wordcount/output in the 
 Tutorial. The VM throwed the NullPoniterException at the line 47. In the line 
 45, the returned default value of conf.getBoolean is true. That is to say  
 when wordcount.skip.patterns is not set ,the WordCount2 will continue to 
 execute getCacheFiles.. Then patternsURIs gets the null value. When the 
 -skip option dosen't exist,  wordcount.skip.patterns will not be set. 
 Then a NullPointerException come out.
 At all, the block after the if-statement in line no. 45 shoudn't be executed 
 when the -skip option dosen't exist in command. Maybe the line 45 should 
 like that  if (conf.getBoolean(wordcount.skip.patterns, false)) { 
 .Just change the boolean.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524885#comment-14524885
 ] 

Hadoop QA commented on MAPREDUCE-5889:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12647036/MAPREDUCE-5889.3.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5562/console |


This message was automatically generated.

 Deprecate FileInputFormat.setInputPaths(Job, String) and 
 FileInputFormat.addInputPaths(Job, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie
 Attachments: MAPREDUCE-5889.3.patch, MAPREDUCE-5889.patch, 
 MAPREDUCE-5889.patch


 {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and 
 {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail 
 to parse commaSeparatedPaths if a comma is included in the file path. (e.g. 
 Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(Job 
 job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} 
 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5969) Private non-Archive Files' size add twice in Distributed Cache directory size calculation.

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524969#comment-14524969
 ] 

Hadoop QA commented on MAPREDUCE-5969:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12673831/MAPREDUCE-5969.branch1.1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5577/console |


This message was automatically generated.

 Private non-Archive Files' size add twice in Distributed Cache directory size 
 calculation.
 --

 Key: MAPREDUCE-5969
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5969
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-5969.branch1.1.patch, 
 MAPREDUCE-5969.branch1.patch


 Private non-Archive Files' size add twice in Distributed Cache directory size 
 calculation. Private non-Archive Files list is passed in by -files command 
 line option. The Distributed Cache directory size is used to check whether 
 the total cache files size exceed the cache size limitation,  the default 
 cache size limitation is 10G.
 I add log in addCacheInfoUpdate and setSize in 
 TrackerDistributedCacheManager.java.
 I use the following command to test:
 hadoop jar ./wordcount.jar org.apache.hadoop.examples.WordCount -files 
 hdfs://host:8022/tmp/zxu/WordCount.java,hdfs://host:8022/tmp/zxu/wordcount.jar
  /tmp/zxu/test_in/ /tmp/zxu/test_out
 to add two files into distributed cache:WordCount.java and wordcount.jar.
 WordCount.java file size is 2395 byes and wordcount.jar file size is 3865 
 bytes. The total should be 6260.
 The log show these files size added twice:
 add one time before download to local node and add second time after download 
 to local node, so total file number becomes 4 instead of 2:
 addCacheInfoUpdate size: 6260 num: 2 baseDir: /mapred/local
 addCacheInfoUpdate size: 8683 num: 3 baseDir: /mapred/local
 addCacheInfoUpdate size: 12588 num: 4 baseDir: /mapred/local
 In the code, for Private non-Archive File, the first time we add file size is 
 at 
 getLocalCache:
 {code}
 if (!isArchive) {
   //for private archives, the lengths come over RPC from the 
   //JobLocalizer since the JobLocalizer is the one who expands
   //archives and gets the total length
   lcacheStatus.size = fileStatus.getLen();
   LOG.info(getLocalCache: + localizedPath +  size = 
   + lcacheStatus.size);
   // Increase the size and sub directory count of the cache
   // from baseDirSize and baseDirNumberSubDir.
   baseDirManager.addCacheInfoUpdate(lcacheStatus);
 }
 {code}
 The second time we add file size is at 
 setSize:
 {code}
   synchronized (status) {
 status.size = size;
 baseDirManager.addCacheInfoUpdate(status);
   }
 {code}
 The fix is not to add the file size for for Private non-Archive File after 
 download(downloadCacheObject).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524978#comment-14524978
 ] 

Hadoop QA commented on MAPREDUCE-4818:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12675155/MAPREDUCE-4818.v5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5578/console |


This message was automatically generated.

 Easier identification of tasks that timeout during localization
 ---

 Key: MAPREDUCE-4818
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 0.23.3, 2.0.3-alpha
Reporter: Jason Lowe
Assignee: Siqi Li
  Labels: usability
 Attachments: MAPREDUCE-4818.v1.patch, MAPREDUCE-4818.v2.patch, 
 MAPREDUCE-4818.v3.patch, MAPREDUCE-4818.v4.patch, MAPREDUCE-4818.v5.patch


 When a task is taking too long to localize and is killed by the AM due to 
 task timeout, the job UI/history is not very helpful.  The attempt simply 
 lists a diagnostic stating it was killed due to timeout, but there are no 
 logs for the attempt since it never actually got started.  There are log 
 messages on the NM that show the container never made it past localization by 
 the time it was killed, but users often do not have access to those logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524913#comment-14524913
 ] 

Hadoop QA commented on MAPREDUCE-4711:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12548038/MAPREDUCE-4711.branch-0.23.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5569/console |


This message was automatically generated.

 Append time elapsed since job-start-time for finished tasks
 ---

 Key: MAPREDUCE-4711
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3
Reporter: Ravi Prakash
 Attachments: MAPREDUCE-4711.branch-0.23.patch


 In 0.20.x/1.x, the analyze job link gave this information
 bq. The last Map task task_sometask finished at (relative to the Job launch 
 time): 5/10 20:23:10 (1hrs, 27mins, 54sec)
 The time it took for the last task to finish needs to be calculated mentally 
 in 0.23. I believe we should print it next to the finish time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525015#comment-14525015
 ] 

Hadoop QA commented on MAPREDUCE-4818:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12675155/MAPREDUCE-4818.v5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5579/console |


This message was automatically generated.

 Easier identification of tasks that timeout during localization
 ---

 Key: MAPREDUCE-4818
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 0.23.3, 2.0.3-alpha
Reporter: Jason Lowe
Assignee: Siqi Li
  Labels: usability
 Attachments: MAPREDUCE-4818.v1.patch, MAPREDUCE-4818.v2.patch, 
 MAPREDUCE-4818.v3.patch, MAPREDUCE-4818.v4.patch, MAPREDUCE-4818.v5.patch


 When a task is taking too long to localize and is killed by the AM due to 
 task timeout, the job UI/history is not very helpful.  The attempt simply 
 lists a diagnostic stating it was killed due to timeout, but there are no 
 logs for the attempt since it never actually got started.  There are log 
 messages on the NM that show the container never made it past localization by 
 the time it was killed, but users often do not have access to those logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5392) mapred job -history all command throws IndexOutOfBoundsException

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524958#comment-14524958
 ] 

Hadoop QA commented on MAPREDUCE-5392:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12669757/MAPREDUCE-5392.5.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5576/console |


This message was automatically generated.

 mapred job -history all command throws IndexOutOfBoundsException
 --

 Key: MAPREDUCE-5392
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5392
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0, 2.0.5-alpha, 2.2.0
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
 Attachments: MAPREDUCE-5392.2.patch, MAPREDUCE-5392.3.patch, 
 MAPREDUCE-5392.4.patch, MAPREDUCE-5392.5.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, 
 MAPREDUCE-5392.patch, MAPREDUCE-5392.patch


 When I use an all option by mapred job -history comamnd, the following 
 exceptions are displayed and do not work.
 {code}
 Exception in thread main java.lang.StringIndexOutOfBoundsException: String 
 index out of range: -3
 at java.lang.String.substring(String.java:1875)
 at 
 org.apache.hadoop.mapreduce.util.HostUtil.convertTrackerNameToHostName(HostUtil.java:49)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.getTaskLogsUrl(HistoryViewer.java:459)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.printAllTaskAttempts(HistoryViewer.java:235)
 at 
 org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.print(HistoryViewer.java:117)
 at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:472)
 at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1233)
 {code}
 This is because a node name recorded in History file is not given tracker_. 
 Therefore it makes modifications to be able to read History file even if a 
 node name is not given by tracker_.
 In addition, it fixes the URL of displayed task log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6040) distcp should automatically use /.reserved/raw when run by the superuser

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524949#comment-14524949
 ] 

Hadoop QA commented on MAPREDUCE-6040:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12669095/MAPREDUCE-6040.002.patch
 |
| Optional Tests | site javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5574/console |


This message was automatically generated.

 distcp should automatically use /.reserved/raw when run by the superuser
 

 Key: MAPREDUCE-6040
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6040
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Charles Lamb
 Attachments: HDFS-6134-Distcp-cp-UseCasesTable2.pdf, 
 MAPREDUCE-6040.001.patch, MAPREDUCE-6040.002.patch


 On HDFS-6134, [~sanjay.radia] asked for distcp to automatically prepend 
 /.reserved/raw if the distcp is being performed by the superuser and 
 /.reserved/raw is supported by both the source and destination filesystems. 
 This behavior only occurs if none of the src and target pathnames are 
 /.reserved/raw.
 The -disablereservedraw flag can be used to disable this option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524912#comment-14524912
 ] 

Hadoop QA commented on MAPREDUCE-5981:
--

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12656504/MAPREDUCE-5981.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5568/console |


This message was automatically generated.

 Log levels of certain MR logs can be changed to DEBUG
 -

 Key: MAPREDUCE-5981
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: MAPREDUCE-5981.patch


 Following map reduce logs can be changed to DEBUG log level.
 1. In 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 
 313), the second log is not required to be at info level. This can be moved 
 to debug as a warn log is anyways printed if verifyReply fails.
   SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey);
   LOG.info(for url=+msgToEncode+ sent hash and received reply);
 2. Thread related info need not be printed in logs at INFO level. Below 2 
 logs can be moved to DEBUG
 a) In 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java
  : 381), below log can be changed to DEBUG
LOG.info(Assigning  + host +  with  + host.getNumKnownMapOutputs() +
 to  + Thread.currentThread().getName());
 b) In 
 org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java
  : 411), below log can be changed to DEBUG
  LOG.info(assigned  + includedMaps +  of  + totalSize +  to  +
  host +  to  + Thread.currentThread().getName());
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >