[jira] [Updated] (MAPREDUCE-5657) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2013-11-28 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-5657:
--

Assignee: Andrew Purtell
  Status: Patch Available  (was: Open)

 [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
 -

 Key: MAPREDUCE-5657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.3.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: 5657-branch-2.patch, 5657-trunk.patch


 Javadoc is more strict by default in JDK8 and will error out on malformed or 
 illegal tags found in doc comments. Although tagged as JDK8 all of the 
 required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5657) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2013-11-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834714#comment-13834714
 ] 

Hadoop QA commented on MAPREDUCE-5657:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616125/5657-branch-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-examples hadoop-tools/hadoop-distcp 
hadoop-tools/hadoop-extras hadoop-tools/hadoop-gridmix:

  org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4233//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4233//console

This message is automatically generated.

 [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
 -

 Key: MAPREDUCE-5657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.3.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: 5657-branch-2.patch, 5657-trunk.patch


 Javadoc is more strict by default in JDK8 and will error out on malformed or 
 illegal tags found in doc comments. Although tagged as JDK8 all of the 
 required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5611) CombineFileInputFormat creates more rack-local tasks due to less split location info.

2013-11-28 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835199#comment-13835199
 ] 

Rajesh Balamohan commented on MAPREDUCE-5611:
-


Thanks Chandra.  This is a good perf patch.  Here are the data locality numbers 
which can be useful to analyze the perf improvement.


Without Patch:
Job CountersLaunched map tasks  0   0   335
Data-local map tasks0   0   179
Rack-local map tasks0   0   81

With Patch:
Job CountersLaunched map tasks  0   0   335
Data-local map tasks0   0   279
Rack-local map tasks0   0   47

The data locality improves a lot with this patch in Hive queries.  


 CombineFileInputFormat creates more rack-local tasks due to less split 
 location info.
 -

 Key: MAPREDUCE-5611
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5611
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Chandra Prakash Bhagtani
Assignee: Chandra Prakash Bhagtani
 Fix For: trunk

 Attachments: CombineFileInputFormat-trunk.patch


 I have come across an issue with CombineFileInputFormat. Actually I ran a 
 hive query on approx 1.2 GB data with CombineHiveInputFormat which internally 
 uses CombineFileInputFormat. My cluster size is 9 datanodes and 
 max.split.size is 256 MB
 When I ran this query with replication factor 9, hive consistently creates 
 all 6 rack-local tasks and with replication factor 3 it creates 5 rack-local 
 and 1 data local tasks. 
  When replication factor is 9 (equal to cluster size), all the tasks should 
 be data-local as each datanode contains all the replicas of the input data, 
 but that is not happening i.e all the tasks are rack-local. 
 When I dug into CombineFileInputFormat.java code in getMoreSplits method, I 
 found the issue with the following snippet (specially in case of higher 
 replication factor)
 {code:title=CombineFileInputFormat.java|borderStyle=solid}
 for (IteratorMap.EntryString,
  ListOneBlockInfo iter = nodeToBlocks.entrySet().iterator();
  iter.hasNext();) {
Map.EntryString, ListOneBlockInfo one = iter.next();
   nodes.add(one.getKey());
   ListOneBlockInfo blocksInNode = one.getValue();
   // for each block, copy it into validBlocks. Delete it from
   // blockToNodes so that the same block does not appear in
   // two different splits.
   for (OneBlockInfo oneblock : blocksInNode) {
 if (blockToNodes.containsKey(oneblock)) {
   validBlocks.add(oneblock);
   blockToNodes.remove(oneblock);
   curSplitSize += oneblock.length;
   // if the accumulated split size exceeds the maximum, then
   // create this split.
   if (maxSize != 0  curSplitSize = maxSize) {
 // create an input split and add it to the splits array
 addCreatedSplit(splits, nodes, validBlocks);
 curSplitSize = 0;
 validBlocks.clear();
   }
 }
   }
 {code}
 First node in the map nodeToBlocks has all the replicas of input file, so the 
 above code creates 6 splits all with only one location. Now if JT doesn't 
 schedule these tasks on that node, all the tasks will be rack-local, even 
 though all the other datanodes have all the other replicas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5611) CombineFileInputFormat creates more rack-local tasks due to less split location info.

2013-11-28 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835200#comment-13835200
 ] 

Rajesh Balamohan commented on MAPREDUCE-5611:
-

Just wanted to add the response times as well

Without Patch : 289 seconds
With Patch: 219 seconds

This testing was carried out with with Hive 0.10

 CombineFileInputFormat creates more rack-local tasks due to less split 
 location info.
 -

 Key: MAPREDUCE-5611
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5611
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Chandra Prakash Bhagtani
Assignee: Chandra Prakash Bhagtani
 Fix For: trunk

 Attachments: CombineFileInputFormat-trunk.patch


 I have come across an issue with CombineFileInputFormat. Actually I ran a 
 hive query on approx 1.2 GB data with CombineHiveInputFormat which internally 
 uses CombineFileInputFormat. My cluster size is 9 datanodes and 
 max.split.size is 256 MB
 When I ran this query with replication factor 9, hive consistently creates 
 all 6 rack-local tasks and with replication factor 3 it creates 5 rack-local 
 and 1 data local tasks. 
  When replication factor is 9 (equal to cluster size), all the tasks should 
 be data-local as each datanode contains all the replicas of the input data, 
 but that is not happening i.e all the tasks are rack-local. 
 When I dug into CombineFileInputFormat.java code in getMoreSplits method, I 
 found the issue with the following snippet (specially in case of higher 
 replication factor)
 {code:title=CombineFileInputFormat.java|borderStyle=solid}
 for (IteratorMap.EntryString,
  ListOneBlockInfo iter = nodeToBlocks.entrySet().iterator();
  iter.hasNext();) {
Map.EntryString, ListOneBlockInfo one = iter.next();
   nodes.add(one.getKey());
   ListOneBlockInfo blocksInNode = one.getValue();
   // for each block, copy it into validBlocks. Delete it from
   // blockToNodes so that the same block does not appear in
   // two different splits.
   for (OneBlockInfo oneblock : blocksInNode) {
 if (blockToNodes.containsKey(oneblock)) {
   validBlocks.add(oneblock);
   blockToNodes.remove(oneblock);
   curSplitSize += oneblock.length;
   // if the accumulated split size exceeds the maximum, then
   // create this split.
   if (maxSize != 0  curSplitSize = maxSize) {
 // create an input split and add it to the splits array
 addCreatedSplit(splits, nodes, validBlocks);
 curSplitSize = 0;
 validBlocks.clear();
   }
 }
   }
 {code}
 First node in the map nodeToBlocks has all the replicas of input file, so the 
 above code creates 6 splits all with only one location. Now if JT doesn't 
 schedule these tasks on that node, all the tasks will be rack-local, even 
 though all the other datanodes have all the other replicas.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-5658) JHS API and other changes to support tags

2013-11-28 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created MAPREDUCE-5658:
---

 Summary: JHS API and other changes to support tags
 Key: MAPREDUCE-5658
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5658
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.2.0
Reporter: Karthik Kambatla


YARN-1399 introduces support for adding tags to an application. The JHS should 
use this and support querying MR jobs with particular tags set.



--
This message was sent by Atlassian JIRA
(v6.1#6144)