[jira] [Updated] (MAPREDUCE-5593) Cleanup code for AssignMapsWithLocality() in RMContainerAllocator

2013-11-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-5593:
--

Attachment: MAPREDUCE-5593-v4.patch

Sync patch with recently trunk branch.

> Cleanup code for AssignMapsWithLocality() in RMContainerAllocator
> -
>
> Key: MAPREDUCE-5593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5593
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Attachments: MAPREDUCE-5593-v3.patch, MAPREDUCE-5593-v4.patch, 
> YARN-1163-v1.patch, YARN-1163-v2.patch
>
>
> In RMContainerAllocator, AssignMapsWithLocality() is a very important method 
> to assign map tasks on allocated containers with conforming different level 
> of locality (dataLocal, rackLocal, etc.). However, this method messed with 
> different code logic to handle different type of locality but have lots of 
> similar behaviours. This is hard to maintain as well as do extension with 
> other locality type, so we need some more clear code here.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5600) ConcurrentModificationException on /tasktracker.jsp

2013-11-11 Thread Benoit Sigoure (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819858#comment-13819858
 ] 

Benoit Sigoure commented on MAPREDUCE-5600:
---

It's a race condition, so it's not deterministic.  But it's fairly easy to 
reproduce on any moderately busy TaskTracer will run into this exception if you 
keep requesting {{/tasktracker.jsp}}.

> ConcurrentModificationException on /tasktracker.jsp
> ---
>
> Key: MAPREDUCE-5600
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5600
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: Found this in the MR1 bundled with 
> hadoop-2.0.0-mr1-cdh4.1.3 (which I think is based on some 0.20 version).
>Reporter: Benoit Sigoure
>
> If you request {{/tasktracker.jsp}} frequently on a TaskTracker that's busy, 
> every once in a while you'll get this:
> {code}
> 2013-10-29 13:25:55,524 ERROR org.mortbay.log: /tasktracker.jsp
> java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1117)
> at java.util.TreeMap$EntryIterator.next(TreeMap.java:1153)
> at java.util.TreeMap$EntryIterator.next(TreeMap.java:1148)
> at 
> org.apache.hadoop.mapred.TaskTracker.getTasksFromRunningJobs(TaskTracker.java:3991)
> at 
> org.apache.hadoop.mapred.tasktracker_jsp._jspService(tasktracker_jsp.java:98)
> at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1056)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5600) ConcurrentModificationException on /tasktracker.jsp

2013-11-11 Thread Kousuke Saruta (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819851#comment-13819851
 ] 

Kousuke Saruta commented on MAPREDUCE-5600:
---

Hi Benoit,
Can you reproduce that?

> ConcurrentModificationException on /tasktracker.jsp
> ---
>
> Key: MAPREDUCE-5600
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5600
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: Found this in the MR1 bundled with 
> hadoop-2.0.0-mr1-cdh4.1.3 (which I think is based on some 0.20 version).
>Reporter: Benoit Sigoure
>
> If you request {{/tasktracker.jsp}} frequently on a TaskTracker that's busy, 
> every once in a while you'll get this:
> {code}
> 2013-10-29 13:25:55,524 ERROR org.mortbay.log: /tasktracker.jsp
> java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1117)
> at java.util.TreeMap$EntryIterator.next(TreeMap.java:1153)
> at java.util.TreeMap$EntryIterator.next(TreeMap.java:1148)
> at 
> org.apache.hadoop.mapred.TaskTracker.getTasksFromRunningJobs(TaskTracker.java:3991)
> at 
> org.apache.hadoop.mapred.tasktracker_jsp._jspService(tasktracker_jsp.java:98)
> at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1056)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-1176) FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819799#comment-13819799
 ] 

Hudson commented on MAPREDUCE-1176:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #4717 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4717/])
MAPREDUCE-1176. FixedLengthInputFormat and FixedLengthRecordReader (Mariappan 
Asokan and BitsOfInfo via Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1540931)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FixedLengthInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FixedLengthRecordReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FixedLengthInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FixedLengthRecordReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestFixedLengthInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFixedLengthInputFormat.java


> FixedLengthInputFormat and FixedLengthRecordReader
> --
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.2.0
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-5619) Separate out configuration loading from QueueManager in the Fair Scheduler

2013-11-11 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5619:
-

 Summary: Separate out configuration loading from QueueManager in 
the Fair Scheduler
 Key: MAPREDUCE-5619
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5619
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza






--
This message was sent by Atlassian JIRA
(v6.1#6144)



[jira] [Commented] (MAPREDUCE-5612) Document TaskAttemptCompletionEventStatuses

2013-11-11 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819785#comment-13819785
 ] 

Sandy Ryza commented on MAPREDUCE-5612:
---

Thanks for clarifying Jason.

> Document TaskAttemptCompletionEventStatuses
> ---
>
> Key: MAPREDUCE-5612
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5612
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Priority: Minor
>  Labels: newbie
>
> What's the difference between FAILED and TIPFAILED?  What is OBSOLETE?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-1176) FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-1176:
--

   Resolution: Fixed
Fix Version/s: 2.3.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I just committed this.  Thanks [~masokan] and [~bitsofinfo]!

> FixedLengthInputFormat and FixedLengthRecordReader
> --
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.2.0
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-1176) FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-1176:
--

Summary: FixedLengthInputFormat and FixedLengthRecordReader  (was: 
Contribution: FixedLengthInputFormat and FixedLengthRecordReader)

> FixedLengthInputFormat and FixedLengthRecordReader
> --
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.2.0
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5605) Memory-centric MapReduce aiming to solve the I/O bottleneck

2013-11-11 Thread Ming Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Chen updated MAPREDUCE-5605:
-

Tags: memory-centric multi-thread optimization task  (was: memory-centric 
muluti-thread optimization task)

> Memory-centric MapReduce aiming to solve the I/O bottleneck
> ---
>
> Key: MAPREDUCE-5605
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 1.0.1
> Environment: x86-64 Linux/Unix
> 64-bit jdk7 preferred
>Reporter: Ming Chen
>Assignee: Ming Chen
> Fix For: 1.0.1
>
> Attachments: MAPREDUCE-5605-v1.patch, 
> hadoop-core-1.0.1-mammoth-0.9.0.jar
>
>
> Memory is a very important resource to bridge the gap between CPUs and I/O 
> devices. So the idea is to maximize the usage of memory to solve the problem 
> of I/O bottleneck. We developed a multi-threaded task execution engine, which 
> runs in a single JVM on a node. In the execution engine, we have implemented 
> the algorithm of memory scheduling to realize global memory management, based 
> on which we further developed the techniques such as sequential disk 
> accessing, multi-cache and solved the problem of full garbage collection in 
> the JVM. The benchmark results shows that it can get impressive improvement 
> in typical cases. When the a system is relatively short of memory (eg, HPC, 
> small- and medium-size enterprises), the improvement will be even more 
> impressive.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-11-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819721#comment-13819721
 ] 

Sangjin Lee commented on MAPREDUCE-5186:


Thanks for patching this Jason!

> mapreduce.job.max.split.locations causes some splits created by 
> CombineFileInputFormat to fail
> --
>
> Key: MAPREDUCE-5186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 2.0.4-alpha, 2.2.0
>Reporter: Sangjin Lee
>Assignee: Robert Parker
>Priority: Critical
> Fix For: 3.0.0, 2.3.0
>
> Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
> MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch
>
>
> CombineFileInputFormat can easily create splits that can come from many 
> different locations (during the last pass of creating "global" splits). 
> However, we observe that this often runs afoul of the 
> mapreduce.job.max.split.locations check that's done by JobSplitWriter.
> The default value for mapreduce.job.max.split.locations is 10, and with any 
> decent size cluster, CombineFileInputFormat creates splits that are well 
> above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819596#comment-13819596
 ] 

Hadoop QA commented on MAPREDUCE-5481:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12613242/MAPREDUCE-5481.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator
  org.apache.hadoop.mapred.TestJobCleanup

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4189//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4189//console

This message is automatically generated.

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-11 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819476#comment-13819476
 ] 

Sandy Ryza commented on MAPREDUCE-5481:
---

I've seen the test failures form the previous build on other JIRAs and don't 
think they're related.

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-11 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-5481:
---

Attachment: MAPREDUCE-5481.patch

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-11 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-5481:
---

Status: Patch Available  (was: Open)

Kicking the build.

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5481) Uber job reducers hang waiting to shuffle map outputs

2013-11-11 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-5481:
---

Status: Open  (was: Patch Available)

> Uber job reducers hang waiting to shuffle map outputs
> -
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: MAPREDUCE-5481.patch, syslog
>
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819299#comment-13819299
 ] 

Sandy Ryza commented on MAPREDUCE-1176:
---

The test failures are unrelated - we're seeing them on other JIRAs as well.

+1.  Will commit this later today or tomorrow unless anybody has additional 
concerns.

> Contribution: FixedLengthInputFormat and FixedLengthRecordReader
> 
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.2.0
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-11-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819284#comment-13819284
 ] 

Hudson commented on MAPREDUCE-5186:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #4713 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4713/])
MAPREDUCE-5186. mapreduce.job.max.split.locations causes some splits created by 
CombineFileInputFormat to fail. Contributed by Robert Parker and Jason Lowe 
(jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1540813)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/split/JobSplitWriter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/split
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/split/TestJobSplitWriter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestBlockLimits.java


> mapreduce.job.max.split.locations causes some splits created by 
> CombineFileInputFormat to fail
> --
>
> Key: MAPREDUCE-5186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 2.0.4-alpha, 2.2.0
>Reporter: Sangjin Lee
>Assignee: Robert Parker
>Priority: Critical
> Fix For: 3.0.0, 2.3.0
>
> Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
> MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch
>
>
> CombineFileInputFormat can easily create splits that can come from many 
> different locations (during the last pass of creating "global" splits). 
> However, we observe that this often runs afoul of the 
> mapreduce.job.max.split.locations check that's done by JobSplitWriter.
> The default value for mapreduce.job.max.split.locations is 10, and with any 
> decent size cluster, CombineFileInputFormat creates splits that are well 
> above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail

2013-11-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5186:
--

   Resolution: Fixed
Fix Version/s: 2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Rob for the contribution, and thanks Sangjin and Daryn for reviews.  I 
committed this to trunk and branch-2.

> mapreduce.job.max.split.locations causes some splits created by 
> CombineFileInputFormat to fail
> --
>
> Key: MAPREDUCE-5186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 2.0.4-alpha, 2.2.0
>Reporter: Sangjin Lee
>Assignee: Robert Parker
>Priority: Critical
> Fix For: 3.0.0, 2.3.0
>
> Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, 
> MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch
>
>
> CombineFileInputFormat can easily create splits that can come from many 
> different locations (during the last pass of creating "global" splits). 
> However, we observe that this often runs afoul of the 
> mapreduce.job.max.split.locations check that's done by JobSplitWriter.
> The default value for mapreduce.job.max.split.locations is 10, and with any 
> decent size cluster, CombineFileInputFormat creates splits that are well 
> above this limit.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5610) TestSleepJob fails in jdk7

2013-11-11 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819253#comment-13819253
 ] 

Jonathan Eagles commented on MAPREDUCE-5610:


Existing tests running with jdk7 already demonstrate the failure. When apache 
builds switch to jdk7, this will be sufficient.

> TestSleepJob fails in jdk7
> --
>
> Key: MAPREDUCE-5610
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5610
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: MAPREDUCE-5610.patch
>
>
> In jdk7 tests methods in a class do not run in file order, but rather in 
> random order. TestSleepJob hosts are not initialized and a 
> NullPointerException is thrown unless testRandomLocation was run first.
> This can be easily seen by running tests individually.
> org.apache.hadoop.mapred.gridmix.TestSleepJob#testStressSubmit
> org.apache.hadoop.mapred.gridmix.TestSleepJob#testReplaySubmit
> org.apache.hadoop.mapred.gridmix.TestSleepJob#testSerialSubmit
> org.apache.hadoop.mapred.gridmix.TestSleepJob#testMapTasksOnlySleepJobs
> org.apache.hadoop.mapred.gridmix.TestSleepJob#testRandomLocation



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819148#comment-13819148
 ] 

Hadoop QA commented on MAPREDUCE-1176:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12613169/mapreduce-1176_v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.mapred.TestJobCleanup

  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

org.apache.hadoop.mapreduce.v2.TestUberAM

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4188//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4188//console

This message is automatically generated.

> Contribution: FixedLengthInputFormat and FixedLengthRecordReader
> 
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.2.0
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-1176:


Status: Patch Available  (was: In Progress)

> Contribution: FixedLengthInputFormat and FixedLengthRecordReader
> 
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.2.0
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-1176:


Affects Version/s: (was: 2.0.5-alpha)
   (was: 2.1.0-beta)
   2.2.0
   Status: In Progress  (was: Patch Available)

> Contribution: FixedLengthInputFormat and FixedLengthRecordReader
> 
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.2.0
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-11-11 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819132#comment-13819132
 ] 

Chris Nauroth commented on MAPREDUCE-5351:
--

Hi, [~jandyfish].  Thank you for reporting this and posting a patch.

Have you seen the related issue MAPREDUCE-5508?  I believe that fixes the bug 
you saw.  The MAPREDUCE-5508 patch has already been reviewed and committed.  We 
tested it at scale (hundreds of nodes/thousands of job executions) and saw no 
occurrences of the leak.

> JobTracker memory leak caused by CleanupQueue reopening FileSystem
> --
>
> Key: MAPREDUCE-5351
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1.1.2
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>Priority: Critical
> Fix For: 1-win, 1.2.1
>
> Attachments: JobInProgress_JobHistory.patch, MAPREDUCE-5351-1.patch, 
> MAPREDUCE-5351-2.patch, MAPREDUCE-5351-addendum-1.patch, 
> MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch
>
>
> When a job is completed, closeAllForUGI is called to close all the cached 
> FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
> this occurs and call FileSystem.get() to delete the staging directory, adding 
> a FileSystem to the cache that will never be closed.
> People on the user-list have reported this causing their JobTrackers to OOME 
> every two weeks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5612) Document TaskAttemptCompletionEventStatuses

2013-11-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819074#comment-13819074
 ] 

Jason Lowe commented on MAPREDUCE-5612:
---

As I understand it, FAILED indicates an attempt fails but there are attempts 
remaining.  TIPFAILED indicates an attempt failed and no further attempts will 
be made (i.e.: reached max attempts).  When a reducer receives a TIPFAILED 
event it gives up trying to shuffle data from that map task.

OBSOLETE is used to override a previously successful event status.  For 
example, a map attempt runs on node A and a SUCCEEDED event is sent.  Later the 
task is retroactively failed due to excessive fetch failures during the shuffle 
phase.  When that retroactive attempt failure occurs, an OBSOLETE event is sent 
for that map attempt to indicate the prior event is no longer valid.

> Document TaskAttemptCompletionEventStatuses
> ---
>
> Key: MAPREDUCE-5612
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5612
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Priority: Minor
>  Labels: newbie
>
> What's the difference between FAILED and TIPFAILED?  What is OBSOLETE?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819061#comment-13819061
 ] 

Mariappan Asokan commented on MAPREDUCE-1176:
-

Hi Sandy,
   Thanks for reviewing the patch.  I have followed all your suggestions and 
uploaded a new patch.  Please review it.  By the way, the seed for the random 
number generator is already logged.  Am I missing something?  Please let me 
know.

-- Asokan

> Contribution: FixedLengthInputFormat and FixedLengthRecordReader
> 
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.1.0-beta, 2.0.5-alpha
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader

2013-11-11 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-1176:


Attachment: mapreduce-1176_v3.patch

> Contribution: FixedLengthInputFormat and FixedLengthRecordReader
> 
>
> Key: MAPREDUCE-1176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.1.0-beta, 2.0.5-alpha
> Environment: Any
>Reporter: BitsOfInfo
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, 
> MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch, mapreduce-1176_v1.patch, 
> mapreduce-1176_v2.patch, mapreduce-1176_v3.patch
>
>
> Hello,
> I would like to contribute the following two classes for incorporation into 
> the mapreduce.lib.input package. These two classes can be used when you need 
> to read data from files containing fixed length (fixed width) records. Such 
> files have no CR/LF (or any combination thereof), no delimiters etc, but each 
> record is a fixed length, and extra data is padded with spaces. The data is 
> one gigantic line within a file.
> Provided are two classes first is the FixedLengthInputFormat and its 
> corresponding FixedLengthRecordReader. When creating a job that specifies 
> this input format, the job must have the 
> "mapreduce.input.fixedlengthinputformat.record.length" property set as follows
> myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
> OR
> myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, 
> [myFixedRecordLength]);
> This input format overrides computeSplitSize() in order to ensure that 
> InputSplits do not contain any partial records since with fixed records there 
> is no way to determine where a record begins if that were to occur. Each 
> InputSplit passed to the FixedLengthRecordReader will start at the beginning 
> of a record, and the last byte in the InputSplit will be the last byte of a 
> record. The override of computeSplitSize() delegates to FileInputFormat's 
> compute method, and then adjusts the returned split size by doing the 
> following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) 
> * fixedRecordLength)
> This suite of fixed length input format classes, does not support compressed 
> files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5618) Make applicationType pluggable

2013-11-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818783#comment-13818783
 ] 

Karthik Kambatla commented on MAPREDUCE-5618:
-

A way around this might be to keep track of all the application types set for 
MR jobs in that cluster, and use these when fetching information from YARN. 
But, that seems like an overkill.

> Make applicationType pluggable
> --
>
> Key: MAPREDUCE-5618
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5618
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>
> MR AM sets the applicationType to be "MAPREDUCE". Downstream projects like 
> Pig, Hive, Oozie might want to set this to a different value for their 
> error-handling, query-tracking etc. Making this pluggable should help this 
> cause.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5618) Make applicationType pluggable

2013-11-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818782#comment-13818782
 ] 

Karthik Kambatla commented on MAPREDUCE-5618:
-

Looking at the source, this would affect the behavior of Cluster, CLI 
getJobs/getAllJobs methods. These methods won't return MR jobs started Oozie, 
Pig, or Hive. This would be an incompatible change.

> Make applicationType pluggable
> --
>
> Key: MAPREDUCE-5618
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5618
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>
> MR AM sets the applicationType to be "MAPREDUCE". Downstream projects like 
> Pig, Hive, Oozie might want to set this to a different value for their 
> error-handling, query-tracking etc. Making this pluggable should help this 
> cause.



--
This message was sent by Atlassian JIRA
(v6.1#6144)