[jira] [Updated] (MAPREDUCE-2781) mr279 RM application finishtime not set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-2781: - Attachment: MAPREDUCE-2781-v2.patch patch with various small fixes and unit tests. > mr279 RM application finishtime not set > --- > > Key: MAPREDUCE-2781 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2781 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Thomas Graves >Assignee: Thomas Graves >Priority: Minor > Fix For: 0.23.0 > > Attachments: MAPREDUCE-2781-v2.patch, finishtime.patch > > > The RM Application finishTime isn't being set. Looks like it got lost in the > RM refactor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2781) mr279 RM application finishtime not set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-2781: - Fix Version/s: 0.23.0 Affects Version/s: 0.23.0 Status: Patch Available (was: Open) > mr279 RM application finishtime not set > --- > > Key: MAPREDUCE-2781 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2781 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Thomas Graves >Assignee: Thomas Graves >Priority: Minor > Fix For: 0.23.0 > > Attachments: MAPREDUCE-2781-v2.patch, finishtime.patch > > > The RM Application finishTime isn't being set. Looks like it got lost in the > RM refactor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2788) LeafQueue.assignContainer() can cause a crash if request.getCapability().getMemory() == 0
[ https://issues.apache.org/jira/browse/MAPREDUCE-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-2788: Attachment: MAPREDUCE-2788.patch The attached patch avoids the described crash and returns Resources.none() in case memory==0. > LeafQueue.assignContainer() can cause a crash if > request.getCapability().getMemory() == 0 > - > > Key: MAPREDUCE-2788 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2788 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Reporter: Ahmed Radwan >Assignee: Ahmed Radwan > Attachments: MAPREDUCE-2788.patch > > > The assignContainer() method in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue > can cause the scheduler to crash if the ResourseRequest capability memory == > 0 (divide by zero). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2788) LeafQueue.assignContainer() can cause a crash if request.getCapability().getMemory() == 0
LeafQueue.assignContainer() can cause a crash if request.getCapability().getMemory() == 0 - Key: MAPREDUCE-2788 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2788 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Ahmed Radwan Assignee: Ahmed Radwan The assignContainer() method in org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue can cause the scheduler to crash if the ResourseRequest capability memory == 0 (divide by zero). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2780) Standardize the value of token service
[ https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081301#comment-13081301 ] Jitendra Nath Pandey commented on MAPREDUCE-2780: - Daryn and I discussed it at length, the conclusions were: It is better to treat service as opaque, as in current Token API. A convenience method to set service using InetSocketAddress is useful. Therefore we add a method in SecurityUtil that takes a token and an InetSocketAddress, constructs a service and sets it in the token. And the Token API remains unchanged. > Standardize the value of token service > -- > > Key: MAPREDUCE-2780 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch > > > The token's service field must (currently) be set to "ip:port". All the > producers of a token are independently building the service string. This > should be done via a common method to reduce the chance of error, and to > facilitate the field value being easily changed in the (near) future. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2676) MR-279: JobHistory Job page needs reformatted
[ https://issues.apache.org/jira/browse/MAPREDUCE-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081164#comment-13081164 ] Luke Lu commented on MAPREDUCE-2676: Since you're creating jobhistory specific pages, can you create unit tests similar to TestAMWebApp? The simplest page test (that ensure structural integrity of the page) is a one liner, e.g.: {code} WebAppTests.testPage(HsView.class, AppContext.class, new TestHistoryContext()); {code} > MR-279: JobHistory Job page needs reformatted > - > > Key: MAPREDUCE-2676 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2676 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Robert Joseph Evans >Assignee: Robert Joseph Evans > Fix For: 0.23.0 > > Attachments: MR-2676-v1.patch > > > The Job page, The Maps page and the Reduces page for the job history server > needs to be reformatted. > The Job Overview needs to add in the User, a link to the Job Conf, and the > Job ACLs > It also needs Submitted at, launched at, and finished at, depending on how > they relates to Started and Elapsed. > In the attempts table we need to remove the new and the running columns > In the tasks table we need to remove progress, pending, and running columns > and add in a failed count column > We also need to investigate what it would take to add in setup and cleanup > statistics. Perhaps these should be more generally Application Master > statistics and links. > The Maps page and Reduces page should have the progress column removed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved MAPREDUCE-2729. -- Resolution: Fixed Sorry, some weird issue with my patch d/w. I just committed this. Thanks Sherry! > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2787) MR-279: Performance improvement in running Uber MapTasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-2787: Attachment: MAPREDUCE-2787.patch The attached patch fixes the described issue by only creating the FileSystem and Configuration once for all task attempts. All mapreduce unit tests ran successfully. > MR-279: Performance improvement in running Uber MapTasks > > > Key: MAPREDUCE-2787 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2787 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Reporter: Ahmed Radwan >Assignee: Ahmed Radwan > Attachments: MAPREDUCE-2787.patch > > > The runUberMapTasks() in org.apache.hadoop.mapred.UberTask obtains the local > fileSystem and local job configuration for every task attempt. This will > have a negative performance impact. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2780) Standardize the value of token service
[ https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-2780: --- Status: Patch Available (was: Open) > Standardize the value of token service > -- > > Key: MAPREDUCE-2780 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch > > > The token's service field must (currently) be set to "ip:port". All the > producers of a token are independently building the service string. This > should be done via a common method to reduce the chance of error, and to > facilitate the field value being easily changed in the (near) future. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2780) Standardize the value of token service
[ https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081142#comment-13081142 ] Daryn Sharp commented on MAPREDUCE-2780: bq. It forces to set ip/host/port in the service. Per the title of this jira, that is the goal. :) Every token producer is required to use the same service field format to comply with the format needed by the token's selector. Currently every token producer has a copy-n-paste chunk selector's code to construct the service format. If the selector and the producer get out of sync, there's a big problem. bq. If we later decide to store something else in the service for example, uri with scheme, we will have to change it again. Please elaborate? This appears as a non-sequitur since token producers that choose to use a URI (someday), for instance, will require a change in either case. Here is the evolution from the original code of: {code} token.setService(new Text(addr.getAddress().getHostAddress() + ":" + addr.getPort())); selector.selectToken(new Text(addr.getAddress().getHostAddress() + ":" + addr.getPort()), tokens); {code} I think is your suggestion? It's incrementally better, but continues to require a copy-n-paste in each token producer, and every token producer continues to have intimate knowledge of the service format. {code} token.setService(SecurityUtil.buildDTAuthority(addr)); selector.selectToken(SecurityUtil.buildDTAuthority(addr), tokens); {code} The patch applies another layer of abstraction. The format is privatized to the token, instead of publicly diffused over all the tokens in hadoop. {code} token.setService(addr); selector.selectToken(Token.createService(addr), tokens); // I removed this due to your earlier concerns in the parent jira // selector.selectToken(addr, tokens); {code} Do you believe this a persuasive case for the patch? > Standardize the value of token service > -- > > Key: MAPREDUCE-2780 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch > > > The token's service field must (currently) be set to "ip:port". All the > producers of a token are independently building the service string. This > should be done via a common method to reduce the chance of error, and to > facilitate the field value being easily changed in the (near) future. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2787) MR-279: Performance improvement in running Uber MapTasks
MR-279: Performance improvement in running Uber MapTasks Key: MAPREDUCE-2787 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2787 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Ahmed Radwan Assignee: Ahmed Radwan The runUberMapTasks() in org.apache.hadoop.mapred.UberTask obtains the local fileSystem and local job configuration for every task attempt. This will have a negative performance impact. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081122#comment-13081122 ] Arun C Murthy commented on MAPREDUCE-2729: -- Thomas, it doesn't make sense to port this to trunk - please don't bother, unless you want to look at this vis-a-vis MAPREDUCE-279. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081118#comment-13081118 ] Thomas Graves commented on MAPREDUCE-2729: -- The patch is for the branch-0.20-security branch. I will look at putting it on trunk. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2780) Standardize the value of token service
[ https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081091#comment-13081091 ] Jitendra Nath Pandey commented on MAPREDUCE-2780: - I see only one issue with changing the setService API to take InetSocketAddress. It forces to set ip/host/port in the service. If we later decide to store something else in the service for example, uri with scheme, we will have to change it again. I would recommend to keep setService API unchanged, however a utility method in SecurityUtil to construct DT service from a socket address is good. > Standardize the value of token service > -- > > Key: MAPREDUCE-2780 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch > > > The token's service field must (currently) be set to "ip:port". All the > producers of a token are independently building the service string. This > should be done via a common method to reduce the chance of error, and to > facilitate the field value being easily changed in the (near) future. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2364) Shouldn't hold lock on rjob while localizing resources.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081086#comment-13081086 ] Devaraj Das commented on MAPREDUCE-2364: Subroto, I see a significant difference in the patches attached to MAPREDUCE-2209 and the last one here. I'll need to look at the details but if you have time could you please take a look at the patch attached here and see if this makes sense (given this patch predates the patch on MAPREDUCE-2209; I am sorry that I didn't look at the patch here earlier). > Shouldn't hold lock on rjob while localizing resources. > --- > > Key: MAPREDUCE-2364 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2364 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.20.203.0 >Reporter: Owen O'Malley >Assignee: Devaraj Das > Fix For: 0.20.204.0 > > Attachments: MAPREDUCE-2364.patch, > no-lock-localize-branch-0.20-security.patch, no-lock-localize-trunk.patch > > > There is a deadlock while localizing resources on the TaskTracker. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2209) TaskTracker's heartbeat hang for several minutes when copying large job.jar from HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081084#comment-13081084 ] Arun C Murthy commented on MAPREDUCE-2209: -- Subroto/Liyin - In general the community is moving forward with a complete re-architecture of Hadoop MapReduce from 0.23 onwards: MAPREDUCE-279 for more details. Thus, you might be better off focussing your efforts either on hadoop-0.20.203 and beyond (the classic mapreduce) or, better yet, helping out with MAPREDUCE-279 (new mapreduce). Given your valuable experience on 'classic' mapreduce, your help on MAPREDUCE-279 would be very welcome. Thanks! > TaskTracker's heartbeat hang for several minutes when copying large job.jar > from HDFS > - > > Key: MAPREDUCE-2209 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2209 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.23.0 > Environment: hadoop version: 0.19.1 >Reporter: Liyin Liang >Priority: Blocker > Attachments: 2209-1.diff, MAPREDUCE-2209.patch > > > If a job's jar file is very large, e.g 200m+, the TaskTracker's heartbeat > hang for several minutes when localizing the job. The jstack of related > threads are as follows: > {code:borderStyle=solid} > "TaskLauncher for task" daemon prio=10 tid=0x002b05ee5000 nid=0x1adf > runnable [0x42e56000] >java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) > - locked <0x002afc892ec8> (a sun.nio.ch.Util$1) > - locked <0x002afc892eb0> (a > java.util.Collections$UnmodifiableSet) > - locked <0x002afc8927d8> (a sun.nio.ch.EPollSelectorImpl) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read(BufferedInputStream.java:237) > - locked <0x002afce26158> (a java.io.BufferedInputStream) > at java.io.DataInputStream.readShort(DataInputStream.java:295) > at > org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1304) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1556) > - locked <0x002afce26218> (a > org.apache.hadoop.hdfs.DFSClient$DFSInputStream) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673) > - locked <0x002afce26218> (a > org.apache.hadoop.hdfs.DFSClient$DFSInputStream) > at java.io.DataInputStream.read(DataInputStream.java:83) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:209) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142) > at > org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214) > at > org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195) > at > org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:824) > - locked <0x002afce2d260> (a > org.apache.hadoop.mapred.TaskTracker$RunningJob) > at > org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1745) > at > org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:103) > at > org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1710) > "Map-events fetcher for all reduce tasks on > tracker_r01a08025:localhost/127.0.0.1:50050" daemon prio=10 > tid=0x002b05ef8000 > nid=0x1ada waiting for monitor entry [0x42d55000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:582) > - waiting to lock <0x002afce2d260> (a > org.apache.hadoop.mapred.TaskTracker$RunningJob) > at > org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:617) > - locked <0x002a9eefe1f8> (a java.util.TreeMap) > "IPC Server handler 2 on
[jira] [Updated] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-2729: - Status: Open (was: Patch Available) Sherry, the patch doesn't apply clean - can you please re-generate it? Thanks. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2786) TestDFSIO should also test compression reading/writing from command-line.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Plamen Jeliazkov updated MAPREDUCE-2786: Attachment: MAPREDUCE-2786.patch This is my work done so far. I'd like to move the codec into the mapper constructors but I have not been able to do it successfully because the CompressionOutputStream relies on the OutputStream within each mapper. > TestDFSIO should also test compression reading/writing from command-line. > - > > Key: MAPREDUCE-2786 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2786 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks >Reporter: Plamen Jeliazkov >Priority: Minor > Labels: newbie > Fix For: 0.22.0 > > Attachments: MAPREDUCE-2786.patch > > Original Estimate: 36h > Remaining Estimate: 36h > > After running into trouble dealing with the config files I thought it might > be easier to simply alter the code of TestDFSIO to accept any compression > codec and allow testing for compression by a command line argument instead of > having to change the config file everytime. Something like "-compression" > would do. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2786) TestDFSIO should also test compression reading/writing from command-line.
TestDFSIO should also test compression reading/writing from command-line. - Key: MAPREDUCE-2786 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2786 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks Reporter: Plamen Jeliazkov Priority: Minor Fix For: 0.22.0 After running into trouble dealing with the config files I thought it might be easier to simply alter the code of TestDFSIO to accept any compression codec and allow testing for compression by a command line argument instead of having to change the config file everytime. Something like "-compression" would do. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081062#comment-13081062 ] Plamen Jeliazkov commented on MAPREDUCE-2627: - Yes, it was due to this patch being submitted. We have reverted it and are waiting for a new POM file to be generated. Please read the related issue thread for more. > guava-r09 JAR file needs to be added to mapreduce. > -- > > Key: MAPREDUCE-2627 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Affects Versions: 0.22.0 >Reporter: Plamen Jeliazkov >Priority: Blocker > Fix For: 0.22.0 > > Attachments: patch.txt > > Original Estimate: 24h > Remaining Estimate: 24h > > Need to add the guava-r09.jar file into the > "mapreduce/build/ivy/lib/Hadoop/common" directory; missing from build. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-377) Add serialization for Protocol Buffers
[ https://issues.apache.org/jira/browse/MAPREDUCE-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081031#comment-13081031 ] Pere Ferrera Bertran commented on MAPREDUCE-377: PB integration with Hadoop is now possible by using Protostuff (http://code.google.com/p/protostuff/) by calling ProtobufIOUtil.writeDelimitedTo() and ProtobufIOUtil.mergeDelimitedFrom() . These methods avoid the problem with consuming too many bytes from the stream. > Add serialization for Protocol Buffers > -- > > Key: MAPREDUCE-377 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-377 > Project: Hadoop Map/Reduce > Issue Type: Wish >Reporter: Tom White >Assignee: Alex Loddengaard > Attachments: hadoop-3788-v1.patch, hadoop-3788-v2.patch, > hadoop-3788-v3.patch, protobuf-java-2.0.1.jar, protobuf-java-2.0.2.jar > > > Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding > data in a compact binary format. This issue is to write a > ProtocolBuffersSerialization to support using Protocol Buffers types in > MapReduce programs, including an example program. This should probably go > into contrib. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2785) MiniMR cluster thread crashes if no hadoop log dir set
MiniMR cluster thread crashes if no hadoop log dir set -- Key: MAPREDUCE-2785 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2785 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.203.0 Reporter: Steve Loughran Priority: Minor I'm marking this as minor as it is most obvious in the MiniMRCluster, but the root cause is in the JT. If you instantiate an MiniMR Cluster without setting {{hadoop.job.history.location}} in the configuration and the system property {{hadoop.log.dir}} unset, then the JobHistory throws an NPE. In production, that would be picked up as a failure to start the JT. In the MiniMRCluster, all it does is crash the JT thread -which isn't noticed by the MiniMR cluster. You see the logged error, but the tests will just timeout waiting for things to come up 2011/08/08 17:46:26:427 CEST [ERROR][Thread-44] org.apache.hadoop.mapred.MiniMRCluster - Job tracker crashed java.lang.NullPointerException at java.io.File.(File.java:222) at org.apache.hadoop.mapred.JobHistory.initLogDir(JobHistory.java:531) at org.apache.hadoop.mapred.JobHistory.init(JobHistory.java:499) at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:2316) at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:2313) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:2313) at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:2171) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:300) at org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner$1.run(MiniMRCluster.java:114) at org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner$1.run(MiniMRCluster.java:112) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner.run(MiniMRCluster.java:112) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2784) [Gridmix] TestGridmixSummary fails with NPE when run in DEBUG mode.
[Gridmix] TestGridmixSummary fails with NPE when run in DEBUG mode. --- Key: MAPREDUCE-2784 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2784 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Amar Kamat Assignee: Amar Kamat TestGridmixSummary fails with NPE when run in debug mode. JobFactory tries to access the _createReaderThread()_ API of JobStoryProducer which returns null in TestGridmixSummary's FakeJobStoryProducer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2765) DistCp Rewrite
[ https://issues.apache.org/jira/browse/MAPREDUCE-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080917#comment-13080917 ] Amareshwari Sriramadasu commented on MAPREDUCE-2765: First of all, the code needs go into a contrib project. So, you need to regenerate the patch putting the code in contrib. Also, build environment needs changes. Will this be blocked on mavenization of MapReduce? Overall, design looks fine. Here are some comments on the code: * CopyMapper: ** {noformat} if (targetFS.exists(targetFinalPath) && targetFS.isFile(targetFinalPath)) { overWrite = true; // When target is an existing file, overwrite it. } {noformat} Target file is overwritten irrespective of overwrite configuration? why? * Dynamic\* ** DynamicInputChunk is not public? ** DynamicInputFormat creates FileSplits with zero length. Instead should it be created with the size of chunk as the size of the split. ** DynamicRecordReader has commented code. Should remove it. * CopyCommitter: ** Atomic commit should not delete the final directory. Should throw out an error if it exists even before starting the job. ** deleteMissing() counts the files which do not exists at both source and target paths as deleted entries. ** Preserving status for the root folder does not happen at all? Can you check? ** If I’m not wrong, preserveFileAttributes() does preserve only for directories. Can we rename the method accordingly? ** The methods deleteMissing(), preserveFileAttributes() etc need more doc. ** Deleting attempt temp files happens in each attempt. Why are we doing delete again in Committer? Committer should just delete the work path. General comment: All public classes and public methods need javadoc Haven't looked at testcases. > DistCp Rewrite > -- > > Key: MAPREDUCE-2765 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2765 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: distcp >Affects Versions: 0.20.203.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: distcpv2.20.203.patch > > > This is a slightly modified version of the DistCp rewrite that Yahoo uses in > production today. The rewrite was ground-up, with specific focus on: > 1. improved startup time (postponing as much work as possible to the MR job) > 2. support for multiple copy-strategies > 3. new features (e.g. -atomic, -async, -bandwidth.) > 4. improved programmatic use > Some effort has gone into refactoring what used to be achieved by a single > large (1.7 KLOC) source file, into a design that (hopefully) reads better too. > The proposed DistCpV2 preserves command-line-compatibility with the old > version, and should be a drop-in replacement. > New to v2: > 1. Copy-strategies and the DynamicInputFormat: > A copy-strategy determines the policy by which source-file-paths are > distributed between map-tasks. (These boil down to the choice of the > input-format.) > If no strategy is explicitly specified on the command-line, the policy > chosen is "uniform size", where v2 behaves identically to old-DistCp. (The > number of bytes transferred by each map-task is roughly equal, at a per-file > granularity.) > Alternatively, v2 ships with a "dynamic" copy-strategy (in the > DynamicInputFormat). This policy acknowledges that > (a) dividing files based only on file-size might not be an > even distribution (E.g. if some datanodes are slower than others, or if some > files are skipped.) > (b) a "static" association of a source-path to a map increases > the likelihood of long-tails during copy. > The "dynamic" strategy divides the list-of-source-paths into a number > (> nMaps) of smaller parts. When each map completes its current list of > paths, it picks up a new list to process, if available. So if a map-task is > stuck on a slow (and not necessarily large) file, other maps can pick up the > slack. The thinner the file-list is sliced, the greater the parallelism (and > the lower the chances of long-tails). Within reason, of course: the number of > these short-lived list-files is capped at an overridable maximum. > Internal benchmarks against source/target clusters with some slow(ish) > datanodes have indicated significant performance gains when using the > dynamic-strategy. Gains are most pronounced when nFiles greatly exceeds nMaps. > Please note that the DynamicInputFormat might prove useful outside of > DistCp. It is hence available as a mapred/lib, unfettered to DistCpV2. Also > note that the copy-strategies have no bearing on the CopyMapper.map() > implementation. > > 2. Improved startup-time and programmatic use: > When the old-DistCp runs with -update, and cre
[jira] [Commented] (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080804#comment-13080804 ] arunkumar commented on MAPREDUCE-1834: -- More info .. >From svn checkout in >HADOOP_HOME/mapreduce/src/contrib/mumak/src/test/org/apache/hadoop/mapred/ there are no DeterministicCollectionAspects.aj and FakeConcurrentHashMap.java files. > TestSimulatorDeterministicReplay timesout on trunk > -- > > Key: MAPREDUCE-1834 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/mumak >Affects Versions: 0.21.0 >Reporter: Amareshwari Sriramadasu >Assignee: Hong Tang > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1834.patch, > TestSimulatorDeterministicReplay.log, mr-1834-20100727.patch, > mr-1834-20100729.patch, mr-1834-20100802.patch > > > TestSimulatorDeterministicReplay timesout on trunk. > See hudson patch build > http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira