[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control
[ https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746749#action_12746749 ] Hemanth Yamijala commented on MAPREDUCE-856: Looked at the patch. I have a few comments: - Make Localizer an instance class, as in general, that's a more flexible design, and also there's state that the localizer is needing to maintain anyway. - I would recommend initializeUserDirs to pass the taskcontroller instead of tasktracker, as the entire tasktracker interface is not needed by the localizer atleast now. - In HADOOP-4491, if the user directory cannot be created on any disk, we were failing localization. I think that's a useful feature to have. -Synchronization w.r.to user localization needs to be looked at. -- It is possible right now that when user localization is in progress for a user, another task for the same user could get launched before the localization completes. -- Also, the object on which we are locking - is it guaranteed that it is a unique instance for every user ? - Race condition exists between creation and deletion of user directories. Say a job requires a user dir and has not yet localized files (and consequently hasn't acquired the synchronization lock. At that time if deletion starts, it could delete the user dir. - Also, I think it will be good to check for cleaning up user directories on a much slower pace as they involve some costly operations. - I think JobConf.setUserAndGroupNamesForJob need not be static. Also, it would be nice to document that this is mainly used in test cases. - User directory can be 570. So also distributed cache directory (no need even for setuid, right ?) - The changes in MAPREDUCE-871 need to be synced up in this patch as well. - Some tests like TestTaskControllerSetup are disabled. Can you please enable them back. - Permission checks for user directory and jobcache and archive directory permissions needed. - Test cases should also confirm directory paths in localized distributed cache paths are being set to the right permissions. - Can we use testManagerFlow to have templates that can be overridden by the LinuxTaskController test class. Localized files from DistributedCache should have right access-control -- Key: MAPREDUCE-856 URL: https://issues.apache.org/jira/browse/MAPREDUCE-856 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: tasktracker Reporter: Arun C Murthy Assignee: Vinod K V Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-909) Shell$ExitCodeException while killing/failing a task.
[ https://issues.apache.org/jira/browse/MAPREDUCE-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746759#action_12746759 ] Suman Sehgal commented on MAPREDUCE-909: This exception is suppressed in 0.21 while it should be there in the logs for 0.21 also while killing or failing a task. Shell$ExitCodeException while killing/failing a task. - Key: MAPREDUCE-909 URL: https://issues.apache.org/jira/browse/MAPREDUCE-909 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Suman Sehgal Priority: Minor Encountered Shell$ExitCodeException in TT logs while killing/failing a job on 0.20.1 Stack Trace: = 2009-08-22 16:37:05,867 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200908200732_0541_m_03_1 2009-08-22 16:37:06,030 WARN org.apache.hadoop.mapred.LinuxTaskController: Exception thrown while launching task JVM : org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:245) at org.apache.hadoop.util.Shell.run(Shell.java:172) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:365) at org.apache.hadoop.mapred.LinuxTaskController.launchTaskJVM(LinuxTaskController.java:156) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.runChild(JvmManager.java:397) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.run(JvmManager.java:386) 2009-08-22 16:37:06,030 WARN org.apache.hadoop.mapred.LinuxTaskController: Exit code from task is : 143 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-909) Shell$ExitCodeException while killing/failing a task.
[ https://issues.apache.org/jira/browse/MAPREDUCE-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suman Sehgal updated MAPREDUCE-909: --- Priority: Trivial (was: Minor) Shell$ExitCodeException while killing/failing a task. - Key: MAPREDUCE-909 URL: https://issues.apache.org/jira/browse/MAPREDUCE-909 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Suman Sehgal Priority: Trivial Encountered Shell$ExitCodeException in TT logs while killing/failing a job on 0.20.1 Stack Trace: = 2009-08-22 16:37:05,867 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200908200732_0541_m_03_1 2009-08-22 16:37:06,030 WARN org.apache.hadoop.mapred.LinuxTaskController: Exception thrown while launching task JVM : org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:245) at org.apache.hadoop.util.Shell.run(Shell.java:172) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:365) at org.apache.hadoop.mapred.LinuxTaskController.launchTaskJVM(LinuxTaskController.java:156) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.runChild(JvmManager.java:397) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.run(JvmManager.java:386) 2009-08-22 16:37:06,030 WARN org.apache.hadoop.mapred.LinuxTaskController: Exit code from task is : 143 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.
[ https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746763#action_12746763 ] rahul k singh commented on MAPREDUCE-861: - small correction above: getLeafQueues is actually getLeafQueueNames(). Some more clarification in terms of names of queue: 1 . The name of the queues would be parent.child.grandChild. for example: {code:xml} queue nameq/name queue namep/name /queue /queue {code} In the above example : There are 2 queues. q is a root level queue and p is a child of q. The name of queue q would be q; The name of queue p would be q.p. We would always use this completely qualified name in the implementation. Users cannot name a queue like queue-name.queue-name as . is used as separator. Modify queue configuration format and parsing to support a hierarchy of queues. --- Key: MAPREDUCE-861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Hemanth Yamijala Assignee: rahul k singh MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework. This JIRA is for defining changes to the configuration related to queues. The current format for defining a queue and its properties is as follows: mapred.queue.queue-name.property-name. For e.g. mapred.queue.queue-name.acl-submit-job. The reason for using this verbose format was to be able to reuse the Configuration parser in Hadoop. However, administrators currently using the queue configuration have already indicated a very strong desire for a more manageable format. Since, this becomes more unwieldy with hierarchical queues, the time may be good to introduce a new format for representing queue configuration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.
[ https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746771#action_12746771 ] rahul k singh commented on MAPREDUCE-861: - After discussing locally regarding separator , there was an agreement over : . Modify queue configuration format and parsing to support a hierarchy of queues. --- Key: MAPREDUCE-861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Hemanth Yamijala Assignee: rahul k singh MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework. This JIRA is for defining changes to the configuration related to queues. The current format for defining a queue and its properties is as follows: mapred.queue.queue-name.property-name. For e.g. mapred.queue.queue-name.acl-submit-job. The reason for using this verbose format was to be able to reuse the Configuration parser in Hadoop. However, administrators currently using the queue configuration have already indicated a very strong desire for a more manageable format. Since, this becomes more unwieldy with hierarchical queues, the time may be good to introduce a new format for representing queue configuration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors
[ https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-430: - Attachment: MAPREDUCE-430-v1.11.patch Attaching a patch that does what was last discussed last. This is what the patch does : - tasktracker now provides fatalError() to report fatal errors from child - Child/ReduceTask/MapTask now catches Throwable and invokes umbilical.fatalError(). If this fails, then System.exit() is invoked. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Running ant-tests. Task stuck in cleanup with OutOfMemoryErrors Key: MAPREDUCE-430 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Amareshwari Sriramadasu Assignee: Amar Kamat Fix For: 0.20.1 Attachments: MAPREDUCE-430-v1.11.patch, MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch Obesrved a task with OutOfMemory error, stuck in cleanup. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-767) to remove mapreduce dependency on commons-cli2
[ https://issues.apache.org/jira/browse/MAPREDUCE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das resolved MAPREDUCE-767. --- Resolution: Fixed I committed to trunk a fix to handle -debug (that was missed in the earlier patch). I committed the patch for 0.20 to the 0.20 branch. Thanks, Amar! to remove mapreduce dependency on commons-cli2 -- Key: MAPREDUCE-767 URL: https://issues.apache.org/jira/browse/MAPREDUCE-767 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/streaming Affects Versions: 0.20.1 Reporter: Giridharan Kesavan Assignee: Amar Kamat Fix For: 0.20.1 Attachments: MAPREDUCE-767-v1.1.patch, MAPREDUCE-767-v1.2.patch, MAPREDUCE-767-v1.3-branch-0.20.patch, MAPREDUCE-767-v1.3.patch mapreduce, streaming and eclipse plugin depends on common-cli2 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746780#action_12746780 ] Hemanth Yamijala commented on MAPREDUCE-768: bq. Because Config can pull in JVM properties, you do need to do the expansion on the host that is using the configuration. The current scope of this JIRA is to do the dump on the host that is using the configuration. Hence, this is covered in HADOOP-6184. bq. It seems sensible to make this a general purpose Tools option,, print my config to stdout, so that anyone using any tool can see the values bq. It's also handy to be able to ask a remote service endpoint for their config -any node, master or slave, should be able to serve up the config to someone it trusts. Which introduces one small problem -only users with admin rights should be allowed to see the configurations, in case they contain passwords or other sensitive topics. These two are good points and I think we should do them as incremental work. I recommend we think about it filing another JIRA for the same after this goes in. Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh Attachments: MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768.patch We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-807) Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up.
[ https://issues.apache.org/jira/browse/MAPREDUCE-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-807: - Attachment: MAPREDUCE-807-v1.6-branch-0.20.patch Attaching a patch for branch 0.20. Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up. -- Key: MAPREDUCE-807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-807 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Priority: Blocker Attachments: MAPRED-807-v1.1.patch, MAPRED-807-v1.2.patch, MAPRED-807-v1.3.patch, MAPRED-807-v1.4.patch, MAPRED-807-v1.6.patch, MAPREDUCE-807-v1.6-branch-0.20.patch With restart disabled, the jobtracker does a _rm -rf_ of the mapred.system.dir. If the mapred.system.dir contains user files with permissions other than 777 then the jobtracker gets stuck in a loop trying to delete the mapred.system.dir (and each time failing with AccessControlException). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] V.V.Chaitanya Krishna updated MAPREDUCE-768: Attachment: MAPREDUCE-768-3.patch The patch is not compatible with the recent updates in mapreduce. Uploading patch with this issue resolved. Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh Attachments: MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768.patch We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746786#action_12746786 ] Hemanth Yamijala commented on MAPREDUCE-768: I think we need a new patch, because the one on the jira currently is not applying. But I briefly looked at the patch, and can think of a few minor comments: - I think JobTracker.dumpConfiguration should not take JobConf as a parameter. It should create one inside the call. - Similarly, QueueManager.dumpConfiguration should also not take a JobConf. Further, it should not load the default resources, because otherwise, the JobTracker's configuration would get dumped twice. Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh Attachments: MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768.patch We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746801#action_12746801 ] Tom White commented on MAPREDUCE-370: - Could the counter name be based on the named output, rather than the base filename? bq. if user doesn't give unique name for the output file, there are chances that output will be garbled. This is true, but like MultipleOutputFormat it would be up to the application to give unique names to the output files. Most users would use the simpler form that takes a named output and lets MultipleOutputs construct the output filename {{{namedOutput}-(m|r)-{part-number}}}, but this change I'm proposing would allow advanced users to control the precise filename of the outputs. Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-370-1.txt, patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-807) Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up.
[ https://issues.apache.org/jira/browse/MAPREDUCE-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-807: - Attachment: MAPREDUCE-807-v1.7-branch-0.20.patch MAPRED-807-v1.7.patch Attaching new patches after Devaraj's offline comments. Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up. -- Key: MAPREDUCE-807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-807 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Priority: Blocker Attachments: MAPRED-807-v1.1.patch, MAPRED-807-v1.2.patch, MAPRED-807-v1.3.patch, MAPRED-807-v1.4.patch, MAPRED-807-v1.6.patch, MAPRED-807-v1.7.patch, MAPREDUCE-807-v1.6-branch-0.20.patch, MAPREDUCE-807-v1.7-branch-0.20.patch With restart disabled, the jobtracker does a _rm -rf_ of the mapred.system.dir. If the mapred.system.dir contains user files with permissions other than 777 then the jobtracker gets stuck in a loop trying to delete the mapred.system.dir (and each time failing with AccessControlException). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746811#action_12746811 ] Amareshwari Sriramadasu commented on MAPREDUCE-370: --- bq. Could the counter name be based on the named output, rather than the base filename? Possible. But counters will be maintained only for named outputs. bq. but this change I'm proposing would allow advanced users to control the precise filename of the outputs. I think these users can override FileOutputFormat.getDefaultWorkFile to control the precise filename. Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-370-1.txt, patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-370: -- Status: Patch Available (was: Open) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-370-1.txt, patch-370-2.txt, patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746816#action_12746816 ] Tom White commented on MAPREDUCE-370: - bq. I think these users can override FileOutputFormat.getDefaultWorkFile to control the precise filename. This is true. So to have complete control over the output filename you would call the write method with a base output path of the name you want (possibly using the key and value to construct it). You would then override FileOutputFormat.getDefaultWorkFile() to omit the {m,r}-n suffix. We could make this slightly easier in the future perhaps (by putting it in the MultipleOutputs API, for example), but I think the current approach is reasonable. Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-370-1.txt, patch-370-2.txt, patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-807) Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up.
[ https://issues.apache.org/jira/browse/MAPREDUCE-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das resolved MAPREDUCE-807. --- Resolution: Fixed Fix Version/s: 0.20.1 I just committed this. Thanks, Amar! Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up. -- Key: MAPREDUCE-807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-807 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Priority: Blocker Fix For: 0.20.1 Attachments: MAPRED-807-v1.1.patch, MAPRED-807-v1.2.patch, MAPRED-807-v1.3.patch, MAPRED-807-v1.4.patch, MAPRED-807-v1.6.patch, MAPRED-807-v1.7.patch, MAPREDUCE-807-v1.6-branch-0.20.patch, MAPREDUCE-807-v1.7-branch-0.20.patch With restart disabled, the jobtracker does a _rm -rf_ of the mapred.system.dir. If the mapred.system.dir contains user files with permissions other than 777 then the jobtracker gets stuck in a loop trying to delete the mapred.system.dir (and each time failing with AccessControlException). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-370: -- Attachment: patch-370-2.txt Patch changing checkTokenName() and checkbaseOutputPath() to be private. Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-370-1.txt, patch-370-2.txt, patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control
[ https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746822#action_12746822 ] Hemanth Yamijala commented on MAPREDUCE-856: bq. User directory can be 570. So also distributed cache directory (no need even for setuid, right ?) I meant setgid.. However, that may be required, as we realized in an internal discussion. Localized files from DistributedCache should have right access-control -- Key: MAPREDUCE-856 URL: https://issues.apache.org/jira/browse/MAPREDUCE-856 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: tasktracker Reporter: Arun C Murthy Assignee: Vinod K V Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746824#action_12746824 ] Tom White commented on MAPREDUCE-476: - Sorry Philip, but I've just noticed that the testFileSystemOtherThanDefault() test from TestDistributedCache (introduced in HADOOP-5635) got missed during the move to TestTrackerDistributedCacheManager. extend DistributedCache to work locally (LocalJobRunner) Key: MAPREDUCE-476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: sam rash Assignee: Philip Zeyliger Priority: Minor Attachments: HADOOP-2914-v1-full.patch, HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, MAPREDUCE-476-v8.patch, MAPREDUCE-476.patch, v6-to-v7.patch The DistributedCache does not work locally when using the outlined recipe at http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html Ideally, LocalJobRunner would take care of populating the JobConf and copying remote files to the local file sytem (http, assume hdfs = default fs = local fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes
[ https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-798: Resolution: Fixed Fix Version/s: 0.21.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I've just committed this. Thanks Aaron! MRUnit should be able to test a succession of MapReduce passes -- Key: MAPREDUCE-798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-798 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Aaron Kimball Assignee: Aaron Kimball Fix For: 0.21.0 Attachments: MAPREDUCE-798.2.patch, MAPREDUCE-798.3.patch, MAPREDUCE-798.patch MRUnit can currently test that the inputs to a given (mapper, reducer) job produce certain outputs at the end of the reducer. It would be good to support more end-to-end tests of a series of MapReduce jobs that form a longer pipeline surrounding some data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-807) Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up.
[ https://issues.apache.org/jira/browse/MAPREDUCE-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-807: - Release Note: The JobTracker tries to delete the mapred.system.dir when it is starting up (with the job recovery disabled). The fix provided by this jira is that JobTracker will fail (bail out) with AccessControlException if it fails to delete files/directories in mapred.system.dir due to access control issues. Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up. -- Key: MAPREDUCE-807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-807 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Priority: Blocker Fix For: 0.20.1 Attachments: MAPRED-807-v1.1.patch, MAPRED-807-v1.2.patch, MAPRED-807-v1.3.patch, MAPRED-807-v1.4.patch, MAPRED-807-v1.6.patch, MAPRED-807-v1.7.patch, MAPREDUCE-807-v1.6-branch-0.20.patch, MAPREDUCE-807-v1.7-branch-0.20.patch With restart disabled, the jobtracker does a _rm -rf_ of the mapred.system.dir. If the mapred.system.dir contains user files with permissions other than 777 then the jobtracker gets stuck in a loop trying to delete the mapred.system.dir (and each time failing with AccessControlException). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-807) Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up.
[ https://issues.apache.org/jira/browse/MAPREDUCE-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-807: - Description: With restart disabled, the jobtracker does a _rm -rf_ of the mapred.system.dir. If the mapred.system.dir contains user files with permissions other than 777 then the jobtracker gets stuck in a loop trying to delete the mapred.system.dir (and each time failing with AccessControlException). The JobTracker admin has to manually cleanup the mapred.system.dir if this happens. (was: With restart disabled, the jobtracker does a _rm -rf_ of the mapred.system.dir. If the mapred.system.dir contains user files with permissions other than 777 then the jobtracker gets stuck in a loop trying to delete the mapred.system.dir (and each time failing with AccessControlException).) Stray user files in mapred.system.dir with permissions other than 777 can prevent the jobtracker from starting up. -- Key: MAPREDUCE-807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-807 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Priority: Blocker Fix For: 0.20.1 Attachments: MAPRED-807-v1.1.patch, MAPRED-807-v1.2.patch, MAPRED-807-v1.3.patch, MAPRED-807-v1.4.patch, MAPRED-807-v1.6.patch, MAPRED-807-v1.7.patch, MAPREDUCE-807-v1.6-branch-0.20.patch, MAPREDUCE-807-v1.7-branch-0.20.patch With restart disabled, the jobtracker does a _rm -rf_ of the mapred.system.dir. If the mapred.system.dir contains user files with permissions other than 777 then the jobtracker gets stuck in a loop trying to delete the mapred.system.dir (and each time failing with AccessControlException). The JobTracker admin has to manually cleanup the mapred.system.dir if this happens. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-318) Refactor reduce shuffle code
[ https://issues.apache.org/jira/browse/MAPREDUCE-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jothi Padmanabhan updated MAPREDUCE-318: Attachment: mapred-318-24Aug.patch New patch with review comments incorporated. Also fixed some findbugs warnings. Refactor reduce shuffle code Key: MAPREDUCE-318 URL: https://issues.apache.org/jira/browse/MAPREDUCE-318 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HADOOP-5233_api.patch, HADOOP-5233_part0.patch, mapred-318-14Aug.patch, mapred-318-20Aug.patch, mapred-318-24Aug.patch, mapred-318-common.patch The reduce shuffle code has become very complex and entangled. I think we should move it out of ReduceTask and into a separate package (org.apache.hadoop.mapred.task.reduce). Details to follow. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors
[ https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-430: - Attachment: MAPREDUCE-430-v1.12-branch-0.20.patch MAPREDUCE-430-v1.12.patch Attaching a new patch for review. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Task stuck in cleanup with OutOfMemoryErrors Key: MAPREDUCE-430 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Amareshwari Sriramadasu Assignee: Amar Kamat Fix For: 0.20.1 Attachments: MAPREDUCE-430-v1.11.patch, MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch Obesrved a task with OutOfMemory error, stuck in cleanup. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-679) XML-based metrics as JSP servlet for JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746861#action_12746861 ] Steve Loughran commented on MAPREDUCE-679: -- TldLocations cache is some cache for globally defined taglibs http://tomcat.apache.org/tomcat-5.5-doc/jasper/docs/api/org/apache/jasper/compiler/TldLocationsCache.html source is here: http://svn.apache.org/repos/asf/tomcat/tc6.0.x/trunk/java/org/apache/jasper/compiler/TldLocationsCache.java Looking at the source, the message comes from {{{processWebDotXml()}}}; it doesnt do any harm, except that it doesnt bother parsing any web.xml -defined content if web.xml is nowhere to be found. Its a warning, not an error. There is a servlet context property, org.apache.catalina.deploy.alt_dd, which can be used to identify an alternate deployment descriptor, but I have no idea how to set that from command line jspc. Recommendation: ignore the warning. XML-based metrics as JSP servlet for JobTracker --- Key: MAPREDUCE-679 URL: https://issues.apache.org/jira/browse/MAPREDUCE-679 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobtracker Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: example-jobtracker-completed-job.xml, example-jobtracker-running-job.xml, MAPREDUCE-679.2.patch, MAPREDUCE-679.3.patch, MAPREDUCE-679.patch In HADOOP-4559, a general REST API for reporting metrics was proposed but work seems to have stalled. In the interim, we have a simple XML translation of the existing JobTracker status page which provides the same metrics (including the tables of running/completed/failed jobs) as the human-readable page. This is a relatively lightweight addition to provide some machine-understandable metrics reporting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors
[ https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746878#action_12746878 ] Amar Kamat commented on MAPREDUCE-430: -- All tests (core + contrib) passed except TestReduceFetch which timed out. Task stuck in cleanup with OutOfMemoryErrors Key: MAPREDUCE-430 URL: https://issues.apache.org/jira/browse/MAPREDUCE-430 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Amareshwari Sriramadasu Assignee: Amar Kamat Fix For: 0.20.1 Attachments: MAPREDUCE-430-v1.11.patch, MAPREDUCE-430-v1.12-branch-0.20.patch, MAPREDUCE-430-v1.12.patch, MAPREDUCE-430-v1.6-branch-0.20.patch, MAPREDUCE-430-v1.6.patch, MAPREDUCE-430-v1.7.patch, MAPREDUCE-430-v1.8.patch Obesrved a task with OutOfMemory error, stuck in cleanup. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746887#action_12746887 ] Hadoop QA commented on MAPREDUCE-370: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417469/patch-370-2.txt against trunk revision 807123. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/508/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/508/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/508/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/508/console This message is automatically generated. Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-370-1.txt, patch-370-2.txt, patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-699) Several streaming test cases seem to be failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746929#action_12746929 ] Nigel Daley commented on MAPREDUCE-699: --- I just blew away the Hudson workspace for this patch build to see if that fixes it. Several streaming test cases seem to be failing --- Key: MAPREDUCE-699 URL: https://issues.apache.org/jira/browse/MAPREDUCE-699 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Jothi Padmanabhan ant test is failing several streaming tests with the following error Error Message java.lang.NullPointerException at org.apache.commons.cli.GnuParser.flatten(GnuParser.java:110) at org.apache.commons.cli.Parser.parse(Parser.java:143) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:374) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:153) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:138) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1314) at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:414) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:278) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:119) at org.apache.hadoop.streaming.TestMultipleCachefiles.testMultipleCachefiles(TestMultipleCachefiles.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:79) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768) Stacktrace junit.framework.AssertionFailedError: java.lang.NullPointerException at org.apache.commons.cli.GnuParser.flatten(GnuParser.java:110) at org.apache.commons.cli.Parser.parse(Parser.java:143) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:374) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:153) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:138) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1314) at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:414) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:278) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:119) at org.apache.hadoop.streaming.TestMultipleCachefiles.testMultipleCachefiles(TestMultipleCachefiles.java:68) at org.apache.hadoop.streaming.TestMultipleCachefiles.failTrace(TestMultipleCachefiles.java:141) at org.apache.hadoop.streaming.TestMultipleCachefiles.testMultipleCachefiles(TestMultipleCachefiles.java:133) The following are links to two such failures http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/337/testReport/ http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/336/testReport/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Zeyliger updated MAPREDUCE-476: -- Status: Patch Available (was: Open) extend DistributedCache to work locally (LocalJobRunner) Key: MAPREDUCE-476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: sam rash Assignee: Philip Zeyliger Priority: Minor Attachments: HADOOP-2914-v1-full.patch, HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, MAPREDUCE-476-v8.patch, MAPREDUCE-476-v9.patch, MAPREDUCE-476.patch, v6-to-v7.patch The DistributedCache does not work locally when using the outlined recipe at http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html Ideally, LocalJobRunner would take care of populating the JobConf and copying remote files to the local file sytem (http, assume hdfs = default fs = local fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Zeyliger updated MAPREDUCE-476: -- Attachment: MAPREDUCE-476-v9.patch Well-spotted, Tom. I've restored the missing test. extend DistributedCache to work locally (LocalJobRunner) Key: MAPREDUCE-476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: sam rash Assignee: Philip Zeyliger Priority: Minor Attachments: HADOOP-2914-v1-full.patch, HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, MAPREDUCE-476-v8.patch, MAPREDUCE-476-v9.patch, MAPREDUCE-476.patch, v6-to-v7.patch The DistributedCache does not work locally when using the outlined recipe at http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html Ideally, LocalJobRunner would take care of populating the JobConf and copying remote files to the local file sytem (http, assume hdfs = default fs = local fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-895) FileSystem::ListStatus will now throw FileNotFoundException, MapRed needs updated
[ https://issues.apache.org/jira/browse/MAPREDUCE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated MAPREDUCE-895: -- Release Note: The semantics for dealing with non-existent paths passed to FileSystem::listStatus() were updated and solidified in HADOOP-6201 and HDFS-538. Existing code within MapReduce that relied on the previous behavior of some FileSystem implementations of returning null has been updated to catch or propagate a FileNotFoundException, per the method's contract. Adding release note. FileSystem::ListStatus will now throw FileNotFoundException, MapRed needs updated - Key: MAPREDUCE-895 URL: https://issues.apache.org/jira/browse/MAPREDUCE-895 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.21.0 Attachments: MAPREDUCE-895.patch HADOOP-6201 (and HDFS-538) determined the semantics of FileSystem::ListStatus is not correct and that the actual file system class vary in their implemenations, with some throwing an exception and some returning null. Fixing this will require adjusting code that calls this method. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-679) XML-based metrics as JSP servlet for JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746997#action_12746997 ] Aaron Kimball commented on MAPREDUCE-679: - Good enough. Is this ready to be committed? XML-based metrics as JSP servlet for JobTracker --- Key: MAPREDUCE-679 URL: https://issues.apache.org/jira/browse/MAPREDUCE-679 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobtracker Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: example-jobtracker-completed-job.xml, example-jobtracker-running-job.xml, MAPREDUCE-679.2.patch, MAPREDUCE-679.3.patch, MAPREDUCE-679.patch In HADOOP-4559, a general REST API for reporting metrics was proposed but work seems to have stalled. In the interim, we have a simple XML translation of the existing JobTracker status page which provides the same metrics (including the tables of running/completed/failed jobs) as the human-readable page. This is a relatively lightweight addition to provide some machine-understandable metrics reporting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-910) MRUnit should support counters
MRUnit should support counters -- Key: MAPREDUCE-910 URL: https://issues.apache.org/jira/browse/MAPREDUCE-910 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball incrCounter() is currently a dummy stub method in MRUnit that does nothing. Would be good for the mock reporter/context implementations to support counters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-910) MRUnit should support counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-910: Attachment: MAPREDUCE-910.patch Attaching patch which provides this functionality. All TestDriver implementations have a getCounters() method which returns the counters used by that test. The user can then verify that the actual counts meet their expected values. MRUnit should support counters -- Key: MAPREDUCE-910 URL: https://issues.apache.org/jira/browse/MAPREDUCE-910 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-910.patch incrCounter() is currently a dummy stub method in MRUnit that does nothing. Would be good for the mock reporter/context implementations to support counters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-910) MRUnit should support counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-910: Status: Patch Available (was: Open) MRUnit should support counters -- Key: MAPREDUCE-910 URL: https://issues.apache.org/jira/browse/MAPREDUCE-910 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-910.patch incrCounter() is currently a dummy stub method in MRUnit that does nothing. Would be good for the mock reporter/context implementations to support counters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure
[ https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-901: -- Attachment: 901_1.patch Attaching a patch for review. I am still testing the patch. Also, a little bit of cleanup is required especially w.r.t to naming variables/fields in the classes. I will do that in a follow up patch. Some points on the approach: 1) Defined a class TaskMetrics that has methods for updating the counters defined in o.a.h.mapreduce.TaskCounter.java. It also provides a utility method to update framework Counters that aren't defined in TaskCounter.java. Examples of such counters are the counters that the framework defines in the countergroup FileSystemCounters. For the TaskCounter counters, the RPC is optimized. For the framework counters like the FileSystemCounters, RPC uses the Counters serialization. 2) The above is serialized out as part of TaskStatus object in the heartbeats. 3) In TaskInProgress.java, the TIP's Counters is updated with the above counters obtained in the heartbeat. Would really appreciate a review on this one. And yes, this looks like a good thing to have for the jiras MAPREDUCE-220 and MAPREDUCE-718. Move Framework Counters into a TaskMetric structure --- Key: MAPREDUCE-901 URL: https://issues.apache.org/jira/browse/MAPREDUCE-901 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.21.0 Reporter: Owen O'Malley Assignee: Devaraj Das Fix For: 0.21.0 Attachments: 901_1.patch I think we should move all of the Counters that the framework updates into a single class called TaskMetrics. TaskMetrics would have specific fields for each of the metrics like input records, input bytes, output records, etc. It would both reduce the serialized size of the heartbeats (by shrinking the Counters down to just the user's counters) and decrease the latency for updates to the JobTracker (since Counters are sent at most 1/minute instead of 1/heartbeat). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-775) Add input/output formatters for Vertica clustered ADBMS.
[ https://issues.apache.org/jira/browse/MAPREDUCE-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omer Trajman updated MAPREDUCE-775: --- Status: Patch Available (was: Open) Add input/output formatters for Vertica clustered ADBMS. Key: MAPREDUCE-775 URL: https://issues.apache.org/jira/browse/MAPREDUCE-775 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Omer Trajman Fix For: 0.21.0 Attachments: MAPREDUCE-775.patch Add native support for Vertica as an input or output format taking advantage of parallel read and write properties of the DBMS. On the input side allow for parametrized queries (a la prepared statements) and create a split for each combination of parameters. Also support the parameter list to be generated from a sql statement. For example - return metrics for all dimensions that meet criteria X with one input split for each dimension. Divide the read among any number of hosts in the Vertica cluster. On the output side, support Vertica streaming load to any number of hosts in the Vertica cluster. Output may be to a different cluster than input. Also includes Input and Output formatters that support streaming interface. Code has been tested and run on live systems under 19 and 20. Patch for 21 with new API will be ready end of this week. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-775) Add input/output formatters for Vertica clustered ADBMS.
[ https://issues.apache.org/jira/browse/MAPREDUCE-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omer Trajman updated MAPREDUCE-775: --- Status: Open (was: Patch Available) Fixing issues with new patch. I seem to have replaced the original instead of adding a .N.patch - sorry for the confusion. Add input/output formatters for Vertica clustered ADBMS. Key: MAPREDUCE-775 URL: https://issues.apache.org/jira/browse/MAPREDUCE-775 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Omer Trajman Fix For: 0.21.0 Attachments: MAPREDUCE-775.patch Add native support for Vertica as an input or output format taking advantage of parallel read and write properties of the DBMS. On the input side allow for parametrized queries (a la prepared statements) and create a split for each combination of parameters. Also support the parameter list to be generated from a sql statement. For example - return metrics for all dimensions that meet criteria X with one input split for each dimension. Divide the read among any number of hosts in the Vertica cluster. On the output side, support Vertica streaming load to any number of hosts in the Vertica cluster. Output may be to a different cluster than input. Also includes Input and Output formatters that support streaming interface. Code has been tested and run on live systems under 19 and 20. Patch for 21 with new API will be ready end of this week. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747092#action_12747092 ] Philip Zeyliger commented on MAPREDUCE-476: --- Failing test is org.apache.hadoop.mapred.TestRecoveryManager.testRestartCount. I think that's failing all-over, not just here. -- Philip extend DistributedCache to work locally (LocalJobRunner) Key: MAPREDUCE-476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: sam rash Assignee: Philip Zeyliger Priority: Minor Attachments: HADOOP-2914-v1-full.patch, HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, MAPREDUCE-476-v8.patch, MAPREDUCE-476-v9.patch, MAPREDUCE-476.patch, v6-to-v7.patch The DistributedCache does not work locally when using the outlined recipe at http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html Ideally, LocalJobRunner would take care of populating the JobConf and copying remote files to the local file sytem (http, assume hdfs = default fs = local fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-906) Updated Sqoop documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-906: Status: Patch Available (was: Open) Updated Sqoop documentation --- Key: MAPREDUCE-906 URL: https://issues.apache.org/jira/browse/MAPREDUCE-906 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-906.patch Here's the latest documentation for Sqoop, in both user-guide and manpage form. Built with asciidoc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-875) Make DBRecordReader execute queries lazily
[ https://issues.apache.org/jira/browse/MAPREDUCE-875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-875: Status: Open (was: Patch Available) Make DBRecordReader execute queries lazily -- Key: MAPREDUCE-875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-875 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-875.2.patch, MAPREDUCE-875.patch DBInputFormat's DBRecordReader executes the user's SQL query in the constructor. If the query is long-running, this can cause task timeout. The user is unable to spawn a background thread (e.g., in a MapRunnable) to inform Hadoop of on-going progress. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-875) Make DBRecordReader execute queries lazily
[ https://issues.apache.org/jira/browse/MAPREDUCE-875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-875: Status: Patch Available (was: Open) Recycling patch again.. seems to have been dropped from the queue. Make DBRecordReader execute queries lazily -- Key: MAPREDUCE-875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-875 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-875.2.patch, MAPREDUCE-875.patch DBInputFormat's DBRecordReader executes the user's SQL query in the constructor. If the query is long-running, this can cause task timeout. The user is unable to spawn a background thread (e.g., in a MapRunnable) to inform Hadoop of on-going progress. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure
[ https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747133#action_12747133 ] Arun C Murthy commented on MAPREDUCE-901: - Hmm... at the risk of sounding completely lame, I can't seem to find the definition of TaskMetrics or TaskCounters - did you forget to do included that in the patch? From the description it seems like TaskMetrics is related to Counters, maybe I should wait to see the patch - anyway I was hoping TaskMetrics would be a Writable and isn't related to Counters at all. Move Framework Counters into a TaskMetric structure --- Key: MAPREDUCE-901 URL: https://issues.apache.org/jira/browse/MAPREDUCE-901 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.21.0 Reporter: Owen O'Malley Assignee: Devaraj Das Fix For: 0.21.0 Attachments: 901_1.patch I think we should move all of the Counters that the framework updates into a single class called TaskMetrics. TaskMetrics would have specific fields for each of the metrics like input records, input bytes, output records, etc. It would both reduce the serialized size of the heartbeats (by shrinking the Counters down to just the user's counters) and decrease the latency for updates to the JobTracker (since Counters are sent at most 1/minute instead of 1/heartbeat). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-910) MRUnit should support counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747162#action_12747162 ] Hadoop QA commented on MAPREDUCE-910: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417505/MAPREDUCE-910.patch against trunk revision 807165. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/510/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/510/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/510/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/510/console This message is automatically generated. MRUnit should support counters -- Key: MAPREDUCE-910 URL: https://issues.apache.org/jira/browse/MAPREDUCE-910 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-910.patch incrCounter() is currently a dummy stub method in MRUnit that does nothing. Would be good for the mock reporter/context implementations to support counters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-910) MRUnit should support counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747165#action_12747165 ] Aaron Kimball commented on MAPREDUCE-910: - Unrelated test failure. MRUnit should support counters -- Key: MAPREDUCE-910 URL: https://issues.apache.org/jira/browse/MAPREDUCE-910 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-910.patch incrCounter() is currently a dummy stub method in MRUnit that does nothing. Would be good for the mock reporter/context implementations to support counters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure
[ https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-901: -- Attachment: 901_1.patch That was my bad. *sigh* Attached is the correct patch. The TaskMetrics has a Counters field but that's mostly to take care of counters that are related to the FileSystemCounters which depends on the FileSystem in use, etc. Move Framework Counters into a TaskMetric structure --- Key: MAPREDUCE-901 URL: https://issues.apache.org/jira/browse/MAPREDUCE-901 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.21.0 Reporter: Owen O'Malley Assignee: Devaraj Das Fix For: 0.21.0 Attachments: 901_1.patch, 901_1.patch I think we should move all of the Counters that the framework updates into a single class called TaskMetrics. TaskMetrics would have specific fields for each of the metrics like input records, input bytes, output records, etc. It would both reduce the serialized size of the heartbeats (by shrinking the Counters down to just the user's counters) and decrease the latency for updates to the JobTracker (since Counters are sent at most 1/minute instead of 1/heartbeat). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747204#action_12747204 ] Hemanth Yamijala commented on MAPREDUCE-768: Javadocs out of sync in both the APIs JobTracker.dumpConfiguration and QueueManager.dumpConfiguration. Other than that, +1. Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh Attachments: MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768.patch We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747207#action_12747207 ] Hemanth Yamijala commented on MAPREDUCE-824: This is getting better. I do have some more feedback: - updateStatsOnRunningJob, addRunningJob, removeRunningJob, removeWaitingJob - make private - ASF licence header should be the first in the src file. - Replace sortJobQueues with inline method. - QueueHierarchyBuilder is creating a new instance of the CapacityTaskScheduler, which is unnecessary. - static builder instance also seems unnecessary. - In QueueHierarchyBuilder, when checking for separator char, IllegalArgumentException must show the queue name which failed the check. - Discuss: Back dependency between QueueHierarchyBuilder and Scheduler - can this be avoided. - AbstractQueue does not override equals, while hashcode is overridden. Also, the toString API was previously printing other information. I'd only asked the name of the queue to be prepended to it, not to remove the other information. - It is a little confusing that the number of slots being asserted after task assignment does not include the currently scheduled task. Recommend to move the asserts before assignment. - Root should always be set up only in a certain way. I would recommend, there's a single static instance of root, which is always got from the capacity scheduler, even in tests. - In testMaxCapacity, rt.update in tests should send in the capacity of the clusters to be in sync. - getTaskDataView() need not be in TaskSchedulingContext. Since it is static, it can be called directly from other classes like the scheduler, passing the type. - AbstractQueue.addChildren should be addChild. Some of the earlier comments are not taken: - APIs in JobQueuesManager and JobQueue can be folded still. - mapTSI and reduceTSI member variables of JobQueue are not needed. - AbstractQueue.getChildren is still public - getCapacity() should not return max capacity any time. It should always return the current capacity or limit, whichever is smaller. Support a hierarchy of queues in the capacity scheduler --- Key: MAPREDUCE-824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/capacity-sched Reporter: Hemanth Yamijala Attachments: HADOOP-824-1.patch, HADOOP-824-2.patch, HADOOP-824-3.patch, HADOOP-824-4.patch, HADOOP-824-5.patch Currently in Capacity Scheduler, cluster capacity is divided among the queues based on the queue capacity. These queues typically represent an organization and the capacity of the queue represents the capacity the organization is entitled to. Most organizations are large and need to divide their capacity among sub-organizations they have. Or they may want to divide the capacity based on a category or type of jobs they run. This JIRA covers the requirements and other details to provide the above feature. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747209#action_12747209 ] Amareshwari Sriramadasu commented on MAPREDUCE-370: --- -1 core tests. Due to test failure TestRecoveryManager (MAPREDUCE-880) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-370-1.txt, patch-370-2.txt, patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-777) A method for finding and tracking jobs from the new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated MAPREDUCE-777: Status: Open (was: Patch Available) I'm not happy with this patch. I need to go through it in more depth, but: 1. The setters mostly look right, although some of them are missing the assertion that the job is in the setup phase. 2. The getters should move to JobContext. 3. I think JobClient is a bad name for the job browser. Something like JobBrowser is probably clearer. A method for finding and tracking jobs from the new API --- Key: MAPREDUCE-777 URL: https://issues.apache.org/jira/browse/MAPREDUCE-777 Project: Hadoop Map/Reduce Issue Type: New Feature Components: client Reporter: Owen O'Malley Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-777-1.txt, patch-777-2.txt, patch-777.txt We need to create a replacement interface for the JobClient API in the new interface. In particular, the user needs to be able to query and track jobs that were launched by other processes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-775) Add input/output formatters for Vertica clustered ADBMS.
[ https://issues.apache.org/jira/browse/MAPREDUCE-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12747222#action_12747222 ] Hadoop QA commented on MAPREDUCE-775: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415514/MAPREDUCE-775.patch against trunk revision 807165. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 16 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/511/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/511/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/511/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/511/console This message is automatically generated. Add input/output formatters for Vertica clustered ADBMS. Key: MAPREDUCE-775 URL: https://issues.apache.org/jira/browse/MAPREDUCE-775 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Omer Trajman Fix For: 0.21.0 Attachments: MAPREDUCE-775.patch Add native support for Vertica as an input or output format taking advantage of parallel read and write properties of the DBMS. On the input side allow for parametrized queries (a la prepared statements) and create a split for each combination of parameters. Also support the parameter list to be generated from a sql statement. For example - return metrics for all dimensions that meet criteria X with one input split for each dimension. Divide the read among any number of hosts in the Vertica cluster. On the output side, support Vertica streaming load to any number of hosts in the Vertica cluster. Output may be to a different cluster than input. Also includes Input and Output formatters that support streaming interface. Code has been tested and run on live systems under 19 and 20. Patch for 21 with new API will be ready end of this week. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.