[jira] Updated: (MAPREDUCE-802) Simplify the job updated event notification between Jobtracker and schedulers
[ https://issues.apache.org/jira/browse/MAPREDUCE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-802: - Attachment: eventmodel-1.patch Attaching the patch which makes changes in the event model as described in the [comment|https://issues.apache.org/jira/browse/MAPREDUCE-802?focusedCommentId=12738226page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12738226] I have introduced {{JobSchedulingInfoIndex}} for removal based on the old {{JobSchedulingInfo}} as I thought the update of the jobs are happening with {{JobTracker}} lock. Simplify the job updated event notification between Jobtracker and schedulers - Key: MAPREDUCE-802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-802 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Attachments: eventmodel-1.patch HADOOP-4053 and HADOOP-4149 added events to take care of updates to the state / property of a job like the run state / priority of a job notified to the scheduler. We've seen some issues with this framework, such as the following: - Events are not raised correctly at all places. If a new code path is added to kill a job, raising events is missed out. - Events are raised with incorrect event data. For e.g. typically start time value is missed out. The resulting contract break between jobtracker and schedulers has lead to problems in the capacity scheduler where jobs remain stuck in the queue without being ever removed and so on. It has proven complicated to get this right in the framework and fixes have typically still left dangling cases. Or new code paths introduce new bugs. This JIRA is about trying to simplify the interaction model so that it is more robust and works well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-832) Too man y WARN messages about deprecated memorty config variables in JobTacker log
Too man y WARN messages about deprecated memorty config variables in JobTacker log -- Key: MAPREDUCE-832 URL: https://issues.apache.org/jira/browse/MAPREDUCE-832 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Karam Singh When user submit a mapred job using old memory config vairiable (mapred.task.maxmem) followinig message too many times in JobTracker logs -: [ WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no longer used instead use mapred.job.map.memory.mb and mapred.job.reduce.memory.mb ] -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-833) Jobclient does not print any warning message when old memory config variable used with -D option from command line
Jobclient does not print any warning message when old memory config variable used with -D option from command line -- Key: MAPREDUCE-833 URL: https://issues.apache.org/jira/browse/MAPREDUCE-833 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Karam Singh -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.
When TaskTracker config use old memory management values its memory monitoring is diabled. -- Key: MAPREDUCE-834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh TaskTracker memory config values -: mapred.tasktracker.vmem.reserved=8589934592 mapred.task.default.maxvmem=2147483648 mapred.task.limit.maxvmem=4294967296 mapred.tasktracker.pmem.reserved=2147483648 TaskTracker start as -: 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.vmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.pmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is no longer used 2009-08-05 12:39:03,308 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_name 2009-08-05 12:39:03,309 INFO org.apache.hadoop.mapred.TaskTracker: Using MemoryCalculatorPlugin : org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777 2009-08-05 12:39:03,311 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (MAPREDUCE-796) Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner
[ https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat reopened MAPREDUCE-796: -- Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner --- Key: MAPREDUCE-796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-796 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1 Reporter: Suman Sehgal Assignee: Amar Kamat ClassCastException for OutOfMemoryError is encountered on tasktracker while running wordcount example with MultithreadedMapRunner. Stack trace : = java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to java.lang.RuntimeException at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303) at org.apache.hadoop.mapred.Child.main(Child.java:170) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-796) Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner
[ https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat reassigned MAPREDUCE-796: Assignee: Amar Kamat Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner --- Key: MAPREDUCE-796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-796 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1 Reporter: Suman Sehgal Assignee: Amar Kamat ClassCastException for OutOfMemoryError is encountered on tasktracker while running wordcount example with MultithreadedMapRunner. Stack trace : = java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to java.lang.RuntimeException at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303) at org.apache.hadoop.mapred.Child.main(Child.java:170) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-796) Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner
[ https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-796: - Attachment: MAPREDUCE-796-v1.0.patch Attaching a simple fix. Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner --- Key: MAPREDUCE-796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-796 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1 Reporter: Suman Sehgal Assignee: Amar Kamat Attachments: MAPREDUCE-796-v1.0.patch ClassCastException for OutOfMemoryError is encountered on tasktracker while running wordcount example with MultithreadedMapRunner. Stack trace : = java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to java.lang.RuntimeException at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303) at org.apache.hadoop.mapred.Child.main(Child.java:170) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-832) Too many WARN messages about deprecated memorty config variables in JobTacker log
[ https://issues.apache.org/jira/browse/MAPREDUCE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-832: -- Summary: Too many WARN messages about deprecated memorty config variables in JobTacker log (was: Too man y WARN messages about deprecated memorty config variables in JobTacker log) Too many WARN messages about deprecated memorty config variables in JobTacker log - Key: MAPREDUCE-832 URL: https://issues.apache.org/jira/browse/MAPREDUCE-832 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Karam Singh When user submit a mapred job using old memory config vairiable (mapred.task.maxmem) followinig message too many times in JobTracker logs -: [ WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no longer used instead use mapred.job.map.memory.mb and mapred.job.reduce.memory.mb ] -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-796) Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner
[ https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740493#action_12740493 ] Amareshwari Sriramadasu commented on MAPREDUCE-796: --- +1 Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner --- Key: MAPREDUCE-796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-796 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1 Reporter: Suman Sehgal Assignee: Amar Kamat Attachments: MAPREDUCE-796-v1.0.patch ClassCastException for OutOfMemoryError is encountered on tasktracker while running wordcount example with MultithreadedMapRunner. Stack trace : = java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to java.lang.RuntimeException at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303) at org.apache.hadoop.mapred.Child.main(Child.java:170) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-767) to remove mapreduce dependency on commons-cli2
[ https://issues.apache.org/jira/browse/MAPREDUCE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740496#action_12740496 ] Amar Kamat commented on MAPREDUCE-767: -- Tested this patch with examples mentioned in [streaming docs|http://hadoop.apache.org/common/docs/r0.20.0/streaming.html]. All cases seem to pass. Doing further testing. to remove mapreduce dependency on commons-cli2 -- Key: MAPREDUCE-767 URL: https://issues.apache.org/jira/browse/MAPREDUCE-767 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/streaming Reporter: Giridharan Kesavan Assignee: Amar Kamat Attachments: MAPREDUCE-767-v1.1.patch mapreduce, streaming and eclipse plugin depends on common-cli2 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics
[ https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-779: - Status: Patch Available (was: Open) Add node health failures into JobTrackerStatistics -- Key: MAPREDUCE-779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-779 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch, mapreduce-779-3.patch, mapreduce-779-4.patch Add the node health failure counts into {{JobTrackerStatistics}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-814) Move completed Job history files to HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sharad Agarwal updated MAPREDUCE-814: - Attachment: 814_v5.patch Incorporated Devaraj's offline comments. Minimized the jobtracker init changes. Passing filesystem handle in JobHistory#getJobHistoryFileName Move completed Job history files to HDFS Key: MAPREDUCE-814 URL: https://issues.apache.org/jira/browse/MAPREDUCE-814 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobtracker Reporter: Sharad Agarwal Assignee: Sharad Agarwal Attachments: 814_v1.patch, 814_v2.patch, 814_v3.patch, 814_v4.patch, 814_v5.patch Currently completed job history files remain on the jobtracker node. Having the files available on HDFS will enable clients to access these files more easily. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-370: -- Attachment: patch-370.txt Attaching an early patch. Patch does the following: 1. Adds an api in org.apache.hadoop.mapreduce.lib.output.FileOutputFormat to get RecordWriter by taking the filename. Current api does not support passing a filename. 2. Adds org.apache.hadoop.mapreduce.lib.output.MultipleOutputs with following api : {code} public class MultipleOutputsKEYOUT, VALUEOUT { public MultipleOutputs(TaskInputOutputContext context); // Adds a named output for the job. public static void addNamedOutput(Job job, String namedOutput, Class? extends FileOutputFormat outputFormatClass, Class? keyClass, Class? valueClass) ; // Enables counters for named outputs public static void setCountersEnabled(Job job, boolean enabled); // Write to a named output. // write to an output file name that depends on key, value, context and namedoutput // gets the record writer from output format added for the named output public K,V void write(String namedOutput, K key, V value) throws IOException, InterruptedException; // Writes to an output file name that depends on key, value and context // gets the record writer from job's outputformat. //Job's output format should be a FileOutputFormat. public void write(KEYOUT key, VALUEOUT value) throws IOException, InterruptedException; protected K,VString generateOutputName(K key, V value, TaskAttemptContext context, String name); protected K,V K generateActualKey(K key, V value) ; protected K,V V generateActualValue(K key, V value); {code} User can add namedOutputs and corresponding OutputFormat, Output key/value types using addNamedOutput. generateOutputName api can be overridden by the user to give final output name. This gives the complete control of the output name to the user. Generating unique file-name can done once user gives this name (can be done in framework it self) as done in the patch. This facilitates the available counter feature to count the number of records written to each output name. The same method can be used to plug-in the functionality of multiNamedOutputs. I illustrated using the api, in the added test-case. 3. Deprecates org.apache.hadoop.mapred.lib.Multiple*Output* Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Attachments: patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-796) Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner
[ https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das resolved MAPREDUCE-796. --- Resolution: Fixed Fix Version/s: 0.20.1 Hadoop Flags: [Reviewed] I just committed this. Thanks, Amar! Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner --- Key: MAPREDUCE-796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-796 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1 Reporter: Suman Sehgal Assignee: Amar Kamat Fix For: 0.20.1 Attachments: MAPREDUCE-796-v1.0.patch ClassCastException for OutOfMemoryError is encountered on tasktracker while running wordcount example with MultithreadedMapRunner. Stack trace : = java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to java.lang.RuntimeException at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303) at org.apache.hadoop.mapred.Child.main(Child.java:170) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-750) Extensible ConnManager factory API
[ https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740511#action_12740511 ] Hadoop QA commented on MAPREDUCE-750: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415690/MAPREDUCE-750.2.patch against trunk revision 801517. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/451/console This message is automatically generated. Extensible ConnManager factory API -- Key: MAPREDUCE-750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-750 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.patch Sqoop uses the ConnFactory class to instantiate a ConnManager implementation based on the connect string and other arguments supplied by the user. This allows per-database logic to be encapsulated in different ConnManager instances, and dynamically chosen based on which database the user is actually importing from. But adding new ConnManager implementations requires modifying the source of a common ConnFactory class. An indirection layer should be used to delegate instantiation to a number of factory implementations which can be specified in the static configuration or at runtime. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740509#action_12740509 ] Amareshwari Sriramadasu commented on MAPREDUCE-370: --- bq. To achieve this, I think we could port MultipleOutputs, and change the semantics of getCollector() in the multi name case, so that the multi name is the full name of the name of the output file. This method is typically invoked in the reduce() method, where the key and value are available, and can be used to form the name. Tom, are you saying that we should not have a protected method to generateOutputName(), which could be overridden to give the functionality. If so, we should have a way to find out whether it is namedOutput (i meant multiNamedOutputs) or an arbitrary name, to know which output format should be used for writing. We should have something like : {code} public K,V void write(String namedOutput, String outputPath, K key, V value) throws IOException, InterruptedException; public K,V void write(String outputPath, K key, V value) throws IOException, InterruptedException; {code} bq. Applications that want to add a unique suffix can call FileOutputFormat#getUniqueFile() themselves. This should be done by the framework to support counters as explained earlier. Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api. --- Key: MAPREDUCE-370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Attachments: patch-370.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-478) separate jvm param for mapper and reducer
[ https://issues.apache.org/jira/browse/MAPREDUCE-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740513#action_12740513 ] Hadoop QA commented on MAPREDUCE-478: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415805/MAPREDUCE-478_1_20090806_yhadoop20.patch against trunk revision 801954. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 19 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/452/console This message is automatically generated. separate jvm param for mapper and reducer - Key: MAPREDUCE-478 URL: https://issues.apache.org/jira/browse/MAPREDUCE-478 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Minor Fix For: 0.21.0 Attachments: HADOOP-5684_0_20090420.patch, MAPREDUCE-478_0_20090804.patch, MAPREDUCE-478_0_20090804_yhadoop20.patch, MAPREDUCE-478_1_20090806.patch, MAPREDUCE-478_1_20090806_yhadoop20.patch Memory footprint of mapper and reducer can differ. It would be nice if we can pass different jvm param (mapred.child.java.opts) for mappers and reducers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-375) Change org.apache.hadoop.mapred.lib.NLineInputFormat and org.apache.hadoop.mapred.MapFileOutputFormat to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-375: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Thanks, Amareshwari! Change org.apache.hadoop.mapred.lib.NLineInputFormat and org.apache.hadoop.mapred.MapFileOutputFormat to use new api. -- Key: MAPREDUCE-375 URL: https://issues.apache.org/jira/browse/MAPREDUCE-375 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-375-1.txt, patch-375-2.txt, patch-375.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-757) JobConf will not be deleted from the logs folder if job retires from finalizeJob()
[ https://issues.apache.org/jira/browse/MAPREDUCE-757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-757: - Attachment: MAPREDUCE-757-v2.0-branch-0.20.patch Attaching a patch for branch 0.20. JobConf will not be deleted from the logs folder if job retires from finalizeJob() -- Key: MAPREDUCE-757 URL: https://issues.apache.org/jira/browse/MAPREDUCE-757 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-757-v1.0.patch, MAPREDUCE-757-v2.0-branch-0.20.patch, MAPREDUCE-757-v2.0.patch MAPREDUCE-130 fixed the case where the job is retired from the retire jobs thread. But jobs can also retire when the num-job-per-user limit is exceeded. In such cases the conf file will not be deleted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-835) hadoop-mapred examples,test and tools jar iles are being packaged when ant binary or bin-package
hadoop-mapred examples,test and tools jar iles are being packaged when ant binary or bin-package Key: MAPREDUCE-835 URL: https://issues.apache.org/jira/browse/MAPREDUCE-835 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Karam Singh When checking mapreduce trunk. If run ant binary or ant bin-package commands-: hadoop-mapred-test-0.21.0-dev.jar, hadoop-mapred-examples-0.21.0-dev.jar, hadoop-mapred-tools-0.21.0-dev.jar are being in tar or build/hadoop-mapred-0.21.0-dev packe directory. But they present under build directory. For ant tar and ant package they are being packaged correclty. buid/hadoop-mapred-0.21.0-dev directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-836) Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands.
Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands. - Key: MAPREDUCE-836 URL: https://issues.apache.org/jira/browse/MAPREDUCE-836 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1, 0.21.0 Reporter: Karam Singh Examples of hadoop pies and python are not packed even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands. The pipes examples are compiled and copied under build/c++-examples but are not being packaged. Similar is case with python examples also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-805) Deadlock in Jobtracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-805: - Attachment: MAPREDUCE-805-v1.7.patch Attaching a patch incorporating Devaraj's offline comments. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 21 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Deadlock in Jobtracker -- Key: MAPREDUCE-805 URL: https://issues.apache.org/jira/browse/MAPREDUCE-805 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Michael Tamm Attachments: MAPREDUCE-805-v1.1.patch, MAPREDUCE-805-v1.2.patch, MAPREDUCE-805-v1.3.patch, MAPREDUCE-805-v1.6.patch, MAPREDUCE-805-v1.7.patch We are running a hadoop cluster (version 0.20.0) and have detected the following deadlock on our jobtracker: {code} IPC Server handler 51 on 9001: at org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:943) - waiting to lock 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobTracker.getJobCounters(JobTracker.java:3102) - locked 0x7f2b5f026000 (a org.apache.hadoop.mapred.JobTracker) at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) pool-1-thread-2: at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2017) - waiting to lock 0x7f2b5f026000 (a org.apache.hadoop.mapred.JobTracker) at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2483) - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:2152) - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:2169) - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2245) - locked 0x7f2b6fb46130 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:86) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace
[ https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740584#action_12740584 ] Hadoop QA commented on MAPREDUCE-479: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415747/MAPREDUCE-479-4.patch against trunk revision 801959. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/console This message is automatically generated. Add reduce ID to shuffle clienttrace Key: MAPREDUCE-479 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.21.0 Reporter: Jiaqi Tan Assignee: Jiaqi Tan Priority: Minor Fix For: 0.21.0 Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, MAPREDUCE-479-2.patch, MAPREDUCE-479-3.patch, MAPREDUCE-479-4.patch, MAPREDUCE-479.patch Current clienttrace messages from shuffles note only the destination map ID but not the source reduce ID. Having both source and destination ID of each shuffle enables full tracing of execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes
[ https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740609#action_12740609 ] Aaron Kimball commented on MAPREDUCE-798: - test failures are in streaming MRUnit should be able to test a succession of MapReduce passes -- Key: MAPREDUCE-798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-798 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-798.2.patch, MAPREDUCE-798.patch MRUnit can currently test that the inputs to a given (mapper, reducer) job produce certain outputs at the end of the reducer. It would be good to support more end-to-end tests of a series of MapReduce jobs that form a longer pipeline surrounding some data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-814) Move completed Job history files to HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740613#action_12740613 ] Sharad Agarwal commented on MAPREDUCE-814: -- test patch and ant test passed. Move completed Job history files to HDFS Key: MAPREDUCE-814 URL: https://issues.apache.org/jira/browse/MAPREDUCE-814 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobtracker Reporter: Sharad Agarwal Assignee: Sharad Agarwal Attachments: 814_v1.patch, 814_v2.patch, 814_v3.patch, 814_v4.patch, 814_v5.patch Currently completed job history files remain on the jobtracker node. Having the files available on HDFS will enable clients to access these files more easily. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-796) Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner
[ https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740658#action_12740658 ] Hudson commented on MAPREDUCE-796: -- Integrated in Hadoop-Mapreduce-trunk #41 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/41/]) . Fixes a ClassCastException in an exception log in MultiThreadedMapRunner. Contributed by Amar Kamat. Encountered ClassCastException on tasktracker while running wordcount with MultithreadedMapRunner --- Key: MAPREDUCE-796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-796 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.1 Reporter: Suman Sehgal Assignee: Amar Kamat Fix For: 0.20.1 Attachments: MAPREDUCE-796-v1.0.patch ClassCastException for OutOfMemoryError is encountered on tasktracker while running wordcount example with MultithreadedMapRunner. Stack trace : = java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to java.lang.RuntimeException at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303) at org.apache.hadoop.mapred.Child.main(Child.java:170) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-375) Change org.apache.hadoop.mapred.lib.NLineInputFormat and org.apache.hadoop.mapred.MapFileOutputFormat to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740657#action_12740657 ] Hudson commented on MAPREDUCE-375: -- Integrated in Hadoop-Mapreduce-trunk #41 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/41/]) . Change org.apache.hadoop.mapred.lib.NLineInputFormat and org.apache.hadoop.mapred.MapFileOutputFormat to use new api. Contributed by Amareshwari Sriramadasu. Change org.apache.hadoop.mapred.lib.NLineInputFormat and org.apache.hadoop.mapred.MapFileOutputFormat to use new api. -- Key: MAPREDUCE-375 URL: https://issues.apache.org/jira/browse/MAPREDUCE-375 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-375-1.txt, patch-375-2.txt, patch-375.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-837) harchive fail when output directory has URI with default port of 8020
[ https://issues.apache.org/jira/browse/MAPREDUCE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740692#action_12740692 ] Koji Noguchi commented on MAPREDUCE-837: hadoop archive -archiveName abc.har /user/knoguchi/abc hdfs://mynamenode:8020/user/knoguchi in 0.18, job fails with {noformat} 09/08/07 19:41:57 INFO mapred.JobClient: Task Id : attempt_200908071938_0001_m_00_2, Status : FAILED Failed to rename output with the exception: java.io.IOException: Can not get the relative path: base = hdfs://mynamenode:8020/user/knoguchi/abc.har/_temporary/_attempt_200908071938_0001_m_00_2 child = hdfs://mynamenode/user/knoguchi/abc.har/_temporary/_attempt_200908071938_0001_m_00_2/part-0 at org.apache.hadoop.mapred.Task.getFinalPath(Task.java:590) at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:603) at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:621) at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:565) at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2616) {noformat} in 0.20, it logs the above warning but job succeeds with empty output directory. (which is worse) I'll create a separate Jira for the 0.20 job succeeding part. harchive fail when output directory has URI with default port of 8020 - Key: MAPREDUCE-837 URL: https://issues.apache.org/jira/browse/MAPREDUCE-837 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.1 Reporter: Koji Noguchi Priority: Minor % hadoop archive -archiveName abc.har /user/knoguchi/abc hdfs://mynamenode:8020/user/knoguchi doesn't work on 0.18 nor 0.20 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-838) Task succeeds even when committer.commitTask fails with IOException
Task succeeds even when committer.commitTask fails with IOException --- Key: MAPREDUCE-838 URL: https://issues.apache.org/jira/browse/MAPREDUCE-838 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Affects Versions: 0.20.1 Reporter: Koji Noguchi In MAPREDUCE-837, job succeeded with empty output even though all the tasks were throwing IOException at commiter.commitTask. {noformat} 2009-08-07 17:51:47,458 INFO org.apache.hadoop.mapred.TaskRunner: Task attempt_200907301448_8771_r_00_0 is allowed to commit now 2009-08-07 17:51:47,466 WARN org.apache.hadoop.mapred.TaskRunner: Failure committing: java.io.IOException: Can not get the relative path: \ base = hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0 \ child = hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index at org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126) at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86) at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171) at org.apache.hadoop.mapred.Task.commit(Task.java:768) at org.apache.hadoop.mapred.Task.done(Task.java:692) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417) at org.apache.hadoop.mapred.Child.main(Child.java:170) 2009-08-07 17:51:47,468 WARN org.apache.hadoop.mapred.TaskRunner: Failure asking whether task can commit: java.io.IOException: \ Can not get the relative path: base = hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0 \ child = hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index at org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126) at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86) at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171) at org.apache.hadoop.mapred.Task.commit(Task.java:768) at org.apache.hadoop.mapred.Task.done(Task.java:692) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417) at org.apache.hadoop.mapred.Child.main(Child.java:170) 2009-08-07 17:51:47,469 INFO org.apache.hadoop.mapred.TaskRunner: Task attempt_200907301448_8771_r_00_0 is allowed to commit now 2009-08-07 17:51:47,472 INFO org.apache.hadoop.mapred.TaskRunner: Task 'attempt_200907301448_8771_r_00_0' done. {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-837) harchive fail when output directory has URI with default port of 8020
[ https://issues.apache.org/jira/browse/MAPREDUCE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740697#action_12740697 ] Koji Noguchi commented on MAPREDUCE-837: bq. I'll create a separate Jira for the 0.20 job succeeding part. Created MAPREDUCE-838 harchive fail when output directory has URI with default port of 8020 - Key: MAPREDUCE-837 URL: https://issues.apache.org/jira/browse/MAPREDUCE-837 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.1 Reporter: Koji Noguchi Priority: Minor % hadoop archive -archiveName abc.har /user/knoguchi/abc hdfs://mynamenode:8020/user/knoguchi doesn't work on 0.18 nor 0.20 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-839) unit test TestMiniMRChildTask fails on mac os-x
unit test TestMiniMRChildTask fails on mac os-x --- Key: MAPREDUCE-839 URL: https://issues.apache.org/jira/browse/MAPREDUCE-839 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hong Tang Priority: Minor The unit test TestMiniMRChildTask fails on Mac OS-X (10.5.8) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-825) JobClient completion poll interval of 5s causes slow tests in local mode
[ https://issues.apache.org/jira/browse/MAPREDUCE-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740726#action_12740726 ] Hadoop QA commented on MAPREDUCE-825: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415772/MAPREDUCE-825.2.patch against trunk revision 801959. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/console This message is automatically generated. JobClient completion poll interval of 5s causes slow tests in local mode Key: MAPREDUCE-825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-825 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Priority: Minor Attachments: completion-poll-interval.patch, MAPREDUCE-825.2.patch The JobClient.NetworkedJob.waitForCompletion() method polls for job completion every 5 seconds. When running a set of short tests in pseudo-distributed mode, this is unnecessarily slow and causes lots of wasted time. When bandwidth is not scarce, setting the poll interval to 100 ms results in a 4x speedup in some tests. This interval should be parametrized to allow users to control the interval for testing purposes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-839) unit test TestMiniMRChildTask fails on mac os-x
[ https://issues.apache.org/jira/browse/MAPREDUCE-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740727#action_12740727 ] Hong Tang commented on MAPREDUCE-839: - The problem is discovered on Mac OS-X. But I tried to list the root causes that could also affect non-mac-os-x platforms: Line 66: assertEquals(tmp, new Path(System.getProperty(java.io.tmpdir)). makeQualified(localFs).toString()); expected = file:/[private/]tmp/hadoop-htang/map..., actual = file:/[]tmp/hadoop-htang/map Root cause: on Mac OS-X, /tmp is symlink to /private/tmp. The test probably would fail on normal unix systems if /tmp is also symlinked. Line 160: assertTrue(LD doesnt contain pwd, System.getenv(LD_LIBRARY_PATH).contains(pwd)); Root cause: the environment variable for dynamic library on Mac OS-X is DYLD_LIBRARY_PATH instead of LD_LIBRARY_PATH unit test TestMiniMRChildTask fails on mac os-x --- Key: MAPREDUCE-839 URL: https://issues.apache.org/jira/browse/MAPREDUCE-839 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hong Tang Priority: Minor The unit test TestMiniMRChildTask fails on Mac OS-X (10.5.8) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-825) JobClient completion poll interval of 5s causes slow tests in local mode
[ https://issues.apache.org/jira/browse/MAPREDUCE-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740743#action_12740743 ] Aaron Kimball commented on MAPREDUCE-825: - Failures are in streaming only. JobClient completion poll interval of 5s causes slow tests in local mode Key: MAPREDUCE-825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-825 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Priority: Minor Attachments: completion-poll-interval.patch, MAPREDUCE-825.2.patch The JobClient.NetworkedJob.waitForCompletion() method polls for job completion every 5 seconds. When running a set of short tests in pseudo-distributed mode, this is unnecessarily slow and causes lots of wasted time. When bandwidth is not scarce, setting the poll interval to 100 ms results in a 4x speedup in some tests. This interval should be parametrized to allow users to control the interval for testing purposes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-840) DBInputFormat leaves open transaction
DBInputFormat leaves open transaction - Key: MAPREDUCE-840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-840 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Priority: Minor DBInputFormat.getSplits() does not connection.commit() after the COUNT query. This can leave an open transaction against the database which interferes with other connections to the same table. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-840) DBInputFormat leaves open transaction
[ https://issues.apache.org/jira/browse/MAPREDUCE-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-840: Attachment: MAPREDUCE-840.patch Attaching trivial patch for this issue. No new tests because I've only seen this issue manifest in interacting with postgresql. I've verified that with this fix in place, it works with postgresql. The TestDBJob unit test also works. DBInputFormat leaves open transaction - Key: MAPREDUCE-840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-840 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Priority: Minor Attachments: MAPREDUCE-840.patch DBInputFormat.getSplits() does not connection.commit() after the COUNT query. This can leave an open transaction against the database which interferes with other connections to the same table. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-840) DBInputFormat leaves open transaction
[ https://issues.apache.org/jira/browse/MAPREDUCE-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-840: Status: Patch Available (was: Open) DBInputFormat leaves open transaction - Key: MAPREDUCE-840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-840 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Priority: Minor Attachments: MAPREDUCE-840.patch DBInputFormat.getSplits() does not connection.commit() after the COUNT query. This can leave an open transaction against the database which interferes with other connections to the same table. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-750) Extensible ConnManager factory API
[ https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-750: Status: Patch Available (was: Open) Extensible ConnManager factory API -- Key: MAPREDUCE-750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-750 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.3.patch, MAPREDUCE-750.patch Sqoop uses the ConnFactory class to instantiate a ConnManager implementation based on the connect string and other arguments supplied by the user. This allows per-database logic to be encapsulated in different ConnManager instances, and dynamically chosen based on which database the user is actually importing from. But adding new ConnManager implementations requires modifying the source of a common ConnFactory class. An indirection layer should be used to delegate instantiation to a number of factory implementations which can be specified in the static configuration or at runtime. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-750) Extensible ConnManager factory API
[ https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-750: Attachment: MAPREDUCE-750.3.patch New patch resync'd with trunk Extensible ConnManager factory API -- Key: MAPREDUCE-750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-750 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.3.patch, MAPREDUCE-750.patch Sqoop uses the ConnFactory class to instantiate a ConnManager implementation based on the connect string and other arguments supplied by the user. This allows per-database logic to be encapsulated in different ConnManager instances, and dynamically chosen based on which database the user is actually importing from. But adding new ConnManager implementations requires modifying the source of a common ConnFactory class. An indirection layer should be used to delegate instantiation to a number of factory implementations which can be specified in the static configuration or at runtime. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-750) Extensible ConnManager factory API
[ https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-750: Status: Open (was: Patch Available) Extensible ConnManager factory API -- Key: MAPREDUCE-750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-750 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.3.patch, MAPREDUCE-750.patch Sqoop uses the ConnFactory class to instantiate a ConnManager implementation based on the connect string and other arguments supplied by the user. This allows per-database logic to be encapsulated in different ConnManager instances, and dynamically chosen based on which database the user is actually importing from. But adding new ConnManager implementations requires modifying the source of a common ConnFactory class. An indirection layer should be used to delegate instantiation to a number of factory implementations which can be specified in the static configuration or at runtime. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-799) Some of MRUnit's self-tests were not being run
[ https://issues.apache.org/jira/browse/MAPREDUCE-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740808#action_12740808 ] Hadoop QA commented on MAPREDUCE-799: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12414378/MAPREDUCE-799.patch against trunk revision 801959. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/console This message is automatically generated. Some of MRUnit's self-tests were not being run -- Key: MAPREDUCE-799 URL: https://issues.apache.org/jira/browse/MAPREDUCE-799 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-799.patch Due to method naming issues, some test cases were not being executed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-799) Some of MRUnit's self-tests were not being run
[ https://issues.apache.org/jira/browse/MAPREDUCE-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740816#action_12740816 ] Aaron Kimball commented on MAPREDUCE-799: - contrib failures are just streaming. Some of MRUnit's self-tests were not being run -- Key: MAPREDUCE-799 URL: https://issues.apache.org/jira/browse/MAPREDUCE-799 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-799.patch Due to method naming issues, some test cases were not being executed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-64) Map-side sort is hampered by io.sort.record.percent
[ https://issues.apache.org/jira/browse/MAPREDUCE-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740837#action_12740837 ] Todd Lipcon commented on MAPREDUCE-64: -- Hi Arun, Have you guys worked on this at all already? I'm interested in playing around with rewriting part of the mapside sort to get rid of this tunable. Like you said, for a lot of applications the default values are *way* off. 350K records in 95MB = 271 bytes average record size, which is larger than probably the majority of jobs we see in practice. If you already have worked on this I don't want to duplicate your effort, but if not, I think it would be a good step towards better average performance without expert tuning. Map-side sort is hampered by io.sort.record.percent --- Key: MAPREDUCE-64 URL: https://issues.apache.org/jira/browse/MAPREDUCE-64 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Arun C Murthy Currently io.sort.record.percent is a fairly obscure, per-job configurable, expert-level parameter which controls how much accounting space is available for records in the map-side sort buffer (io.sort.mb). Typically values for io.sort.mb (100) and io.sort.record.percent (0.05) imply that we can store ~350,000 records in the buffer before necessitating a sort/combine/spill. However for many applications which deal with small records e.g. the world-famous wordcount and it's family this implies we can only use 5-10% of io.sort.mb i.e. (5-10M) before we spill inspite of having _much_ more memory available in the sort-buffer. The word-count for e.g. results in ~12 spills (given hdfs block size of 64M). The presence of a combiner exacerbates the problem by piling serialization/deserialization of records too... Sure, jobs can configure io.sort.record.percent, but it's tedious and obscure; we really can do better by getting the framework to automagically pick it by using all available memory (upto io.sort.mb) for either the data or accounting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.