[jira] [Created] (MAPREDUCE-2614) Allow append arbitrary text at the end of generated query in DBOutputFormat class
Allow append arbitrary text at the end of generated query in DBOutputFormat class - Key: MAPREDUCE-2614 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2614 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Jarek Jarcec Cecho Priority: Minor It would be wonderful if DBOutputFormat class allow addition of arbitrary text at the end of generated query. This feature can be useful for example in case of MySQL database to specify ON DUPLICATE KEY UPDATE ... part of the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2187) map tasks timeout during sorting
[ https://issues.apache.org/jira/browse/MAPREDUCE-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-2187: --- Attachment: MAPREDUCE-2187-branch-MR-279.patch map tasks timeout during sorting Key: MAPREDUCE-2187 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2187 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2 Reporter: Gianmarco De Francisci Morales Assignee: Anupam Seth Attachments: MAPREDUCE-2187-20-security.patch, MAPREDUCE-2187-22.patch, MAPREDUCE-2187-branch-MR-279.patch, MAPREDUCE-2187-trunk.patch During the execution of a large job, the map tasks timeout: {code} INFO mapred.JobClient: Task Id : attempt_201010290414_60974_m_57_1, Status : FAILED Task attempt_201010290414_60974_m_57_1 failed to report status for 609 seconds. Killing! {code} The bug is in the fact that the mapper has already finished, and, according to the logs, the timeout occurs during the merge sort phase. The intermediate data generated by the map task is quite large. So I think this is the problem. The logs show that the merge-sort was running for 10 minutes when the task was killed. I think the mapred.Merger should call Reporter.progress() somewhere. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2614) Allow append arbitrary text at the end of generated query in DBOutputFormat class
[ https://issues.apache.org/jira/browse/MAPREDUCE-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Jarcec Cecho updated MAPREDUCE-2614: -- Release Note: Add new configuration property for DBConfiguration and it's propagation to DBOutputFormat Status: Patch Available (was: Open) I've add new configuration property to DBConfiguration called OUTPUT_QUERY_APPEND_PROPERTY. If this property is not empty than it's content is appended at the end of query generation in class DBOutputFormat (constructQuery method). I've also fixed both test around this two classes. Allow append arbitrary text at the end of generated query in DBOutputFormat class - Key: MAPREDUCE-2614 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2614 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Jarek Jarcec Cecho Priority: Minor It would be wonderful if DBOutputFormat class allow addition of arbitrary text at the end of generated query. This feature can be useful for example in case of MySQL database to specify ON DUPLICATE KEY UPDATE ... part of the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2614) Allow append arbitrary text at the end of generated query in DBOutputFormat class
[ https://issues.apache.org/jira/browse/MAPREDUCE-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Jarcec Cecho updated MAPREDUCE-2614: -- Attachment: MAPREDUCE-2614.patch Allow append arbitrary text at the end of generated query in DBOutputFormat class - Key: MAPREDUCE-2614 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2614 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Jarek Jarcec Cecho Priority: Minor Attachments: MAPREDUCE-2614.patch It would be wonderful if DBOutputFormat class allow addition of arbitrary text at the end of generated query. This feature can be useful for example in case of MySQL database to specify ON DUPLICATE KEY UPDATE ... part of the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2614) Allow append arbitrary text at the end of generated query in DBOutputFormat class
[ https://issues.apache.org/jira/browse/MAPREDUCE-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053346#comment-13053346 ] Hadoop QA commented on MAPREDUCE-2614: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12483446/MAPREDUCE-2614.patch against trunk revision 1138301. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestMRCLI org.apache.hadoop.fs.TestFileSystem org.apache.hadoop.mapreduce.lib.db.TestDBJob -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/411//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/411//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/411//console This message is automatically generated. Allow append arbitrary text at the end of generated query in DBOutputFormat class - Key: MAPREDUCE-2614 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2614 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Jarek Jarcec Cecho Priority: Minor Attachments: MAPREDUCE-2614.patch It would be wonderful if DBOutputFormat class allow addition of arbitrary text at the end of generated query. This feature can be useful for example in case of MySQL database to specify ON DUPLICATE KEY UPDATE ... part of the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2615) MR 279: KillJob should go through AM whenever possible
MR 279: KillJob should go through AM whenever possible -- Key: MAPREDUCE-2615 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2615 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: 0.23.0 KillJob currently goes directly to the RM - which effectively causes the AM and tasks to be killed via a signal. History information is not recorded in this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2615) MR 279: KillJob should go through AM whenever possible
[ https://issues.apache.org/jira/browse/MAPREDUCE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-2615: -- Attachment: MR2615.patch Kill goes via the AM if it's available. Setting #maps, reduces and finishTime in JobSummaryLog in case of Killed jobs. MR 279: KillJob should go through AM whenever possible -- Key: MAPREDUCE-2615 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2615 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: 0.23.0 Attachments: MR2615.patch KillJob currently goes directly to the RM - which effectively causes the AM and tasks to be killed via a signal. History information is not recorded in this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-2560) Support specification of codecs by name
[ https://issues.apache.org/jira/browse/MAPREDUCE-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers reassigned MAPREDUCE-2560: - Assignee: Arun Ramakrishnan (was: Anthony Urso) Support specification of codecs by name --- Key: MAPREDUCE-2560 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2560 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Tom White Assignee: Arun Ramakrishnan Labels: newbie By changing the code to take advantage of HADOOP-7323, it will be possible to specify compression codecs in configuration by name (e.g. 'gzip'), not only by classname, although that will still be supported, of course (e.g. 'org.apache.hadoop.io.compress.GzipCodec'). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-2560) Support specification of codecs by name
[ https://issues.apache.org/jira/browse/MAPREDUCE-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Urso reassigned MAPREDUCE-2560: --- Assignee: Anthony Urso Support specification of codecs by name --- Key: MAPREDUCE-2560 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2560 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Tom White Assignee: Anthony Urso Labels: newbie By changing the code to take advantage of HADOOP-7323, it will be possible to specify compression codecs in configuration by name (e.g. 'gzip'), not only by classname, although that will still be supported, of course (e.g. 'org.apache.hadoop.io.compress.GzipCodec'). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2615) MR 279: KillJob should go through AM whenever possible
[ https://issues.apache.org/jira/browse/MAPREDUCE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Lu resolved MAPREDUCE-2615. Tags: mrv2 Resolution: Fixed Hadoop Flags: [Reviewed] +1. Committed to MR-279. Thanks Sidd! MR 279: KillJob should go through AM whenever possible -- Key: MAPREDUCE-2615 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2615 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: 0.23.0 Attachments: MR2615.patch KillJob currently goes directly to the RM - which effectively causes the AM and tasks to be killed via a signal. History information is not recorded in this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2602) Allow setting of end-of-record delimiter for TextInputFormat (for the old API)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053561#comment-13053561 ] Ahmed Radwan commented on MAPREDUCE-2602: - The Hadoop-QA test failures above are not related to the submitted patch. Allow setting of end-of-record delimiter for TextInputFormat (for the old API) -- Key: MAPREDUCE-2602 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2602 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-2602.patch Since there are users who are still using the old MR API, it will be useful to modify the org.apache.hadoop.mapred.LineRecordReader and org.apache.hadoop.mapred.TextInputFormat to be able to use custom (user-specified) end-of-record delimiters. This will make use of the LineReader improvement introduced in HADOOP-7096 that enables the LineReader to break lines at user-specified delimiters. Note: MAPREDUCE-2254 already added this improvement to the new API (but not the old API). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2616) [Gridmix] Input data compression emulation might not work as expected with data reuse
[Gridmix] Input data compression emulation might not work as expected with data reuse - Key: MAPREDUCE-2616 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2616 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.23.0 Currently, all the Gridmix input data files are located at gridmix-io-dir/input (gridmix-io-dir is expected as a CLI parameter). When compression emulation is enabled, Gridmix will check for compressed files (based on suffixes) in the input folder. Gridmix will bail out if there are no compressed input files. If the input folder consists of a mix of compressed and uncompressed input files, then Gridmix might end up using uncompressed files resulting into no emulation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira