[jira] Commented: (MAPREDUCE-1267) Fix typo in mapred-default.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786350#action_12786350 ] Hadoop QA commented on MAPREDUCE-1267: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426992/mapreduce-1267.txt against trunk revision 887135. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/168/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/168/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/168/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/168/console This message is automatically generated. > Fix typo in mapred-default.xml > -- > > Key: MAPREDUCE-1267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1267 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1267.txt > > > There's a typo of mapreduce.client.progerssmonitor.pollinterval instead of > mapreduce.client.progressmonitor.pollinterval in mapred-default. Trivial > patch to fix. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1097) Changes/fixes to support Vertica 3.5
[ https://issues.apache.org/jira/browse/MAPREDUCE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786344#action_12786344 ] Hadoop QA commented on MAPREDUCE-1097: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425767/MAPREDUCE-1097.patch against trunk revision 887135. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/293/console This message is automatically generated. > Changes/fixes to support Vertica 3.5 > > > Key: MAPREDUCE-1097 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1097 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.21.0 > Environment: Hadoop 0.21.0 pre-release and Vertica 3.5 >Reporter: Omer Trajman >Assignee: Omer Trajman >Priority: Minor > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1097.patch > > > Vertica 3.5 includes three changes that the formatters should handle: > 1) deploy_design function that handles much of the logic in the optimize > method. This improvement uses deploy_design if the server version supports > it instead of orchestrating in the formatter function. > 2) truncate table instead of recreating the table > 3) numeric, decimal, money, number types (all the same path) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786342#action_12786342 ] Todd Lipcon commented on MAPREDUCE-1114: Ivy already caches the resolves done in the same run, in theory, but there are a lot of "different" resolves, I think? The gain here *is* from caching between runs as you surmised. > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words
[ https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786324#action_12786324 ] Hadoop QA commented on MAPREDUCE-1174: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426970/MAPREDUCE-1174.2.patch against trunk revision 887135. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/167/console This message is automatically generated. > Sqoop improperly handles table/column names which are reserved sql words > > > Key: MAPREDUCE-1174 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/sqoop >Reporter: Aaron Kimball >Assignee: zhiyong zhang > Attachments: MAPREDUCE-1174.2.patch, MAPREDUCE-1174.patch > > > In some databases it is legal to name tables and columns with terms that > overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such > cases, the database allows you to escape the table and column names. We > should always escape table and column names when possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786322#action_12786322 ] Chris Douglas commented on MAPREDUCE-1114: -- I thought the bulk of the problem was re-resolving these properties during the same run. Is that mistaken? The current proposal also works across runs, which could be helpful, but again: maintaining the build is already a pain. Adding a cache to a bad idea is a well established software engineering practice, but I'd favor either fixing our use of ivy or replacing it if middling performance requires this. > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1262) Eclipse Plugin does not build for Hadoop 0.20.1
[ https://issues.apache.org/jira/browse/MAPREDUCE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1262: - Status: Open (was: Patch Available) The patch causes the build to [fail|http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/292/console], specifically: {noformat} [exec] compile: [exec] [echo] contrib: eclipse-plugin [exec] [javac] Compiling 45 source files to /grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h6.grid.sp2.yahoo.net/trunk\ /build/contrib/eclipse-plugin/classes [exec] [javac] /grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h6.grid.sp2.yahoo.net/trunk/src/contrib/eclipse-plugin\ /src/java/org/apache/hadoop/eclipse/launch/HadoopApplicationLaunchShortcut.java:35: cannot find symbol [exec] [javac] symbol : class JavaApplicationLaunchShortcut [exec] [javac] location: package org.eclipse.jdt.debug.ui.launchConfigurations [exec] [javac] import org.eclipse.jdt.debug.ui.launchConfigurations.JavaApplicationLaunchShortcut; {noformat} > Eclipse Plugin does not build for Hadoop 0.20.1 > --- > > Key: MAPREDUCE-1262 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1262 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 > Environment: SLES 10, Mac OS/X 10.5.8 >Reporter: Stephen Watt > Fix For: 0.20.2, 0.21.0, 0.22.0, 0.20.1 > > Attachments: hadoop-0.20.1-eclipse-plugin.jar, HADOOP-6360.patch > > > When trying to run the build script for the Eclipse Plugin in > src/contrib/eclipse-plugin there are several errors a user receives. The > first error is that the eclipse.home is not set. This is easily remedied by > adding a value for eclipse.home in the build.properties file in the > eclipse-plugin directory. > The script then states it cannot compile > org.apache.hadoop.eclipse.launch.HadoopApplicationLaunchShortcut because it > cannot resolve JavaApplicationLaunchShortcut on line 35: > import > org.eclipse.jdt.internal.debug.ui.launcher.JavaApplicationLaunchShortcut; > and fails > I believe this is because there is no jar in the eclipse.home/plugins that > has this class in that package. I did however find it in > org.eclipse.jdt.debug.ui.launchConfigurations.JavaApplicationLaunchShortcut > which was inside in org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar in the > plugins dir of Eclipse 3.5 > Changing the import in the class in the source to the latter allows the build > to complete successfully. The M/R Perspective opens and works on my SLES 10 > Linux environment but not on my Macbook Pro. Both are running Eclipse 3.5. > To users wanting to do the same, I built this inside Eclipse. To do that I > added org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar and > hadoop-0.20.1-core.jar to the ant runtime configuration classpath. I also had > to set the version value=0.20.1 in the build.properties. You will also need > to copy hadoop-0.20.1-core.jar to hadoop.home/build and commons-cli-1.2.jar > to hadoop.home/build/ivy/lib/Hadoop/common. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786309#action_12786309 ] Todd Lipcon commented on MAPREDUCE-1114: When the classpath is resolved, it's written out to a text file named for that variable. Then when it needs to be resolved again, if that file exists, it's loaded rather than re-resolving. > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786305#action_12786305 ] Chris Douglas commented on MAPREDUCE-1114: -- Then I'm missing something. What is being "cached"? > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-956) Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)
[ https://issues.apache.org/jira/browse/MAPREDUCE-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786303#action_12786303 ] YangLai commented on MAPREDUCE-956: --- I have a scenario that the output of shuffle phase is exact what I want, so the sort phase and reduce phase are not necessary to me and cause a lot of overheads. I dont know how get the output of shuffle phase in hadoop 0.19.1 or 0.20.1. Maybe the sort phase should be optional to developers. > Shuffle should be broken down to only two phases (copy/reduce) instead of > three (copy/sort/reduce) > -- > > Key: MAPREDUCE-956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-956 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 0.21.0 >Reporter: Jothi Padmanabhan > > For the progress calculations and displaying on the UI, shuffle, in its > current form, is decomposed into three phases (copy/sort/reduce). Actually, > the sort phase is no longer applicable. I think we should just reduce the > number of phases to two and assign 50% weight-age to each of copy and reduce > phases. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1155) Streaming tests swallow exceptions
[ https://issues.apache.org/jira/browse/MAPREDUCE-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786298#action_12786298 ] Chris Douglas commented on MAPREDUCE-1155: -- bq. We may as well fix the broken tests now when there's a patch that applies and passes, and worry about style separately. Only some of the tests this is updating are "broken." Most just use an odd and unnecessary idiom. Style is what this is repairing > Streaming tests swallow exceptions > -- > > Key: MAPREDUCE-1155 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1155 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.20.1, 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1155.txt > > > Many of the streaming tests (including TestMultipleArchiveFiles) catch > exceptions and print their stack trace rather than failing the job. This > means that tests do not fail even when the job fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1262) Eclipse Plugin does not build for Hadoop 0.20.1
[ https://issues.apache.org/jira/browse/MAPREDUCE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786285#action_12786285 ] Hadoop QA commented on MAPREDUCE-1262: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426690/HADOOP-6360.patch against trunk revision 887135. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/292/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/292/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/292/console This message is automatically generated. > Eclipse Plugin does not build for Hadoop 0.20.1 > --- > > Key: MAPREDUCE-1262 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1262 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 > Environment: SLES 10, Mac OS/X 10.5.8 >Reporter: Stephen Watt > Fix For: 0.20.1, 0.20.2, 0.21.0, 0.22.0 > > Attachments: hadoop-0.20.1-eclipse-plugin.jar, HADOOP-6360.patch > > > When trying to run the build script for the Eclipse Plugin in > src/contrib/eclipse-plugin there are several errors a user receives. The > first error is that the eclipse.home is not set. This is easily remedied by > adding a value for eclipse.home in the build.properties file in the > eclipse-plugin directory. > The script then states it cannot compile > org.apache.hadoop.eclipse.launch.HadoopApplicationLaunchShortcut because it > cannot resolve JavaApplicationLaunchShortcut on line 35: > import > org.eclipse.jdt.internal.debug.ui.launcher.JavaApplicationLaunchShortcut; > and fails > I believe this is because there is no jar in the eclipse.home/plugins that > has this class in that package. I did however find it in > org.eclipse.jdt.debug.ui.launchConfigurations.JavaApplicationLaunchShortcut > which was inside in org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar in the > plugins dir of Eclipse 3.5 > Changing the import in the class in the source to the latter allows the build > to complete successfully. The M/R Perspective opens and works on my SLES 10 > Linux environment but not on my Macbook Pro. Both are running Eclipse 3.5. > To users wanting to do the same, I built this inside Eclipse. To do that I > added org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar and > hadoop-0.20.1-core.jar to the ant runtime configuration classpath. I also had > to set the version value=0.20.1 in the build.properties. You will also need > to copy hadoop-0.20.1-core.jar to hadoop.home/build and commons-cli-1.2.jar > to hadoop.home/build/ivy/lib/Hadoop/common. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-744) Support in DistributedCache to share cache files with other users after HADOOP-4493
[ https://issues.apache.org/jira/browse/MAPREDUCE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786279#action_12786279 ] Devaraj Das commented on MAPREDUCE-744: --- I should add that "public" means world-readable. The entire hierarchy of the cache file path is checked for that (starting from the leaf filename to "/"). > Support in DistributedCache to share cache files with other users after > HADOOP-4493 > --- > > Key: MAPREDUCE-744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-744 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Reporter: Vinod K V > Attachments: 744-early.patch > > > HADOOP-4493 aims to completely privatize the files distributed to TT via > DistributedCache. This jira issues focuses on sharing some/all of these files > with all other users. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-744) Support in DistributedCache to share cache files with other users after HADOOP-4493
[ https://issues.apache.org/jira/browse/MAPREDUCE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-744: -- Attachment: 744-early.patch Attaching a preliminary patch for review. What's done there is that the cache files are checked at the client side for public/private access, and that information (booleans - true for public, false for private) is passed in the configuration. The TaskTrackers look at the configuration for each file during localization, and, if the file was public they localize it to a common space. If not, then the file is localized to the user's private directory. Testcase is not there yet. > Support in DistributedCache to share cache files with other users after > HADOOP-4493 > --- > > Key: MAPREDUCE-744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-744 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Reporter: Vinod K V > Attachments: 744-early.patch > > > HADOOP-4493 aims to completely privatize the files distributed to TT via > DistributedCache. This jira issues focuses on sharing some/all of these files > with all other users. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1050) Introduce a mock object testing framework
[ https://issues.apache.org/jira/browse/MAPREDUCE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786273#action_12786273 ] Konstantin Boudnik commented on MAPREDUCE-1050: --- Well, still fails on my BSD machine. The message is {noformat} TestLostTaskTracker Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.791 sec - Standard Output --- 2009-12-04 16:43:13,581 INFO mapred.JobTracker (JobTracker.java:(1334)) - Starting jobtracker with owner as cos and supergroup as supergroup 2009-12-04 16:43:13,587 INFO mapred.JobTracker (JobTracker.java:initializeTaskMemoryRelatedConfig(4086)) - Scheduler configured with (memSizeForMapSlotOnJT, memSizeForRedu ceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 2009-12-04 16:43:13,590 INFO util.HostsFileReader (HostsFileReader.java:refresh(81)) - Refreshing hosts (include/exclude) list 2009-12-04 16:43:13,607 INFO mapred.QueueConfigurationParser (QueueConfigurationParser.java:parseResource(170)) - Bad conf file: top-level element not - --- Testcase: testLostTaskTrackerCalledAfterExpiryTime took 0.763 sec Caused an ERROR No queues defined java.lang.RuntimeException: No queues defined at org.apache.hadoop.mapred.QueueConfigurationParser.parseResource(QueueConfigurationParser.java:171) at org.apache.hadoop.mapred.QueueConfigurationParser.loadResource(QueueConfigurationParser.java:163) at org.apache.hadoop.mapred.QueueConfigurationParser.(QueueConfigurationParser.java:92) at org.apache.hadoop.mapred.QueueManager.getQueueConfigurationParser(QueueManager.java:126) at org.apache.hadoop.mapred.QueueManager.(QueueManager.java:146) at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:1376) at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:1325) at org.apache.hadoop.mapred.TestLostTaskTracker.setUp(TestLostTaskTracker.java:58) {noformat} The other problem: TestLostTaskTracker is JUnit v3 test (it extends TestCase, etc.). Please make it to be JUnit v4 (like the other two tests are). > Introduce a mock object testing framework > - > > Key: MAPREDUCE-1050 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1050 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Reporter: Tom White >Assignee: Tom White > Attachments: MAPREDUCE-1050.patch, MAPREDUCE-1050.patch, > MAPREDUCE-1050.patch, MAPREDUCE-1050.patch, MAPREDUCE-1050.patch > > > Using mock objects in unit tests can improve code quality (see e.g. > http://www.mockobjects.com/). Hadoop would benefit from having a mock object > framework for developers to write unit tests with. Doing so will allow a > wider range of failure conditions to be tested and the tests will run faster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1209) Move common specific part of the test TestReflectionUtils out of mapred into common
[ https://issues.apache.org/jira/browse/MAPREDUCE-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-1209: --- Fix Version/s: 0.22.0 Affects Version/s: 0.22.0 0.21.0 Status: Patch Available (was: Open) > Move common specific part of the test TestReflectionUtils out of mapred into > common > --- > > Key: MAPREDUCE-1209 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1209 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: test >Affects Versions: 0.21.0, 0.22.0 >Reporter: Vinod K V >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1209.txt > > > As commented by Tom here > (https://issues.apache.org/jira/browse/HADOOP-6230?focusedCommentId=12751058&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12751058), > TestReflectionUtils has a single test testSetConf() to test backward > compatibility of ReflectionUtils for JobConfigurable objects. > TestReflectionUtils can be spilt into two tests - one on common and one in > mapred - this single test may reside in mapred till the mapred package is > removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1209) Move common specific part of the test TestReflectionUtils out of mapred into common
[ https://issues.apache.org/jira/browse/MAPREDUCE-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-1209: --- Attachment: mapreduce-1209.txt Patch to remove the common stuff from this test. Also fixed line endings to unix while I was at it. > Move common specific part of the test TestReflectionUtils out of mapred into > common > --- > > Key: MAPREDUCE-1209 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1209 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: test >Affects Versions: 0.21.0, 0.22.0 >Reporter: Vinod K V >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1209.txt > > > As commented by Tom here > (https://issues.apache.org/jira/browse/HADOOP-6230?focusedCommentId=12751058&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12751058), > TestReflectionUtils has a single test testSetConf() to test backward > compatibility of ReflectionUtils for JobConfigurable objects. > TestReflectionUtils can be spilt into two tests - one on common and one in > mapred - this single test may reside in mapred till the mapred package is > removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786268#action_12786268 ] Todd Lipcon commented on MAPREDUCE-1114: bq. Aren't the classpaths named? Would there be a way to short-circuit the resolution if it created/checked for a file mapped to that path? That is exactly what this patch does... > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1209) Move common specific part of the test TestReflectionUtils out of mapred into common
[ https://issues.apache.org/jira/browse/MAPREDUCE-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned MAPREDUCE-1209: -- Assignee: Todd Lipcon > Move common specific part of the test TestReflectionUtils out of mapred into > common > --- > > Key: MAPREDUCE-1209 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1209 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: test >Reporter: Vinod K V >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0 > > > As commented by Tom here > (https://issues.apache.org/jira/browse/HADOOP-6230?focusedCommentId=12751058&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12751058), > TestReflectionUtils has a single test testSetConf() to test backward > compatibility of ReflectionUtils for JobConfigurable objects. > TestReflectionUtils can be spilt into two tests - one on common and one in > mapred - this single test may reside in mapred till the mapred package is > removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786265#action_12786265 ] Chris Douglas commented on MAPREDUCE-1114: -- bq. Comparing the 15 second payoff to the full build time isn't particular important to me. For me, the ability to quickly iterate on code while recompiling and rerunning unit tests is the big payoff As a vi user, I got that. I haven't argued that the long build times are unimportant, but that a hack introducing a custom caching layer for classpaths is not, in my mind, a justifiable tradeoff in complexity. Maintaining black magic in the build is tedious and avoidable. bq. the slowness is actually in the resolve task which generates the various classpath properties in ant Aren't the classpaths named? Would there be a way to short-circuit the resolution if it created/checked for a file mapped to that path? bq. My most common development cycle is to run a single unit test. For Avro this takes just a few seconds, and I'm willing to wait without finding a new task to work on. As a workaround: depending on how often I'm running it, adding a {{main}} to the unit test is sometimes worthwhile. > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1254) job.xml should add crc check in tasktracker and sub jvm.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786261#action_12786261 ] Todd Lipcon commented on MAPREDUCE-1254: Curious why the XML reading doesn't fail for an empty file. Emptiness is not valid XML, right? > job.xml should add crc check in tasktracker and sub jvm. > > > Key: MAPREDUCE-1254 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1254 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: task, tasktracker >Affects Versions: 0.22.0 >Reporter: ZhuGuanyin > > Currently job.xml in tasktracker and subjvm are write to local disk through > ChecksumFilesystem, and already had crc checksum information, but load the > job.xml file without crc check. It would cause the mapred job finished > successful but with wrong data because of disk error. Example: The > tasktracker and sub task jvm would load the default configuration if it > doesn't successfully load the job.xml which maybe replace the mapper with > IdentityMapper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1254) job.xml should add crc check in tasktracker and sub jvm.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786259#action_12786259 ] Zheng Shao commented on MAPREDUCE-1254: --- Got it. It seems a good idea to read and check the checksum. Will you upload a patch including a simple test case? > job.xml should add crc check in tasktracker and sub jvm. > > > Key: MAPREDUCE-1254 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1254 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: task, tasktracker >Affects Versions: 0.22.0 >Reporter: ZhuGuanyin > > Currently job.xml in tasktracker and subjvm are write to local disk through > ChecksumFilesystem, and already had crc checksum information, but load the > job.xml file without crc check. It would cause the mapred job finished > successful but with wrong data because of disk error. Example: The > tasktracker and sub task jvm would load the default configuration if it > doesn't successfully load the job.xml which maybe replace the mapper with > IdentityMapper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist
[ https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786256#action_12786256 ] Todd Lipcon commented on MAPREDUCE-1241: bq. -1 core tests. The patch failed core unit tests. Failed org.apache.hadoop.mapred.TestMiniMRWithDFS.testWithDFSWithDefaultPort which is different than the failure in the last build, and entirely unrelated. > JobTracker should not crash when mapred-queues.xml does not exist > - > > Key: MAPREDUCE-1241 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1241.txt, mapreduce-1241.txt > > > Currently, if you bring up the JobTracker on an old configuration directory, > it gets a NullPointerException looking for the mapred-queues.xml file. It > should just assume a default queue and continue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-576) writing to status reporter before consuming standard input causes task failure.
[ https://issues.apache.org/jira/browse/MAPREDUCE-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved MAPREDUCE-576. --- Resolution: Duplicate > writing to status reporter before consuming standard input causes task > failure. > --- > > Key: MAPREDUCE-576 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-576 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.20.1 > Environment: amazon ec2 instance created with the given scripts > (fedora, small) >Reporter: Karl Anderson >Assignee: Todd Lipcon > > A Hadoop Streaming task which writes a status reporter line before consuming > input causes the task to fail. Writing after consuming input does not fail. > I caused this failure using a Python reducer and writing a > "reporter:status:foo\n" line to stderr. Didn't try writing anything else. > The reducer script which fails: > #!/usr/bin/env python > import sys > if __name__ == "__main__": > sys.stderr.write('reporter:status:foo\n') > sys.stderr.flush() > for line in sys.stdin: > print line > The reducer script which succeeds: > #!/usr/bin/env python > import sys > if __name__ == "__main__": > for line in sys.stdin: > sys.stderr.write('reporter:status:foo\n') > sys.stderr.flush() > print line > The hadoop invocation which I used: > hadoop jar > /usr/local/hadoop-0.18.1/contrib/streaming/hadoop-0.18.1-streaming.jar > -mapper cat -reducer ./reducer_foo.py -input vectors -output clusters_1 > -jobconf mapred.map.tasks=512 -jobconf mapred.reduce.tasks=512 -file > ./reducer_foo.py > This is on a 64 node hadoop-ec2 cluster. > One of the errors listed on the failures page (they all appear to be the > same): > java.io.IOException: subprocess exited successfully > R/W/S=1/0/0 in:0=1/41 [rec/s] out:0=0/41 [rec/s] > minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null > HOST=null > USER=root > HADOOP_USER=null > last Hadoop input: |null| > last tool output: |null| > Date: Mon Oct 20 19:13:38 EDT 2008 > MROutput/MRErrThread failed:java.lang.NullPointerException > at > org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.setStatus(PipeMapRed.java:497) > at > org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:429) > at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:103) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) > The stderr log for a failed task: > Exception in thread "Timer thread for monitoring mapred" > java.lang.NullPointerException > at > org.apache.hadoop.metrics.ganglia.GangliaContext.xdr_string(GangliaContext.java:195) > at > org.apache.hadoop.metrics.ganglia.GangliaContext.emitMetric(GangliaContext.java:138) > at > org.apache.hadoop.metrics.ganglia.GangliaContext.emitRecord(GangliaContext.java:123) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext.emitRecords(AbstractMetricsContext.java:304) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:290) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:50) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:249) > at java.util.TimerThread.mainLoop(Timer.java:512) > at java.util.TimerThread.run(Timer.java:462) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist
[ https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786249#action_12786249 ] Hadoop QA commented on MAPREDUCE-1241: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426955/mapreduce-1241.txt against trunk revision 887135. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/166/console This message is automatically generated. > JobTracker should not crash when mapred-queues.xml does not exist > - > > Key: MAPREDUCE-1241 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1241.txt, mapreduce-1241.txt > > > Currently, if you bring up the JobTracker on an old configuration directory, > it gets a NullPointerException looking for the mapred-queues.xml file. It > should just assume a default queue and continue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-576) writing to status reporter before consuming standard input causes task failure.
[ https://issues.apache.org/jira/browse/MAPREDUCE-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned MAPREDUCE-576: - Assignee: Todd Lipcon > writing to status reporter before consuming standard input causes task > failure. > --- > > Key: MAPREDUCE-576 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-576 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.20.1 > Environment: amazon ec2 instance created with the given scripts > (fedora, small) >Reporter: Karl Anderson >Assignee: Todd Lipcon > > A Hadoop Streaming task which writes a status reporter line before consuming > input causes the task to fail. Writing after consuming input does not fail. > I caused this failure using a Python reducer and writing a > "reporter:status:foo\n" line to stderr. Didn't try writing anything else. > The reducer script which fails: > #!/usr/bin/env python > import sys > if __name__ == "__main__": > sys.stderr.write('reporter:status:foo\n') > sys.stderr.flush() > for line in sys.stdin: > print line > The reducer script which succeeds: > #!/usr/bin/env python > import sys > if __name__ == "__main__": > for line in sys.stdin: > sys.stderr.write('reporter:status:foo\n') > sys.stderr.flush() > print line > The hadoop invocation which I used: > hadoop jar > /usr/local/hadoop-0.18.1/contrib/streaming/hadoop-0.18.1-streaming.jar > -mapper cat -reducer ./reducer_foo.py -input vectors -output clusters_1 > -jobconf mapred.map.tasks=512 -jobconf mapred.reduce.tasks=512 -file > ./reducer_foo.py > This is on a 64 node hadoop-ec2 cluster. > One of the errors listed on the failures page (they all appear to be the > same): > java.io.IOException: subprocess exited successfully > R/W/S=1/0/0 in:0=1/41 [rec/s] out:0=0/41 [rec/s] > minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null > HOST=null > USER=root > HADOOP_USER=null > last Hadoop input: |null| > last tool output: |null| > Date: Mon Oct 20 19:13:38 EDT 2008 > MROutput/MRErrThread failed:java.lang.NullPointerException > at > org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.setStatus(PipeMapRed.java:497) > at > org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:429) > at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:103) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) > The stderr log for a failed task: > Exception in thread "Timer thread for monitoring mapred" > java.lang.NullPointerException > at > org.apache.hadoop.metrics.ganglia.GangliaContext.xdr_string(GangliaContext.java:195) > at > org.apache.hadoop.metrics.ganglia.GangliaContext.emitMetric(GangliaContext.java:138) > at > org.apache.hadoop.metrics.ganglia.GangliaContext.emitRecord(GangliaContext.java:123) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext.emitRecords(AbstractMetricsContext.java:304) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:290) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:50) > at > org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:249) > at java.util.TimerThread.mainLoop(Timer.java:512) > at java.util.TimerThread.run(Timer.java:462) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786234#action_12786234 ] Todd Lipcon commented on MAPREDUCE-1114: Doug: the slowness is actually in the resolve task which generates the various classpath properties in ant. Without caching those properties to disk, there's no way to get around running ivy that I can think of. This patch essentially persists them to disk between runs, since the majority of the time they don't change. > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1268) Update streaming tests to JUnit 4 style
Update streaming tests to JUnit 4 style --- Key: MAPREDUCE-1268 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1268 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/streaming, test Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Suggested by Chris in MAPREDUCE-1155 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1155) Streaming tests swallow exceptions
[ https://issues.apache.org/jira/browse/MAPREDUCE-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786232#action_12786232 ] Todd Lipcon commented on MAPREDUCE-1155: Chris: mind if we do that in a separate JIRA? I opened MAPREDUCE-1268. We may as well fix the broken tests now when there's a patch that applies and passes, and worry about style separately. > Streaming tests swallow exceptions > -- > > Key: MAPREDUCE-1155 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1155 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.20.1, 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1155.txt > > > Many of the streaming tests (including TestMultipleArchiveFiles) catch > exceptions and print their stack trace rather than failing the job. This > means that tests do not fail even when the job fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1267) Fix typo in mapred-default.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-1267: --- Status: Patch Available (was: Open) > Fix typo in mapred-default.xml > -- > > Key: MAPREDUCE-1267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1267 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1267.txt > > > There's a typo of mapreduce.client.progerssmonitor.pollinterval instead of > mapreduce.client.progressmonitor.pollinterval in mapred-default. Trivial > patch to fix. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1267) Fix typo in mapred-default.xml
Fix typo in mapred-default.xml -- Key: MAPREDUCE-1267 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1267 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Fix For: 0.21.0, 0.22.0 Attachments: mapreduce-1267.txt There's a typo of mapreduce.client.progerssmonitor.pollinterval instead of mapreduce.client.progressmonitor.pollinterval in mapred-default. Trivial patch to fix. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1267) Fix typo in mapred-default.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-1267: --- Attachment: mapreduce-1267.txt Should be committed to both 0.21 and trunk > Fix typo in mapred-default.xml > -- > > Key: MAPREDUCE-1267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1267 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1267.txt > > > There's a typo of mapreduce.client.progerssmonitor.pollinterval instead of > mapreduce.client.progressmonitor.pollinterval in mapred-default. Trivial > patch to fix. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786225#action_12786225 ] Doug Cutting commented on MAPREDUCE-1114: - > I look at this as a 60% speedup in my development cycle rather than a few % > speedup in the full build. I agree with this logic. My most common development cycle is to run a single unit test. For Avro this takes just a few seconds, and I'm willing to wait without finding a new task to work on. With Hadoop this takes long enough that I switch to doing something else, lose my context, etc. Improving this significantly will significantly improve many developers productivity. I wonder if we can simply check if build/ivy/lib/Hadoop-Hdfs/{common,test} exist, and, if they do, assumes they're up-to-date, and only runs Ivy otherwise. Might that be simpler? > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters
[ https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786219#action_12786219 ] Todd Lipcon commented on MAPREDUCE-1266: Well, actually, in trunk there's mapreduce.jobtracker.heartbeats.in.second which sets the individual trackers such that that number of heartbeats arrive every second. The default is 100, which would be a 10ms interval for a pseudo-distributed cluster, which is silly. So there's a minimum as well, hardcoded. Here's the relevant code: {code} int heartbeatInterval = Math.max( (int)(1000 * HEARTBEATS_SCALING_FACTOR * Math.ceil((double)clusterSize / NUM_HEARTBEATS_IN_SECOND)), HEARTBEAT_INTERVAL_MIN) ; {code} HEARTBEAT_INTERVAL_MIN is hard coded to 3 seconds in MRConstants.java. Maybe I'm misunderstanding your question - are you in support of lowering the minimum and just asking why make it undocumented-configurable instead of hardcoded? I was offering the undocumented configuration option just in case someone had an argument against this change. If everyone's for it, happy to just change the constant. > Allow heartbeat interval smaller than 3 seconds for tiny clusters > - > > Key: MAPREDUCE-1266 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobtracker, task, tasktracker >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Priority: Minor > > For small clusters, the heartbeat interval has a large effect on job latency. > This is especially true on pseudo-distributed or other "tiny" (<5 nodes) > clusters. It's not a big deal for production, but new users would have a > happier first experience if Hadoop seemed snappier. > I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps > 0.5 seconds (but have it governed by an undocumented config parameter in case > people don't like this change). The cluster size-based ramp up of interval > will maintain the current scalable behavior for large clusters with no > negative effect. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1097) Changes/fixes to support Vertica 3.5
[ https://issues.apache.org/jira/browse/MAPREDUCE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omer Trajman updated MAPREDUCE-1097: Fix Version/s: 0.21.0 Status: Patch Available (was: Open) > Changes/fixes to support Vertica 3.5 > > > Key: MAPREDUCE-1097 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1097 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.21.0 > Environment: Hadoop 0.21.0 pre-release and Vertica 3.5 >Reporter: Omer Trajman >Assignee: Omer Trajman >Priority: Minor > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1097.patch > > > Vertica 3.5 includes three changes that the formatters should handle: > 1) deploy_design function that handles much of the logic in the optimize > method. This improvement uses deploy_design if the server version supports > it instead of orchestrating in the formatter function. > 2) truncate table instead of recreating the table > 3) numeric, decimal, money, number types (all the same path) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1257) Ability to grab the number of spills
[ https://issues.apache.org/jira/browse/MAPREDUCE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786216#action_12786216 ] Todd Lipcon commented on MAPREDUCE-1257: Chris: I don't feel strongly about this. I like it for the exact reason you mentioned - makes it easier to tune io.sort.record.percent (or at least see at a glance whether such tuning could help). My plan was to backport it into our distribution for 20, where a backport of MAPREDUCE-64 is pretty unlikely since that change is much riskier. If no one else wants this, happy to resolve as wontfix. Would be interested to hear from the original reporter, though, before doing so. > Ability to grab the number of spills > > > Key: MAPREDUCE-1257 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1257 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Affects Versions: 0.22.0 >Reporter: Sriranjan Manjunath >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: mapreduce-1257.txt > > > The counters should have information about the number of spills in addition > to the number of spill records. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1097) Changes/fixes to support Vertica 3.5
[ https://issues.apache.org/jira/browse/MAPREDUCE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omer Trajman updated MAPREDUCE-1097: Status: Open (was: Patch Available) wrong target > Changes/fixes to support Vertica 3.5 > > > Key: MAPREDUCE-1097 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1097 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.21.0 > Environment: Hadoop 0.21.0 pre-release and Vertica 3.5 >Reporter: Omer Trajman >Assignee: Omer Trajman >Priority: Minor > Attachments: MAPREDUCE-1097.patch > > > Vertica 3.5 includes three changes that the formatters should handle: > 1) deploy_design function that handles much of the logic in the optimize > method. This improvement uses deploy_design if the server version supports > it instead of orchestrating in the formatter function. > 2) truncate table instead of recreating the table > 3) numeric, decimal, money, number types (all the same path) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters
[ https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786213#action_12786213 ] Allen Wittenauer commented on MAPREDUCE-1266: - I'm probably be forgetful, but.. we have: a) heartbeat interval b) minimum heartbeat interval such that a > b, always. If someone doesn't like b, does it matter? Wouldn't they just tune a? I guess i'm asking: why make b configurable at all? > Allow heartbeat interval smaller than 3 seconds for tiny clusters > - > > Key: MAPREDUCE-1266 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobtracker, task, tasktracker >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Priority: Minor > > For small clusters, the heartbeat interval has a large effect on job latency. > This is especially true on pseudo-distributed or other "tiny" (<5 nodes) > clusters. It's not a big deal for production, but new users would have a > happier first experience if Hadoop seemed snappier. > I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps > 0.5 seconds (but have it governed by an undocumented config parameter in case > people don't like this change). The cluster size-based ramp up of interval > will maintain the current scalable behavior for large clusters with no > negative effect. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786211#action_12786211 ] Todd Lipcon commented on MAPREDUCE-1114: bq. I don't think the 15 second payoff justifies the maintenance cost of a custom caching layer for ivy. Comparing the 15 second payoff to the full build time isn't particular important to me. For me, the ability to quickly iterate on code while recompiling and rerunning unit tests is the big payoff - so I look at this as a 60% speedup in my development cycle rather than a few % speedup in the full build. I may be in the minority, though, as I don't use eclipse or anything other fancy IDE that does incremental compilation. Anyone else care to chime in? > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters
Allow heartbeat interval smaller than 3 seconds for tiny clusters - Key: MAPREDUCE-1266 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, task, tasktracker Affects Versions: 0.22.0 Reporter: Todd Lipcon Priority: Minor For small clusters, the heartbeat interval has a large effect on job latency. This is especially true on pseudo-distributed or other "tiny" (<5 nodes) clusters. It's not a big deal for production, but new users would have a happier first experience if Hadoop seemed snappier. I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 0.5 seconds (but have it governed by an undocumented config parameter in case people don't like this change). The cluster size-based ramp up of interval will maintain the current scalable behavior for large clusters with no negative effect. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log
[ https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786201#action_12786201 ] Hadoop QA commented on MAPREDUCE-1265: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426933/MAPREDUCE-1265-v2.patch against trunk revision 887135. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/291/console This message is automatically generated. > Include tasktracker name in the task attempt error log > -- > > Key: MAPREDUCE-1265 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen >Priority: Trivial > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch > > > When task attempt receive an error, TaskInProgress will log the task attempt > id and diagnosis string in the JobTracker log. > Ex: > 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: > Java heap space > 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 > failed to report status for 601 seconds. Killing! > When we want to debug a machine for example, a node has been blacklisted in > the past few days. > We have to use the task attempt id to find the TT. This is not very > convenient. > It will be nice if we can also log the tasktracker which causes this error. > This way we can just grep the hostname to quickly find all the relevant error > message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-1244) eclipse-plugin fails with missing dependencies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved MAPREDUCE-1244. -- Resolution: Fixed Fix Version/s: 0.21.0 Hadoop Flags: [Reviewed] We need to apply this fix to 0.21 also. > eclipse-plugin fails with missing dependencies > -- > > Key: MAPREDUCE-1244 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1244 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Affects Versions: 0.22.0 >Reporter: Giridharan Kesavan >Assignee: Giridharan Kesavan > Fix For: 0.21.0, 0.22.0 > > Attachments: mapred-1244.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (MAPREDUCE-1244) eclipse-plugin fails with missing dependencies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reopened MAPREDUCE-1244: -- We need to apply this to 0.21 also. > eclipse-plugin fails with missing dependencies > -- > > Key: MAPREDUCE-1244 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1244 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Affects Versions: 0.22.0 >Reporter: Giridharan Kesavan >Assignee: Giridharan Kesavan > Fix For: 0.21.0, 0.22.0 > > Attachments: mapred-1244.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words
[ https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1174: - Attachment: MAPREDUCE-1174.2.patch Freshly cut patch. > Sqoop improperly handles table/column names which are reserved sql words > > > Key: MAPREDUCE-1174 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/sqoop >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-1174.2.patch, MAPREDUCE-1174.patch > > > In some databases it is legal to name tables and columns with terms that > overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such > cases, the database allows you to escape the table and column names. We > should always escape table and column names when possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words
[ https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1174: - Assignee: zhiyong zhang (was: Aaron Kimball) Status: Patch Available (was: Open) > Sqoop improperly handles table/column names which are reserved sql words > > > Key: MAPREDUCE-1174 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/sqoop >Reporter: Aaron Kimball >Assignee: zhiyong zhang > Attachments: MAPREDUCE-1174.2.patch, MAPREDUCE-1174.patch > > > In some databases it is legal to name tables and columns with terms that > overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such > cases, the database allows you to escape the table and column names. We > should always escape table and column names when possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786163#action_12786163 ] Hadoop QA commented on MAPREDUCE-1230: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425750/MAPREDUCE-1230.patch against trunk revision 887135. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/165/console This message is automatically generated. > Vertica streaming adapter doesn't handle nulls in all cases > --- > > Key: MAPREDUCE-1230 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 > Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+ >Reporter: Omer Trajman >Assignee: Omer Trajman > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1230.patch > > > Test user reported that Vertica adapter throws an npe when retrieving null > values for certain types (binary, numeric both reported). There is no > special case handling when serializing nulls. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1262) Eclipse Plugin does not build for Hadoop 0.20.1
[ https://issues.apache.org/jira/browse/MAPREDUCE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Watt updated MAPREDUCE-1262: Fix Version/s: 0.22.0 0.21.0 0.20.2 Affects Version/s: 0.22.0 0.21.0 0.20.2 Status: Patch Available (was: Open) Moving to "Patch Available" - There is no supporting test as this resolves an issue with the build. The Build itself is the test. > Eclipse Plugin does not build for Hadoop 0.20.1 > --- > > Key: MAPREDUCE-1262 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1262 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 > Environment: SLES 10, Mac OS/X 10.5.8 >Reporter: Stephen Watt > Fix For: 0.20.2, 0.21.0, 0.22.0, 0.20.1 > > Attachments: hadoop-0.20.1-eclipse-plugin.jar, HADOOP-6360.patch > > > When trying to run the build script for the Eclipse Plugin in > src/contrib/eclipse-plugin there are several errors a user receives. The > first error is that the eclipse.home is not set. This is easily remedied by > adding a value for eclipse.home in the build.properties file in the > eclipse-plugin directory. > The script then states it cannot compile > org.apache.hadoop.eclipse.launch.HadoopApplicationLaunchShortcut because it > cannot resolve JavaApplicationLaunchShortcut on line 35: > import > org.eclipse.jdt.internal.debug.ui.launcher.JavaApplicationLaunchShortcut; > and fails > I believe this is because there is no jar in the eclipse.home/plugins that > has this class in that package. I did however find it in > org.eclipse.jdt.debug.ui.launchConfigurations.JavaApplicationLaunchShortcut > which was inside in org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar in the > plugins dir of Eclipse 3.5 > Changing the import in the class in the source to the latter allows the build > to complete successfully. The M/R Perspective opens and works on my SLES 10 > Linux environment but not on my Macbook Pro. Both are running Eclipse 3.5. > To users wanting to do the same, I built this inside Eclipse. To do that I > added org.eclipse.jdt.debug.ui_3.4.1.v20090811_r351.jar and > hadoop-0.20.1-core.jar to the ant runtime configuration classpath. I also had > to set the version value=0.20.1 in the build.properties. You will also need > to copy hadoop-0.20.1-core.jar to hadoop.home/build and commons-cli-1.2.jar > to hadoop.home/build/ivy/lib/Hadoop/common. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist
[ https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-1241: --- Status: Open (was: Patch Available) > JobTracker should not crash when mapred-queues.xml does not exist > - > > Key: MAPREDUCE-1241 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1241.txt, mapreduce-1241.txt > > > Currently, if you bring up the JobTracker on an old configuration directory, > it gets a NullPointerException looking for the mapred-queues.xml file. It > should just assume a default queue and continue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist
[ https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-1241: --- Attachment: mapreduce-1241.txt Adds license to mapred-queues-default.xml. Since we're now treating them as separate files, I also got rid of all of the documentation-y comments from -default. > JobTracker should not crash when mapred-queues.xml does not exist > - > > Key: MAPREDUCE-1241 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1241.txt, mapreduce-1241.txt > > > Currently, if you bring up the JobTracker on an old configuration directory, > it gets a NullPointerException looking for the mapred-queues.xml file. It > should just assume a default queue and continue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist
[ https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-1241: --- Status: Patch Available (was: Open) > JobTracker should not crash when mapred-queues.xml does not exist > - > > Key: MAPREDUCE-1241 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1241.txt, mapreduce-1241.txt > > > Currently, if you bring up the JobTracker on an old configuration directory, > it gets a NullPointerException looking for the mapred-queues.xml file. It > should just assume a default queue and continue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-177) Hadoop performance degrades significantly as more and more jobs complete
[ https://issues.apache.org/jira/browse/MAPREDUCE-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786104#action_12786104 ] Allen Wittenauer commented on MAPREDUCE-177: What is the latest status of this patch? It doesn't appear to be committed or, heck, even resolved as to how the fix is going to be applied. > Hadoop performance degrades significantly as more and more jobs complete > > > Key: MAPREDUCE-177 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-177 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Runping Qi >Assignee: Ioannis Koltsidas >Priority: Critical > Attachments: HADOOP-4766-v1.patch, HADOOP-4766-v2.10.patch, > HADOOP-4766-v2.4.patch, HADOOP-4766-v2.6.patch, HADOOP-4766-v2.7-0.18.patch, > HADOOP-4766-v2.7-0.19.patch, HADOOP-4766-v2.7.patch, > HADOOP-4766-v2.8-0.18.patch, HADOOP-4766-v2.8-0.19.patch, > HADOOP-4766-v2.8.patch, HADOOP-4766-v3.4-0.19.patch, map_scheduling_rate.txt > > > When I ran the gridmix 2 benchmark load on a fresh cluster of 500 nodes with > hadoop trunk, > the gridmix load, consisting of 202 map/reduce jobs of various sizes, > completed in 32 minutes. > Then I ran the same set of the jobs on the same cluster, yhey completed in 43 > minutes. > When I ran them the third times, it took (almost) forever --- the job tracker > became non-responsive. > The job tracker's heap size was set to 2GB. > The cluster is configured to keep up to 500 jobs in memory. > The job tracker kept one cpu busy all the time. Look like it was due to GC. > I believe the release 0.18/0.19 have the similar behavior. > I believe 0.18 and 0.18 also have the similar behavior. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-181) Secure job submission
[ https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786085#action_12786085 ] Devaraj Das commented on MAPREDUCE-181: --- On the failing tests, failure of TestGridmixSubmission is a known issue. The other two tests don't fail on my local machine.. > Secure job submission > -- > > Key: MAPREDUCE-181 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-181 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Amar Kamat >Assignee: Devaraj Das > Fix For: 0.22.0 > > Attachments: 181-1.patch, 181-2.patch, 181-3.patch, 181-3.patch, > 181-4.patch, hadoop-3578-branch-20-example-2.patch, > hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, > HADOOP-3578-v2.7.patch, MAPRED-181-v3.32.patch, MAPRED-181-v3.8.patch > > > Currently the jobclient accesses the {{mapred.system.dir}} to add job > details. Hence the {{mapred.system.dir}} has the permissions of > {{rwx-wx-wx}}. This could be a security loophole where the job files might > get overwritten/tampered after the job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-967) TaskTracker does not need to fully unjar job jars
[ https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786077#action_12786077 ] Tom White commented on MAPREDUCE-967: - +1 This looks good to me. bq. One question for reviewer: the constant for the new configuration key is in JobContext, whereas the default is in JobConf. I was following some other examples from the code, but it seems a little bit messy here. Where are the right places to add new configuration parameters that work in both APIs? The key should certainly go in JobContext, but where the default is located is less clear. Defaults tend to be defined in the class that they are used, which is JobConf in this case. However, JobConf is deprecated and will disappear, although it may still be used by the implementation (i.e. not be a part of the public API), in which case what you have done is fine. > TaskTracker does not need to fully unjar job jars > - > > Key: MAPREDUCE-967 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-967 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tasktracker >Affects Versions: 0.21.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: mapreduce-967-branch-0.20.txt, mapreduce-967.txt, > mapreduce-967.txt, mapreduce-967.txt > > > In practice we have seen some users submitting job jars that consist of > 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning > up after them has a significant cost (both in wall clock and in unnecessary > heavy disk utilization). This cost can be easily avoided -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log
[ https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1265: -- Description: When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log. Ex: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 failed to report status for 601 seconds. Killing! When we want to debug a machine for example, a node has been blacklisted in the past few days. We have to use the task attempt id to find the TT. This is not very convenient. It will be nice if we can also log the tasktracker which causes this error. This way we can just grep the hostname to quickly find all the relevant error message. was: When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log. Ex: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 failed to report status for 601 seconds. Killing! When we want to debug a machine for example, a blacklisted node. We have to use the task attempt id to find the TT. This is not very convenient. It will be nice if we can also log the tasktracker which cauces this error. This way we can just grep the hostname to quickly find all the relevant error message. > Include tasktracker name in the task attempt error log > -- > > Key: MAPREDUCE-1265 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen >Priority: Trivial > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch > > > When task attempt receive an error, TaskInProgress will log the task attempt > id and diagnosis string in the JobTracker log. > Ex: > 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: > Java heap space > 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 > failed to report status for 601 seconds. Killing! > When we want to debug a machine for example, a node has been blacklisted in > the past few days. > We have to use the task attempt id to find the TT. This is not very > convenient. > It will be nice if we can also log the tasktracker which causes this error. > This way we can just grep the hostname to quickly find all the relevant error > message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log
[ https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1265: -- Description: When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log. Ex: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 failed to report status for 601 seconds. Killing! When we want to debug a machine for example, a blacklisted node. We have to use the task attempt id to find the TT. This is not very convenient. It will be nice if we can also log the tasktracker which cauces this error. This way we can just grep the hostname to quickly find all the relevant error message. was: When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log. Ex: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 failed to report status for 601 seconds. Killing! When we want to debug a machine for example, a blacklisted node. We have to use the task attempt id to find these information. This is not very convenient. It will be nice if we can also log the tasktracker which cauces this error. This way we can just grep the hostname to quickly find all the relevant error message. > Include tasktracker name in the task attempt error log > -- > > Key: MAPREDUCE-1265 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen >Priority: Trivial > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch > > > When task attempt receive an error, TaskInProgress will log the task attempt > id and diagnosis string in the JobTracker log. > Ex: > 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: > Java heap space > 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 > failed to report status for 601 seconds. Killing! > When we want to debug a machine for example, a blacklisted node. > We have to use the task attempt id to find the TT. This is not very > convenient. > It will be nice if we can also log the tasktracker which cauces this error. > This way we can just grep the hostname to quickly find all the relevant error > message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log
[ https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1265: -- Fix Version/s: 0.22.0 Affects Version/s: 0.22.0 Status: Patch Available (was: Open) > Include tasktracker name in the task attempt error log > -- > > Key: MAPREDUCE-1265 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen >Priority: Trivial > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch > > > When task attempt receive an error, TaskInProgress will log the task attempt > id and diagnosis string in the JobTracker log. > Ex: > 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: > Java heap space > 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 > failed to report status for 601 seconds. Killing! > When we want to debug a machine for example, a blacklisted node. > We have to use the task attempt id to find these information. This is not > very convenient. > It will be nice if we can also log the tasktracker which cauces this error. > This way we can just grep the hostname to quickly find all the relevant error > message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log
[ https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786066#action_12786066 ] Scott Chen commented on MAPREDUCE-1265: --- I just realized that job id is just part of task attempt id so we can easily obtain that. So we need to log tasktracker name here only. So, here is the log after change: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__r_09_1 *on tracker_m01.aaa.com*: Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__m_000478_0 *on tracker_m02.aaa.com*: Task attempt_2009__m_000478_0 failed to report status for 601 seconds. Killing! > Include tasktracker name in the task attempt error log > -- > > Key: MAPREDUCE-1265 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.22.0 >Reporter: Scott Chen >Assignee: Scott Chen >Priority: Trivial > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch > > > When task attempt receive an error, TaskInProgress will log the task attempt > id and diagnosis string in the JobTracker log. > Ex: > 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: > Java heap space > 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 > failed to report status for 601 seconds. Killing! > When we want to debug a machine for example, a blacklisted node. > We have to use the task attempt id to find these information. This is not > very convenient. > It will be nice if we can also log the tasktracker which cauces this error. > This way we can just grep the hostname to quickly find all the relevant error > message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1265) Include tasktracker name in the task attempt error log
[ https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1265: -- Description: When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log. Ex: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 failed to report status for 601 seconds. Killing! When we want to debug a machine for example, a blacklisted node. We have to use the task attempt id to find these information. This is not very convenient. It will be nice if we can also log the tasktracker which cauces this error. This way we can just grep the hostname to quickly find all the relevant error message. was: When task attempt receive an error, TaskInProgress will log the task attempt id and diagnosis string in the JobTracker log. Ex: 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: Java heap space 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 failed to report status for 601 seconds. Killing! When we want to debug a machine or a job. We have to use the task attempt id to find these information. It will be much more convenient if we can just log them together. This way we can just grep the jobId or hostname to quickly find all the relevant error message. Summary: Include tasktracker name in the task attempt error log (was: Include jobId and hostname in the task attempt error log) > Include tasktracker name in the task attempt error log > -- > > Key: MAPREDUCE-1265 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Scott Chen >Assignee: Scott Chen >Priority: Trivial > Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch > > > When task attempt receive an error, TaskInProgress will log the task attempt > id and diagnosis string in the JobTracker log. > Ex: > 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: > Java heap space > 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 > failed to report status for 601 seconds. Killing! > When we want to debug a machine for example, a blacklisted node. > We have to use the task attempt id to find these information. This is not > very convenient. > It will be nice if we can also log the tasktracker which cauces this error. > This way we can just grep the hostname to quickly find all the relevant error > message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1265) Include jobId and hostname in the task attempt error log
[ https://issues.apache.org/jira/browse/MAPREDUCE-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1265: -- Attachment: MAPREDUCE-1265-v2.patch > Include jobId and hostname in the task attempt error log > > > Key: MAPREDUCE-1265 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1265 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Scott Chen >Assignee: Scott Chen >Priority: Trivial > Attachments: MAPREDUCE-1265-v2.patch, MAPREDUCE-1265.patch > > > When task attempt receive an error, TaskInProgress will log the task attempt > id and diagnosis string in the JobTracker log. > Ex: > 2009-xx-xx 23:50:45,994 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__r_09_1: Error: java.lang.OutOfMemoryError: > Java heap space > 2009-xx-xx 22:53:53,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error > from attempt_2009__m_000478_0: Task attempt_2009__m_000478_0 > failed to report status for 601 seconds. Killing! > When we want to debug a machine or a job. We have to use the task attempt id > to find these information. > It will be much more convenient if we can just log them together. > This way we can just grep the jobId or hostname to quickly find all the > relevant error message. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist
[ https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786057#action_12786057 ] Owen O'Malley commented on MAPREDUCE-1241: -- I can see not having the Apache license in the template files, but I think we should have it in the default files. Please add it to the new one. > JobTracker should not crash when mapred-queues.xml does not exist > - > > Key: MAPREDUCE-1241 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1241.txt > > > Currently, if you bring up the JobTracker on an old configuration directory, > it gets a NullPointerException looking for the mapred-queues.xml file. It > should just assume a default queue and continue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1263) Hudson doesn't run MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786053#action_12786053 ] Tom White commented on MAPREDUCE-1263: -- I would like to see these tests renamed to be benchmarks and put in a separate tree, since "tests" to me suggest unit tests. This would make them more findable too. (This wouldn't prevent bitrot, though.) > Hudson doesn't run MapredTestDriver > --- > > Key: MAPREDUCE-1263 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1263 > Project: Hadoop Map/Reduce > Issue Type: Test >Reporter: Eli Collins > > It doesn't look like > src/test/mapred/org/apache/hadoop/test/MapredTestDriver.java is being run by > Hudson. There are no results for MRReliabilityTest, DFSCIOTest or other tests > that live under src/test/mapred/org/apache/hadoop that don't have file names > beginning with Test (ie not picked up by junit). Intentional? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1152) JobTrackerInstrumentation.killed{Map/Reduce} is never called
[ https://issues.apache.org/jira/browse/MAPREDUCE-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786045#action_12786045 ] Hudson commented on MAPREDUCE-1152: --- Integrated in Hadoop-Mapreduce-trunk #164 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/]) . Distinguish between failed and killed tasks in JobTrackerInstrumentation. Contributed by Sharad Agarwal > JobTrackerInstrumentation.killed{Map/Reduce} is never called > > > Key: MAPREDUCE-1152 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1152 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Sharad Agarwal > Fix For: 0.22.0 > > Attachments: 1152.patch, 1152.patch, 1152_v2.patch, 1152_v3.patch > > > JobTrackerInstrumentation.killed{Map/Reduce} metrics added as part of > MAPREDUCE-1103 is not captured -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1260) Update Eclipse configuration to match changes to Ivy configuration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786041#action_12786041 ] Hudson commented on MAPREDUCE-1260: --- Integrated in Hadoop-Mapreduce-trunk #164 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/]) > Update Eclipse configuration to match changes to Ivy configuration > -- > > Key: MAPREDUCE-1260 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1260 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Affects Versions: 0.22.0 >Reporter: Edwin Chan > Fix For: 0.22.0 > > Attachments: mapReduceClasspath.patch > > > The .eclipse_templates/.classpath file doesn't match the Ivy configuration, > so I've updated it to match. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1249) mapreduce.reduce.shuffle.read.timeout's default value should be 3 minutes, in mapred-default.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786043#action_12786043 ] Hudson commented on MAPREDUCE-1249: --- Integrated in Hadoop-Mapreduce-trunk #164 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/]) . Update config default value for socket read timeout to match code default. Contributed by Amareshwari Sriramadasu > mapreduce.reduce.shuffle.read.timeout's default value should be 3 minutes, in > mapred-default.xml > > > Key: MAPREDUCE-1249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 0.21.0 >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu >Priority: Blocker > Fix For: 0.21.0 > > Attachments: patch-1249-1.txt, patch-1249.txt > > > mapreduce.reduce.shuffle.read.timeout has a value of 30,000 (30 seconds) in > mapred-default.xml, whereas the default value in Fetcher code is 3 minutes. > It should be 3 minutes by default, as it was in pre MAPREDUCE-353. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1161) NotificationTestCase should not lock current thread
[ https://issues.apache.org/jira/browse/MAPREDUCE-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786042#action_12786042 ] Hudson commented on MAPREDUCE-1161: --- Integrated in Hadoop-Mapreduce-trunk #164 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/]) . Remove ineffective synchronization in NotificationTestCase. Contributed by Owen O'Malley > NotificationTestCase should not lock current thread > --- > > Key: MAPREDUCE-1161 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1161 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 0.21.0 > > Attachments: mr-1161.patch > > > There are 3 instances where NotificationTestCase is locking > Thread.currentThread() is being locked and calling sleep on it. There is also > a method stdPrintln that doesn't do anything. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1119) When tasks fail to report status, show tasks's stack dump before killing
[ https://issues.apache.org/jira/browse/MAPREDUCE-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786044#action_12786044 ] Hudson commented on MAPREDUCE-1119: --- Integrated in Hadoop-Mapreduce-trunk #164 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/164/]) . When tasks fail to report status, show tasks's stack dump before killing. Contributed by Aaron Kimball. > When tasks fail to report status, show tasks's stack dump before killing > > > Key: MAPREDUCE-1119 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1119 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Aaron Kimball > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1119.2.patch, MAPREDUCE-1119.3.patch, > MAPREDUCE-1119.4.patch, MAPREDUCE-1119.5.patch, MAPREDUCE-1119.6.patch, > MAPREDUCE-1119.patch > > > When the TT kills tasks that haven't reported status, it should somehow > gather a stack dump for the task. This could be done either by sending a > SIGQUIT (so the dump ends up in stdout) or perhaps something like JDI to > gather the stack directly from Java. This may be somewhat tricky since the > child may be running as another user (so the SIGQUIT would have to go through > LinuxTaskController). This feature would make debugging these kinds of > failures much easier, especially if we could somehow get it into the > TaskDiagnostic message -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-118) Job.getJobID() will always return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785974#action_12785974 ] Hadoop QA commented on MAPREDUCE-118: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426883/patch-118.txt against trunk revision 887135. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/290/console This message is automatically generated. > Job.getJobID() will always return null > -- > > Key: MAPREDUCE-118 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-118 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Amar Kamat >Priority: Blocker > Fix For: 0.20.2 > > Attachments: patch-118-0.20.txt, patch-118-0.21.txt, patch-118.txt > > > JobContext is used for a read-only view of job's info. Hence all the readonly > fields in JobContext are set in the constructor. Job extends JobContext. When > a Job is created, jobid is not known and hence there is no way to set JobID > once Job is created. JobID is obtained only when the JobClient queries the > jobTracker for a job-id., which happens later i.e upon job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785954#action_12785954 ] Hadoop QA commented on MAPREDUCE-372: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426877/patch-372-2.txt against trunk revision 887135. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/164/console This message is automatically generated. > Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api. > --- > > Key: MAPREDUCE-372 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-372 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.21.0 > > Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, > patch-372-1.txt, patch-372-2.txt, patch-372.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-896) Users can set non-writable permissions on temporary files for TT and can abuse disk usage.
[ https://issues.apache.org/jira/browse/MAPREDUCE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-896: --- Attachment: y896.v1.patch Attaching Y! 20 patch(review comments incorporated). Making changes to the trunk's patch. Please review Y! 20 patch and provide your comments. > Users can set non-writable permissions on temporary files for TT and can > abuse disk usage. > -- > > Key: MAPREDUCE-896 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-896 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.21.0 >Reporter: Vinod K V >Assignee: Ravi Gummadi > Fix For: 0.21.0 > > Attachments: MR-896.patch, MR-896.v1.patch, y896.v1.patch > > > As of now, irrespective of the TaskController in use, TT itself does a full > delete on local files created by itself or job tasks. This step, depending > upon TT's umask and the permissions set by files by the user, for e.g in > job-work/task-work or child.tmp directories, may or may not go through > successful completion fully. Thus is left an opportunity for abusing disk > space usage either accidentally or intentionally by TT/users. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-896) Users can set non-writable permissions on temporary files for TT and can abuse disk usage.
[ https://issues.apache.org/jira/browse/MAPREDUCE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785948#action_12785948 ] Ravi Gummadi commented on MAPREDUCE-896: The new TaskController command is ENABLE_TASK_FOR_CLEANUP. There is a change in JVMManager where the workdir for the last task was being deleted inline, but now we delete it asynchronously. This should be fine. The change in setupWorkDir fixes the issue of trying to delete workDir, which is the current working dir. Only contents of workDir are deleted, leaving the workDir as empty. A testcase is added to validate this cleanup of workDir. Removing check_group as this wouldn't work if user changes the group of workDir. createFileAndSetPermissions sets a=rx for subDir and file in subDir sothat no one can delete them without doing chmod. Am fine with the other comments. > Users can set non-writable permissions on temporary files for TT and can > abuse disk usage. > -- > > Key: MAPREDUCE-896 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-896 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.21.0 >Reporter: Vinod K V >Assignee: Ravi Gummadi > Fix For: 0.21.0 > > Attachments: MR-896.patch, MR-896.v1.patch > > > As of now, irrespective of the TaskController in use, TT itself does a full > delete on local files created by itself or job tasks. This step, depending > upon TT's umask and the permissions set by files by the user, for e.g in > job-work/task-work or child.tmp directories, may or may not go through > successful completion fully. Thus is left an opportunity for abusing disk > space usage either accidentally or intentionally by TT/users. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omer Trajman updated MAPREDUCE-1230: Status: Patch Available (was: Open) > Vertica streaming adapter doesn't handle nulls in all cases > --- > > Key: MAPREDUCE-1230 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 > Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+ >Reporter: Omer Trajman >Assignee: Omer Trajman > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1230.patch > > > Test user reported that Vertica adapter throws an npe when retrieving null > values for certain types (binary, numeric both reported). There is no > special case handling when serializing nulls. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omer Trajman updated MAPREDUCE-1230: Status: Open (was: Patch Available) > Vertica streaming adapter doesn't handle nulls in all cases > --- > > Key: MAPREDUCE-1230 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 > Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+ >Reporter: Omer Trajman >Assignee: Omer Trajman > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1230.patch > > > Test user reported that Vertica adapter throws an npe when retrieving null > values for certain types (binary, numeric both reported). There is no > special case handling when serializing nulls. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1230) Vertica streaming adapter doesn't handle nulls in all cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omer Trajman updated MAPREDUCE-1230: Fix Version/s: 0.21.0 > Vertica streaming adapter doesn't handle nulls in all cases > --- > > Key: MAPREDUCE-1230 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1230 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.21.0 > Environment: Hadoop 0.21.0 pre-release and Vertica 3.0+ >Reporter: Omer Trajman >Assignee: Omer Trajman > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1230.patch > > > Test user reported that Vertica adapter throws an npe when retrieving null > values for certain types (binary, numeric both reported). There is no > special case handling when serializing nulls. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1264) Error Recovery failed, task will continue but run forever as new data only comes in very very slowly
[ https://issues.apache.org/jira/browse/MAPREDUCE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thibaut updated MAPREDUCE-1264: --- Description: Hi, Sometimes, some of my jobs (It normally always happens in the reducers and on random basis) will not finish and will run forever. I have to manually fail the task so the task will be started and be finished. The error log on the node is full of entries like: java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 failed because recovery from primary datanode 192.168.0.3:50011 failed 6 times. Pipeline was 192.168.0.3:50011. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239) java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 failed because recovery from primary datanode 192.168.0.3:50011 failed 6 times. Pipeline was 192.168.0.3:50011. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239) java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 failed because recovery from primary datanode 192.168.0.3:50011 failed 6 times. Pipeline was 192.168.0.3:50011. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239) The error entries all refer to the same data block. Unfortunately, the reduce function still seems to be called in the reducer with valid data (although very very slowly), so the task will never been killed and restarted and will take forever to run! If I kill the task, the job will finish without any problems. I experienced the same problem under version 0.20.0 as well. Thanks, Thibaut was: Hi, Sometimes, some of my jobs (It normally always happens in the reducers and on random basis) will not finish and will run forever. I have to manually fail the task so the task will be started and be finished. The error log on the node is full of entries like: java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 failed because recovery from primary datanode 192.168.0.3:50011 failed 6 times. Pipeline was 192.168.0.3:50011. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239) java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 failed because recovery from primary datanode 192.168.0.3:50011 failed 6 times. Pipeline was 192.168.0.3:50011. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239) java.io.IOException: Error Recovery for block blk_-8036012205502614140_21582139 failed because recovery from primary datanode 192.168.0.3:50011 failed 6 times. Pipeline was 192.168.0.3:50011. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2582) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239) The error entries all refer to the same data block. Unfortunally, the reduce function still seems to be called in the reducer with valid data (allthough very very slowly), so the task will never been killed and restarted and will take forever to run! I experienced the same problem under version 0.20.0 as well. Thanks, Thibaut Fix Version/s: 0.20.2 > Error Recovery failed, task will continue but run forever as new data only > comes in very very slowly > > > Key: MAPREDUCE-1264 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1264 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Thibaut >
[jira] Commented: (MAPREDUCE-1185) URL to JT webconsole for running job and job history should be the same
[ https://issues.apache.org/jira/browse/MAPREDUCE-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785911#action_12785911 ] Arun C Murthy commented on MAPREDUCE-1185: -- +1 > URL to JT webconsole for running job and job history should be the same > --- > > Key: MAPREDUCE-1185 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1185 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobtracker >Reporter: Sharad Agarwal >Assignee: Sharad Agarwal > Attachments: 1185_v1.patch, 1185_v2.patch, 1185_v3.patch, > 1185_v4.patch, 1185_v5.patch, 1185_v6.patch, 1185_v7.patch, > patch-1185-1-ydist.txt, patch-1185-2-ydist.txt, patch-1185-3-ydist.txt, > patch-1185-ydist.txt > > > The tracking url for running jobs and the jobs which are retired is > different. This creates problem for clients which caches the job running url > because soon it becomes invalid when job is retired. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-181) Secure job submission
[ https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785905#action_12785905 ] Hadoop QA commented on MAPREDUCE-181: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426876/181-4.patch against trunk revision 887096. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 78 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/289/console This message is automatically generated. > Secure job submission > -- > > Key: MAPREDUCE-181 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-181 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Amar Kamat >Assignee: Devaraj Das > Fix For: 0.22.0 > > Attachments: 181-1.patch, 181-2.patch, 181-3.patch, 181-3.patch, > 181-4.patch, hadoop-3578-branch-20-example-2.patch, > hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, > HADOOP-3578-v2.7.patch, MAPRED-181-v3.32.patch, MAPRED-181-v3.8.patch > > > Currently the jobclient accesses the {{mapred.system.dir}} to add job > details. Hence the {{mapred.system.dir}} has the permissions of > {{rwx-wx-wx}}. This could be a security loophole where the job files might > get overwritten/tampered after the job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1082) Command line UI for queues' information is broken with hierarchical queues.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785899#action_12785899 ] Hemanth Yamijala commented on MAPREDUCE-1082: - Looking close. Some final comments: - We are assuming the job statuses cannot be null in QueueInfo. I think we should check this in setJobStatuses. If it is null, we can set an empty array. - The test case should call APIs like setRootQueues. getQueue is not passing through the code path change you made in JobTracker.getQueueInfoArray > Command line UI for queues' information is broken with hierarchical queues. > --- > > Key: MAPREDUCE-1082 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1082 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, jobtracker >Affects Versions: 0.21.0 >Reporter: Vinod K V >Assignee: V.V.Chaitanya Krishna >Priority: Blocker > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1082-1.txt, MAPREDUCE-1082-2.patch, > MAPREDUCE-1082-3.patch > > > When the command "./bin/mapred --config ~/tmp/conf/ queue -list" is run, it > just hangs. I can see the following in the JT logs: > {code} > 2009-10-08 13:19:26,762 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 1 on 5 caught: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.QueueInfo.write(QueueInfo.java:217) > at org.apache.hadoop.mapreduce.QueueInfo.write(QueueInfo.java:223) > at > org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:159) > at > org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:126) > at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:70) > at org.apache.hadoop.ipc.Server.setupResponse(Server.java:1074) > at org.apache.hadoop.ipc.Server.access$2400(Server.java:77) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:983) > {code} > Same is the case with "./bin/mapred --config ~/tmp/conf/ queue -info > " -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785897#action_12785897 ] Sharad Agarwal commented on MAPREDUCE-372: -- Looked at the ChainBlockingQueue part of the code. Some comments: 1. Can we avoid the casting in Chain#stopAllThreads? One way could be to override interrupt() in MapRunner and ReduceRunner. Also interruptAllThreads would be a better name IMO. 2. I think instead of interrupting the runners and then calling interrupt on both readers and writers, it would be simpler we can directly interrupt all the blocking queues. > Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api. > --- > > Key: MAPREDUCE-372 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-372 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.21.0 > > Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, > patch-372-1.txt, patch-372-2.txt, patch-372.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1161) NotificationTestCase should not lock current thread
[ https://issues.apache.org/jira/browse/MAPREDUCE-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785877#action_12785877 ] Hudson commented on MAPREDUCE-1161: --- Integrated in Hadoop-Mapreduce-trunk-Commit #144 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/144/]) . Remove ineffective synchronization in NotificationTestCase. Contributed by Owen O'Malley > NotificationTestCase should not lock current thread > --- > > Key: MAPREDUCE-1161 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1161 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 0.21.0 > > Attachments: mr-1161.patch > > > There are 3 instances where NotificationTestCase is locking > Thread.currentThread() is being locked and calling sleep on it. There is also > a method stdPrintln that doesn't do anything. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1152) JobTrackerInstrumentation.killed{Map/Reduce} is never called
[ https://issues.apache.org/jira/browse/MAPREDUCE-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785878#action_12785878 ] Hudson commented on MAPREDUCE-1152: --- Integrated in Hadoop-Mapreduce-trunk-Commit #144 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/144/]) . Distinguish between failed and killed tasks in JobTrackerInstrumentation. Contributed by Sharad Agarwal > JobTrackerInstrumentation.killed{Map/Reduce} is never called > > > Key: MAPREDUCE-1152 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1152 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Sharad Agarwal > Fix For: 0.22.0 > > Attachments: 1152.patch, 1152.patch, 1152_v2.patch, 1152_v3.patch > > > JobTrackerInstrumentation.killed{Map/Reduce} metrics added as part of > MAPREDUCE-1103 is not captured -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1075) getQueue(String queue) in JobTracker would return NPE for invalid queue name
[ https://issues.apache.org/jira/browse/MAPREDUCE-1075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785874#action_12785874 ] Hemanth Yamijala commented on MAPREDUCE-1075: - In an offline discussion with Vinod, we concluded that there is no provision to marshal exceptions in Hadoop's RPC right now. Hence, we are deciding in favor of returning null in the queue APIs. With this context I looked at the new patch. One minor NIT is that I would suggest we test the API JobClient.getQueueInfo instead of Cluster.getQueue, as it covers more code path that's changed. Can you please make this change and run the patch through Hudson so I can commit once it passes ? > getQueue(String queue) in JobTracker would return NPE for invalid queue name > > > Key: MAPREDUCE-1075 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1075 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: V.V.Chaitanya Krishna >Assignee: V.V.Chaitanya Krishna > Fix For: 0.21.0 > > Attachments: MAPREDUCE-1075-1.patch, MAPREDUCE-1075-2.patch, > MAPREDUCE-1075-3.patch, MAPREDUCE-1075-4.patch, MAPREDUCE-1075-5.patch, > MAPREDUCE-1075-6.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1155) Streaming tests swallow exceptions
[ https://issues.apache.org/jira/browse/MAPREDUCE-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1155: - Status: Open (was: Patch Available) This is cleaner than the {{failTrace}} idiom in most of the tests. Since you're tightening these up, would you mind updating the tests to use JUnit4 annotations instead of extending TestCase? Other than that, this looks fine > Streaming tests swallow exceptions > -- > > Key: MAPREDUCE-1155 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1155 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.20.1, 0.21.0, 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1155.txt > > > Many of the streaming tests (including TestMultipleArchiveFiles) catch > exceptions and print their stack trace rather than failing the job. This > means that tests do not fail even when the job fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1241) JobTracker should not crash when mapred-queues.xml does not exist
[ https://issues.apache.org/jira/browse/MAPREDUCE-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785860#action_12785860 ] Hadoop QA commented on MAPREDUCE-1241: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426838/mapreduce-1241.txt against trunk revision 887061. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 160 release audit warnings (more than the trunk's current 159 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/163/console This message is automatically generated. > JobTracker should not crash when mapred-queues.xml does not exist > - > > Key: MAPREDUCE-1241 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1241 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: mapreduce-1241.txt > > > Currently, if you bring up the JobTracker on an old configuration directory, > it gets a NullPointerException looking for the mapred-queues.xml file. It > should just assume a default queue and continue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1161) NotificationTestCase should not lock current thread
[ https://issues.apache.org/jira/browse/MAPREDUCE-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1161: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed this. Thanks, Owen! > NotificationTestCase should not lock current thread > --- > > Key: MAPREDUCE-1161 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1161 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 0.21.0 > > Attachments: mr-1161.patch > > > There are 3 instances where NotificationTestCase is locking > Thread.currentThread() is being locked and calling sleep on it. There is also > a method stdPrintln that doesn't do anything. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1114) Speed up ivy resolution in builds with clever caching
[ https://issues.apache.org/jira/browse/MAPREDUCE-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1114: - Status: Open (was: Patch Available) The patch is stale. The long build times are a problem and ivy's a big part of that, but I agree with your assessment: this is a hack. I don't think the 15 second payoff justifies the maintenance cost of a custom caching layer for ivy. > Speed up ivy resolution in builds with clever caching > - > > Key: MAPREDUCE-1114 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1114 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Minor > Attachments: mapreduce-1114.txt, mapreduce-1114.txt, > mapreduce-1114.txt > > > An awful lot of time is spent in the ivy:resolve parts of the build, even > when all of the dependencies have been fetched and cached. Profiling showed > this was in XML parsing. I have a sort-of-ugly hack which speeds up > incremental compiles (and more importantly "ant test") significantly using > some ant macros to cache the resolved classpaths. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1254) job.xml should add crc check in tasktracker and sub jvm.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785838#action_12785838 ] ZhuGuanyin commented on MAPREDUCE-1254: --- Because the local inexpensive disks are not reliable, and we once found the non zero file became zero length, but the os kernel message has no warning, while some minutes later, the kernel message report the disk failtures. Durining that time, the read operation return success without throw any IOException. In current implementation, it would throw IOException if the job.xml missing, but it couldn't detect the configuration file has corrupted or has being truncated. > job.xml should add crc check in tasktracker and sub jvm. > > > Key: MAPREDUCE-1254 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1254 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: task, tasktracker >Affects Versions: 0.22.0 >Reporter: ZhuGuanyin > > Currently job.xml in tasktracker and subjvm are write to local disk through > ChecksumFilesystem, and already had crc checksum information, but load the > job.xml file without crc check. It would cause the mapred job finished > successful but with wrong data because of disk error. Example: The > tasktracker and sub task jvm would load the default configuration if it > doesn't successfully load the job.xml which maybe replace the mapper with > IdentityMapper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Attachment: mapreduce-1084-1-withoutsvnexternals.patch mapreduce-1084-1.patch Attaching the patch implementing the fault injection in mapreduce project. There are two patches with svn external and without svn external. Svn external patch when applied over workspace does not create the appropriate folder structure with links even tho' the property and folder is added into version control. > Implementing aspects development and fault injeciton framework for MapReduce > > > Key: MAPREDUCE-1084 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build, test >Reporter: Konstantin Boudnik >Assignee: Sreekanth Ramakrishnan > Attachments: mapreduce-1084-1-withoutsvnexternals.patch, > mapreduce-1084-1.patch > > > Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of > injection framework for MapReduce. > After HADOOP-6204 is in place this particular modification should be very > trivial and would take importing (via svn:external) of src/test/build and > some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-118: -- Status: Patch Available (was: Open) > Job.getJobID() will always return null > -- > > Key: MAPREDUCE-118 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-118 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Amar Kamat >Priority: Blocker > Fix For: 0.20.2 > > Attachments: patch-118-0.20.txt, patch-118-0.21.txt, patch-118.txt > > > JobContext is used for a read-only view of job's info. Hence all the readonly > fields in JobContext are set in the constructor. Job extends JobContext. When > a Job is created, jobid is not known and hence there is no way to set JobID > once Job is created. JobID is obtained only when the JobClient queries the > jobTracker for a job-id., which happens later i.e upon job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan reassigned MAPREDUCE-1084: - Assignee: Sreekanth Ramakrishnan > Implementing aspects development and fault injeciton framework for MapReduce > > > Key: MAPREDUCE-1084 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: build, test >Reporter: Konstantin Boudnik >Assignee: Sreekanth Ramakrishnan > > Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of > injection framework for MapReduce. > After HADOOP-6204 is in place this particular modification should be very > trivial and would take importing (via svn:external) of src/test/build and > some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-118: -- Attachment: patch-118.txt patch-118-0.21.txt Patch for branch 0.21 and trunk, renaming getID to getJobID, sothat it overrides the method in JobContext. > Job.getJobID() will always return null > -- > > Key: MAPREDUCE-118 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-118 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Amar Kamat >Priority: Blocker > Fix For: 0.20.2 > > Attachments: patch-118-0.20.txt, patch-118-0.21.txt, patch-118.txt > > > JobContext is used for a read-only view of job's info. Hence all the readonly > fields in JobContext are set in the constructor. Job extends JobContext. When > a Job is created, jobid is not known and hence there is no way to set JobID > once Job is created. JobID is obtained only when the JobClient queries the > jobTracker for a job-id., which happens later i.e upon job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-118: -- Attachment: patch-118-0.20.txt Patch for branch 0.20 > Job.getJobID() will always return null > -- > > Key: MAPREDUCE-118 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-118 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Amar Kamat >Priority: Blocker > Fix For: 0.20.2 > > Attachments: patch-118-0.20.txt > > > JobContext is used for a read-only view of job's info. Hence all the readonly > fields in JobContext are set in the constructor. Job extends JobContext. When > a Job is created, jobid is not known and hence there is no way to set JobID > once Job is created. JobID is obtained only when the JobClient queries the > jobTracker for a job-id., which happens later i.e upon job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1257) Ability to grab the number of spills
[ https://issues.apache.org/jira/browse/MAPREDUCE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785822#action_12785822 ] Hadoop QA commented on MAPREDUCE-1257: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426868/mapreduce-1257.txt against trunk revision 887061. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/288/console This message is automatically generated. > Ability to grab the number of spills > > > Key: MAPREDUCE-1257 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1257 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Affects Versions: 0.22.0 >Reporter: Sriranjan Manjunath >Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: mapreduce-1257.txt > > > The counters should have information about the number of spills in addition > to the number of spill records. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-1234) getJobID() returns null on org.apache.hadoop.mapreduce.Job after job was submitted
[ https://issues.apache.org/jira/browse/MAPREDUCE-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu resolved MAPREDUCE-1234. Resolution: Duplicate Duplicate of MAPREDUCE-118 > getJobID() returns null on org.apache.hadoop.mapreduce.Job after job was > submitted > -- > > Key: MAPREDUCE-1234 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1234 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 0.20.1 > Environment: Run on Win XP, but will propably occur on any system >Reporter: Thomas Kathmann >Priority: Minor > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > After an instance of org.apache.hadoop.mapreduce.Job is submitted via > submit() the method getJobID() returns null. > The code of the submit() method should include something like: > setJobID(info.getJobID()); > after > info = jobClient.submitJobInternal(conf); -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-118) Job.getJobID() will always return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-118: -- Priority: Blocker (was: Major) Affects Version/s: 0.20.1 Fix Version/s: 0.20.2 > Job.getJobID() will always return null > -- > > Key: MAPREDUCE-118 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-118 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.20.1 >Reporter: Amar Kamat >Priority: Blocker > Fix For: 0.20.2 > > > JobContext is used for a read-only view of job's info. Hence all the readonly > fields in JobContext are set in the constructor. Job extends JobContext. When > a Job is created, jobid is not known and hence there is no way to set JobID > once Job is created. JobID is obtained only when the JobClient queries the > jobTracker for a job-id., which happens later i.e upon job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1174) Sqoop improperly handles table/column names which are reserved sql words
[ https://issues.apache.org/jira/browse/MAPREDUCE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1174: - Status: Open (was: Patch Available) Unfortunately, the patch has gone stale. Could you regenerate it? > Sqoop improperly handles table/column names which are reserved sql words > > > Key: MAPREDUCE-1174 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1174 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/sqoop >Reporter: Aaron Kimball >Assignee: Aaron Kimball > Attachments: MAPREDUCE-1174.patch > > > In some databases it is legal to name tables and columns with terms that > overlap SQL reserved keywords (e.g., {{CREATE}}, {{table}}, etc.). In such > cases, the database allows you to escape the table and column names. We > should always escape table and column names when possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-372: -- Attachment: patch-372-2.txt Patch with review comments incorporated. > Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api. > --- > > Key: MAPREDUCE-372 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-372 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.21.0 > > Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, > patch-372-1.txt, patch-372-2.txt, patch-372.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-372: -- Status: Patch Available (was: Open) > Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api. > --- > > Key: MAPREDUCE-372 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-372 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Amareshwari Sriramadasu >Assignee: Amareshwari Sriramadasu > Fix For: 0.21.0 > > Attachments: mapred-372.patch, mapred-372.patch, mapred-372.patch, > patch-372-1.txt, patch-372-2.txt, patch-372.txt > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1152) JobTrackerInstrumentation.killed{Map/Reduce} is never called
[ https://issues.apache.org/jira/browse/MAPREDUCE-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1152: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1 I committed this. Thanks, Sharad! > JobTrackerInstrumentation.killed{Map/Reduce} is never called > > > Key: MAPREDUCE-1152 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1152 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Sharad Agarwal > Fix For: 0.22.0 > > Attachments: 1152.patch, 1152.patch, 1152_v2.patch, 1152_v3.patch > > > JobTrackerInstrumentation.killed{Map/Reduce} metrics added as part of > MAPREDUCE-1103 is not captured -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.