[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13665723#comment-13665723 ] Mostafa Elhemali commented on MAPREDUCE-5224: - Looks good to me. Some small comments mostly about the unit test: # You don't have to catch(Exception) to call tearDown(), nor to print stack trace. tearDown() will always be called automatically, and the exception will be displayed by the test framework. Please remove the try..catch. # Use assertTrue(cond) instead of assertEquals(cond, true) # Even though this is a patch for branch-1-win, since it's a new test you should consider writing it in JUnit 4. See [http://wiki.apache.org/hadoop/HowToDevelopUnitTests] for guidance. Other than that, +1 on my side. JobTracker should allow the system directory to be in non-default FS Key: MAPREDUCE-5224 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, MAPREDUCE-5224.patch JobTracker today expects the system directory to be in the default file system if (fs == null) { fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() { public FileSystem run() throws IOException { return FileSystem.get(conf); }}); } ... public String getSystemDir() { Path sysDir = new Path(conf.get(mapred.system.dir, /tmp/hadoop/mapred/system)); return fs.makeQualified(sysDir).toString(); } In Cloud like Azure the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS. We should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4840) Delete dead code and deprecate public API related to skipping bad records
Mostafa Elhemali created MAPREDUCE-4840: --- Summary: Delete dead code and deprecate public API related to skipping bad records Key: MAPREDUCE-4840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4840 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Mostafa Elhemali Priority: Minor It looks like the decision was made in MAPREDUCE-1932 to remove support for skipping bad records rather than fix it (it doesn't work right now in trunk). If that's the case then we should probably delete all the dead code related to it and deprecate the public API's for it right? Dead code I'm talking about: 1. Task class: skipping, skipRanges, writeSkipRecs 2. MapTask class: SkippingRecordReader inner class 3. ReduceTask class: SkippingReduceValuesIterator inner class 4. Tests: TestBadRecords Public API: 1. SkipBadRecords class -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4840) Delete dead code and deprecate public API related to skipping bad records
[ https://issues.apache.org/jira/browse/MAPREDUCE-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Elhemali updated MAPREDUCE-4840: Attachment: MAPREDUCE-4840.patch Patch attached. Disclaimer: the code compiles fine, but I didn't fully test it since I wrote this on a Windows box and trunk isn't really good with Windows these days. Delete dead code and deprecate public API related to skipping bad records - Key: MAPREDUCE-4840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4840 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Mostafa Elhemali Priority: Minor Fix For: trunk Attachments: MAPREDUCE-4840.patch It looks like the decision was made in MAPREDUCE-1932 to remove support for skipping bad records rather than fix it (it doesn't work right now in trunk). If that's the case then we should probably delete all the dead code related to it and deprecate the public API's for it right? Dead code I'm talking about: 1. Task class: skipping, skipRanges, writeSkipRecs 2. MapTask class: SkippingRecordReader inner class 3. ReduceTask class: SkippingReduceValuesIterator inner class 4. Tests: TestBadRecords Public API: 1. SkipBadRecords class -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4840) Delete dead code and deprecate public API related to skipping bad records
[ https://issues.apache.org/jira/browse/MAPREDUCE-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Elhemali updated MAPREDUCE-4840: Fix Version/s: trunk Status: Patch Available (was: Open) Delete dead code and deprecate public API related to skipping bad records - Key: MAPREDUCE-4840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4840 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Mostafa Elhemali Priority: Minor Fix For: trunk Attachments: MAPREDUCE-4840.patch It looks like the decision was made in MAPREDUCE-1932 to remove support for skipping bad records rather than fix it (it doesn't work right now in trunk). If that's the case then we should probably delete all the dead code related to it and deprecate the public API's for it right? Dead code I'm talking about: 1. Task class: skipping, skipRanges, writeSkipRecs 2. MapTask class: SkippingRecordReader inner class 3. ReduceTask class: SkippingReduceValuesIterator inner class 4. Tests: TestBadRecords Public API: 1. SkipBadRecords class -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3115) OOM When the value for the property mapred.map.multithreadedrunner.class is set to MultithreadedMapper instance.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Elhemali updated MAPREDUCE-3115: Attachment: MAPREDUCE-3115.2.patch OOM When the value for the property mapred.map.multithreadedrunner.class is set to MultithreadedMapper instance. -- Key: MAPREDUCE-3115 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3115 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.23.0, 1.0.0 Environment: NA Reporter: Bhallamudi Venkata Siva Kamesh Attachments: MAPREDUCE-3115.2.patch, MAPREDUCE-3115.patch When we set the value for the property *mapred.map.multithreadedrunner.class* as instance of MultithreadedMapper, using MultithreadedMapper.setMapperClass(), it simply throws IllegalArgumentException. But when we set the same property, using job's conf object using job.getConfiguration().setClass(*mapred.map.multithreadedrunner.class*, MultithreadedMapper.class, Mapper.class), throws OOM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3115) OOM When the value for the property mapred.map.multithreadedrunner.class is set to MultithreadedMapper instance.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508367#comment-13508367 ] Mostafa Elhemali commented on MAPREDUCE-3115: - Good find! I've taken the liberty of updating the patch to work against trunk. Not sure if I'm allowed to +1 when I've uploaded my own patch, but +1 from me. OOM When the value for the property mapred.map.multithreadedrunner.class is set to MultithreadedMapper instance. -- Key: MAPREDUCE-3115 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3115 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.23.0, 1.0.0 Environment: NA Reporter: Bhallamudi Venkata Siva Kamesh Attachments: MAPREDUCE-3115.2.patch, MAPREDUCE-3115.patch When we set the value for the property *mapred.map.multithreadedrunner.class* as instance of MultithreadedMapper, using MultithreadedMapper.setMapperClass(), it simply throws IllegalArgumentException. But when we set the same property, using job's conf object using job.getConfiguration().setClass(*mapred.map.multithreadedrunner.class*, MultithreadedMapper.class, Mapper.class), throws OOM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4840) Delete dead code and deprecate public API related to skipping bad records
[ https://issues.apache.org/jira/browse/MAPREDUCE-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508384#comment-13508384 ] Mostafa Elhemali commented on MAPREDUCE-4840: - Ah OK looks like I was the one confused. I don't believe it works, though as mentioned above I can't really test because of unrelated Windows problems. There's this code in TaskAttemptListenerImpl though: {code} @Override public void reportNextRecordRange(TaskAttemptID taskAttemptID, Range range) throws IOException { // This is used when the feature of skipping records is enabled. // This call exists as a hadoop mapreduce legacy wherein all changes in // counters/progress/phase/output-size are reported through statusUpdate() // call but not the next record range information. throw new IOException(Not yet implemented.); } {code} So I guess the right thing to do is fix the implementation? Not sure if there's a JIRA tracking that. Delete dead code and deprecate public API related to skipping bad records - Key: MAPREDUCE-4840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4840 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Mostafa Elhemali Priority: Minor Attachments: MAPREDUCE-4840.patch It looks like the decision was made in MAPREDUCE-1932 to remove support for skipping bad records rather than fix it (it doesn't work right now in trunk). If that's the case then we should probably delete all the dead code related to it and deprecate the public API's for it right? Dead code I'm talking about: 1. Task class: skipping, skipRanges, writeSkipRecs 2. MapTask class: SkippingRecordReader inner class 3. ReduceTask class: SkippingReduceValuesIterator inner class 4. Tests: TestBadRecords Public API: 1. SkipBadRecords class -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4840) Delete dead code and deprecate public API related to skipping bad records
[ https://issues.apache.org/jira/browse/MAPREDUCE-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508386#comment-13508386 ] Mostafa Elhemali commented on MAPREDUCE-4840: - Note that the test for it (TestBadRecords) was disabled in MAPREDUCE-3582 Delete dead code and deprecate public API related to skipping bad records - Key: MAPREDUCE-4840 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4840 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Mostafa Elhemali Priority: Minor Attachments: MAPREDUCE-4840.patch It looks like the decision was made in MAPREDUCE-1932 to remove support for skipping bad records rather than fix it (it doesn't work right now in trunk). If that's the case then we should probably delete all the dead code related to it and deprecate the public API's for it right? Dead code I'm talking about: 1. Task class: skipping, skipRanges, writeSkipRecs 2. MapTask class: SkippingRecordReader inner class 3. ReduceTask class: SkippingReduceValuesIterator inner class 4. Tests: TestBadRecords Public API: 1. SkipBadRecords class -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2393) No total min share limitation of all pools
[ https://issues.apache.org/jira/browse/MAPREDUCE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508389#comment-13508389 ] Mostafa Elhemali commented on MAPREDUCE-2393: - Can you please add a test for this case? No total min share limitation of all pools -- Key: MAPREDUCE-2393 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2393 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.21.0 Reporter: Denny Ye Labels: fair, scheduler Attachments: MAPREDUCE-2393.patch hi, there is no limitation about min share of all pools with cluster total shares. User can define arbitrary amount of min share for each pool. It has such description in fair scheduler design document, but no regular code. It may critical for slot distribution. One pool can hold all cluster slots to meet it's min share that greater than cluster total slots very much. If that case has happened, we should scaled down proportionally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508424#comment-13508424 ] Mostafa Elhemali commented on MAPREDUCE-2632: - A few comments: 1. Stylistic nit-picks: else should be on same line as closing brace, and should have a space after. And part=0 needs more space. 2. Can you put a comment (and test) that partitioner can be null if there's only one reader? One of the nice things about MAPREDUCE-1287 is that it doesn't even construct the partitioner if it's not needed, so we should help client code do that as well. Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V Assignee: Ravi Teja Ch N V Attachments: MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3540) saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common)
[ https://issues.apache.org/jira/browse/MAPREDUCE-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Elhemali updated MAPREDUCE-3540: Attachment: MAPREDUCE-3540.Nov12.patch Thanks Trevor for updating the patch. I still needed to trim newlines from whoami in my environment for build to work so including an updated patch with that fix intact (and unnecessary whitespace changes removed). saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common) -- Key: MAPREDUCE-3540 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3540 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.24.0, trunk Reporter: Alejandro Abdelnur Fix For: 0.24.0 Attachments: MAPREDUCE-3540-121001.patch, MAPREDUCE-3540.Nov12.patch, MAPREDUCE-3540.patch {code} [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec (generate-version) on project hadoop-yarn-common: Comman d execution failed. Cannot run program scripts\saveVersion.sh (in directory C:\cygwin\home\tucu\src\hadoop\hadoop-mapreduce-proje ct\hadoop-yarn\hadoop-yarn-common): CreateProcess error=2, The system cannot find the file specified - [Help 1] [ERROR] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira