[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005262#comment-15005262 ] Hadoop QA commented on MAPREDUCE-5889: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 29s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 44s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 25s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s {color} | {color:red} hadoop-tools/hadoop-datajoin in trunk has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 17s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 7s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 37s {color} | {color:green} hadoop-mapreduce-client-core in the patch passed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 6s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s {color} | {color:green} hadoop-mapreduce-examples in the patch passed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-datajoin in the patch passed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 18s {color} | {color:green} hadoop-gridmix in the patch passed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 3s {color} | {color:green} hadoop-streaming in the patch passed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 32s {color} | {color:green} hadoop-mapreduce-client-core in the patch passed with JDK v1.7.0_79.
[jira] [Updated] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated MAPREDUCE-6548: - Status: Patch Available (was: Open) I attach a initial patch.I do some user-check and time-check in job submission methods {{Job#submit}} and {{Job#waitForCompletion}}. And I add the 3 new config as below. * MAPREDUCE_LIMIT_EXECUTED_ENABLED:whether enable the limit-executed function. * MAPREDUCE_LIMIT_EXECUTED_USERS:the user that can be executed in cluster. * MAPREDUCE_LIMIT_EXECUTED_HOURS:the job can be executed in these hours in this config value. > Jobs executed can be configurated with specific users and time hours > > > Key: MAPREDUCE-6548 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: job submission >Reporter: Lin Yiqun >Assignee: Lin Yiqun > > In recent hadoop versions,the system has no limitation for users to execute > their jobs if you don't configurate ACL.And I find that the ACL is only > called in IPC, isn't operated in job submissions.And this condition can't > satisfied with this case that I have a very important job, and I am prepared > to execute this job in 0 to 9 o'clock.In order to let this job executed > quickly, I am not allowed other user's job to execute in these time. So I can > see the result in tomorrow morning.So may be we can let jobs executed with > specific users in specific time hours. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours
Lin Yiqun created MAPREDUCE-6548: Summary: Jobs executed can be configurated with specific users and time hours Key: MAPREDUCE-6548 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission Reporter: Lin Yiqun Assignee: Lin Yiqun In recent hadoop versions,the system has no limitation for users to execute their jobs if you don't configurate ACL.And I find that the ACL is only called in IPC, isn't operated in job submissions.And this condition can't satisfied with this case that I have a very important job, and I am prepared to execute this job in 0 to 9 o'clock.In order to let this job executed quickli, I am not allowed other users job to execute in these time. So I can see the result in tomorrow morning.So may be we can let jobs executed with specific users in specific time hours. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated MAPREDUCE-6548: - Description: In recent hadoop versions,the system has no limitation for users to execute their jobs if you don't configurate ACL.And I find that the ACL is only called in IPC, isn't operated in job submissions.And this condition can't satisfied with this case that I have a very important job, and I am prepared to execute this job in 0 to 9 o'clock.In order to let this job executed quickly, I am not allowed other user's job to execute in these time. So I can see the result in tomorrow morning.So may be we can let jobs executed with specific users in specific time hours. (was: In recent hadoop versions,the system has no limitation for users to execute their jobs if you don't configurate ACL.And I find that the ACL is only called in IPC, isn't operated in job submissions.And this condition can't satisfied with this case that I have a very important job, and I am prepared to execute this job in 0 to 9 o'clock.In order to let this job executed quickli, I am not allowed other users job to execute in these time. So I can see the result in tomorrow morning.So may be we can let jobs executed with specific users in specific time hours.) > Jobs executed can be configurated with specific users and time hours > > > Key: MAPREDUCE-6548 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: job submission >Reporter: Lin Yiqun >Assignee: Lin Yiqun > > In recent hadoop versions,the system has no limitation for users to execute > their jobs if you don't configurate ACL.And I find that the ACL is only > called in IPC, isn't operated in job submissions.And this condition can't > satisfied with this case that I have a very important job, and I am prepared > to execute this job in 0 to 9 o'clock.In order to let this job executed > quickly, I am not allowed other user's job to execute in these time. So I can > see the result in tomorrow morning.So may be we can let jobs executed with > specific users in specific time hours. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated MAPREDUCE-6548: - Attachment: MAPREDUCE-6548.001.patch > Jobs executed can be configurated with specific users and time hours > > > Key: MAPREDUCE-6548 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: job submission >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: MAPREDUCE-6548.001.patch > > > In recent hadoop versions,the system has no limitation for users to execute > their jobs if you don't configurate ACL.And I find that the ACL is only > called in IPC, isn't operated in job submissions.And this condition can't > satisfied with this case that I have a very important job, and I am prepared > to execute this job in 0 to 9 o'clock.In order to let this job executed > quickly, I am not allowed other user's job to execute in these time. So I can > see the result in tomorrow morning.So may be we can let jobs executed with > specific users in specific time hours. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005483#comment-15005483 ] Bikas Saha commented on MAPREDUCE-6548: --- Does the admission control added to YARN help this scenario? YARN-1051. The important job could be guaranteed its capacity between 0 to 9am using admission control. > Jobs executed can be configurated with specific users and time hours > > > Key: MAPREDUCE-6548 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: job submission >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: MAPREDUCE-6548.001.patch > > > In recent hadoop versions,the system has no limitation for users to execute > their jobs if you don't configurate ACL.And I find that the ACL is only > called in IPC, isn't operated in job submissions.And this condition can't > satisfied with this case that I have a very important job, and I am prepared > to execute this job in 0 to 9 o'clock.In order to let this job executed > quickly, I am not allowed other user's job to execute in these time. So I can > see the result in tomorrow morning.So may be we can let jobs executed with > specific users in specific time hours. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated MAPREDUCE-6549: --- Attachment: MAPREDUCE-6549-1.patch Attaching a patch to basically remove the attempt to read the last incomplete record of an input and change the tests to test a more generic, imperfect scenario. I'll add some more tests if review deems it necessary. As far as I am aware, we should drop an incomplete record at the end of the input, which now this happens with this patch in addition to the correct number of records coming up in the middle of the input (where previously there were duplicates). > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Dustin Cote > Attachments: MAPREDUCE-6549-1.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated MAPREDUCE-6549: --- Status: Patch Available (was: Open) [~zxu], could you review this? > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Dustin Cote > Attachments: MAPREDUCE-6549-1.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
Dustin Cote created MAPREDUCE-6549: -- Summary: multibyte delimiters with LineRecordReader cause duplicate records Key: MAPREDUCE-6549 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.2 Reporter: Dustin Cote Assignee: Dustin Cote LineRecorderReader currently produces duplicate records under certain scenarios such as: 1) input string: "abc+++def++ghi++" delimiter string: "+++" test passes with all sizes of the split 2) input string: "abc++def+++ghi++" delimiter string: "+++" test fails with a split size of 4 2) input string: "abc+++def++ghi++" delimiter string: "++" test fails with a split size of 5 3) input string "abc+++defg++hij++" delimiter string: "++" test fails with a split size of 4 4) input string "abc++def+++ghi++" delimiter string: "++" test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)