[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)

2015-11-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005262#comment-15005262
 ] 

Hadoop QA commented on MAPREDUCE-5889:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
25s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-tools/hadoop-datajoin in trunk has 2 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
7s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 37s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 6s {color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s 
{color} | {color:green} hadoop-mapreduce-examples in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s 
{color} | {color:green} hadoop-datajoin in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 18s 
{color} | {color:green} hadoop-gridmix in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 3s 
{color} | {color:green} hadoop-streaming in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 32s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.7.0_79. 

[jira] [Updated] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours

2015-11-14 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated MAPREDUCE-6548:
-
Status: Patch Available  (was: Open)

I attach a initial patch.I do some user-check and time-check in job submission 
methods {{Job#submit}} and {{Job#waitForCompletion}}. And I add the 3 new 
config  as below.
* MAPREDUCE_LIMIT_EXECUTED_ENABLED:whether enable the limit-executed function.
* MAPREDUCE_LIMIT_EXECUTED_USERS:the user that can be executed in cluster.
* MAPREDUCE_LIMIT_EXECUTED_HOURS:the job can be executed in these hours in this 
config value.


> Jobs executed can be configurated with specific users and time hours
> 
>
> Key: MAPREDUCE-6548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>
> In recent hadoop versions,the system has no limitation for users to execute 
> their jobs if you don't configurate ACL.And I find that the ACL is only 
> called in IPC, isn't operated in job submissions.And this condition can't 
> satisfied with this case that I have a very important job, and I am prepared 
> to execute this job in 0 to 9 o'clock.In order to let this job executed 
> quickly, I am not allowed other user's job to execute in these time. So I can 
> see the result in tomorrow morning.So may be we can let jobs executed with 
> specific users in specific time hours.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours

2015-11-14 Thread Lin Yiqun (JIRA)
Lin Yiqun created MAPREDUCE-6548:


 Summary: Jobs executed can be configurated with specific users and 
time hours
 Key: MAPREDUCE-6548
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Reporter: Lin Yiqun
Assignee: Lin Yiqun


In recent hadoop versions,the system has no limitation for users to execute 
their jobs if you don't configurate ACL.And I find that the ACL is only called 
in IPC, isn't operated in job submissions.And this condition can't satisfied 
with this case that I have a very important job, and I am prepared to execute 
this job in 0 to 9 o'clock.In order to let this job executed quickli, I am not 
allowed other users job to execute in these time. So I can see the result in 
tomorrow morning.So may be we can let jobs executed with specific users in 
specific time hours.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours

2015-11-14 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated MAPREDUCE-6548:
-
Description: In recent hadoop versions,the system has no limitation for 
users to execute their jobs if you don't configurate ACL.And I find that the 
ACL is only called in IPC, isn't operated in job submissions.And this condition 
can't satisfied with this case that I have a very important job, and I am 
prepared to execute this job in 0 to 9 o'clock.In order to let this job 
executed quickly, I am not allowed other user's job to execute in these time. 
So I can see the result in tomorrow morning.So may be we can let jobs executed 
with specific users in specific time hours.  (was: In recent hadoop 
versions,the system has no limitation for users to execute their jobs if you 
don't configurate ACL.And I find that the ACL is only called in IPC, isn't 
operated in job submissions.And this condition can't satisfied with this case 
that I have a very important job, and I am prepared to execute this job in 0 to 
9 o'clock.In order to let this job executed quickli, I am not allowed other 
users job to execute in these time. So I can see the result in tomorrow 
morning.So may be we can let jobs executed with specific users in specific time 
hours.)

> Jobs executed can be configurated with specific users and time hours
> 
>
> Key: MAPREDUCE-6548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>
> In recent hadoop versions,the system has no limitation for users to execute 
> their jobs if you don't configurate ACL.And I find that the ACL is only 
> called in IPC, isn't operated in job submissions.And this condition can't 
> satisfied with this case that I have a very important job, and I am prepared 
> to execute this job in 0 to 9 o'clock.In order to let this job executed 
> quickly, I am not allowed other user's job to execute in these time. So I can 
> see the result in tomorrow morning.So may be we can let jobs executed with 
> specific users in specific time hours.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours

2015-11-14 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated MAPREDUCE-6548:
-
Attachment: MAPREDUCE-6548.001.patch

> Jobs executed can be configurated with specific users and time hours
> 
>
> Key: MAPREDUCE-6548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: MAPREDUCE-6548.001.patch
>
>
> In recent hadoop versions,the system has no limitation for users to execute 
> their jobs if you don't configurate ACL.And I find that the ACL is only 
> called in IPC, isn't operated in job submissions.And this condition can't 
> satisfied with this case that I have a very important job, and I am prepared 
> to execute this job in 0 to 9 o'clock.In order to let this job executed 
> quickly, I am not allowed other user's job to execute in these time. So I can 
> see the result in tomorrow morning.So may be we can let jobs executed with 
> specific users in specific time hours.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6548) Jobs executed can be configurated with specific users and time hours

2015-11-14 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005483#comment-15005483
 ] 

Bikas Saha commented on MAPREDUCE-6548:
---

Does the admission control added to YARN help this scenario? YARN-1051. The 
important job could be guaranteed its capacity between 0 to 9am using admission 
control.

> Jobs executed can be configurated with specific users and time hours
> 
>
> Key: MAPREDUCE-6548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6548
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: MAPREDUCE-6548.001.patch
>
>
> In recent hadoop versions,the system has no limitation for users to execute 
> their jobs if you don't configurate ACL.And I find that the ACL is only 
> called in IPC, isn't operated in job submissions.And this condition can't 
> satisfied with this case that I have a very important job, and I am prepared 
> to execute this job in 0 to 9 o'clock.In order to let this job executed 
> quickly, I am not allowed other user's job to execute in these time. So I can 
> see the result in tomorrow morning.So may be we can let jobs executed with 
> specific users in specific time hours.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-14 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6549:
---
Attachment: MAPREDUCE-6549-1.patch

Attaching a patch to basically remove the attempt to read the last incomplete 
record of an input and change the tests to test a more generic, imperfect 
scenario.  I'll add some more tests if review deems it necessary.  As far as I 
am aware, we should drop an incomplete record at the end of the input, which 
now this happens with this patch in addition to the correct number of records 
coming up in the middle of the input (where previously there were duplicates).

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Dustin Cote
> Attachments: MAPREDUCE-6549-1.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-14 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated MAPREDUCE-6549:
---
Status: Patch Available  (was: Open)

[~zxu], could you review this?

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Dustin Cote
> Attachments: MAPREDUCE-6549-1.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-14 Thread Dustin Cote (JIRA)
Dustin Cote created MAPREDUCE-6549:
--

 Summary: multibyte delimiters with LineRecordReader cause 
duplicate records
 Key: MAPREDUCE-6549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.2
Reporter: Dustin Cote
Assignee: Dustin Cote


LineRecorderReader currently produces duplicate records under certain scenarios 
such as:

1) input string: "abc+++def++ghi++" 
delimiter string: "+++" 
test passes with all sizes of the split 
2) input string: "abc++def+++ghi++" 
delimiter string: "+++" 
test fails with a split size of 4 
2) input string: "abc+++def++ghi++" 
delimiter string: "++" 
test fails with a split size of 5 
3) input string "abc+++defg++hij++" 
delimiter string: "++" 
test fails with a split size of 4 
4) input string "abc++def+++ghi++" 
delimiter string: "++" 
test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)