date:20180620

[jira] [Commented] (HADOOP-15495) Upgrade common-langs version to 3.7 in hadoop-common-project and hadoop-tools

2018-06-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518080#comment-16518080
 ] 

genericqa commented on HADOOP-15495:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 35 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
36s{color} | {color:red} integration-test in trunk failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-client-modules/hadoop-client-minicluster 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
20s{color} | {color:red} integration-test in trunk failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
10s{color} | {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 29m 21s{color} 
| {color:red} root generated 25 new + 1563 unchanged - 0 fixed = 1588 total 
(was 1563) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
21s{color} | {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-client-modules/hadoop-client-minicluster 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
29s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
25s{color} | {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m

[jira] [Created] (HADOOP-15548) Randomize local dirs

2018-06-20 Thread Jim Brennan (JIRA)

Jim Brennan created HADOOP-15548:


 Summary: Randomize local dirs
 Key: HADOOP-15548
 URL: https://issues.apache.org/jira/browse/HADOOP-15548
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Jim Brennan
Assignee: Jim Brennan


shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. Some 
applications will process these in exactly the same way in every container 
(e.g. roundrobin) which can cause disks to get unnecessarily overloaded (e.g. 
one output file written to first entry specified in the environment variable).

There are two paths for local dir allocation, depending on whether the size is 
unknown or known.  The unknown path already uses a random algorithm.  The known 
path initializes with a random starting point, and then goes round-robin after 
that.  When selecting a dir, it increments the last used by one and then checks 
sequentially until it finds a dir that satisfies the request.  Proposal is to 
increment by a random value of between 1 and num_dirs - 1, and then check 
sequentially from there.  This should result in a more random selection in all 
cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15495) Upgrade common-langs version to 3.7 in hadoop-common-project and hadoop-tools

2018-06-20 Thread Takanobu Asanuma (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518232#comment-16518232
 ] 

Takanobu Asanuma commented on HADOOP-15495:
---

Sorry, the last patch doesn't fix the whitespace warnings. The next patch will 
cover it.

{{The failure of TestCLI}} semms to be related. I will investigate it.

> Upgrade common-langs version to 3.7 in hadoop-common-project and hadoop-tools
> -
>
> Key: HADOOP-15495
> URL: https://issues.apache.org/jira/browse/HADOOP-15495
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Attachments: HADOOP-15495.1.patch, HADOOP-15495.2.patch
>
>
> commons-lang 2.6 is widely used. Let's upgrade to 3.6.
> This jira is separated from HADOOP-10783.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15542) S3AFileSystem - FileAlreadyExistsException when prefix is a file and part of a directory tree

2018-06-20 Thread Steve Loughran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15542:

Priority: Minor  (was: Blocker)

> S3AFileSystem - FileAlreadyExistsException when prefix is a file and part of 
> a directory tree
> -
>
> Key: HADOOP-15542
> URL: https://issues.apache.org/jira/browse/HADOOP-15542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.5, 3.1.0
>Reporter: t oo
>Priority: Minor
>
> We are running Apache Spark jobs with aws-java-sdk-1.7.4.jar  
> hadoop-aws-2.7.5.jar to write parquet files to an S3 bucket. We have the key 
> 's3://mybucket/d1/d2/d3/d4/d5/d6/d7' in s3 (d7 being a text file). We also 
> have keys 
> 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180615/a.parquet' 
> (a.parquet being a file)
> When we run a spark job to write b.parquet file under 
> 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180616/' (ie would like 
> to have 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180616/b.parquet' 
> get created in s3) we get the below error
>  
>  
> org.apache.hadoop.fs.FileAlreadyExistsException: Can't make directory for 
> path 's3a://mybucket/d1/d2/d3/d4/d5/d6/d7' since it is a file.
> at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:861)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1881)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15542) S3AFileSystem - FileAlreadyExistsException when prefix is a file and part of a directory tree

2018-06-20 Thread Steve Loughran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15542:

Issue Type: Improvement  (was: Bug)

> S3AFileSystem - FileAlreadyExistsException when prefix is a file and part of 
> a directory tree
> -
>
> Key: HADOOP-15542
> URL: https://issues.apache.org/jira/browse/HADOOP-15542
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.5, 3.1.0
>Reporter: t oo
>Priority: Minor
>
> We are running Apache Spark jobs with aws-java-sdk-1.7.4.jar  
> hadoop-aws-2.7.5.jar to write parquet files to an S3 bucket. We have the key 
> 's3://mybucket/d1/d2/d3/d4/d5/d6/d7' in s3 (d7 being a text file). We also 
> have keys 
> 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180615/a.parquet' 
> (a.parquet being a file)
> When we run a spark job to write b.parquet file under 
> 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180616/' (ie would like 
> to have 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180616/b.parquet' 
> get created in s3) we get the below error
>  
>  
> org.apache.hadoop.fs.FileAlreadyExistsException: Can't make directory for 
> path 's3a://mybucket/d1/d2/d3/d4/d5/d6/d7' since it is a file.
> at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:861)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1881)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Todd Lipcon (JIRA)

Todd Lipcon created HADOOP-15549:


 Summary: Upgrade to commons-configuration 2.1 regresses task CPU 
consumption
 Key: HADOOP-15549
 URL: https://issues.apache.org/jira/browse/HADOOP-15549
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 3.0.2
Reporter: Todd Lipcon
Assignee: Todd Lipcon


HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
commons-configuration is used when parsing the metrics configuration properties 
file. The new builder API used in the new version apparently makes use of a 
bunch of very bloated reflection and classloading nonsense to achieve the same 
goal, and this results in a regression of >100ms of CPU time as measured by a 
program which simply initializes DefaultMetricsSystem.

This isn't a big deal for long-running daemons, but for MR tasks which might 
only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518375#comment-16518375
 ] 

Todd Lipcon commented on HADOOP-15549:
--

I ran a simple program which just calls DefaultMetricsSystem.initialize against 
Hadoop 2.8.2 compared to 3.0.0 dist tarballs:

*2.8.2:

{code}
683.416696  task-clock (msec) #1.793 CPUs utilized  
  ( +-  2.32% )
 1,790  context-switches  #0.003 M/sec  
  ( +-  1.07% )
54  cpu-migrations#0.080 K/sec  
  ( +- 17.64% )
13,688  page-faults   #0.020 M/sec  
  ( +-  0.54% )
 2,216,866,739  cycles#3.244 GHz
  ( +-  1.62% )
 2,299,332,469  instructions  #1.04  insn per cycle 
  ( +-  1.21% )
   431,487,977  branches  #  631.369 M/sec  
  ( +-  1.17% )
19,346,551  branch-misses #4.48% of all branches
  ( +-  1.07% )

   0.381138028 seconds time elapsed 
 ( +-  2.52% )
{code}

*3.0.0:*

{code}
924.881803  task-clock (msec) #1.902 CPUs utilized  
  ( +-  2.05% )
 1,962  context-switches  #0.002 M/sec  
  ( +-  0.73% )
44  cpu-migrations#0.047 K/sec  
  ( +- 11.15% )
20,593  page-faults   #0.022 M/sec  
  ( +-  0.55% )
 3,042,371,457  cycles#3.289 GHz
  ( +-  1.67% )
 3,165,586,053  instructions  #1.04  insn per cycle 
  ( +-  1.41% )
   592,945,118  branches  #  641.104 M/sec  
  ( +-  1.36% )
25,735,278  branch-misses #4.34% of all branches
  ( +-  1.30% )

   0.486354791 seconds time elapsed 
 ( +-  2.04% )
{code}

Not all of the regression is due to the metrics system initialization, but with 
a small patch that avoids the "builder" APIs, I can recover some of the 
regression.

{code}
885.276567  task-clock (msec) #2.009 CPUs utilized  
  ( +-  1.45% )
 1,608  context-switches  #0.002 M/sec  
  ( +-  2.02% )
48  cpu-migrations#0.055 K/sec  
  ( +- 12.98% )
18,949  page-faults   #0.021 M/sec  
  ( +-  0.88% )
 2,908,533,684  cycles#3.285 GHz
  ( +-  0.46% )
 3,045,577,520  instructions  #1.05  insn per cycle 
  ( +-  0.66% )
   566,661,963  branches  #  640.096 M/sec  
  ( +-  0.67% )
24,309,912  branch-misses #4.29% of all branches
  ( +-  0.77% )

   0.440731241 seconds time elapsed 
 ( +-  2.98% )
{code}

It also loads fewer classes (1651 vs 1768) by eliminating usage of 'beanutil' 
and a bunch of ancillary classes in commons-configuration.

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-15549:
-
Attachment: hadoop-15549.txt

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-15549:
-
Status: Patch Available  (was: Open)

Re-measured on trunk instead of 3.0.0, and also used 'taskset -c 1' to bind 
everything to one core to try to reduce variance:

2.8.2:
{code}
535.052533  task-clock (msec) #0.918 CPUs utilized  
  ( +-  0.88% )
 3,055  context-switches  #0.006 M/sec  
  ( +-  1.02% )
 1  cpu-migrations#0.002 K/sec  

12,644  page-faults   #0.024 M/sec  
  ( +-  0.06% )
 1,953,309,627  cycles#3.651 GHz
  ( +-  0.16% )
 2,221,327,797  instructions  #1.14  insn per cycle 
  ( +-  0.17% )
   417,919,978  branches  #  781.082 M/sec  
  ( +-  0.17% )
18,726,810  branch-misses #4.48% of all branches
  ( +-  0.19% )

   0.582855783 seconds time elapsed 
 ( +-  1.70% )
{code}

3.2 without patch:
{code}
751.038338  task-clock (msec) #0.927 CPUs utilized  
  ( +-  0.43% )
 3,646  context-switches  #0.005 M/sec  
  ( +-  2.26% )
 1  cpu-migrations#0.001 K/sec  
  ( +- 25.00% )
19,233  page-faults   #0.026 M/sec  
  ( +-  0.42% )
 2,735,218,817  cycles#3.642 GHz
  ( +-  0.62% )
 3,218,012,767  instructions  #1.18  insn per cycle 
  ( +-  0.54% )
   604,477,739  branches  #  804.856 M/sec  
  ( +-  0.48% )
25,664,033  branch-misses #4.25% of all branches
  ( +-  0.55% )

   0.810230298 seconds time elapsed 
 ( +-  0.98% )
{code}

3.2 with patch:
{code}
3.2 with patch:


679.940626  task-clock (msec) #0.918 CPUs utilized  
  ( +-  1.14% )
 3,302  context-switches  #0.005 M/sec  
  ( +-  1.16% )
 1  cpu-migrations#0.001 K/sec  

16,819  page-faults   #0.025 M/sec  
  ( +-  0.06% )
 2,375,283,537  cycles#3.493 GHz
  ( +-  0.33% )
 2,722,724,476  instructions  #1.15  insn per cycle 
  ( +-  0.27% )
   511,944,028  branches  #  752.925 M/sec  
  ( +-  0.24% )
21,981,131  branch-misses #4.29% of all branches
  ( +-  0.33% )

   0.740316578 seconds time elapsed 
 ( +-  0.96% )
{code}

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Steve Loughran (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518426#comment-16518426
 ] 

Steve Loughran commented on HADOOP-15549:
-

Is this also why we get those messages in the s3a & wasb connectors about 
initialization of metrics?

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Steve Loughran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15549:

Target Version/s: 3.1.1  (was: 3.1.0)

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14918) Remove the Local Dynamo DB test option

2018-06-20 Thread Sean Mackrory (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518453#comment-16518453
 ] 

Sean Mackrory commented on HADOOP-14918:


Fair enough - I had missed the earlier explanation of that key. There's a 
general problem of tests that delete tables because there's a couple of others 
that have to do it. Some of them manage the entire life cycle and thus just 
create their own random name (like Gabor did too) - which is probably the best 
we can do since the table namespace is inconsistent, so you can't configure 1 
table and use it for multiple tests reliably. But I've seen tests fail because 
the table was recently deleted - need to fix that at some point. But given the 
relative importance of this particular test, it's own config is fine. +1 - will 
commit soon.

> Remove the Local Dynamo DB test option
> --
>
> Key: HADOOP-14918
> URL: https://issues.apache.org/jira/browse/HADOOP-14918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-14918-001.patch, HADOOP-14918-002.patch, 
> HADOOP-14918-003.patch, HADOOP-14918-004.patch, HADOOP-14918.005.patch, 
> HADOOP-14918.006.patch
>
>
> I'm going to propose cutting out the localdynamo test option for s3guard
> * the local DDB JAR is unmaintained/lags the SDK We work with...eventually 
> there'll be differences in API.
> * as the local dynamo DB is unshaded. it complicates classpath setup for the 
> build. Remove it and there's no need to worry about versions of anything 
> other than the shaded AWS
> * it complicates test runs. Now we need to test for both localdynamo *and* 
> real dynamo
> * but we can't ignore real dynamo, because that's the one which matters
> While the local option promises to reduce test costs, really, it's just 
> adding complexity. If you are testing with s3guard, you need to have a real 
> table to test against., And with the exception of those people testing s3a 
> against non-AWS, consistent endpoints, everyone should be testing with 
> S3Guard.
> -Straightforward to remove.-



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518472#comment-16518472
 ] 

Todd Lipcon commented on HADOOP-15549:
--

I also noticed there were some weird error logs, but didn't investigate those. 
Just coming at this from a perf regression angle (trying to figure out various 
reasons why a sleep job with 1ms tasks regressed noticeably between Hadoop 2 
and Hadoop 3)

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15550) Avoid static initialization of ObjectMappers

2018-06-20 Thread Todd Lipcon (JIRA)

Todd Lipcon created HADOOP-15550:


 Summary: Avoid static initialization of ObjectMappers
 Key: HADOOP-15550
 URL: https://issues.apache.org/jira/browse/HADOOP-15550
 Project: Hadoop Common
  Issue Type: Bug
  Components: performance
Affects Versions: 3.2.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon


Various classes statically initialize an ObjectMapper READER instance. This 
ends up doing a bunch of class-loading of Jackson libraries that can add up to 
a fair amount of CPU, even if the reader ends up not being used. This is 
particularly the case with WebHdfsFileSystem, which is class-loaded by a 
serviceloader even when unused in a particular job. We should lazy-init these 
members instead of doing so as a static class member.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15550) Avoid static initialization of ObjectMappers

2018-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-15550:
-
Attachment: hadoop-15550.txt

> Avoid static initialization of ObjectMappers
> 
>
> Key: HADOOP-15550
> URL: https://issues.apache.org/jira/browse/HADOOP-15550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: performance
>Affects Versions: 3.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hadoop-15550.txt
>
>
> Various classes statically initialize an ObjectMapper READER instance. This 
> ends up doing a bunch of class-loading of Jackson libraries that can add up 
> to a fair amount of CPU, even if the reader ends up not being used. This is 
> particularly the case with WebHdfsFileSystem, which is class-loaded by a 
> serviceloader even when unused in a particular job. We should lazy-init these 
> members instead of doing so as a static class member.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15550) Avoid static initialization of ObjectMappers

2018-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-15550:
-
Status: Patch Available  (was: Open)

> Avoid static initialization of ObjectMappers
> 
>
> Key: HADOOP-15550
> URL: https://issues.apache.org/jira/browse/HADOOP-15550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: performance
>Affects Versions: 3.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hadoop-15550.txt
>
>
> Various classes statically initialize an ObjectMapper READER instance. This 
> ends up doing a bunch of class-loading of Jackson libraries that can add up 
> to a fair amount of CPU, even if the reader ends up not being used. This is 
> particularly the case with WebHdfsFileSystem, which is class-loaded by a 
> serviceloader even when unused in a particular job. We should lazy-init these 
> members instead of doing so as a static class member.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15550) Avoid static initialization of ObjectMappers

2018-06-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518508#comment-16518508
 ] 

Todd Lipcon commented on HADOOP-15550:
--

Benchmarked with a simple program that does {{new Path("/").getFileSystem(new 
Configuration());}}

The attached patch avoids loading about 400 classes, and saves some measurable 
CPU:

{code}
without patch (2219 classes loaded):

   1378.393961  task-clock (msec) #1.959 CPUs utilized  
  ( +-  0.57% )
 2,076  context-switches  #0.002 M/sec  
  ( +-  0.62% )
45  cpu-migrations#0.033 K/sec  
  ( +-  3.75% )
30,529  page-faults   #0.022 M/sec  
  ( +-  0.24% )
 4,540,069,263  cycles#3.294 GHz
  ( +-  0.92% )
 5,282,002,987  instructions  #1.16  insn per cycle 
  ( +-  0.93% )
   991,080,821  branches  #  719.011 M/sec  
  ( +-  0.90% )
40,313,544  branch-misses #4.07% of all branches
  ( +-  0.67% )

   0.703624736 seconds time elapsed 
 ( +-  0.72% )


with patch (1821 classes loaded):

   1269.949263  task-clock (msec) #2.082 CPUs utilized  
  ( +-  1.11% )
 2,008  context-switches  #0.002 M/sec  
  ( +-  0.76% )
51  cpu-migrations#0.040 K/sec  
  ( +-  8.14% )
25,034  page-faults   #0.020 M/sec  
  ( +-  0.26% )
 4,157,369,649  cycles#3.274 GHz
  ( +-  0.78% )
 4,674,086,838  instructions  #1.12  insn per cycle 
  ( +-  0.42% )
   870,359,803  branches  #  685.350 M/sec  
  ( +-  0.41% )
36,028,258  branch-misses #4.14% of all branches
  ( +-  0.44% )

   0.610038881 seconds time elapsed 
 ( +-  1.54% )
{code}

> Avoid static initialization of ObjectMappers
> 
>
> Key: HADOOP-15550
> URL: https://issues.apache.org/jira/browse/HADOOP-15550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: performance
>Affects Versions: 3.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hadoop-15550.txt
>
>
> Various classes statically initialize an ObjectMapper READER instance. This 
> ends up doing a bunch of class-loading of Jackson libraries that can add up 
> to a fair amount of CPU, even if the reader ends up not being used. This is 
> particularly the case with WebHdfsFileSystem, which is class-loaded by a 
> serviceloader even when unused in a particular job. We should lazy-init these 
> members instead of doing so as a static class member.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread Todd Lipcon (JIRA)

Todd Lipcon created HADOOP-15551:


 Summary: Avoid use of Java8 streams in Configuration.addTags
 Key: HADOOP-15551
 URL: https://issues.apache.org/jira/browse/HADOOP-15551
 Project: Hadoop Common
  Issue Type: Improvement
  Components: performance
Affects Versions: 3.2
Reporter: Todd Lipcon
Assignee: Todd Lipcon


Configuration.addTags oddly uses Arrays.stream instead of a more conventional 
mechanism. When profiling a simple program that uses Configuration, I found 
that addTags was taking tens of millis of CPU to do very little work the first 
time it's called, accounting for ~8% of total profiler samples in my program.

{code}
[9] 4.52% 253 self: 0.00% 0 java/lang/invoke/MethodHandleNatives.linkCallSite
[9] 3.71% 208 self: 0.00% 0 
java/lang/invoke/MethodHandleNatives.linkMethodHandleConstant
{code}

I don't know much about the implementation details of the Streams stuff, but it 
seems it's probably meant more for cases with very large arrays or somesuch. 
Switching to a normal Set.addAll() call eliminates this from the profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-15551:
-
Attachment: hadoop-15551.txt

> Avoid use of Java8 streams in Configuration.addTags
> ---
>
> Key: HADOOP-15551
> URL: https://issues.apache.org/jira/browse/HADOOP-15551
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 3.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15551.txt
>
>
> Configuration.addTags oddly uses Arrays.stream instead of a more conventional 
> mechanism. When profiling a simple program that uses Configuration, I found 
> that addTags was taking tens of millis of CPU to do very little work the 
> first time it's called, accounting for ~8% of total profiler samples in my 
> program.
> {code}
> [9] 4.52% 253 self: 0.00% 0 java/lang/invoke/MethodHandleNatives.linkCallSite
> [9] 3.71% 208 self: 0.00% 0 
> java/lang/invoke/MethodHandleNatives.linkMethodHandleConstant
> {code}
> I don't know much about the implementation details of the Streams stuff, but 
> it seems it's probably meant more for cases with very large arrays or 
> somesuch. Switching to a normal Set.addAll() call eliminates this from the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518526#comment-16518526
 ] 

Todd Lipcon commented on HADOOP-15551:
--

Perf results of a simple Java program that instantiates and uses a single 
Configuration
{code}
without patch:

   1220.803922  task-clock (msec) #2.075 CPUs utilized  
  ( +-  0.97% )
 2,038  context-switches  #0.002 M/sec  
  ( +-  0.52% )
39  cpu-migrations#0.032 K/sec  
  ( +-  5.07% )
22,468  page-faults   #0.018 M/sec  
  ( +-  0.41% )
 3,992,441,054  cycles#3.270 GHz
  ( +-  0.78% )
 4,458,310,856  instructions  #1.12  insn per cycle 
  ( +-  0.71% )
   833,135,256  branches  #  682.448 M/sec  
  ( +-  0.70% )
34,458,171  branch-misses #4.14% of all branches
  ( +-  0.76% )

   0.588308028 seconds time elapsed 
 ( +-  1.80% )

with patch:

   1158.420617  task-clock (msec) #2.106 CPUs utilized  
  ( +-  0.80% )
 1,998  context-switches  #0.002 M/sec  
  ( +-  0.93% )
40  cpu-migrations#0.035 K/sec  
  ( +-  9.65% )
22,025  page-faults   #0.019 M/sec  
  ( +-  0.45% )
 3,957,999,054  cycles#3.417 GHz
  ( +-  0.89% )
 4,468,617,304  instructions  #1.13  insn per cycle 
  ( +-  0.71% )
   834,835,030  branches  #  720.667 M/sec  
  ( +-  0.72% )
34,494,708  branch-misses #4.13% of all branches
  ( +-  0.67% )

   0.550146256 seconds time elapsed 
 ( +-  0.92% )
{code}

(ie this silly change saves ~50ms of CPU)

> Avoid use of Java8 streams in Configuration.addTags
> ---
>
> Key: HADOOP-15551
> URL: https://issues.apache.org/jira/browse/HADOOP-15551
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 3.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15551.txt
>
>
> Configuration.addTags oddly uses Arrays.stream instead of a more conventional 
> mechanism. When profiling a simple program that uses Configuration, I found 
> that addTags was taking tens of millis of CPU to do very little work the 
> first time it's called, accounting for ~8% of total profiler samples in my 
> program.
> {code}
> [9] 4.52% 253 self: 0.00% 0 java/lang/invoke/MethodHandleNatives.linkCallSite
> [9] 3.71% 208 self: 0.00% 0 
> java/lang/invoke/MethodHandleNatives.linkMethodHandleConstant
> {code}
> I don't know much about the implementation details of the Streams stuff, but 
> it seems it's probably meant more for cases with very large arrays or 
> somesuch. Switching to a normal Set.addAll() call eliminates this from the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-15551:
-
Status: Patch Available  (was: Open)

> Avoid use of Java8 streams in Configuration.addTags
> ---
>
> Key: HADOOP-15551
> URL: https://issues.apache.org/jira/browse/HADOOP-15551
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 3.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15551.txt
>
>
> Configuration.addTags oddly uses Arrays.stream instead of a more conventional 
> mechanism. When profiling a simple program that uses Configuration, I found 
> that addTags was taking tens of millis of CPU to do very little work the 
> first time it's called, accounting for ~8% of total profiler samples in my 
> program.
> {code}
> [9] 4.52% 253 self: 0.00% 0 java/lang/invoke/MethodHandleNatives.linkCallSite
> [9] 3.71% 208 self: 0.00% 0 
> java/lang/invoke/MethodHandleNatives.linkMethodHandleConstant
> {code}
> I don't know much about the implementation details of the Streams stuff, but 
> it seems it's probably meant more for cases with very large arrays or 
> somesuch. Switching to a normal Set.addAll() call eliminates this from the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518552#comment-16518552
 ] 

genericqa commented on HADOOP-15549:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
3s{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 33m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
49s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15549 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928506/hadoop-15549.txt |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f85ee0007c5d 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9a9e969 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14796/testReport/ |
| Max. process+thread count | 1365 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-c

[jira] [Commented] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518583#comment-16518583
 ] 

Todd Lipcon commented on HADOOP-15549:
--

bq. The patch doesn't appear to include any new or modified tests. Please 
justify why no new tests are needed for this patch. Also please list what 
manual steps were performed to verify this patch.

Existing tests cover this code path -- just using a different set of APIs. I'm 
not 100% sure the behavior didn't change, but if it did change, it probably 
isn't commonly used functionality (and likely going back to what it was in 
hadoop 2)

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15552) Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools

2018-06-20 Thread Giovanni Matteo Fumarola (JIRA)

Giovanni Matteo Fumarola created HADOOP-15552:
-

 Summary: Continue HADOOP-14296 - Move logging APIs over to slf4j 
in hadoop-tools
 Key: HADOOP-15552
 URL: https://issues.apache.org/jira/browse/HADOOP-15552
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Giovanni Matteo Fumarola


Some classes in Hadoop-tools were not moved to slf4j 
e.g. AliyunOSSInputStream.java, HadoopArchiveLogs.java, 
HadoopArchiveLogsRunner.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15550) Avoid static initialization of ObjectMappers

2018-06-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518630#comment-16518630
 ] 

genericqa commented on HADOOP-15550:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
26s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
48s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m  
5s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15550 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928522/hadoop-15550.txt |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit

[jira] [Commented] (HADOOP-15552) Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools

2018-06-20 Thread Ian Pickering (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518643#comment-16518643
 ] 

Ian Pickering commented on HADOOP-15552:


Thanks [~giovanni.fumarola] for opening the issue. Do you mind if I work on it?

> Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools
> ---
>
> Key: HADOOP-15552
> URL: https://issues.apache.org/jira/browse/HADOOP-15552
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Priority: Major
>
> Some classes in Hadoop-tools were not moved to slf4j 
> e.g. AliyunOSSInputStream.java, HadoopArchiveLogs.java, 
> HadoopArchiveLogsRunner.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518660#comment-16518660
 ] 

genericqa commented on HADOOP-15551:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
37s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15551 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928523/hadoop-15551.txt |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7cd58cb13317 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d6ee429 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14798/testReport/ |
| Max. process+thread count | 1519 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-c

[jira] [Commented] (HADOOP-15552) Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools

2018-06-20 Thread Ian Pickering (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518667#comment-16518667
 ] 

Ian Pickering commented on HADOOP-15552:


I found this bit in ITestNativeAzureFileSystemClientLogging.java:
{code:java}
/**
 * Test to validate Azure storage client side logging. Tests works only when
 * testing with Live Azure storage because Emulator does not have support for
 * client-side logging.
 *
 * Important:  Do not attempt to move off commons-logging.
 * The tests will fail.
 */
{code}
So it seems like a complete migration isn't immediately doable without somehow 
migrating the commons-logging parts that interact tightly with Log4J2.

> Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools
> ---
>
> Key: HADOOP-15552
> URL: https://issues.apache.org/jira/browse/HADOOP-15552
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Priority: Major
>
> Some classes in Hadoop-tools were not moved to slf4j 
> e.g. AliyunOSSInputStream.java, HadoopArchiveLogs.java, 
> HadoopArchiveLogsRunner.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-15552) Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools

2018-06-20 Thread Ian Pickering (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518667#comment-16518667
 ] 

Ian Pickering edited comment on HADOOP-15552 at 6/20/18 10:24 PM:
--

I found this bit in ITestNativeAzureFileSystemClientLogging.java:
{code:java}
/**
 * Test to validate Azure storage client side logging. Tests works only when
 * testing with Live Azure storage because Emulator does not have support for
 * client-side logging.
 *
 * Important:  Do not attempt to move off commons-logging.
 * The tests will fail.
 */
{code}
So it seems like a complete migration isn't immediately doable without somehow 
migrating the commons-logging parts that interact tightly with Log4J.


was (Author: iapicker):
I found this bit in ITestNativeAzureFileSystemClientLogging.java:
{code:java}
/**
 * Test to validate Azure storage client side logging. Tests works only when
 * testing with Live Azure storage because Emulator does not have support for
 * client-side logging.
 *
 * Important:  Do not attempt to move off commons-logging.
 * The tests will fail.
 */
{code}
So it seems like a complete migration isn't immediately doable without somehow 
migrating the commons-logging parts that interact tightly with Log4J2.

> Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools
> ---
>
> Key: HADOOP-15552
> URL: https://issues.apache.org/jira/browse/HADOOP-15552
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Priority: Major
>
> Some classes in Hadoop-tools were not moved to slf4j 
> e.g. AliyunOSSInputStream.java, HadoopArchiveLogs.java, 
> HadoopArchiveLogsRunner.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15552) Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools

2018-06-20 Thread Steve Loughran (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518688#comment-16518688
 ] 

Steve Loughran commented on HADOOP-15552:
-

The Azure stuff is special: leave alone. It's explicitly capturing the logs and 
looking for the results

It shouldn't matter that tests are being lower level in their logging; moving 
the production code off commons-logging should be the goal

> Continue HADOOP-14296 - Move logging APIs over to slf4j in hadoop-tools
> ---
>
> Key: HADOOP-15552
> URL: https://issues.apache.org/jira/browse/HADOOP-15552
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Priority: Major
>
> Some classes in Hadoop-tools were not moved to slf4j 
> e.g. AliyunOSSInputStream.java, HadoopArchiveLogs.java, 
> HadoopArchiveLogsRunner.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14918) Remove the Local Dynamo DB test option

2018-06-20 Thread Sean Mackrory (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14918:
---
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

> Remove the Local Dynamo DB test option
> --
>
> Key: HADOOP-14918
> URL: https://issues.apache.org/jira/browse/HADOOP-14918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-14918-001.patch, HADOOP-14918-002.patch, 
> HADOOP-14918-003.patch, HADOOP-14918-004.patch, HADOOP-14918.005.patch, 
> HADOOP-14918.006.patch
>
>
> I'm going to propose cutting out the localdynamo test option for s3guard
> * the local DDB JAR is unmaintained/lags the SDK We work with...eventually 
> there'll be differences in API.
> * as the local dynamo DB is unshaded. it complicates classpath setup for the 
> build. Remove it and there's no need to worry about versions of anything 
> other than the shaded AWS
> * it complicates test runs. Now we need to test for both localdynamo *and* 
> real dynamo
> * but we can't ignore real dynamo, because that's the one which matters
> While the local option promises to reduce test costs, really, it's just 
> adding complexity. If you are testing with s3guard, you need to have a real 
> table to test against., And with the exception of those people testing s3a 
> against non-AWS, consistent endpoints, everyone should be testing with 
> S3Guard.
> -Straightforward to remove.-



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread Steve Loughran (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518689#comment-16518689
 ] 

Steve Loughran commented on HADOOP-15551:
-

LGTM
+1

> Avoid use of Java8 streams in Configuration.addTags
> ---
>
> Key: HADOOP-15551
> URL: https://issues.apache.org/jira/browse/HADOOP-15551
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 3.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15551.txt
>
>
> Configuration.addTags oddly uses Arrays.stream instead of a more conventional 
> mechanism. When profiling a simple program that uses Configuration, I found 
> that addTags was taking tens of millis of CPU to do very little work the 
> first time it's called, accounting for ~8% of total profiler samples in my 
> program.
> {code}
> [9] 4.52% 253 self: 0.00% 0 java/lang/invoke/MethodHandleNatives.linkCallSite
> [9] 3.71% 208 self: 0.00% 0 
> java/lang/invoke/MethodHandleNatives.linkMethodHandleConstant
> {code}
> I don't know much about the implementation details of the Streams stuff, but 
> it seems it's probably meant more for cases with very large arrays or 
> somesuch. Switching to a normal Set.addAll() call eliminates this from the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Steve Loughran (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518697#comment-16518697
 ] 

Steve Loughran commented on HADOOP-15549:
-

LGTM, though I think we should wait for a review from someone who knows the code

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14918) Remove the Local Dynamo DB test option

2018-06-20 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518703#comment-16518703
 ] 

Hudson commented on HADOOP-14918:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14456 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14456/])
HADOOP-14918. Remove the Local Dynamo DB test option. Contributed by 
(mackrorysd: rev b089a06793d94d42b7da1b7566e366ceb748e081)
* (edit) hadoop-tools/hadoop-aws/pom.xml
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/commit/staging/StagingTestBase.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBMetadataStore.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java
* (edit) hadoop-project/pom.xml
* (delete) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDynamoDBMetadataStore.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/AbstractS3ATestBase.java
* (delete) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBLocalClientFactory.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreTestBase.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java


> Remove the Local Dynamo DB test option
> --
>
> Key: HADOOP-14918
> URL: https://issues.apache.org/jira/browse/HADOOP-14918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-14918-001.patch, HADOOP-14918-002.patch, 
> HADOOP-14918-003.patch, HADOOP-14918-004.patch, HADOOP-14918.005.patch, 
> HADOOP-14918.006.patch
>
>
> I'm going to propose cutting out the localdynamo test option for s3guard
> * the local DDB JAR is unmaintained/lags the SDK We work with...eventually 
> there'll be differences in API.
> * as the local dynamo DB is unshaded. it complicates classpath setup for the 
> build. Remove it and there's no need to worry about versions of anything 
> other than the shaded AWS
> * it complicates test runs. Now we need to test for both localdynamo *and* 
> real dynamo
> * but we can't ignore real dynamo, because that's the one which matters
> While the local option promises to reduce test costs, really, it's just 
> adding complexity. If you are testing with s3guard, you need to have a real 
> table to test against., And with the exception of those people testing s3a 
> against non-AWS, consistent endpoints, everyone should be testing with 
> S3Guard.
> -Straightforward to remove.-



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-15551:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, pushed to trunk.

> Avoid use of Java8 streams in Configuration.addTags
> ---
>
> Key: HADOOP-15551
> URL: https://issues.apache.org/jira/browse/HADOOP-15551
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 3.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: hadoop-15551.txt
>
>
> Configuration.addTags oddly uses Arrays.stream instead of a more conventional 
> mechanism. When profiling a simple program that uses Configuration, I found 
> that addTags was taking tens of millis of CPU to do very little work the 
> first time it's called, accounting for ~8% of total profiler samples in my 
> program.
> {code}
> [9] 4.52% 253 self: 0.00% 0 java/lang/invoke/MethodHandleNatives.linkCallSite
> [9] 3.71% 208 self: 0.00% 0 
> java/lang/invoke/MethodHandleNatives.linkMethodHandleConstant
> {code}
> I don't know much about the implementation details of the Streams stuff, but 
> it seems it's probably meant more for cases with very large arrays or 
> somesuch. Switching to a normal Set.addAll() call eliminates this from the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15549) Upgrade to commons-configuration 2.1 regresses task CPU consumption

2018-06-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518729#comment-16518729
 ] 

Todd Lipcon commented on HADOOP-15549:
--

[~mackrorysd] mind taking a look?

> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> ---
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hadoop-15549.txt
>
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x. 
> commons-configuration is used when parsing the metrics configuration 
> properties file. The new builder API used in the new version apparently makes 
> use of a bunch of very bloated reflection and classloading nonsense to 
> achieve the same goal, and this results in a regression of >100ms of CPU time 
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might 
> only run a few seconds on poorly-tuned jobs, this can be noticeable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2018-06-20 Thread Vinod Kumar Vavilapalli (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518730#comment-16518730
 ] 

Vinod Kumar Vavilapalli commented on HADOOP-14163:
--

Got pinged about this offline.

Thanks for keeping at it, [~elek]!

I think there are two road-blocks here
 (1) Is the mechanism using which the website is built good enough - mvn-site / 
hugo etc?
 (2) Is the new website good enough?

For (1), I just think we need more committer attention and get feedback rapidly 
on this Jira and get it in.

For (2), how about we do it in a different way in the interest of progress?
 - We create a hadoop.apache.org/new-site/ where this new site goes.
 - We then modify the existing web-site to say that there is a new 
site/experience that folks can click on a link and navigate to
 - As this new website matures and gets feedback & fixes, we finally pull the 
plug at a later point of time when we think we are good to go.

Thoughts?

> Refactor existing hadoop site to use more usable static website generator
> -
>
> Key: HADOOP-14163
> URL: https://issues.apache.org/jira/browse/HADOOP-14163
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: site
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-14163-001.zip, HADOOP-14163-002.zip, 
> HADOOP-14163-003.zip, HADOOP-14163.004.patch, HADOOP-14163.005.patch, 
> HADOOP-14163.006.patch, HADOOP-14163.007.patch, HADOOP-14163.008.tar.gz, 
> HADOOP-14163.009.patch, HADOOP-14163.009.tar.gz, hadoop-site.tar.gz, 
> hadop-site-rendered.tar.gz
>
>
> From the dev mailing list:
> "Publishing can be attacked via a mix of scripting and revamping the darned 
> website. Forrest is pretty bad compared to the newer static site generators 
> out there (e.g. need to write XML instead of markdown, it's hard to review a 
> staging site because of all the absolute links, hard to customize, did I 
> mention XML?), and the look and feel of the site is from the 00s. We don't 
> actually have that much site content, so it should be possible to migrate to 
> a new system."
> This issue is find a solution to migrate the old site to a new modern static 
> site generator using a more contemprary theme.
> Goals: 
>  * existing links should work (or at least redirected)
>  * It should be easy to add more content required by a release automatically 
> (most probably with creating separated markdown files)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2018-06-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518733#comment-16518733
 ] 

genericqa commented on HADOOP-14163:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HADOOP-14163 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14163 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910915/HADOOP-14163.009.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14799/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Refactor existing hadoop site to use more usable static website generator
> -
>
> Key: HADOOP-14163
> URL: https://issues.apache.org/jira/browse/HADOOP-14163
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: site
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-14163-001.zip, HADOOP-14163-002.zip, 
> HADOOP-14163-003.zip, HADOOP-14163.004.patch, HADOOP-14163.005.patch, 
> HADOOP-14163.006.patch, HADOOP-14163.007.patch, HADOOP-14163.008.tar.gz, 
> HADOOP-14163.009.patch, HADOOP-14163.009.tar.gz, hadoop-site.tar.gz, 
> hadop-site-rendered.tar.gz
>
>
> From the dev mailing list:
> "Publishing can be attacked via a mix of scripting and revamping the darned 
> website. Forrest is pretty bad compared to the newer static site generators 
> out there (e.g. need to write XML instead of markdown, it's hard to review a 
> staging site because of all the absolute links, hard to customize, did I 
> mention XML?), and the look and feel of the site is from the 00s. We don't 
> actually have that much site content, so it should be possible to migrate to 
> a new system."
> This issue is find a solution to migrate the old site to a new modern static 
> site generator using a more contemprary theme.
> Goals: 
>  * existing links should work (or at least redirected)
>  * It should be easy to add more content required by a release automatically 
> (most probably with creating separated markdown files)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15551) Avoid use of Java8 streams in Configuration.addTags

2018-06-20 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518735#comment-16518735
 ] 

Hudson commented on HADOOP-15551:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14458 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14458/])
HADOOP-15551. Avoid use of Arrays.stream in Configuration.addTags (todd: rev 
43541a18907d2303b708ae27a9a2cb5df891da4f)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java


> Avoid use of Java8 streams in Configuration.addTags
> ---
>
> Key: HADOOP-15551
> URL: https://issues.apache.org/jira/browse/HADOOP-15551
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 3.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: hadoop-15551.txt
>
>
> Configuration.addTags oddly uses Arrays.stream instead of a more conventional 
> mechanism. When profiling a simple program that uses Configuration, I found 
> that addTags was taking tens of millis of CPU to do very little work the 
> first time it's called, accounting for ~8% of total profiler samples in my 
> program.
> {code}
> [9] 4.52% 253 self: 0.00% 0 java/lang/invoke/MethodHandleNatives.linkCallSite
> [9] 3.71% 208 self: 0.00% 0 
> java/lang/invoke/MethodHandleNatives.linkMethodHandleConstant
> {code}
> I don't know much about the implementation details of the Streams stuff, but 
> it seems it's probably meant more for cases with very large arrays or 
> somesuch. Switching to a normal Set.addAll() call eliminates this from the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15074) SequenceFile#Writer flush does not update the length of the written file.

2018-06-20 Thread Harish Jaiprakash (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518779#comment-16518779
 ] 

Harish Jaiprakash commented on HADOOP-15074:


[~arpitagarwal], SequenceFileWriter depends upon FSDataOutputStream object 
returned by FileSystem.create/append, which does not expose the 
hsync(EnumSet syncFlags). The DFSOutputStream gets wrapped into a 
FSDataOutputStream so it's not possible to fix in SequenceFile.

This bug makes it hard to implement producer / consumer using sequence files. 
We are a bit stuck on this. When would length get persisted, if hsync is never 
called with UPDATE_LENGTH? Is there like a periodic update of length or udpate 
when a block is full and written or only when close is called?

> SequenceFile#Writer flush does not update the length of the written file.
> -
>
> Key: HADOOP-15074
> URL: https://issues.apache.org/jira/browse/HADOOP-15074
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>
> SequenceFile#Writer flush does not update the length of the file. This 
> happens because as part of the flush, {{UPDATE_LENGTH}} flag is not passed to 
> the DFSOutputStream#hsync.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15550) Avoid static initialization of ObjectMappers

2018-06-20 Thread Xiao Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518965#comment-16518965
 ] 

Xiao Chen commented on HADOOP-15550:


Thanks for the work here Todd.

Looks pretty good to me. Just curious, what's the rational on choosing what to 
change on this jira? Should we also fix HttpExceptionUtils?

> Avoid static initialization of ObjectMappers
> 
>
> Key: HADOOP-15550
> URL: https://issues.apache.org/jira/browse/HADOOP-15550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: performance
>Affects Versions: 3.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hadoop-15550.txt
>
>
> Various classes statically initialize an ObjectMapper READER instance. This 
> ends up doing a bunch of class-loading of Jackson libraries that can add up 
> to a fair amount of CPU, even if the reader ends up not being used. This is 
> particularly the case with WebHdfsFileSystem, which is class-loaded by a 
> serviceloader even when unused in a particular job. We should lazy-init these 
> members instead of doing so as a static class member.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

41 matches

Mail list logo