[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=487157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487157
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 21/Sep/20 18:20
Start Date: 21/Sep/20 18:20
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2069:
URL: https://github.com/apache/hadoop/pull/2069#issuecomment-696286408


   Closing this; best to rebuild as a new PR atop trunk.
   
   For the next PR I plan to split into
   * hadoop-common
   * hadoop-aws
   
   trickier than I'd like, but, it means we can get the common one reviewed and 
in early



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487157)
Time Spent: 1h 20m  (was: 1h 10m)

> Add public IOStatistics API; S3A to support
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=487158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487158
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 21/Sep/20 18:20
Start Date: 21/Sep/20 18:20
Worklog Time Spent: 10m 
  Work Description: steveloughran closed pull request #2069:
URL: https://github.com/apache/hadoop/pull/2069


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487158)
Time Spent: 1.5h  (was: 1h 20m)

> Add public IOStatistics API; S3A to support
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=481138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481138
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 09/Sep/20 23:22
Start Date: 09/Sep/20 23:22
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2069:
URL: https://github.com/apache/hadoop/pull/2069#issuecomment-689874828


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  3s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
39 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 22s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 50s |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 21s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  16m 42s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 53s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 14s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 39s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 52s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 13s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   4m 45s |  trunk passed  |
   | -0 :warning: |  patch  |   1m 35s |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 57s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 46s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |  18m 46s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 0 new + 2061 unchanged - 
1 fixed = 2061 total (was 2062)  |
   | +1 :green_heart: |  compile  |  16m 54s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |  16m 54s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 0 new + 1955 unchanged - 
1 fixed = 1955 total (was 1956)  |
   | -0 :warning: |  checkstyle  |   2m 45s |  root: The patch generated 18 new 
+ 266 unchanged - 26 fixed = 284 total (was 292)  |
   | +1 :green_heart: |  mvnsite  |   3m 15s |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s |  The patch has 14 line(s) that end in 
whitespace. Use git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  1s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 11s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javadoc  |   1m 37s |  
hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new 
+ 1 unchanged - 0 fixed = 2 total (was 1)  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  hadoop-mapreduce-client-core in 
the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. 
 |
   | +1 :green_heart: |  javadoc  |   0m 41s |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 
with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 0 new + 
0 unchanged - 4 fixed = 0 total (was 4)  |
   | -1 :x: |  findbugs  |   2m 21s |  hadoop-common-project/hadoop-common 
generated 9 new + 0 unchanged - 0 fixed = 9 total (was 0)  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 36s |  hadoop-common in the patch passed. 
 |
   | +1 :green_heart: |  

[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=479637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479637
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 07/Sep/20 13:23
Start Date: 07/Sep/20 13:23
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2069:
URL: https://github.com/apache/hadoop/pull/2069#issuecomment-688323653


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   1m  3s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  2s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
38 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 22s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 16s |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 50s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m 32s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m  1s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 54s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  4s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 26s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m  9s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   4m 41s |  trunk passed  |
   | -0 :warning: |  patch  |   1m 28s |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 55s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  9s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javac  |  20m  9s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 2 new + 2061 unchanged - 
1 fixed = 2063 total (was 2062)  |
   | +1 :green_heart: |  compile  |  17m 37s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  javac  |  17m 37s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 2 new + 1955 unchanged - 
1 fixed = 1957 total (was 1956)  |
   | -0 :warning: |  checkstyle  |   2m 56s |  root: The patch generated 16 new 
+ 266 unchanged - 26 fixed = 282 total (was 292)  |
   | +1 :green_heart: |  mvnsite  |   2m 53s |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s |  The patch has 14 line(s) that end in 
whitespace. Use git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  1s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  15m 34s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javadoc  |   1m 27s |  
hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new 
+ 1 unchanged - 0 fixed = 2 total (was 1)  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  hadoop-mapreduce-client-core in 
the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. 
 |
   | +1 :green_heart: |  javadoc  |   0m 34s |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 
with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 0 new + 
0 unchanged - 4 fixed = 0 total (was 4)  |
   | -1 :x: |  findbugs  |   2m 21s |  hadoop-common-project/hadoop-common 
generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0)  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 58s |  hadoop-common in the patch passed. 
 |
   | +1 :green_heart: |  unit  |   6m 56s |  

[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=479180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479180
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 04/Sep/20 16:17
Start Date: 04/Sep/20 16:17
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2069:
URL: https://github.com/apache/hadoop/pull/2069#issuecomment-687247574


   OK, despite my force push losing @jimmy-zuber-amzn 's comments, I agree with 
the points about thread safety. In my head I'd imagined that we'd build that 
implementation map once and then iterate over it, but I can see benefits in 
supporting dynamic addition of new values to both the snapshot and the dynamic 
ones.
   
   Snapshot: add an entry to the map
   Dynamic: add new atomic long etc entries to the appropriate map
   
   this would let us create a minimal snapshot then pass it around, and as it 
was passed around it would collect values, *without you needing to define up 
front all stats to collect*.  This work here needs to be lined up for that with
   iterators of maps being resilient to new values being added.
   
   For the dynamic stuff -> ConcurrentHashMap.
   For Snapshot, it's trickier as they need to be java serializable, so that 
Spark & can can forward them around. There I will have to do one of 
   
   *mark the maps all as transient, and then in read/write data actually save 
then restore the data as treemaps (or just arrays of entries)
   * Make accessors to the iterators synchronized and do a snapshot of the 
iterator. I think that will actually be the easiest approach...I just need to 
make sure the operations to update the maps are also synchronized



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479180)
Time Spent: 50m  (was: 40m)

> Add public IOStatistics API; S3A to support
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=479174=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479174
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 04/Sep/20 16:08
Start Date: 04/Sep/20 16:08
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2069:
URL: https://github.com/apache/hadoop/pull/2069#discussion_r483717630



##
File path: 
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/iostatistics.md
##
@@ -0,0 +1,432 @@
+
+
+# Statistic collection with the IOStatistics API
+
+```java
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+```
+
+The `IOStatistics` API is intended to provide statistics on individual IO
+classes -such as input and output streams, *in a standard way which 
+applications can query*
+
+Many filesystem-related classes have implemented statistics gathering
+and provided private/unstable ways to query this, but as they were
+not common across implementations it was unsafe for applications
+to reference these values. Example: `S3AInputStream` and its statistics
+API. This is used in internal tests, but cannot be used downstream in
+applications such as Apache Hive or Apache HBase.
+
+The IOStatistics API is intended to 
+
+1. Be instance specific:, rather than shared across multiple instances
+   of a class, or thread local.
+1. Be public and stable enough to be used by applications.
+1. Be easy to use in applications written in Java, Scala, and, via libhdfs, 
C/C++
+1. Have foundational interfaces and classes in the `hadoop-common` JAR.
+
+## Core Model
+
+Any class *may* implement `IOStatisticsSource` in order to
+provide statistics.
+
+Wrapper I/O Classes such as `FSDataInputStream` anc `FSDataOutputStream` 
*should*
+implement the interface and forward it to the wrapped class, if they also
+implement it -and return `null` if they do not.
+
+`IOStatisticsSource` implementations `getIOStatistics()` return an
+instance of `IOStatistics` enumerating the statistics of that specific
+instance.
+
+The `IOStatistics` Interface exports five kinds of statistic:
+
+
+| Category | Type | Description |
+|--|--|-|
+| `counter`| `long`  | a counter which may increase in value; 
SHOULD BE >= 0 |
+| `gauge`  | `long`  | an arbitrary value which can down as 
well as up; SHOULD BE >= 0|
+| `minimum`| `long`  | an minimum value; MAY BE negative |
+| `maximum`| `long`  | a maximum value;  MAY BE negative |
+| `meanStatistic` | `MeanStatistic` | an arithmetic mean and sample size; mean 
MAY BE negative|
+
+Four are simple `long` values, with the variations how they are likely to
+change and how they are aggregated.
+
+
+ Aggregation of Statistic Values
+
+For the different statistic category, the result of `aggregate(x, y)` is
+
+| Category | Aggregation |
+|--|-|
+| `counter`| `min(0, x) + min(0, y)` |

Review comment:
   yeah -fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479174)
Time Spent: 40m  (was: 0.5h)

> Add public IOStatistics API; S3A to support
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads 

[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=479172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479172
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 04/Sep/20 16:05
Start Date: 04/Sep/20 16:05
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2069:
URL: https://github.com/apache/hadoop/pull/2069#discussion_r483715827



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/CounterIOStatisticsBuilder.java
##
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+/**
+ * Builder of the CounterIOStatistics class.
+ */
+public interface CounterIOStatisticsBuilder {

Review comment:
   yeah, it's obsolete. Removed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479172)
Time Spent: 0.5h  (was: 20m)

> Add public IOStatistics API; S3A to support
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=479171=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479171
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 04/Sep/20 16:00
Start Date: 04/Sep/20 16:00
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2069:
URL: https://github.com/apache/hadoop/pull/2069#issuecomment-680308475







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479171)
Time Spent: 20m  (was: 10m)

> Add public IOStatistics API; S3A to support
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API; S3A to support

2020-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=479170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479170
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 04/Sep/20 15:59
Start Date: 04/Sep/20 15:59
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2069:
URL: https://github.com/apache/hadoop/pull/2069#issuecomment-681067758


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  3s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
38 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 16s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  4s |  trunk passed  |
   | +1 :green_heart: |  compile  |  26m 25s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  22m 17s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 43s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 43s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 13s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 53s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 16s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 28s |  trunk passed  |
   | -0 :warning: |  patch  |   1m 39s |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 20s |  the patch passed  |
   | +1 :green_heart: |  compile  |  25m 23s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javac  |  25m 23s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 3 new + 2054 unchanged - 
1 fixed = 2057 total (was 2055)  |
   | +1 :green_heart: |  compile  |  21m 38s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  javac  |  21m 38s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 3 new + 1947 unchanged - 
1 fixed = 1950 total (was 1948)  |
   | -0 :warning: |  checkstyle  |   3m 51s |  root: The patch generated 19 new 
+ 258 unchanged - 26 fixed = 277 total (was 284)  |
   | -1 :x: |  mvnsite  |   0m 58s |  hadoop-mapreduce-client-core in the patch 
failed.  |
   | -1 :x: |  whitespace  |   0m  0s |  The patch has 14 line(s) that end in 
whitespace. Use git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  2s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  17m 19s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  javadoc  |   1m 33s |  
hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new 
+ 1 unchanged - 0 fixed = 2 total (was 1)  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  hadoop-mapreduce-client-core in 
the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. 
 |
   | +1 :green_heart: |  javadoc  |   0m 43s |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 
with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 0 new + 
0 unchanged - 4 fixed = 0 total (was 4)  |
   | -1 :x: |  findbugs  |   2m 25s |  hadoop-common-project/hadoop-common 
generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0)  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 37s |  hadoop-common in the patch passed. 
 |
   | +1