[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-01-16 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824741#comment-15824741
 ] 

Ai Deng commented on HADOOP-13453:
--

Hello [~ste...@apache.org] actually, I think as all the metrics (in this story) 
are send by the new implementation of s3Guard, maybe it's better to separate 
this new metrics code with S3AInstrumentation. The currently S3AInstrumentation 
has already 800 lines. We can more easily to disable the s3 guard metrics if we 
separate the two. But I'm not sure how much codes are reusable in 
S3AInstrumentation for the new metrics.

Also I don't find any tests for S3AInstrumentation, how we test these metrics 
system in Hadoop?

Sorry for the basic question, i'm really new for work on Hadoop code base.

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-01-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825927#comment-15825927
 ] 

Steve Loughran commented on HADOOP-13453:
-

They're going to have to go into that file because those are the metrics 
published by the S3A filesystem when deployed, returned by S3AStorageStatistics 
in a call to {{S3AFileSystem.getStorageStatistics(), and printed in 
{{S3AFileSystem.toString()}}. We could choose whether to add the specific 
metrics to every S3a FS instance; that's something to consider. Listing the 
values but returning 0 for all gauges and counters is the most consistent.

Don't worry about the class length: if you look at it in detail, there's two 
nested classes + support methods explicitly for output/input streams...you 
don't need to go there. The rest of the code is fairly simple

# add new values to org.apache.hadoop.fs.s3a.Statistic; prefix {{s3guard_}}
# In {{S3AInstrumentation}}, add counters to the array {{COUNTERS_TO_CREATE}}; 
gauges to {{GAUGES_TO_CREATE}}
# Pass in an instance of the instrumentation down to S3Guard
# have the code call incrementCounter and increment/decrementGauge as 
appropriate
# I'd like a simple counter of {{s3guard_enabled}} and 
{{s3guard_authoritative}}, which will be 0 when there's no s3guard running, 1 
when the respective booleans are up. Why? Remote visibility

You make a good point, "where are the tests?". The answer is: the metrics can 
be used to test the internal state of the S3 classes, therefore become 
implicitly tested there. 

Take a look at {{ITestS3ADirectoryPerformance}} for a key example of this: our 
test cases use the counters of the various HTTP operations as the means to 
verify that API calls work as expected. (note that s3guard, by reducing these, 
has complicated the tests)

That is, you verify the counters work by asserting that they change as you make 
operations to the DFS. see: 
http://steveloughran.blogspot.co.uk/2016/04/distributed-testing-making-use-of.html
 for more of my thinking here

bq. Sorry for the basic question, i'm really new for work on Hadoop code base.

happy to explain my reasoning. We've all started off staring at a vast amount 
of code that we don't understand; there are still big bits of Hadoop that I 
don't go near.

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-01-18 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829055#comment-15829055
 ] 

Ai Deng commented on HADOOP-13453:
--

[~steve_l] Thank you very much for the explication, that's very helpful. 

I have 2 questions for the moment, for sure there are more to come.

# I see 2 pattern to change the counter value in S3AInstrumentation, have a 
proper method like fileCreated() or pass one Statistic to the generic method 
incrementCounter(), it is for a reason we keep both? Looks like you suggest to 
use the second approach.

# I can't find any usage of S3AFileSystem.getStorageStatistics() in the 
project, what is the main propose of this statistics? it's for use outside of 
Hadoop?  I don't need pass an instance of storageStatistics to S3Guard? In 
S3AFileSystem, we always increment the both.
{code:java}
  protected void incrementStatistic(Statistic statistic, long count) {
instrumentation.incrementCounter(statistic, count);
storageStatistics.incrementCounter(statistic, count);
  }
{code}

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-01-23 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834359#comment-15834359
 ] 

Steve Loughran commented on HADOOP-13453:
-

Hi, don't worry about asking questions, we'll do our best to get you 
contributing code —it benefits all of us if you are adding code to Hadoop.

The split between low level increment named counter and more elegant "event 
with internal counters?". The event ones are cleaner, as they stop the rest of 
the code having to know exactly which counters/gauges to use. Consider the 
elegant ones the best approach, and the direct invocation us being lazy.

The S3aInstrumentation class also has a set of explicit named counters 
"filesDeleted" as well as lots of ones that are only listed in the arrays 
{{GAUGES_TO_CREATE}} and {{COUNTERS_TO_CREATE}}. That's evolution over time; I 
got bored of having to name and register lots of fields, and realised I could 
do it from the arrays, at the cost of a hash lookup on every increment.

Outside the S3a class itself, i've tried to have external inner classes to do 
the counting, with the results merged in at the end (example: the input and 
output streams), with the inner classes using simple long values, rather than 
atomics. Why? Eliminates any delays during increments, and lets us override the 
toString() values for input/output streams with dumps of the values (go on, try 
it!). We can have many input/output streams per FS instance, so the risk of 
contention for atomic int/log values is potentially quite high.

I think for s3guard we could add a new inner class passed in to each s3guard 
instance; it would export the various methods for events that s3guard could 
raise, such as {{tableCreated()}}, {{tableDeleted()}} —these can directly 
increment the atomic counters in the instrumentation, as we'd only have a 1:1 
map of S3aFS instance and a s3guard store instance.

Regarding access the statistics, that's hooked up to 
{{FileSystem.getStorageStatistics()}}, which is intended to provide the storage 
stats for any FS; s3a and HDFS share common statistic names for the common 
statistics. The latest versions of Tez do collect the statistics of jobs, and 
so give you the aggregate statistics across your entire query. Until now, only 
{{Filesystem.getStatistics()}} has been used, which returns a fixed set of 
values (bytes read/written, etc). Spark still only collects those; it'd take 
some migration to hadoop 2.8+ to pick up the new data. Until then, it's 
something we can use in tests.



> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: comm

[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-01-26 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839563#comment-15839563
 ] 

Ai Deng commented on HADOOP-13453:
--

Steve, thank you for sharing these knowledge and thought. It's a good idea to 
having a inner class for S3guard metrics.

I have started a little with all your help, but I will be on holiday for next 
two weeks (back to China for the new year). I really hope I can resolve this 
ticket (could work more quick on this after the holiday), but if the timing is 
not match to the plan of Hadoop13345, please affect this ticket to someone 
else, so we can finish in time.

I will try to catch up with you in China. 

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-01-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839599#comment-15839599
 ] 

Steve Loughran commented on HADOOP-13453:
-

—don't worry about being on holiday for the next few weeks; a fair few people 
have gone off to enjoy themselves. Take a break from your emails and enjoy 
yourself!

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-20 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874958#comment-15874958
 ] 

Aaron Fabbri commented on HADOOP-13453:
---

I will comment as I see new places in the code that could use metrics:

- HADOOP-13904 (if it gets committed) in DynamoDBMetadataStore#retryBackoff()


> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-20 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875164#comment-15875164
 ] 

Ai Deng commented on HADOOP-13453:
--

[~fabbri] Thanks.
[~ste...@apache.org] I just made a simple change. Could you please check that? 
(wip patch) Just make sure I'm on the right way to doing things. Thanks.

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453.wip-01.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-22 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878886#comment-15878886
 ] 

Steve Loughran commented on HADOOP-13453:
-

thanks, I've hit the submit patch button, but jenkins will fail as the patch 
isn't going to apply

the way Hadoop builds work is you need to include the branch name if   its not 
trunk, here
{code}
HADOOP-13453-HADOOP-13345-001.patch
{code}

Try that with the existing code and you should have the machines review the 
patch (which we rely on to do the basic checks).

For object store tests we also require the submitter to declare which s3 
infrastructure they tested against —because Jenkins doesn't run those tests. 
Here's the details: 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md

Patch-wise, base design looks good: you've added a new statistic and passing 
the instrumentation down to add it.

We just need to think of all the statistics to collect.

Thanks for doing this.

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453.wip-01.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878896#comment-15878896
 ] 

Hadoop QA commented on HADOOP-13453:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} HADOOP-13453 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-13453 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853617/HADOOP-13453.wip-01.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11689/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453.wip-01.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884392#comment-15884392
 ] 

Hadoop QA commented on HADOOP-13453:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 5s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
22s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13453 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12854675/HADOOP-13453-HADOOP-13345-001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 929c7a818845 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 95e0143 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11717/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11717/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADO

[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884401#comment-15884401
 ] 

Hadoop QA commented on HADOOP-13453:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 1s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
22s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
35s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13453 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12854676/HADOOP-13453-HADOOP-13345-002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 72388b8af098 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 95e0143 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11718/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11718/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADO

[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-25 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884420#comment-15884420
 ] 

Ai Deng commented on HADOOP-13453:
--

Hi [~ste...@apache.org] Thank you for the information. I will keep adding more 
metrics. (still on the S3Guard class level)

Regards of the metrics mentioned in the document (I have copied it to the Jira 
ticket), as we consider the metadata in the store is "fresher", and use it 
first, we don't do any recheck(for the inconsistent between S3 and metadata 
store) right? So the metrics "S3GuardRechecksNthPercentileLatency", 
"S3GuardRechecksNumOps", "S3GuardInconsistencies" will not need any more.

For the Jenkins build, I can't find the "Submit" button on the Jira, it because 
my user permission or I miss something?

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-02-25 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884425#comment-15884425
 ] 

Ai Deng commented on HADOOP-13453:
--

Looks like the Jenkins run automatically for the patch. Will modify the 
existing S3Guard tests scenarios to test the metrics added. 

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-03-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905231#comment-15905231
 ] 

Steve Loughran commented on HADOOP-13453:
-

I'm afraid HADOOP-13914 has just broken the patch, which means, sadly, you get 
to do the merge. Let's get this in *before* anything else traumatic comes in, 
so other patches get to suffer next time.

I like what you've done measuring latency as well as counts. I think we could 
actually do this more broadly. I think the timing counting should be in a 
finally() clause though, so timings for failures get included too. (side issue: 
count success and failures separately? with different timings?)

I would like to think about how we could avoiding having to pass the 
instrumentation around all the time. Ideally, we could just pass it in as a 
constructor to the metadata store. Alternatively, that store could collect 
metrics and we could wire it up, but I don't see an easy way to do that in 
Hadoop metrics (compared to Coda Hale's). The easiest would be just to pass in 
the S3AInstrumentation (or an inner class) down, but currently the metastore 
interface is not specific to S3A only.

If we add an interface for metadata store instrumentation, then 
S3AInstrumentation can implement it in an inner class, and S3AFS can pass it 
down during initialization. Th's would let the metastore do all it wants, with 
well defined strings, of course.

What do people think?


> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-03-19 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931609#comment-15931609
 ] 

Ai Deng commented on HADOOP-13453:
--

Hi [~ste...@apache.org], I have added a new patch following your suggestion. If 
it is ok, we can discuss the metrics we want to add?

I come out this list of operation and latency metrics for this ticket, can you 
check if I miss anything? Thank you.

Status:
S3GUARD_METADATASTORE_ENABLED
S3GUARD_METADATASTORE_IS_AUTHORITATIVE
Operations:
S3GUARD_METADATASTORE_INITIALIZATION
S3GUARD_METADATASTORE_DELETE_PATH
S3GUARD_METADATASTORE_DELETE_PATH_LATENCY
S3GUARD_METADATASTORE_DELETE_SUBTREE_PATCH
S3GUARD_METADATASTORE_GET_PATH
S3GUARD_METADATASTORE_GET_PATH_LATENCY
S3GUARD_METADATASTORE_GET_CHILDREN_PATH
S3GUARD_METADATASTORE_GET_CHILDREN_PATH_LATENCY
S3GUARD_METADATASTORE_MOVE_PATH
S3GUARD_METADATASTORE_PUT_PATH
S3GUARD_METADATASTORE_PUT_PATH_LATENCY
S3GUARD_METADATASTORE_CLOSE
S3GUARD_METADATASTORE_DESTORY
>From S3Guard:
S3GUARD_METADATASTORE_MERGE_DIRECTORY
For the failures:
S3GUARD_METADATASTORE_DELETE_FAILURE
S3GUARD_METADATASTORE_GET_FAILURE
S3GUARD_METADATASTORE_PUT_FAILURE
Etc:
S3GUARD_METADATASTORE_PUT_RETRY_TIMES

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-03-19 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931611#comment-15931611
 ] 

Ai Deng commented on HADOOP-13453:
--

I think maybe measure the number of path has been operated (put, get … ) in 
MetaStore could be interesting. The end user can see how big their S3 file 
system has been managed in S3Guard.

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-03-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931613#comment-15931613
 ] 

Hadoop QA commented on HADOOP-13453:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
34s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 19 
new + 25 unchanged - 0 fixed = 44 total (was 25) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 18m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13453 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859449/HADOOP-13453-HADOOP-13345-003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux bd705d5c15d7 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / b54e1b2 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/ji

[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-04-12 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15966617#comment-15966617
 ] 

Steve Loughran commented on HADOOP-13453:
-

This is on my review todo list, don't think I've forgotten! 

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-04-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1596#comment-1596
 ] 

Steve Loughran commented on HADOOP-13453:
-

I see your point about quantile config: don't know what to do there. 

* be careful with those imports,  S3AInstrumentation had all its imports 
expanded and moved around. Import sections are one of the main merge-conflict 
areas, so its critical to keep changes there to a minimum. We normally turn off 
any IDE automatic features to avoid this.
* moved off the split of separate interface and impl for the instrumentation; 
everything is closely couple enough we don't need to abstract things away.

I'm doing a patch with my changes; if everything is happy and I can run a full 
integration test suite (HADOOP-14216 has broken this), then I'll +1 it; once it 
is in we can expand the metrics

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967964#comment-15967964
 ] 

Hadoop QA commented on HADOOP-13453:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 0s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
30s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 13s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 1 
new + 38 unchanged - 0 fixed = 39 total (was 38) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:612578f |
| JIRA Issue | HADOOP-13453 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12863312/HADOOP-13453-HADOOP-13345-005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 40fb9ba1fa78 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / af8250a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jir

[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-04-14 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969596#comment-15969596
 ] 

Ai Deng commented on HADOOP-13453:
--

Ok, will test the path05 when I back from holiday. Happy easter everyone!

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch, 
> HADOOP-13453-HADOOP-13345-004.patch, HADOOP-13453-HADOOP-13345-005.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-05-03 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994745#comment-15994745
 ] 

Ai Deng commented on HADOOP-13453:
--

Hi [~steve_l]  happy with patch5, we should push it to the branch and start to 
add more metrics?

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch, 
> HADOOP-13453-HADOOP-13345-004.patch, HADOOP-13453-HADOOP-13345-005.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-05-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995403#comment-15995403
 ] 

Steve Loughran commented on HADOOP-13453:
-

+1
yeah. I'll do it now, one last test run to see all is well

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch, 
> HADOOP-13453-HADOOP-13345-004.patch, HADOOP-13453-HADOOP-13345-005.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-05-03 Thread Ai Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995790#comment-15995790
 ] 

Ai Deng commented on HADOOP-13453:
--

Cool, I listed my suggestion for the metrics in previous comment, what is your 
thoughts? Let's decide the list first.

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch, 
> HADOOP-13453-HADOOP-13345-004.patch, HADOOP-13453-HADOOP-13345-005.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-05-04 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996530#comment-15996530
 ] 

Steve Loughran commented on HADOOP-13453:
-

Why not take that list, create a new JIRA off HADOOP-13345 "add more s3guard 
metrics" and suggest those as the start.

One interesting one to see if we could detect would be mismatches between 
s3guard and the underlying object store: if we can observe inconsistencies 
(how?) then that should be measured. The S3mper blog posts looks at how netflix 
detected consistency issues in S3 that way

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch, HADOOP-13453-HADOOP-13345-003.patch, 
> HADOOP-13453-HADOOP-13345-004.patch, HADOOP-13453-HADOOP-13345-005.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13453) S3Guard: Instrument new functionality with Hadoop metrics.

2017-01-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15805499#comment-15805499
 ] 

Steve Loughran commented on HADOOP-13453:
-

# these should all go into org.apache.hadoop.fs.s3a.Statistic & Instrumentation
# tracking total & ongoing dynamoDB request rates could be useful, as it will 
help identify when you've over/under provisioned your DDB.
# include stats on detected inconsistencies
# include in {{S3AFileSystem.toString}}.

 

> S3Guard: Instrument new functionality with Hadoop metrics.
> --
>
> Key: HADOOP-13453
> URL: https://issues.apache.org/jira/browse/HADOOP-13453
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Ai Deng
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org