Re: [DISCUSS] HADOOP-9122 Add power mock library for writing better unit tests

2017-10-27 Thread Chris Douglas
Sorry, took awhile to get back to this.

In this example, the win with PowerMock is the capability to set the
private static field, so it can be final outside the unit test (in the
version ultimately committed, there's a @VisibleForTesting static
method to overwrite the static field between tests)? I'm not familiar
with how this class is used, but if it's an injected @Singleton, then
what is the virtue of making the field static at all? Would it be
possible to create a new instance for each test, if this were a member
field? Using reflection to set internal state, even if it's a library
with a good API, creates brittle coupling between unit tests and
implementations... this library sounds like syntactic sugar for the
brittle anti-patterns that Steve cited as the bane of mock tests,
generally.

Nonetheless, if you find that the existing tooling is insufficient,
then please feel free to include it. -C

On Mon, Oct 2, 2017 at 4:55 PM, Eric Yang  wrote:
> Chris,
>
> Here is a patch that use powermock.  
> https://issues.apache.org/jira/secure/attachment/12889144/YARN-7202.yarn-native-services.002.patch
> This was written to verify that when ServiceClient interacts with Hadoop, if 
> it throws the possible Exception types declared by ServiceClient API, does 
> the REST API layer handles the error code correctly.  It can help to simulate 
> internal errors and safe guard the API against the errors.  It seems like a 
> useful approach to reduce the full setup of MiniYarnCluster, and submit job 
> and generate actual failure situations in the backend.
>
> It looks like a useful way to test negative test cases.  The full exercise of 
> positive case is written in another test case in TestYarnNativeServices in 
> Hadoop-yarn-services-api project.
> Without ability to inject fault into the system, it is harder to test 
> negative cases.  However, I found it difficult to attempt this in Hadoop code 
> base.  Suggestion?
>
> Regards,
> Eric
>
> On 10/2/17, 3:09 PM, "Chris Douglas"  wrote:
>
> Eric/Steve-
>
> Please pick a test- any test- and demonstrate why Powermock would
> improve- by any metric- testing in Hadoop. -C
>
>
>
> On Mon, Oct 2, 2017 at 2:12 PM, Eric Yang  wrote:
> > Mock provides tool chains to run simulation for a piece of code.  It 
> helps to prevent null pointer exception, and reduce unexpected runtime 
> exceptions.  When a piece of code is finished with a well-defined unit test, 
> it provides great insights to see author’s intention and reasoning to write 
> the code.  However, everyone looks at code from a different perspective, and 
> it is often easier to rewrite the code than modifying and update the tests.   
> The short coming of writing new code, there is always danger of losing 
> existing purpose, workaround buried deep in the code.  On the other hand, if 
> a test program is filling with several pages of initialization code, and 
> override.  It is hard to get context of the test case, and easy to lose the 
> original meaning of the test case.  Hence, there are drawback for using mock 
> or full integration test.
> >
> > I was in favor of using Powermock in favor of giving user the ability 
> to unit test a class and reduce external interference initially.  However, I 
> quickly come to realization that Hadoop usage of protocol buffer 
> serialization technique and java reflection serialization technique have some 
> difference which prevents powermock to work for certain Hadoop classes.
> >
> > Hadoop unit tests are written to be bigger than one class, and 
> frequently, a mini-cluster is spawned to test 5-10 lines of code.  Any simple 
> API test will trigger large portion of Hadoop code to be initialized.  Hadoop 
> code base will require too much effort to work with Powermock.  Programs 
> outside of Hadoop can use powermock annotation to prevent mocking Hadoop 
> classes, such as: @powermockignore({"javax.management_", "javax.xml.", 
> "org.w3c.", "org.apache.hadoop._", "com.sun.*"}) .  However, working in 
> Hadoop code base, this technique is not practical because every class in 
> Hadoop prefix with org.apache.hadoop.  It will be heavy upkeep to maintain 
> the list of prefix packages that can not work with powermock reflection.
> > Hence, I rest my case for re-opening this issue.
> >
> > Regards,
> > Eric
> >
> > From: Steve Loughran 
> > Date: Sunday, October 1, 2017 at 12:36 PM
> > To: Eric Yang 
> > Cc: Andrew Wang , Chris Douglas 
> , "common-dev@hadoop.apache.org" 
> 
> > Subject: Re: [DISCUSS] HADOOP-9122 Add power mock library for writing 
> better unit tests
> >
> >
> > On 29 Sep 2017, at 22:46, Eric Yang 
> > wrote:
> >
> > Hi Chris and 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-10-27 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/570/

[Oct 25, 2017 2:40:33 PM] (aajisaka) HADOOP-14979. Upgrade 
maven-dependency-plugin to 3.0.2. Contributed by
[Oct 25, 2017 3:08:22 PM] (aw) HADOOP-14977. Xenial dockerfile needs ant in 
main build for findbugs
[Oct 25, 2017 5:54:40 PM] (manojpec) HDFS-12544. SnapshotDiff - support diff 
generation on any snapshot root
[Oct 25, 2017 9:11:30 PM] (xiao) HADOOP-14957. ReconfigurationTaskStatus is 
exposing guava Optional in
[Oct 25, 2017 9:24:22 PM] (arp) HDFS-12579. JournalNodeSyncer should use 
fromUrl field of
[Oct 25, 2017 10:07:50 PM] (subru) YARN-4827. Document configuration of 
ReservationSystem for
[Oct 25, 2017 10:51:27 PM] (subru) HADOOP-14840. Tool to estimate resource 
requirements of an application
[Oct 26, 2017 5:25:10 PM] (rkanter) YARN-7358. TestZKConfigurationStore and 
TestLeveldbConfigurationStore
[Oct 26, 2017 7:10:14 PM] (subu) YARN-5516. Add REST API for supporting 
recurring reservations. (Sean Po
[Oct 26, 2017 10:50:14 PM] (rkanter) YARN-7320. Duplicate LiteralByteStrings in
[Oct 27, 2017 12:47:32 AM] (rkanter) YARN-7262. Add a hierarchy into the 
ZKRMStateStore for delegation token
[Oct 27, 2017 2:13:58 AM] (junping_du) Update CHANGES.md and RELEASENOTES for 
2.8.2 release.
[Oct 27, 2017 2:15:35 AM] (junping_du) Set jdiff stable version to 2.8.2.
[Oct 27, 2017 2:30:48 AM] (junping_du) Add several jdiff xml files for 2.8.2 
release.
[Oct 27, 2017 3:15:19 AM] (wangda) YARN-7307. Allow client/AM update supported 
resource types via YARN
[Oct 27, 2017 9:45:03 AM] (stevel) MAPREDUCE-6977 Move logging APIs over to 
slf4j in
[Oct 27, 2017 2:43:54 PM] (arp) HDFS-9914. Fix configurable WebhDFS 
connect/read timeout. Contributed by
[Oct 27, 2017 3:23:57 PM] (sunilg) YARN-7375. Possible NPE in  RMWebapp when HA 
is enabled and the active
[Oct 27, 2017 5:16:38 PM] (rohithsharmaks) YARN-7289. Application lifetime does 
not work with FairScheduler.




-1 overall


The following subsystems voted -1:
asflicense findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-tools/hadoop-resourceestimator 
   Dead store to jobHistory in 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.getHistoryResourceSkyline(String,
 String) At 
ResourceEstimatorService.java:org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.getHistoryResourceSkyline(String,
 String) At ResourceEstimatorService.java:[line 196] 
   Incorrect lazy initialization and update of static field 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.skylineStore
 in new org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() 
At ResourceEstimatorService.java:of static field 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.skylineStore
 in new org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() 
At ResourceEstimatorService.java:[lines 78-82] 
   Write to static field 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.config 
from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:[line 80] 
   Write to static field 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.gson from 
instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:[line 106] 
   Write to static field 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.logParser 
from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:[line 86] 
   Write to static field 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.rleType 
from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:[line 108] 
   Write to static field 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService.skylineStore
 from instance method new 
org.apache.hadoop.resourceestimator.service.ResourceEstimatorService() At 
ResourceEstimatorService.java:from 

[jira] [Created] (HADOOP-14991) Add missing figures to Resource Estimator tool

2017-10-27 Thread Subru Krishnan (JIRA)
Subru Krishnan created HADOOP-14991:
---

 Summary: Add missing figures to Resource Estimator tool
 Key: HADOOP-14991
 URL: https://issues.apache.org/jira/browse/HADOOP-14991
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Subru Krishnan
Assignee: Rui Li


The figures in the documentation for the Resource Estimator tool is missing in 
HADOOP-14840. This jira tracks adding them. 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14990) Clean up jdiff xml files added for 2.8.2 release

2017-10-27 Thread Subru Krishnan (JIRA)
Subru Krishnan created HADOOP-14990:
---

 Summary: Clean up jdiff xml files added for 2.8.2 release
 Key: HADOOP-14990
 URL: https://issues.apache.org/jira/browse/HADOOP-14990
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Subru Krishnan
Assignee: Junping Du
Priority: Critical


The jdiff xml files for 2.8.2 release have been committed to trunk (sha id 
a25b5aa0cf5189247fa38e7b0a188d568eba1b6c). Do we still need it here (what about 
branch-2?)? If so, we have to add the license headers. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Access to Confluence Wiki

2017-10-27 Thread Hanisha Koneru
Hi,

Can I please get access to the Confluence Hadoop Wiki. My confluence id is 
“hanishakoneru”.

Thanks,
Hanisha



[jira] [Created] (HADOOP-14989) Multiple metrics2 sinks (incl JMX) result in inconsistent Mutable(Stat|Rate) values

2017-10-27 Thread Erik Krogen (JIRA)
Erik Krogen created HADOOP-14989:


 Summary: Multiple metrics2 sinks (incl JMX) result in inconsistent 
Mutable(Stat|Rate) values
 Key: HADOOP-14989
 URL: https://issues.apache.org/jira/browse/HADOOP-14989
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 2.6.5
Reporter: Erik Krogen
Priority: Critical


While doing some digging in the metrics2 system recently, we noticed that the 
way {{MutableStat}} values are collected (and thus {{MutableRate}}, since it is 
based off of {{MutableStat}}) mean that each sink configured (including JMX) 
only receives a portion of the average information.

{{MutableStat}}, to compute its average value, maintains a total value since 
last snapshot, as well as operation count since last snapshot. Upon 
snapshotting, the average is calculated as (total / opCount) and placed into a 
gauge metric, and total / operation count are cleared. So the average value 
represents the average since the last snapshot. If only a single sink ever 
snapshots, this would result in the expected behavior that the value is the 
average over the reporting period. However, if multiple sinks are configured, 
or if the JMX cache is refreshed, this is another snapshot operation. So, for 
example, if you have a FileSink configured at a 60 second interval and your JMX 
cache refreshes itself 1 second before the FileSink period fires, the values 
emitted to your FileSink only represent averages _over the last one second_.

A few ways to solve this issue:
* From an operator perspective, ensure only one sink is configured. This is not 
realistic given that the JMX cache exhibits the same behavior.
* Make {{MutableRate}} manage its own average refresh, similar to 
{{MutableQuantiles}}, which has a refresh thread and saves a snapshot of the 
last quantile values that it will serve up until the next refresh. Given how 
many {{MutableRate}} metrics there are, a thread per metric is not really 
feasible, but could be done on e.g. a per-source basis. This has some 
downsides: if multiple sinks are configured with different periods, what is the 
right refresh period for the {{MutableRate}}? 
* Make {{MutableRate}} emit two counters, one for total and one for operation 
count, rather than an average gauge and an operation count counter. The average 
could then be calculated downstream from this information. This is cumbersome 
for operators and not backwards compatible. To improve on both of those 
downsides, we could have it keep the current behavior but _additionally_ emit 
the total as a counter. The snapshotted average is probably sufficient in the 
common case (we've been using it for years), and when more guaranteed accuracy 
is required, the average could be derived from the total and operation count.

Open to suggestions & input here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org