Re: Unnecessary uses of final on local variables

2019-06-19 Thread Kirk Lund
Thanks for the very thoughtful and well-worded reply, Bill. I would
certainly welcome further discussion on how and where to consistently use
the final keyword (if that's desirable), especially within the context of
reviewing pull requests.

On Wed, Jun 19, 2019 at 11:23 AM Bill Burcham  wrote:

> -1 to the proposal as I understand it:
>
> Premise:
>
> a. Because the "final" modifier on local variables serves no function other
> than to inform compiler optimization and…
>
> b. because the compiler can tell if a variable is "effectively final"
> without an explicit "final" modifier in the source code
>
> Conclusion:
>
> • The "final" modifier is just "noise" in the source code, so…
>
> Ramifications:
>
> 1. Do not use the "final" modifier on local variables.
>
> 2. By extension, it would be not just ok, but in fact a positive change, to
> _remove_ the "final" keyword from local variables, because it is just
> noise.
>
> 3. I believe the proposal should be interpreted to apply to method
> parameters as well—not just local variables. (Local variables includes loop
> variables when using for-each syntax—those can potentially be declared
> "final" just like any other local variable).
>
>
> I am thumbs-down on the proposal, first, because I believe the premise of
> the proposal is flawed. If the only value of the "final" modifier on
> storage was compile-time optimization then my view might be different. The
> fact is that the "final" modifier on a variable (or for-each loop variable)
> or method parameter declaration expresses the developer's intention that
> the value not be modified after initialization. Any violation of that
> intention will be flagged and rejected by the compiler. As such, the
> "final" modifier on variable and method parameter declarations is part of
> our arsenal of Java language constructs, such as types (including generic
> types), that let us express our intentions to the compiler.
>
> Lurking in the proposal is a danger (ramification (2)): that some of us
> will think it's not only ok, but an _improvement_ to actively _remove_
> final modifiers from local variable declarations, parameter declarations
> and for-each loop variable declarations. This lurking danger is a clear and
> present threat which I feel requires urgent consensus.
>
> I, for one, put "final" where I mean to put "final". I shouldn't have to
> monitor every single PR to ensure my finals aren't removed as noise—any
> more than I should have to monitor PRs to ensure folks aren't turning
> List back into (pre-generic) List.
>
> Can we at least agree to this:
>
> • when we encounter a final local variable, method parameter, or for-each
> loop variable declaration, we agree to _not_ remove the final modifier
> unless we have to
>
> And by "have to", I mean, we actually need to mutate the variable to make
> the code work. Any aesthetic concern about "final" being "noise" would not
> count here.
>
> -Bill
>


Unnecessary uses of final on local variables

2019-06-13 Thread Kirk Lund
According to Effective Java 3rd Edition, all local variables are implicitly
made final by the JVM (and thus receiving the same JVM optimizations as a
final variable) without requiring use of the keyword as long as you only
set the value once. So this becomes an unnecessary use of the keyword
final which
really just adds noise to the code (one more word on many lines) much like
unnecessary uses of this. on references to class fields. The only context
where it's still valuable to use final is on class fields and constants
where it provides concurrency guarantees.

It may be useful to temporarily mark a local variable as final while
performing a refactoring. See Martin Fowler's book for various refactorings
in which he does this.

There really is no value in retaining the keyword on a local variable any
longer, so I think we should avoid using it in the same way we avoid
unnecessary uses of this. on references to class fields. It's all just more
noise in the code.


Re: [PROPOSAL] Instrumenting Geode Code

2019-06-11 Thread Kirk Lund
+1 Looks good

On Mon, Jun 10, 2019 at 5:13 PM Aaron Lindsey  wrote:

> +1
>
> I like this approach compared to the previous proposals because it's
> simpler (doesn't require a custom registry) and makes it more
> straightforward to replace stats with Micrometer meters in the future.
>
> - Aaron
>
>
> On Fri, Jun 7, 2019 at 4:58 PM Dale Emery  wrote:
>
>> Proposal for instrumenting Geode
>>
>>
>> *Goals*
>>
>> - Allow routing measurements to statistics, Micrometer meters, or both as
>> appropriate, to allow publishing each measurement to the appropriate places.
>>
>> - Minimize the intrusiveness of new and existing metrics instrumentation.
>>
>> - Minimize the amount of code that must know how to route measurements to
>> stats and meters.
>>
>> - Maintain Geode performance.
>>
>> - Not preclude any options for deprecating and removing Geode’s
>> statistics mechanism in the future. And take steps to make deprecating and
>> removing the existing statistics mechanism easier.
>>
>>
>> *Proposal*
>>
>> - Continue to use Geode’s existing style of instrumentation class (e.g.
>> CachePerfStats) to instrument code.
>>
>> - Enhance the instrumentation classes to create and register meters when
>> appropriate, and to route each measurement to the appropriate stats and
>> meters.
>>
>> - When we want to route a given measurement to both a statistic and a
>> meter, use the "legacy meter" wrapper mechanism (described below).
>>
>> - Incrementally improve the instrumentation classes to focus more on
>> reporting domain events (e.g. regionAdded()) and less on reporting
>> measurements (e.g. incRegions()).
>>
>>
>> *Legacy Meter Wrappers*
>>
>> To route a given measurement to both a Micrometer meter and a Geode
>> statistic, do the following in the instrumentation class's constructor:
>>
>> - Create and register a normal Micrometer meter in the cache's meter
>> registry.
>>
>> - Wrap the registered meter in a custom "legacy meter" that reads and
>> writes the stat, and also writes to the registered meter.
>>
>> In the instrumentation methods (e.g. endPut()):
>>
>> - Use the legacy meter to report measurements.
>>
>> I've attached two diagrams below to show how the wrapper mechanism works
>> for "counter" style stats and meters. The first is a class diagram, showing
>> how the parts relate in general. The second is a sequence diagram that
>> shows in some detail how the parts interact to route a given measurement to
>> both a meter and a stat.
>>
>> If you want even more details, Jake Barrett has created a nice "proof of
>> concept" implementation:
>> https://github.com/apache/geode/compare/develop...pivotal-jbarrett:wip/LegacyMeterBuilder
>>
>>
>>
>> Please let us know if you have questions, comments, or concerns.
>>
>>
>> Cheers,
>> Dale
>>
>> —
>> Dale Emery
>> dem...@pivotal.io
>>
>>
>>
>>
>>
>>
>>
>>
>>


Please review PR #3650

2019-06-06 Thread Kirk Lund
Reminder: Please review PR #3650 https://github.com/apache/geode/pull/3650.
Cleanup commits are separated from the fix commits to help facilitate
review.

GEODE-6183: Fix isAttachAPIFound and launcher tests

Thanks,
Kirk


Re: what is the best way to update a geode pull request

2019-06-04 Thread Kirk Lund
+1 for doing whatever facilitates reviewing and is best for the PR (which
may vary by PR or even reviewers)
-1 to disallowing or even strongly discouraging force pushing in a PR (go
ahead and merge or rebase as each person prefers but don't force that
preference on others)

I prefer to separate cleanup/refactor commits from behavior change commits
but I accept that it's on a per-case or per-author basis. I care more about
cleaning up code and adding unit tests than making the reviewing easier.

On Fri, May 31, 2019 at 5:32 PM Aaron Lindsey  wrote:

> +1 to updating the PR template. I've noticed that few people actually
> follow it and sometimes they just remove it altogether.
> +1 to pushing PR changes as separate commits. This makes PR review easier.
>
> Sometimes it's helpful to me as a reviewer for the initial PR to be split
> into multiple commits as well. Also, I like Robert's suggestion of merge
> instead of rebase when the PR is out-of-date with develop.
>
> - Aaron
>
>
> On Fri, May 31, 2019 at 2:50 PM Jacob Barrett  wrote:
>
> >
> >
> > > On May 31, 2019, at 2:40 PM, Udo Kohlmeyer  wrote:
> > >
> > > If we are concerned about the single line that can break the product,
> > then our testing has failed us on so many levels, that there is no hope.
> >
> > Sorry, I used a hyperbolic statement about looking for 1 line out of
> 1000.
> > The point was “formatting” or “cleanup” style commits are better left
> > separate because looking for the real change in that sea of change is
> hard.
> >
> > > But looking forward to see how long one can sustain the "factor ->
> > commit -> make changes required to fulfill JIRA -> commit -> manual
> merge"…
> >
> > It’s only a problem if you are cleaning up lots a code. Not a bad problem
> > to have and the future looks brighter each time.
> >
> > > Also, who reviews the refactor, because even that can introduce
> > unintentional bugs... same effort as single commit. In single commit, if
> > the refactor has made code cleaner, clearer and simpler, maybe 1 commit
> is
> > easier to follow.
> >
> > I think there is a distinction between a refactor and cleanup. Consider
> > the time we decide to reformat all the code, that was a cleanup. Now as
> we
> > are going through the code and IJ reports every other line as a static
> > analyzer warning, fixing that is a cleanup. All these cleanups have been
> > reviews like any other PR. Th point being made was that they are done in
> a
> > way that allows the reviewer to review the clean and the change
> > independently.
> >
> > A refactor would would be a complete reorganization of code and should
> > have the tests, reviews, etc. that go with it.
> >
> > Regardless, all are reviewed.
> >
> > -Jake
> >
> >
>


Re: [DISCUSS] require reviews before merging a PR

2019-06-04 Thread Kirk Lund
I'm -1 for requiring N reviews before merging a commit.

Overall, I support Lazy Consensus. If I post a PR that fixes the flakiness
in a test, the precheckin jobs prove it, and it sits there for 2 weeks
without reviews, then I favor merging it in at that point without any
reviews. I'm not going to chase people around or spam the dev list over and
over asking for reviews. Nothing in the Apache Way says you have to do
reviews before committing -- some projects prefer "commit then review"
instead of "review then commit". You can always look at the code someone
changed and you can always change it further or revert it.

I think if we don't trust our committers then we have a bigger systemic
problem that becoming more strict about PR reviews isn not going to fix.

Overall, I also favor pairing/mobbing over reviews. Without being there
during the work, a reviewer lacks the context to understand why it was done
the way it was done.

If we cannot establish or maintain trust in committers, then I think we
should remove committer status from everyone and start over as a project,
proposing and accepting one committer at a time.

Instead of constraints on reviews, I would prefer to establish new criteria
for coding such as:
1) all classes touched in a PR must have a unit test created if none exists
2) all code touched in a PR must have unit test coverage (and possibly
integration test coverage) specific to the changes
3) all new classes must have full unit test coverage
4) all code touched in a PR must follow clean code principles (which would
obviously need defining on the wiki)

Then it becomes the responsibility of the author(s) and committer(s) of
that PR to ensure that the code and the PR follows the project's criteria
for code quality and test coverage. It also becomes easier to measure the
PRs of a non-committer to determine if we think they would make a good
committer (for example, do they adhere to clean code quality and unit
testing with mocks? -- along with any other criteria).

On Thu, May 30, 2019 at 3:51 PM Owen Nichols  wrote:

> It seems common for Geode PRs to get merged with only a single green
> checkmark in GitHub.
>
> According to https://www.apache.org/foundation/voting.html we should not
> be merging PRs with fewer than 3 green checkmarks.
>
> Consensus is a fundamental value in doing things The Apache Way.  A single
> +1 is not consensus.  Since we’re currently discussing what it takes to
> become a committer and what standards a committer is expected to uphold, it
> seems like a good time to review this policy.
>
> GitHub can be configured to require N reviews before a commit can be
> merged.  Should we enable this feature?
>
> -Owen
> VOTES ON CODE MODIFICATION <
> https://www.apache.org/foundation/voting.html#votes-on-code-modification>
> For code-modification votes, +1 votes are in favour of the proposal, but
> -1 votes are vetos 
> and kill the proposal dead until all vetoers withdraw their -1 votes.
>
> Unless a vote has been declared as using lazy consensus <
> https://www.apache.org/foundation/voting.html#LazyConsensus> , three +1
> votes are required for a code-modification proposal to pass.
>
> Whole numbers are recommended for this type of vote, as the opinion being
> expressed is Boolean: 'I approve/do not approve of this change.'
>
>
> CONSENSUS GAUGING THROUGH SILENCE <
> https://www.apache.org/foundation/voting.html#LazyConsensus>
> An alternative to voting that is sometimes used to measure the
> acceptability of something is the concept of lazy consensus <
> https://www.apache.org/foundation/glossary.html#LazyConsensus>.
>
> Lazy consensus is simply an announcement of 'silence gives assent.’ When
> someone wants to determine the sense of the community this way, it might do
> so with a mail message such as:
> "The patch below fixes GEODE-12345; if no-one objects within three days,
> I'll assume lazy consensus and commit it."
>
> Lazy consensus cannot be used on projects that enforce a
> review-then-commit <
> https://www.apache.org/foundation/glossary.html#ReviewThenCommit> policy.
>
>
>


Re: [DISCUSS] Remove exception.getMessage() error handling

2019-05-28 Thread Kirk Lund
I favor providing a detailed log message that includes the Throwable as an
argument.

I have seen code like the following in Geode:

} catch (Exception exception) {
  logger.error(exception.getMessage());
}

...but I consider that to be a naive anti-pattern for informing the user
that a problem occurred. For one thing, getMessage() is not guaranteed to
have a non-null value even for Exceptions thrown by the JDK. For example,
NullPointerExceptions thrown by the JDK usually (if not always) have a null
message and the result in the Geode log is:

null

...which is clearly not useful and is very confusing. It actually just
looks like a bug.

I think any error-handling that logs an exception should include a log
message with at least some basic context about what was attempted that
resulted in the exception:

  logger.error("Attempt to perform operation foo resulted in {}",
exception);

The above will then use exception.toString() for {} which provides much
better detail to the user about what problem actually occurred.

On Tue, May 28, 2019 at 10:50 AM Bruce Schuchardt 
wrote:

> In some ways I agree with Owen - it's always better to have more
> information about a problem.  I can recall seeing blank error messages
> or "null" on many occasions.
>
> In other ways I agree with Dan and Anthony that this needs to not be a
> global search, but I think that's mostly because our alert
> system is intertwined with our logging system.  An operator wouldn't
> know what do make of "UnknownHostException" in an alert but someone
> doing forensic analysis might want to see that exception name in a log
> file.
>
> On 5/28/19 10:03 AM, Anthony Baker wrote:
> > In the example you provided, I don’t agree that adding the exception
> class name creates a better user experience.
> >
> > Anthony
> >
> >
> >> On May 25, 2019, at 6:39 PM, Owen Nichols  wrote:
> >>
> >> Here’s an example of a message that was logged before Jack’s change:
> >>
> >> l192.168.99.1: nodename nor servname provided, or not known
> >>
> >> Here’s what it will look like now with .toString() instead of
> .getMessage():
> >>
> >> java.net.UnknownHostException: l192.168.99.1: nodename nor servname
> provided, or not known
> >>
> >>
> >> I cannot think of a single good reason to ever print an exception’s
> message stripped of all context.  As Jack noted, many exceptions don’t even
> have a message; the key information is the class of the exception itself.
> Even if you aren’t catching Exception but rather some very specific
> subclass, code should never make assumptions about the text of an exception
> (see PR#3616  for a recent
> example).
> >>
> >>
> >> @jinmei There is no built-in method in java to capture the entire
> stacktrace as a string.  Exception::toString() just concatenates the
> exception class name and message (i.e. the first line only of a full
> stacktrace)
> >>
> >>
> >>> On May 24, 2019, at 3:37 PM, Dan Smith  wrote:
> >>>
> >>> I think the right thing to do in those 100+ cases may be different in
> each
> >>> case, a blanket search and replace might not be the best idea.
> >>>
> >>> -Dan
> >>>
> >>>
> >>>
> >>> On Fri, May 24, 2019 at 3:05 PM Jinmei Liao  wrote:
> >>>
>  does exception.toString() print out the entire stack trace? If so, I
> don't
>  think we should use that to replace exception.getMessage().
> 
>  On Fri, May 24, 2019 at 1:18 PM Jack Weissburg  >
>  wrote:
> 
> > Hi All,
> >
> > Owen and I investigated a strange error message caused because Geode
> > previously handled an error exception with exception.getMessage()
> instead
> > of
> > exception.toString().
> >
> > The issue is that the exception message from exception.getMessage()
> could
> > be unhelpful or empty. Instead, we should return the whole exception
> via
> > exception.toString().
> >
> > exception.getMessage()is used widely throughout the Geode code base
> (100+
> > times) and could be prone to cause more issues similar to
> > https://issues.apache.org/jira/browse/GEODE-6796
> > I would like to change the remaining instances of
>  exception.getMessage()to
> > avoid future issues of this variety.
> >
> > Thoughts? Comments?
> >
> > Best,
> > Jack Weissburg
> >
> 
>  --
>  Cheers
> 
>  Jinmei
> 
>


Re: Remove mavenLocal from Geode gradle files

2019-05-08 Thread Kirk Lund
The following easily reproduces this problem for me:

1) Checkout geode head revision of develop.
2) Create a maven project with these dependencies:



junit
junit
4.13-beta-3
test


org.assertj
assertj-core
3.12.2
test


pl.pragmatists
JUnitParams
1.1.1
test


org.apache.logging.log4j
log4j-api
2.11.2


org.apache.geode
geode-core
1.9.0


org.apache.geode
geode-junit
1.9.0
test



3) Delete your .m2 directory
4) Install maven 3.6.1 (the very latest version)
5) Use mvn to build the maven project (this will create a .m2 directory)

$ mvn package

6) Go to your geode checkout and build it

The above 100% reproduces the following every time. If I delete either
mavenLocal or ~/.m2 then geode builds. I do not have an installation of
gradle. My ~/.gradle/gradle.properties can either be deleted or present.
I'm using JDK 1.8.0_202.

$ ./gradlew build -x test -x pmdMain -x javadoc

/Users/klund/dev/geode3 [554]$ ./gradlew build -x
test -x pmdMain -x javadoc

To honour the JVM settings for this build a new JVM will be forked. Please
consider using the daemon:
https://docs.gradle.org/5.4/userguide/gradle_daemon.html.

Daemon will be stopped at the end of the build stopping after processing

*> Task :geode-assembly:compileDistributedTestJava* FAILED

*> Task :extensions:geode-modules-session:compileIntegrationTestJava* FAILED

*> Task :geode-core:compileIntegrationTestJava* FAILED


FAILURE: Build completed with 3 failures.


1: Task failed with an exception.

---

* What went wrong:

Execution failed for task ':geode-assembly:compileDistributedTestJava'.

> Could not resolve all files for configuration
':geode-assembly:distributedTestCompileClasspath'.

   > Could not find log4j-core-tests.jar
(org.apache.logging.log4j:log4j-core:2.11.1).

 Searched in the following locations:


file:/Users/klund/.m2/repository/org/apache/logging/log4j/log4j-core/2.11.1/log4j-core-2.11.1-tests.jar

   > Could not find log4j-core-test-sources.jar
(org.apache.logging.log4j:log4j-core:2.11.1).

 Searched in the following locations:


file:/Users/klund/.m2/repository/org/apache/logging/log4j/log4j-core/2.11.1/log4j-core-2.11.1-test-sources.jar


* Try:

Run with *--stacktrace* option to get the stack trace. Run with *--info* or
*--debug* option to get more log output. Run with *--scan* to get full
insights.

==


2: Task failed with an exception.

---

* What went wrong:

Execution failed for task
':extensions:geode-modules-session:compileIntegrationTestJava'.

> Could not resolve all files for configuration
':extensions:geode-modules-session:integrationTestCompileClasspath'.

   > Could not find jetty-http-tests.jar
(org.eclipse.jetty:jetty-http:9.4.12.v20180830).

 Searched in the following locations:


file:/Users/klund/.m2/repository/org/eclipse/jetty/jetty-http/9.4.12.v20180830/jetty-http-9.4.12.v20180830-tests.jar


* Try:

Run with *--stacktrace* option to get the stack trace. Run with *--info* or
*--debug* option to get more log output. Run with *--scan* to get full
insights.

==


3: Task failed with an exception.

---

* What went wrong:

Execution failed for task ':geode-core:compileIntegrationTestJava'.

> Could not resolve all files for configuration
':geode-core:integrationTestCompileClasspath'.

   > Could not find log4j-core-tests.jar
(org.apache.logging.log4j:log4j-core:2.11.1).

 Searched in the following locations:


file:/Users/klund/.m2/repository/org/apache/logging/log4j/log4j-core/2.11.1/log4j-core-2.11.1-tests.jar

   > Could not find log4j-core-test-sources.jar
(org.apache.logging.log4j:log4j-core:2.11.1).

 Searched in the following locations:


file:/Users/klund/.m2/repository/org/apache/logging/log4j/log4j-core/2.11.1/log4j-core-2.11.1-test-sources.jar


* Try:

Run with *--stacktrace* option to get the stack trace. Run with *--info* or
*--debug* option to get more log output. Run with *--scan* to get full
insights.

==


* Get more help at *https://help.gradle.org <https://help.gradle.org>*


Deprecated Gradle features were used in this build, making it incompatible
with Gradle 6.0.

Use '--warning-mode all' to show the individual deprecation warnings.

See
https://docs.gradle.org/5.4/userguide/command_line_interface.html#sec:command_line_warnings


*BUILD FAILED* in 57s

389 actionable tasks: 46 executed, 86 from cache, 257 up-to-date

O

Re: Remove mavenLocal from Geode gradle files

2019-05-08 Thread Kirk Lund
I'll assign the bug to you so you can decide if you want to close it or try
to fix it...

On Wed, May 8, 2019 at 1:26 PM Kirk Lund  wrote:

> Other people including Barry have run into it as well.
>
> No, I'm not using --offline, but I will try --refresh-dependencies. The
> only thing *weird* that I'm doing is building a composite build for one
> of my checkouts.
>
> I'm sure I can find a way to fix it for myself if you believe we need to
> use mavenLocal.
>
> Thanks,
> Kirk
>
> On Wed, May 8, 2019 at 12:41 PM Patrick Rhomberg 
> wrote:
>
>> It is strange to me that your build is *only* looking in Maven-local.
>> You're not building with --offline, are you?  Does running with
>> --refresh-dependencies resolve this issue when you have it?
>>
>> Is anything fiddling with your ~/.m2/repository without updating the
>> corresponding maven xml entries?  It seems to me like your local Maven xml
>> records believes it should have an artifact, but then the expected file
>> isn't present on disk.
>>
>> I don't think the forum thread you found is still relevant.  That thread
>> points to GRADLE-2709, which has been fixed since Gradle 1.9.
>>
>> All that said, if you don't want to use your local ~/.m2, that's your
>> business.  If that folder doesn't exist, the presence of mavenLocal()
>> shouldn't
>> have any effect.  But some of us are making use of it.
>>
>> On Wed, May 8, 2019 at 12:24 PM Robert Houghton 
>> wrote:
>>
>> > Jake, my understanding is that benchmarks does not need geode to be
>> able to
>> > pull resources from Maven local. Benchmarks is an external wrapper that
>> > executes geode commands, right?
>> >
>> > On Wed, May 8, 2019, 12:12 Jacob Barrett  wrote:
>> >
>> > > Maven local is necessary for some of our other build processes like
>> > > benchmarks.
>> > >
>> > > Is there no other way to correct this issue. I have never run into
>> this
>> > > issue.
>> > >
>> > > -jake
>> > >
>> > >
>> > > > On May 8, 2019, at 10:13 AM, Kirk Lund  wrote:
>> > > >
>> > > > I'd like like to remove mavelLocal the Geode gradle files.
>> > > >
>> > > > GEODE-6753: Use of mavenLocal in gradle may cause build to fail with
>> > > > missing tests dependencies
>> > > > https://issues.apache.org/jira/browse/GEODE-6753
>> > > >
>> > > > Thanks,
>> > > > Kirk
>> > >
>> >
>>
>


Re: Remove mavenLocal from Geode gradle files

2019-05-08 Thread Kirk Lund
Other people including Barry have run into it as well.

No, I'm not using --offline, but I will try --refresh-dependencies. The
only thing *weird* that I'm doing is building a composite build for one of
my checkouts.

I'm sure I can find a way to fix it for myself if you believe we need to
use mavenLocal.

Thanks,
Kirk

On Wed, May 8, 2019 at 12:41 PM Patrick Rhomberg 
wrote:

> It is strange to me that your build is *only* looking in Maven-local.
> You're not building with --offline, are you?  Does running with
> --refresh-dependencies resolve this issue when you have it?
>
> Is anything fiddling with your ~/.m2/repository without updating the
> corresponding maven xml entries?  It seems to me like your local Maven xml
> records believes it should have an artifact, but then the expected file
> isn't present on disk.
>
> I don't think the forum thread you found is still relevant.  That thread
> points to GRADLE-2709, which has been fixed since Gradle 1.9.
>
> All that said, if you don't want to use your local ~/.m2, that's your
> business.  If that folder doesn't exist, the presence of mavenLocal()
> shouldn't
> have any effect.  But some of us are making use of it.
>
> On Wed, May 8, 2019 at 12:24 PM Robert Houghton 
> wrote:
>
> > Jake, my understanding is that benchmarks does not need geode to be able
> to
> > pull resources from Maven local. Benchmarks is an external wrapper that
> > executes geode commands, right?
> >
> > On Wed, May 8, 2019, 12:12 Jacob Barrett  wrote:
> >
> > > Maven local is necessary for some of our other build processes like
> > > benchmarks.
> > >
> > > Is there no other way to correct this issue. I have never run into this
> > > issue.
> > >
> > > -jake
> > >
> > >
> > > > On May 8, 2019, at 10:13 AM, Kirk Lund  wrote:
> > > >
> > > > I'd like like to remove mavelLocal the Geode gradle files.
> > > >
> > > > GEODE-6753: Use of mavenLocal in gradle may cause build to fail with
> > > > missing tests dependencies
> > > > https://issues.apache.org/jira/browse/GEODE-6753
> > > >
> > > > Thanks,
> > > > Kirk
> > >
> >
>


Remove mavenLocal from Geode gradle files

2019-05-08 Thread Kirk Lund
I'd like like to remove mavelLocal the Geode gradle files.

GEODE-6753: Use of mavenLocal in gradle may cause build to fail with
missing tests dependencies
https://issues.apache.org/jira/browse/GEODE-6753

Thanks,
Kirk


How to run PulseAuthorizationTest

2019-04-22 Thread Kirk Lund
Is there a way to run a geode-pulse uiTest on the command-line without it
failing due to "webdriver.chrome.driver system property"? I would expect
the gradle build to be setting this property.

$ ./gradlew geode-pulse:uiTest --tests PulseAuthorizationTest

...

> Task :geode-pulse:uiTest

org.apache.geode.tools.pulse.tests.ui.PulseAuthorizationTest >
authenticatedUserWithNoClusterReadPermissionShouldGetAccessDeniedPage FAILED
java.lang.IllegalStateException: The path to the driver executable must
be set by the webdriver.chrome.driver system property; for more
information, see https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver.
The latest version can be downloaded from
http://chromedriver.storage.googleapis.com/index.html

org.apache.geode.tools.pulse.tests.ui.PulseAuthorizationTest >
authenticatedUserWithClusterReadButNoDataReadPermissionShouldSeeClusterDetailsButGetAccessDeniedPageForDataBrowser
FAILED
java.lang.IllegalStateException: The path to the driver executable must
be set by the webdriver.chrome.driver system property; for more
information, see https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver.
The latest version can be downloaded from
http://chromedriver.storage.googleapis.com/index.html

2 tests completed, 2 failed

> Task :geode-pulse:uiTest FAILED


Re: Request for bulk transition permission in Geode Jira

2019-04-22 Thread Kirk Lund
I think I gave you permissions for this. Let me know if it doesn't work for
you. Thanks!

On Mon, Apr 22, 2019 at 7:16 AM Owen Nichols  wrote:

> As per Geode release manager instructions, I will need to bulk-transition
> resolved issues to completed (once we have voted to release a 1.9.0 RC).
> Please grant me Jira 'bulk-transition’ permissions.
>
> Thanks,
> -Owen


Re: Fixing Awaitility await().untilAsserted(new WaitCriterion

2019-04-11 Thread Kirk Lund
I didn't realize anyone had added to the WaitCriterion interface. That
explains why we don't have constant failures caused by this! Unfortunately,
that means I didn't find a potential cause of flakiness in one of my tests.

On Thu, Apr 11, 2019 at 12:20 PM Dan Smith  wrote:

> WaitCriterion extends ThrowingRunnable. So this pattern should still work
> as WaitCriterion did before. But just using an assertion inside of a lambda
> is the better option.
>
> -Dan
>
> On Thu, Apr 11, 2019 at 12:02 PM Kirk Lund  wrote:
>
> > Just a quick heads up... I'm seeing an Awaitility usage pattern that is
> > broken and does nothing. Specifically, it's any uses of dunit
> WaitCriterion
> > with untilAsserted:
> >
> >   GeodeAwaitility.*await().untilAsserted(new WaitCriterion*() {
> >
> > @Override
> > public boolean done() {
> >   return region.isDestroyed();
> > }
> >
> > @Override
> > public String description() {
> >   return "Region was not destroyed : " + region.isDestroyed();
> > }
> >   });
> >
> > The above is broken. It will not await anything. Invoking "new
> > WaitCriterion()" will not throw anything so untilAsserted returns
> > immediately. This may cause some flaky tests to be flakier because we're
> no
> > longer waiting for the WaitCriterion done to be true.
> >
> > To fix this, just change it to use await().until with a lambda containing
> > the contents of done():
> >
> >   GeodeAwaitility.*await().until(() -> *region.isDestroyed());
> >
>


Fixing Awaitility await().untilAsserted(new WaitCriterion

2019-04-11 Thread Kirk Lund
Just a quick heads up... I'm seeing an Awaitility usage pattern that is
broken and does nothing. Specifically, it's any uses of dunit WaitCriterion
with untilAsserted:

  GeodeAwaitility.*await().untilAsserted(new WaitCriterion*() {

@Override
public boolean done() {
  return region.isDestroyed();
}

@Override
public String description() {
  return "Region was not destroyed : " + region.isDestroyed();
}
  });

The above is broken. It will not await anything. Invoking "new
WaitCriterion()" will not throw anything so untilAsserted returns
immediately. This may cause some flaky tests to be flakier because we're no
longer waiting for the WaitCriterion done to be true.

To fix this, just change it to use await().until with a lambda containing
the contents of done():

  GeodeAwaitility.*await().until(() -> *region.isDestroyed());


Re: Jira Permission Request

2019-04-10 Thread Kirk Lund
Done! You should have permissions now. Thanks!

On Wed, Apr 10, 2019 at 3:18 PM Aaron Lindsey  wrote:

> Sorry, I forgot to add that my username is aaronlindsey
>
> On Wed, Apr 10, 2019 at 3:17 PM Aaron Lindsey  wrote:
>
> > Hi,
> >
> > I would like to request permission to create and edit tickets and assign
> > them to myself on the following JIRA:
> > https://issues.apache.org/jira/
> >
> > Thanks,
> > Aaron Lindsey
> >
>
>
> --
> - Aaron
>


Re: Permission request

2019-04-10 Thread Kirk Lund
Done! You should have edit permissions now. Thanks!

On Wed, Apr 10, 2019 at 3:15 PM Aaron Lindsey  wrote:

> Hi,
>
> I would like to be able to edit the wiki here:
> https://cwiki.apache.org/confluence/display/GEODE
>
> My username is aaronlindsey.
>
> Thanks,
> Aaron Lindsey
>


Re: Jira & wiki permissions

2019-04-10 Thread Kirk Lund
Done! You should have permissions to edit both. Thanks!

On Wed, Apr 10, 2019 at 3:13 PM Alberto Bustamante Reyes
 wrote:

> Hi Geode community,
>
>
> I would like to have permissions to edit the wiki and assign me tickets.
> Could someone help me with this?
>
> My username is alberto.bustamante.reyes
>
>
> Thanks in advance,
>
>
> Alberto
>
>
>


Re: Request wiki edit permission

2019-04-10 Thread Kirk Lund
Done! If there's anything you find that doesn't work for you, let me know.
Thanks!

On Wed, Apr 10, 2019 at 11:35 AM Owen Nichols  wrote:

> Hi, my Apache LDAP username is onichols.
>
> I would like to request permission to edit
> https://cwiki.apache.org/confluence/display/GEODE <
> https://cwiki.apache.org/confluence/display/GEODE> wiki pages.


Re: [DISCUSS] Move or remove org.apache.geode.admin

2019-04-05 Thread Kirk Lund
gt;> of documentation, annotation and availability of the symbols. If the
> >>> symbols are available but marked deprecated and not documented anymore
> then
> >>> it is. It API.
> >>>
> >>> Yes under your interpretation it would require that availability be
> >>> removed at the major release too so that a consumer that ignore
> deprecation
> >>> warnings when compiling with the new version got a compilation or
> runtime
> >>> error later up updating a minor.
> >>>
> >>> I think that error is well deserved. We do a disservice to our
> customers
> >>> by allowing them to continue to limp along on deprecated, unmaintained
> and
> >>> untested APIs because we missed an arbitrary window to remove the
> symbols.
> >>>
> >>> If however the community agrees with you interpretation then we must
> make
> >>> sure a ticket is created for the next major with a list of deprecated
> items
> >>> and updated when new items are deprecated so that the task is
> performed at
> >>> the major release. A chore needs to be undertaken to remove all
> deprecated
> >>> APIs at the start of each major development cycle. The short of it is
> we
> >>> can’t miss this arbitrarily limiting window.
> >>>
> >>> -Jake
> >>>
> >>>
> >>>> On Apr 3, 2019, at 3:58 PM, Alexander Murmann 
> >>> wrote:
> >>>>
> >>>> Would we announce that we aren't following semver anymore because
> bumping
> >>>> the major version is somehow too expensive?
> >>>>
> >>>>> On Wed, Apr 3, 2019 at 3:29 PM Kirk Lund  wrote:
> >>>>>
> >>>>> 1) +1 YES. If we continue to *not* have at least one major release
> per
> >>>>> year, then I 100% support the removal of deprecated APIs and features
> >>> in a
> >>>>> minor release such as 1.10. If we decide to have at least one major
> >>> release
> >>>>> per year, then I'd be willing to revisit this and consider not
> allowing
> >>>>> removal in a minor release. I do *not* support perpetually deferring
> >>> major
> >>>>> releases to avoid facing the removal of deprecated APIs -- that's
> just
> >>>>> ridiculous as it keeps our maintenance costs much higher than they
> need
> >>> to
> >>>>> be.
> >>>>>
> >>>>> 2) +1 to remove anything in 1.10, or any other minor release, that
> was
> >>>>> already deprecated in Geode 1.0
> >>>>>
> >>>>>> On Tue, Apr 2, 2019 at 5:05 PM Dan Smith  wrote:
> >>>>>>
> >>>>>> Hi devs,
> >>>>>>
> >>>>>> The org.apache.geode.admin package has been deprecated for about 7
> >>> years
> >>>>>> [1].
> >>>>>>
> >>>>>> I'd like to remove it, or at least move it to a separate module. I
> >>>>> actually
> >>>>>> started some work to move it to a separate module [2]. However in
> the
> >>>>>> course of this process, I've found that this module has extremely
> >>> little
> >>>>>> test coverage, so I'm not confident in the move. For example, it
> has a
> >>>>>> whole JMX manager feature (different than our current JMX manager)
> >>> which
> >>>>>> does not appear to have any tests. This JMX manager feature won't
> work
> >>> as
> >>>>>> shipped unless you find and add some mx4j jars to your classpath.
> >>>>>>
> >>>>>> I think the best course of action is probably to remove it entirely.
> >>>>>> However, this does bring up a couple of questions:
> >>>>>>
> >>>>>> 1) Is it ok in general to remove deprecated API in a non X.0 release
> >>> (eg
> >>>>>> geode 1.10 instead of geode 2.0)?
> >>>>>>
> >>>>>> 2) How about in this case, when this feature has been deprecated
> for so
> >>>>>> long, and may or may not work?
> >>>>>>
> >>>>>> -Dan
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/admin/package-summary.html
> >>>>>> [2] Draft PR for moving the org.apache.geode.admin to a separate
> >>> module -
> >>>>>> https://github.com/apache/geode/pull/3392
> >>>>>>
> >>>>>
> >>>
> >
>


Re: [DISCUSS] Move or remove org.apache.geode.admin

2019-04-04 Thread Kirk Lund
eleased
> >> despite the availability of deprecated symbols from previous major. Just
> >> because I can access symbols does not make it API. API is the
> combination
> >> of documentation, annotation and availability of the symbols. If the
> >> symbols are available but marked deprecated and not documented anymore
> then
> >> it is. It API.
> >>
> >> Yes under your interpretation it would require that availability be
> >> removed at the major release too so that a consumer that ignore
> deprecation
> >> warnings when compiling with the new version got a compilation or
> runtime
> >> error later up updating a minor.
> >>
> >> I think that error is well deserved. We do a disservice to our customers
> >> by allowing them to continue to limp along on deprecated, unmaintained
> and
> >> untested APIs because we missed an arbitrary window to remove the
> symbols.
> >>
> >> If however the community agrees with you interpretation then we must
> make
> >> sure a ticket is created for the next major with a list of deprecated
> items
> >> and updated when new items are deprecated so that the task is performed
> at
> >> the major release. A chore needs to be undertaken to remove all
> deprecated
> >> APIs at the start of each major development cycle. The short of it is we
> >> can’t miss this arbitrarily limiting window.
> >>
> >> -Jake
> >>
> >>
> >>> On Apr 3, 2019, at 3:58 PM, Alexander Murmann 
> >> wrote:
> >>>
> >>> Would we announce that we aren't following semver anymore because
> bumping
> >>> the major version is somehow too expensive?
> >>>
> >>>> On Wed, Apr 3, 2019 at 3:29 PM Kirk Lund  wrote:
> >>>>
> >>>> 1) +1 YES. If we continue to *not* have at least one major release per
> >>>> year, then I 100% support the removal of deprecated APIs and features
> >> in a
> >>>> minor release such as 1.10. If we decide to have at least one major
> >> release
> >>>> per year, then I'd be willing to revisit this and consider not
> allowing
> >>>> removal in a minor release. I do *not* support perpetually deferring
> >> major
> >>>> releases to avoid facing the removal of deprecated APIs -- that's just
> >>>> ridiculous as it keeps our maintenance costs much higher than they
> need
> >> to
> >>>> be.
> >>>>
> >>>> 2) +1 to remove anything in 1.10, or any other minor release, that was
> >>>> already deprecated in Geode 1.0
> >>>>
> >>>>> On Tue, Apr 2, 2019 at 5:05 PM Dan Smith  wrote:
> >>>>>
> >>>>> Hi devs,
> >>>>>
> >>>>> The org.apache.geode.admin package has been deprecated for about 7
> >> years
> >>>>> [1].
> >>>>>
> >>>>> I'd like to remove it, or at least move it to a separate module. I
> >>>> actually
> >>>>> started some work to move it to a separate module [2]. However in the
> >>>>> course of this process, I've found that this module has extremely
> >> little
> >>>>> test coverage, so I'm not confident in the move. For example, it has
> a
> >>>>> whole JMX manager feature (different than our current JMX manager)
> >> which
> >>>>> does not appear to have any tests. This JMX manager feature won't
> work
> >> as
> >>>>> shipped unless you find and add some mx4j jars to your classpath.
> >>>>>
> >>>>> I think the best course of action is probably to remove it entirely.
> >>>>> However, this does bring up a couple of questions:
> >>>>>
> >>>>> 1) Is it ok in general to remove deprecated API in a non X.0 release
> >> (eg
> >>>>> geode 1.10 instead of geode 2.0)?
> >>>>>
> >>>>> 2) How about in this case, when this feature has been deprecated for
> so
> >>>>> long, and may or may not work?
> >>>>>
> >>>>> -Dan
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> >>>>
> >>
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/admin/package-summary.html
> >>>>> [2] Draft PR for moving the org.apache.geode.admin to a separate
> >> module -
> >>>>> https://github.com/apache/geode/pull/3392
> >>>>>
> >>>>
> >>
>
>


Re: [DISCUSS] Move or remove org.apache.geode.admin

2019-04-03 Thread Kirk Lund
1) +1 YES. If we continue to *not* have at least one major release per
year, then I 100% support the removal of deprecated APIs and features in a
minor release such as 1.10. If we decide to have at least one major release
per year, then I'd be willing to revisit this and consider not allowing
removal in a minor release. I do *not* support perpetually deferring major
releases to avoid facing the removal of deprecated APIs -- that's just
ridiculous as it keeps our maintenance costs much higher than they need to
be.

2) +1 to remove anything in 1.10, or any other minor release, that was
already deprecated in Geode 1.0

On Tue, Apr 2, 2019 at 5:05 PM Dan Smith  wrote:

> Hi devs,
>
> The org.apache.geode.admin package has been deprecated for about 7 years
> [1].
>
> I'd like to remove it, or at least move it to a separate module. I actually
> started some work to move it to a separate module [2]. However in the
> course of this process, I've found that this module has extremely little
> test coverage, so I'm not confident in the move. For example, it has a
> whole JMX manager feature (different than our current JMX manager) which
> does not appear to have any tests. This JMX manager feature won't work as
> shipped unless you find and add some mx4j jars to your classpath.
>
> I think the best course of action is probably to remove it entirely.
> However, this does bring up a couple of questions:
>
> 1) Is it ok in general to remove deprecated API in a non X.0 release (eg
> geode 1.10 instead of geode 2.0)?
>
> 2) How about in this case, when this feature has been deprecated for so
> long, and may or may not work?
>
> -Dan
>
> [1]
>
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/admin/package-summary.html
> [2] Draft PR for moving the org.apache.geode.admin to a separate module -
> https://github.com/apache/geode/pull/3392
>


Re: Weird intermittent build error caused by spotlessJava

2019-03-26 Thread Kirk Lund
Are you sure it's not necessary to do a clean build when switching between
different branches? Especially branches that are far enough apart in
revisions that their gradle build files are different? Sometimes two
branches, such as for two different releases, are significantly different
in the gradle files.

On Tue, Mar 26, 2019 at 2:49 PM Patrick Rhomberg 
wrote:

> As Dan mentioned, the few times I've seen this has been a result of running
> clean in the same build set as spotless.  While that "shouldn't" be an
> issue, it seems to be the underlying cause.
>
> I encourage you to not clean every time you build.  We've done a lot of
> work lately to improve the correctness of our build, so incrementalization
> and task caching should (no scare quotes this time) be the go-to, rather
> than always cleaning before you build.  If you have a specific reason you
> think you need to clean before building, that's something we should fix.
> As the common developer use-case, we have ./gradlew dev to do both compile
> / assembly and spotlessApply.
>
> As a side note, both of your configurations could also be achieved via
> command-line arguments --no-daemon and --no-parallel respectively, if you
> don't want that behavior to be the default as specified in your local
> ~/.gradle/gradle.properties.
>
> On Tue, Mar 26, 2019 at 11:46 AM Kirk Lund  wrote:
>
> > I actually have two lines in my gradle.properties. The combo seems to
> make
> > spotless behave when it gets into the broken state I mentioned before:
> >
> > org.gradle.daemon=false
> > org.gradle.parallel=false
> >
> > This does make it slower but slower is better than broken.
> >
> > On Tue, Mar 26, 2019 at 11:38 AM Kirk Lund  wrote:
> >
> > > I just had this occur again. And I found another solution, albeit not a
> > > great one, but if you disable org.gradle.daemon that also clears up
> this
> > > problem. You can add this to ~/.gradle/gradle.properties:
> > >
> > > org.gradle.daemon=false
> > >
> > > Just to clarify, I'm on a Mac using gradlew in geode (I don't have my
> own
> > > version of gradle installed) and oracle jdk1.8.0_202.
> > >
> > > On Tue, Mar 26, 2019 at 10:48 AM Kirk Lund  wrote:
> > >
> > >> Few more details that might help others...
> > >>
> > >> Each Geode module fails in the same way but obviously on a different
> > >> .java file, and I have not altered these .java files -- they do not
> > appear
> > >> to have any format issues involving imports.
> > >>
> > >> Once gradle is in this state, switching branches from a feature branch
> > to
> > >> develop (which has no local changes) does not clear it up either.
> > >>
> > >> > Task :geode-junit:spotlessJava FAILED
> > >> Step 'removeUnusedImports' found problem in
> > >>
> >
> 'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
> > >> null
> > >> java.lang.reflect.InvocationTargetException
> > >>
> > >> On Tue, Mar 26, 2019 at 10:43 AM Kirk Lund  wrote:
> > >>
> > >>> Intermittent build error caused by spotlessJava. My checkout of Geode
> > >>> intermittently gets into a state that then fails to build. I've been
> > seeing
> > >>> this come and go for the last month or two. I'm not sure what puts it
> > into
> > >>> this state but executing:
> > >>>
> > >>> $ ./gradlew clean build -x test
> > >>>
> > >>> ...will repeatedly fail in junit:spotlessJava with a
> > >>> java.lang.reflect.InvocationTargetException and a root cause of
> > >>> ava.lang.NoClassDefFoundError:
> > >>> org/openjdk/tools/javac/comp/InferenceContext.
> > >>>
> > >>> Stopping the gradle daemon(s) does not clear the problem up. Cleaning
> > >>> does not clear it up. Changing JDK version does not clear it up. But
> I
> > just
> > >>> discovered this morning that if I attempt to execute spotless
> directly
> > that
> > >>> finally clears it up:
> > >>>
> > >>> $ ./gradlew spotlessApply
> > >>> or
> > >>> $ ./gradlew spotlessCheck
> > >>>
> > >>> After either spotlessApply or spotlessCheck, the build will stop
> > failing
> > >>> when I execute:
> > >>>
> > >>> $ ./gradlew clean build -x test
> > >>>
> >

Re: Weird intermittent build error caused by spotlessJava

2019-03-26 Thread Kirk Lund
I actually have two lines in my gradle.properties. The combo seems to make
spotless behave when it gets into the broken state I mentioned before:

org.gradle.daemon=false
org.gradle.parallel=false

This does make it slower but slower is better than broken.

On Tue, Mar 26, 2019 at 11:38 AM Kirk Lund  wrote:

> I just had this occur again. And I found another solution, albeit not a
> great one, but if you disable org.gradle.daemon that also clears up this
> problem. You can add this to ~/.gradle/gradle.properties:
>
> org.gradle.daemon=false
>
> Just to clarify, I'm on a Mac using gradlew in geode (I don't have my own
> version of gradle installed) and oracle jdk1.8.0_202.
>
> On Tue, Mar 26, 2019 at 10:48 AM Kirk Lund  wrote:
>
>> Few more details that might help others...
>>
>> Each Geode module fails in the same way but obviously on a different
>> .java file, and I have not altered these .java files -- they do not appear
>> to have any format issues involving imports.
>>
>> Once gradle is in this state, switching branches from a feature branch to
>> develop (which has no local changes) does not clear it up either.
>>
>> > Task :geode-junit:spotlessJava FAILED
>> Step 'removeUnusedImports' found problem in
>> 'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
>> null
>> java.lang.reflect.InvocationTargetException
>>
>> On Tue, Mar 26, 2019 at 10:43 AM Kirk Lund  wrote:
>>
>>> Intermittent build error caused by spotlessJava. My checkout of Geode
>>> intermittently gets into a state that then fails to build. I've been seeing
>>> this come and go for the last month or two. I'm not sure what puts it into
>>> this state but executing:
>>>
>>> $ ./gradlew clean build -x test
>>>
>>> ...will repeatedly fail in junit:spotlessJava with a
>>> java.lang.reflect.InvocationTargetException and a root cause of
>>> ava.lang.NoClassDefFoundError:
>>> org/openjdk/tools/javac/comp/InferenceContext.
>>>
>>> Stopping the gradle daemon(s) does not clear the problem up. Cleaning
>>> does not clear it up. Changing JDK version does not clear it up. But I just
>>> discovered this morning that if I attempt to execute spotless directly that
>>> finally clears it up:
>>>
>>> $ ./gradlew spotlessApply
>>> or
>>> $ ./gradlew spotlessCheck
>>>
>>> After either spotlessApply or spotlessCheck, the build will stop failing
>>> when I execute:
>>>
>>> $ ./gradlew clean build -x test
>>>
>>> The stack thrown by spotlessJava and printed by gradle when it's stuck
>>> in this state is below...
>>>
>>> > Task :geode-junit:spotlessJava FAILED
>>> Step 'removeUnusedImports' found problem in
>>> 'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
>>> null
>>> java.lang.reflect.InvocationTargetException
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> at
>>> com.diffplug.spotless.java.GoogleJavaFormatStep$State.lambda$createRemoveUnusedImportsOnly$1(GoogleJavaFormatStep.java:153)
>>> at
>>> com.diffplug.spotless.FormatterStepImpl$Standard.format(FormatterStepImpl.java:78)
>>> at
>>> com.diffplug.spotless.FormatterStep$Strict.format(FormatterStep.java:76)
>>> at com.diffplug.spotless.Formatter.compute(Formatter.java:230)
>>> at com.diffplug.spotless.Formatter.isClean(Formatter.java:167)
>>> at
>>> com.diffplug.gradle.spotless.SpotlessTask.check(SpotlessTask.java:263)
>>> at
>>> com.diffplug.gradle.spotless.SpotlessTask.performAction(SpotlessTask.java:205)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> at
>>> org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:103)
>>> at
>>> org.gradle.api.internal.project.taskfactory.In

Re: Weird intermittent build error caused by spotlessJava

2019-03-26 Thread Kirk Lund
I just had this occur again. And I found another solution, albeit not a
great one, but if you disable org.gradle.daemon that also clears up this
problem. You can add this to ~/.gradle/gradle.properties:

org.gradle.daemon=false

Just to clarify, I'm on a Mac using gradlew in geode (I don't have my own
version of gradle installed) and oracle jdk1.8.0_202.

On Tue, Mar 26, 2019 at 10:48 AM Kirk Lund  wrote:

> Few more details that might help others...
>
> Each Geode module fails in the same way but obviously on a different .java
> file, and I have not altered these .java files -- they do not appear to
> have any format issues involving imports.
>
> Once gradle is in this state, switching branches from a feature branch to
> develop (which has no local changes) does not clear it up either.
>
> > Task :geode-junit:spotlessJava FAILED
> Step 'removeUnusedImports' found problem in
> 'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
> null
> java.lang.reflect.InvocationTargetException
>
> On Tue, Mar 26, 2019 at 10:43 AM Kirk Lund  wrote:
>
>> Intermittent build error caused by spotlessJava. My checkout of Geode
>> intermittently gets into a state that then fails to build. I've been seeing
>> this come and go for the last month or two. I'm not sure what puts it into
>> this state but executing:
>>
>> $ ./gradlew clean build -x test
>>
>> ...will repeatedly fail in junit:spotlessJava with a
>> java.lang.reflect.InvocationTargetException and a root cause of
>> ava.lang.NoClassDefFoundError:
>> org/openjdk/tools/javac/comp/InferenceContext.
>>
>> Stopping the gradle daemon(s) does not clear the problem up. Cleaning
>> does not clear it up. Changing JDK version does not clear it up. But I just
>> discovered this morning that if I attempt to execute spotless directly that
>> finally clears it up:
>>
>> $ ./gradlew spotlessApply
>> or
>> $ ./gradlew spotlessCheck
>>
>> After either spotlessApply or spotlessCheck, the build will stop failing
>> when I execute:
>>
>> $ ./gradlew clean build -x test
>>
>> The stack thrown by spotlessJava and printed by gradle when it's stuck in
>> this state is below...
>>
>> > Task :geode-junit:spotlessJava FAILED
>> Step 'removeUnusedImports' found problem in
>> 'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
>> null
>> java.lang.reflect.InvocationTargetException
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> com.diffplug.spotless.java.GoogleJavaFormatStep$State.lambda$createRemoveUnusedImportsOnly$1(GoogleJavaFormatStep.java:153)
>> at
>> com.diffplug.spotless.FormatterStepImpl$Standard.format(FormatterStepImpl.java:78)
>> at
>> com.diffplug.spotless.FormatterStep$Strict.format(FormatterStep.java:76)
>> at com.diffplug.spotless.Formatter.compute(Formatter.java:230)
>> at com.diffplug.spotless.Formatter.isClean(Formatter.java:167)
>> at
>> com.diffplug.gradle.spotless.SpotlessTask.check(SpotlessTask.java:263)
>> at
>> com.diffplug.gradle.spotless.SpotlessTask.performAction(SpotlessTask.java:205)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:103)
>> at
>> org.gradle.api.internal.project.taskfactory.IncrementalTaskAction.doExecute(IncrementalTaskAction.java:73)
>> at
>> org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:41)
>> at
>> org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:28)
>> at
>> org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter$4.run(ExecuteActionsTaskExecuter.java:338)
>> at
>> org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:402)
>> at
>> org.gradle.internal.operati

Re: Weird intermittent build error caused by spotlessJava

2019-03-26 Thread Kirk Lund
Few more details that might help others...

Each Geode module fails in the same way but obviously on a different .java
file, and I have not altered these .java files -- they do not appear to
have any format issues involving imports.

Once gradle is in this state, switching branches from a feature branch to
develop (which has no local changes) does not clear it up either.

> Task :geode-junit:spotlessJava FAILED
Step 'removeUnusedImports' found problem in
'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
null
java.lang.reflect.InvocationTargetException

On Tue, Mar 26, 2019 at 10:43 AM Kirk Lund  wrote:

> Intermittent build error caused by spotlessJava. My checkout of Geode
> intermittently gets into a state that then fails to build. I've been seeing
> this come and go for the last month or two. I'm not sure what puts it into
> this state but executing:
>
> $ ./gradlew clean build -x test
>
> ...will repeatedly fail in junit:spotlessJava with a
> java.lang.reflect.InvocationTargetException and a root cause of
> ava.lang.NoClassDefFoundError:
> org/openjdk/tools/javac/comp/InferenceContext.
>
> Stopping the gradle daemon(s) does not clear the problem up. Cleaning does
> not clear it up. Changing JDK version does not clear it up. But I just
> discovered this morning that if I attempt to execute spotless directly that
> finally clears it up:
>
> $ ./gradlew spotlessApply
> or
> $ ./gradlew spotlessCheck
>
> After either spotlessApply or spotlessCheck, the build will stop failing
> when I execute:
>
> $ ./gradlew clean build -x test
>
> The stack thrown by spotlessJava and printed by gradle when it's stuck in
> this state is below...
>
> > Task :geode-junit:spotlessJava FAILED
> Step 'removeUnusedImports' found problem in
> 'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
> null
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> com.diffplug.spotless.java.GoogleJavaFormatStep$State.lambda$createRemoveUnusedImportsOnly$1(GoogleJavaFormatStep.java:153)
> at
> com.diffplug.spotless.FormatterStepImpl$Standard.format(FormatterStepImpl.java:78)
> at
> com.diffplug.spotless.FormatterStep$Strict.format(FormatterStep.java:76)
> at com.diffplug.spotless.Formatter.compute(Formatter.java:230)
> at com.diffplug.spotless.Formatter.isClean(Formatter.java:167)
> at
> com.diffplug.gradle.spotless.SpotlessTask.check(SpotlessTask.java:263)
> at
> com.diffplug.gradle.spotless.SpotlessTask.performAction(SpotlessTask.java:205)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:103)
> at
> org.gradle.api.internal.project.taskfactory.IncrementalTaskAction.doExecute(IncrementalTaskAction.java:73)
> at
> org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:41)
> at
> org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:28)
> at
> org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter$4.run(ExecuteActionsTaskExecuter.java:338)
> at
> org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:402)
> at
> org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:394)
> at
> org.gradle.internal.operations.DefaultBuildOperationExecutor$1.execute(DefaultBuildOperationExecutor.java:165)
> at
> org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:250)
> at
> org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:158)
> at
> org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:92)
> at
> org.gradle.internal.operations.DelegatingBuildOperationExecutor.run(DelegatingBuildOperationExecutor.java:31)
> 

Weird intermittent build error caused by spotlessJava

2019-03-26 Thread Kirk Lund
Intermittent build error caused by spotlessJava. My checkout of Geode
intermittently gets into a state that then fails to build. I've been seeing
this come and go for the last month or two. I'm not sure what puts it into
this state but executing:

$ ./gradlew clean build -x test

...will repeatedly fail in junit:spotlessJava with a
java.lang.reflect.InvocationTargetException and a root cause of
ava.lang.NoClassDefFoundError:
org/openjdk/tools/javac/comp/InferenceContext.

Stopping the gradle daemon(s) does not clear the problem up. Cleaning does
not clear it up. Changing JDK version does not clear it up. But I just
discovered this morning that if I attempt to execute spotless directly that
finally clears it up:

$ ./gradlew spotlessApply
or
$ ./gradlew spotlessCheck

After either spotlessApply or spotlessCheck, the build will stop failing
when I execute:

$ ./gradlew clean build -x test

The stack thrown by spotlessJava and printed by gradle when it's stuck in
this state is below...

> Task :geode-junit:spotlessJava FAILED
Step 'removeUnusedImports' found problem in
'geode-junit/src/main/java/org/apache/geode/test/junit/rules/gfsh/DebuggableCommand.java':
null
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
com.diffplug.spotless.java.GoogleJavaFormatStep$State.lambda$createRemoveUnusedImportsOnly$1(GoogleJavaFormatStep.java:153)
at
com.diffplug.spotless.FormatterStepImpl$Standard.format(FormatterStepImpl.java:78)
at
com.diffplug.spotless.FormatterStep$Strict.format(FormatterStep.java:76)
at com.diffplug.spotless.Formatter.compute(Formatter.java:230)
at com.diffplug.spotless.Formatter.isClean(Formatter.java:167)
at
com.diffplug.gradle.spotless.SpotlessTask.check(SpotlessTask.java:263)
at
com.diffplug.gradle.spotless.SpotlessTask.performAction(SpotlessTask.java:205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:103)
at
org.gradle.api.internal.project.taskfactory.IncrementalTaskAction.doExecute(IncrementalTaskAction.java:73)
at
org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:41)
at
org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:28)
at
org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter$4.run(ExecuteActionsTaskExecuter.java:338)
at
org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:402)
at
org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:394)
at
org.gradle.internal.operations.DefaultBuildOperationExecutor$1.execute(DefaultBuildOperationExecutor.java:165)
at
org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:250)
at
org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:158)
at
org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:92)
at
org.gradle.internal.operations.DelegatingBuildOperationExecutor.run(DelegatingBuildOperationExecutor.java:31)
at
org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeAction(ExecuteActionsTaskExecuter.java:327)
at
org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeActions(ExecuteActionsTaskExecuter.java:312)
at
org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.access$200(ExecuteActionsTaskExecuter.java:75)
at
org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter$TaskExecution.execute(ExecuteActionsTaskExecuter.java:158)
at
org.gradle.internal.execution.impl.steps.ExecuteStep.execute(ExecuteStep.java:46)
at
org.gradle.internal.execution.impl.steps.CancelExecutionStep.execute(CancelExecutionStep.java:34)
at
org.gradle.internal.execution.impl.steps.TimeoutStep.executeWithoutTimeout(TimeoutStep.java:69)
at
org.gradle.internal.execution.impl.steps.TimeoutStep.execute(TimeoutStep.java:49)
at
org.gradle.internal.execution.impl.steps.CatchExceptionStep.execute(CatchExceptionStep.java:34)
at

Review of GEODE-6295: Add Micrometer-based metrics system #3277

2019-03-11 Thread Kirk Lund
Please review GEODE-6295: Add Micrometer-based metrics system #3277 if you
have time:
https://github.com/apache/geode/pull/3277

Thanks,
Kirk


Re: Review request for GEODE-6295: Add Micrometer-based metrics system #3277

2019-03-11 Thread Kirk Lund
Please review our PR. Thanks!

On Fri, Mar 8, 2019 at 4:09 PM Kirk Lund  wrote:

> Please review GEODE-6295: Add Micrometer-based metrics system #3277 and
> provide any feedback you have in the PR or feel free to reply to this email
> thread.
>
> https://github.com/apache/geode/pull/3277
>
> Thanks,
> Kirk
>
>


Review request for GEODE-6295: Add Micrometer-based metrics system #3277

2019-03-08 Thread Kirk Lund
Please review GEODE-6295: Add Micrometer-based metrics system #3277 and
provide any feedback you have in the PR or feel free to reply to this email
thread.

https://github.com/apache/geode/pull/3277

Thanks,
Kirk


Request PR review for #3263

2019-03-06 Thread Kirk Lund
Anyone have time to review a PR for me? It fixes up another DUnit rule so
it works with adding and bouncing VMs.

https://github.com/apache/geode/pull/3263

Thanks,
Kirk


Re: Bug Numbers and Trac Numbers in comments

2019-02-21 Thread Kirk Lund
I didn't write this test, but I did rename it from Bug42226Test to
PersistentPartitionHangsDuringRestartRegressionTest and do some additional
cleaning to bring it up to minimal standards.

Please feel free to file a PR that removes all Jira and Trac ticket #s from
the codebase (or parts of it).

On Wed, Feb 20, 2019 at 11:28 AM Alexander Murmann 
wrote:

> I think it's important that we enable everyone in the community to be
> equally successful and on top of that do NOT rely on non-Apache resources.
> If we find value in Trac numbers, we should either find a way to make them
> accessible to everyone or update the comment so that there is no value in
> knowing the number.
>
> In general, I avoid references in the code to anything outside the code
> base as much as possible. Anything that's checked into the repo gets
> versioned together and I don't have to worry about aligning versions of
> what's in the code with potential versions of the outside resource. So my
> vote is for augmenting the comment. Maybe we can do even better and make
> the code or test self-explanatory enough to need even fewer comments?
>
> On Wed, Feb 20, 2019 at 10:23 AM Kirk Lund  wrote:
>
> > Well, the problem is that different people disagree on what's
> "meaningful"
> > in this context. For example:
> >
> > See PersistentPartitionHangsDuringRestartRegressionTest.java
> >
> > *  /***
> > *   * RegressionTest for bug 42226. *
> > *   * 1. Member A has the bucket *
> > *   * 2. Member B starts creating the bucket. It tells member A that it
> > hosts the bucket *
> > *   * 3. Member A crashes *
> > *   * 4. Member B destroys the bucket and throws a partition offline
> > exception, because it wasn't*
> > *   * able to complete initialization. *
> > *   * 5. Member A recovers, and gets stuck waiting for member B.*
> > *   **
> > *   * *
> > *   * TRAC 42226: recycled VM hangs during re-start while waiting for
> > Partition to come online (after*
> > *   * Controller VM sees unexpected PartitionOffLineException while doing
> > ops)*
> > *   */*
> > *  @Test*
> > *  public void doesNotWaitForPreviousInstanceOfOnlineServer() throws
> > Exception {*
> >
> > I personally would NOT remove the references to "42226" from the above,
> and
> > here's why:
> > 1) The javadoc clearly states the what and how of this bug
> > 2) The 2nd paragraph includes the full summary from the old TRAC ticket
> > 3) Quite a few people working on Geode do have access to TRAC which means
> > anyone in the community can request us to go dig up further history about
> > this bug to share
> >
> > If we systematically delete all TRAC #s from javadocs and comments then
> > community developers will lose that opportunity. While it's true that
> most
> > of the time most developers will probably never need or want that
> history,
> > I would not say it's never going to be valuable.
> >
> > When I'm working on code that references old bugs (especially regression
> > tests like the one above), I frequently pull up TRAC and read about the
> > bug. I also tend to pull up the old pre-Apache git repo and read through
> > old commit messages. Maybe I'm the only one who does this.
> >
> > Despite my point of view, I don't feel very strongly about it. If you
> guys
> > really think that the TRAC #s distract or confuse you too much and you
> > can't foresee the need to ask someone to pull up history on a ticket,
> then
> > you should submit a PR to "remove all TRAC #s" from the code base. I
> won't
> > disapprove it -- I can always look at older revisions of files like
> > PersistentPartitionHangsDuringRestartRegressionTest to find the old TRAC
> #.
> > I previously submitted PRs to rename all test classes and test methods
> from
> > names like test42226 to meaningful names so I support the overall intent.
> >
> > On Tue, Feb 19, 2019 at 5:21 PM Jacob Barrett 
> wrote:
> >
> > > Comments that don’t provide meaningful context beyond what is already
> > > expressed in the code should be removed. A number to a system that the
> > > general public can’t access is not meaningful. Delete or replace with
> > > meaningful comment.
> > >
> > > -jake
> > >
> > >
> > > > On Feb 19, 2019, at 1:41 PM, Michael Oleske 
> > wrote:
> > > >
> > > > Hey Geode Dev Friends!
> > > >
> > > > I was reviewing a PR (this one
> > https://github.com/apache/geode/pull/3197
> > > )
> > > > and made a note that maybe we

Re: Bug Numbers and Trac Numbers in comments

2019-02-20 Thread Kirk Lund
Well, the problem is that different people disagree on what's "meaningful"
in this context. For example:

See PersistentPartitionHangsDuringRestartRegressionTest.java

*  /***
*   * RegressionTest for bug 42226. *
*   * 1. Member A has the bucket *
*   * 2. Member B starts creating the bucket. It tells member A that it
hosts the bucket *
*   * 3. Member A crashes *
*   * 4. Member B destroys the bucket and throws a partition offline
exception, because it wasn't*
*   * able to complete initialization. *
*   * 5. Member A recovers, and gets stuck waiting for member B.*
*   **
*   * *
*   * TRAC 42226: recycled VM hangs during re-start while waiting for
Partition to come online (after*
*   * Controller VM sees unexpected PartitionOffLineException while doing
ops)*
*   */*
*  @Test*
*  public void doesNotWaitForPreviousInstanceOfOnlineServer() throws
Exception {*

I personally would NOT remove the references to "42226" from the above, and
here's why:
1) The javadoc clearly states the what and how of this bug
2) The 2nd paragraph includes the full summary from the old TRAC ticket
3) Quite a few people working on Geode do have access to TRAC which means
anyone in the community can request us to go dig up further history about
this bug to share

If we systematically delete all TRAC #s from javadocs and comments then
community developers will lose that opportunity. While it's true that most
of the time most developers will probably never need or want that history,
I would not say it's never going to be valuable.

When I'm working on code that references old bugs (especially regression
tests like the one above), I frequently pull up TRAC and read about the
bug. I also tend to pull up the old pre-Apache git repo and read through
old commit messages. Maybe I'm the only one who does this.

Despite my point of view, I don't feel very strongly about it. If you guys
really think that the TRAC #s distract or confuse you too much and you
can't foresee the need to ask someone to pull up history on a ticket, then
you should submit a PR to "remove all TRAC #s" from the code base. I won't
disapprove it -- I can always look at older revisions of files like
PersistentPartitionHangsDuringRestartRegressionTest to find the old TRAC #.
I previously submitted PRs to rename all test classes and test methods from
names like test42226 to meaningful names so I support the overall intent.

On Tue, Feb 19, 2019 at 5:21 PM Jacob Barrett  wrote:

> Comments that don’t provide meaningful context beyond what is already
> expressed in the code should be removed. A number to a system that the
> general public can’t access is not meaningful. Delete or replace with
> meaningful comment.
>
> -jake
>
>
> > On Feb 19, 2019, at 1:41 PM, Michael Oleske  wrote:
> >
> > Hey Geode Dev Friends!
> >
> > I was reviewing a PR (this one https://github.com/apache/geode/pull/3197
> )
> > and made a note that maybe we should remove comments that make references
> > to bug and trac numbers that people cannot reach (like me for one).  Kirk
> > mentioned that some people (like him) have access to those bugs and have
> > proven helpful for some history.  So there is this balance between noise
> > (people who cannot access those old issues) and getting context (people
> who
> > can access those issues).
> >
> > So I guess my point is to start a discussion on what a path forward might
> > be (if any)?  My opinion is that they are noise and we should remove
> them.
> > If someone has access to the original issue, then making sure there is a
> > test case covering it should be done.  Then it makes even more sense to
> me
> > to remove the comment.
> >
> > -michael
>


StopServerWithSecurityAcceptanceTest fails in precheckin

2019-02-14 Thread Kirk Lund
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest
> cannotStopServerAsClusterReaderOverHttp FAILED
org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
at
jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at
org.apache.geode.test.junit.rules.gfsh.GfshExecution.awaitTermination(GfshExecution.java:125)
at
org.apache.geode.test.junit.rules.gfsh.GfshRule.execute(GfshRule.java:125)
at
org.apache.geode.test.junit.rules.gfsh.GfshScript.execute(GfshScript.java:133)
at
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.startCluster(StopServerWithSecurityAcceptanceTest.java:110)
at
org.apache.geode.management.internal.cli.commands.StopServerWithSecurityAcceptanceTest.cannotStopServerAsClusterReaderOverHttp(StopServerWithSecurityAcceptanceTest.java:96)


Precheckin flakiness

2019-02-13 Thread Kirk Lund
Despite our attempts to make precheckin free of flaky failures, I'm still
seeing at least one flaky failure (that does not reproduce and does not
correlate to my PR changes) in every single precheckin launched for my PRs.

For example, the latest one (
https://concourse.apachegeode-ci.info/builds/37714) is:

*org.apache.geode.internal.cache.ha.HAGIIDUnitTest > testGIIRegionQueue
FAILED*
*java.lang.AssertionError: Suspicious strings were written to the log
during this run.*
*Fix the strings or use IgnoredException.addIgnoredException to ignore.*
*
---*
*Found suspect string in log4j at line 2703*

*java.net.ConnectException: Connection refused (Connection refused)*

I've searched Jira and there is no open ticket for this test. Are we
expected to file new Jira tickets for every single failure I see in
precheckin? And if so, are other developers filing tickets like I am? Or do
we go back to ignoring all of these flaky failures (I don't really consider
this a good option)?

I'm worried that I'm wasting a lot of time filing all of these Jira tickets
if no one is ever going to work on them and then they get bulk closed by
someone 12 months from now...


Should Geode stats conform to backwards compatibility constraints?

2019-02-13 Thread Kirk Lund
Quite a few Geode stats are currently defined as IntCounters which very
easily wrap past Integer.MAX_VALUE. Some users wanted these stats to not
wrap to negative MAX_VALUE, so my team defined the following two tickets
and changed them to LongCounters on the develop branch:

GEODE-6345: StatSamplerStats jvmPauses stat may wrap to negative value
https://issues.apache.org/jira/browse/GEODE-6345

GEODE-6334: CachePerfStats operation count stats may wrap to negative values
https://issues.apache.org/jira/browse/GEODE-6334

Some people are now concerned that this may break backwards compatibility
and API for existing users.

There are two potential ways for a user to *experience* this as an API
change:

1) MemberMXBean in JMX

*-  int getTotalNetSearchCompleted();*
*+  long getTotalNetSearchCompleted();*

As you can see in GEODE-6334, we changed quite a few stats from integer to
long in CachePerfStats. The only one directly exposed as an attribute on
MemberMXBean is TotalNetSearchCompleted.

Users will find that this API method changed.

2) Statistics API with undocumented strings

If we assume that users might know or discover that we have a statistics
text id of "cachePerfStats" with an integer stat of name "puts" then they
could use our Statistics API to get the value:

* 1:CacheFactory cacheFactory = new CacheFactory();*
* 2:Cache cache = cacheFactory.create();*
* 3:*
* 4:Region region = cache.createRegionFactory(PARTITION).create("region");*
* 5:*
* 6:Statistics[] statistics =
cache.getDistributedSystem().findStatisticsByTextId("cachePerfStats");*
* 7:int oldPuts = statistics[0].getInt("puts");*
* 8:*
* 9:region.put("some", "value");*
*10:*
*11:int newPuts = statistics[0].getInt("puts");*
*12:*
*13:assertThat(newPuts).isEqualTo(oldPuts + 1);*

The above works in Geode 1.8 and earlier but will begin to throw the
following in Geode 1.9 (when develop branch is released):

*java.lang.IllegalArgumentException: The statistic puts with id 23 is of
type long and it was expected to be an int.*
* at
org.apache.geode.internal.statistics.StatisticDescriptorImpl.checkInt(StatisticDescriptorImpl.java:324)*
* at
org.apache.geode.internal.statistics.StatisticsImpl.getIntId(StatisticsImpl.java:577)*
* at
org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:274)*
* at
org.apache.geode.internal.statistics.StatisticsImpl.getInt(StatisticsImpl.java:269)*
* at com.user.example.Example.example(Example.java7)*

In order to avoid the above exception, a user would have to change the code
on lines 7 and 11 to use *getLong* instead of *getInt*.

Should Geode stats be considered a form of API contract and should they
conform to backwards compatibility constraints?

Should we change these from LongCounters back to IntCounters?

Someone did suggest that we could change them back to IntCounters and then
change our statistics internals to skip to zero anytime a stat attempts to
increment above MAX_VALUE, however, I think that if we decide that stats
cannot be changed from IntCounters to LongCounters then we should not make
this change either.

Thanks for any input or opinions,
Kirk


Re: geode-junit and geode-dunit classpath problems

2019-02-08 Thread Kirk Lund
Alright, this one is figured out! I think most of these issues are other
3rd party dependencies that aren't in geode-junit or geode-dunit because
the tests that run and require those dependencies are over in geode-core. I
believe I can handle this sort of thing one at a time... 

On Fri, Feb 8, 2019 at 2:01 PM Kirk Lund  wrote:

> We have a src/main class called UseJacksonForJsonPathRule which uses
> classes from this dependency in geode-junit/build.gradle:
>
>   compile('com.jayway.jsonpath:json-path')
>
> I'm trying to write a unit test in src/test called
> UseJacksonForJsonPathRuleTest but it fails with the following
> NoClassDefFoundErrors. These are the same classes used by any test using
> UseJacksonForJsonPathRule. geode-dunit's
> DistributedUseJacksonForJsonPathRule extends UseJacksonForJsonPathRule and
> we have one test in geode-core that uses the distributed version of the
> rule and it passes without NoClassDefFoundErrors.
>
> Anyone know why or what needs to change in geode-junit/build.gradle?
>
> Also, I seem to keep hitting this ever since we exploded geode-junit and
> geode-dunit. It's as if the classpaths aren't really correct for
> geode-junit and geode-dunit, such that when I start writing unit tests or
> integration tests within those two modules, they just don't work unless I
> move the test to geode-core -- that's the only module in which I can
> actually run these back filled unit tests. Given that it seems to be a
> systemic problem with geode-junit and geode-dunit, can we do something to
> fix this systemically instead of every single time I write a new test?
>
> java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/ObjectMapper
>
> at
> com.jayway.jsonpath.spi.json.JacksonJsonProvider.(JacksonJsonProvider.java:32)
> at
> org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule$1.(UseJacksonForJsonPathRule.java:65)
> at
> org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule.before(UseJacksonForJsonPathRule.java:63)
> at
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource.access$000(SerializableExternalResource.java:24)
> at
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:36)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.ClassNotFoundException:
> com.fasterxml.jackson.databind.ObjectMapper
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 20 more
>
>
> java.lang.NoClassDefFoundError: Could not initialize class
> com.jayway.jsonpath.spi.json.JacksonJsonProvider
>
> at
> org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule$1.(UseJacksonForJsonPathRule.java:65)
> at
> org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule.before(UseJacksonForJsonPathRule.java:63)
> at
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource.access$000(SerializableExternalResource.java:24)
> at
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:36)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.Pa

geode-junit and geode-dunit classpath problems

2019-02-08 Thread Kirk Lund
We have a src/main class called UseJacksonForJsonPathRule which uses
classes from this dependency in geode-junit/build.gradle:

  compile('com.jayway.jsonpath:json-path')

I'm trying to write a unit test in src/test called
UseJacksonForJsonPathRuleTest but it fails with the following
NoClassDefFoundErrors. These are the same classes used by any test using
UseJacksonForJsonPathRule. geode-dunit's
DistributedUseJacksonForJsonPathRule extends UseJacksonForJsonPathRule and
we have one test in geode-core that uses the distributed version of the
rule and it passes without NoClassDefFoundErrors.

Anyone know why or what needs to change in geode-junit/build.gradle?

Also, I seem to keep hitting this ever since we exploded geode-junit and
geode-dunit. It's as if the classpaths aren't really correct for
geode-junit and geode-dunit, such that when I start writing unit tests or
integration tests within those two modules, they just don't work unless I
move the test to geode-core -- that's the only module in which I can
actually run these back filled unit tests. Given that it seems to be a
systemic problem with geode-junit and geode-dunit, can we do something to
fix this systemically instead of every single time I write a new test?

java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/ObjectMapper

at
com.jayway.jsonpath.spi.json.JacksonJsonProvider.(JacksonJsonProvider.java:32)
at
org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule$1.(UseJacksonForJsonPathRule.java:65)
at
org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule.before(UseJacksonForJsonPathRule.java:63)
at
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource.access$000(SerializableExternalResource.java:24)
at
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:36)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.lang.ClassNotFoundException:
com.fasterxml.jackson.databind.ObjectMapper
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 20 more


java.lang.NoClassDefFoundError: Could not initialize class
com.jayway.jsonpath.spi.json.JacksonJsonProvider

at
org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule$1.(UseJacksonForJsonPathRule.java:65)
at
org.apache.geode.test.junit.rules.UseJacksonForJsonPathRule.before(UseJacksonForJsonPathRule.java:63)
at
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource.access$000(SerializableExternalResource.java:24)
at
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:36)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)


Re: Very red CI -> Hold merges, please

2019-02-08 Thread Kirk Lund
If you put quotes around "WANRollingUpgradeNewSenderProcessOldEvent" in
Jira, then the search finds no hits. Without the quotes it results in two
tickets now including the one you worked on. Jira must be searching for
partial string matches without quotes -- so some sequence of words in the
name of that test must be in your ticket. I'll be using quotes from now on.

On Thu, Feb 7, 2019 at 10:50 PM Xiaojian Zhou  wrote:

> WANRollingUpgradeNewSenderProcessOldEvent is not related with GEODE-3967.
> I wonder why search guided us to GEODE-3967.
>
> Regards
> Gester
>
> On Thu, Feb 7, 2019 at 8:34 PM Owen Nichols  wrote:
>
> > Pipeline is back to green now.  Thank you to everyone who stepped up to
> > get things back on track.
> >
> > If you had PR checks fail this week, please re-trigger them (by making an
> > empty commit).
> >
> > > On Feb 7, 2019, at 4:20 PM, Alexander Murmann 
> > wrote:
> > >
> > > Bruce, would it make sense to for now revert the suspect change to the
> > > test? At that point we should be back to full green and we all can
> > without
> > > a doubt go back to our usual flow of merging to develop.
> > >
> > > Thoughts?
> > >
> > > On Thu, Feb 7, 2019 at 2:37 PM Kirk Lund  wrote:
> > >
> > >> Hmm, and that was another false search hit in Jira! Searching for
> > >> WANRollingUpgradeNewSenderProcessOldEvent in Jira brings up GEODE-3967
> > >> which apparently does NOT involve that test. So, maybe we found
> another
> > >> flaky test.
> > >>
> > >> Jira search seems to not work very well.
> > >>
> > >> On Thu, Feb 7, 2019 at 2:24 PM Kirk Lund  wrote:
> > >>
> > >>> The UpgradeTest failures on your latest commit for this PR are
> > >>> WANRollingUpgradeNewSenderProcessOldEvent which seems to be a
> > >> reoccurrence
> > >>> of [GEODE-3967](https://issues.apache.org/jira/browse/GEODE-3967). I
> > >>> recommend having Gester take a look at that these failures. He marked
> > >>> [GEODE-3967](https://issues.apache.org/jira/browse/GEODE-3967) as
> > >>> resolved on Jan 9th.
> > >>>
> > >>> On Thu, Feb 7, 2019 at 12:37 PM Jens Deppe 
> wrote:
> > >>>
> > >>>> No worries. I think I have a better fix now. At least the builds are
> > >>>> moving
> > >>>> again.
> > >>>>
> > >>>> On Thu, Feb 7, 2019 at 12:11 PM Kirk Lund  wrote:
> > >>>>
> > >>>>> Sorry, go ahead and revert the commit and reopen the PR.
> > >>>>>
> > >>>>> On Thu, Feb 7, 2019 at 11:36 AM Jens Deppe 
> > wrote:
> > >>>>>
> > >>>>>> I was still working on a fix...
> > >>>>>>
> > >>>>>> On Thu, Feb 7, 2019 at 11:31 AM Kirk Lund 
> wrote:
> > >>>>>>
> > >>>>>>> I merged it in.
> > >>>>>>>
> > >>>>>>> On Thu, Feb 7, 2019 at 11:28 AM Kirk Lund 
> > >> wrote:
> > >>>>>>>
> > >>>>>>>> I think we should go ahead and merge in
> > >>>>>>>> https://github.com/apache/geode/pull/3172 since it resolves the
> > >>>>>>>> GfshConsoleModeUnitTest UnitTest failures.
> > >>>>>>>>
> > >>>>>>>> On Thu, Feb 7, 2019 at 9:57 AM Nabarun Nag 
> > >>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> FYI, I have just merged a ci timeout fix to increase the
> > >> timeout
> > >>>> for
> > >>>>>>>>> geode-benchmarks to 4h. This does not influence any geode
> > >>>> modules.
> > >>>>>>>>>
> > >>>>>>>>> Regards
> > >>>>>>>>> Naba
> > >>>>>>>>>
> > >>>>>>>>> On Thu, Feb 7, 2019 at 9:32 AM Alexander Murmann <
> > >>>>> amurm...@apache.org
> > >>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi folks,
> > >>>>>>>>>>
> > >>>>>>>>>> Our CI is very red since ~24 hours
> > >>>>>>>>>> <
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/372
> > >>>>>>>>>>> .
> > >>>>>>>>>> It looks like a substantial new issue was introduced.
> > >>>>>>>>>>
> > >>>>>>>>>> Can we hold off on merging new changes to the develop branch
> > >>>> till
> > >>>>>> this
> > >>>>>>>>>> issue is resolved?
> > >>>>>>>>>>
> > >>>>>>>>>> Thank you all!
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>


Re: Adding integrationTest src set to geode-dunit

2019-02-08 Thread Kirk Lund
Yep, I'll ping you later this afternoon to see if you're free. Thanks!

On Thu, Feb 7, 2019 at 4:11 PM Robert Houghton  wrote:

> Can I work with you on this tomorrow?
>
> On Thu, Feb 7, 2019, 15:09 Kirk Lund  wrote:
>
> > The usual geode src sets like integrationTest don't already exist in some
> > modules such as geode-dunit.
> >
> > I'm trying to write a new IntegrationTest but simply creating the
> > directories and placing a new .java file in it doesn't seem to work.
> >
> >
> >
> geode-dunit/src/integrationTest/java/org/apache/geode/test/junit/rules/DiskDirRuleIntegrationTest.java
> >
> > I've looked at geode-dunit/build.gradle and it's not clear what to add
> > there if anything. The only src sets currently in geode-dunit are:
> >
> > geode-dunit/src/distributedTest
> > geode-dunit/src/main
> > geode-dunit/src/test
> >
> > How do I add integrationTest so that it a) compiles, b) is added to my IJ
> > project, and 3) actually runs in CI and precheckin?
> >
>


Adding integrationTest src set to geode-dunit

2019-02-07 Thread Kirk Lund
The usual geode src sets like integrationTest don't already exist in some
modules such as geode-dunit.

I'm trying to write a new IntegrationTest but simply creating the
directories and placing a new .java file in it doesn't seem to work.

geode-dunit/src/integrationTest/java/org/apache/geode/test/junit/rules/DiskDirRuleIntegrationTest.java

I've looked at geode-dunit/build.gradle and it's not clear what to add
there if anything. The only src sets currently in geode-dunit are:

geode-dunit/src/distributedTest
geode-dunit/src/main
geode-dunit/src/test

How do I add integrationTest so that it a) compiles, b) is added to my IJ
project, and 3) actually runs in CI and precheckin?


Re: Very red CI -> Hold merges, please

2019-02-07 Thread Kirk Lund
Hmm, and that was another false search hit in Jira! Searching for
WANRollingUpgradeNewSenderProcessOldEvent in Jira brings up GEODE-3967
which apparently does NOT involve that test. So, maybe we found another
flaky test.

Jira search seems to not work very well.

On Thu, Feb 7, 2019 at 2:24 PM Kirk Lund  wrote:

> The UpgradeTest failures on your latest commit for this PR are
> WANRollingUpgradeNewSenderProcessOldEvent which seems to be a reoccurrence
> of [GEODE-3967](https://issues.apache.org/jira/browse/GEODE-3967). I
> recommend having Gester take a look at that these failures. He marked
> [GEODE-3967](https://issues.apache.org/jira/browse/GEODE-3967) as
> resolved on Jan 9th.
>
> On Thu, Feb 7, 2019 at 12:37 PM Jens Deppe  wrote:
>
>> No worries. I think I have a better fix now. At least the builds are
>> moving
>> again.
>>
>> On Thu, Feb 7, 2019 at 12:11 PM Kirk Lund  wrote:
>>
>> > Sorry, go ahead and revert the commit and reopen the PR.
>> >
>> > On Thu, Feb 7, 2019 at 11:36 AM Jens Deppe  wrote:
>> >
>> > > I was still working on a fix...
>> > >
>> > > On Thu, Feb 7, 2019 at 11:31 AM Kirk Lund  wrote:
>> > >
>> > > > I merged it in.
>> > > >
>> > > > On Thu, Feb 7, 2019 at 11:28 AM Kirk Lund  wrote:
>> > > >
>> > > > > I think we should go ahead and merge in
>> > > > > https://github.com/apache/geode/pull/3172 since it resolves the
>> > > > > GfshConsoleModeUnitTest UnitTest failures.
>> > > > >
>> > > > > On Thu, Feb 7, 2019 at 9:57 AM Nabarun Nag 
>> wrote:
>> > > > >
>> > > > >> FYI, I have just merged a ci timeout fix to increase the timeout
>> for
>> > > > >> geode-benchmarks to 4h. This does not influence any geode
>> modules.
>> > > > >>
>> > > > >> Regards
>> > > > >> Naba
>> > > > >>
>> > > > >> On Thu, Feb 7, 2019 at 9:32 AM Alexander Murmann <
>> > amurm...@apache.org
>> > > >
>> > > > >> wrote:
>> > > > >>
>> > > > >> > Hi folks,
>> > > > >> >
>> > > > >> > Our CI is very red since ~24 hours
>> > > > >> > <
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/372
>> > > > >> > >.
>> > > > >> > It looks like a substantial new issue was introduced.
>> > > > >> >
>> > > > >> > Can we hold off on merging new changes to the develop branch
>> till
>> > > this
>> > > > >> > issue is resolved?
>> > > > >> >
>> > > > >> > Thank you all!
>> > > > >> >
>> > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>


Re: Very red CI -> Hold merges, please

2019-02-07 Thread Kirk Lund
The UpgradeTest failures on your latest commit for this PR are
WANRollingUpgradeNewSenderProcessOldEvent which seems to be a reoccurrence
of [GEODE-3967](https://issues.apache.org/jira/browse/GEODE-3967). I
recommend having Gester take a look at that these failures. He marked
[GEODE-3967](https://issues.apache.org/jira/browse/GEODE-3967) as resolved
on Jan 9th.

On Thu, Feb 7, 2019 at 12:37 PM Jens Deppe  wrote:

> No worries. I think I have a better fix now. At least the builds are moving
> again.
>
> On Thu, Feb 7, 2019 at 12:11 PM Kirk Lund  wrote:
>
> > Sorry, go ahead and revert the commit and reopen the PR.
> >
> > On Thu, Feb 7, 2019 at 11:36 AM Jens Deppe  wrote:
> >
> > > I was still working on a fix...
> > >
> > > On Thu, Feb 7, 2019 at 11:31 AM Kirk Lund  wrote:
> > >
> > > > I merged it in.
> > > >
> > > > On Thu, Feb 7, 2019 at 11:28 AM Kirk Lund  wrote:
> > > >
> > > > > I think we should go ahead and merge in
> > > > > https://github.com/apache/geode/pull/3172 since it resolves the
> > > > > GfshConsoleModeUnitTest UnitTest failures.
> > > > >
> > > > > On Thu, Feb 7, 2019 at 9:57 AM Nabarun Nag 
> wrote:
> > > > >
> > > > >> FYI, I have just merged a ci timeout fix to increase the timeout
> for
> > > > >> geode-benchmarks to 4h. This does not influence any geode modules.
> > > > >>
> > > > >> Regards
> > > > >> Naba
> > > > >>
> > > > >> On Thu, Feb 7, 2019 at 9:32 AM Alexander Murmann <
> > amurm...@apache.org
> > > >
> > > > >> wrote:
> > > > >>
> > > > >> > Hi folks,
> > > > >> >
> > > > >> > Our CI is very red since ~24 hours
> > > > >> > <
> > > > >> >
> > > > >>
> > > >
> > >
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/372
> > > > >> > >.
> > > > >> > It looks like a substantial new issue was introduced.
> > > > >> >
> > > > >> > Can we hold off on merging new changes to the develop branch
> till
> > > this
> > > > >> > issue is resolved?
> > > > >> >
> > > > >> > Thank you all!
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>


Re: Very red CI -> Hold merges, please

2019-02-07 Thread Kirk Lund
Sorry, go ahead and revert the commit and reopen the PR.

On Thu, Feb 7, 2019 at 11:36 AM Jens Deppe  wrote:

> I was still working on a fix...
>
> On Thu, Feb 7, 2019 at 11:31 AM Kirk Lund  wrote:
>
> > I merged it in.
> >
> > On Thu, Feb 7, 2019 at 11:28 AM Kirk Lund  wrote:
> >
> > > I think we should go ahead and merge in
> > > https://github.com/apache/geode/pull/3172 since it resolves the
> > > GfshConsoleModeUnitTest UnitTest failures.
> > >
> > > On Thu, Feb 7, 2019 at 9:57 AM Nabarun Nag  wrote:
> > >
> > >> FYI, I have just merged a ci timeout fix to increase the timeout for
> > >> geode-benchmarks to 4h. This does not influence any geode modules.
> > >>
> > >> Regards
> > >> Naba
> > >>
> > >> On Thu, Feb 7, 2019 at 9:32 AM Alexander Murmann  >
> > >> wrote:
> > >>
> > >> > Hi folks,
> > >> >
> > >> > Our CI is very red since ~24 hours
> > >> > <
> > >> >
> > >>
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/372
> > >> > >.
> > >> > It looks like a substantial new issue was introduced.
> > >> >
> > >> > Can we hold off on merging new changes to the develop branch till
> this
> > >> > issue is resolved?
> > >> >
> > >> > Thank you all!
> > >> >
> > >>
> > >
> >
>


Re: Very red CI -> Hold merges, please

2019-02-07 Thread Kirk Lund
I merged it in.

On Thu, Feb 7, 2019 at 11:28 AM Kirk Lund  wrote:

> I think we should go ahead and merge in
> https://github.com/apache/geode/pull/3172 since it resolves the
> GfshConsoleModeUnitTest UnitTest failures.
>
> On Thu, Feb 7, 2019 at 9:57 AM Nabarun Nag  wrote:
>
>> FYI, I have just merged a ci timeout fix to increase the timeout for
>> geode-benchmarks to 4h. This does not influence any geode modules.
>>
>> Regards
>> Naba
>>
>> On Thu, Feb 7, 2019 at 9:32 AM Alexander Murmann 
>> wrote:
>>
>> > Hi folks,
>> >
>> > Our CI is very red since ~24 hours
>> > <
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/372
>> > >.
>> > It looks like a substantial new issue was introduced.
>> >
>> > Can we hold off on merging new changes to the develop branch till this
>> > issue is resolved?
>> >
>> > Thank you all!
>> >
>>
>


Re: Very red CI -> Hold merges, please

2019-02-07 Thread Kirk Lund
I think we should go ahead and merge in
https://github.com/apache/geode/pull/3172 since it resolves the
GfshConsoleModeUnitTest UnitTest failures.

On Thu, Feb 7, 2019 at 9:57 AM Nabarun Nag  wrote:

> FYI, I have just merged a ci timeout fix to increase the timeout for
> geode-benchmarks to 4h. This does not influence any geode modules.
>
> Regards
> Naba
>
> On Thu, Feb 7, 2019 at 9:32 AM Alexander Murmann 
> wrote:
>
> > Hi folks,
> >
> > Our CI is very red since ~24 hours
> > <
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/372
> > >.
> > It looks like a substantial new issue was introduced.
> >
> > Can we hold off on merging new changes to the develop branch till this
> > issue is resolved?
> >
> > Thank you all!
> >
>


GfshConsoleModeUnitTest is failing in UnitTest

2019-02-06 Thread Kirk Lund
Did someone break unit tests?

I have 3 PRs with unit tests failing in GfshConsoleModeUnitTest:

org.apache.geode.management.internal.cli.shell.GfshConsoleModeUnitTest >
consoleModeShouldRedirectOnlyJDKLoggers FAILED
java.lang.AssertionError:
Expecting:
  <"org.apache.geode.management.internal.cli.LogWrapper">
not to end with:
  <"LogWrapper">
at
org.apache.geode.management.internal.cli.shell.GfshConsoleModeUnitTest.consoleModeShouldRedirectOnlyJDKLoggers(GfshConsoleModeUnitTest.java:54)

Does anyone know which commit broke this test? Can we please revert that
commit?


Re: ExecutorServiceRuleDumpThreads fails intermittently

2019-02-05 Thread Kirk Lund
This test has been deleted because it was really just testing ThreadInfo
which is a JDK class. It was originally added as more of an example for
ExecutorServiceRule.

If you see ExecutorServiceRuleDumpThreadsTest fail in a precheckin, please
ignore it. Rebasing on develop will remove the test.

On Tue, Feb 5, 2019 at 11:55 AM Kirk Lund  wrote:

> ExecutorServiceRuleDumpThreads.showsThreadBlockedByOtherThread fails
> intermittently in precheckin.
>
> UnitTestOpenJDK11 failure: [
> https://concourse.apachegeode-ci.info/builds/35684|https://concourse.apachegeode-ci.info/builds/35684
> ]
>
> *This is one I recently created, so I'll fix it. *Not sure why it passed
> StressNewTest and now fails...
>
> org.awaitility.core.ConditionTimeoutException: Assertion condition defined
> as a lambda expression in
> org.apache.geode.test.junit.rules.ExecutorServiceRuleDumpThreadsTest
> Expecting:
>  <"">
> to contain:
>  <"-  locked java.lang.Object@50c92111">  within 300 seconds.
> at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:145)
> at
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:122)
> at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32)
> at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:902)
> at
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at
> org.apache.geode.test.junit.rules.ExecutorServiceRuleDumpThreadsTest.showsThreadBlockedByOtherThread(ExecutorServiceRuleDumpThreadsTest.java:385)
> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:566)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
> at
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
> at
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
> at
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
> at
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:566)
> at
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
> at
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
> at
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
> at
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
> at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
> at
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118)
> at jdk.internal.reflect.NativeMethod

ExecutorServiceRuleDumpThreads fails intermittently

2019-02-05 Thread Kirk Lund
ExecutorServiceRuleDumpThreads.showsThreadBlockedByOtherThread fails
intermittently in precheckin.

UnitTestOpenJDK11 failure: [
https://concourse.apachegeode-ci.info/builds/35684|https://concourse.apachegeode-ci.info/builds/35684
]

*This is one I recently created, so I'll fix it. *Not sure why it passed
StressNewTest and now fails...

org.awaitility.core.ConditionTimeoutException: Assertion condition defined
as a lambda expression in
org.apache.geode.test.junit.rules.ExecutorServiceRuleDumpThreadsTest
Expecting:
 <"">
to contain:
 <"-  locked java.lang.Object@50c92111">  within 300 seconds.
at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:145)
at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:122)
at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32)
at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:902)
at
org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
at
org.apache.geode.test.junit.rules.ExecutorServiceRuleDumpThreadsTest.showsThreadBlockedByOtherThread(ExecutorServiceRuleDumpThreadsTest.java:385)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:566)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
at
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:566)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:566)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:175)
at
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:157)
at
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at

LocatorUDPSecurityDUnitTest fails intermittently

2019-02-05 Thread Kirk Lund
LocatorUDPSecurityDUnitTest failed in one of my PR precheckins. It does not
reproduce on my branch and I cannot find any open Jira tickets for it.

DistributedTestOpenJDK11 failure:
https://concourse.apachegeode-ci.info/builds/35457

I'm not sure who to assign this to. Any suggestions or volunteers?

GEODE-6363:
LocatorUDPSecurityDUnitTest.testMultipleLocatorsRestartingAtSameTimeWithMissingServers
fails intermittently in precheckin

https://issues.apache.org/jira/browse/GEODE-6363

org.apache.geode.test.dunit.RMIException: While invoking
org.apache.geode.test.dunit.NamedRunnable.run in VM 1 running on Host
55ed9c035e2b with 6 VMs
at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:551)
at org.apache.geode.test.dunit.VM.invoke(VM.java:365)
at
org.apache.geode.distributed.LocatorDUnitTest.testMultipleLocatorsRestartingAtSameTimeWithMissingServers(LocatorDUnitTest.java:1540)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:566)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:61)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
at
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:566)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:566)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:175)
at
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:157)
at
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
at

Review request for PR #3123

2019-02-04 Thread Kirk Lund
Does anyone have time to review PR #3123 for me?

GEODE-6301: Use ThreadInfo.toString in ExecutorServiceRule.dumpThreads
https://github.com/apache/geode/pull/3123

This PR changes ExecutorServiceRule.dumpThreads to use
java.lang.management.ThreadInfo.toString() which includes lock monitors and
synchronizers.

The only usage of this so far is the regression test that I attached
to GEODE-6255. When the test reproduces the hang, it uses
ExecutorServiceRule.dumpThreads() in the assertion failure output to show
the stacks of the two threads that are deadlocked.

Thanks,
Kirk


Re: [Proposal] Adding Micrometer to Apache Geode

2019-02-04 Thread Kirk Lund
+1 to add Micrometer in the described way and I'll add my approval to the
PR as well

On Fri, Feb 1, 2019 at 4:16 PM Dale Emery  wrote:

> Hello all,
>
> I've created a PR to add Micrometer to Geode:
> https://github.com/apache/geode/pull/3153
>
> I invite your review.
>
> The Micrometer system includes a rich suite of Meter types for
> instrumenting code, and several styles of meter registries for
> maintaining the meters.  It also includes (as separate jars) additional
> registry types for publishing metrics to a variety of external
> monitoring systems (e.g. Prometheus, JMX, and Influx).
>
> See http://micrometer.io/ for details about Micrometer,
>
> This PR adds a Micrometer-based metrics system to Geode.
>
> On startup, Geode configures a "primary" meter registry. Developers use
> the primary registry to create meters (gauges, counters, timers, and
> others) to instrument Geode and client code.
>
> Internally, the meter registry is a "composite" registry, which allows
> attaching any number of "downstream" registries. Each downstream
> registry will typically publish the metrics to an external monitoring
> system, or provide an endpoint where an external monitoring system can
> "scrape" the metrics.
>
> In Geode, when a meter is added or removed from the primary registry,
> the primary registry adds or removes a corresponding meter in each
> downstream registry.  When a meter in the primary registry is updated
> (e.g. incremented), the primary registry forwards each operation to the
> corresponding meter in each downstream registry.
>
> A MetricsCollector provides access to the primary registry, and includes
> methods for adding and removing downstream registries.
>
> The MetricsCollector itself is currently available via a method on
> InternalDistributedSystem.  Eventually we plan to  make the
> MetricsCollector available through DistributedSystem, or some other
> non-internal class or interface.
>
> The key components of this change are:
> - MetricsCollector (interface) allows access to the primary registry,
>   and allows adding and removing downstream registries.
> - CompositeMetricsCollector is the concrete implementation of
>   MetricsCollector, and creates the internal composite registry.
> - InternalDistributedSystem creates the MetricsCollector and makes it
>   available via a new getMetricsCollector() method.
> - The Micrometer system for instrumenting and maintaining metrics (added
>   to Geode as a dependency).
>
> Cheers,
> Dale
>
> > On Jan 15, 2019, at 9:37 AM, Mark Hanson  wrote:
> >
> > Hello All,
> >
> > I would like to propose that we incorporate Micrometer into Geode to
> allow us to collect statistics and make them more easily available to
> external services should someone chose to implement support for that.
> >
> > In some basic testing, it does not appear to have a significant impact
> on performance to hook this in. I think that in the long run it will really
> improve the visibility of the systems stats in real-time… This will also
> allow some extensibility for people who want to hook into their tracking
> infrastructure, such as New Relic or Data Dog, though we are not putting
> that in.
> >
> >
> > What do people think?
> >
> > Thanks,
> > Mark
> >
> >
>
>


ImportClusterConfigTest flaky failure

2019-01-30 Thread Kirk Lund
I just filed https://issues. apache.org/jira/browse/GEODE-6343 against
ImportClusterConfigTest.importWouldNotShutDownServer because it failed in a
precheckin. I'm not sure who to assign this failure to. Any suggestions or
volunteers?

Failure:
https://concourse.apachegeode-ci.info/builds/34064

Stack:
org.apache.geode.management.internal.cli.commands.ImportClusterConfigTest >
importWouldNotShutDownServer FAILED
org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
at
jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at
org.apache.geode.test.junit.rules.gfsh.GfshExecution.awaitTermination(GfshExecution.java:125)
at
org.apache.geode.test.junit.rules.gfsh.GfshRule.execute(GfshRule.java:125)
at
org.apache.geode.test.junit.rules.gfsh.GfshScript.execute(GfshScript.java:133)
at
org.apache.geode.management.internal.cli.commands.ImportClusterConfigTest.importWouldNotShutDownServer(ImportClusterConfigTest.java:41)


PR review request

2019-01-24 Thread Kirk Lund
Anyone want to review my PR, please?

GEODE-6313: Make ControllableProcess fail if status is only whitespace #3112
https://github.com/apache/geode/pull/3112

Thanks,
Kirk


Re: dunit hang in ClusterConfigLocatorRestartDUnitTest

2019-01-23 Thread Kirk Lund
/Method.java:566)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.1/Native
Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.1
/NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.1
/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@11.0.1/Method.java:566)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:175)
at
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:157)
at
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
at
org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.1
/ThreadPoolExecutor.java:1128)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.1
/ThreadPoolExecutor.java:628)
at
org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
at java.lang.Thread.run(java.base@11.0.1/Thread.java:834)

   Locked ownable synchronizers:
- <0xd0884428> (a
java.util.concurrent.ThreadPoolExecutor$Worker)

On Wed, Jan 23, 2019 at 10:02 AM Kirk Lund  wrote:

> I hit a dunit hang in one of my precheckin runs.
>
> The only test mentioned in callstacks/dunit-hangs.txt is
> ClusterConfigLocatorRestartDUnitTest.
>
> I see some Pooled Message Processor threads that might be hung waiting for
> the same
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject on
> OverflowQueueWithDMStat.
>
> ClusterConfigurationLoader and FunctionStreamingResultCollector might be
> involved.
>
> Here's the link if someone working on cluster config wants to download the
> tgz and look through the callstacks:
> https://concourse.apachegeode-ci.info/builds/31696
>
> "RMI TCP Connection(1)-172.17.0.14" #34 daemon prio=5 os_prio=0
> cpu=1485.20ms elapsed=4864.19s tid=0x7f6950001800 nid=0x213 waiting on
> condition  [0x7f696b5f3000]
>java.lang.Thread.State: TIMED_WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base@11.0.1/Native Method)
> - parking to wait for  <0xed7bf538> (a
> java.util.concurrent.CountDownLatch$Sync)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.1
> /LockSupport.java:234)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(java.base@11.0.1
> /AbstractQueuedSynchronizer.java:1079)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(java.base@11.0.1
> /AbstractQueuedSynchronizer.java:1369)
> at java.util.concurrent.CountDownLatch.await(java.base@11.0.1
> /CountDownLatch.java:278)
> at
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:61)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:714)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:785)
> at
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:762)
> at
> org.apache.geode.internal.cache.execute.FunctionStreamingResultCollector.getResult(FunctionStreamingResultCollector.java:142)
> at
> org.apache.geode.internal.cache.ClusterConfigurationLoader.requestConfigurationFromOneLocator(ClusterConfigurationLoader.java:313)
> at
> org.apache.geode.internal.cache.ClusterConfigurationLoader.requestConfigurationFromLocators(ClusterConfigurationLoader.java:282)
> at
> org.apache.geode.intern

dunit hang in ClusterConfigLocatorRestartDUnitTest

2019-01-23 Thread Kirk Lund
I hit a dunit hang in one of my precheckin runs.

The only test mentioned in callstacks/dunit-hangs.txt is
ClusterConfigLocatorRestartDUnitTest.

I see some Pooled Message Processor threads that might be hung waiting for
the same
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject on
OverflowQueueWithDMStat.

ClusterConfigurationLoader and FunctionStreamingResultCollector might be
involved.

Here's the link if someone working on cluster config wants to download the
tgz and look through the callstacks:
https://concourse.apachegeode-ci.info/builds/31696

"RMI TCP Connection(1)-172.17.0.14" #34 daemon prio=5 os_prio=0
cpu=1485.20ms elapsed=4864.19s tid=0x7f6950001800 nid=0x213 waiting on
condition  [0x7f696b5f3000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.1/Native Method)
- parking to wait for  <0xed7bf538> (a
java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.1
/LockSupport.java:234)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(java.base@11.0.1
/AbstractQueuedSynchronizer.java:1079)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(java.base@11.0.1
/AbstractQueuedSynchronizer.java:1369)
at java.util.concurrent.CountDownLatch.await(java.base@11.0.1
/CountDownLatch.java:278)
at
org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:61)
at
org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:714)
at
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:785)
at
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:762)
at
org.apache.geode.internal.cache.execute.FunctionStreamingResultCollector.getResult(FunctionStreamingResultCollector.java:142)
at
org.apache.geode.internal.cache.ClusterConfigurationLoader.requestConfigurationFromOneLocator(ClusterConfigurationLoader.java:313)
at
org.apache.geode.internal.cache.ClusterConfigurationLoader.requestConfigurationFromLocators(ClusterConfigurationLoader.java:282)
at
org.apache.geode.internal.cache.GemFireCacheImpl.requestSharedConfiguration(GemFireCacheImpl.java:1074)
at
org.apache.geode.internal.cache.GemFireCacheImpl.(GemFireCacheImpl.java:859)
- locked <0xed7bf7f8> (a java.lang.Class for
org.apache.geode.internal.cache.GemFireCacheImpl)
at
org.apache.geode.internal.cache.GemFireCacheImpl.basicCreate(GemFireCacheImpl.java:796)
- locked <0xed7bf7f8> (a java.lang.Class for
org.apache.geode.internal.cache.GemFireCacheImpl)
at
org.apache.geode.internal.cache.GemFireCacheImpl.create(GemFireCacheImpl.java:785)
at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:176)
- locked <0xed6005b0> (a java.lang.Class for
org.apache.geode.cache.CacheFactory)
at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:223)
- locked <0xed6005b0> (a java.lang.Class for
org.apache.geode.cache.CacheFactory)
at
org.apache.geode.test.junit.rules.ServerStarterRule.startServer(ServerStarterRule.java:174)
at
org.apache.geode.test.junit.rules.ServerStarterRule.before(ServerStarterRule.java:80)
at
org.apache.geode.test.dunit.rules.ClusterStartupRule.lambda$startServerVM$729766c4$1(ClusterStartupRule.java:248)
at
org.apache.geode.test.dunit.rules.ClusterStartupRule$$Lambda$131/0x0008401c0840.call(Unknown
Source)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.1/Native
Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.1
/NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.1
/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@11.0.1/Method.java:566)
at
org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
at
org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:69)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.1/Native
Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.1
/NativeMethodAccessorImpl.java:62)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.1
/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@11.0.1/Method.java:566)
at sun.rmi.server.UnicastServerRef.dispatch(java.rmi@11.0.1
/UnicastServerRef.java:359)
at sun.rmi.transport.Transport$1.run(java.rmi@11.0.1
/Transport.java:200)
at 

Re: Gradle build broken?

2019-01-23 Thread Kirk Lund
I saw the same problem a few weeks ago. I ended up deleting the directories
in my .m2 repository and rebuilding. That seemed to fix it.

The cause seems to have something to do with that log4j core tests jar, but
I'm not sure why our build would be looking for the corresponding sources
jar. If anyone knows what would cause this, let me know!

On Tue, Jan 22, 2019 at 5:49 PM Galen O'Sullivan 
wrote:

> I'm getting the following failure building Geode on the latest develop.
> I've tried `rm -r .gradle ~/.gradle`, to no avail. Any ideas?
>
> Thanks,
> Galen
>
> ---
>
> ./gradlew dev
> > Task :geode-core:compileIntegrationTestJava FAILED
>
> FAILURE: Build failed with an exception.
>
> * What went wrong:
> Execution failed for task ':geode-core:compileIntegrationTestJava'.
> > Could not resolve all files for configuration
> ':geode-core:integrationTestCompileClasspath'.
>> Could not find log4j-core-tests.jar
> (org.apache.logging.log4j:log4j-core:2.11.1).
>  Searched in the following locations:
>
>
> file:/Users/gosullivan/.m2/repository/org/apache/logging/log4j/log4j-core/2.11.1/log4j-core-2.11.1-tests.jar
>> Could not find log4j-core-test-sources.jar
> (org.apache.logging.log4j:log4j-core:2.11.1).
>  Searched in the following locations:
>
>
> file:/Users/gosullivan/.m2/repository/org/apache/logging/log4j/log4j-core/2.11.1/log4j-core-2.11.1-test-sources.jar
>
> * Try:
> Run with --stacktrace option to get the stack trace. Run with --info or
> --debug option to get more log output. Run with --scan to get full
> insights.
>
> * Get more help at
> https://urldefense.proofpoint.com/v2/url?u=https-3A__help.gradle.org=DwIBaQ=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw=oz-4MoNONnm7XFd0NWMb3g=xgO6NyJSiftex8QXD5QJ4pgduD0C7W-ovrychCh4GmU=ajS6jzRPUeYjdT-R2p1HDX_9A_660FRZjsJ4cRMD-_Y=
>
> Deprecated Gradle features were used in this build, making it incompatible
> with Gradle 6.0.
> Use '--warning-mode all' to show the individual deprecation warnings.
> See
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.gradle.org_5.0_userguide_command-5Fline-5Finterface.html-23sec-3Acommand-5Fline-5Fwarnings=DwIBaQ=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw=oz-4MoNONnm7XFd0NWMb3g=xgO6NyJSiftex8QXD5QJ4pgduD0C7W-ovrychCh4GmU=ZeKKsj9s-7D4ccqAPBX7hJddwZPlMHkGll8cCJayDbQ=
>
> BUILD FAILED in 24s
>


Re: Creating the default disk store after persistent region

2019-01-18 Thread Kirk Lund
ateRegion(GemFireCacheImpl.java:2956)
at
org.apache.geode.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2944)
at org.apache.geode.cache.RegionFactory.create(RegionFactory.java:755)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.createRegionWithDefaultDiskStore(CreateDestroyRegionRegressionTest.java:92)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$0(CreateDestroyRegionRegressionTest.java:80)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$2/13648335.run(Unknown
Source)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

On Fri, Jan 18, 2019 at 2:24 PM Kirk Lund  wrote:

> And my description doesn't match the threads again... Just ignore the
> message.
>
> Look at the threads if you care.
>
> On Fri, Jan 18, 2019 at 2:18 PM Kirk Lund  wrote:
>
>> Have there been any changes within the last year involving the following?
>> Is anyone familiar with getOrCreateDefaultDiskStore and when it's invoked?
>>
>> When Region creation for a persistent Region does not specify a disk
>> store, the code call getOrCreateDefaultDiskStore. It then proceeds to
>> create the default disk store.
>>
>> In December, this began to cause a dead lock (
>> https://issues.apache.org/jira/browse/GEODE-6255) in a persistent region
>> test that has two threads:
>>
>> * Thread-1 is invoking Cache.close()
>> * Thread-2 is invoking Region creation for a persistent Region that will
>> use the default Disk Store which has not yet been created
>>
>> Thread-1 (close) is acquiring locks in this order: synchronization on
>> GemFireCacheImpl.rootRegions, MangementListener.writeLock
>>
>> Thread-2 (create region) is acquiring locks in this order:
>> a) (for Region) synchronization on GemFireCacheImpl.rootRegions,
>> MangementListener.readLock
>> b) (for Disk Store) MangementListener.readLock, synchronization on
>> GemFireCacheImpl.rootRegions
>>
>> Step (b) is what causes the deadlock and this only occurs if the default
>> disk store needs to be created for the newly created region.
>>
>> ManagementListener is creating JMX mbeans for the whatever component was
>> just created.
>>
>> I filed a ticket for the deadlock:
>> https://issues.apache.org/jira/browse/GEODE-6255 (not sure I used
>> "thread-1" and "thread-2" consistently between this email and the ticket --
>> I may have flipped them around).
>>
>> I think creating the default disk store before creating the region might
>> be the only easy way to fix the bug. My pair already tried changing
>> ManagementListener to use a dedicated thread (or thread pool). We also
>> tried removing the ReadWriteLock to see what it's actually protecting and
>> the failures are more complicated than creating the default disk store
>> before creating the region.
>>
>> "thread-1":
>> at
>> org.apache.geode.internal.cache.GemFireCacheImpl.removeRoot(GemFireCacheImpl.java:3577)
>> - waiting to lock <0x000773583c28> (a java.util.HashMap)
>> at
>> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6333)
>> at
>> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1755)
>> at
>> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6255)
>> at
>> org.apache.geode.internal.cache.LocalRegion.localDestroyRegion(LocalRegion.java:2242)
>> at
>> org.apache.geode.internal.cache.AbstractRegion.localDestroyRegion(AbstractRegion.java:430)
>> at
>> org.apache.geode.management.internal.ManagementResourceRepo.destroyLocalMonitoringRegion(ManagementResourceRepo.java:73)
>> at
>> org.apache.geode.management.internal.LocalManager.cleanUpResources(LocalManager.java:260)
>> at
>> org.apache.geode.management.internal.LocalManager.stopManager(LocalManager.java:388)
>> at
>> org.apache.geode.management.internal.SystemManagementService.close(SystemManagementService.java:239)
>> - locked <0x00077361b900> (a java.util.HashMap)
>> at
>> org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheRemoval(ManagementAdapter.java:737)
>> at
>> org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:119)
>> at
>> org.apache.geode.distributed.internal.InternalDistr

Re: Creating the default disk store after persistent region

2019-01-18 Thread Kirk Lund
And my description doesn't match the threads again... Just ignore the
message.

Look at the threads if you care.

On Fri, Jan 18, 2019 at 2:18 PM Kirk Lund  wrote:

> Have there been any changes within the last year involving the following?
> Is anyone familiar with getOrCreateDefaultDiskStore and when it's invoked?
>
> When Region creation for a persistent Region does not specify a disk
> store, the code call getOrCreateDefaultDiskStore. It then proceeds to
> create the default disk store.
>
> In December, this began to cause a dead lock (
> https://issues.apache.org/jira/browse/GEODE-6255) in a persistent region
> test that has two threads:
>
> * Thread-1 is invoking Cache.close()
> * Thread-2 is invoking Region creation for a persistent Region that will
> use the default Disk Store which has not yet been created
>
> Thread-1 (close) is acquiring locks in this order: synchronization on
> GemFireCacheImpl.rootRegions, MangementListener.writeLock
>
> Thread-2 (create region) is acquiring locks in this order:
> a) (for Region) synchronization on GemFireCacheImpl.rootRegions,
> MangementListener.readLock
> b) (for Disk Store) MangementListener.readLock, synchronization on
> GemFireCacheImpl.rootRegions
>
> Step (b) is what causes the deadlock and this only occurs if the default
> disk store needs to be created for the newly created region.
>
> ManagementListener is creating JMX mbeans for the whatever component was
> just created.
>
> I filed a ticket for the deadlock:
> https://issues.apache.org/jira/browse/GEODE-6255 (not sure I used
> "thread-1" and "thread-2" consistently between this email and the ticket --
> I may have flipped them around).
>
> I think creating the default disk store before creating the region might
> be the only easy way to fix the bug. My pair already tried changing
> ManagementListener to use a dedicated thread (or thread pool). We also
> tried removing the ReadWriteLock to see what it's actually protecting and
> the failures are more complicated than creating the default disk store
> before creating the region.
>
> "thread-1":
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.removeRoot(GemFireCacheImpl.java:3577)
> - waiting to lock <0x000773583c28> (a java.util.HashMap)
> at
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6333)
> at
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1755)
> at
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6255)
> at
> org.apache.geode.internal.cache.LocalRegion.localDestroyRegion(LocalRegion.java:2242)
> at
> org.apache.geode.internal.cache.AbstractRegion.localDestroyRegion(AbstractRegion.java:430)
> at
> org.apache.geode.management.internal.ManagementResourceRepo.destroyLocalMonitoringRegion(ManagementResourceRepo.java:73)
> at
> org.apache.geode.management.internal.LocalManager.cleanUpResources(LocalManager.java:260)
> at
> org.apache.geode.management.internal.LocalManager.stopManager(LocalManager.java:388)
> at
> org.apache.geode.management.internal.SystemManagementService.close(SystemManagementService.java:239)
> - locked <0x00077361b900> (a java.util.HashMap)
> at
> org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheRemoval(ManagementAdapter.java:737)
> at
> org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:119)
> at
> org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201)
> at
> org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2127)
> - locked <0x0006c010d508> (a java.lang.Class for
> org.apache.geode.internal.cache.GemFireCacheImpl)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1966)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1956)
> at
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.closeCache(CreateDestroyRegionRegressionTest.java:119)
> at
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$1(CreateDestroyRegionRegressionTest.java:93)
> at
> org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$3/1456208737.run(Unknown
> Source)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecut

Creating the default disk store after persistent region

2019-01-18 Thread Kirk Lund
Have there been any changes within the last year involving the following?
Is anyone familiar with getOrCreateDefaultDiskStore and when it's invoked?

When Region creation for a persistent Region does not specify a disk store,
the code call getOrCreateDefaultDiskStore. It then proceeds to create the
default disk store.

In December, this began to cause a dead lock (
https://issues.apache.org/jira/browse/GEODE-6255) in a persistent region
test that has two threads:

* Thread-1 is invoking Cache.close()
* Thread-2 is invoking Region creation for a persistent Region that will
use the default Disk Store which has not yet been created

Thread-1 (close) is acquiring locks in this order: synchronization on
GemFireCacheImpl.rootRegions, MangementListener.writeLock

Thread-2 (create region) is acquiring locks in this order:
a) (for Region) synchronization on GemFireCacheImpl.rootRegions,
MangementListener.readLock
b) (for Disk Store) MangementListener.readLock, synchronization on
GemFireCacheImpl.rootRegions

Step (b) is what causes the deadlock and this only occurs if the default
disk store needs to be created for the newly created region.

ManagementListener is creating JMX mbeans for the whatever component was
just created.

I filed a ticket for the deadlock:
https://issues.apache.org/jira/browse/GEODE-6255 (not sure I used
"thread-1" and "thread-2" consistently between this email and the ticket --
I may have flipped them around).

I think creating the default disk store before creating the region might be
the only easy way to fix the bug. My pair already tried changing
ManagementListener to use a dedicated thread (or thread pool). We also
tried removing the ReadWriteLock to see what it's actually protecting and
the failures are more complicated than creating the default disk store
before creating the region.

"thread-1":
at
org.apache.geode.internal.cache.GemFireCacheImpl.removeRoot(GemFireCacheImpl.java:3577)
- waiting to lock <0x000773583c28> (a java.util.HashMap)
at
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6333)
at
org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1755)
at
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6255)
at
org.apache.geode.internal.cache.LocalRegion.localDestroyRegion(LocalRegion.java:2242)
at
org.apache.geode.internal.cache.AbstractRegion.localDestroyRegion(AbstractRegion.java:430)
at
org.apache.geode.management.internal.ManagementResourceRepo.destroyLocalMonitoringRegion(ManagementResourceRepo.java:73)
at
org.apache.geode.management.internal.LocalManager.cleanUpResources(LocalManager.java:260)
at
org.apache.geode.management.internal.LocalManager.stopManager(LocalManager.java:388)
at
org.apache.geode.management.internal.SystemManagementService.close(SystemManagementService.java:239)
- locked <0x00077361b900> (a java.util.HashMap)
at
org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheRemoval(ManagementAdapter.java:737)
at
org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:119)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
at
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2127)
- locked <0x0006c010d508> (a java.lang.Class for
org.apache.geode.internal.cache.GemFireCacheImpl)
at
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1966)
at
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1956)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.closeCache(CreateDestroyRegionRegressionTest.java:119)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$1(CreateDestroyRegionRegressionTest.java:93)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$3/1456208737.run(Unknown
Source)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

"thread-2": at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0007735ff8e0> (a
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at

Re: Loner changes its membership port when starting an acceptor

2019-01-16 Thread Kirk Lund
Ah! I forgot about that (trying to do too many things at once). I'll file a
ticket to fix it up. Thanks!

On Wed, Jan 16, 2019 at 3:47 PM Bruce Schuchardt 
wrote:

> Hmm, okay but I was looking at this code in MBeanJMXAdapter:
>
> public static String getMemberNameOrId(DistributedMember member) {
>if (member.getName() !=null && !member.getName().equals("")) {
>  return makeCompliantName(member.getName());
>}
>return makeCompliantName(member.getId());
> }
>
> So if there isn't a name set in the distribution configuration it will
> use the toString of the member ID.  I think that's what's happening to
> you.  The getName() method in InternalDistributedMember doesn't return a
> string like you've described.  It returns whatever was configured as the
> "name" in the distribution configuration properties.
>
> On 1/16/19 1:27 PM, Kirk Lund wrote:
> > It's not using toString(). It's just using DistributedMember.getName().
> > This is implemented by InternalDistributedMember.getName() which
> delegates
> > to NetMember.getName(). Are you saying we should add a new method to
> > DistributedMember instead of using getName()?
> >
> > The mbeans are categorized a type of Member or Distributed. Member means
> > the mbean represents something about a member, while the other type means
> > that the mbean represents something about the DistributedSystem/Cluster
> > (aggregating all Members together in some way). When the type is Member,
> > the ObjectName contains:
> >
> > "type=Member,member={0}"
> >
> > MBeanJMXAdapter creates the ObjectNames by filling in that parameter with
> > DistributedMember.getName(). We originally documented that the
> ObjectNames
> > would contain the DistributedMember.getName(), but we could change that
> by
> > adding a new getMBeanName() method or something like that. I think we
> > originally thought it would be nice for the User if they could simply
> refer
> > to DistributedMember.getName() as being the same value as we use in the
> JMX
> > ObjectName.
> >
> > On Tue, Jan 15, 2019 at 11:28 AM Bruce Schuchardt <
> bschucha...@pivotal.io>
> > wrote:
> >
> >> Yeah, let's fix this, but let's not require the toString() of an object
> >> to never change.  Let's add another method that returns a Bean
> >> identifier and is documented to never change.
> >>
> >> On 1/15/19 9:45 AM, Kirk Lund wrote:
> >>> Sorry about the confusion. I meant that the change of membership port
> >>> results in DistributedMember returning a different string from its
> >>> getName() method.
> >>>
> >>> We discovered this because the JMX layer has some error handling that
> >>> results in suppressing this failure, so the failure was being hidden.
> We
> >>> cleaned up the error handling and saw quite a few failures in
> precheckin
> >>> caused by this. I figured it was more correct to fix the underlying
> >> problem
> >>> rather than restore the suppression of this bug.
> >>>
> >>> On Tue, Jan 15, 2019 at 7:47 AM Bruce Schuchardt <
> bschucha...@pivotal.io
> >>>
> >>> wrote:
> >>>
> >>>> Actually the formatting code would go in InternalDistributedMember.
> The
> >>>> JMX code already has a special method for handling that class.  I was
> >>>> thrown off by the reference to a non-existant getName() method in
> >>>> LonerDistributionManager.
> >>>>
> >>>> On 1/15/19 7:34 AM, Bruce Schuchardt wrote:
> >>>>> I think we could solve this by either moving the ID formatting code
> to
> >>>>> the DistributionManager implementations & having
> >>>>> LonerDistributionManager omit the port number or modify the
> >>>>> client/server handshake to not install a port number when connecting
> >>>>> to a remote GatewayReceiver.  I guess the latter wouldn't work if the
> >>>>> bean is in a real client cache.
> >>>>>
> >>>>> The installation of a port number was implemented to prevent
> duplicate
> >>>>> membership IDs in client caches when clients are being spun up in
> >>>>> parallel on the same machine.  However there is also a "unique ID" in
> >>>>> LonerDistributionManager that was supposed to address this problem
> but
> >>>>> apparently didn't.
> >>>>>
> >>>>> On 1/14/19 4:45 P

Re: Loner changes its membership port when starting an acceptor

2019-01-16 Thread Kirk Lund
It's not using toString(). It's just using DistributedMember.getName().
This is implemented by InternalDistributedMember.getName() which delegates
to NetMember.getName(). Are you saying we should add a new method to
DistributedMember instead of using getName()?

The mbeans are categorized a type of Member or Distributed. Member means
the mbean represents something about a member, while the other type means
that the mbean represents something about the DistributedSystem/Cluster
(aggregating all Members together in some way). When the type is Member,
the ObjectName contains:

"type=Member,member={0}"

MBeanJMXAdapter creates the ObjectNames by filling in that parameter with
DistributedMember.getName(). We originally documented that the ObjectNames
would contain the DistributedMember.getName(), but we could change that by
adding a new getMBeanName() method or something like that. I think we
originally thought it would be nice for the User if they could simply refer
to DistributedMember.getName() as being the same value as we use in the JMX
ObjectName.

On Tue, Jan 15, 2019 at 11:28 AM Bruce Schuchardt 
wrote:

> Yeah, let's fix this, but let's not require the toString() of an object
> to never change.  Let's add another method that returns a Bean
> identifier and is documented to never change.
>
> On 1/15/19 9:45 AM, Kirk Lund wrote:
> > Sorry about the confusion. I meant that the change of membership port
> > results in DistributedMember returning a different string from its
> > getName() method.
> >
> > We discovered this because the JMX layer has some error handling that
> > results in suppressing this failure, so the failure was being hidden. We
> > cleaned up the error handling and saw quite a few failures in precheckin
> > caused by this. I figured it was more correct to fix the underlying
> problem
> > rather than restore the suppression of this bug.
> >
> > On Tue, Jan 15, 2019 at 7:47 AM Bruce Schuchardt  >
> > wrote:
> >
> >> Actually the formatting code would go in InternalDistributedMember.  The
> >> JMX code already has a special method for handling that class.  I was
> >> thrown off by the reference to a non-existant getName() method in
> >> LonerDistributionManager.
> >>
> >> On 1/15/19 7:34 AM, Bruce Schuchardt wrote:
> >>> I think we could solve this by either moving the ID formatting code to
> >>> the DistributionManager implementations & having
> >>> LonerDistributionManager omit the port number or modify the
> >>> client/server handshake to not install a port number when connecting
> >>> to a remote GatewayReceiver.  I guess the latter wouldn't work if the
> >>> bean is in a real client cache.
> >>>
> >>> The installation of a port number was implemented to prevent duplicate
> >>> membership IDs in client caches when clients are being spun up in
> >>> parallel on the same machine.  However there is also a "unique ID" in
> >>> LonerDistributionManager that was supposed to address this problem but
> >>> apparently didn't.
> >>>
> >>> On 1/14/19 4:45 PM, Kirk Lund wrote:
> >>>> So I was stepping through some WAN tests in IJ debugger (on develop
> >>>> with no
> >>>> changes) and discovered that any MXBeans that are created before
> >>>> starting a
> >>>> server port (either using CacheServer or GatewayReceiver) are broken
> and
> >>>> fail to be updated after that -- the ObjectNames include the
> >>>> DistributedMember.getName(). Turns out some JMX code is eating an NPE
> >>>> that's caused because the LonerDistributionManager changes its
> >>>> membership
> >>>> port when an acceptor endpoint is started up.
> >>>>
> >>>> Below is the method in LonerDistributionManager (with some other
> >>>> issues as
> >>>> well) that does this updating. We either need to make a lot of
> >>>> changes to
> >>>> the JMX code to fix this or we need to make one small change to
> >>>> LonerDistributionManager (ie, to delete this method). Question: do we
> >>>> really need the DistributedMember of a Loner to change its getName()
> >>>> which
> >>>> includes the membership port that changed?
> >>>>
> >>>> /**
> >>>>  * update the loner port with an integer that may be more unique
> >>>> than the
> >>>> default port (zero).
> >>>>  * This updates the ID in place and establishes new default
> >>>> settings for
> >>>> the manufacture of new
> >>>>  * IDs.
> >>>>  *
> >>>>  * @param newPort the new port to use
> >>>>  */
> >>>> public void updateLonerPort(int newPort) {
> >>>>   this.logger.config(
> >>>>   String.format("Updating membership port.  Port changed from
> >>>> %s to
> >>>> %s.  ID is now %s",
> >>>>   new Object[] {this.lonerPort, newPort, getId()}));
> >>>>   this.lonerPort = newPort;
> >>>>   *this.getId().setPort(this.lonerPort);*
> >>>> }
> >>>>
>


Re: Loner changes its membership port when starting an acceptor

2019-01-15 Thread Kirk Lund
Sorry about the confusion. I meant that the change of membership port
results in DistributedMember returning a different string from its
getName() method.

We discovered this because the JMX layer has some error handling that
results in suppressing this failure, so the failure was being hidden. We
cleaned up the error handling and saw quite a few failures in precheckin
caused by this. I figured it was more correct to fix the underlying problem
rather than restore the suppression of this bug.

On Tue, Jan 15, 2019 at 7:47 AM Bruce Schuchardt 
wrote:

> Actually the formatting code would go in InternalDistributedMember.  The
> JMX code already has a special method for handling that class.  I was
> thrown off by the reference to a non-existant getName() method in
> LonerDistributionManager.
>
> On 1/15/19 7:34 AM, Bruce Schuchardt wrote:
> > I think we could solve this by either moving the ID formatting code to
> > the DistributionManager implementations & having
> > LonerDistributionManager omit the port number or modify the
> > client/server handshake to not install a port number when connecting
> > to a remote GatewayReceiver.  I guess the latter wouldn't work if the
> > bean is in a real client cache.
> >
> > The installation of a port number was implemented to prevent duplicate
> > membership IDs in client caches when clients are being spun up in
> > parallel on the same machine.  However there is also a "unique ID" in
> > LonerDistributionManager that was supposed to address this problem but
> > apparently didn't.
> >
> > On 1/14/19 4:45 PM, Kirk Lund wrote:
> >> So I was stepping through some WAN tests in IJ debugger (on develop
> >> with no
> >> changes) and discovered that any MXBeans that are created before
> >> starting a
> >> server port (either using CacheServer or GatewayReceiver) are broken and
> >> fail to be updated after that -- the ObjectNames include the
> >> DistributedMember.getName(). Turns out some JMX code is eating an NPE
> >> that's caused because the LonerDistributionManager changes its
> >> membership
> >> port when an acceptor endpoint is started up.
> >>
> >> Below is the method in LonerDistributionManager (with some other
> >> issues as
> >> well) that does this updating. We either need to make a lot of
> >> changes to
> >> the JMX code to fix this or we need to make one small change to
> >> LonerDistributionManager (ie, to delete this method). Question: do we
> >> really need the DistributedMember of a Loner to change its getName()
> >> which
> >> includes the membership port that changed?
> >>
> >>/**
> >> * update the loner port with an integer that may be more unique
> >> than the
> >> default port (zero).
> >> * This updates the ID in place and establishes new default
> >> settings for
> >> the manufacture of new
> >> * IDs.
> >> *
> >> * @param newPort the new port to use
> >> */
> >>public void updateLonerPort(int newPort) {
> >>  this.logger.config(
> >>  String.format("Updating membership port.  Port changed from
> >> %s to
> >> %s.  ID is now %s",
> >>  new Object[] {this.lonerPort, newPort, getId()}));
> >>  this.lonerPort = newPort;
> >>  *this.getId().setPort(this.lonerPort);*
> >>}
> >>
>


Loner changes its membership port when starting an acceptor

2019-01-14 Thread Kirk Lund
So I was stepping through some WAN tests in IJ debugger (on develop with no
changes) and discovered that any MXBeans that are created before starting a
server port (either using CacheServer or GatewayReceiver) are broken and
fail to be updated after that -- the ObjectNames include the
DistributedMember.getName(). Turns out some JMX code is eating an NPE
that's caused because the LonerDistributionManager changes its membership
port when an acceptor endpoint is started up.

Below is the method in LonerDistributionManager (with some other issues as
well) that does this updating. We either need to make a lot of changes to
the JMX code to fix this or we need to make one small change to
LonerDistributionManager (ie, to delete this method). Question: do we
really need the DistributedMember of a Loner to change its getName() which
includes the membership port that changed?

  /**
   * update the loner port with an integer that may be more unique than the
default port (zero).
   * This updates the ID in place and establishes new default settings for
the manufacture of new
   * IDs.
   *
   * @param newPort the new port to use
   */
  public void updateLonerPort(int newPort) {
this.logger.config(
String.format("Updating membership port.  Port changed from %s to
%s.  ID is now %s",
new Object[] {this.lonerPort, newPort, getId()}));
this.lonerPort = newPort;
*this.getId().setPort(this.lonerPort);*
  }


StatSampler no longer logs fatal message for JDK-8207200

2019-01-08 Thread Kirk Lund
I just committed a change to develop which catches the
IllegalStateException caused by JDK-8207200 when the StatSampler samples
JVM memory usage using MemoryMXBean.

commit 1143af997292df0f6d480084ffd2103fa2c17bbf (origin/develop,
origin/HEAD)
Author: Kirk Lund 
Date:   Mon Jan 7 16:43:34 2019 -0800

GEODE-6253: Handle JDK-8207200 gracefully in VM stats

The above change prevents the following, so please don't add any more
IgnoredExceptions to dunit tests for this:

[fatal 2019/01/04 00:15:19.050 UTC  tid=67] committed =
538968064 should be < max = 536870912
java.lang.IllegalArgumentException: committed = 538968064 should be <
max = 536870912
at
java.management/java.lang.management.MemoryUsage.(MemoryUsage.java:166)
at java.management/sun.management.MemoryImpl.getMemoryUsage0(Native
Method)
at
java.management/sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
at
org.apache.geode.internal.stats50.VMStats50.refresh(VMStats50.java:624)
at
org.apache.geode.internal.statistics.HostStatSampler.sampleSpecialStats(HostStatSampler.java:562)
at
org.apache.geode.internal.statistics.HostStatSampler.run(HostStatSampler.java:235)
at java.base/java.lang.Thread.run(Thread.java:834)

Thanks,
Kirk


Re: [DISCUSS] Disable merge for failing pull requests

2018-12-26 Thread Kirk Lund
I'm changing my vote to -1 for disallowing merge until precheck passes.

The reason is that it's rare for me to see a 100% clean precheckin for any
of my PRs. There seems to always be some failure unrelated to my PR. For
example PR #3042 just hit GEODE-6008. If we make the change to disable the
merge button, then my PR could potentially be blocked indefinitely.

After we get it more consistently GREEN, I would be willing to change my
vote to +1.

On Fri, Dec 21, 2018 at 10:36 AM Kirk Lund  wrote:

> I was responding to Udo's comment:
>
> "Could one not configure the button that the owner of the PR cannot merge
> the PR?"
>
> I'm +1 for disallowing merge until precheck passes.
> I'm -1 for disallowing the owner of the PR to merge it.
>
> On Fri, Dec 21, 2018 at 9:28 AM Helena Bales  wrote:
>
>> Kirk, this change would not require you to get someone to merge it. It
>> would just require that your PR pass CI before it can be merged.
>>
>> On Thu, Dec 20, 2018 at 2:38 PM Kirk Lund  wrote:
>>
>> > I have enough trouble just getting other developers to review my PR. I
>> > don't want to have to struggle to find someone to merge it for me, too.
>> >
>> > On Mon, Nov 19, 2018 at 4:09 PM Udo Kohlmeyer  wrote:
>> >
>> > > I don't believe "name and shame" is a hammer we should wield, but if
>> we
>> > > have use it... use it wisely
>> > >
>> > > Could one not configure the button that the owner of the PR cannot
>> merge
>> > > the PR?
>> > >
>> > > --Udo
>> > >
>> > >
>> > > On 11/19/18 16:03, Dan Smith wrote:
>> > > > Closing the loop on this thread - I don't feel like there was enough
>> > > > agreement to go forwards with disabling the merge button, so I'm
>> going
>> > to
>> > > > drop this for now.
>> > > >
>> > > > I would like to see everyone make sure that they only merge green
>> PRs.
>> > > > Maybe we can go with a name and shame approach? As in, we shouldn't
>> see
>> > > any
>> > > > new PRs show up in this query:
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/geode/pulls?utf8=%E2%9C%93=is%3Apr+is%3Amerged+status%3Afailure
>> > > >
>> > > > -Dan
>> > > >
>> > > > On Tue, Nov 13, 2018 at 10:19 AM Ryan McMahon 
>> > > wrote:
>> > > >
>> > > >> +1 I like this idea, but I recognize that it will be a challenge
>> when
>> > > there
>> > > >> is still some flakiness to the pipeline.  I think we'd need clear
>> > > >> guidelines on what to do if your PR fails due to something
>> seemingly
>> > > >> unrelated.  For instance, we ran into GEODE-5943 (flaky
>> > > EvictionDUnitTest)
>> > > >> in our last PR, and saw that there was already an open ticket for
>> the
>> > > >> flakiness (we even reverted our change and reproduced to be
>> sure).  So
>> > > we
>> > > >> triggered another PR pipeline and it passed the second time.  Is
>> > > rerunning
>> > > >> the pipeline again sufficient in this case?  Or should we have
>> stopped
>> > > what
>> > > >> we were doing and took up GEODE-5943, assuming it wasn't assigned
>> to
>> > > >> someone?  If it was already assigned to someone, do we wait until
>> that
>> > > bug
>> > > >> is fixed before we run through the PR pipeline again?
>> > > >>
>> > > >> I think if anything this will help us treat bugs that are causing a
>> > red
>> > > >> pipeline as critical, and I think that is a good thing.  So I'm
>> still
>> > +1
>> > > >> despite the flakiness.  Just curious what people think about how we
>> > > should
>> > > >> handle cases where there is a known failure and it is indeed
>> unrelated
>> > > to
>> > > >> our PR.
>> > > >>
>> > > >> Ryan
>> > > >>
>> > > >>
>> > > >> On Mon, Nov 12, 2018 at 2:49 PM Jinmei Liao 
>> > wrote:
>> > > >>
>> > > >>> Just to clarify, that flaky EvictionDUnitTest is old flaky. The
>> PR to
>> > > >>> refactor the test passed all checks, even the repeatTest as we

PR review requests

2018-12-26 Thread Kirk Lund
If anyone is free to review a couple PRs, please take a look at mine:

GEODE-3205: Cleanup and reenable DiskSpaceLimitIntegrationTest
https://github.com/apache/geode/pull/3035
- basically, the FILE_SIZE_LIMIT_BYTES was too small, resulting in rolling
too many
- added lots more info in assertion failures and other changes to cleanup
the test

GEODE-6143: Import methods from Mockito instead of PowerMockito
https://github.com/apache/geode/pull/2978
- imported when from Mockito instead of PowerMockito
- removed unnecessary uses of final keyword and changed test to more
closely follow junit coding conventions

Thanks,
Kirk


Re: Re: [PROPOSAL] use default value for validate-serializable-objects in dunit

2018-12-21 Thread Kirk Lund
Thanks for the input Bruce. I'm going to close the PR for now and do some
more thinking about how we test it.

On Fri, Dec 21, 2018 at 3:26 PM Bruce Schuchardt 
wrote:

> I agree with running tests with the default settings but I do not agree
> with this change.
>
> I think we need to enable this serialization validation by default.
> Otherwise servers and clients are exposed to serialization exploits.  We
> did not enable validation by default when the serialization filter was
> implemented, but I think that was a mistake.  Who wants to be exposed to
> someone running arbitrary code in their servers or clients due to
> exploitable flaws in java serialization?  You should have to opt in to
> that.
>
> With validation turned off by default in our tests we have no assurance
> that someone hasn't introduced a java-serializable class in Geode that
> isn't white-listed by the filter.  If that happens a user enabling
> validation would then hit errors when trying to use Geode because the
> filter would reject instances of that class. Imagine one server sending
> a Function to another server that replied with an Enum value that wasn't
> whitelisted.  The message would be rejected and the thread issuing the
> function would hang.
>
>
>
> On 12/21/18 2:42 PM, Kirk Lund wrote:
> > 
> >
> > I filed GEODE-6202: DUnit should not enable VALIDATE_SERIALIZABLE_OBJECTS
> > by default
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_GEODE-2D6202=DwIBaQ=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk=32wc_2NhBYmEbZduqD6ejal16V-6fZSaV8nMKc4DXlo=Lk7Nz72vIqLozdzuEhGqTm49719bax0k3wzX9Os-cu8=
> >
> > And submitted PR #3023
> > https://github.com/apache/geode/pull/3023
> >
> > Please review and/or discuss further if needed.
> >
> > Thanks,
> > Kirk
> >
> > On Thu, Mar 15, 2018 at 12:00 PM Bruce Schuchardt <
> bschucha...@pivotal.io>
> > wrote:
> >
> >> +0.5 I think we can turn this off (back to the default) now since the
> >> AnalyzeSerializables tests for other modules have been created. These
> >> tests ensure that serializable objects are properly white-listed or
> >> excluded and are able to be serialized/deserialized.
> >>
> >> Excluded classes are not tested, though, so if you add an excluded class
> >> and are wrong about whether it is sent over the wire you will break
> >> Geode. Having serialization validation enabled in dunit tests protects
> >> against this if you write tests that use the excluded class.
> >>
> >> We also have non-default values for cluster configuration in dunit runs.
> >>
> >>
> >> On 3/13/18 10:03 AM, Kirk Lund wrote:
> >>> I want to propose using the default value for
> >> validate-serializable-object
> >>> in dunit tests instead of forcing it on for all dunit tests. I'm
> >>> sympathetic to the reason why this was done: ensure that all existing
> >> code
> >>> and future code will function properly with this feature enabled.
> >>> Unfortunately running all dunit tests with it turned on is not a
> >>> good way
> >>> to achieve this.
> >>>
> >>> Here are my reasons:
> >>>
> >>> 1) All tests should start out with the same defaults that Users have.
> If
> >> we
> >>> don't do this, we are going to goof up sometime and release something
> >> that
> >>> only works with this feature turned on or worsen the User experience of
> >>> Geode in some way.
> >>>
> >>> 2) All tests should have sovereign control over their own
> configuration.
> >> We
> >>> should strive towards being able to look at a test and determine its
> >> config
> >>> at a glance without having to dig through the framework or other
> classes
> >>> for hidden configuration. We continue to improve dunit in this area
> >>> but I
> >>> think adding to the problem is going in the wrong direction.
> >>>
> >>> 3) It sets a bad precedent. Do we follow this approach once or keep
> >> adding
> >>> additional non-default features when we need to test them too? Next one
> >> is
> >>> GEODE-4769 "Serialize region entry before putting in local cache" which
> >>> will be disabled by default in the next Geode release and yet by
> turning
> >> it
> >>> on by default for all of precheckin we were able to find lots of
> >>> problems
> >

Re: [PROPOSAL] use default value for validate-serializable-objects in dunit

2018-12-21 Thread Kirk Lund


I filed GEODE-6202: DUnit should not enable VALIDATE_SERIALIZABLE_OBJECTS
by default
https://issues.apache.org/jira/browse/GEODE-6202

And submitted PR #3023
https://github.com/apache/geode/pull/3023

Please review and/or discuss further if needed.

Thanks,
Kirk

On Thu, Mar 15, 2018 at 12:00 PM Bruce Schuchardt 
wrote:

> +0.5  I think we can turn this off (back to the default) now since the
> AnalyzeSerializables tests for other modules have been created.  These
> tests ensure that serializable objects are properly white-listed or
> excluded and are able to be serialized/deserialized.
>
> Excluded classes are not tested, though, so if you add an excluded class
> and are wrong about whether it is sent over the wire you will break
> Geode.  Having serialization validation enabled in dunit tests protects
> against this if you write tests that use the excluded class.
>
> We also have non-default values for cluster configuration in dunit runs.
>
>
> On 3/13/18 10:03 AM, Kirk Lund wrote:
> > I want to propose using the default value for
> validate-serializable-object
> > in dunit tests instead of forcing it on for all dunit tests. I'm
> > sympathetic to the reason why this was done: ensure that all existing
> code
> > and future code will function properly with this feature enabled.
> > Unfortunately running all dunit tests with it turned on is not a good way
> > to achieve this.
> >
> > Here are my reasons:
> >
> > 1) All tests should start out with the same defaults that Users have. If
> we
> > don't do this, we are going to goof up sometime and release something
> that
> > only works with this feature turned on or worsen the User experience of
> > Geode in some way.
> >
> > 2) All tests should have sovereign control over their own configuration.
> We
> > should strive towards being able to look at a test and determine its
> config
> > at a glance without having to dig through the framework or other classes
> > for hidden configuration. We continue to improve dunit in this area but I
> > think adding to the problem is going in the wrong direction.
> >
> > 3) It sets a bad precedent. Do we follow this approach once or keep
> adding
> > additional non-default features when we need to test them too? Next one
> is
> > GEODE-4769 "Serialize region entry before putting in local cache" which
> > will be disabled by default in the next Geode release and yet by turning
> it
> > on by default for all of precheckin we were able to find lots of problems
> > in both the product code and test code.
> >
> > 4) This is already starting to cause confusion for developers thinking
> its
> > actually a product default or expecting it to be enabled in other
> > (non-dunit) tests.
> >
> > Alternatives for test coverage:
> >
> > There really are no reasonable shortcuts for end-to-end test coverage of
> > any feature. We need to write new tests or identify existing tests to
> > subclass with the feature enabled.
> >
> > 1) Subclass specific tests to turn validate-serializable-object on for
> that
> > test case. Examples of this include a) dunit tests that execute Region
> > tests with OffHeap enabled, b) dunit tests that flip on HTTP over GFSH,
> c)
> > dunit tests that run with SSL or additional security enabled.
> >
> > 2) Write new tests that cover all features with
> validate-serializable-object
> > enabled.
> >
> > 3) Rerun all of dunit with and without the option. This doesn't sound
> very
> > reasonable to me, but it's the closest to what we really want or need.
> >
> > Any other ideas or suggestions other than forcing all dunit tests to run
> > with a non-default value?
> >
> > Thanks,
> > Kirk
> >
>
>


Reminder: do NOT @Ignore tests that flicker

2018-12-21 Thread Kirk Lund
Just a reminder: please do not @Ignore tests that flicker.

I was pairing on some statistics changes this week and ran across a
disabled test in DiskSpaceLimitIntegrationTest:

  @Test
  @Ignore("Fails frequently in docker container.")
  public void aboveZeroDeletesPreviousFiles() throws Exception {

Looking at the git info, I can see it was committed for GEODE-5494 in July
27, 2018 (pretty recently). Part of the contract for getting rid of
FlakyTest category was to stop marking flaky tests and fix them all.

GEODE-5494: "Flaky test DiskSpaceLimitIntegrationTest >
aboveZeroDeletesPreviousFiles"

There's a couple problems with this:

1) GEODE-5494 is a duplicate of GEODE-3205
2) The test method was disabled with @Ignore and then forgotten about (it's
still disabled in Dec)

This is exactly the sort of thing that I said we should never do if we get
rid running FlakyTests in quarantine. Now we have test never running (in
quarantine or in CI).

Hope we don't have very many more of these...


Re: [DISCUSS] Disable merge for failing pull requests

2018-12-21 Thread Kirk Lund
I was responding to Udo's comment:

"Could one not configure the button that the owner of the PR cannot merge
the PR?"

I'm +1 for disallowing merge until precheck passes.
I'm -1 for disallowing the owner of the PR to merge it.

On Fri, Dec 21, 2018 at 9:28 AM Helena Bales  wrote:

> Kirk, this change would not require you to get someone to merge it. It
> would just require that your PR pass CI before it can be merged.
>
> On Thu, Dec 20, 2018 at 2:38 PM Kirk Lund  wrote:
>
> > I have enough trouble just getting other developers to review my PR. I
> > don't want to have to struggle to find someone to merge it for me, too.
> >
> > On Mon, Nov 19, 2018 at 4:09 PM Udo Kohlmeyer  wrote:
> >
> > > I don't believe "name and shame" is a hammer we should wield, but if we
> > > have use it... use it wisely
> > >
> > > Could one not configure the button that the owner of the PR cannot
> merge
> > > the PR?
> > >
> > > --Udo
> > >
> > >
> > > On 11/19/18 16:03, Dan Smith wrote:
> > > > Closing the loop on this thread - I don't feel like there was enough
> > > > agreement to go forwards with disabling the merge button, so I'm
> going
> > to
> > > > drop this for now.
> > > >
> > > > I would like to see everyone make sure that they only merge green
> PRs.
> > > > Maybe we can go with a name and shame approach? As in, we shouldn't
> see
> > > any
> > > > new PRs show up in this query:
> > > >
> > > >
> > >
> >
> https://github.com/apache/geode/pulls?utf8=%E2%9C%93=is%3Apr+is%3Amerged+status%3Afailure
> > > >
> > > > -Dan
> > > >
> > > > On Tue, Nov 13, 2018 at 10:19 AM Ryan McMahon 
> > > wrote:
> > > >
> > > >> +1 I like this idea, but I recognize that it will be a challenge
> when
> > > there
> > > >> is still some flakiness to the pipeline.  I think we'd need clear
> > > >> guidelines on what to do if your PR fails due to something seemingly
> > > >> unrelated.  For instance, we ran into GEODE-5943 (flaky
> > > EvictionDUnitTest)
> > > >> in our last PR, and saw that there was already an open ticket for
> the
> > > >> flakiness (we even reverted our change and reproduced to be sure).
> So
> > > we
> > > >> triggered another PR pipeline and it passed the second time.  Is
> > > rerunning
> > > >> the pipeline again sufficient in this case?  Or should we have
> stopped
> > > what
> > > >> we were doing and took up GEODE-5943, assuming it wasn't assigned to
> > > >> someone?  If it was already assigned to someone, do we wait until
> that
> > > bug
> > > >> is fixed before we run through the PR pipeline again?
> > > >>
> > > >> I think if anything this will help us treat bugs that are causing a
> > red
> > > >> pipeline as critical, and I think that is a good thing.  So I'm
> still
> > +1
> > > >> despite the flakiness.  Just curious what people think about how we
> > > should
> > > >> handle cases where there is a known failure and it is indeed
> unrelated
> > > to
> > > >> our PR.
> > > >>
> > > >> Ryan
> > > >>
> > > >>
> > > >> On Mon, Nov 12, 2018 at 2:49 PM Jinmei Liao 
> > wrote:
> > > >>
> > > >>> Just to clarify, that flaky EvictionDUnitTest is old flaky. The PR
> to
> > > >>> refactor the test passed all checks, even the repeatTest as well. I
> > > had a
> > > >>> closed PR that just touched the "un-refactored" EvictionDUnitTest,
> it
> > > >>> wouldn't even pass the repeatTest at all.
> > > >>>
> > > >>> On Mon, Nov 12, 2018 at 2:04 PM Dan Smith 
> wrote:
> > > >>>
> > > >>>> To be clear, I don't think we have an issue of untrustworthy
> > > committers
> > > >>>> pushing code they know is broken to develop.
> > > >>>>
> > > >>>> The problem is that it is all to easy to look at a PR with some
> > > >> failures
> > > >>>> and *assume* your code didn't cause the failures and merge it
> > anyway.
> > > I
> > > >>>> think we should all be at least rerunning the tests and not
> me

Re: Javadoc errors in org/apache/geode/cache/configuration/RegionAttributesType.java

2018-12-20 Thread Kirk Lund
Looks good. Thanks! I added my approval.

On Thu, Dec 20, 2018 at 2:20 PM Aditya Anchuri  wrote:

> PR to fix: https://github.com/apache/geode/pull/3030
>
> On Thu, Dec 20, 2018 at 1:58 PM Aditya Anchuri 
> wrote:
>
> > These classes were removed recently, but it looks like we forgot to clean
> > up the javadocs. Will make a PR to fix ASAP.
> >
> > -Aditya
> >
> > On Thu, Dec 20, 2018 at 1:57 PM Kirk Lund  wrote:
> >
> >> Just a reminder that any classes not behind internal packages need to
> have
> >> functional Javadocs.
> >>
> >> We have Javadoc errors involving
> >> org/apache/geode/cache/configuration/RegionAttributesType.java referring
> >> to
> >> RegionTimeToLive which I can't find when I grep.
> >>
> >> > Task :geode-assembly:docs
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:479:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.RegionTimeToLive
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:490:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.RegionTimeToLive
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:501:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.RegionIdleTime
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:512:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.RegionIdleTime
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:523:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.EntryTimeToLive
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:534:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.EntryTimeToLive
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:545:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.EntryIdleTime
> >>
> >>
> /Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:556:
> >> warning - Tag @link: reference not found:
> >> RegionAttributesType.EntryIdleTime
> >> 8 warnings
> >>
> >
>


Re: [DISCUSS] Disable merge for failing pull requests

2018-12-20 Thread Kirk Lund
I have enough trouble just getting other developers to review my PR. I
don't want to have to struggle to find someone to merge it for me, too.

On Mon, Nov 19, 2018 at 4:09 PM Udo Kohlmeyer  wrote:

> I don't believe "name and shame" is a hammer we should wield, but if we
> have use it... use it wisely
>
> Could one not configure the button that the owner of the PR cannot merge
> the PR?
>
> --Udo
>
>
> On 11/19/18 16:03, Dan Smith wrote:
> > Closing the loop on this thread - I don't feel like there was enough
> > agreement to go forwards with disabling the merge button, so I'm going to
> > drop this for now.
> >
> > I would like to see everyone make sure that they only merge green PRs.
> > Maybe we can go with a name and shame approach? As in, we shouldn't see
> any
> > new PRs show up in this query:
> >
> >
> https://github.com/apache/geode/pulls?utf8=%E2%9C%93=is%3Apr+is%3Amerged+status%3Afailure
> >
> > -Dan
> >
> > On Tue, Nov 13, 2018 at 10:19 AM Ryan McMahon 
> wrote:
> >
> >> +1 I like this idea, but I recognize that it will be a challenge when
> there
> >> is still some flakiness to the pipeline.  I think we'd need clear
> >> guidelines on what to do if your PR fails due to something seemingly
> >> unrelated.  For instance, we ran into GEODE-5943 (flaky
> EvictionDUnitTest)
> >> in our last PR, and saw that there was already an open ticket for the
> >> flakiness (we even reverted our change and reproduced to be sure).  So
> we
> >> triggered another PR pipeline and it passed the second time.  Is
> rerunning
> >> the pipeline again sufficient in this case?  Or should we have stopped
> what
> >> we were doing and took up GEODE-5943, assuming it wasn't assigned to
> >> someone?  If it was already assigned to someone, do we wait until that
> bug
> >> is fixed before we run through the PR pipeline again?
> >>
> >> I think if anything this will help us treat bugs that are causing a red
> >> pipeline as critical, and I think that is a good thing.  So I'm still +1
> >> despite the flakiness.  Just curious what people think about how we
> should
> >> handle cases where there is a known failure and it is indeed unrelated
> to
> >> our PR.
> >>
> >> Ryan
> >>
> >>
> >> On Mon, Nov 12, 2018 at 2:49 PM Jinmei Liao  wrote:
> >>
> >>> Just to clarify, that flaky EvictionDUnitTest is old flaky. The PR to
> >>> refactor the test passed all checks, even the repeatTest as well. I
> had a
> >>> closed PR that just touched the "un-refactored" EvictionDUnitTest, it
> >>> wouldn't even pass the repeatTest at all.
> >>>
> >>> On Mon, Nov 12, 2018 at 2:04 PM Dan Smith  wrote:
> >>>
>  To be clear, I don't think we have an issue of untrustworthy
> committers
>  pushing code they know is broken to develop.
> 
>  The problem is that it is all to easy to look at a PR with some
> >> failures
>  and *assume* your code didn't cause the failures and merge it anyway.
> I
>  think we should all be at least rerunning the tests and not merging
> the
> >>> PR
>  until everything passes. If the merge button is greyed out, that might
> >>> help
>  communicate that standard to everyone.
> 
>  Looking at the OpenJDK 8 metrics, it looks to me like most of the
> >> issues
>  are recently introduced (builds 81-84 and the EvictionDUnitTest), not
> >> old
>  flaky tests. So I think we were a little more disciplined always
> >>> listening
>  to what the checks are telling us we would have less noise in the long
> >>> run.
> 
> 
> >>
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-metrics/jobs/GeodeDistributedTestOpenJDK8Metrics/builds/23
>  -Dan
> 
>  On Mon, Nov 12, 2018 at 11:21 AM Udo Kohlmeyer 
> wrote:
> 
> > 0
> >
> > Patrick does make a point. The committers on the project have been
> >>> voted
> > in as "responsible, trustworthy and best of breed" and rights and
> > privileges according to those beliefs have been bestowed.
> >
> > We should live those words and trust our committers. In the end.. If
> > there is a "rotten" apple in the mix this should be addressed, maybe
> >>> not
> > as public flogging, but with stern communications.
> >
> > On the other side, I've also seen the model where the submitter of PR
> >>> is
> > not allowed to merge + commit their own PR's. That befalls to another
> > committer to complete this task, avoiding the whole, "I'll just
> >> quickly
> > commit without due diligence".
> >
> > --Udo
> >
> >
> > On 11/12/18 10:23, Patrick Rhomberg wrote:
> >> -1
> >>
> >> I really don't think this needs to be codified.  If people are
> >>> merging
> > red
> >> PRs, that is a failing as a responsible developer.  Defensive
>  programming
> >> is all well and good, but this seems like it's a bit beyond the
> >> pale
> >>> in
> >> that regard.  I foresee it making the correction of a red main
> >>> pipeline
> 

Javadoc errors in org/apache/geode/cache/configuration/RegionAttributesType.java

2018-12-20 Thread Kirk Lund
Just a reminder that any classes not behind internal packages need to have
functional Javadocs.

We have Javadoc errors involving
org/apache/geode/cache/configuration/RegionAttributesType.java referring to
RegionTimeToLive which I can't find when I grep.

> Task :geode-assembly:docs
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:479:
warning - Tag @link: reference not found:
RegionAttributesType.RegionTimeToLive
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:490:
warning - Tag @link: reference not found:
RegionAttributesType.RegionTimeToLive
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:501:
warning - Tag @link: reference not found:
RegionAttributesType.RegionIdleTime
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:512:
warning - Tag @link: reference not found:
RegionAttributesType.RegionIdleTime
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:523:
warning - Tag @link: reference not found:
RegionAttributesType.EntryTimeToLive
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:534:
warning - Tag @link: reference not found:
RegionAttributesType.EntryTimeToLive
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:545:
warning - Tag @link: reference not found:
RegionAttributesType.EntryIdleTime
/Users/klund/dev/gemfire1/open/geode-core/src/main/java/org/apache/geode/cache/configuration/RegionAttributesType.java:556:
warning - Tag @link: reference not found:
RegionAttributesType.EntryIdleTime
8 warnings


'io.spring.dependency-management' not found.

2018-12-19 Thread Kirk Lund
My other IJ project is now broken after letting IJ update. Anyone know
anything about geode-dependency-management.gradle or why it would cause:

Caused by: org.gradle.api.plugins.UnknownPluginException: Plugin with id
'io.spring.dependency-management' not found.

I tried refreshing from Gradle even though I enabled auto-import and
rebuilding but the problem persists. I know I complain a lot about Geode's
Gradle but the reason I do so is because it breaks nearly every day and you
have no idea how frustrating that is when you're trying to actually advance
Geode's code base instead of spending endless hours fighting with Gradle
and IJ. I know I'm not the only one, because I helped another developer
hitting the same crazy problems yesterday and it literally took about 6
hours for him.

Full stack:

Caused by: org.gradle.api.plugins.UnknownPluginException: Plugin with id
'io.spring.dependency-management' not found.
at
org.gradle.api.internal.plugins.DefaultPluginManager.apply(DefaultPluginManager.java:128)
at
org.gradle.api.internal.plugins.DefaultObjectConfigurationAction.applyType(DefaultObjectConfigurationAction.java:120)
at
org.gradle.api.internal.plugins.DefaultObjectConfigurationAction.access$200(DefaultObjectConfigurationAction.java:38)
at
org.gradle.api.internal.plugins.DefaultObjectConfigurationAction$3.run(DefaultObjectConfigurationAction.java:86)
at
org.gradle.api.internal.plugins.DefaultObjectConfigurationAction.execute(DefaultObjectConfigurationAction.java:143)
at org.gradle.groovy.scripts.DefaultScript.apply(DefaultScript.java:133)
at org.gradle.api.Script$apply.callCurrent(Unknown Source)
at
geode_dependency_management_4fsb42s5gr195cwy27iq0u2n9.run(/Users/klund/dev/gemfire1/open/gradle/geode-dependency-management.gradle:21)
at
org.gradle.groovy.scripts.internal.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:90)
... 181 more


Re: Two copies of ExtendsFunctionAdapter.java

2018-12-17 Thread Kirk Lund
You're right. IntelliJ shouldn't be compiling these classes. However, the
real issue here is that our use of Gradle is so horribly unnatural that it
ends up confusing IntelliJ (and other tools). We should strive to NOT
confuse other tools by doing stupid stuff in Gradle.

You've seen my instructions for configuring IJ and setting up a project (or
I assume you have). I'm still following that for every Geode project I
setup. Sometimes, one of these projects "misbehaves" and gets so confused
by Geode's crazy Gradle build that it goes belly-up. The result is we are
wasting the time of lots of developers because of this kind of non-sense.

So, I ask again: do we really need two copies of the same file?

On Mon, Dec 17, 2018 at 11:49 AM Galen O'Sullivan 
wrote:

> I don't understand our build or IntelliJ that well, but it seems weird to
> me that these classes would be getting built at all since they're in the
> resources section rather than java. I don't see compiled versions of these
> classes in my geode directory. Perhaps it's an IntelliJ configuration
> issue?
>
> Galen
>
>
> On Mon, Dec 17, 2018 at 11:23 AM Kirk Lund  wrote:
>
> > IntelliJ just started failing to compile because we have two copies of
> > ExtendsFunctionAdapter.java. Apparently, IJ was happy enough to ignore
> > these duplicates for a month or so, but it's now fed up and will no
> longer
> > tolerate the duplication so it's failing with:
> >
> > Error:(21, 8) java: duplicate class:
> > org.apache.geode.management.internal.deployment.ExtendsFunctionAdapter
> >
> > This file is in geode-core/src/distributedTest/resources and
> > geode-core/src/integrationTest/resources:
> >
> > 1)
> >
> >
> ./geode-core/src/distributedTest/resources/org/apache/geode/management/internal/deployment/ExtendsFunctionAdapter.java
> > 2)
> >
> >
> ./geode-core/src/integrationTest/resources/org/apache/geode/management/internal/deployment/ExtendsFunctionAdapter.java
> >
> > Apparently we have two tests that load these java files as resources:
> >
> > 1)
> >
> >
> geode-core/src/distributedTest/java/org/apache/geode/management/internal/cli/commands/DeployCommandFunctionRegistrationDUnitTest.java:83:
> >
> >
> >
> "/org/apache/geode/management/internal/deployment/ExtendsFunctionAdapter.java");
> > 2.a)
> >
> >
> geode-core/src/integrationTest/java/org/apache/geode/management/internal/deployment/FunctionScannerTest.java:62:
> >   File sourceFileOne = loadTestResource("ExtendsFunctionAdapter.java");
> > 2.b)
> >
> >
> geode-core/src/integrationTest/java/org/apache/geode/management/internal/deployment/FunctionScannerTest.java:73:
> >   File sourceFileOne =
> > loadTestResource("AbstractExtendsFunctionAdapter.java");
> > 2.c)
> >
> >
> geode-core/src/integrationTest/java/org/apache/geode/management/internal/deployment/FunctionScannerTest.java:74:
> >   File sourceFileTwo =
> > loadTestResource("ConcreteExtendsAbstractExtendsFunctionAdapter.java");
> >
> > Do we really need to have two copies of this file in our codebase?
> >
> > PS, here's the last commit to touch these two files:
> >
> > commit 65c79841b65d7bd9ffa3c50fa73d4d3857dced58
> >
> > Author: Jacob Barrett 
> >
> > Date:   Fri Aug 10 15:49:22 2018 -0700
> >
> >
> >  GEODE-5530: Removes test dependency from other test source sets
> > (#2294)
> >
> >
> >
> > Moves common sources to geode-dunit or geode-junit.
> >
> >
> >
> > Co-authored-by: Finn Sutherland 
> >
> > Thanks,
> > Kirk
> >
>


[DISCUSS] Replace @TestingOnly with @VisibleForTesting

2018-12-17 Thread Kirk Lund
We have a custom annotation in geode-common called @TestingOnly:

geode-common/src/main/java/org/apache/geode/annotations/TestingOnly.java

This annotation was created while pairing with Michael Feathers and the
intention was to annotate non-private constructors or methods that have a
widened visibility scope to facilitate testing.

Some developers, however, have interpreted it as meaning that the
constructor or method cannot be used in the main src code and can only be
used from test src code.

I'd like to propose deleting @TestingOnly and change to
using @VisibleForTesting which is defined in Guava (which is already a
Geode dependency). The name of the Guava annotation is less ambiguous and
it's already been adopted for use by additional projects including AssertJ
which we use extensively.

The javadocs on VisibleForTesting explains its usage very clearly:

/**
 * Annotates a program element that exists, or is more widely visible
than otherwise necessary, only
 * for use in test code.
 *
 * @author Johannes Henkel
 */
@GwtCompatible
public @interface VisibleForTesting {
}


Two copies of ExtendsFunctionAdapter.java

2018-12-17 Thread Kirk Lund
IntelliJ just started failing to compile because we have two copies of
ExtendsFunctionAdapter.java. Apparently, IJ was happy enough to ignore
these duplicates for a month or so, but it's now fed up and will no longer
tolerate the duplication so it's failing with:

Error:(21, 8) java: duplicate class:
org.apache.geode.management.internal.deployment.ExtendsFunctionAdapter

This file is in geode-core/src/distributedTest/resources and
geode-core/src/integrationTest/resources:

1)
./geode-core/src/distributedTest/resources/org/apache/geode/management/internal/deployment/ExtendsFunctionAdapter.java
2)
./geode-core/src/integrationTest/resources/org/apache/geode/management/internal/deployment/ExtendsFunctionAdapter.java

Apparently we have two tests that load these java files as resources:

1)
geode-core/src/distributedTest/java/org/apache/geode/management/internal/cli/commands/DeployCommandFunctionRegistrationDUnitTest.java:83:

"/org/apache/geode/management/internal/deployment/ExtendsFunctionAdapter.java");
2.a)
geode-core/src/integrationTest/java/org/apache/geode/management/internal/deployment/FunctionScannerTest.java:62:
  File sourceFileOne = loadTestResource("ExtendsFunctionAdapter.java");
2.b)
geode-core/src/integrationTest/java/org/apache/geode/management/internal/deployment/FunctionScannerTest.java:73:
  File sourceFileOne =
loadTestResource("AbstractExtendsFunctionAdapter.java");
2.c)
geode-core/src/integrationTest/java/org/apache/geode/management/internal/deployment/FunctionScannerTest.java:74:
  File sourceFileTwo =
loadTestResource("ConcreteExtendsAbstractExtendsFunctionAdapter.java");

Do we really need to have two copies of this file in our codebase?

PS, here's the last commit to touch these two files:

commit 65c79841b65d7bd9ffa3c50fa73d4d3857dced58

Author: Jacob Barrett 

Date:   Fri Aug 10 15:49:22 2018 -0700


 GEODE-5530: Removes test dependency from other test source sets (#2294)



Moves common sources to geode-dunit or geode-junit.



Co-authored-by: Finn Sutherland 

Thanks,
Kirk


Working on Geode with IntelliJ 2018.3

2018-12-13 Thread Kirk Lund
IntelliJ 2018.3.1 downloads sources jars for the jars of old geode versions
that the geode-old-versions module refers to. The Javadoc {@link class}
editor support in IJ 2018.3.1 then seems to randomly reference some of the
Geode classes in those old geode jars instead of local .java files in the
project.

My recommendation is to stick to IntelliJ 2018.2.x until someone using
2018.3.x reports back to the dev list that the above problem is fixed. Or
you can use 2018.3.1 and just realize you won't have the IDE support for
editing Javadocs that you may have been used to.

IntelliJ also has a Gradle option that I think you should avoid like the
plague if you're working on Geode. Do *NOT* disable this:

*Preferences -> Build, Execution, Deployment -> Build Tools -> Gradle ->
"Store generated project files externally"*

Disabling the above results in classpath errors when trying to run dunit
tests in IntelliJ.

If anyone has more information on any of the above (especially how to fix
or workaround it) then please share!


Re: geode-core dependencies for maven pom

2018-12-06 Thread Kirk Lund
arning get logged and Geode JMX will be broken.

On Thu, Dec 6, 2018 at 10:27 AM Jacob Barrett  wrote:

> It should be in the POM as optional.
>
> > On Dec 6, 2018, at 10:17 AM, Kirk Lund  wrote:
> >
> > Is the presence of "ext.optional = true" for spring-shell in
> > geode-core/build.gradle the reason it's missing from the geode-core maven
> > pom?
> >
> >  compile('org.springframework.shell:spring-shell:' +
> > project.'spring-shell.version') {
> >exclude module: 'aopalliance'
> >exclude module: 'asm'
> >exclude module: 'cglib'
> >exclude module: 'guava'
> >exclude module: 'spring-aop'
> >exclude module: 'spring-context-support'
> >    exclude module: 'spring-core'
> >*ext.optional = true*
> >  }
> >
> >
> >> On Thu, Dec 6, 2018 at 10:12 AM Kirk Lund  wrote:
> >>
> >> The geode-core maven pom is missing "spring-shell" -- where can I go to
> >> add in "spring-shell" as a required dependency? That's the code or
> gradle
> >> or xml that I want to look at.
> >>
> >>> On Thu, Dec 6, 2018 at 9:57 AM Jacob Barrett 
> wrote:
> >>>
> >>> That file is only for testing that the contents are as expected. The
> POM
> >>> is generated from dependencies and transitive dependencies specified
> in the
> >>> various Gradle files.
> >>>
> >>>> On Dec 6, 2018, at 9:55 AM, Kirk Lund  wrote:
> >>>>
> >>>> Is this the only file controlling and testing the dependencies for the
> >>>> geode-core maven pom that we publish?
> >>>>
> >>>> geode-core/src/test/resources/expected-pom.xml
> >>>>
> >>>>> On Thu, Dec 6, 2018 at 9:52 AM Kirk Lund  wrote:
> >>>>>
> >>>>> Can someone please point me at the right place to review and alter
> >>>>> the dependencies for geode-core that are being published for its
> maven
> >>> pom?
> >>>>>
> >>>>> Also, are there any tests involving the dependencies of the
> geode-core
> >>>>> maven pom?
> >>>>>
> >>>
> >>
>


Re: geode-core dependencies for maven pom

2018-12-06 Thread Kirk Lund
Is the presence of "ext.optional = true" for spring-shell in
geode-core/build.gradle the reason it's missing from the geode-core maven
pom?

  compile('org.springframework.shell:spring-shell:' +
project.'spring-shell.version') {
exclude module: 'aopalliance'
exclude module: 'asm'
exclude module: 'cglib'
exclude module: 'guava'
exclude module: 'spring-aop'
exclude module: 'spring-context-support'
exclude module: 'spring-core'
*ext.optional = true*
  }


On Thu, Dec 6, 2018 at 10:12 AM Kirk Lund  wrote:

> The geode-core maven pom is missing "spring-shell" -- where can I go to
> add in "spring-shell" as a required dependency? That's the code or gradle
> or xml that I want to look at.
>
> On Thu, Dec 6, 2018 at 9:57 AM Jacob Barrett  wrote:
>
>> That file is only for testing that the contents are as expected. The POM
>> is generated from dependencies and transitive dependencies specified in the
>> various Gradle files.
>>
>> > On Dec 6, 2018, at 9:55 AM, Kirk Lund  wrote:
>> >
>> > Is this the only file controlling and testing the dependencies for the
>> > geode-core maven pom that we publish?
>> >
>> > geode-core/src/test/resources/expected-pom.xml
>> >
>> >> On Thu, Dec 6, 2018 at 9:52 AM Kirk Lund  wrote:
>> >>
>> >> Can someone please point me at the right place to review and alter
>> >> the dependencies for geode-core that are being published for its maven
>> pom?
>> >>
>> >> Also, are there any tests involving the dependencies of the geode-core
>> >> maven pom?
>> >>
>>
>


Re: geode-core dependencies for maven pom

2018-12-06 Thread Kirk Lund
The geode-core maven pom is missing "spring-shell" -- where can I go to add
in "spring-shell" as a required dependency? That's the code or gradle or
xml that I want to look at.

On Thu, Dec 6, 2018 at 9:57 AM Jacob Barrett  wrote:

> That file is only for testing that the contents are as expected. The POM
> is generated from dependencies and transitive dependencies specified in the
> various Gradle files.
>
> > On Dec 6, 2018, at 9:55 AM, Kirk Lund  wrote:
> >
> > Is this the only file controlling and testing the dependencies for the
> > geode-core maven pom that we publish?
> >
> > geode-core/src/test/resources/expected-pom.xml
> >
> >> On Thu, Dec 6, 2018 at 9:52 AM Kirk Lund  wrote:
> >>
> >> Can someone please point me at the right place to review and alter
> >> the dependencies for geode-core that are being published for its maven
> pom?
> >>
> >> Also, are there any tests involving the dependencies of the geode-core
> >> maven pom?
> >>
>


Re: geode-core dependencies for maven pom

2018-12-06 Thread Kirk Lund
Is this the only file controlling and testing the dependencies for the
geode-core maven pom that we publish?

geode-core/src/test/resources/expected-pom.xml

On Thu, Dec 6, 2018 at 9:52 AM Kirk Lund  wrote:

> Can someone please point me at the right place to review and alter
> the dependencies for geode-core that are being published for its maven pom?
>
> Also, are there any tests involving the dependencies of the geode-core
> maven pom?
>


geode-core dependencies for maven pom

2018-12-06 Thread Kirk Lund
Can someone please point me at the right place to review and alter
the dependencies for geode-core that are being published for its maven pom?

Also, are there any tests involving the dependencies of the geode-core
maven pom?


Re: PowerMock and mock ClassLoader

2018-12-04 Thread Kirk Lund
I added one comment to https://issues.apache.org/jira/browse/GEODE-6143
which lists every test file using PowerMockRunner. These will be the ones
that take some work to remove PowerMock.

I also added another comment which lists every test file importing
something from PowerMockito but not actually using PowerMockRunner. I think
these are all accidentally importing from PowerMockito instead of Mockito.
These should be very quick and easy to fix.

I didn't want to file sub-tasks for each test, so feel free to either add a
comment saying that you're working on a test or if you really want a
sub-task then feel free to file one and assign it to yourself. I recommend
tackling one PowerMockRunner test at a time and submitting a PR per test.

I'll start things off by fixing FastLoggerWithDefaultConfigIntegrationTest
since I wrote the test and accidentally imported a method from PowerMockito
instead of Mockito.

On Tue, Dec 4, 2018 at 3:50 PM Helena Bales  wrote:

> Once we have refactored tests currently using PowerMock, it might be
> prudent to introduce a spotless rule to prohibit the reintroduction of
> PowerMock.
>
> On Tue, Dec 4, 2018 at 3:32 PM Ryan McMahon 
> wrote:
>
> > I am interested in contributing to this effort.  I removed PowerMock
> usage
> > from one set of tests (GEODE-6052), and at that time I took a quick
> glance
> > at other usages.  I’ll assign GEODE-6143 to myself and see about removing
> > the remaining usages.
> >
> > Ryan
> >
> > On Tue, Dec 4, 2018 at 3:08 PM Kirk Lund  wrote:
> >
> > > I filed GEODE-6143: PowerMock should not be used in Geode tests.
> > >
> > > We need everyone to stop using PowerMock in new tests. If anyone sees a
> > PR
> > > attempting to use PowerMock please let the contributor know about
> > > GEODE-6143. The alternative is to refactor product code such that
> > > dependencies are passed into the constructor instead of reaching out to
> > > singletons and to avoid using static methods or static final fields for
> > > anything would would make testing more difficult.
> > >
> > > On Tue, Dec 4, 2018 at 11:20 AM Dan Smith  wrote:
> > >
> > > > +1 to removing PowerMock. Any situation that needs PowerMock needs
> > > > refactoring more.
> > > >
> > > > -Dan
> > > >
> > > > On Tue, Dec 4, 2018 at 10:27 AM Kirk Lund  wrote:
> > > >
> > > > > Anyone have any ideas which unit test is using PowerMock and is
> > > > injecting a
> > > > > mock ClassLoader? This keeps failing in my precheckin runs. I think
> > we
> > > > need
> > > > > to a) remove all uses of PowerMock and b) forbid its use going
> > forward.
> > > > >
> > > > > 2018-12-04 18:11:36,258 Distributed system shutdown hook ERROR
> Could
> > > not
> > > > > reconfigure JMX java.lang.LinkageError: loader constraint
> violation:
> > > > loader
> > > > > (instance of org/powermock/core/classloader/MockClassLoader)
> > previously
> > > > > initiated loading for a different type with name
> > > > > "javax/management/MBeanServer"
> > > > > at java.lang.ClassLoader.defineClass1(Native Method)
> > > > > at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.MockClassLoader.loadUnmockedClass(MockClassLoader.java:262)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:206)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass1(DeferSupportingClassLoader.java:89)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:79)
> > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.unregisterAllMatching(Server.java:337)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.logging.log4j.core.jmx.Server.unregi

Re: Using gradlew --tests runs test class twice

2018-12-04 Thread Kirk Lund
Oops, you're right.

On Tue, Dec 4, 2018 at 11:03 AM Jacob Barrett  wrote:

> I don’t see how your output indicates the test is running twice. Is there
> more output missing from the clip? Try added ‘—info’ to your Gradle command
> to validate.
>
>
> > On Dec 4, 2018, at 10:53 AM, Kirk Lund  wrote:
> >
> > So, based on the output that I see from gradle when I execute this:
> >
> > $ ./gradlew geode-core:integrationTest --tests
> AnalyzeSerializablesJUnitTest
> >
> > ...gradle apparently runs the test class twice. I say this because this
> is
> > my output:
> >
> > 94% EXECUTING [1m 21s]
> >
> > geode-core:integrationTest > 3 tests completed
> >
> > geode-core:integrationTest > Executing test
> > org.apache.geode.codeAnalysis.AnalyzeSerializablesJUnitTest
> >
> >
> > After running it once and reporting that it completed, it then executes
> the
> > entire test class a 2nd time.
> >
> >
> > Anyone know enough gradle to fix our files to prevent this?
>
>


Using gradlew --tests runs test class twice

2018-12-04 Thread Kirk Lund
So, based on the output that I see from gradle when I execute this:

$ ./gradlew geode-core:integrationTest --tests AnalyzeSerializablesJUnitTest

...gradle apparently runs the test class twice. I say this because this is
my output:

94% EXECUTING [1m 21s]

geode-core:integrationTest > 3 tests completed

geode-core:integrationTest > Executing test
org.apache.geode.codeAnalysis.AnalyzeSerializablesJUnitTest


After running it once and reporting that it completed, it then executes the
entire test class a 2nd time.


Anyone know enough gradle to fix our files to prevent this?


PowerMock and mock ClassLoader

2018-12-04 Thread Kirk Lund
Anyone have any ideas which unit test is using PowerMock and is injecting a
mock ClassLoader? This keeps failing in my precheckin runs. I think we need
to a) remove all uses of PowerMock and b) forbid its use going forward.

2018-12-04 18:11:36,258 Distributed system shutdown hook ERROR Could not
reconfigure JMX java.lang.LinkageError: loader constraint violation: loader
(instance of org/powermock/core/classloader/MockClassLoader) previously
initiated loading for a different type with name
"javax/management/MBeanServer"
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at
org.powermock.core.classloader.MockClassLoader.loadUnmockedClass(MockClassLoader.java:262)
at
org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:206)
at
org.powermock.core.classloader.DeferSupportingClassLoader.loadClass1(DeferSupportingClassLoader.java:89)
at
org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:79)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at
org.apache.logging.log4j.core.jmx.Server.unregisterAllMatching(Server.java:337)
at
org.apache.logging.log4j.core.jmx.Server.unregisterLoggerContext(Server.java:261)
at
org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:165)
at
org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:141)
at
org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:558)
at
org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:619)
at
org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:636)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:231)
at
org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:243)
at
org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:174)
at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:661)
at
org.apache.geode.internal.logging.LogService.getLogger(LogService.java:64)
at
org.apache.geode.internal.tcp.ConnectionTable.(ConnectionTable.java:61)
at
org.apache.geode.distributed.DistributedSystem.setThreadsSocketPolicy(DistributedSystem.java:263)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.lambda$static$0(InternalDistributedSystem.java:2317)
at java.lang.Thread.run(Thread.java:748)


Re: Third Iteration of Java Modules Support

2018-12-03 Thread Kirk Lund
Moving internal.logging to geode-logging would involve moving some
internal.logging classes to org.apache.geode.logging. I think that would
help with the logging portion of what you're doing.

On Mon, Dec 3, 2018 at 4:30 PM Kirk Lund  wrote:

> I'm iteratively cleaning up geode's internal logging (specifically our
> dependency on log4j core) so that log4j core can be optional and a user
> could switch it out for logback or do further customization of our logging.
>
> Even without the modules work you're doing, moving our logging internals
> to a new geode-logging module would make my work easier so that I could
> hide the Geode log4j2 appenders behind a ServiceLoader -- would that make
> your work easier or more difficult?
>
> On Mon, Dec 3, 2018 at 1:31 PM Jacob Barrett  wrote:
>
>> All,
>>
>> I’ll start with yuck! I took a stab at adding module-info.java files to
>> core and cq and it isn’t pretty. So all the work done in he second
>> iteration, internal package refactoring, etc., also has to be done for real
>> modules. To achieve this we basically have to compile the sources in a Java
>> 9+ compiler using the new ‘—release X’ argument, where primary sources get
>> ‘—release 8’ and module-info.java gets ‘—release 9’. If only it was that
>> easy though as the compiler validates the module-info.java when compiling
>> so all the packages and classes mention in it must exist in the source
>> input to javac. Blah blah smoke and mirrors later it is doable, such that a
>> jar is produced with Java 8 compatible (binary and API) classes and a Java
>> 9+ module info file.
>>
>> When all is done here is the module file for the core, cq and an
>> application.
>>
>> geode-core:
>> module org.apache.geode.core {
>>   exports org.apache.geode;
>>   exports org.apache.geode.cache;
>>   exports org.apache.geode.cache.client;
>>   exports org.apache.geode.cache.query;
>>   exports org.apache.geode.pdx;
>>
>>   requires org.apache.geode.common;
>>   requires org.apache.geode.json;
>>
>>   requires java.desktop;
>>   requires java.naming;
>>   requires java.management;
>>   requires java.rmi;
>>   requires java.sql;
>>
>>   requires org.apache.logging.log4j;
>>   requires org.apache.logging.log4j.core;
>>   requires antlr;
>>
>>   opens org.apache.geode.internal.logging.log4j to
>> org.apache.logging.log4j.core;
>>   opens org.apache.geode.cache.query.internal.parse to antlr;
>>
>>   uses org.apache.geode.distributed.internal.DistributedSystemService;
>>   uses org.apache.geode.cache.query.internal.cq.spi.CqServiceFactory;
>>   uses org.apache.geode.internal.cache.CacheService;
>>
>>   // TODO Internal exports are bad
>>   exports org.apache.geode.cache.client.internal to org.apache.geode.cq;
>>   exports org.apache.geode.cache.query.internal to org.apache.geode.cq;
>>   exports org.apache.geode.cache.query.internal.cq to org.apache.geode.cq;
>>   exports org.apache.geode.cache.query.internal.cq.spi to
>> org.apache.geode.cq;
>>   exports org.apache.geode.distributed.internal to org.apache.geode.cq;
>>   exports org.apache.geode.internal to org.apache.geode.cq;
>>   exports org.apache.geode.internal.cache to org.apache.geode.cq;
>>   exports org.apache.geode.internal.cache.tier.sockets to
>> org.apache.geode.cq;
>>   exports org.apache.geode.internal.logging to org.apache.geode.cq;
>>   exports org.apache.geode.internal.statistics to org.apache.geode.cq;
>> }
>>
>> geode-cq:
>> module org.apache.geode.cq {
>>   requires org.apache.geode.core;
>>
>>   provides org.apache.geode.cache.query.internal.cq.spi.CqServiceFactory
>>   with org.apache.geode.cq.internal.cache.query.CqServiceFactoryImpl;
>>   provides org.apache.geode.distributed.internal.DistributedSystemService
>>   with org.apache.geode.cq.internal.CQDistributedSystemService;
>>
>>   requires org.apache.logging.log4j;
>> }
>>
>> application:
>> module com.example.java11example {
>>   exports com.example;
>>
>>   // for PDX auto serialization
>>   opens com.example to org.apache.geode.core;
>>
>>   requires org.apache.geode.core;
>>   requires org.apache.geode.cq;
>> }
>>
>> Thoughts?
>>
>> -Jake
>>
>>


Re: Third Iteration of Java Modules Support

2018-12-03 Thread Kirk Lund
I'm iteratively cleaning up geode's internal logging (specifically our
dependency on log4j core) so that log4j core can be optional and a user
could switch it out for logback or do further customization of our logging.

Even without the modules work you're doing, moving our logging internals to
a new geode-logging module would make my work easier so that I could hide
the Geode log4j2 appenders behind a ServiceLoader -- would that make your
work easier or more difficult?

On Mon, Dec 3, 2018 at 1:31 PM Jacob Barrett  wrote:

> All,
>
> I’ll start with yuck! I took a stab at adding module-info.java files to
> core and cq and it isn’t pretty. So all the work done in he second
> iteration, internal package refactoring, etc., also has to be done for real
> modules. To achieve this we basically have to compile the sources in a Java
> 9+ compiler using the new ‘—release X’ argument, where primary sources get
> ‘—release 8’ and module-info.java gets ‘—release 9’. If only it was that
> easy though as the compiler validates the module-info.java when compiling
> so all the packages and classes mention in it must exist in the source
> input to javac. Blah blah smoke and mirrors later it is doable, such that a
> jar is produced with Java 8 compatible (binary and API) classes and a Java
> 9+ module info file.
>
> When all is done here is the module file for the core, cq and an
> application.
>
> geode-core:
> module org.apache.geode.core {
>   exports org.apache.geode;
>   exports org.apache.geode.cache;
>   exports org.apache.geode.cache.client;
>   exports org.apache.geode.cache.query;
>   exports org.apache.geode.pdx;
>
>   requires org.apache.geode.common;
>   requires org.apache.geode.json;
>
>   requires java.desktop;
>   requires java.naming;
>   requires java.management;
>   requires java.rmi;
>   requires java.sql;
>
>   requires org.apache.logging.log4j;
>   requires org.apache.logging.log4j.core;
>   requires antlr;
>
>   opens org.apache.geode.internal.logging.log4j to
> org.apache.logging.log4j.core;
>   opens org.apache.geode.cache.query.internal.parse to antlr;
>
>   uses org.apache.geode.distributed.internal.DistributedSystemService;
>   uses org.apache.geode.cache.query.internal.cq.spi.CqServiceFactory;
>   uses org.apache.geode.internal.cache.CacheService;
>
>   // TODO Internal exports are bad
>   exports org.apache.geode.cache.client.internal to org.apache.geode.cq;
>   exports org.apache.geode.cache.query.internal to org.apache.geode.cq;
>   exports org.apache.geode.cache.query.internal.cq to org.apache.geode.cq;
>   exports org.apache.geode.cache.query.internal.cq.spi to
> org.apache.geode.cq;
>   exports org.apache.geode.distributed.internal to org.apache.geode.cq;
>   exports org.apache.geode.internal to org.apache.geode.cq;
>   exports org.apache.geode.internal.cache to org.apache.geode.cq;
>   exports org.apache.geode.internal.cache.tier.sockets to
> org.apache.geode.cq;
>   exports org.apache.geode.internal.logging to org.apache.geode.cq;
>   exports org.apache.geode.internal.statistics to org.apache.geode.cq;
> }
>
> geode-cq:
> module org.apache.geode.cq {
>   requires org.apache.geode.core;
>
>   provides org.apache.geode.cache.query.internal.cq.spi.CqServiceFactory
>   with org.apache.geode.cq.internal.cache.query.CqServiceFactoryImpl;
>   provides org.apache.geode.distributed.internal.DistributedSystemService
>   with org.apache.geode.cq.internal.CQDistributedSystemService;
>
>   requires org.apache.logging.log4j;
> }
>
> application:
> module com.example.java11example {
>   exports com.example;
>
>   // for PDX auto serialization
>   opens com.example to org.apache.geode.core;
>
>   requires org.apache.geode.core;
>   requires org.apache.geode.cq;
> }
>
> Thoughts?
>
> -Jake
>
>


SampleHandler interface and handling of statistics samples

2018-11-26 Thread Kirk Lund
SampleHandler is in the org.apache.geode.internal.statistics package.

The SampleHandler interface was originally introduced to allow multiple
handlers to receive notification of a statistics sample. Before this
interface, the StatSampler used a StatArchiveWriter directly to write to
the .gfs (Geode Statistics archive file).

The StatSampler takes a sample of all statistics (every second by default)
and then invokes every registered SampleHandler. The interface defines 4
methods:

*1) sampled* -- invoked whenever there's a sample taken -- provides a List
of ResourceInstances with latest values
*2) allocatedResourceType* -- invoked when a new ResourceType has been
defined -- this may be caused by enabling a feature or adding a new
instance of a feature or the User may have used the Statistics API to
define a new Statistics ResourceType
*3) allocatedResourceInstance* -- invoked when a new ResourceInstance of a
ResourceType is created
*4) destroyedResourceInstance* -- invoked when a ResourceInstance is
destroyed

A ResourceType is basically a definition of a bucket of key/value stats.
Within Geode, each ResourceType is typically a class such as
DistributionStats. A ResourceInstance is an instance of that with actual
values. One ResourceType may have many ResourceInstances.

The format of a .gfs file closely follows those 4 methods. ResourceType
definition is written to the .gfs file before any sample can contain a
ResourceInstance of that type. A ResourceInstance allocated record is also
written to the .gfs file before any sample may contain any values for that
instance. A ResourceInstance destroyed record is written indicating the end
of lifecycle for a particular instance.

There are two main implementations of SampleHandler:

*A) StatArchiveHandler* -- mechanism that writes binary data to the .gfs
(Geode Statistics File)
*B) StatMonitorHandler* -- mechanism that notifies objects that are clients
to the statistics monitoring API (which is what updates the metrics
attributes on Geode mbeans

The purpose of this design is to allow a developer to build a new
implementation of SampleHandler that writes out to another format or
defines some sort of custom consumer of Statistics samples. For example,
you could define a MicrometerHandler which adapts Geode stats into
Micrometer, or you could define an entirely new custom file format of stats.

By default, the StatSampler samples the statistics every second and there
is only one StatSampler thread, so stat sampling is sensitive to the
performance of any implementation of SampleHandler. For example, you might
not want to define a handler that writes to a RDB over the network.

Performance problems in a SampleHandler could cause gaps in stats values,
delayed updating of Geode mbeans for monitoring tools, and it could cause
someone who is analyzing Geode artifacts to think that a stop-the-world GC
pause has occurred -- the time between stat samples is generally assumed to
be fairly consistent unless a serious GC pause occurs.


Re: [DISCUSS] Geode packages mustn't span Jigsaw modules

2018-11-26 Thread Kirk Lund
One problem about moving around internal classes is that Geode uses
(proprietary and Java-based) serialization on the wire instead of defining
a wire-format that isn't dependent on class names/structures/packages.

I for one would love to move to a real wire-format with a well-defined
protocol but that's probably a separate project in its own right.

Are you really convinced that we could move internal classes around without
breaking rolling upgrades, client-server backwards compatibility and
diskstore contents?

On Mon, Nov 19, 2018 at 5:21 PM Dan Smith  wrote:

> I think we can actually fix most of these issues without geode-2.0. Most of
> these are in internal packages, which means we can change the package of
> these classes without breaking users. The only concerning one is
> org.apache.geode.cache.util, which is a public package. However, the
> AutoBalancer is actually still marked @Experimental, so we could
> technically move that as well. Or maybe deprecate the old one and it's
> associated jar, and create a new jar with a reasonable package. JDK11 users
> could just avoid the old jar.
>
> I have been wondering for a while if we should just fold geode-cq and
> geode-wan back into geode-core. These are not really properly separated
> out, they are very tangled with core. The above package overlap kinda shows
> that as well. But maybe going through the effort of renaming the packages
> to make them java 11 modules would help improve the code anyway!
>
> -Dan
>
>
>
>
> On Mon, Nov 19, 2018 at 5:03 PM Owen Nichols  wrote:
>
> > To package Geode as Java 11 Jigsaw module(s), a major hurdle is the
> > requirement that packages cannot be split across modules.
> >
> > If we simply map existing Geode jars to Modules, we have about 10 split
> > packages (see table below).  Any restructuring would potentially have to
> > wait until Geode 2.0.
> >
> > A workaround would be to map existing packages into a small number of new
> > modules (e.g. geode-api and geode-internal).  Existing jars would remain
> > as-is.  Users making the transition to Java 11 /w Jigsaw already need to
> > switch from CLASSPATH to MODULEPATH  so the inconvenience of a different
> > naming scheme for Geode-as-modules might be acceptable (however, once
> > module naming and organization is established, we may be stuck with it
> for
> > a long time).
> >
> > Thoughts?
> >
> > Is it even possible to rearrange all classes in each package below into a
> > single jar?  Is doing so desirable enough to defer Java 11 support until
> a
> > yet-unplanned Geode 2.0, or perhaps to drive such a release?
> >
> > Or, is it satisfactory to fatten Geode releases to include one set of
> jars
> > for CLASSPATH use, plus a different set of jars for MODULEPATH use?
> >
> >
> > Package
> > Jar
> > org.apache.geode.cache.client.internal
> > geode-core-1.8.0.jar
> > geode-cq-1.8.0.jar
> > geode-wan-1.8.0.jar
> > org.apache.geode.cache.client.internal.locator.wan
> > geode-core-1.8.0.jar
> > geode-wan-1.8.0.jar
> > org.apache.geode.cache.query.internal.cq
> > geode-core-1.8.0.jar
> > geode-cq-1.8.0.jar
> > org.apache.geode.cache.util
> > geode-core-1.8.0.jar
> > geode-rebalancer-1.8.0.jar
> > org.apache.geode.internal
> > geode-connectors-1.8.0.jar
> > geode-core-1.8.0.jar
> > geode-cq-1.8.0.jar
> > geode-lucene-1.8.0.jar
> > geode-wan-1.8.0.jar
> > org.apache.geode.internal.cache.tier.sockets.command
> > geode-core-1.8.0.jar
> > geode-cq-1.8.0.jar
> > org.apache.geode.internal.cache.wan
> > geode-core-1.8.0.jar
> > geode-wan-1.8.0.jar
> > org.apache.geode.internal.cache.wan.parallel
> > geode-core-1.8.0.jar
> > geode-wan-1.8.0.jar
> > org.apache.geode.internal.cache.wan.serial
> > geode-core-1.8.0.jar
> > geode-wan-1.8.0.jar
> > org.apache.geode.internal.protocol.protobuf.v1
> > geode-protobuf-1.8.0.jar
> > geode-protobuf-messages-1.8.0.jar
>


Re: Using SystemOutRule or SystemErrRule in Geode tests

2018-11-26 Thread Kirk Lund
The following unit test will pass repeatedly in both your IDE (ex: run
until failure) or in stressNewTest:

public class SystemOutRuleTest {

  @Rule
  public SystemOutRule systemOutRule = new SystemOutRule().enableLog();

  @Test
  public void passesRepeatedly() {
System.out.println("hello");

assertThat(systemOutRule.getLog()).contains("hello");
  }
}

But the following test will fail after the first time the test is executed
(ex: run until failure in IntelliJ) :

public class SystemOutRuleTest {

  @Rule
  public SystemOutRule systemOutRule = new SystemOutRule().enableLog();

  @Test
  public void passesFirstTime() {
LogManager.getLogger().info("hello");

assertThat(systemOutRule.getLog()).contains("hello");
  }
}

If you need to verify that a class is logging a specific message in some
circumstances then you can extract that test from the Unit Test and move it
to an Integration Test. You do *not* have to annotate SystemOutRule
with @ClassRule but I recommend doing that especially if you're using other
Rules doing some complicated setUp involving Geode classes:

public class CacheLoggingIntegrationTest {

  private InternalCache cache;

  @Rule
  public SystemOutRule systemOutRule = new SystemOutRule().enableLog();

  @Before
  public void setUp() {
cache = (InternalCache) new CacheFactory().set(LOCATORS, "").create();
  }

  @After
  public void tearDown() {
cache.close();
  }

  @Test
  public void passesFirstTime() {
cache.getLogger().info("hello");

assertThat(systemOutRule.getLog()).contains("hello");
  }
}

If you change the SystemOutRule in CacheLoggingIntegrationTest to a
static @ClassRule:

  @ClassRule
  public static SystemOutRule systemOutRule = new
SystemOutRule().enableLog();

...then it will repeatedly pass if you run-until-failure in IntelliJ but
only because IJ ends up running it as one test class. It will still fail
when run repeatedly by stressNewTest or if a previous test class caused
Log4J2 to startup within the same JVM.

On Mon, Nov 26, 2018 at 1:46 PM Kirk Lund  wrote:

> Log4J and Logback both capture a reference to System.out and/or System.err
> and then use that reference thereafter. This means that any manipulation of
> System.out or System.err -- using System.setOut(PrintStream) or
> System.setErr(PrintStream) -- probably isn't going to play nicely with
> logging in general.
>
> Here's what that means for Geode tests...
>
> If you're writing a Unit Test, then that test will be executed in a
> long-lived JVM in which other Unit Tests are also going to be executed. The
> SystemOutRule from system-rules will work with code that uses System.out
> directly but will not work repeatedly with loggers.
>
> If you're writing an Integration Test or Distributed Test, then that test
> will be executed in a fresh JVM (fork every one is the config), so as long
> as you can get the SystemOutRule before to invoke before logging, the test
> will behave. In general that means you'll want to annotate it
> with @BeforeClass.
>
> Here are a couple examples in which log4j2 captures System.out to
> illustrate why SystemOutRule can't intercept the output after invoking
> System.setOut(PrintStream):
>
> log4j-core/src/main/java/org/apache/logging/log4j/core/config/status/StatusConfiguration.java:43:
>   private static final PrintStream DEFAULT_STREAM = System.out;
>
> log4j-api/src/main/java/org/apache/logging/log4j/simple/SimpleLoggerContext.java:87:
>   ps = System.out;
>
> If SystemOutRule executes its before -- System.setOut(PrintStream) --
> after the above code, logging will be printing to a different PrintStream
> than the PrintStream that SystemOutRule inserted into System.
>


Using SystemOutRule or SystemErrRule in Geode tests

2018-11-26 Thread Kirk Lund
Log4J and Logback both capture a reference to System.out and/or System.err
and then use that reference thereafter. This means that any manipulation of
System.out or System.err -- using System.setOut(PrintStream) or
System.setErr(PrintStream) -- probably isn't going to play nicely with
logging in general.

Here's what that means for Geode tests...

If you're writing a Unit Test, then that test will be executed in a
long-lived JVM in which other Unit Tests are also going to be executed. The
SystemOutRule from system-rules will work with code that uses System.out
directly but will not work repeatedly with loggers.

If you're writing an Integration Test or Distributed Test, then that test
will be executed in a fresh JVM (fork every one is the config), so as long
as you can get the SystemOutRule before to invoke before logging, the test
will behave. In general that means you'll want to annotate it
with @BeforeClass.

Here are a couple examples in which log4j2 captures System.out to
illustrate why SystemOutRule can't intercept the output after invoking
System.setOut(PrintStream):

log4j-core/src/main/java/org/apache/logging/log4j/core/config/status/StatusConfiguration.java:43:
  private static final PrintStream DEFAULT_STREAM = System.out;

log4j-api/src/main/java/org/apache/logging/log4j/simple/SimpleLoggerContext.java:87:
  ps = System.out;

If SystemOutRule executes its before -- System.setOut(PrintStream) -- after
the above code, logging will be printing to a different PrintStream than
the PrintStream that SystemOutRule inserted into System.


Unit test target loads up full Geode logging

2018-11-26 Thread Kirk Lund
The unit test target has geode-core/src/main/resources in the classpath
which means that Log4J finds our log4j2.xml file and creates/registers our
custom appenders. We might want to have a unit test log4j2.xml that gets
placed earlier on the classpath but only for the unit test target. The
integrationTest and distributedTest targets still need the
geode-core/src/main/resources/log4j2.xml earliest on the classpath.

The best solution is probably to hide
geode-core/src/main/resources/log4j2.xml from unit test if that's possible.

Second best solution might be to have a resources src set containing a
simplified log4j2.xml which is added to the classpath for unit test but not
for the other testing targets.

Anyone with gradle knowledge interested in helping with this?


AcceptanceTestOpenJDK11 precheckin fails to compile

2018-11-26 Thread Kirk Lund
Just the first few compilation failures are listed below. Is the image
for AcceptanceTestOpenJDK11 precheckin broken in some way?

/home/geode/geode/geode-core/src/main/java/org/apache/geode/internal/InternalInstantiator.java:20:
error: cannot access Constructor
import java.lang.reflect.Constructor;
^
  bad class file:
/usr/lib/jvm/java-8-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/lang/reflect/Constructor.class)
unable to access file: corrupted zip file
Please remove or make sure it appears in the correct subdirectory of
the classpath.
/home/geode/geode/geode-core/src/main/java/org/apache/geode/internal/cache/LocalRegion.java:18:
error: cannot find symbol
import static
org.apache.geode.internal.offheap.annotations.OffHeapIdentifier.ENTRY_EVENT_NEW_VALUE;
^
  symbol:   static ENTRY_EVENT_NEW_VALUE
  location: class
/home/geode/geode/geode-core/src/main/java/org/apache/geode/internal/cache/LocalRegion.java:17:
error: cannot find symbol
import static org.apache.geode.internal.lang.SystemUtils.getLineSeparator;
^
  symbol:   static getLineSeparator
  location: class


Re: Avoiding Unpredictability in Our DUnit Testing Rules

2018-11-26 Thread Kirk Lund
Typo fix: If you need to use CleanupDUnitVMsRule along with other dunit
rules, then you will need to* use RuleChain*.

If you don't need to use CleanupDUnitVMsRule then you don't need to use
RuleChain.

On Mon, Nov 26, 2018 at 11:57 AM Kirk Lund  wrote:

> Actually the only problem with this is specific to bouncing of dunit VMs.
> I filed "GEODE-6033: DistributedTest rules should support VM bounce"
> earlier this month and I have a branch with preliminary changes that seem
> to be working fine.
>
> Aside from bouncing of dunit VMs, the dunit rules you listed do not care
> about ordering of invocation, so you do *not* need to use RuleChain
> (unless you're using CleanupDUnitVMsRule). There are two caveats though:
>
> 1) if you want or need to specify the number of VMs to be something other
> than the default of 4, then you'll need to specify the number of VMs for
> every one of these dunit rules (see example below)
>
> 2) these dunit rules do not currently work with bouncing of vms
>
> If you need to use CleanupDUnitVMsRule along with other dunit rules, then
> you will need to use. If you need to use VM bouncing within a test, then
> you'll need to wait until I can merge my
> (I have a branch on which I've made some changes to make this work)
>
> So it's actually #2 that's causing your problem. But as I mentioned, I do
> have a branch on which it does now work with bouncing of VMs but I'm not
> quite ready to merge it.
>
> Here's an example for a test that wants to limit the number of dunit VMs
> to 2. RuleChain is not needed, but you do have to specify the # of dunit
> VMs on all 3 dunit rules:
>
>   @Rule
>   public DistributedRule distributedRule = new DistributedRule(2);
>
>   @Rule
>   public SharedCountersRule sharedCountersRule = new SharedCountersRule(2);
>
>   @Rule
>   public SharedErrorCollector errorCollector = new SharedErrorCollector(2);
>
> Please don't use ManagementTestRule at all. I'm trying to modify all tests
> that use it to use DistributedRule with Geode User APIs so that I can
> delete it.
>
> All of those other dunit rules in your list extend AbstractDistributedRule
> which is why rule ordering does not matter (unless you use
> CleanupDUnitVMsRule because it bounces VMs).
>
> Thanks,
> Kirk
>
> On Wed, Nov 21, 2018 at 10:34 AM Patrick Rhomberg 
> wrote:
>
>> tl;dr: Use JUnit RuleChain.
>> 
>>
>> Hello all!
>>
>> Several [1] of our test @Rule classes make use of the fact that our DUnit
>> VMs Host is statically accessible to affect every test VM.  For instance,
>> the SharedCountersRule initializes a counter in every VM, and the
>> CleanupDUnitVMsRule bounces VMs before and after each test.
>>
>> Problematically, JUnit rules applied in an unpredictable / JVM-dependent
>> ordering. [2]  As a result, some flakiness we find in our tests may be the
>> result of unexpected interaction and ordering of our test rules. [3]
>>
>> Fortunately, a solution to this problem already exists.  Rule ordering can
>> be imposed by JUnit's RuleChain. [4]
>>
>> In early exploration with this rule, some tests failed due to the
>> RuleChain
>> not being serializable.  However, as it should only apply to the test VM,
>> and given that it can be composed of (unannotated) rules that remain
>> accessible and serializable, it should be a simple matter of declaring the
>> offending field transient, as it will only be necessary in the test VM.
>>
>> So, you dear reader: while you're out there making Geode the best it can
>> be, if you find yourself in a test class that uses more than one rule
>> listed in [1], or if you notice some other rule not listed below that
>> reaches out to VMs as part of its @Before or @After, please update that
>> test to use the RuleChain to apply the rules in a predictable order.
>>
>> Imagination is Change.
>> ~Patrick
>>
>> [1] A probably-incomplete list of invasive rules can be found via
>> $> git grep -il inEveryVM | grep Rule.java
>>
>> geode-core/src/distributedTest/java/org/apache/geode/management/ManagementTestRule.java
>> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/CacheRule.java
>>
>> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/ClientCacheRule.java
>>
>> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/DistributedDiskDirRule.java
>>
>> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/DistributedRule.java
>>
>> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/DistributedUseJacksonForJsonPathRule.java
>>
>> geode-dunit/src/main/java/org/apache/geode/test/dunit/rul

Re: Avoiding Unpredictability in Our DUnit Testing Rules

2018-11-26 Thread Kirk Lund
Actually the only problem with this is specific to bouncing of dunit VMs. I
filed "GEODE-6033: DistributedTest rules should support VM bounce" earlier
this month and I have a branch with preliminary changes that seem to be
working fine.

Aside from bouncing of dunit VMs, the dunit rules you listed do not care
about ordering of invocation, so you do *not* need to use RuleChain (unless
you're using CleanupDUnitVMsRule). There are two caveats though:

1) if you want or need to specify the number of VMs to be something other
than the default of 4, then you'll need to specify the number of VMs for
every one of these dunit rules (see example below)

2) these dunit rules do not currently work with bouncing of vms

If you need to use CleanupDUnitVMsRule along with other dunit rules, then
you will need to use. If you need to use VM bouncing within a test, then
you'll need to wait until I can merge my
(I have a branch on which I've made some changes to make this work)

So it's actually #2 that's causing your problem. But as I mentioned, I do
have a branch on which it does now work with bouncing of VMs but I'm not
quite ready to merge it.

Here's an example for a test that wants to limit the number of dunit VMs to
2. RuleChain is not needed, but you do have to specify the # of dunit VMs
on all 3 dunit rules:

  @Rule
  public DistributedRule distributedRule = new DistributedRule(2);

  @Rule
  public SharedCountersRule sharedCountersRule = new SharedCountersRule(2);

  @Rule
  public SharedErrorCollector errorCollector = new SharedErrorCollector(2);

Please don't use ManagementTestRule at all. I'm trying to modify all tests
that use it to use DistributedRule with Geode User APIs so that I can
delete it.

All of those other dunit rules in your list extend AbstractDistributedRule
which is why rule ordering does not matter (unless you use
CleanupDUnitVMsRule because it bounces VMs).

Thanks,
Kirk

On Wed, Nov 21, 2018 at 10:34 AM Patrick Rhomberg 
wrote:

> tl;dr: Use JUnit RuleChain.
> 
>
> Hello all!
>
> Several [1] of our test @Rule classes make use of the fact that our DUnit
> VMs Host is statically accessible to affect every test VM.  For instance,
> the SharedCountersRule initializes a counter in every VM, and the
> CleanupDUnitVMsRule bounces VMs before and after each test.
>
> Problematically, JUnit rules applied in an unpredictable / JVM-dependent
> ordering. [2]  As a result, some flakiness we find in our tests may be the
> result of unexpected interaction and ordering of our test rules. [3]
>
> Fortunately, a solution to this problem already exists.  Rule ordering can
> be imposed by JUnit's RuleChain. [4]
>
> In early exploration with this rule, some tests failed due to the RuleChain
> not being serializable.  However, as it should only apply to the test VM,
> and given that it can be composed of (unannotated) rules that remain
> accessible and serializable, it should be a simple matter of declaring the
> offending field transient, as it will only be necessary in the test VM.
>
> So, you dear reader: while you're out there making Geode the best it can
> be, if you find yourself in a test class that uses more than one rule
> listed in [1], or if you notice some other rule not listed below that
> reaches out to VMs as part of its @Before or @After, please update that
> test to use the RuleChain to apply the rules in a predictable order.
>
> Imagination is Change.
> ~Patrick
>
> [1] A probably-incomplete list of invasive rules can be found via
> $> git grep -il inEveryVM | grep Rule.java
>
> geode-core/src/distributedTest/java/org/apache/geode/management/ManagementTestRule.java
> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/CacheRule.java
>
> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/ClientCacheRule.java
>
> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/DistributedDiskDirRule.java
>
> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/DistributedRule.java
>
> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/DistributedUseJacksonForJsonPathRule.java
>
> geode-dunit/src/main/java/org/apache/geode/test/dunit/rules/SharedCountersRule.java
>
> [2] See the documentation for rules here:
> https://junit.org/junit4/javadoc/4.12/org/junit/Rule.html ; notably,
> "However,
> if there are multiple [Rule] fields (or methods) they will be applied in an
> order that depends on your JVM's implementation of the reflection API,
> which is undefined, in general."
>
> [3] For what it's worth, this was discovered after looking into why the
> DistributedRule check fo suspicious strings caused the test *after* the
> test that emitted the strings to fail.  It's only tangentially related, but
> got me looking into when and how the @After was getting applied.
>
> [4] https://junit.org/junit4/javadoc/4.12/org/junit/rules/RuleChain.html
>


Handling precheckin jobs with concourse errors

2018-11-26 Thread Kirk Lund
The DistributedTestOpenJDK11 precheckin job for PR #2878 failed with what I
think is a Concourse error: https://github.com/apache/geode/pull/2878

I believe the expected process from me is to just poke my PR to repeat
precheckin. If that's wrong, please let me know. Thanks!


What is the new process for flaky test failures in precheckin?

2018-11-26 Thread Kirk Lund
I just saw SizingFlagDUnitTest fail in a precheckin but it passes on my
branch when I run directly. I cannot find a Jira ticket for it. What's the
new process for handling these flickering tests?

See:
https://concourse.apachegeode-ci.info/builds/17745

Test failure stack:
org.apache.geode.internal.cache.SizingFlagDUnitTest >
testPRHeapLRUDeltaPutOnPrimary FAILED
org.apache.geode.test.dunit.RMIException: While invoking
org.apache.geode.internal.cache.SizingFlagDUnitTest$12.run in VM 0 running
on Host eb7aca4f2587 with 4 VMs
at org.apache.geode.test.dunit.VM.invoke(VM.java:433)
at org.apache.geode.test.dunit.VM.invoke(VM.java:402)
at org.apache.geode.test.dunit.VM.invoke(VM.java:361)
at
org.apache.geode.internal.cache.SizingFlagDUnitTest.assertValueType(SizingFlagDUnitTest.java:793)
at
org.apache.geode.internal.cache.SizingFlagDUnitTest.doPRDeltaTestLRU(SizingFlagDUnitTest.java:312)
at
org.apache.geode.internal.cache.SizingFlagDUnitTest.testPRHeapLRUDeltaPutOnPrimary(SizingFlagDUnitTest.java:220)

Caused by:
org.apache.geode.cache.EntryNotFoundException: Entry not found for
key 0
at
org.apache.geode.internal.cache.LocalRegion.checkEntryNotFound(LocalRegion.java:2760)
at
org.apache.geode.internal.cache.LocalRegion.nonTXbasicGetValueInVM(LocalRegion.java:3448)
at
org.apache.geode.internal.cache.LocalRegionDataView.getValueInVM(LocalRegionDataView.java:105)
at
org.apache.geode.internal.cache.LocalRegion.basicGetValueInVM(LocalRegion.java:3436)
at
org.apache.geode.internal.cache.LocalRegion.getValueInVM(LocalRegion.java:3424)
at
org.apache.geode.internal.cache.PartitionedRegionDataStore.getLocalValueInVM(PartitionedRegionDataStore.java:2775)
at
org.apache.geode.internal.cache.PartitionedRegion.getValueInVM(PartitionedRegion.java:8786)
at
org.apache.geode.internal.cache.SizingFlagDUnitTest$12.run(SizingFlagDUnitTest.java:797)


<    1   2   3   4   5   6   7   8   9   10   >