Re: spotless fails Apache license header?

2018-08-22 Thread Udo Kohlmeyer

Hi there Patrick,

Given that you have almost being the token Spotless-trainer, I was just 
wondering if we have ever updated the style guides (eclipse + intellij) 
with the latest version from google?


https://github.com/google/styleguide

Whilst I am not across ANY of the changes or differences in the two 
"style guides", it would be interesting to see if the two "guides" have 
converged enough for us to use again.


Does it make sense that we rather state that we have based our style 
guide on the Google guides, but have diverged so far that we cannot go back?


Do we have any idea as to why spotless would be flagging this as an 
issue now? It is not like we have just recently introduced this animal 
called spotless.


@all_devs
Whilst it has helped us maintain some sanity when it came to style, I 
also feel that it has overstayed its welcome on many occasions. Caring 
about spaces in comments is just a matter preference and does not carry 
any that weight when it comes to code. I believe if we try and follow 
the following guide, https://google.github.io/styleguide/javaguide.html, 
we could potentially have spotless be the monitor for code conformity, 
rather than the Skynet it has become.


--Udo

On 8/22/18 12:28, Patrick Rhomberg wrote:

I totally understand your frustrations, Kirk.  It is disheartening to stub
your toe on someone else's rough corners.

This email became more verbose than I intended, which is par for the course
for me, but still...  tl;dr: you can change whitespace rules in the
etc/eclipse*.xml.  Feel free to go relax those rules, if you can figure out
which exactly which constants Eclipse wants set to what and where.
Personally, I don't have a horse in this race on what the whitespace rules
should be.  I just think that we should have a standard that is enforced.

-

Just to clarify some possible misconceptions:

* We don't actually use google-java-format utility.  Spotless is provided
by DiffPlug and accepts an arbitrary Eclipse format file.
* We _do_ use a format spec that was initially based off the Google Java
format style guide.  It can be found at the now-suboptimally-named
/etc/eclipse-java-google-style.xml.
* (Of course, problems in the google-java-format utility may share a root
cause of in that Eclipse format xml, so that google-java-format issue may
indeed be what we were hitting.)

* The Eclipse format xml should contain just about everything that deals
with whitespacing in our style.
* Everything else that spotless cares about is contained in
/gradle/spotless.gradle
* Most of our not-whitespace-related spotless constraints came from
observing repeated violations (read: "petty annoyances") throughout our own
codebase.  Still, with few exceptions, you really shouldn't be hitting any
non-whitespace spotless issues if you are adhering to standard practices.

Our current spotless rules are as follows:

Best practices rules:
* Remove your unused imports.
* Remove your commented-out imports
* Do not use wildcard imports
* Remove empty Javadoc stubs (e.g., "@param myParam" with no description).
Better would be to write descriptive javadocs.
* Empty javadoc or block comments.

Whitespace rules:
* The litany of things defined in the Eclipse format file.  (For instance,
if there is a setting for spaces after punctuation in comments, it would
live here somewhere.)
* Lines should not end with whitespace before the newline.
* A file should end with a newline.

Community standards and for consistency rules:
* We have an opinion on import order   (Historically, there was a
reorder-war between Eclipse and IntelliJ users that disagreed on where
javax. should go.  This was standardized to reduce diff noise in pull
requests.)
* We have an opinion on modifier word order.  (This appears to be more of a
Java-community standard.  Not a "best practice" per se, but helps with
readability.)


Known issue:
* There is some disagreement between the IntelliJ's style file in etc/ and
the Eclipse style file in the same directory.
* As a corollary and because Spotless uses the Eclipse style file, running
AutoFormat in IntelliJ can create formatting that is considered a violation
of spotless.

And that, everyone, is pretty much everything that I know about Spotless.

Imagination is Change.
~Patrick

On Wed, Aug 22, 2018 at 9:55 AM, Anthony Baker  wrote:


Thanks Kirk!  FWIW, I’m also annoyed with the overzealous spotless
constraints.

Anthony



On Aug 22, 2018, at 9:28 AM, Kirk Lund  wrote:

Sorry to waste everyone’s time with something so trivial. I tried hard to
ensure my instructions for Setting Up IntelliJ were correct and

error-free,

so this formatter issue was a big disappointment to me because of that.

The

instructions have been updated.

On Tue, Aug 21, 2018 at 3:57 PM, Kirk Lund  wrote:


Looks like it's a feature: https://github.com/google/

google-java-format/

issues/62

Is it too late to down-vote our use of google-java-format?

On Tue, Aug 21, 2018 at 3:43 PM, Ki

Re: 2 minute gateway startup time due to GEODE-5591

2018-09-04 Thread Udo Kohlmeyer
Imo (and I'm coming in cold)... We are NOT officially supporting Alpine 
linux (yet), which is the basis for this ticket, maybe push this to a 
later release?


I prefer us getting out the fixes we have and release a more optimal 
version of GEODE-5591 later.


IF this is a bug that will affect us on EVERY linux distro, then we 
should fix, otherwise, I vote to push it to 1.8


--Udo


On 9/4/18 16:38, Dan Smith wrote:

Spitting this into a separate thread.

I see the issue. The two minute timeout is the constructor for
AcceptorImpl, where it retries to bind for 2 minutes.

That behavior makes sense for CacheServer.start.

But it doesn't make sense for the new logic in GatewayReceiver.start() from
GEODE-5591. That code is trying to use CacheServer.start to scan for an
available port, trying each port in a range. That free port finding logic
really doesn't want to have two minutes of retries for each port. It seems
like we need to rework the fix for GEODE-5591.

Does it make sense to hold up the release to rework this fix, or should we
just revert it? Have we switched concourse over to using alpine linux,
which I think was the original motivation for this fix?

-Dan

On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith  wrote:


Why is it waiting at all in this case? Where is this 2 minute timeout
coming from?

-Dan

On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda 
wrote:
So the issue is that it takes longer to start than previous releases?
Also, is this wait time only when using Gfsh to create gateway-receiver?

On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag  wrote:


Currently we have a minor issue in the release branch as pointed out by
Barry O.
We will wait till a resolution is figured out for this issue.

Steps:
1. create locator
2. start server --name=server1 --server-port=40404
3. start server --name=server2 --server-port=40405
4. create gateway-receiver --member=server1
5. create gateway-receiver --member=server2 `This gets stuck for 2

minutes`

Is the 2 minute wait time acceptable? Should we document it? When we

revert

GEODE-5591, this issue does not happen.

Regards
Nabarun Nag





Re: 2 minute gateway startup time due to GEODE-5591

2018-09-05 Thread Udo Kohlmeyer
Did we also revert this in 1.7? I assume it has, but not directly stated 
here.



On 9/5/18 10:20, Nabarun Nag wrote:

GEODE-5591 has been reverted in develop
ref: 901da27f227a8ce2b7d6b681619782a1accd9330

Regards
Nabarun Nag

On Wed, Sep 5, 2018 at 10:14 AM Ryan McMahon  wrote:


+1 for reverting in both places.

I see that there is already an isGatewayReceiver flag in the AcceptorImpl
constructor.  It's not ideal, but could we use this flag to prevent the 2
minute retry logic for happening if this flag is true?

Ryan

On Wed, Sep 5, 2018 at 10:01 AM, Lynn Hughes-Godfrey <
lhughesgodf...@pivotal.io> wrote:


+1 for reverting in both places.

On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith  wrote:


+1 for reverting in both places. The current fix is not better, that's

why

we are reverting it on the release branch!

-Dan

On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett 

wrote:

I’m not ok with reverting in develop. Revert in 1.7 and modify in

develop.

We shouldn’t go backwards in develop. The current fix is better than

the

bug it fixes.


On Sep 5, 2018, at 9:40 AM, Nabarun Nag  wrote:

If everyone is okay with it, I will revert that change in develop

and

then

cherry pick it to release/1.7.0 branch.
Please do comment.

Regards
Nabarun Nag



On Wed, Sep 5, 2018 at 9:30 AM Dan Smith 

wrote:

+1 to yank it and rework the fix.

Gester's change helps, but it just means that you will sometimes

randomly

have a 2 minute delay starting up a gateway receiver. I don't

think

that is

a great user experience either.

-Dan

On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <

bschucha...@pivotal.io>

wrote:


Let's yank it




On 9/4/18 5:04 PM, Sean Goller wrote:

If it's to get the release out, I'm fine with reverting. I don't

like

it,

but I'm not willing to die on that hill. :)

-S.

On Tue, Sep 4, 2018 at 4:38 PM Dan Smith 

wrote:

Spitting this into a separate thread.

I see the issue. The two minute timeout is the constructor for
AcceptorImpl, where it retries to bind for 2 minutes.

That behavior makes sense for CacheServer.start.

But it doesn't make sense for the new logic in

GatewayReceiver.start()

from
GEODE-5591. That code is trying to use CacheServer.start to

scan

for

an

available port, trying each port in a range. That free port

finding

logic

really doesn't want to have two minutes of retries for each

port.

It

seems
like we need to rework the fix for GEODE-5591.

Does it make sense to hold up the release to rework this fix,

or

should

we
just revert it? Have we switched concourse over to using alpine

linux,

which I think was the original motivation for this fix?

-Dan

On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith 

wrote:

Why is it waiting at all in this case? Where is this 2 minute

timeout

coming from?

-Dan

On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <


sai.boorlaga...@gmail.com


wrote:

So the issue is that it takes longer to start than previous

releases?

Also, is this wait time only when using Gfsh to create
gateway-receiver?

On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag 

wrote:

Currently we have a minor issue in the release branch as

pointed

out

by

Barry O.

We will wait till a resolution is figured out for this

issue.

Steps:
1. create locator
2. start server --name=server1 --server-port=40404
3. start server --name=server2 --server-port=40405
4. create gateway-receiver --member=server1
5. create gateway-receiver --member=server2 `This gets stuck

for 2

minutes`


Is the 2 minute wait time acceptable? Should we document it?

When

we

revert


GEODE-5591, this issue does not happen.

Regards
Nabarun Nag






Re: 2 minute gateway startup time due to GEODE-5591

2018-09-05 Thread Udo Kohlmeyer

Thank you. I must have missed that :)


On 9/5/18 10:54, Nabarun Nag wrote:

@Udo I have mentioned in an earlier mail that it will be reverted in
develop and then cherry picked to develop. release/1.7.0 branch has not
being published yet, as it is undergoing preliminary tests before release
candidate is published.

Regards
Nabarun Nag

On Wed, Sep 5, 2018 at 10:46 AM Udo Kohlmeyer  wrote:


Did we also revert this in 1.7? I assume it has, but not directly stated
here.


On 9/5/18 10:20, Nabarun Nag wrote:

GEODE-5591 has been reverted in develop
ref: 901da27f227a8ce2b7d6b681619782a1accd9330

Regards
Nabarun Nag

On Wed, Sep 5, 2018 at 10:14 AM Ryan McMahon 

wrote:

+1 for reverting in both places.

I see that there is already an isGatewayReceiver flag in the

AcceptorImpl

constructor.  It's not ideal, but could we use this flag to prevent the

2

minute retry logic for happening if this flag is true?

Ryan

On Wed, Sep 5, 2018 at 10:01 AM, Lynn Hughes-Godfrey <
lhughesgodf...@pivotal.io> wrote:


+1 for reverting in both places.

On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith  wrote:


+1 for reverting in both places. The current fix is not better, that's

why

we are reverting it on the release branch!

-Dan

On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett 

wrote:

I’m not ok with reverting in develop. Revert in 1.7 and modify in

develop.

We shouldn’t go backwards in develop. The current fix is better than

the

bug it fixes.


On Sep 5, 2018, at 9:40 AM, Nabarun Nag  wrote:

If everyone is okay with it, I will revert that change in develop

and

then

cherry pick it to release/1.7.0 branch.
Please do comment.

Regards
Nabarun Nag



On Wed, Sep 5, 2018 at 9:30 AM Dan Smith 

wrote:

+1 to yank it and rework the fix.

Gester's change helps, but it just means that you will sometimes

randomly

have a 2 minute delay starting up a gateway receiver. I don't

think

that is

a great user experience either.

-Dan

On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <

bschucha...@pivotal.io>

wrote:


Let's yank it




On 9/4/18 5:04 PM, Sean Goller wrote:

If it's to get the release out, I'm fine with reverting. I don't

like

it,

but I'm not willing to die on that hill. :)

-S.

On Tue, Sep 4, 2018 at 4:38 PM Dan Smith 

wrote:

Spitting this into a separate thread.

I see the issue. The two minute timeout is the constructor for
AcceptorImpl, where it retries to bind for 2 minutes.

That behavior makes sense for CacheServer.start.

But it doesn't make sense for the new logic in

GatewayReceiver.start()

from
GEODE-5591. That code is trying to use CacheServer.start to

scan

for

an

available port, trying each port in a range. That free port

finding

logic

really doesn't want to have two minutes of retries for each

port.

It

seems
like we need to rework the fix for GEODE-5591.

Does it make sense to hold up the release to rework this fix,

or

should

we
just revert it? Have we switched concourse over to using alpine

linux,

which I think was the original motivation for this fix?

-Dan

On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith 

wrote:

Why is it waiting at all in this case? Where is this 2 minute

timeout

coming from?

-Dan

On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <


sai.boorlaga...@gmail.com


wrote:

So the issue is that it takes longer to start than previous

releases?

Also, is this wait time only when using Gfsh to create
gateway-receiver?

On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag 

wrote:

Currently we have a minor issue in the release branch as

pointed

out

by

Barry O.

We will wait till a resolution is figured out for this

issue.

Steps:
1. create locator
2. start server --name=server1 --server-port=40404
3. start server --name=server2 --server-port=40405
4. create gateway-receiver --member=server1
5. create gateway-receiver --member=server2 `This gets stuck

for 2

minutes`


Is the 2 minute wait time acceptable? Should we document it?

When

we

revert


GEODE-5591, this issue does not happen.

Regards
Nabarun Nag








Re: 2 minute gateway startup time due to GEODE-5591

2018-09-05 Thread Udo Kohlmeyer

+1


On 9/5/18 10:35, Anthony Baker wrote:

Before this improvement is re-merged I’d like to see:

1) A test that characterizes the current behavior (e.g. doesn’t wait 2 min when 
there’s a port conflict)
2) A test that demonstrates how the current logic is insufficient

Anthony



On Sep 5, 2018, at 10:20 AM, Nabarun Nag  wrote:

GEODE-5591 has been reverted in develop
ref: 901da27f227a8ce2b7d6b681619782a1accd9330

Regards
Nabarun Nag

On Wed, Sep 5, 2018 at 10:14 AM Ryan McMahon  wrote:


+1 for reverting in both places.

I see that there is already an isGatewayReceiver flag in the AcceptorImpl
constructor.  It's not ideal, but could we use this flag to prevent the 2
minute retry logic for happening if this flag is true?

Ryan

On Wed, Sep 5, 2018 at 10:01 AM, Lynn Hughes-Godfrey <
lhughesgodf...@pivotal.io> wrote:


+1 for reverting in both places.

On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith  wrote:


+1 for reverting in both places. The current fix is not better, that's

why

we are reverting it on the release branch!

-Dan

On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett 

wrote:

I’m not ok with reverting in develop. Revert in 1.7 and modify in

develop.

We shouldn’t go backwards in develop. The current fix is better than

the

bug it fixes.


On Sep 5, 2018, at 9:40 AM, Nabarun Nag  wrote:

If everyone is okay with it, I will revert that change in develop

and

then

cherry pick it to release/1.7.0 branch.
Please do comment.

Regards
Nabarun Nag



On Wed, Sep 5, 2018 at 9:30 AM Dan Smith 

wrote:

+1 to yank it and rework the fix.

Gester's change helps, but it just means that you will sometimes

randomly

have a 2 minute delay starting up a gateway receiver. I don't

think

that is

a great user experience either.

-Dan

On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <

bschucha...@pivotal.io>

wrote:


Let's yank it




On 9/4/18 5:04 PM, Sean Goller wrote:

If it's to get the release out, I'm fine with reverting. I don't

like

it,

but I'm not willing to die on that hill. :)

-S.

On Tue, Sep 4, 2018 at 4:38 PM Dan Smith 

wrote:

Spitting this into a separate thread.

I see the issue. The two minute timeout is the constructor for
AcceptorImpl, where it retries to bind for 2 minutes.

That behavior makes sense for CacheServer.start.

But it doesn't make sense for the new logic in

GatewayReceiver.start()

from
GEODE-5591. That code is trying to use CacheServer.start to

scan

for

an

available port, trying each port in a range. That free port

finding

logic

really doesn't want to have two minutes of retries for each

port.

It

seems
like we need to rework the fix for GEODE-5591.

Does it make sense to hold up the release to rework this fix,

or

should

we
just revert it? Have we switched concourse over to using alpine

linux,

which I think was the original motivation for this fix?

-Dan

On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith 

wrote:

Why is it waiting at all in this case? Where is this 2 minute

timeout

coming from?

-Dan

On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <


sai.boorlaga...@gmail.com


wrote:

So the issue is that it takes longer to start than previous

releases?

Also, is this wait time only when using Gfsh to create
gateway-receiver?

On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag 

wrote:

Currently we have a minor issue in the release branch as

pointed

out

by

Barry O.

We will wait till a resolution is figured out for this

issue.

Steps:
1. create locator
2. start server --name=server1 --server-port=40404
3. start server --name=server2 --server-port=40405
4. create gateway-receiver --member=server1
5. create gateway-receiver --member=server2 `This gets stuck

for 2

minutes`


Is the 2 minute wait time acceptable? Should we document it?

When

we

revert


GEODE-5591, this issue does not happen.

Regards
Nabarun Nag






[Discuss] Hosting nightly Geode snapshots

2018-09-05 Thread Udo Kohlmeyer

Hi there Geode Dev-list,

I would like to suggest that Geode starts making nightly build snapshots 
available for downstream consumption.


Would it be possible to start uploading nightly snapshots to the 
http://repository.apache.org/snapshots repo?


I see the last snapshots uploaded were

gemfire-core-1.0.0-incubating-20160212.122315-223.jar 
 
	Fri Feb 12 12:23:15 UTC 2016 	12821153



--Udo



Re: Concerning Apache Geode 1.7.0

2018-09-05 Thread Udo Kohlmeyer

Hi there John,

I've started another thread, wrt nightly snapshots. Hopefully we will 
have an answer for you soon.


--Udo


On 9/5/18 15:10, John Blum wrote:

I just need nightly build snapshots (with artifacts accessible from a
snapshot repo) on a regular basis.  That will eventually include the 1.8
bits too.

Anyway, thank you for you efforts and straightening out the bits for 1.7.
Will there be build snapshots available to test with?

If at all possible, I want to avoid having to build the bits manually,
myself.  Plus, I need build snapshot bits to pull for SDG's CI env.

Regards,
John


On Wed, Sep 5, 2018 at 3:04 PM, Nabarun Nag  wrote:


@John, everything you mentioned in mail is how things happen in Apache
Geode where we cut a release branch and develop moves to the SNAPSHOT
version for the next release.

But as we mentioned in the release mails, 1.7.0 needs some special work
done.
We created the release branch for 1.7.0 and moved the develop to
1.8.0-SNAPSHOT couple of months ago, but we paused the release and started
working on product improvement.
Now, when we decided to create a new release branch for 1.7.0, there were
couple of methods, etc with hardcoded 1.8.0 version (in develop) which
needed to reverted and tested.

The release process for the subsequent versions 1.8.0+ will continue in the
way you mentioned. For 1.7.0 there were some unforeseen circumstances,
which is causing some issues with the release process.
I apologize for the inconvenience and hopefully you will soon have a stable
branch to test.

Regards
Nabarun Nag

On Wed, Sep 5, 2018 at 2:41 PM John Blum  wrote:


I'd also say that, in general, those branches need to be cleaned up.

There

are a crap load of branches in there!  Are all those branches actively
developed and currently being built by a nightly CI?  I'd argue they

should

be!  But, I guess not since 'release/1.7.0' is still lingering.

Also, it is common for most OSS projects to publish nightly build

snapshots

(preferably from 'master', but I guess in this case, 'develop', where '
develop' should be the 1.7.0 release up to the point where 1.7.0 is
delivered, then a 1.7.x branch is created and 'master', err 'develop'
become the 1.8 development line) with the latest developments to allow
users to test drive with the latest bits, well before any release

happens,

and especially before a vote, no less.

This is part of any good OSS process where their is a significant

ecosystem

around the core project.  How else can downstream projects get a sense

for

what is happening, collaborate on that, or if necessary, even contribute,
if we cannot test things out?

Regarding...

*> It will be unproductive of users to run their experiments on a branch
with issues.*

That is a process problem since there should not be pending changes on

any

"releasable" branch that would affect core functionality, especially not
without adequate test coverage catching problems before bits are checked
in.

Anyway, the point is, there needs to be some build snapshot artifacts
readily available for users/projects to consume in order for a healthy
ecosystem to thrive.

-John


On Wed, Sep 5, 2018 at 2:28 PM, Nabarun Nag  wrote:


@John it will be published soon. The branch is not ready yet. There are
some known issues that need to solved before we release the branch to

world

to test.

It will be unproductive of users to run their experiments on a branch

with

issues. It will be up as soon as the issues are resolved. I'm working

on

resolving them at the moment. Hopefully soon we will send out the mail

and

the updated release branch.

Regards
Nabarun Nag


On Wed, Sep 5, 2018 at 2:22 PM John Blum  wrote:


Doesn't the pending release branch, or something, publish (build)

snapshot

artifacts to a snapshot repo somewhere?

How do/can users test out the upcoming bits (that is nightly build

snapshot

artifacts)?


On Wed, Sep 5, 2018 at 2:13 PM, Nabarun Nag  wrote:


@John, the [DISCUSS] mail for the new branch has not been sent yet.

The

branch you are looking at is an old one that was abandoned few

months

ago.

The new branch is undergoing some preliminary tests on my local

machine

before getting published to the world. A mail will be sent out as

soon

as

the branch is ready.

Regards
Nabarun Nag


On Wed, Sep 5, 2018 at 2:10 PM John Blum  wrote:


I see that the branch for Apache Geode 1.7.0 (release/1.7.0 [1])

has

been

created in preparation for the upcoming the 1.7 release.

I am trying to get an early sense/feel for the required changes

in

SDG

when

I rebase it on 1.7.

Previously, I was able to obtain Geode snapshots from...



apache-snapshots

https://repository.apache.org/content/repositories/snapshots





But currently, Maven is unable to resolve the 1.7 JARs


$ mvn dependency:resolve
[INFO] Scanning for projects...
[INFO]
[INFO] -< org.springframework.data:spring-data-geode

-

[INFO] Building Spring Data Geode 2.2.0.BUILD-SNAPSHO

Re: [Discuss] Hosting nightly Geode snapshots

2018-09-05 Thread Udo Kohlmeyer

@Dan,

I think what John is after is something a less manual, than changing the 
`geodeVersion` property every time Geode releases a new version, and 
just keep it pointed at a `NIGHTLY` version.


Do you think this were possible?

--Udo


On 9/5/18 15:16, Dan Smith wrote:

Hi John,

We are publishing nightly snapshots from the develop branch. See
https://repository.apache.org/content/groups/snapshots/org/apache/geode/geode-core/1.8.0-SNAPSHOT/

Also, see for the gradle build for the examples repo for an example of
build that is consuming these nightly snapshots:

https://github.com/apache/geode-examples/blob/master/build.gradle

-Dan

On Wed, Sep 5, 2018 at 3:04 PM, John Blum  wrote:


+1


On Wed, Sep 5, 2018 at 3:00 PM, Udo Kohlmeyer 
wrote:


Hi there Geode Dev-list,

I would like to suggest that Geode starts making nightly build snapshots
available for downstream consumption.

Would it be possible to start uploading nightly snapshots to the
http://repository.apache.org/snapshots repo?

I see the last snapshots uploaded were

gemfire-core-1.0.0-incubating-20160212.122315-223.jar <
https://repository.apache.org/content/groups/snapshots/org/
apache/geode/gemfire-core/1.0.0-incubating-SNAPSHOT/gemfire-
core-1.0.0-incubating-20160212.122315-223.jar>Fri Feb 12 12:23:15

UTC

201612821153


--Udo




--
-John
john.blum10101 (skype)





Re: Concerning Apache Geode 1.7.0

2018-09-05 Thread Udo Kohlmeyer

@jake, I'm sure we will better utilize the CI pipeline infra from now on.


On 9/5/18 16:47, Jacob Barrett wrote:

We should be utilizing a CI to fix these issues.


On Sep 5, 2018, at 4:46 PM, Nabarun Nag  wrote:

As mentioned in the previous emails, we are still working on fixing issues.


On Wed, Sep 5, 2018 at 4:42 PM Jacob Barrett  wrote:

Why hasn’t the CI pipeline been created though? How can we be validating
the changes going into this branch if there isn’t a CI? If there was a CI
in place the snapshots would be published on clean builds.

-Jake



On Sep 5, 2018, at 3:04 PM, Nabarun Nag  wrote:

@John, everything you mentioned in mail is how things happen in Apache
Geode where we cut a release branch and develop moves to the SNAPSHOT
version for the next release.

But as we mentioned in the release mails, 1.7.0 needs some special work
done.
We created the release branch for 1.7.0 and moved the develop to
1.8.0-SNAPSHOT couple of months ago, but we paused the release and

started

working on product improvement.
Now, when we decided to create a new release branch for 1.7.0, there were
couple of methods, etc with hardcoded 1.8.0 version (in develop) which
needed to reverted and tested.

The release process for the subsequent versions 1.8.0+ will continue in

the

way you mentioned. For 1.7.0 there were some unforeseen circumstances,
which is causing some issues with the release process.
I apologize for the inconvenience and hopefully you will soon have a

stable

branch to test.

Regards
Nabarun Nag


On Wed, Sep 5, 2018 at 2:41 PM John Blum  wrote:

I'd also say that, in general, those branches need to be cleaned up.

There

are a crap load of branches in there!  Are all those branches actively
developed and currently being built by a nightly CI?  I'd argue they

should

be!  But, I guess not since 'release/1.7.0' is still lingering.

Also, it is common for most OSS projects to publish nightly build

snapshots

(preferably from 'master', but I guess in this case, 'develop', where '
develop' should be the 1.7.0 release up to the point where 1.7.0 is
delivered, then a 1.7.x branch is created and 'master', err 'develop'
become the 1.8 development line) with the latest developments to allow
users to test drive with the latest bits, well before any release

happens,

and especially before a vote, no less.

This is part of any good OSS process where their is a significant

ecosystem

around the core project.  How else can downstream projects get a sense

for

what is happening, collaborate on that, or if necessary, even

contribute,

if we cannot test things out?

Regarding...

*> It will be unproductive of users to run their experiments on a branch
with issues.*

That is a process problem since there should not be pending changes on

any

"releasable" branch that would affect core functionality, especially not
without adequate test coverage catching problems before bits are checked
in.

Anyway, the point is, there needs to be some build snapshot artifacts
readily available for users/projects to consume in order for a healthy
ecosystem to thrive.

-John



On Wed, Sep 5, 2018 at 2:28 PM, Nabarun Nag  wrote:

@John it will be published soon. The branch is not ready yet. There are
some known issues that need to solved before we release the branch to

world

to test.

It will be unproductive of users to run their experiments on a branch

with

issues. It will be up as soon as the issues are resolved. I'm working

on

resolving them at the moment. Hopefully soon we will send out the mail

and

the updated release branch.

Regards
Nabarun Nag



On Wed, Sep 5, 2018 at 2:22 PM John Blum  wrote:

Doesn't the pending release branch, or something, publish (build)

snapshot

artifacts to a snapshot repo somewhere?

How do/can users test out the upcoming bits (that is nightly build

snapshot

artifacts)?



On Wed, Sep 5, 2018 at 2:13 PM, Nabarun Nag  wrote:

@John, the [DISCUSS] mail for the new branch has not been sent yet.

The

branch you are looking at is an old one that was abandoned few months

ago.

The new branch is undergoing some preliminary tests on my local

machine

before getting published to the world. A mail will be sent out as

soon

as

the branch is ready.

Regards
Nabarun Nag



On Wed, Sep 5, 2018 at 2:10 PM John Blum  wrote:

I see that the branch for Apache Geode 1.7.0 (release/1.7.0 [1])

has

been

created in preparation for the upcoming the 1.7 release.

I am trying to get an early sense/feel for the required changes in

SDG

when

I rebase it on 1.7.

Previously, I was able to obtain Geode snapshots from...



  apache-snapshots

https://repository.apache.org/content/repositories/snapshots





But currently, Maven is unable to resolve the 1.7 JARs


$ mvn dependency:resolve
[INFO] Scanning for projects...
[INFO]
[INFO] -< org.springframework.data:spring-data-geode

-

[INFO] Building Spring Data Geode 2.2.0.BUILD-SNAPSHOT
[INFO] -

Re: [discuss] Should we evaluate at commit messages as part of PR review?

2018-09-14 Thread Udo Kohlmeyer

-1 on removing the JIRA ticket number from the summary line.

I believe that it is a vital piece of the comment, and putting it 
upfront enforces that all work is committed with a corresponding JIRA.


I'm on the fence when it comes to the whole "let's put a lot more 
information into the git comment." I feel that the information provided 
should be enough to have an understanding what was done in the commit.. 
But given that we have a "single commit" policy, it might end up having 
to provide A LOT of detail into the commit message.


So how much info is too much and how much is not enough.. This becomes a 
question of perspective and judgement...


--Udo


On 9/14/18 11:55, Bradford Boyle wrote:

How would people feel about removing the requirement to include the
"GEODE-: " prefix in the summary line? That accounts for about 25% of
the 52 character limit. We could move it to the first non-summary line of
the commit message.

--Bradford





Re: goodbye LogWriterI18n

2018-10-08 Thread Udo Kohlmeyer

Awesome work!!

I've previously attempted to remove the deprecated LogWriterI18n.. but 
did ran out of time..


--Udo


On 10/8/18 11:20, Bruce Schuchardt wrote:
Removal of LogWriterI18n is in JIRA: 
https://issues.apache.org/jira/browse/GEODE-258


That's a subtask of GEODE-72


On 10/5/18 3:38 PM, Jacob Barrett wrote:
Perhaps it would be wise to create a JIRA that lists things, like 
this, we would like to remove from the public API come 2.0 so we 
don’t forget.


On Oct 5, 2018, at 3:02 PM, Bruce Schuchardt 
 wrote:


I've just checked in a fairly large set of changes to remove all 
uses of LogWriterI18n and to remove the LocalizedStrings.java class 
altogether.  The other I18n classes are still in place because they 
are in a public package or are needed to make that stuff compile, 
but you should no longer use them.


 From now on please use the new log4j-based log service.  If you 
must use a LogWriter do not use the LogWriterI18n variant because 
you no longer have a good framework for creating StringIDs to feed 
to its methods.  Just use regular Strings.









Re: [DISCUSS] Predictable minor release cadence

2018-10-08 Thread Udo Kohlmeyer
@Alexander, I don't believe that we can use the "DISCUSS" thread to have 
made a decision that we are going to do something.


Imo, it gauges interest rather than making a decision.

I would rather see the "VOTE" thread to be started, detailing the 
proposal and process how this will work. As I don't see how we can make 
decision on anything that isn't clearly defined. With clearly defined, I 
mean, what is the process regarding a major, minor and patch release. I 
agree with @Anthony, that all releases are treated equally... But as 
@Ken and @John have stated, what happens to the *patch* number? If it 
becomes a redundant, red-taped release, then we might end up with only 
*major*.*minor* releases.


--Udo


On 10/8/18 13:37, Alexander Murmann wrote:

Hi all,

Given the overwhelmingly positive response, I added a release schedule page
to the wiki:
https://cwiki.apache.org/confluence/display/GEODE/Release+Schedule

Given the many "+1"s in this thread, can this be seen as agreed or do we
need a formal [VOTE] thread?

On Mon, Oct 8, 2018 at 1:34 PM Anthony Baker  wrote:


It’s an ASF requirement that PMC’s shepherd releases through a prescribed
set of practices.  It doesn’t matter if a release is major, minor, or
patch—they all must be voted and approved the the PMC.

Anthony



On Oct 8, 2018, at 1:04 PM, John Blum  wrote:

Also, a huge +1 to Ken's suggestion of actually using the

maintenance/patch

release version major.minor.*patch*, perhaps without all the "Apache
ceremony" around releasing.  Patches, should be quick and painless.






Re: Running compatibility and upgrade tests using jdk9+

2018-10-08 Thread Udo Kohlmeyer
Should this not rather be a statement of.. "Running on JDK 11+" and not 
9+? Given that 9 + 10 are not in support anymore.
Also, when this is released, we will running on 11 rather than 9, or 
what is the thought (end goal) with this effort?


Also does this effort cover some of the main additions to the JDK since 
9, which is the whole modularity theme?


--Udo

On 10/8/18 14:11, Jinmei Liao wrote:

In the effort of making GEODE java 9+ compatible, we already determined
that older released versions of GEODE can NOT be run using jdk9+. With this
in mind, should we still run those compatibility/upgrade DUnit tests in
java9+ pipeline? The current state of things are:

1) A subset of compatibility/upgrade DUnit tests are successful in java9+
are passing because the dunit test environment happen to have the missing
jars in the classpath.  With the exclusion of Geode 1.4 in those test, we
can make all of them pass. (Just FYI, only Geode1.4 is failing in jdk9+ is
because we introduced SerializationFilter in 1.4, but the support for in
jdk9 was added only in 1.5).
2. We will have parallel pipelines testing with both jdk8 and jdk9+. So
even if we don't run these tests in jdk9+ pipeline, we are still running
them in jdk8.

The question to ask is: does running compatibility/upgrade tests in jdk9 in
addition to jdk8 offer additional value?





Re: [VOTE] Time-based release schedule for minor releases

2018-10-08 Thread Udo Kohlmeyer

-0

It seems we have completely disregarded the *patch* version number. Does 
this mean Geode versions will be *major*,*minor*? Can we then remove the 
*patch* version on the release version?


In addition to this, should the test coverage not be sufficient enough 
to allow "release when green"? I must agree with @Jacob, I would prefer 
something a lot less formalized. If  the community has contributed a 
significant fix, should that not warrant an ad-hoc patch release? Or 
what if the community has added functionality, that could "fill" a 
single minor release by itself, should that not warrant a pre-emptive 
release.


All these questions are not enough to warrant this effort to be blocked, 
but I prefer those use cases to be considered for a more comprehensive 
documentation effort, than what is currently on the wiki.


In addition to that, is a release with only bug fixes in it, really 
still a worthy of minor release number, or does it not count as a patch 
release?


--Udo


On 10/8/18 14:27, Jacob Barrett wrote:

+0

My preference is to release when there is something worth releasing rather
than arbitrary points in time but I don't hold that preference strongly
enough to spike this effort.

On Mon, Oct 8, 2018 at 2:24 PM Alexander Murmann 
wrote:


Hi everyone,

As discussed in "Predictable minor release cadence", I'd like us to find
agreement on releasing a new minor version every three months. There are
more details in the other thread and I should have captured everything
relevant on the wiki:
https://cwiki.apache.org/confluence/display/GEODE/Release+Schedule

There are also some discussions about patch releases. Let's please focus
this vote on the proposed minor release schedule and carry on other
discussions in the [DISCUSS] thread.

Thank you all!





Re: [Discuss] Where should simple classes for tests belong?

2018-10-15 Thread Udo Kohlmeyer
+1 to what Jens & Jinmei & Kirk have said. It is far easier to have a 
local definition of the "toy" classes.


Yes, it makes the code base "larger" and most good editors would flag it 
as duplicate code, BUT it is about localization and making sure that one 
can easily see what classes are part of a test and what its definition is.


In addition to this, having a set of "toy" definition lying around 
means, that as more tests use these classes, the test scenario 
inevitably starts seeping into the toy classes, eventually causing 
copies of these classes to be made in any case, as the original does not 
meet the requirement of the new test.


--Udo


On 10/15/18 09:40, Jinmei Liao wrote:

+1 to what Jens said. The test jar is very specific to the test itself.
Does not need to be in the src at all. It should disappear when the test
finishes. There is no need for it to be in the build process at all.
Producing it inside the test itself is a much cleaner way for the
developer/maintainer of the code to know what the intent of the test/code
is for rather having to hunt around to find out where the test code lives.

On Mon, Oct 15, 2018 at 8:38 AM Jens Deppe  wrote:


For testing functionality like 'gfsh deploy' one has to be aware of
classloading issues. Specifically, when writing dunit tests, classes are
compiled from java resource files so that they are guaranteed not to be on
the classpath during testing. This is less of an issue for gfsh acceptance
tests where there is more control over the classpath and what is launched
as part of the test.

For the dunit deploy testing scenario (and any similar) I would prefer to
see the java test resource as exactly that - a package-equivalent resource
file that is explicitly compiled as part of the test. It makes the test
clearer and doesn't leave anyone wondering where/how resources are built or
injected.

--Jens

On Fri, Oct 12, 2018 at 4:22 PM Kirk Lund  wrote:


I usually create a private static inner class and keep it as small as
possible at the end of the test class that uses it. Some of these these
such as MyCacheListener can be replaced by spy(CacheListener.class) and I
always try to convert these where possible.

If it's a simple class that only tests in one package will use (like a
custom AssertJ class that nothing outside that package will use) then I
make it package-protected and stick it in the same package and srcSet as
the tests that will use it.

I think something like MonthBasedPartitionResolver is simple enough that

it

should exist within the same src set and in the same package as the test

or

a few tests that use it. And if another test in some other src set or in
another package needs something similar, then too bad, that other test
should create its own simple small class that is similar. This sort of
class does NOT belong in a src/main of some module like geode-dunit.

Better

yet make it a static inner class in every test that uses it. I know some
people object to this, but here's why...

If I want to modify one test that uses MonthBasedPartitionResolver and

this

modification involves changing MonthBasedPartitionResolver (which

remember

it's a simple class) then if each test has its own copy I won't risk
breaking other tests. When I'm fixing up, cleaning up and refactoring
tests, the biggest nightmares I run into are shared utility classes such
that if I try to cleanup one test, I can't -- I have to clean up
potentially dozens of tests because they linked some some stupid utility
class because "code sharing" is great. MonthBasedPartitionResolver is a
small example, there are plenty of other examples that are true
monstrosities.

Sharing code between tests is NOT great unless you devise a Rule or

custom

AssertJ Assertion that is truly very reusable across many tests -- and

only

in this case it belongs in geode-junit or geode-dunit where it should

live

in a src/main and there should be tests for it under src/test or
src/integrationTest. I also highly recommend that you keep shared testing
objects like this small in scope with a single high-level responsibility.
And please, never create something that simply creates a new improved API
so that you don't have to use the Geode API -- this just promotes leaving
crappy product APIs in place without changing them because by using some
fancy testing API, we don't feel the pain that Users experience with the
product APIs.

On Fri, Oct 12, 2018 at 3:32 PM, Patrick Rhomberg 
wrote:


Hello, all!

   There are a number of classes that require some number of "toy"

classes

as part of their testing framework, e.g., to be used as custom data
containers.  Many of these classes involve testing or use of the

`deploy

jar` command, and so are packaged into jars for this purpose.
   In an effort to promote good coding habits and, as importantly,

consensus

throughout this community and our codebase, I would like to ask which

of

these are considered the preferred method to maintain these additional
cl

Re: About JIRA GEODE-5896

2018-10-22 Thread Udo Kohlmeyer

Hi there Dong Yang,

If you have completed a fix, please submit it via the PR mechanism 
within Github. We will most gladly review and incorporate.


--Udo

On 10/18/18 06:00, Yang, Dong [GTSUS Non-J&J] wrote:

Hi,

I am Dong Yang, and my apache account is twosand.  What we are using Gemfire is 
not commonly usage scenario in other company, it's more like a OLTP and OLAP 
mixed scenario. The concept is very similar to using Spark-Gemfire connect, we 
have some server-side function that can shuffle data from server to client as 
stream style. And we encountered the thread lock issue in different 
environments. Before we use Gemfire8 , now we are upgrading to GemFrie9.
About GEODE-5896, it's very important usage for us, and I think the same for 
others if they want using spark to connect to Gemfire. Now we just do some 
patch at client-side the force the meta ready before function executed. But the 
perfect solution should fix some sever-side code.
I can share what I found and where I want to fix, you can review it , resonale 
or not . Fix it by current geode team or I can do it as a contributor.



Dong Yang, Dong [GTSUS Non-J&J
Thanks





Re: Lombok

2018-11-08 Thread Udo Kohlmeyer

The Spring world/community are heavy users of Lombok.

In essence it is "nice", BUT it does now add a new dependency on a 
library that is to provide functionality that developers should provide. 
IJ Idea does provide support for Lombok.


I have not yet seen any code bloat that Lombok could reduce for us. 
Also, the reduction is only in terms of "visible", the compiled class 
might be more verbose.


Kotlin on the other hand, as some of the boilerplate code built in as a 
language feature. I prefer that over choosing a library, that might have 
compatibility issues in the future.


Also, Kotlin's conciseness is also a language feature rather than 
library plugin. I've also seen cases where compiled Java was larger than 
the equivalent compiled Kotlin code.


--Udo


On 11/8/18 10:31, Aditya Anchuri wrote:

Hi everyone,

I am considering adding Lombok as a compile-time dependency (
https://projectlombok.org/) so we can reduce the amount of boilerplate code
and reduce the size of some of our classes. I have a small proof of concept
PR ready to go. Before I do so, I want to find out if people have tried it
before and how they feel about it, especially when used with IDEs like
IntelliJ and/or Eclipse?

Thanks,
-Aditya





Re: Lombok

2018-11-08 Thread Udo Kohlmeyer
I know that this is "pedantic".. but to me "isDiskSynchronous" reads 
better than "getDiskSynchronous"


but in the end, it returns the declared field.

Kotlin's data class on the other plays a little "nicer".

If you define the field as "isDiskSynchronous" the generated getter is 
"isDiskSynchronous()". Would Lombok respect similar semantics?


--Udo


On 11/8/18 13:57, Aditya Anchuri wrote:

Note: If the PR gets accepted, people that use IntelliJ idea or Eclipse
will need to use the Lombok plugin for their respective IDEs -- for
IntelliJ people will also need to enable annotation processing in the
compiler settings if not already enabled.

On Thu, Nov 8, 2018 at 12:02 PM Aditya Anchuri  wrote:


I've only touched a few classes in my PR, but I feel like there's a lot
more boilerplate floating around that can be removed. Having said that, I
agree with your point regarding Kotlin, but for the Java code I would find
Lombok pretty useful. Have included a link to the PR:

https://github.com/apache/geode/pull/2815

-Aditya



On Thu, Nov 8, 2018 at 11:24 AM Udo Kohlmeyer  wrote:


The Spring world/community are heavy users of Lombok.

In essence it is "nice", BUT it does now add a new dependency on a
library that is to provide functionality that developers should provide.
IJ Idea does provide support for Lombok.

I have not yet seen any code bloat that Lombok could reduce for us.
Also, the reduction is only in terms of "visible", the compiled class
might be more verbose.

Kotlin on the other hand, as some of the boilerplate code built in as a
language feature. I prefer that over choosing a library, that might have
compatibility issues in the future.

Also, Kotlin's conciseness is also a language feature rather than
library plugin. I've also seen cases where compiled Java was larger than
the equivalent compiled Kotlin code.

--Udo


On 11/8/18 10:31, Aditya Anchuri wrote:

Hi everyone,

I am considering adding Lombok as a compile-time dependency (
https://projectlombok.org/) so we can reduce the amount of boilerplate

code

and reduce the size of some of our classes. I have a small proof of

concept

PR ready to go. Before I do so, I want to find out if people have tried

it

before and how they feel about it, especially when used with IDEs like
IntelliJ and/or Eclipse?

Thanks,
-Aditya







Re: Lombok

2018-11-08 Thread Udo Kohlmeyer

As an informative comparison on conciseness offering same functionality:

Java Code:

import java.io.Serializable; import lombok.Getter; import org.apache.geode.cache.configuration.CacheConfig.GatewayReceiver; import org.apache.geode.cache.configuration.DeclarableType; /** * This class stores the arguments provided in the create 
gateway-receiver command. */ @Getter public class GatewayReceiverFunctionArgsimplements Serializable {

   private static final long serialVersionUID = -5158224572470173267L; private 
final BooleanmanualStart; private final IntegerstartPort; private final 
IntegerendPort; private final StringbindAddress; private final 
IntegersocketBufferSize; private final IntegermaximumTimeBetweenPings; private 
final String[]gatewayTransportFilters; private final StringhostnameForSenders; 
private final BooleanifNotExists; public 
GatewayReceiverFunctionArgs(GatewayReceiver configuration, Boolean ifNotExists) 
{
  this.manualStart = configuration.isManualStart(); this.startPort =
 configuration.getStartPort() !=null ? 
Integer.valueOf(configuration.getStartPort()) :null; this.endPort =
 configuration.getEndPort() !=null ? 
Integer.valueOf(configuration.getEndPort()) :null; this.bindAddress = 
configuration.getBindAddress(); this.socketBufferSize = 
configuration.getSocketBufferSize() !=null ? 
Integer.valueOf(configuration.getSocketBufferSize()) :null; 
this.maximumTimeBetweenPings = configuration.getMaximumTimeBetweenPings() 
!=null ? Integer.valueOf(configuration.getMaximumTimeBetweenPings()) :null; 
this.gatewayTransportFilters = configuration.getGatewayTransportFilter() !=null 
? 
configuration.getGatewayTransportFilter().stream().map(DeclarableType::getClassName)
 .toArray(String[]::new)
 :null; this.hostnameForSenders = 
configuration.getHostnameForSenders(); this.ifNotExists = ifNotExists; }
}

The equivalent Kotlin definition is:

import org.apache.geode.cache.configuration.CacheConfig
import org.apache.geode.cache.configuration.ClassWithParametersType
import java.io.Serializable

data class GatewayReceiverFunctionArgs
@JvmOverloads private constructor(val manualStart: Boolean, val startPort: Int?, val 
endPort: Int?, val bindAddress: String, val socketBufferSize: Int?, val 
maximumTimeBetweenPings: Int?, val gatewayTransportFilters: Array, val 
hostNameForSender: String, val ifNotExists: Boolean) : Serializable{

constructor(configuration: CacheConfig.GatewayReceiver, ifNotExists: 
Boolean) :
this(configuration.isManualStart, configuration.startPort?.toInt(), 
configuration.endPort?.toInt(), configuration.bindAddress, 
configuration.socketBufferSize?.toInt(), configuration.maximumTimeBetweenPings?.toInt(), 
configuration.gatewayTransportFilter ?.map { classWithParametersType: 
ClassWithParametersType-> classWithParametersType.className } 
?.toTypedArray()
?:emptyArray(), configuration.hostnameForSenders, ifNotExists)


companion object {
@JvmStatic val serialVersionUID = -5158224572470173267L }
}


On 11/8/18 12:02, Aditya Anchuri wrote:

I've only touched a few classes in my PR, but I feel like there's a lot
more boilerplate floating around that can be removed. Having said that, I
agree with your point regarding Kotlin, but for the Java code I would find
Lombok pretty useful. Have included a link to the PR:

https://github.com/apache/geode/pull/2815

-Aditya



On Thu, Nov 8, 2018 at 11:24 AM Udo Kohlmeyer  wrote:


The Spring world/community are heavy users of Lombok.

In essence it is "nice", BUT it does now add a new dependency on a
library that is to provide functionality that developers should provide.
IJ Idea does provide support for Lombok.

I have not yet seen any code bloat that Lombok could reduce for us.
Also, the reduction is only in terms of "visible", the compiled class
might be more verbose.

Kotlin on the other hand, as some of the boilerplate code built in as a
language feature. I prefer that over choosing a library, that might have
compatibility issues in the future.

Also, Kotlin's conciseness is also a language feature rather than
library plugin. I've also seen cases where compiled Java was larger than
the equivalent compiled Kotlin code.

--Udo


On 11/8/18 10:31, Aditya Anchuri wrote:

Hi everyone,

I am considering adding Lombok as a compile-time dependency (
https://projectlombok.org/) so we can reduce the amount of boilerplate

code

and reduce the size of some of our classes. I have a small proof of

concept

PR ready to go. Before I do so, I want to find out if people have tried

it

before and how they feel about it, especially when used with IDEs like
IntelliJ and/or Eclipse?

Thanks,
-Aditya







Re: [DISCUSS] Disable merge for failing pull requests

2018-11-12 Thread Udo Kohlmeyer

0

Patrick does make a point. The committers on the project have been voted 
in as "responsible, trustworthy and best of breed" and rights and 
privileges according to those beliefs have been bestowed.


We should live those words and trust our committers. In the end.. If 
there is a "rotten" apple in the mix this should be addressed, maybe not 
as public flogging, but with stern communications.


On the other side, I've also seen the model where the submitter of PR is 
not allowed to merge + commit their own PR's. That befalls to another 
committer to complete this task, avoiding the whole, "I'll just quickly 
commit without due diligence".


--Udo


On 11/12/18 10:23, Patrick Rhomberg wrote:

-1

I really don't think this needs to be codified.  If people are merging red
PRs, that is a failing as a responsible developer.  Defensive programming
is all well and good, but this seems like it's a bit beyond the pale in
that regard.  I foresee it making the correction of a red main pipeline
must more difficult that it needs to be.  And while it's much better than
it had been, I am (anecdotally) still seeing plenty of flakiness in my PRs,
either resulting from infra failures or test classes that need to be
refactored or reevaluated.

If someone is merging truly broken code that has failed precheckin, that
seems to me to be a discussion to have with that person.   If it
persists, perhaps a public flogging would be in order.  But at the end
of the day, the onus is on us to be responsible developers, and no amount
of gatekeeping is going to be a substitute for that.

On Mon, Nov 12, 2018 at 9:38 AM, Galen O'Sullivan 
wrote:


I'm in favor of this change, but only if we have a way to restart failing
PR builds without being the PR submitter. Any committer should be able to
restart the build. The pipeline is still flaky enough and I want to avoid
the situation where a new contributor is asked repeatedly to push empty
commits to trigger a PR build (in between actual PR review) and their PR
gets delayed by days if not weeks. That's a real bad experience for the
people we want to be inviting in.

On Mon, Nov 12, 2018 at 9:23 AM Alexander Murmann 
wrote:


What's the general consensus on flakiness of the pipeline for this

purpose?

If there is consensus that it's still too flaky to disable the merge

button

on failure, we should probably consider doubling down on that issue

again.

It's hard to tell from just looking at the dev pipeline because you

cannot

see easily what failures were legitimate.

On Mon, Nov 12, 2018 at 8:47 AM Bruce Schuchardt 
I'm in favor of this.

Several times over the years we've made a big push to get precheckin to
reliably only to see rapid degradation due to crappy code being pushed
in the face of precheckin failures.  We've just invested another half
year doing it again.  Are we going to do things differently now?
Disabling the Merge button on test failure might be a good start.

On 11/9/18 12:55 PM, Dan Smith wrote:


Hi all,

Kirks emails reminded me - I think we are at the point now with our

PR

checks where we should not be merging anything to develop that

doesn't

pass

all of the PR checks.

I propose we disable the merge button unless a PR is passing all of

the

checks. If we are in agreement I'll follow up with infra to see how

to

make

that happen.

This would not completely prevent pushing directly to develop from

the

command line, but since most developers seem to be using the github

UI, I

hope that it will steer people towards getting the PRs passing

instead

of

using the command line.

Thoughts?
-Dan







Re: [DISCUSS] Disable merge for failing pull requests

2018-11-12 Thread Udo Kohlmeyer
In that case, why don't we make the rule that the owner of the PR is not 
allowed to merge it.


Another committer must then accept and merge it.

--Udo


On 11/12/18 14:03, Dan Smith wrote:

To be clear, I don't think we have an issue of untrustworthy committers
pushing code they know is broken to develop.

The problem is that it is all to easy to look at a PR with some failures
and *assume* your code didn't cause the failures and merge it anyway. I
think we should all be at least rerunning the tests and not merging the PR
until everything passes. If the merge button is greyed out, that might help
communicate that standard to everyone.

Looking at the OpenJDK 8 metrics, it looks to me like most of the issues
are recently introduced (builds 81-84 and the EvictionDUnitTest), not old
flaky tests. So I think we were a little more disciplined always listening
to what the checks are telling us we would have less noise in the long run.

https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-metrics/jobs/GeodeDistributedTestOpenJDK8Metrics/builds/23

-Dan

On Mon, Nov 12, 2018 at 11:21 AM Udo Kohlmeyer  wrote:


0

Patrick does make a point. The committers on the project have been voted
in as "responsible, trustworthy and best of breed" and rights and
privileges according to those beliefs have been bestowed.

We should live those words and trust our committers. In the end.. If
there is a "rotten" apple in the mix this should be addressed, maybe not
as public flogging, but with stern communications.

On the other side, I've also seen the model where the submitter of PR is
not allowed to merge + commit their own PR's. That befalls to another
committer to complete this task, avoiding the whole, "I'll just quickly
commit without due diligence".

--Udo


On 11/12/18 10:23, Patrick Rhomberg wrote:

-1

I really don't think this needs to be codified.  If people are merging

red

PRs, that is a failing as a responsible developer.  Defensive programming
is all well and good, but this seems like it's a bit beyond the pale in
that regard.  I foresee it making the correction of a red main pipeline
must more difficult that it needs to be.  And while it's much better than
it had been, I am (anecdotally) still seeing plenty of flakiness in my

PRs,

either resulting from infra failures or test classes that need to be
refactored or reevaluated.

If someone is merging truly broken code that has failed precheckin, that
seems to me to be a discussion to have with that person.   If it
persists, perhaps a public flogging would be in order.  But at the

end

of the day, the onus is on us to be responsible developers, and no amount
of gatekeeping is going to be a substitute for that.

On Mon, Nov 12, 2018 at 9:38 AM, Galen O'Sullivan 
I'm in favor of this change, but only if we have a way to restart

failing

PR builds without being the PR submitter. Any committer should be able

to

restart the build. The pipeline is still flaky enough and I want to

avoid

the situation where a new contributor is asked repeatedly to push empty
commits to trigger a PR build (in between actual PR review) and their PR
gets delayed by days if not weeks. That's a real bad experience for the
people we want to be inviting in.

On Mon, Nov 12, 2018 at 9:23 AM Alexander Murmann 
wrote:


What's the general consensus on flakiness of the pipeline for this

purpose?

If there is consensus that it's still too flaky to disable the merge

button

on failure, we should probably consider doubling down on that issue

again.

It's hard to tell from just looking at the dev pipeline because you

cannot

see easily what failures were legitimate.

On Mon, Nov 12, 2018 at 8:47 AM Bruce Schuchardt <

bschucha...@pivotal.io

wrote:


I'm in favor of this.

Several times over the years we've made a big push to get precheckin

to

reliably only to see rapid degradation due to crappy code being pushed
in the face of precheckin failures.  We've just invested another half
year doing it again.  Are we going to do things differently now?
Disabling the Merge button on test failure might be a good start.

On 11/9/18 12:55 PM, Dan Smith wrote:


Hi all,

Kirks emails reminded me - I think we are at the point now with our

PR

checks where we should not be merging anything to develop that

doesn't

pass

all of the PR checks.

I propose we disable the merge button unless a PR is passing all of

the

checks. If we are in agreement I'll follow up with infra to see how

to

make

that happen.

This would not completely prevent pushing directly to develop from

the

command line, but since most developers seem to be using the github

UI, I

hope that it will steer people towards getting the PRs passing

instead

of

using the command line.

Thoughts?
-Dan







Re: [DISCUSS] - Create new repository for geode benchmarks

2018-11-15 Thread Udo Kohlmeyer

+1, never thought there was any other option...


On 11/15/18 10:47, Dan Smith wrote:

Hi all,

We (Naba, Sean, Brian and I) would like to add some benchmarks for geode,
with a goal of eventually running them as part of our concourse build and
detecting performance changes.

We think it makes sense to put these benchmarks in a separate
geode-benchmarks repository. That will make them less tightly coupled to a
specific revision of geode. What do you all think? If folks are okay with
this, I will go ahead and create the repository.

We have some prototype code in this repository with a simple client/server
put benchmark:  https://github.com/upthewaterspout/geode-performance.

-Dan





Re: [DISCUSS] Disable merge for failing pull requests

2018-11-19 Thread Udo Kohlmeyer
I don't believe "name and shame" is a hammer we should wield, but if we 
have use it... use it wisely


Could one not configure the button that the owner of the PR cannot merge 
the PR?


--Udo


On 11/19/18 16:03, Dan Smith wrote:

Closing the loop on this thread - I don't feel like there was enough
agreement to go forwards with disabling the merge button, so I'm going to
drop this for now.

I would like to see everyone make sure that they only merge green PRs.
Maybe we can go with a name and shame approach? As in, we shouldn't see any
new PRs show up in this query:

https://github.com/apache/geode/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Amerged+status%3Afailure

-Dan

On Tue, Nov 13, 2018 at 10:19 AM Ryan McMahon  wrote:


+1 I like this idea, but I recognize that it will be a challenge when there
is still some flakiness to the pipeline.  I think we'd need clear
guidelines on what to do if your PR fails due to something seemingly
unrelated.  For instance, we ran into GEODE-5943 (flaky EvictionDUnitTest)
in our last PR, and saw that there was already an open ticket for the
flakiness (we even reverted our change and reproduced to be sure).  So we
triggered another PR pipeline and it passed the second time.  Is rerunning
the pipeline again sufficient in this case?  Or should we have stopped what
we were doing and took up GEODE-5943, assuming it wasn't assigned to
someone?  If it was already assigned to someone, do we wait until that bug
is fixed before we run through the PR pipeline again?

I think if anything this will help us treat bugs that are causing a red
pipeline as critical, and I think that is a good thing.  So I'm still +1
despite the flakiness.  Just curious what people think about how we should
handle cases where there is a known failure and it is indeed unrelated to
our PR.

Ryan


On Mon, Nov 12, 2018 at 2:49 PM Jinmei Liao  wrote:


Just to clarify, that flaky EvictionDUnitTest is old flaky. The PR to
refactor the test passed all checks, even the repeatTest as well. I had a
closed PR that just touched the "un-refactored" EvictionDUnitTest, it
wouldn't even pass the repeatTest at all.

On Mon, Nov 12, 2018 at 2:04 PM Dan Smith  wrote:


To be clear, I don't think we have an issue of untrustworthy committers
pushing code they know is broken to develop.

The problem is that it is all to easy to look at a PR with some

failures

and *assume* your code didn't cause the failures and merge it anyway. I
think we should all be at least rerunning the tests and not merging the

PR

until everything passes. If the merge button is greyed out, that might

help

communicate that standard to everyone.

Looking at the OpenJDK 8 metrics, it looks to me like most of the

issues

are recently introduced (builds 81-84 and the EvictionDUnitTest), not

old

flaky tests. So I think we were a little more disciplined always

listening

to what the checks are telling us we would have less noise in the long

run.




https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-metrics/jobs/GeodeDistributedTestOpenJDK8Metrics/builds/23

-Dan

On Mon, Nov 12, 2018 at 11:21 AM Udo Kohlmeyer  wrote:


0

Patrick does make a point. The committers on the project have been

voted

in as "responsible, trustworthy and best of breed" and rights and
privileges according to those beliefs have been bestowed.

We should live those words and trust our committers. In the end.. If
there is a "rotten" apple in the mix this should be addressed, maybe

not

as public flogging, but with stern communications.

On the other side, I've also seen the model where the submitter of PR

is

not allowed to merge + commit their own PR's. That befalls to another
committer to complete this task, avoiding the whole, "I'll just

quickly

commit without due diligence".

--Udo


On 11/12/18 10:23, Patrick Rhomberg wrote:

-1

I really don't think this needs to be codified.  If people are

merging

red

PRs, that is a failing as a responsible developer.  Defensive

programming

is all well and good, but this seems like it's a bit beyond the

pale

in

that regard.  I foresee it making the correction of a red main

pipeline

must more difficult that it needs to be.  And while it's much

better

than

it had been, I am (anecdotally) still seeing plenty of flakiness in

my

PRs,

either resulting from infra failures or test classes that need to

be

refactored or reevaluated.

If someone is merging truly broken code that has failed precheckin,

that

seems to me to be a discussion to have with that person.   If it
persists, perhaps a public flogging would be in order.  But at

the

end

of the day, the onus is on us to be responsible developers, and no

amount

of gatekeeping is going to be a substitute for that.

On Mon, Nov 12, 2018 at 9:38 AM, Galen O'Sullivan <

gosulli...@pivotal.io

wrote:


I'm 

Re: [DISCUSS] Geode packages mustn't span Jigsaw modules

2018-11-26 Thread Udo Kohlmeyer
Imo, no module should export "internal". Maybe a stretch goal, but 
something we should keep in mind.



On 11/26/18 10:50, Jacob Barrett wrote:

One lingering question I have around jigsaw is the split package issue
recursive? In that I mean if a module exports "org.apache.geode.internal"
and another module exports "org.apache.geode.internal.foo" is this legal?

On Mon, Nov 19, 2018 at 5:21 PM Dan Smith  wrote:


I think we can actually fix most of these issues without geode-2.0. Most of
these are in internal packages, which means we can change the package of
these classes without breaking users. The only concerning one is
org.apache.geode.cache.util, which is a public package. However, the
AutoBalancer is actually still marked @Experimental, so we could
technically move that as well. Or maybe deprecate the old one and it's
associated jar, and create a new jar with a reasonable package. JDK11 users
could just avoid the old jar.

I have been wondering for a while if we should just fold geode-cq and
geode-wan back into geode-core. These are not really properly separated
out, they are very tangled with core. The above package overlap kinda shows
that as well. But maybe going through the effort of renaming the packages
to make them java 11 modules would help improve the code anyway!

-Dan




On Mon, Nov 19, 2018 at 5:03 PM Owen Nichols  wrote:


To package Geode as Java 11 Jigsaw module(s), a major hurdle is the
requirement that packages cannot be split across modules.

If we simply map existing Geode jars to Modules, we have about 10 split
packages (see table below).  Any restructuring would potentially have to
wait until Geode 2.0.

A workaround would be to map existing packages into a small number of new
modules (e.g. geode-api and geode-internal).  Existing jars would remain
as-is.  Users making the transition to Java 11 /w Jigsaw already need to
switch from CLASSPATH to MODULEPATH  so the inconvenience of a different
naming scheme for Geode-as-modules might be acceptable (however, once
module naming and organization is established, we may be stuck with it

for

a long time).

Thoughts?

Is it even possible to rearrange all classes in each package below into a
single jar?  Is doing so desirable enough to defer Java 11 support until

a

yet-unplanned Geode 2.0, or perhaps to drive such a release?

Or, is it satisfactory to fatten Geode releases to include one set of

jars

for CLASSPATH use, plus a different set of jars for MODULEPATH use?


Package
Jar
org.apache.geode.cache.client.internal
geode-core-1.8.0.jar
geode-cq-1.8.0.jar
geode-wan-1.8.0.jar
org.apache.geode.cache.client.internal.locator.wan
geode-core-1.8.0.jar
geode-wan-1.8.0.jar
org.apache.geode.cache.query.internal.cq
geode-core-1.8.0.jar
geode-cq-1.8.0.jar
org.apache.geode.cache.util
geode-core-1.8.0.jar
geode-rebalancer-1.8.0.jar
org.apache.geode.internal
geode-connectors-1.8.0.jar
geode-core-1.8.0.jar
geode-cq-1.8.0.jar
geode-lucene-1.8.0.jar
geode-wan-1.8.0.jar
org.apache.geode.internal.cache.tier.sockets.command
geode-core-1.8.0.jar
geode-cq-1.8.0.jar
org.apache.geode.internal.cache.wan
geode-core-1.8.0.jar
geode-wan-1.8.0.jar
org.apache.geode.internal.cache.wan.parallel
geode-core-1.8.0.jar
geode-wan-1.8.0.jar
org.apache.geode.internal.cache.wan.serial
geode-core-1.8.0.jar
geode-wan-1.8.0.jar
org.apache.geode.internal.protocol.protobuf.v1
geode-protobuf-1.8.0.jar
geode-protobuf-messages-1.8.0.jar




Re: [DISCUSS] Geode packages mustn't span Jigsaw modules

2018-11-26 Thread Udo Kohlmeyer
Geode-core provides suitable public alternatives, I'm sure Spring Data 
Geode would be able to adapt and use the certified public API's.


--Udo


On 11/26/18 15:48, Dan Smith wrote:

Our wire format and disk format generally does *not* depend on class names
or packages. Internal classes implement DataSerializableFixedID and are
sent using a fixed integer ID. So we should be able to move these classes
around without affecting the on the wire format.

I don't think we should shy away from changing *internal* packages. Worst
case is that we break someone's code and encourage them to ask for a public
API, which is a conversation we should be having anyway.

But to allay some concerns I took a look at what internal classes spring
data geode is using (on the master branch). I think all of these are in
geode-core. So if we leave the packages in geode-core alone, SDG will be
fine.

import org.apache.geode.cache.query.internal.ResultsBag;
import org.apache.geode.distributed.internal.DistributionConfig;
import org.apache.geode.distributed.internal.InternalDistributedSystem;
import org.apache.geode.distributed.internal.InternalLocator;
import org.apache.geode.internal.DistributionLocator;
import org.apache.geode.internal.GemFireVersion;
import org.apache.geode.internal.InternalDataSerializer;
import org.apache.geode.internal.cache.GemFireCacheImpl;
import org.apache.geode.internal.cache.LocalRegion;
import org.apache.geode.internal.cache.PartitionedRegion;
import org.apache.geode.internal.cache.PoolManagerImpl;
import org.apache.geode.internal.cache.UserSpecifiedRegionAttributes;
import org.apache.geode.internal.concurrent.ConcurrentHashSet;
import org.apache.geode.internal.datasource.ConfigProperty;
import org.apache.geode.internal.datasource.GemFireBasicDataSource;
import org.apache.geode.internal.jndi.JNDIInvoker;
import org.apache.geode.internal.security.shiro.GeodePermissionResolver;
import org.apache.geode.management.internal.cli.domain.RegionInformation;
import
org.apache.geode.management.internal.cli.functions.GetRegionsFunction;
import org.apache.geode.management.internal.security.ResourcePermissions;
import org.apache.geode.pdx.internal.PdxInstanceEnum;
import org.apache.geode.pdx.internal.PdxInstanceFactoryImpl;


On Mon, Nov 26, 2018 at 3:24 PM Kirk Lund  wrote:


One problem about moving around internal classes is that Geode uses
(proprietary and Java-based) serialization on the wire instead of defining
a wire-format that isn't dependent on class names/structures/packages.

I for one would love to move to a real wire-format with a well-defined
protocol but that's probably a separate project in its own right.

Are you really convinced that we could move internal classes around without
breaking rolling upgrades, client-server backwards compatibility and
diskstore contents?

On Mon, Nov 19, 2018 at 5:21 PM Dan Smith  wrote:


I think we can actually fix most of these issues without geode-2.0. Most

of

these are in internal packages, which means we can change the package of
these classes without breaking users. The only concerning one is
org.apache.geode.cache.util, which is a public package. However, the
AutoBalancer is actually still marked @Experimental, so we could
technically move that as well. Or maybe deprecate the old one and it's
associated jar, and create a new jar with a reasonable package. JDK11

users

could just avoid the old jar.

I have been wondering for a while if we should just fold geode-cq and
geode-wan back into geode-core. These are not really properly separated
out, they are very tangled with core. The above package overlap kinda

shows

that as well. But maybe going through the effort of renaming the packages
to make them java 11 modules would help improve the code anyway!

-Dan




On Mon, Nov 19, 2018 at 5:03 PM Owen Nichols 

wrote:

To package Geode as Java 11 Jigsaw module(s), a major hurdle is the
requirement that packages cannot be split across modules.

If we simply map existing Geode jars to Modules, we have about 10 split
packages (see table below).  Any restructuring would potentially have

to

wait until Geode 2.0.

A workaround would be to map existing packages into a small number of

new

modules (e.g. geode-api and geode-internal).  Existing jars would

remain

as-is.  Users making the transition to Java 11 /w Jigsaw already need

to

switch from CLASSPATH to MODULEPATH  so the inconvenience of a

different

naming scheme for Geode-as-modules might be acceptable (however, once
module naming and organization is established, we may be stuck with it

for

a long time).

Thoughts?

Is it even possible to rearrange all classes in each package below

into a

single jar?  Is doing so desirable enough to defer Java 11 support

until

a

yet-unplanned Geode 2.0, or perhaps to drive such a release?

Or, is it satisfactory to fatten Geode releases to include one set of

jars

for CLASSPATH use, plus a different set of jars for MODULEPATH use?


Package
Jar
org.apache.geode.cach

Core tenets of Geode

2018-11-28 Thread Udo Kohlmeyer

Hi there Geode dev's.

I'm starting to notice more and more discussions about proposed features 
or JIRA tickets, where imo, core Geode tenets are being violated. 
Initially I thought that Geode must be lacking core tenets, to guide our 
decisions. BUT then I noticed that we do state the on the home page. 
http://geode.apache.org/


I would like to remind everyone working on Geode of the following tenets 
which Geode lives and dies by:


1. Performance
2. Consistency
3. Low Latency
4. High concurrency
5. Elastic scalability
6. Reliable transactions
7. Share-nothing architecture

The reason I am calling this out, is that every decision we make, every 
piece of code we write needs to meet and exceed (if possible) these 
tenets. IF a solution or feature violates ANY one of the tenets, that is 
solution needs to be revised to meet these tenets.


I would like to suggest that in the future we add two more tenets:

1. Modular
2. Predictable

Imo, Geode has to be modular. A simple architecture where it is possible 
to easily replace modules of the system with more suitable (and greatly 
improved) successors.


As for */Predictable/*, Geode needs to be predictable in the following 
areas:


 * Latency
 * Error Handling
 * Service contracts

Any thoughts?

--Udo



Re: Core tenets of Geode

2018-11-28 Thread Udo Kohlmeyer

I think the most recent conversations that come to mind are:

 * DLQ for WAN Gateways
 * A current discussion about a bug regarding Tombstones, Entry removal
   and client/server inconsistencies.

All I am pointing out is, if one is working on any feature or bug and 
there is any doubt about direction to pick, going back to the core 
tenets is helpful to guide a solution.


--Udo


On 11/28/18 12:08, Jacob Barrett wrote:

Can you provide examples of discussion you believe violate there tenants?


On Nov 28, 2018, at 11:12 AM, Udo Kohlmeyer  wrote:

Hi there Geode dev's.

I'm starting to notice more and more discussions about proposed features or 
JIRA tickets, where imo, core Geode tenets are being violated. Initially I 
thought that Geode must be lacking core tenets, to guide our decisions. BUT 
then I noticed that we do state the on the home page. http://geode.apache.org/

I would like to remind everyone working on Geode of the following tenets which 
Geode lives and dies by:

1. Performance
2. Consistency
3. Low Latency
4. High concurrency
5. Elastic scalability
6. Reliable transactions
7. Share-nothing architecture

The reason I am calling this out, is that every decision we make, every piece 
of code we write needs to meet and exceed (if possible) these tenets. IF a 
solution or feature violates ANY one of the tenets, that is solution needs to 
be revised to meet these tenets.

I would like to suggest that in the future we add two more tenets:

1. Modular
2. Predictable

Imo, Geode has to be modular. A simple architecture where it is possible to 
easily replace modules of the system with more suitable (and greatly improved) 
successors.

As for */Predictable/*, Geode needs to be predictable in the following areas:

* Latency
* Error Handling
* Service contracts

Any thoughts?

--Udo





Re: Core tenets of Geode

2018-11-28 Thread Udo Kohlmeyer

John, I 100% agree with your statements!

We should add that to the list as well...

On 11/28/18 11:54, John Blum wrote:

1 more thing...

Yes, CORRECTNESS is not going to be the glamorous thing.  You cannot market
it because it is not sexy, not cool, it is simply expected as the bar to
entry.  No one notices when the system just works as it should.  You will
not get compliments, no thank you's.  It's just "Business As Usual".  Of
course, look out things don't work!  You'll hear about those, even when it
is not your problem.  Anticipation is key, but not at the expense of over
engineering the problem.  It's fine balance and art to be learned.

Performance is one where its fast one day but slow the next.  Correctness
does not take a day off.

If you can only optimize for 1 thing, opt to be correct.  Users will
silently be thanking you, especially during the holidays when they don't
have to worry about keeping the lights on at work.

+ $0.02 more,
-John


On Wed, Nov 28, 2018 at 11:42 AM John Blum  wrote:


There is only 1 thing I would add to this list is, above all others...

1. Correctness

Of course, this ought to go without saying, but do the right thing,
always, even at the peril of all the other things combined.  It is all too
easy to focus on the shinny things (e.g. performance, latency, high
concurrency), which have external factors (e.g. hardware failures, network
issues, resource limitations) beyond the control of the system itself.
Correctness has no external factors, especially if the system is designed
with resiliency/recovery in mind.  When in doubt, do the simplest thing,
less is more, write a test.  If it is hard to test, it is probably wrong,
poorly designed, or both.  I cannot count how many times I have been proven
wrong simply by writing 1 more test.  If you think it can happen, it will!

Finally, by way of example, I want to pick on "performance".  Ooh!  Ah!
Shinny!  I hate nothing more than premature optimization.  Unfortunately,
performance might have the most external factors out of any of the above
things and if you are not correct, then it really does not matter how fast
you go if you are going in the wrong direction/doing the wrong thing
faster.  You are just magnifying the problem.  Also don't confuse
consistency with correctness.  You can be consistently incorrect, too.

There are many instances where features/functions are NOT doing the right
thing.

$0.02,
-John




On Wed, Nov 28, 2018 at 11:12 AM Udo Kohlmeyer  wrote:


Hi there Geode dev's.

I'm starting to notice more and more discussions about proposed features
or JIRA tickets, where imo, core Geode tenets are being violated.
Initially I thought that Geode must be lacking core tenets, to guide our
decisions. BUT then I noticed that we do state the on the home page.
http://geode.apache.org/

I would like to remind everyone working on Geode of the following tenets
which Geode lives and dies by:

  1. Performance
  2. Consistency
  3. Low Latency
  4. High concurrency
  5. Elastic scalability
  6. Reliable transactions
  7. Share-nothing architecture

The reason I am calling this out, is that every decision we make, every
piece of code we write needs to meet and exceed (if possible) these
tenets. IF a solution or feature violates ANY one of the tenets, that is
solution needs to be revised to meet these tenets.

I would like to suggest that in the future we add two more tenets:

  1. Modular
  2. Predictable

Imo, Geode has to be modular. A simple architecture where it is possible
to easily replace modules of the system with more suitable (and greatly
improved) successors.

As for */Predictable/*, Geode needs to be predictable in the following
areas:

   * Latency
   * Error Handling
   * Service contracts

Any thoughts?

--Udo



--
-John
john.blum10101 (skype)







Geode Open PR's

2018-12-10 Thread Udo Kohlmeyer

Hi fellow Geode devs.

For all those that have PR's that have not yet be closed/merged, could 
you please attend to them. This is ONLY if one is not actively working 
on the tickets in question.


26 open PR's look a little excessive to me. If there are some that 
require reviews, please respond to this mail with the PR number so that 
the devs know which to look at.


--Udo



JMH performance help regarding InternalDataSerializer.basicReadObject

2018-12-10 Thread Udo Kohlmeyer

Hi there Geode Devs,

I have a problem that needs some input...

In bb87f20a2317bb40e2218ec11d467591 
 
I refactored InternalDataSerializer 
, 
from using `if` statements to a `switch` statement, to an improved more 
readable/maintainable of InternalDataSerializer.basicReadObject 
.


This change was never benchmarked, given its obvious improvement over 
the previous implementation.


In a recent refactor, the reverse lookup DSCodeHelper 
 
and `switch` statement was replaced with a single O(1) lookup from 
`byte` to function.


In order to prove that this implementation is "better" or more 
efficient, it was thought to write a jmh benchmark, to quantitatively 
prove this.


Whilst looking at this, I realized that there around 70 different data 
types that this method can deserialize. In my, possibly flawed, 
thinking, I wanted to write a benchmark to ensure that I use all types. 
BUT as you can imagine, this would be an insurmountable task.


Are the any recommendations on how to best achieve this task of 
effectively benchmarking this method to prove or dispel the notion that 
a single lookup from byte->function is faster than byte->Enum and then 
switch(Enum)->function?


--Udo



Re: Exploring using Kotlin for building the "Cluster Management Service"

2019-01-05 Thread Udo Kohlmeyer

Hi there Aditya,

Thank you for raising this topic. I've been thinking of using Kotlin for 
a VERY long time now.


I believe that Kotlin will provide is with many benefits other than just 
`non-nullable` fields/parameters but also with immutability.


Conciseness of code is another benefit and the VERY strong functional 
and much improved lambda support.


Imo, Kotlin is easy to pick up and learn and just as easy to read for a 
non-Kotlin developer. It is definitely an huge step up from Scala from a 
simplicity and "just pick up and code".


To address some of the points that were raised:

 * /When is it appropriate to use Kotlin/? - There is no definitive
   answer to this one... Always, never, only when you want to.. The
   interop with Java is second to none and having Kotlin function
   interweaved with Java code is simple and easy to do.
 * /Are there sufficient number of commiters with Kotlin experience to
   maintain the work/? - Kotlin has a strong growth in adoption. It all
   started with Android and JS, but it has definitely gained more
   traction in traditional Java applications as well. Imo, Kotlin is
   easy to read, easy to learn and definitely easy to maintain, not
   like Scala that requires some effort to learn and maintain.
 * /How will the introduction of Kotlin affect development exp and
   build times/? - There is little overhead to compile Kotlin code, add
   a few milliseconds here and there. As for development exp, from all
   developers I've interacted with, they've found Kotlin easy to deal
   with. I don't believe this would be a problem, unless you are averse
   to change and learning...
 * /Does this increase the learning curve for new committers/? - I
   don't believe this to be an issue. Given that we won't be
   exclusively switching over to Kotlin and would still support Java.
   Would it have an impact for committers to maintain existing Kotlin
   code, possibly yes. If you've never written Kotlin code before, it
   would be a learning moment, but not so steep that it cannot be
   mastered. In addition, any code that is committed will be reviewed
   and in true agile fashion, the code would be reviewed, commented on
   and possibly have some pointers on how to do things better, so it's
   a learning moment.

I believe that Kotlin can have a much larger impact. Kotlin Coroutines 
, which 
will help with writing async, concurrent code, without really have to 
know and understand Java threading extensively.


Ktor , a simplified async communications framework

RSocket-Kotlin , a reactive 
communication framework


Of course Spring provides Kotlin support. Kofu 
, 
a simplified Kotlin DSL based configuration tool for SpringBoot, of 
course there is a Java equivalent Jofu 
.


Kotlin.link , is  collection of resources spanning 
everything from learning, libraries, frameworks, etc... all for Kotlin


The final step of course would be Kotlin Native 
, which 
allows the compilation of "pure" Kotlin code onto many different 
platform using the LLVM . Thereby providing the 
ability to eventually have an application that can be compiled into 
native binaries for many different platforms, as described HERE 
.


So, in my mind, the question that needs to be answered, can we afford 
not to use Kotlin.


--Udo


On 1/3/19 16:16, Aditya Anchuri wrote:

Bringing the discussion on the PR back to the dev list

An important point that was raised by Anthony was: "

Introducing a new language to the Geode project raises some questions that
we need to answer as a community before adopting this proposal:

- When is it appropriate to use Kotlin? When should we prefer Java?
- Are there a sufficient number of committers with Kotlin experience to
maintain the work over the long-term?
- How will the introduction of Kotlin affect the development experience
and build times?
- Does this increase the learning curve for new committers?

I think I would be more comfortable exploring this change in a submodule
rather than in geode-core. I would also like to see all the REST code move
to geode-web or geode-mgmtso that we can finally fix those broken
dependencies. Specifically we should aim to delete thewebJar` Gradle task
from geode-core."


I feel that some of these questions can be answered better by actually
introducing Kotlin into the codebase and seeing the results -- but
obviously there is a concern with how locked-in we get if we come to the
conclusion that Kotlin is not for us. We will look into introducing as a
submodule that is a sibling to geo

Re: [Proposal] Adding Micrometer to Apache Geode

2019-01-15 Thread Udo Kohlmeyer

I agree with Jacob here. Great to see such great strides forward

+1 deprecating old Stats APIs

It would be good to see the new Micrometer stats have a logical 
grouping, that makes it easier for users to search for metrics.


Does this mean that Geode will now support fully tagged/dimension 
metrics? What about JMX?


--Udo

On 1/15/19 09:43, Jacob Barrett wrote:

I am good with this proposal as long as it includes the deprecation of all the 
current stats APIs and VSD such that Micrometer is the only go forward stats 
definition and collection API in 2.0.


On Jan 15, 2019, at 9:37 AM, Mark Hanson  wrote:

Hello All,

I would like to propose that we incorporate Micrometer into Geode to allow us 
to collect statistics and make them more easily available to external services 
should someone chose to implement support for that.

In some basic testing, it does not appear to have a significant impact on 
performance to hook this in. I think that in the long run it will really 
improve the visibility of the systems stats in real-time… This will also allow 
some extensibility for people who want to hook into their tracking 
infrastructure, such as New Relic or Data Dog, though we are not putting that 
in.


What do people think?

Thanks,
Mark




Re: [Proposal] adding a new type of PdxInstance

2019-01-15 Thread Udo Kohlmeyer

Darrel, thank you for this.

I would like to propose a counter-proposal.

Instead of introducing another PDXInstance type, why don't we improve 
the serialization framework itself? I know my proposal is most likely 
going to take a little more effort than adding a new type, but I believe 
it is less of a work around.


MY proposal is to have the PDX serialization configuration be a little 
more explicit. In the sense that a user can define serialization details 
down to the Region.Key or Region.Value level.


Why would we possibly have a "one size fits all" approach? Could one 
have a setup where serialization configuration is stored on a per region 
basis. Maybe in some cases we want to deserialize the key and in some 
cases we don't want to. In some regions we want to leave the value in 
serialized form and in others we don't. The point is, why limit to a 
single flag.


--Udo

On 1/15/19 10:17, Darrel Schneider wrote:

As part of GEODE-6272 we realized we need a way to use a PdxInstance as key
for a Region entry. The problem with the current PdxInstance behavior is
that in some members the key may be seen as a PdxInstance and in others
seen as an instance of a domain class. This inconsistency can lead to
problems, in particular with partitioned regions because of the key's hash
code being used to determine the bucket. You can read more about this here:
https://urldefense.proofpoint.com/v2/url?u=https-3A__geode.apache.org_docs_guide_17_developing_data-5Fserialization_using-5Fpdx-5Fregion-5Fentry-5Fkeys.html&d=DwIBaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=eizM8j4ZzXpU2_4tKNPdsrNNjryTeKuT6UdYhvucPpY&m=Pba8A2NQprPqyA0LhCvz9iyCjcXgqxkVildpFiJD6b4&s=blWIWwIbt5SKqKVidtZsC-cB9QK158CdEdOho54mhiM&e=

What we want is a new type of PdxInstance that will never deserialize to a
domain class. It will always be a PdxInstance. This can safely be used as a
Region key since PdxInstance implements equals and hashCode. It can also be
used in other contexts when you just want some structured data with well
defined fields but never need to deserialize that data to a domain class.

We are trying to figure out what to call this new type of PdxInstance.
Currently the pull request for GEODE-6272 has them named as "stable"
because they do not change form; they are always a PdxInstance. Another
suggestion was not to name them but add a boolean parameter to the method
that creates a PdxInstanceFactory named "forcePDXEveryWhere". Internally we
have some code that has a boolean named "noDomainClass". I'd prefer we come
up with a name instead of using boolean parameters. In the Java world you
label fields that can't change "final" and in the object world you call
objects that can't change "immutable". Would either of these be better than
"stable"? Any other ideas for what we could calls this new type of
PdxInstance?



Re: [Proposal] adding a new type of PdxInstance

2019-01-15 Thread Udo Kohlmeyer

Darrel to better understand what you are asking for:

I understand that you want the following:

byte[] -> PdxInstance

and not

byte[] -> PdxInstance -> Pojo

The `read-serialized=true` flag is a single flag that "rules them all" 
which does not make sense anymore and something that is seemingly a 
problem. The proposal seems to want a type that does not get 
deserialized into a POJO, regardless of setting of `read-serialized=true`.



I don't believe we need another type that denotes the behavior of "don't 
deserialize into POJO".


Creating a type JUST for a behavior is not helpful in any manner. The 
opposite would actually be true. You would not have to push that logic 
into the code:


```

if(object instanceof StablePdxInstance){

    return object;

} else{

   return SerializtionFramework.deserialize(object);

}

```

If you want the system NOT deserialize the PDXInstance to a POJO, then 
the behavior comes from the serializer.


Adding serialization configuration on a per Region basis is not that 
crazy and will solve the problem you are trying to address.


It might be a little more effort, but something that can be more easily 
maintained. What the end goal really should be, is to have all 
serialization logic to be only in the Serialization framework and not 
any where else in the code. So if you want to have a region key of type 
PDXInstance, that does NOT get deserialized into POJO (or at least does 
not try), then have the serializer know that. So on a per region basis, 
tell the serializer that it does not need to deserialze this key into 
POJO. I would hate to see more code that says "if(object instanceof 
StablePDXInstance)...


--Udo

On 1/15/19 13:21, Dan Smith wrote:

If I understand this right, you are talking about a way to create a
PdxInstance that has no corresponding java class. How about just a
RegionService.createPdxInstanceFactory() method that doesn't take a
classname, and therefore has no corresponding java class? It seems a
PdxInstances without a class is a more fundamental PdxInstance. A
PdxInstance with a java classname on it is just an extension of the
classless version.

I agree what Udo is talking about - giving the user better control of
*when* there value is deserialized to a java object - is also valuable, but
a separate feature.

-Dan

On Tue, Jan 15, 2019 at 1:09 PM Darrel Schneider 
wrote:


Even before the JSON pdx support we had internal support for a PdxInstance
that deserializes as a PdxInstance.
This is just adding an external api for that already existing internal
feature. So it is pretty simple to do if we can figure out how to name it.


On Tue, Jan 15, 2019 at 11:18 AM Galen O'Sullivan 
wrote:


I suspect Udo is remembering something we both had to deal with, which is
that the lack of values to get/put PDXInstances on Regions make some
patterns difficult. In internal code, we have to set some thread-locals

to

get serialized values out, and in general, I think that setting
pdx-read-serialized is a violation of the contract you'd expect from the
type signature of get, put, etc. Having a separate API for serialized
objects, and possibly region-level configuration, makes a lot more sense.
You could even have the non-PDX get fail on regions that are set to only
use PDX-serialized objects for everything.

We already have something like a PdxInstance that always deserializes to

a

PdxInstance -- have a look at the __GEMFIRE_JSON mess that we use for

JSON.

However you end up doing the new PDXInstance stuff, I strongly suggest
using the new solution for JSON objects.

-Galen

On Tue, Jan 15, 2019 at 10:49 AM Darrel Schneider 
I like the idea of adding support to the region configuration that lets
users control how it stores the data. But even if we did that, and you

are

correct that it would be much more work, I don't think it would address
this issue or remove the value of a PdxInstance that always

deserializes

to

a PdxInstance. So I'd like this proposal to stay focused on PdxInstance

and

not get side tracked. PdxInstances can be used outside of regions (for
example arguments to functions).

I'd like to see a separate proposal about being able to configure how a
region stores its data. I could be wrong, but I think that proposal

would

focus on the values, not the keys. Storing keys as serialized data is
tricky because you need to come up with a equals and hashCode and if

those

are going to be done based on a sequence of serialized bytes then you
really need to understand your serialization code and make sure that
"equal" objects always have the same serialized form.


On Tue, Jan 15, 2019 at 10:38 AM Udo Kohlmeyer  wrote:


Darrel, thank you for this.

I would like to propose a counter-proposal.

Instead of introducing another PDXInstance type, why don't we improve
the serialization framework itself? I know my proposal is most likely
going to take a li

Re: Preventing new build warnings

2019-01-15 Thread Udo Kohlmeyer
So, to reduce the number of new warnings, are we then going to 
standardize on JDK versions? i.e, we only build with JDK 8 build 192 and 
JDK11 build 03, because changes in JDK can introduce warnings.


I'm all for reducing warnings, but they are warnings. Don't think we 
need to error, or break on them.


-1 for break build on warnings.

--Udo

On 1/15/19 13:28, Dan Smith wrote:

Sounds good. +1 to failing the build if new warnings are introduced.

-Dan

On Tue, Jan 15, 2019 at 12:59 PM Galen O'Sullivan 
wrote:


I'm for failing CI on warnings. It would be nice to reduce or eliminate our
existing build warnings as well.

Thanks,
Galen


On Tue, Jan 15, 2019 at 12:33 PM Peter Tran  wrote:


Hello!

I've noticed that there is no mechanism in which we prevent new PRs from
introduce new build warnings. In our PR template we ask people to self
report that they have a "clean build" but nothing more to ensure we're

not

adding new warnings.

Has there been an initiative to address this in the past? Would it be too
restrictive if CI fails if new warnings are introduced in a PR?

Thanks
--
Peter Tran



Re: [Proposal] Adding Micrometer to Apache Geode

2019-01-15 Thread Udo Kohlmeyer

So...

imo, what is a dimension and what is a metric.

well to me a metric is a value that we want to measure on across all the 
different permutations of that metric. The metric has to be generic in 
the sense that it exposes no detail about the value it is measuring. 
Metrics are like dry oats. Flavorless, bland, colorless but yet useful. 
i.e *region.operation.count* - just described the count of all 
operations that have been made against a region.


Dimensions provide "color" to the metric. i.e 
*region.operation.count*... could have the following metrics:


 * Availability zone *optional*
 * distributed system id (unique DS ID, limit 255)
 * server name (options are endless here)
 * region name (the options are endless here)
 * region type (partition,replicate,normal,replicate-proxy)
 * operation type (put, get, invalidate,destroy

What we have is still a single metric, "*region.operation.count*" and 
this metric will work... Anyone can read it and know what it is 
measuring. The "slicing and dicing" comes from the color that is 
provided by the dimensions and the combinations of any -> all of them.


Now, dimensions will change from metric to metric. i.e all the 
dimensions for regions, might not all be applicable for other metrics... 
but that goes without saying...


I hope that helps.

--Udo

On 1/15/19 13:44, Nicholas Vallely wrote:

FYI

I spoke with Udo about tagging specifically, not necessarily hierarchy 
or naming.  He said that adding tags that are unique, don't change 
very often, and that you might want to group on seemed like the best 
course of action. So for instance we discussed IP-PORT and he 
mentioned that this might be a bad one to use because it could change 
as a VM goes up/down and wouldn't be able to show a trend or be able 
to easily group on in the same way you would get with a ServerName.  
Other than that one, everything else that I have in the spreadsheet 
meets this criterion, he also mentioned 'Region Type' as something 
else we could tag.


Nick

On Tue, Jan 15, 2019 at 11:58 AM Dale Emery <mailto:dem...@pivotal.io>> wrote:


Hi Udo and all,

> On Jan 15, 2019, at 10:06 AM, Udo Kohlmeyer mailto:u...@apache.org>> wrote:
>
> It would be good to see the new Micrometer stats have a logical
grouping, that makes it easier for users to search for metrics.

Do you know of any useful guidelines or conventions for creating a
hierarchy of metrics, and criteria for evaluating the goodness of
the hierarchy?

And for which details to represent in the meter name hierarchy vs
tags/dimensions?

Dale

—
Dale Emery
dem...@pivotal.io <mailto:dem...@pivotal.io>



Re: [Proposal] adding a new type of PdxInstance

2019-01-15 Thread Udo Kohlmeyer
Bruce, this was not aimed at Darrel to do. I was just counter-proposing 
that I prefer a different approach to solving the problem.


And yes, all new features are starting to become big changes. I still 
think that the proposal that Darrel has made, could be a byproduct of a 
"Serialization Framework" overhaul.


Maybe a clean, modular approach to the serialization mechanism with some 
configuration capability.


--Udo

On 1/15/19 14:38, Bruce Schuchardt wrote:
While I like the idea of a non-static & more flexible serialization 
service I don't think it's reasonable to foist that onto Darrel 
because it's a great deal of work.



On 1/15/19 2:29 PM, Darrel Schneider wrote:
Dan, we still want a "class name" but it no longer has to be an 
actual name

of a java class.
It can now be just a logical "type name". For two PdxInstances to be 
equal

they need to have the same class name.
So we really do just want a flag that says: never deserialize this
PdxInstance.
Anil suggested a "setDeserializable(boolean)" and 
"isDeserializable()" in

place of stable.

On Tue, Jan 15, 2019 at 1:21 PM Dan Smith  wrote:


If I understand this right, you are talking about a way to create a
PdxInstance that has no corresponding java class. How about just a
RegionService.createPdxInstanceFactory() method that doesn't take a
classname, and therefore has no corresponding java class? It seems a
PdxInstances without a class is a more fundamental PdxInstance. A
PdxInstance with a java classname on it is just an extension of the
classless version.

I agree what Udo is talking about - giving the user better control of
*when* there value is deserialized to a java object - is also 
valuable, but

a separate feature.

-Dan

On Tue, Jan 15, 2019 at 1:09 PM Darrel Schneider 


wrote:


Even before the JSON pdx support we had internal support for a

PdxInstance

that deserializes as a PdxInstance.
This is just adding an external api for that already existing internal
feature. So it is pretty simple to do if we can figure out how to name

it.


On Tue, Jan 15, 2019 at 11:18 AM Galen O'Sullivan 


wrote:

I suspect Udo is remembering something we both had to deal with, 
which

is

that the lack of values to get/put PDXInstances on Regions make some
patterns difficult. In internal code, we have to set some 
thread-locals

to

get serialized values out, and in general, I think that setting
pdx-read-serialized is a violation of the contract you'd expect from

the

type signature of get, put, etc. Having a separate API for serialized
objects, and possibly region-level configuration, makes a lot more

sense.

You could even have the non-PDX get fail on regions that are set to

only

use PDX-serialized objects for everything.

We already have something like a PdxInstance that always deserializes

to

a

PdxInstance -- have a look at the __GEMFIRE_JSON mess that we use for

JSON.
However you end up doing the new PDXInstance stuff, I strongly 
suggest

using the new solution for JSON objects.

-Galen

On Tue, Jan 15, 2019 at 10:49 AM Darrel Schneider <

dschnei...@pivotal.io

wrote:


I like the idea of adding support to the region configuration that

lets

users control how it stores the data. But even if we did that, and

you

are

correct that it would be much more work, I don't think it would

address

this issue or remove the value of a PdxInstance that always

deserializes

to

a PdxInstance. So I'd like this proposal to stay focused on

PdxInstance

and

not get side tracked. PdxInstances can be used outside of regions

(for

example arguments to functions).

I'd like to see a separate proposal about being able to configure

how a

region stores its data. I could be wrong, but I think that proposal

would
focus on the values, not the keys. Storing keys as serialized 
data is

tricky because you need to come up with a equals and hashCode and if

those
are going to be done based on a sequence of serialized bytes then 
you

really need to understand your serialization code and make sure that
"equal" objects always have the same serialized form.


On Tue, Jan 15, 2019 at 10:38 AM Udo Kohlmeyer 

wrote:

Darrel, thank you for this.

I would like to propose a counter-proposal.

Instead of introducing another PDXInstance type, why don't we

improve

the serialization framework itself? I know my proposal is most

likely

going to take a little more effort than adding a new type, but I

believe

it is less of a work around.

MY proposal is to have the PDX serialization configuration be a

little

more explicit. In the sense that a user can define serialization

details

down to the Region.Key or Region.Value level.

Why would we possibly have a "one size fits all" approach? Could

one

have a setup where serialization configuration is stored on a per

region

basis. Maybe in some cases we want to deserialize the key and in

som

Re: Merging GEODE-6424 into release/1.9.0

2019-02-20 Thread Udo Kohlmeyer

+1, Go,Go,GO!!

On 2/20/19 12:24, Jacob Barrett wrote:

Anyone have issue with merging https://issues.apache.org/jira/browse/GEODE-6424 
 into release/1.9.0?

Without it we will have to wait for the next release before we can have 
meaningful baselines for function and query benchmarks. Without this fix 
baselines will continue to vary by as much as 45% making them useless.

It’s also a big performance boost. Concurrent local cache gets see about a 50% 
bump in throughput due to reduced contention for stats, even with timed stats 
enabled. Other operations haven’t been benchmarked but should see similar 
improvements where stats were the bottleneck.

-Jake




Re: Dependency review for release 1.9.0

2019-02-28 Thread Udo Kohlmeyer

I wonder if we should be looking at shaded jars for the extensions.

--Udo

On 2/28/19 09:47, Charlie Black wrote:

Hopefully, we are thinking about classpath of the server and lazily adding
these jars only when a feature is turned on.

On Thu, Feb 28, 2019 at 9:45 AM Dan Smith  wrote:


I see that geo, grumpy-core, and commons math came from adding geospatial
support to redis -

https://github.com/apache/geode/commit/7bf02251fd047cb1cf575c01b80a9807108618da

-Dan

On Thu, Feb 28, 2019 at 9:41 AM Anthony Baker  wrote:


Looks a number of the new dependencies came in transitively with the

guava

version bump.


On Feb 27, 2019, at 5:32 PM, Anthony Baker  wrote:

I was reviewing the release branch and noticed a number of new

dependencies have been added since the last release.  When you add a new
dependency, please review and follow the project license guide [1].  In
particular, update the LICENSE file in geode-assembly/src/main/dist
depending on the license type.

Currently we need to update the LICENSE file with the additional

MIT/BSD/CDDL dependencies.  We may also need to update NOTICE files.
There’s also a version conflict with multiple versions of Jackson in use
(2.9.6 / 2.9.8).

@Sai - these need to be fixed on release/1.9.0

Here’s the list of additions:

   animal-sniffer-annotations-1.17.jar
   checker-qual-2.5.2.jar
   commons-math3-3.2.jar
   error_prone_annotations-2.2.0.jar
   failureaccess-1.0.jar
   geo-0.7.1.jar
   grumpy-core-0.2.2.jar
   istack-commons-runtime-2.2.jar
   j2objc-annotations-1.1.jar
   javax.activation-1.2.0.jar
   javax.activation-api-1.2.0.jar
   jsr305-3.0.2.jar
   listenablefuture-.0-empty-to-avoid-conflict-with-guava.jar

Removed:

   activation-1.1.1
   jaxb-core-2.2.11.jar

Anthony

[1]

https://cwiki.apache.org/confluence/display/GEODE/License+Guide+for+Contributors

<


https://cwiki.apache.org/confluence/display/GEODE/License+Guide+for+Contributors








Re: [DISCUSS] removal of geode-json module

2019-03-15 Thread Udo Kohlmeyer
IMO, I think it would better serve the project if were to remove it 
completely and replace it with jackson.


On 3/15/19 14:06, Bruce Schuchardt wrote:
I've removed use of geode-json in non-test code and I'd like to remove 
it completely and just add a dependency on a org.json package in a 
Maven repository.  The only one available is org.json though, so 
here's the question: Is acceptable to use org.json with it's silly 
license (see below) if we're not including it in our distribution?



   Copyright (c) 2002 JSON.org

   Permission is hereby granted, free of charge, to any person
   obtaining a copy of this software and associated documentation files
   (the "Software"), to deal in the Software without restriction,
   including without limitation the rights to use, copy, modify, merge,
   publish, distribute, sublicense, and/or sell copies of the Software,
   and to permit persons to whom the Software is furnished to do so,
   subject to the following conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   _*The Software shall be used for Good, not Evil.*_

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   SOFTWARE.

Note: We can't use openjson, which is what geode-json is based on, 
because it's packaged as com.github.openjson instead of org.json.





Re: [DISCUSS] TTL setting on WAN

2019-03-20 Thread Udo Kohlmeyer
-1, I don't believe this is a feature that we should support. IF a 
client is experiencing slow WAN replication and users only care about 
the "latest" data, then maybe the user should use "conflation".


With a TTL model, we are messing with our consistency tenet. I'm am NOT 
in support of a setting that can cause inconsistency.


Dead-letter queues is another area that WILL cause data/site 
inconsistency. I think we really have to take a step back, think about 
WHAT tenets are important to GEODE and then act accordingly.


--Udo

On 3/20/19 10:46, Bruce Schuchardt wrote:
IDK Anil, we'll figure that out in the implementation.  I was thinking 
it would be in the dispatch threads, so if distribution is need that 
will happen as it does now.  I'm hopeful that this won't perturb the 
code too much.


One thing that was brought up to me in person was the Dead Letter 
Queue 
 
initiative that seems to have stalled.  That seems like a very similar 
idea though it's reacting to errors coming from the receiving side and 
not a local condition.  I like the callback, stats, gfsh and cluster 
config integration in that write-up & think they might be useful 
here.  There is also relevant discussion in that page about things 
like PDX registrations.  Is that initiative going to move forward at 
some point or is it off the boards?


On 3/20/19 10:25 AM, Anilkumar Gingade wrote:

+1. Will the expiration (destroy) be applied on local queues or the
expiration will be replicated (for both serial and parallel)?

-Anil.


On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt 


wrote:


We've seen situations where the receiving side of a WAN gateway is slow
to accept data or is not accepting any data.  This can cause queues to
fill up on the sending side.  If disk-overflow is being used this can
even lead to an outage.  Some users are concerned more with the latest
data and don't really care if old data is thrown away in this
situation.  They may have set a TTL on their Regions and would like to
be able to do the same thing with their GatewaySenders.

With that in mind I'd like to add this method to GatewaySenderFactory:

/** * Sets the timeToLive expiration attribute for queue entries for 
the

next * {@code GatewaySender} created. * * @param timeToLive the
timeToLive ExpirationAttributes for entries in this region * @return a
reference to this GatewaySenderFactory object * @throws
IllegalArgumentException if timeToLive is null * @see
RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
setEntryTimeToLive(ExpirationAttributes timeToLive);

The exact implementation may not be the same as for Regions since we
probably want to expire the oldest entries first and make sure we do so
in their order in the queue.






Re: [DISCUSS] TTL setting on WAN

2019-03-20 Thread Udo Kohlmeyer
If all that the customer is concerned about is that the receiving side 
gets the "latest" data, conflation is definitely the best bet. How do we 
classify old? The only classification that I have of old (in this 
context) is that there is a newer version of a data entry. This 
classification is not time based, as a TTL approach would require, but 
state.


What is the use of WAN replication if we are proposing to only replicate 
*some* of the data. How would the clusters ever know if they are 
in-sync? I believe we are just opening a door that we will never be able 
to close again and will cause us endless issues.


--Udo

On 3/20/19 10:56, Bruce Schuchardt wrote:
Udo, in the cases I've looked at the user is okay with inconsistency 
because they don't really care about the old data. They're most 
interested in getting the newest data and keeping the sending site 
from going down.  I guess the docs for TTL should make it very clear 
that it will cause inconsistencies.


Conflation does seem like an appropriate thing to try if the same keys 
are being updated - I'll do some investigation and see why it wasn't 
appropriate.



On 3/20/19 10:51 AM, Udo Kohlmeyer wrote:
-1, I don't believe this is a feature that we should support. IF a 
client is experiencing slow WAN replication and users only care about 
the "latest" data, then maybe the user should use "conflation".


With a TTL model, we are messing with our consistency tenet. I'm am 
NOT in support of a setting that can cause inconsistency.


Dead-letter queues is another area that WILL cause data/site 
inconsistency. I think we really have to take a step back, think 
about WHAT tenets are important to GEODE and then act accordingly.


--Udo

On 3/20/19 10:46, Bruce Schuchardt wrote:
IDK Anil, we'll figure that out in the implementation.  I was 
thinking it would be in the dispatch threads, so if distribution is 
need that will happen as it does now.  I'm hopeful that this won't 
perturb the code too much.


One thing that was brought up to me in person was the Dead Letter 
Queue 
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452478> 
initiative that seems to have stalled.  That seems like a very 
similar idea though it's reacting to errors coming from the 
receiving side and not a local condition.  I like the callback, 
stats, gfsh and cluster config integration in that write-up & think 
they might be useful here.  There is also relevant discussion in 
that page about things like PDX registrations.  Is that initiative 
going to move forward at some point or is it off the boards?


On 3/20/19 10:25 AM, Anilkumar Gingade wrote:

+1. Will the expiration (destroy) be applied on local queues or the
expiration will be replicated (for both serial and parallel)?

-Anil.


On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt 


wrote:

We've seen situations where the receiving side of a WAN gateway is 
slow
to accept data or is not accepting any data.  This can cause 
queues to

fill up on the sending side.  If disk-overflow is being used this can
even lead to an outage.  Some users are concerned more with the 
latest

data and don't really care if old data is thrown away in this
situation.  They may have set a TTL on their Regions and would 
like to

be able to do the same thing with their GatewaySenders.

With that in mind I'd like to add this method to 
GatewaySenderFactory:


/** * Sets the timeToLive expiration attribute for queue entries 
for the

next * {@code GatewaySender} created. * * @param timeToLive the
timeToLive ExpirationAttributes for entries in this region * 
@return a

reference to this GatewaySenderFactory object * @throws
IllegalArgumentException if timeToLive is null * @see
RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
setEntryTimeToLive(ExpirationAttributes timeToLive);

The exact implementation may not be the same as for Regions since we
probably want to expire the oldest entries first and make sure we 
do so

in their order in the queue.






Re: [DISCUSS] Move or remove org.apache.geode.admin

2019-04-02 Thread Udo Kohlmeyer

+1 on removing the JMXManager stuff for Geode 2.0 release.

But this "hidden" feature, what is that? Is this something that we would 
like to support or is this something that would be replaced with our 
existing efforts in the ConfigurationService and Metrics area.


On 4/2/19 17:04, Dan Smith wrote:

Hi devs,

The org.apache.geode.admin package has been deprecated for about 7 years
[1].

I'd like to remove it, or at least move it to a separate module. I actually
started some work to move it to a separate module [2]. However in the
course of this process, I've found that this module has extremely little
test coverage, so I'm not confident in the move. For example, it has a
whole JMX manager feature (different than our current JMX manager) which
does not appear to have any tests. This JMX manager feature won't work as
shipped unless you find and add some mx4j jars to your classpath.

I think the best course of action is probably to remove it entirely.
However, this does bring up a couple of questions:

1) Is it ok in general to remove deprecated API in a non X.0 release (eg
geode 1.10 instead of geode 2.0)?

2) How about in this case, when this feature has been deprecated for so
long, and may or may not work?

-Dan

[1]
https://geode.apache.org/releases/latest/javadoc/org/apache/geode/admin/package-summary.html
[2] Draft PR for moving the org.apache.geode.admin to a separate module -
https://github.com/apache/geode/pull/3392



Re: GEODE-6662 for 1.9.0

2019-04-17 Thread Udo Kohlmeyer

Unless this is a critical issue I'd vote -1 for including this.

The process to release 1.9 has already been started and should be closed 
to anything other than critical CVE's.


--Udo

On 4/17/19 11:30, Bruce Schuchardt wrote:
I'd like to include the fix for this memory leak that Darrel found. 
It's new in 1.9 and the fix is pretty simple - I'm putting up a PR now.




Re: Backwards compatibility issue with JSONFormatter

2019-05-14 Thread Udo Kohlmeyer
How did this slip the review process that the signature of a public API 
was changed?


Well done Gester for finding this!!

--Udo

On 5/14/19 10:40, Dan Smith wrote:

I think the changes for GEODE-6327 broke backwards compatibility in
JSONFormatter with the change from fromJSON(String jsonString) to
fromJSON(String jsonString, String... identityFields)

Adding an additional varargs parameter to the method breaks code that was
compiled against the non-varargs version. I think we need to overload the
method instead.

Thanks to Gester for discovering this with his test!

-Dan



Re: [DISCUSS] is it time to make Windows tests gating?

2019-05-16 Thread Udo Kohlmeyer
I think we need to make sure our windows tests get to green... If we 
make them gating then we will never release, but at the time be 
motivated to fix them, in order to release.


Maybe they run once every day... to at least start getting an idea of health

On 5/15/19 18:28, Owen Nichols wrote:

For a very long time we’ve had Windows tests in the main pipeline (hidden away, 
not in the default view), but the pipeline proceeds to publish regardless of 
whether Windows tests fail or even run at all.

Now seems like a good time to review whether to:
a) treat Windows tests as first-class tests and prevent the pipeline from 
proceeding if any test fails on Windows
b) keep as-is
c) change Windows tests to trigger only once a week rather than on every commit, if they 
are going to remain "informational only"

One disadvantage to making Windows tests gating is that they currently take 
much longer to run (around 5 hours, vs 2 hours for Linux tests).


Re: Changing external methods to no longer throw UnsupportedOperationException

2019-05-23 Thread Udo Kohlmeyer

+1 to implementing this method.

There is no plausible reason NOT to implement this.

--Udo

On 5/23/19 12:44, Dan Smith wrote:

+1 to implementing this method in a minor release.

I'm with Jake on this one. Every bug we fix changes the behavior of the
product in some small way. This seems like a behavior change for the better
- I can't picture a use case where someone is actually *relying* on this
method throwing UnsupportedOperationException.

I suppose someone might write a test that this feature is not supported -
but it seems like the only reason to do that would be to detect when geode
starts supporting getStatistics, so they'd probably be happy to see
getStatistics start working!


-Dan

On Thu, May 23, 2019 at 10:31 AM Anilkumar Gingade 
wrote:


I agree, this may not look like the usecase that one would be using or
depending. Going with the backward compatibility requirement this will be
breaking that contract.
Again, based on the scenario and usecases, there could be exceptions. I am
trying to see if the versioning support that's used to keep the backward
compatibility contract can be used here.

On Thu, May 23, 2019 at 10:17 AM Jacob Barrett 
wrote:


But what application is going to legitimately call this method and expect
that it throw an exception? What would be the function of that usage?

If you assume that calling this method under these conditions had no

value

and would therefor never have been called then one could argue that
implementing this method is adding a feature. It adds a case where one
could legitimately call this method under new conditions.


On May 23, 2019, at 10:06 AM, Anilkumar Gingade 

wrote:

As this changes the behavior on the existing older application; it

seems

to

break the backward compatibility requirements.
We use client versions to keep the contracts/behavior same for older
client; can we do the same here.

-Anil.


On Thu, May 23, 2019 at 8:33 AM Darrel Schneider <

dschnei...@pivotal.io>

wrote:


Is it okay, in a minor release, to implement Region.getStatistics for
partitioned regions? See GEODE-2685. The current behavior is for it to
always throw UnsupportedOperationException. I doubt that any

application is

depending on that behavior but it could be. I know we have seen

changes

like this in the past break the tests of other products that are

layered on

top of Geode.
Should this type of change be considered one that breaks backwards
compatibility?





Re: [DISCUSS] Criteria for PMC, committers

2019-05-31 Thread Udo Kohlmeyer
@Owen Some interesting thoughts and proposals. None of which I think we 
could easily implement, but some that I would LOVE to have.


I also apologize for restating many things that Helena has already said, 
but it is taking my a long time to craft this email...


Jake has raised a concern that I share. In too many cases contributors 
are made committers on the basis of "mateship" and less on tangible 
contributions. I remember cases where proposed committers were rejected 
for not having been active enough on mailing lists or even PR. Yet in 
other cases no contributions on mailing lists are over shadowed by the 
sheer volume of PR's submitted, resulting in the granting of committer 
(and pmc) rights. In most successful (nomination cases) the nominee sat 
in the same office,same team, had breakfast/lunch/coffee and socialized 
with many other Geode committers/pmc members. Which automatically evokes 
a "mateship" bias. "The person is a nice person and I enjoy working with 
them" very often overshadows the lack of dev/user list contributions or 
PR's.


Having guidelines based code quality and technical knowledge is only 
part the criteria. Activity in the community (mailing lists) is the 
other part that should be considered. All of these, are factors in 
determining a committer.


As a PMC member I also ask myself the questions:

 * "Do I as a PMC member trust that all future commits from a nominee
   to be of the same high standard going forward?"
 * "Will this nominee continue to be active on mailing lists and the
   community at large."
 * "Am I convinced that the PR's that I am seeing is really that of the
   nominee and not that of a team effort of pair programming with
   another committer/pmc member".
   
(https://github.com/apache/geode/pull/3650,https://github.com/apache/geode/pull/3573).


How am I to judge the quality of work of  the nominee when multiple 
developers were part of it? I have witnessed this in many cases, where 
developers submit PR's under their name, but the main body of work was 
predominantly done by their more experienced pair. Synthetic bolstering 
of PR number to use as "fact" when it comes to committer nomination. In 
these cases I look to the mailing lists to gain insight into that 
nominees understanding of project and problem area, rather than can they 
write compliant code.


Being a PMC member, imo, is another role that should be evaluated on, 
after having become a committer. Activity in mailing list, reviewing of 
PR's (yes a committer's responsibility, AND imo an evaluation criteria), 
shows commitment to the project. Commitment to the project results in 
rights to become a PMC member. Once again there are no metrics (or exact 
numbers) that we can attribute to this, BUT subjectively everyone can 
evaluate the contributions and gauge if they feel the nominee is strong 
enough. As a side note, anonymization would be awesome here.. take away 
the face and familiarity and only evaluate on contributions would be 
SOOO much simpler. But alas too much work and most likely complete and 
utter overkill.


@Owen, you are correct, a single -1 is a strong signal that something is 
not as clear or apparent to a single PMC member as it is to many others. 
This is why a reason is required. This also allows space for other PMC 
members to present their views on the nominee, which might not be as 
apparent to others. In some cases I feel maybe reasons should accompany 
a +1. I mean, why would a -1 need justification and a +1 does not. THAT 
+1 justification might just sway a -1 to change their mind to a "0" or 
even a "+1".


I also want to highlight another elephant in the room. Out of 100 
(registered) committers and 50 (registered) pmc members, only 6 have 
commented on this thread. Pause for thought.


--Udo

On 5/31/19 02:41, Owen Nichols wrote:

I appreciate the concern about bias/cronyism.  If having some criteria will 
“level the playing field”, then let’s discuss what those criteria would be.

However, in voting for a committer, a single -1 carries more weight than all 
the +1’s in the world.  A -1 also requires explanation.  This system of checks 
and balances should be sufficient to combat most implicit bias or voting blocs. 
 If it looks like someone is getting +1 votes for the wrong reasons, it only 
takes one person to call it out by casting a -1.  If you have a reservation, 
don’t be afraid to speak up!

Keep in mind what we are voting for when we make someone a committer.  We are 
not voting “have they satisfied this or that”.  We are voting “can we trust 
them to be a good steward of the codebase”.  Apache’s guidelines are 
intentionally vague, because to be any more specific risks applying an 
artificial definition of “trust”.

Suggesting some signals a voter might take into account to judge 
trustworthiness is very different from prescribing an explicit metric that 
every voter must use.  If we want to go that route, let's follow the DMV model 
and give ea

Re: [DISCUSS] require reviews before merging a PR

2019-05-31 Thread Udo Kohlmeyer

Thank you Dan,

Some guidance around how we can go about reviewing the code.

I want to remind ALL 
committers..https://www.apache.org/dev/new-committers-guide.html#committer-responsibilities


It states "/Each committer has a responsibility to monitor the changes 
made for potential issues, both coding and legal."/


I cannot see a reason to have any reviewers on trivial (spelling, typos).

Post that, everything should have at least a view reviewers. 1 does not 
an opinion (or reviewed code) make, and I must agree with the statement 
that Owen made. 3 reviewers.


Yes it is a real pita. As it takes dedication from ppl to review code 
and it takes away from our daily lives and the limited time we have to 
dedicate to it. Yet, I cannot understand how out of 70 committers (I'm 
using 70% fudge factor) we cannot get 3 reviews on correctness, style, 
potential bugs. In addition committers CAN nominate reviewers on PR's. 
In addition to the nomination of reviewers, I would advocate that 
reviewers of code not be more than 66% of a potentially biased team 
members. (given that I know many committers are employed by the same 
company).


@Naba I agree, there is more work now. I now as a committer actually 
have to review the code of a fellow committer. BUT we all know that the 
space and code we work on is complex. Concurrent, distributed systems 
are not easy and being too close to the problem could cause blinders to 
issues which someone more removed could spot. The opposite is also true, 
too removed will only evaluate on basic criteria and might not spot 
issues. Regardless of this, we have many troops that can do work.. and 1 
code review 1-2 twice a week is not going to kill us.


I would also like to request of everyone, that when submitting a PR, 
keep an eye on it. Track it, ping the @dev channel if no progress is 
made. Oversights can happen and in some cases emails can be filtered out 
with new PR's or comments made on one's own PR.


--Udo

On 5/31/19 11:33, Dan Smith wrote:

As others pointed out - I think it's really important to remember that
people and community are much more important than process. That is a key
part of "The Apache Way" - it's not a set very specific rules, but more of
a philosophy of building an open community. I think this page has a good
take on the apache way - http://theapacheway.com/.

The page I linked also mentions "Getting good enough consensus is often
better than voting." Apache is about consensus building, *not* about
voting. Ideally, the only things we should formally vote on are
irreversible decisions such as a release or new committer.

Regarding code reviews, I think having a really strict process implies that
we don't trust our committers. Rather than that, maybe have some guidelines
for committers who aren't sure how much of a review to get. Here's what I
feel like I've been trying to follow:
1. Fixing a typo, broken spotless, etc. - just push it!
2. Straightforward change - get at least one reviewer. Maybe a second
author on the commit should count here?
3. Technically challenging, complicated change - get multiple reviewers
4. New Public API, large features - Write up a wiki page and have a
discussion on the dev list to get feedback

For all of these, if someone rejects your PR/feature, fix the issues or
talk with them. A good rule of thumb is if you are frustrated or annoyed by
what they said - talk to them one-on-one first instead of answering
directly in the PR/thread. If you a really pissed, wait a day and then talk
to them one-on-one.

-Dan

On Fri, May 31, 2019 at 11:00 AM Helena Bales  wrote:


Thanks for the filter, Jinmei!

+1 to Naba and Bruce.

I will add that I think we should have a formal policy of getting *a*
review (and more for large changes), and that PRs that are merged without
one should include (in the PR) a comment providing a justification for this
merge, so that committers can review the reasoning later on if needed. This
has the benefits of being low impact on our current workflow, but also
increasing transparency, accountability, and thoughtfulness around the
proposed changes and their impact.

On Fri, May 31, 2019 at 10:42 AM Jinmei Liao  wrote:


I was told that screenshot that I sent earlier got filtered out by the

dev

list. Basically, the filter puts "notificati...@github.com" in the

"From"

section, and put "review_reques...@noreply.github.com" in the "Doesn't
have" section of the form.

On Fri, May 31, 2019 at 10:36 AM Anthony Baker 

wrote:



On May 31, 2019, at 10:01 AM, Owen Nichols 

wrote:

We chose to make Geode an Apache open source project for a reason.

If

we no longer wish to embrace The Apache Way <
https://www.apache.org/theapacheway/index.html>, perhaps we should
reconsider.

I strongly disagree with the assertion that we are not following the
Apache Way because we aren’t doing RTC.  Please take a look around

other

ASF communities and compare that to our approach.  I think you’ll see a

lot

of similarities 

Re: what is the best way to update a geode pull request

2019-05-31 Thread Udo Kohlmeyer

I think a single (squashed if required) commit for the start of a PR.
ANY changes due to comments should be reflected as separate commits on 
the PR.
Once approved, that PR should squash all commits into 1 commit with a 
message that describes what was done,etc...


In the past I've heard developers ask for the following:

1x commit to address the problem/feature
1x commit with refactors.

I must be honest, but I am yet to find 1 developer that keeps a list of 
all changes they want to be refactored separate from the bug/feature 
code. OR better stated I am yet to find where this was sustainable AND 
productive.


It would definitely discourage the community to refactor, because who 
keeps track of all the code they would like to refactor and change the 
code after completing the task. Sounds like double if not triple work to me.


Also, how would this work if there are comments on the code i.e better 
approaches, have you thought about... or that might lead to errors. Are 
these changes applied to the core/base commit or against the refactor or 
both?


Also, a good practice for all is to commit (and push) code on a daily 
basis on their branches. "WIP - completed x,y,z. need to address: a,g,r" 
comments are not uncommon and safeguard against loss of equipment, 
illness of members,etc...


My personal coding style is, I make changes to code as I pass through 
it.. (change variable names when they don't make sense, refactor 
methods, etc...) So, I rarely have the time to retrace my steps and 
post-factum re-apply my refactored changes.


I think the checklist could be reworded, but as everything in this 
project, we take it in the spirit was intended and don't follow it to 
the letter. Would the rewording of the checklist enhance/improve/make 
the intent clearer or not? I don't care either way, but when someone 
merges a PR into a main branch with many intermediate commits, it just 
clusters the commit log with "contextually irrelevant" noise. (also 
makes it harder to follow all the changes that are in a branch). What if 
I want to rollback a commit into develop, how many does one have to 
revert, 1 is always simpler than 2-10... Also.. cherry picking becomes 
simpler...


--Udo

On 5/31/19 13:43, Nabarun Nag wrote:

In my opinion, I am okay will multiple commits within a PR.
But please do squash them to a single  commit when it is pushed to develop.
This helps us a ton if it is single commit.
- bisect operations are easier when it is a single commit during major
failure analysis.
- cherrypick is easier when it is one commit.


I even don’t prefer merge commit messages :
  -  none of the big ASF projects do it.
  -  visualizing on tools is bit skewed.
  -  difficult to analyze failures .


I would like to reiterate Dan’s statement on emphasis on people and empathy
over blanket process and rules.

Regards
Naba



On Fri, May 31, 2019 at 1:32 PM Helena Bales  wrote:


+1. I would guess that it is the checklist as part of the PR that is
confusing people.

The other reason that history gets rewritten is when force pushing after a
rebase. While fast-forwarding is necessary on occasion, this can be
accomplished without rewriting history by using a merge.

As part of our document on making PRs, we should include instructions on
how to handle the situation where fast-forwarding is necessary, explicitly
discourage the use of merges and force-pushes once a PR has been opened,
and some guidelines regarding the appropriate number of commits when the PR
is initially opened. Once we have these guidelines, it would be helpful to
link to them from the PR checklist that we currently have, and rework the
checklist so that it is in line with our desired process.

On Fri, May 31, 2019 at 1:20 PM Darrel Schneider 
wrote:


Something I have noticed is that often when I have requested changes be
made to a pull request is that the changes are force pushed ask a single
commit to the pr. I would actually prefer that the changes show up as a

new

commit on the pr instead of everything being rebased into one commit.

That

makes the history of the pr easier to follow and make it easy to see what
has changed since the previous review. What do others think? Have we done
something that makes contributors think the pull request has to be single
commit? I know the initial pull request is supposed to be but from then

on

I'd prefer that we wait to squash when we merge it to develop.



Re: what is the best way to update a geode pull request

2019-05-31 Thread Udo Kohlmeyer
If we are concerned about the single line that can break the product, 
then our testing has failed us on so many levels, that there is no hope.


But looking forward to see how long one can sustain the "factor -> 
commit -> make changes required to fulfill JIRA -> commit -> manual 
merge"...


Also, who reviews the refactor, because even that can introduce 
unintentional bugs... same effort as single commit. In single commit, if 
the refactor has made code cleaner, clearer and simpler, maybe 1 commit 
is easier to follow.


Once again... mileage will vary for EVERYONE.

On 5/31/19 14:31, Jacob Barrett wrote:



On May 31, 2019, at 2:23 PM, Udo Kohlmeyer  wrote:
I must be honest, but I am yet to find 1 developer that keeps a list of all 
changes they want to be refactored separate from the bug/feature code. OR 
better stated I am yet to find where this was sustainable AND productive.

Challenge accepted and complete… I have done this on a few PRs recently. I have 
taken to cleaning up the code first, leaving that as a separate commit on the 
PR, then fixing the code in question as another commit. The trick has come when 
merging the PR I wanted to squash all feedback commits. This was a manual task 
outside of GitHub. Lately I have taken to filing a PR fo the area I am cleaning 
up, merging that, then a PR for the fix and merging that. I haven’t decided 
which I find more tedious yet.

I do know that it sucks to find the one line change buried in a 1000 line 
cleanup task. They should always be separate commits into develop.

-Jake



Re: IntelliJ inspect git hooks

2019-06-03 Thread Udo Kohlmeyer
@Peter, as I understand that we don't want to at least not to add to the 
existing pain, BUT I don't know if that any plugin or Intellij can 
determine if it was made worse.


I think just cleaning up the warnings should not be too hard... That way 
it can be simple reasoning if it has been made worse.


--Udo

On 6/3/19 11:21, Peter Tran wrote:

Thanks Jake

Is it configurable to warn you if you're adding new warnings? Right now
it's tough to clean up all warnings in a file you touch but at least we
should not be making things worse (at least stop the bleeding so to say)

On Mon, Jun 3, 2019 at 12:55 PM Jacob Barrett  wrote:


If you’re already using IntelliJ then if you commit with IntelliJ you can
enable the commit analysis and it will tell you if there are warnings
before you commit.

-jake



On Jun 3, 2019, at 9:16 AM, Peter Tran  wrote:

Hi All,

I was wondering if anyone has any git hooks setup to ensure your commit

is

not introducing new intelliJ warnings. I was gonna play around with this
but was hoping maybe someone has already solved them problem.

Thank you!


Re: IntelliJ inspect git hooks

2019-06-03 Thread Udo Kohlmeyer
Boy scouts.. yeah... @kirk has tried a few times to remind people not to 
add to the crazy... but seemingly it just keeps coming back...


So short of, "you've added warnings" and rejecting, this is not going to 
be easily solved.


--Udo

On 6/3/19 12:44, Peter Tran wrote:

@Udo Kohlmeyer 

Yeah totally agree - make everything better than you found it. Boy scouts
rule? If we all followed it we'd systematically remove all the warnings
from the code base (at least files we still touch).


On Mon, Jun 3, 2019 at 2:30 PM Udo Kohlmeyer  wrote:


@Peter, as I understand that we don't want to at least not to add to the
existing pain, BUT I don't know if that any plugin or Intellij can
determine if it was made worse.

I think just cleaning up the warnings should not be too hard... That way
it can be simple reasoning if it has been made worse.

--Udo

On 6/3/19 11:21, Peter Tran wrote:

Thanks Jake

Is it configurable to warn you if you're adding new warnings? Right now
it's tough to clean up all warnings in a file you touch but at least we
should not be making things worse (at least stop the bleeding so to say)

On Mon, Jun 3, 2019 at 12:55 PM Jacob Barrett 

wrote:

If you’re already using IntelliJ then if you commit with IntelliJ you

can

enable the commit analysis and it will tell you if there are warnings
before you commit.

-jake



On Jun 3, 2019, at 9:16 AM, Peter Tran  wrote:

Hi All,

I was wondering if anyone has any git hooks setup to ensure your commit

is

not introducing new intelliJ warnings. I was gonna play around with

this

but was hoping maybe someone has already solved them problem.

Thank you!


Re: [DISCUSS] require reviews before merging a PR

2019-06-05 Thread Udo Kohlmeyer
@Kirk, I totally understand the pain that you speak of. I too agree that 
every line of changed code should have a test confirming that behavior 
was not changed.


I don't believe that we need to go as far as revoking committer rights 
and reviewing each committer again, BUT it would be AMAZING that out of 
our 100 committers, 80% of them would be more active in PR reviews, 
mailing lists and in the end active on the project outside of their 
focus area.


I do want to remind all Geode committers, it IS your responsibility to 
be part of the PR review cycle. I will hold myself just as accountable 
to this than what I hold every committer to, as I've been just as lazy 
as the rest of them.


BUT

The reality is:

1. Geode code is HUGELY complex and NOT a test complete as we'd like
2. In the past single small changes have caused failures the were
   completely unforeseen by anyone
3. In the past commits with single reviewers, have caused backward
   compatibility issues which were only caught by chance in unrelated
   testing.
4. There are 100 committers on Geode, and we keep on arguing that it is
   hard to get PR's reviewed and that is why it is ok to have only 1
   reviewer per PR.
5. There seems to be majority (unstated) opinion of: "why change, it
   has been working for us so far." (I call is unstated, because being
   silent means you agree with the status quo)
6. With requiring only 1 reviewer on code submissions, we are possibly
   creating areas of the code only understood by a few.

IF, we as a project, have decided that all code shall enter only through 
the flow of PR, then why not extend the QA cycle a little by requiring 
more eyes. Lazy consensus, is as stated, lazy and would only work in a 
project where the levels of complexity and size are not as high as 
Geode's. In addition, with PR submissions, we have admitted that we are 
human and could make mistakes and in an already complex environment and 
to the best of our ability get it wrong.


Now, there are commits that really do not require 3 pairs of eyes, 
because spelling mistakes and typos don't need consensus. But any time 
code logic was amended, this needs to be reviewed.


I have seen different approach to code submissions:

 * The submitter of the PR is NOT the committer of the PR. This task is
   the responsibility of another committer(s) to review, approve and
   finally merge in.
 * Smaller amount of committers with higher numbers of contributors.
   Yes, this does create a bottleneck, but it promotes a sense of pride
   and responsibility that individual feels. Possibly a greater
   understanding of the target module is promoted through this approach
   as well.

Now, I don't say we as a project should follow these strict or 
restricting approaches, but from my perspective, if we as a project 
argue that we struggle to find 3 reviewers out of 100, then there are 
bigger problems in the project than we anticipated. It is not a lack of 
trust in our committers, to me it is a sense of pride that I want other 
committers to confirm that I've delivered code to the high standard that 
we want to be known for. Whilst it is painful to go through the process, 
but if done correctly it is beneficial to all involved, as differing 
opinions and approaches can be shared and all can learn from.


In addition, I have personally stumbled upon a few PR's, which upon 
review found to be lacking in the areas of best practices of code and/or 
design.


I fully support the notion of 3 reviewers per PR. I'm also going to take 
it one step further, in the list of reviewers, there is at least 1 
reviewer that is not part of a team, as this might drive a unbiased view 
of the code and approach. I would also like to encourage ALL committers 
to review code outside of the focus area. This will only promote a 
broader understanding of the project codebase. I also support the notion 
of a pair/mobbing reviews, if a reviewer does not understand the problem 
area enough to effectively review, OR where the solution is not apparent.


--Udo

On 6/4/19 16:49, Kirk Lund wrote:

I'm -1 for requiring N reviews before merging a commit.

Overall, I support Lazy Consensus. If I post a PR that fixes the flakiness
in a test, the precheckin jobs prove it, and it sits there for 2 weeks
without reviews, then I favor merging it in at that point without any
reviews. I'm not going to chase people around or spam the dev list over and
over asking for reviews. Nothing in the Apache Way says you have to do
reviews before committing -- some projects prefer "commit then review"
instead of "review then commit". You can always look at the code someone
changed and you can always change it further or revert it.

I think if we don't trust our committers then we have a bigger systemic
problem that becoming more strict about PR reviews isn not going to fix.

Overall, I also favor pairing/mobbing over reviews. Without being there
during the work, a reviewer lacks the context to understan

Re: [DISCUSS] require reviews before merging a PR

2019-06-05 Thread Udo Kohlmeyer
Alexander, thank you for your response. And yes, change is uncomfortable 
and in some cases more reviewers would not have caught issues. BUT, more 
people would have seen the code, maybe become more familiar with it, etc...


I don't say don't trust committers, as I am one. But I also know that I 
mistakes are made regardless of intent. If we trust committers, why 
review at all? Just commit... and we might catch a problem, we might not.


--Udo

On 6/5/19 11:20, Alexander Murmann wrote:

Udo, I agree with many of the pains you are addressing, but am pessimistic
that having more reviewers will solve them.

You are absolutely correct in calling out that the code is ugly complex and
missing coverage. The best way to address this is to clean up the code and
improve coverage. You say yourself "In the past single small changes have
caused failures the were completely unforeseen by anyone". I don't think
more eyeballs will go a long way in making someone see complex bugs
introduced by seemingly safe changes.

I also am concerned that introducing a hurdle like this will make
committers more excited to review PRs with care, but rather might lead to
less care. It  would be great of our committers were more passionate about
PR reviews and do them more often, but forcing it rarely accomplishes that
goal.

I'd rather see us trust our committers to decide how much review they
require to feel comfortable about their work and use the time saved to
address the root of the problem (accidental complexity & lack of test
coverage)

On Wed, Jun 5, 2019 at 11:03 AM Udo Kohlmeyer  wrote:


@Kirk, I totally understand the pain that you speak of. I too agree that
every line of changed code should have a test confirming that behavior
was not changed.

I don't believe that we need to go as far as revoking committer rights
and reviewing each committer again, BUT it would be AMAZING that out of
our 100 committers, 80% of them would be more active in PR reviews,
mailing lists and in the end active on the project outside of their
focus area.

I do want to remind all Geode committers, it IS your responsibility to
be part of the PR review cycle. I will hold myself just as accountable
to this than what I hold every committer to, as I've been just as lazy
as the rest of them.

BUT

The reality is:

  1. Geode code is HUGELY complex and NOT a test complete as we'd like
  2. In the past single small changes have caused failures the were
 completely unforeseen by anyone
  3. In the past commits with single reviewers, have caused backward
 compatibility issues which were only caught by chance in unrelated
 testing.
  4. There are 100 committers on Geode, and we keep on arguing that it is
 hard to get PR's reviewed and that is why it is ok to have only 1
 reviewer per PR.
  5. There seems to be majority (unstated) opinion of: "why change, it
 has been working for us so far." (I call is unstated, because being
 silent means you agree with the status quo)
  6. With requiring only 1 reviewer on code submissions, we are possibly
 creating areas of the code only understood by a few.

IF, we as a project, have decided that all code shall enter only through
the flow of PR, then why not extend the QA cycle a little by requiring
more eyes. Lazy consensus, is as stated, lazy and would only work in a
project where the levels of complexity and size are not as high as
Geode's. In addition, with PR submissions, we have admitted that we are
human and could make mistakes and in an already complex environment and
to the best of our ability get it wrong.

Now, there are commits that really do not require 3 pairs of eyes,
because spelling mistakes and typos don't need consensus. But any time
code logic was amended, this needs to be reviewed.

I have seen different approach to code submissions:

   * The submitter of the PR is NOT the committer of the PR. This task is
 the responsibility of another committer(s) to review, approve and
 finally merge in.
   * Smaller amount of committers with higher numbers of contributors.
 Yes, this does create a bottleneck, but it promotes a sense of pride
 and responsibility that individual feels. Possibly a greater
 understanding of the target module is promoted through this approach
 as well.

Now, I don't say we as a project should follow these strict or
restricting approaches, but from my perspective, if we as a project
argue that we struggle to find 3 reviewers out of 100, then there are
bigger problems in the project than we anticipated. It is not a lack of
trust in our committers, to me it is a sense of pride that I want other
committers to confirm that I've delivered code to the high standard that
we want to be known for. Whilst it is painful to go through the process,
but if done correctly it is beneficial to all involved, as differing
opinions and approaches can be shared and all 

Re: [DISCUSS] changing geode 32-bit counter stats to 64-bit

2019-06-07 Thread Udo Kohlmeyer
+1, if the LongAdders have already added and the additional memory usage 
has already been dealt with, then adding the accessors does not really 
make a difference anymore.


On 6/7/19 13:47, Jacob Barrett wrote:

I like this!

I’d go ahead and change all the usage of the int methods to the long methods. 
I’d deprecate the int methods to make it very clear.

If some consumer is using the int methods they will still work with the same 
rollover issues but perhaps with the deprecated warning they will update their 
code. Without the warning they may never know.

-jake



On Jun 7, 2019, at 1:32 PM, Darrel Schneider  wrote:

We have had a couple of tickets that have problems with 32-bit counters
changing too fast and causing them to be hard to understand when they wrap
around (see GEODE-6425 and GEODE-6834). We have also had problems with some
stats being broken because they were changing the 32-bit one when they
should have been changing the 64-bit one (see GEODE-6776).
The current implementation has one array of values for the 32-bit stats and
another array of values for the 64-bit stats. We use indexes into these
arrays when changing a stat. Given an int "id" used to index these arrays,
we can not tell if we should be indexing the 32-bit array or 64-bit array.
The first 32-bit stat for a type will have an id of 0 and the first 64-bit
stat on that type will also have an id of 0. But our current implementation
has the same type of value in both arrays (LongAdder
see: StripedStatisticsImpl fields intAdders and longAdders). So if we
simply change our implementation to have a single array, then the ids will
no longer be overloaded.

Changing this internal implementation also improves backward compatibility.
Currently when we change one of our counters from 32-bit to 64-bit it is
possible we would break existing applications that are using the Statistics
interface. It has methods on it that allow stats to be read given an it:
getInt(int id) and getLong(int id). If they are currently reading a 32-bit
stat using getInt(id) and we change that stat to be 64-bit (like we did in
GEODE-6425) then they will now be reading the wrong stat when they call
getInt(id). If they used getInt(String) or getInt(StatisticDescriptor) they
will be okay. But if we make this simple change to have 32-bit and 64-bit
stats stored in the same array then getInt(id) will do the right thing when
we change a stat from 32-bit to 64-bit.

Does anyone see a problem with making this change?

After we do it I think we should change all of our counters to 64-bit since
they are always stored in a LongAdder anyway and we should deprecate the
methods that support 32-bit stats.


Re: Unnecessary uses of final on local variables

2019-06-18 Thread Udo Kohlmeyer

+1 to everything Bill said

+1 to what Anthony recommended

I think that we in many cases we should not confuse scope with variable 
reassignment.


And we should not confuse variable reassignment with methods that affect 
change of a local object. Stateful objects don't have to be immutable 
and usually are not.. But good practices (wrt threading) dictate that we 
should prefer immutability over local object state changes.


I vote for Object fields to be final from construction and I vote for 
method parameters to be final. AND because I love Kotlin I vote for 
classes to be final and not open for extension without real cause.


But this is not what this thread is about

So.. imo, does it really matter if a developer was more cautious by 
adding final to all their local variable declarations? It is a little 
more defensive, but at the end of the day, it makes for better code... imo.


--Udo

On 6/18/19 11:48, Jinmei Liao wrote:

I agree with Murtuza, most finals on local variables and method parameters
are just noise to me. I only use "final" on these two situations:
1. to declare public static constants of immutable types (e.g. String,
Integer)
2. to prevent children from overriding a method.

But thought I can't offer an example, I don't want to put out a blank
statement saying that "final" on local variables are entirely unnecessary.
There must be a case that it could be useful, so even if we come to a
consensus that local variables should not have final on it, I don't think
using a static analysis tool to get rid of all of them is a good idea.

On Tue, Jun 18, 2019 at 11:14 AM Anthony Baker  wrote:


I’ll offer this alternative:  perhaps shorter method bodies obviate the
need for explicit final vars.

Anthony



On Jun 18, 2019, at 10:30 AM, Ernest Burghardt 

wrote:

+1 to auto-enforcement (if possible) post-consensus

On Tue, Jun 18, 2019 at 8:33 AM Murtuza Boxwala 

wrote:

final in Java does not guarantee immutability.  It would be AWESOME if

it

did but all it guarantees is that the variable cannot be reassigned. In
most cases the variable points to an object’s location (memory

address), so

you can still call methods on it, e.g.

final var = new Foo();
var.mutateState();

final variables like these are in no way thread safe. To make objects
immutable, the objects themselves need to follow a pattern that

guarantees

that.  Something like the ValueObject <
https://martinfowler.com/bliki/ValueObject.html> pattern.

Mutability may well be the enemy, but I don’t think this is the

construct

that gets us much/if any closer.

In local variables and parameters final feels like noise to me, and in
fact may make things more difficult to reason about, if we start

assuming

variables with final are thread safe.

But I may be missing something.  I am more curious to see how we come to
consensus on something like this, because the worst outcome from all

this

will be to have some folks actively adding final and some actively

removing

it, which will add noise to PRs and to the code.  And once we reach
consensus, how do we enforce somethings like this? ./gradlew spA?


On Jun 17, 2019, at 8:55 PM, Jacob Barrett 

wrote:

I too am in camp final too. You could say `final boolean useFinal =

true`. For all the same reasons Bill stated below.

On Jun 17, 2019, at 5:33 PM, Bill Burcham 

wrote:

The final keyword is not redundant—quite the opposite—it's extremely

valuable.

Local variables are not, in general, final, unless you declare them as

such. That being the case, it is not redundant to declare local

variables

"final".

What the compiler will do for you, is _if_ it can ensure that a local

variable (or method parameter) is never modified (after initialization)
then that variable is treated as "effectively final". Variables that are
explicitly declared final, or are determined to be "effectively final"

may

be referenced in lambdas. That's a nice thing.

I would like to offer a counter-recommendation: final should be the

default everywhere for fields, for method parameters (on classes, not on
interfaces), and for local variables.

Many benefits would accrue to us, should we adopt this default:

1. final fields must be initialized in a constructor and never mutated

again. This makes reasoning about those fields easier.

2. classes that have all their fields final are immutable and hence

easier to reason about: they can be passed between threads, for

instance,

with no need to protect from races

3. final method parameters can never be mutated, making them easier to

reason about

4. final local variables can never be mutated, making them easier to

reason about

When final is the rule, non-final is the exception. Another way of

saying that is that when final is the rule, mutability is the exception.
That is as it should be. Mutability is the enemy.

I have turned on a couple IntelliJ settings that make this the default

for me. I encourage you to do the same:

First there are these two "C

Re: Unnecessary uses of final on local variables

2019-06-19 Thread Udo Kohlmeyer

I agree with Darrel, Bill has made some very compelling arguments.

I also add my vote of -1 to remove "noisy" final keywords from local 
variables.


I am VERY interested in understanding how the JVM would handle this, as 
final is a keyword that stops the reassignment of the variable with 
another value. It would never stop immutability concerns. This can only 
be achieved by following stricter rules regarding how Pojos are created 
and coding practices.


I believe it is better to be more secure and opt-out when not required, 
than trying to add-in being more secure (as you are more likely to miss 
things).


--Udo

On 6/19/19 08:41, Darrel Schneider wrote:

I find Bill's argument of using final by default on everything convincing
and removing it when you have something that needs to be changed.
It is not in keeping with my current practice but I'm willing to change.
So I vote -1 to not using "final" on local variables.


On Wed, Jun 19, 2019 at 7:29 AM Anthony Baker  wrote:


Just to confirm, the primary place where we make project decisions is on
the dev@geode list.  Thanks!

Anthony



On Jun 19, 2019, at 7:19 AM, Bill Burcham  wrote:

I feel that a lot more
conversation is needed, outside email. On the other hand, this mailing

list

is a fine place to deliberate. But to deliberate successfully on this
issue, via email, I think we'll have to break it down further, and we'll
have to be comfortable with the process taking a long time.




Re: [DISCUSS] Add a test dependency to geode-core - ArchUnit

2019-06-21 Thread Udo Kohlmeyer
I know that I'm missing the benefit of physically moving the code from 
the package into its own module.


Could you possibly explain to me what it is?

On 6/21/19 07:37, Murtuza Boxwala wrote:

I think that’s a really clever way to increment toward splitting geode-core 
into more modules. I am excited to see what it looks like 👍


On Jun 20, 2019, at 7:45 PM, Jacob Barrett  wrote:

Gotcha! Sounds good.


On Jun 20, 2019, at 4:35 PM, Dan Smith  wrote:

We don't have a membership gradle module, just a package. We're adding this
to geode-core.

For a little more context - we are thinking about refactoring membership
(and/or maybe some other pieces) into separate gradle modules - proposal
forthcoming! However, as a first step we need to untangle those pieces of
code from the rest of geode-core. Rather than creating some long lived
branch we can incrementally untangle the code a piece at a time, on
develop. Having a way to track progress and enforce the direction of
dependencies on the way to a separate gradle module will help with that.

-Dan


On Thu, Jun 20, 2019 at 4:23 PM Jacob Barrett  wrote:

Are you adding this dependency to just the membership module? I am cool
with that.


On Jun 20, 2019, at 2:39 PM, Dan Smith  wrote:

Hi all,

Bill, Ernie, and I would like to add a new (apache licensed) test
dependency to geode-core - https://github.com/TNG/ArchUnit. This is a

tool

that lets you write tests that make assertions about the

interdependencies

of your code - for example enforcing that package A does not depend on
package B.

Initially we intend to add some tests about what parts of the system the
org.apache.geode.distributed.internal.membership package depends on, with
an eye towards making that code more independently testable (proposal on
that coming soon!).

Does anyone have an issue with adding this test dependency?

-Dan




Re: [DISCUSS] Add a test dependency to geode-core - ArchUnit

2019-06-21 Thread Udo Kohlmeyer

Thank you for the response...

#1 - But isn't cyclical package dependent code not a smell and practice, 
whilst at the same time and uni-directional dependency is preferred.


Soo... I think I see the benefit to be more, that ArchUnit allows the 
untangling of code into a modular way WITHOUT a big bang approach of 
moving the code into modules and then having to be concerned about the 
fallout. But also it allows for the managing of package dependencies 
WITHOUT breaking the code out into different separate modules.


I really like ArchUnit :).. We should prioritize adoption :)

--Udo

On 6/21/19 12:48, Murtuza Boxwala wrote:

Two things come to mind:

1) uni-directional dependency
Packages can be dependent on each other, because the classes inside of them can 
use each other. e.g.let’s say package A has class A1 and class A2 and 
package B has class B1 and B2.  A1 can depend on B1, and B2 can depend on A2. 
Hence, the packages are dependent on each other.

Modules can only have uni-directional dependency. If Module A depends on Module 
B, then no class in Module B can reference a class in Module A.  This prevents 
tangling, i.e. spaghetti

2) Incremental compilation
This lack of tangling helps not only developers, but the compiler too.  In the 
packages example above, if I change any of the classes, all the code has to get 
recompiled because the dependency lines can go in any direction, and the 
compiler won’t attempt to optimize.  In the modules case, if Module A changes, 
Module B will not recompile, because the dependency guarantees that nothing 
about Module B could have been affected.


On Jun 21, 2019, at 2:14 PM, Udo Kohlmeyer  wrote:

I know that I'm missing the benefit of physically moving the code from the 
package into its own module.

Could you possibly explain to me what it is?

On 6/21/19 07:37, Murtuza Boxwala wrote:

I think that’s a really clever way to increment toward splitting geode-core 
into more modules. I am excited to see what it looks like 👍


On Jun 20, 2019, at 7:45 PM, Jacob Barrett  wrote:

Gotcha! Sounds good.


On Jun 20, 2019, at 4:35 PM, Dan Smith  wrote:

We don't have a membership gradle module, just a package. We're adding this
to geode-core.

For a little more context - we are thinking about refactoring membership
(and/or maybe some other pieces) into separate gradle modules - proposal
forthcoming! However, as a first step we need to untangle those pieces of
code from the rest of geode-core. Rather than creating some long lived
branch we can incrementally untangle the code a piece at a time, on
develop. Having a way to track progress and enforce the direction of
dependencies on the way to a separate gradle module will help with that.

-Dan


On Thu, Jun 20, 2019 at 4:23 PM Jacob Barrett  wrote:

Are you adding this dependency to just the membership module? I am cool
with that.


On Jun 20, 2019, at 2:39 PM, Dan Smith  wrote:

Hi all,

Bill, Ernie, and I would like to add a new (apache licensed) test
dependency to geode-core - https://github.com/TNG/ArchUnit. This is a

tool

that lets you write tests that make assertions about the

interdependencies

of your code - for example enforcing that package A does not depend on
package B.

Initially we intend to add some tests about what parts of the system the
org.apache.geode.distributed.internal.membership package depends on, with
an eye towards making that code more independently testable (proposal on
that coming soon!).

Does anyone have an issue with adding this test dependency?

-Dan




Re: [PROPOSAL]: Improve OQL Method Invocation Security

2019-06-24 Thread Udo Kohlmeyer

+1, Count me in

On 6/24/19 13:06, Juan José Ramos wrote:

Hey Jake,

Sure, I guess we could do a live session if there's enough interest after
people have reviewed the proposal.
Best regards.

On Mon, Jun 24, 2019 at 4:17 PM Jacob Barrett  wrote:




On Jun 24, 2019, at 11:49 AM, Juan José Ramos  wrote:

  I’d rather get feedback in any way and aggregate everything on my own

than

maybe not getting anything because I'm explicitly limiting the options to
provide it.

Dealers choice so both it is!

Could you also consider public live session on some medium, like Zoom,
where you can walk through the proposal and take like feedback and
questions?

Thanks,
Jake





Re: [DISCUSS] RFC 0: Lightweight RFC Process

2019-07-15 Thread Udo Kohlmeyer

@Dan,

Thank you for your first attempt at this.

Maybe we should be a rename "Active" to "Completed". "Active" to me 
means that we are currently working on them, rather having completed 
them. I don't view these proposals as features that can be toggled 
on/off (or active/inactive).


Also, I disagree with the approach that proposals that are not actively 
worked on are "Dropped". Which in itself is incorrect as well. Maybe 
there should be an "Icebox" area, that lists a set of proposals that 
have not yet been approved, but also not yet rejected.


I think it is ok to have an "Icebox" of proposals that lists areas of 
improvement we want to target, but as of yet, no concrete proposal has 
yet been submitted. Modularity comes to mind. It is not that we don't 
want to do it, it is just that there is no proposal that has been 
accepted/completed.


--Udo

On 7/12/19 12:57, Dan Smith wrote:

Following up on this, I took a stab at organizing our old proposals into
the buckets on the wiki. We now have:

Under Discussion - Draft and In Discussion proposals
In Development - proposals under active development
Active - Proposals that are completely implemented
Dropped - Proposals that were not approved or development stalled out.

If I moved your proposal to "Dropped" erroneously, please feel free to move
it back! I moved things there that did not appear to have been implemented
or have any recent activity.

I put a few things in "Unknown State." If you know what state these
proposals are in, please move them!

https://cwiki.apache.org/confluence/display/GEODE/Project+Proposals+and+Specifications

On Wed, Jun 26, 2019 at 11:20 AM Alexander Murmann 
wrote:


Given discussion here and previous discussion on the PR, I consider this
proposal approved and updated its state accordingly.

I also incorporated Dan's suggestion of moving deprecated proposals and
added a reference to the new process at the top of the Project Proposals
and Specifications page
<
https://cwiki.apache.org/confluence/display/GEODE/Project+Proposals+and+Specifications
.

Thank you all for you great feedback throughout this process!

On Tue, Jun 25, 2019 at 10:07 AM Dan Smith  wrote:


Will moving the page around on the wiki result in dead links to the

draft

version?


No. If you use the share button in the wiki, you get a permanent link to
the page. Even if you just copy the URL from the address bar it doesn't
include the folder the page is in.

-Dan



--
Alexander J. Murmann
(650) 283-1933



Re: [DISCUSS] Time to cut Geode 1.10.0?

2019-07-22 Thread Udo Kohlmeyer
I don't think we need to wait for this, as there has been no RFC process 
followed.


--Udo

On 7/22/19 3:38 PM, Jinmei Liao wrote:

Work is still being planned to move the cluster management rest service
under an experimental version flag and use a geode property to turn it
on/off. I would say we are ready to cut the geode 1.10.0 after that work is
complete.

On Mon, Jul 22, 2019 at 3:24 PM Alexander Murmann 
wrote:


Hi everyone!

We released Geode 1.9.0 on April 25th. That's about 3 months ago. End of
last year we discussed releasing quarterly. In the past we've had about a
month between cutting a release branch and actually shipping our new minor.
This means we are already behind our target release cadence.

What are your thoughts on cutting a 1.10.0 release branch this week?

Would anyone like to volunteer to be the release manager for geode 1.10.0?

Thank you all!





Re: [DISCUSS] Time to cut Geode 1.10.0?

2019-07-23 Thread Udo Kohlmeyer
I don't believe we should be including anything into the Geode release 
that has not gone through the correct process of feature proposal.


All work under the experimental cluster management service has not yet 
been approved by the agreed upon RFC process.


I don't believe we should be including this work, experimental or otherwise.

--Udo

On 7/22/19 4:51 PM, Alexander Murmann wrote:

Udo, do you mind explaining how the RFC process comes into this? Are you
suggesting that we should wait if an RFC had a target release attached?

On Mon, Jul 22, 2019 at 4:47 PM Udo Kohlmeyer  wrote:


I don't think we need to wait for this, as there has been no RFC process
followed.

--Udo

On 7/22/19 3:38 PM, Jinmei Liao wrote:

Work is still being planned to move the cluster management rest service
under an experimental version flag and use a geode property to turn it
on/off. I would say we are ready to cut the geode 1.10.0 after that work

is

complete.

On Mon, Jul 22, 2019 at 3:24 PM Alexander Murmann 
wrote:


Hi everyone!

We released Geode 1.9.0 on April 25th. That's about 3 months ago. End of
last year we discussed releasing quarterly. In the past we've had about

a

month between cutting a release branch and actually shipping our new

minor.

This means we are already behind our target release cadence.

What are your thoughts on cutting a 1.10.0 release branch this week?

Would anyone like to volunteer to be the release manager for geode

1.10.0?

Thank you all!



Re: [DISCUSS] Time to cut Geode 1.10.0?

2019-07-26 Thread Udo Kohlmeyer

@Alexander + @Jared,

So maybe that was my misunderstanding on the RFC (not being optional on 
new feature work). Given that this is a new feature, there is 
significant risk to getting it "wrong".


I was expecting more discussion around this. I have some objections to 
the current approach/design. Given that my day job does not allow me to 
respond in a timely manner, I would have not been able to get all my 
concerns raised. In addition, throwing something onto the wiki, and then 
a few weeks before we'd like to cut a version raising a discussion 
thread on work that has been going on for months already does not help 
with the community being able to read, digest, think, reason and respond 
BEFORE it is released.


I know `@Experimental` is non-binding on API's or usage, BUT I prefer 
some of the ground work to have been discussed, API's validated BEFORE 
it is released into the wild. I mean this is a PUBLIC API, so we'd 
prefer to get it more correct than the previous one.


Maybe it is just me, taking it too serious... Where I prefer not release 
something as close to 95% correct (and discussed).


Anyway.. If we want to cut 1.10... and we should... Let's do so.. but 
I'd prefer that more on the correctness on the approach.


--Udo

On 7/25/19 11:08 AM, Alexander Murmann wrote:

I don't believe we should be including anything into the Geode release
that has not gone through the correct process of feature proposal.

All work under the experimental cluster management service has not yet
been approved by the agreed upon RFC process.


Udo, I didn't take the RFC process that we agreed on to be a gate keeper,
but rather a way to de-risk work before making a PR.

 From the RFC doc in the wiki:


When to write an RFC?

Writing an RFC should be entirely voluntary. There is always the option of
going straight to a pull request. However, for larger changes, it might be
wise to de-risk the risk of rejection of the pull request by first
gathering input from the community. Therefore it’s up to every member of
our community to decide themselves when they want to reach for this tool.


If we want to change the role of the RFC process, that's fine with me, but
we should have that discussion first.

On Tue, Jul 23, 2019 at 10:30 AM Jared Stewart 
wrote:


What was missing from the RFC process for the cluster management service?
I saw a [Discuss] thread for it, as well as a proposal at

https://cwiki.apache.org/confluence/display/GEODE/Cluster+Management+Service

On Tue, Jul 23, 2019 at 10:02 AM Udo Kohlmeyer  wrote:


I don't believe we should be including anything into the Geode release
that has not gone through the correct process of feature proposal.

All work under the experimental cluster management service has not yet
been approved by the agreed upon RFC process.

I don't believe we should be including this work, experimental or
otherwise.

--Udo

On 7/22/19 4:51 PM, Alexander Murmann wrote:

Udo, do you mind explaining how the RFC process comes into this? Are

you

suggesting that we should wait if an RFC had a target release attached?

On Mon, Jul 22, 2019 at 4:47 PM Udo Kohlmeyer  wrote:


I don't think we need to wait for this, as there has been no RFC

process

followed.

--Udo

On 7/22/19 3:38 PM, Jinmei Liao wrote:

Work is still being planned to move the cluster management rest

service

under an experimental version flag and use a geode property to turn

it

on/off. I would say we are ready to cut the geode 1.10.0 after that

work

is

complete.

On Mon, Jul 22, 2019 at 3:24 PM Alexander Murmann <

amurm...@apache.org

wrote:


Hi everyone!

We released Geode 1.9.0 on April 25th. That's about 3 months ago.

End

of

last year we discussed releasing quarterly. In the past we've had

about

a

month between cutting a release branch and actually shipping our new

minor.

This means we are already behind our target release cadence.

What are your thoughts on cutting a 1.10.0 release branch this week?

Would anyone like to volunteer to be the release manager for geode

1.10.0?

Thank you all!



Re: [DISCUSS] Release Geode 1.9.1 with logging improvements

2019-08-13 Thread Udo Kohlmeyer
The latest version of SBDG 1.2 is already in RC stage. Which means the 
dependent Geode version cannot be changed any more. Currently SBDG 1.2 
is based on Geode 1.9. This will not change. Patch versions to 1.9 are 
supported, but not changes to 1.10 or later.


THUS,

Once SBDG 1.3 (Neuman) is released, it will be based on the latest GA of 
Geode, which at this point would be 1.10 or possibly 1.11 depending on 
release cycles.


In addition...

@Aaron, Whilst it would also be possible to override the underlying 
Geode version that SBDG uses, to a later version, I would just like to 
point out that all testing of SBDG will be against a named supported 
version of Geode / GemFire. Which means, if failures arise using SBDG / 
SDG with a non-supported version of Geode / GemFire would effectively be 
unsupported. (due diligence to confirm origin of failure will of course 
be applied)


Hope this helps...

--Udo

On 8/13/19 10:03 AM, Aaron Lindsey wrote:

Assuming Geode 1.10 is released with the three logging fixes in Kirk’s message, 
can the next GA release of Spring Boot Data Geode consume 1.10 instead of 1.9? 
Also, when would SBDG need this patch release by (whether we do a 1.9.1 release 
or 1.10 release)?

- Aaron


On Aug 13, 2019, at 9:31 AM, Bruce Schuchardt  wrote:

If we release a 1.9.1 I'd like to include the SSL/NIO fix. Cluster SSL 
communications with conserve-sockets=false is currently broken in 1.9.

On 8/13/19 9:25 AM, Kirk Lund wrote:

I'd like to discuss if and how we can release Geode 1.9.1 with logging
improvements. This is primarily to provide a patch release for Spring Data
Geode and Spring Boot to ensure a smoother User experience out-of-the box.
They have very near-future releases that need this as soon as possible.

The specific tickets and commits that would be back-ported are:

*1. GEODE-7058 Log4j-core dependency should be optional in geode-core*

commit 413800bc16d05df689a2af5c30797f180aad6088
Author: Kirk Lund 
Date:   Wed Aug 7 14:33:21 2019 -0700

 GEODE-7058: Mark log4j-core optional in geode-core

 Note: this change requires all commits from GEODE-2644 and GEODE-6122.

*2. GEODE-7050 Log4jAgent should avoid casting non-log4j loggers*

commit e5c9c420f462149fd062847904e3435fbe99afb4
Author: Kirk Lund 
Date:   Thu Aug 8 18:17:32 2019 -0700

 GEODE-7050: Use Log4jAgent only if Log4j is using Log4jProvider (#3892)

 This change prevents Geode from using Log4jAgent if Log4j Core is
 present but not using Log4jProvider.

 For example, Log4j uses SLF4JProvider when log4j-to-slf4j is in the
classpath.

 By disabling Log4jAgent when other Log4j Providers are in use, this
 prevents problems such as ClassCastExceptions when attemping to cast
 loggers from org.apache.logging.slf4j.SLF4JLogger to
 org.apache.logging.log4j.core.Logger to get the LoggerConfig or
 LoggerContext.

 Co-Authored-By: Aaron Lindsey 

*3. GEODE-6959 NPE if AlertAppender is not defined*

commit dd15fec1f2ecbc3bc0cdfc42072252c379e0bb89
Author: Kirk Lund 
Date:   Thu Aug 8 14:59:44 2019 -0700

 GEODE-6959: Prevent NPE in GMSMembershipManager for null AlertAppender
(#3899)

 If a custom log4j2.xml is used without specifying the Geode
AlertAppender,
 GMSMembershipManager may throw a NullPointerException when invoking
 AlertAppender.getInstance().stopSession() during a forceDisconnect. This
 change prevents the NullPointerException allowing forceDisconnect to
finish.

 Users using Spring Boot with Logback are more likely to hit this bug.

 Co-authored-by: Mark Hanson mhan...@pivotal.io



Re: [DISCUSS] Release Geode 1.9.1 with logging improvements

2019-08-13 Thread Udo Kohlmeyer
No, 1.9.1 IS something we require. SBDG 1.2 CAN use 1.9.1, we'd have to 
wait for SBDG 1.3 to use 1.10 or 1.11


SBDG 1.3 is still a few months off, so maybe getting critical fixes in 
patch versions is required.


On 8/13/19 11:26 AM, Kirk Lund wrote:

Udo, Thanks for the info! Sounds like we shouldn't bother with Geode 1.9.1
then. If I'm misinterpreting what you wrote, let me know.

On Tue, Aug 13, 2019 at 10:36 AM Udo Kohlmeyer  wrote:


The latest version of SBDG 1.2 is already in RC stage. Which means the
dependent Geode version cannot be changed any more. Currently SBDG 1.2
is based on Geode 1.9. This will not change. Patch versions to 1.9 are
supported, but not changes to 1.10 or later.

THUS,

Once SBDG 1.3 (Neuman) is released, it will be based on the latest GA of
Geode, which at this point would be 1.10 or possibly 1.11 depending on
release cycles.

In addition...

@Aaron, Whilst it would also be possible to override the underlying
Geode version that SBDG uses, to a later version, I would just like to
point out that all testing of SBDG will be against a named supported
version of Geode / GemFire. Which means, if failures arise using SBDG /
SDG with a non-supported version of Geode / GemFire would effectively be
unsupported. (due diligence to confirm origin of failure will of course
be applied)

Hope this helps...

--Udo

On 8/13/19 10:03 AM, Aaron Lindsey wrote:

Assuming Geode 1.10 is released with the three logging fixes in Kirk’s

message, can the next GA release of Spring Boot Data Geode consume 1.10
instead of 1.9? Also, when would SBDG need this patch release by (whether
we do a 1.9.1 release or 1.10 release)?

- Aaron


On Aug 13, 2019, at 9:31 AM, Bruce Schuchardt 

wrote:

If we release a 1.9.1 I'd like to include the SSL/NIO fix. Cluster SSL

communications with conserve-sockets=false is currently broken in 1.9.

On 8/13/19 9:25 AM, Kirk Lund wrote:

I'd like to discuss if and how we can release Geode 1.9.1 with logging
improvements. This is primarily to provide a patch release for Spring

Data

Geode and Spring Boot to ensure a smoother User experience out-of-the

box.

They have very near-future releases that need this as soon as possible.

The specific tickets and commits that would be back-ported are:

*1. GEODE-7058 Log4j-core dependency should be optional in geode-core*

commit 413800bc16d05df689a2af5c30797f180aad6088
Author: Kirk Lund 
Date:   Wed Aug 7 14:33:21 2019 -0700

  GEODE-7058: Mark log4j-core optional in geode-core

  Note: this change requires all commits from GEODE-2644 and

GEODE-6122.

*2. GEODE-7050 Log4jAgent should avoid casting non-log4j loggers*

commit e5c9c420f462149fd062847904e3435fbe99afb4
Author: Kirk Lund 
Date:   Thu Aug 8 18:17:32 2019 -0700

  GEODE-7050: Use Log4jAgent only if Log4j is using Log4jProvider

(#3892)

  This change prevents Geode from using Log4jAgent if Log4j Core is
  present but not using Log4jProvider.

  For example, Log4j uses SLF4JProvider when log4j-to-slf4j is in

the

 classpath.

  By disabling Log4jAgent when other Log4j Providers are in use,

this

  prevents problems such as ClassCastExceptions when attemping to

cast

  loggers from org.apache.logging.slf4j.SLF4JLogger to
  org.apache.logging.log4j.core.Logger to get the LoggerConfig or
  LoggerContext.

  Co-Authored-By: Aaron Lindsey 

*3. GEODE-6959 NPE if AlertAppender is not defined*

commit dd15fec1f2ecbc3bc0cdfc42072252c379e0bb89
Author: Kirk Lund 
Date:   Thu Aug 8 14:59:44 2019 -0700

  GEODE-6959: Prevent NPE in GMSMembershipManager for null

AlertAppender

(#3899)

  If a custom log4j2.xml is used without specifying the Geode
AlertAppender,
  GMSMembershipManager may throw a NullPointerException when

invoking

  AlertAppender.getInstance().stopSession() during a

forceDisconnect. This

  change prevents the NullPointerException allowing forceDisconnect

to

finish.

  Users using Spring Boot with Logback are more likely to hit this

bug.

  Co-authored-by: Mark Hanson mhan...@pivotal.io



Re: Propose fix for 1.10 release: Prevent NPE in getLocalSize()

2019-08-13 Thread Udo Kohlmeyer
@Aaron, is this an existing issue (i.e this was not introduced in a 
current refactor)?


If the answer is anything other that "This will make the system stop 
working", I would vote: -1


If this is an existing issue and has been around for a while, I think we 
hold off including this.


I think the boat has sailed on the inclusion of issues into the 1.10 
release. Sorry...


--Udo

On 8/13/19 4:58 PM, Aaron Lindsey wrote:

I’d like to propose including 
https://github.com/apache/geode/pull/3913/commits/6f1814d1f719cc06b13769c40a9d6d01f99f927c
 

 in the Geode 1.10 release.

This commit fixes an issue where a NullPointerException is thrown from 
PartitionedRegion.getLocalSize() when the statistics callback sampler is 
invoked before a PartitionedRegion is initialized.

- Aaron





Re: Propose fix for 1.10 release: Prevent NPE in getLocalSize()

2019-08-14 Thread Udo Kohlmeyer

@Aaron,Kirk - thank you for the clarification.

+1 to include the fix, as reverting GEODE-7001 would be more effort :)

--Udo

On 8/14/19 9:25 AM, Aaron Lindsey wrote:

@Udo, I think Kirk explained it well — This issue was introduced very recently 
(right before we cut the release branch) and it has serious consequences 
(requires restarting the server).

- Aaron


On Aug 14, 2019, at 9:06 AM, Kirk Lund  wrote:

+1 to include this fix in 1.10.0

FYI: The race condition for this code path to throw NPE (which is
catastrophic and requires restarting the server) was introduced by commit
279fa0 on July 31 for GEODE-7001.

On Tue, Aug 13, 2019 at 6:22 PM Anthony Baker  wrote:


Given that we’re trying to stabilize the release branch and this fix seems
to *help* that I’m in favor of merging it.

Anthony



On Aug 13, 2019, at 5:32 PM, Udo Kohlmeyer  wrote:

@Aaron, is this an existing issue (i.e this was not introduced in a

current refactor)?

If the answer is anything other that "This will make the system stop

working", I would vote: -1

If this is an existing issue and has been around for a while, I think we

hold off including this.

I think the boat has sailed on the inclusion of issues into the 1.10

release. Sorry...

--Udo

On 8/13/19 4:58 PM, Aaron Lindsey wrote:

I’d like to propose including

https://github.com/apache/geode/pull/3913/commits/6f1814d1f719cc06b13769c40a9d6d01f99f927c
<
https://github.com/apache/geode/pull/3913/commits/6f1814d1f719cc06b13769c40a9d6d01f99f927c>
in the Geode 1.10 release.

This commit fixes an issue where a NullPointerException is thrown from

PartitionedRegion.getLocalSize() when the statistics callback sampler is
invoked before a PartitionedRegion is initialized.

- Aaron







Re: Proposal to Include GEODE-7079 in 1.10.0

2019-08-15 Thread Udo Kohlmeyer

Juan,

From your explanation, it seems this issue is existing and not 
critical. Could we possibly hold this for 1.11?


--Udo

On 8/15/19 5:29 AM, Ju@N wrote:

Hello team,

I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in release
1.10.0.
Long story short: a *NullPointerException* can be continuously thrown
and flood the member's logs if a serial event processor (either
*async-event-queue* or *gateway-sender*) starts processing events from a
recovered persistent queue before the actual region to which it was
attached is fully operational.
Note: *no events are lost (even without the fix)* but, if the region takes
a while to recover, the logs  for the member can grow pretty quickly due to
the continuously thrown *NPEs.*
Best regards.

[1]:
https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b
[2]: https://issues.apache.org/jira/browse/GEODE-7079



Re: Proposal to Include GEODE-7079 in 1.10.0

2019-08-15 Thread Udo Kohlmeyer
Seems everyone is in favor or including a /*non-critical*/ fix to an 
already cut branch of the a potential release...


Am I missing something?

Why cut a release at all... just have a perpetual cycle of fixes added 
to develop and users can chose what nightly snapshot build they would 
want to use..


I'm voting -1 on a non-critical issue, which is existing and worst 
effect is to fill logs will NPE logs... (yes, not something we want).


I believed that we (as a Geode community) agreed that once a release has 
been cut, only critical issue fixes will be included. If we continue 
just continually adding to the ALREADY CUT 1.10 release, where do we 
stop and when do we release...


--Udo

On 8/15/19 10:19 AM, Nabarun Nag wrote:

+1

On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann 
wrote:


+1

Agreed to fixing this. It's impossible for a user to discover they hit an
edge case that we fail to support till they are in prod and restart.

On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos 
wrote:


Hello Udo,

Even if it is an existing issue I'd still consider it critical for those
cases on which there are unprocessed events on the persistent queue

after a

restart and the region takes long to recover... you can actually see
millions of *NPEs* flooding the member's logs.
My two cents anyway, it's up to the community to make the final decision.
Cheers.


On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer  wrote:


Juan,

  From your explanation, it seems this issue is existing and not
critical. Could we possibly hold this for 1.11?

--Udo

On 8/15/19 5:29 AM, Ju@N wrote:

Hello team,

I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in

release

1.10.0.
Long story short: a *NullPointerException* can be continuously thrown
and flood the member's logs if a serial event processor (either
*async-event-queue* or *gateway-sender*) starts processing events

from

a

recovered persistent queue before the actual region to which it was
attached is fully operational.
Note: *no events are lost (even without the fix)* but, if the region

takes

a while to recover, the logs  for the member can grow pretty quickly

due

to

the continuously thrown *NPEs.*
Best regards.

[1]:


https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b

[2]: https://issues.apache.org/jira/browse/GEODE-7079



--
Juan José Ramos Cassella
Senior Software Engineer
Email: jra...@pivotal.io



Re: Proposal to Include GEODE-7079 in 1.10.0

2019-08-15 Thread Udo Kohlmeyer
Whilst I agree with "*finish* when we believe the quality of the release 
branch is sufficient", I disagree that we should have cut a branch and 
continue to patch that branch with non-critical fixes. i.e this issue 
has been around for a while and has no averse side effects. Issues like 
GEODE-7081, which is new due to a new commit, AND it has critical 
stability implications on the server, that I can agree we should include 
in a potential release branch.


Otherwise we can ALWAYS argue that said release branch is not of 
"sufficient" quality, especially if there are numerous existing JIRA's 
pertaining to bugs already in the system.


To quote Juan's original email:

/"Note: *no events are lost (even without the fix)* but, if the region 
takes//
//a while to recover, the logs  for the member can grow pretty quickly 
due to//

//the continuously thrown *NPEs.*"/

In addition to this, if there is a commit in a cut release branch, which 
is requiring us to continuously patching the release branch, in order to 
stabilize that feature/fix, maybe we should consider reverting that fix 
and release it at a later stage, when it is believed that this fix is 
more stable and have better, more comprehensive test coverage.


So far, GEODE-7081, does not have me convinced that it is critical. OR 
maybe it is the latter of my options, where it is a stabilization commit 
to a new feature, which begs the question, should we have accepted the 
original feature commit if there are all manner of side effects which we 
are only discovering.


--Udo

On 8/15/19 11:08 AM, Anthony Baker wrote:

While we can’t fix *all known bugs*, I think where we do have a fix for an 
important issue we should think hard about the cost of not including that in a 
release.

IMO, the fixed time approach to releases means that we *start* the release 
effort (including stabilization and bug fixing if needed) on a known date and 
we *finish* when new believe the quality of the release branch is sufficient.  
Given the number of important fixes being requested, I’m not sure we are there 
yet.

I think the release branch concept has merit because it allows us to isolate 
ongoing work from the changes needed for a release.

+1 for including GEODE-7079.

Anthony



On Aug 15, 2019, at 10:51 AM, Udo Kohlmeyer  wrote:

Seems everyone is in favor or including a /*non-critical*/ fix to an already 
cut branch of the a potential release...

Am I missing something?

Why cut a release at all... just have a perpetual cycle of fixes added to 
develop and users can chose what nightly snapshot build they would want to use..

I'm voting -1 on a non-critical issue, which is existing and worst effect is to 
fill logs will NPE logs... (yes, not something we want).

I believed that we (as a Geode community) agreed that once a release has been 
cut, only critical issue fixes will be included. If we continue just 
continually adding to the ALREADY CUT 1.10 release, where do we stop and when 
do we release...

--Udo

On 8/15/19 10:19 AM, Nabarun Nag wrote:

+1

On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann 
wrote:


+1

Agreed to fixing this. It's impossible for a user to discover they hit an
edge case that we fail to support till they are in prod and restart.

On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos 
wrote:


Hello Udo,

Even if it is an existing issue I'd still consider it critical for those
cases on which there are unprocessed events on the persistent queue

after a

restart and the region takes long to recover... you can actually see
millions of *NPEs* flooding the member's logs.
My two cents anyway, it's up to the community to make the final decision.
Cheers.


On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer  wrote:


Juan,

  From your explanation, it seems this issue is existing and not
critical. Could we possibly hold this for 1.11?

--Udo

On 8/15/19 5:29 AM, Ju@N wrote:

Hello team,

I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in

release

1.10.0.
Long story short: a *NullPointerException* can be continuously thrown
and flood the member's logs if a serial event processor (either
*async-event-queue* or *gateway-sender*) starts processing events

from

a

recovered persistent queue before the actual region to which it was
attached is fully operational.
Note: *no events are lost (even without the fix)* but, if the region

takes

a while to recover, the logs  for the member can grow pretty quickly

due

to

the continuously thrown *NPEs.*
Best regards.

[1]:


https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b

[2]: https://issues.apache.org/jira/browse/GEODE-7079


--
Juan José Ramos Cassella
Senior Software Engineer
Email: jra...@pivotal.io



Re: Proposal to Include GEODE-7079 in 1.10.0

2019-08-15 Thread Udo Kohlmeyer
@Dan, not arguing that logs filling up with NPE's could bring a system 
down with limit disk space, or potentially swallowing important logs 
that could be helpful in root-causing issues...


I'm merely raising the question on why this bug fix should receive 
priority inclusion. It has been around for a long time and could be 
included in a subsequent release.


On 8/15/19 12:34 PM, Dan Smith wrote:

+1 to merging Juan's fix for GEODE-7079. I've seen systems taken down by
rapidly filling up the logs in the past, this does seem to be a critical
fix from the perspective of system stability.

Also, this is the change, which doesn't seem particularly risky to me.

-  ConflationKey key = new
ConflationKey(gsEvent.getRegion().getFullPath(),
+  ConflationKey key = new ConflationKey(gsEvent.getRegionPath(),

-Dan

On Thu, Aug 15, 2019 at 12:23 PM Udo Kohlmeyer  wrote:


Whilst I agree with "*finish* when we believe the quality of the release
branch is sufficient", I disagree that we should have cut a branch and
continue to patch that branch with non-critical fixes. i.e this issue
has been around for a while and has no averse side effects. Issues like
GEODE-7081, which is new due to a new commit, AND it has critical
stability implications on the server, that I can agree we should include
in a potential release branch.

Otherwise we can ALWAYS argue that said release branch is not of
"sufficient" quality, especially if there are numerous existing JIRA's
pertaining to bugs already in the system.

To quote Juan's original email:

/"Note: *no events are lost (even without the fix)* but, if the region
takes//
//a while to recover, the logs  for the member can grow pretty quickly
due to//
//the continuously thrown *NPEs.*"/

In addition to this, if there is a commit in a cut release branch, which
is requiring us to continuously patching the release branch, in order to
stabilize that feature/fix, maybe we should consider reverting that fix
and release it at a later stage, when it is believed that this fix is
more stable and have better, more comprehensive test coverage.

So far, GEODE-7081, does not have me convinced that it is critical. OR
maybe it is the latter of my options, where it is a stabilization commit
to a new feature, which begs the question, should we have accepted the
original feature commit if there are all manner of side effects which we
are only discovering.

--Udo

On 8/15/19 11:08 AM, Anthony Baker wrote:

While we can’t fix *all known bugs*, I think where we do have a fix for

an important issue we should think hard about the cost of not including
that in a release.

IMO, the fixed time approach to releases means that we *start* the

release effort (including stabilization and bug fixing if needed) on a
known date and we *finish* when new believe the quality of the release
branch is sufficient.  Given the number of important fixes being requested,
I’m not sure we are there yet.

I think the release branch concept has merit because it allows us to

isolate ongoing work from the changes needed for a release.

+1 for including GEODE-7079.

Anthony



On Aug 15, 2019, at 10:51 AM, Udo Kohlmeyer 

wrote:

Seems everyone is in favor or including a /*non-critical*/ fix to an

already cut branch of the a potential release...

Am I missing something?

Why cut a release at all... just have a perpetual cycle of fixes added

to develop and users can chose what nightly snapshot build they would want
to use..

I'm voting -1 on a non-critical issue, which is existing and worst

effect is to fill logs will NPE logs... (yes, not something we want).

I believed that we (as a Geode community) agreed that once a release

has been cut, only critical issue fixes will be included. If we continue
just continually adding to the ALREADY CUT 1.10 release, where do we stop
and when do we release...

--Udo

On 8/15/19 10:19 AM, Nabarun Nag wrote:

+1

On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann <

amurm...@apache.org>

wrote:


+1

Agreed to fixing this. It's impossible for a user to discover they

hit an

edge case that we fail to support till they are in prod and restart.

On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos 
wrote:


Hello Udo,

Even if it is an existing issue I'd still consider it critical for

those

cases on which there are unprocessed events on the persistent queue

after a

restart and the region takes long to recover... you can actually see
millions of *NPEs* flooding the member's logs.
My two cents anyway, it's up to the community to make the final

decision.

Cheers.


On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer 

wrote:

Juan,

   From your explanation, it seems this issue is existing and not
critical. Could we possibly hold this for 1.11?

--Udo

On 8/15/19 5:29 AM, Ju@N wrote:

Hello team,

I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in

release

1.10.0.
Lon

Re: Proposal to Include GEODE-7079 in 1.10.0

2019-08-15 Thread Udo Kohlmeyer

I'm changing my vote to +1 on this issue.

The ONLY reason I'm changing my vote is to add to the cleanliness of the 
code of the release. I do 100% disagree with the continual scope creep 
that we have been incurring on this release branch.


--Udo

On 8/15/19 12:34 PM, Dan Smith wrote:

+1 to merging Juan's fix for GEODE-7079. I've seen systems taken down by
rapidly filling up the logs in the past, this does seem to be a critical
fix from the perspective of system stability.

Also, this is the change, which doesn't seem particularly risky to me.

-  ConflationKey key = new
ConflationKey(gsEvent.getRegion().getFullPath(),
+  ConflationKey key = new ConflationKey(gsEvent.getRegionPath(),

-Dan

On Thu, Aug 15, 2019 at 12:23 PM Udo Kohlmeyer  wrote:


Whilst I agree with "*finish* when we believe the quality of the release
branch is sufficient", I disagree that we should have cut a branch and
continue to patch that branch with non-critical fixes. i.e this issue
has been around for a while and has no averse side effects. Issues like
GEODE-7081, which is new due to a new commit, AND it has critical
stability implications on the server, that I can agree we should include
in a potential release branch.

Otherwise we can ALWAYS argue that said release branch is not of
"sufficient" quality, especially if there are numerous existing JIRA's
pertaining to bugs already in the system.

To quote Juan's original email:

/"Note: *no events are lost (even without the fix)* but, if the region
takes//
//a while to recover, the logs  for the member can grow pretty quickly
due to//
//the continuously thrown *NPEs.*"/

In addition to this, if there is a commit in a cut release branch, which
is requiring us to continuously patching the release branch, in order to
stabilize that feature/fix, maybe we should consider reverting that fix
and release it at a later stage, when it is believed that this fix is
more stable and have better, more comprehensive test coverage.

So far, GEODE-7081, does not have me convinced that it is critical. OR
maybe it is the latter of my options, where it is a stabilization commit
to a new feature, which begs the question, should we have accepted the
original feature commit if there are all manner of side effects which we
are only discovering.

--Udo

On 8/15/19 11:08 AM, Anthony Baker wrote:

While we can’t fix *all known bugs*, I think where we do have a fix for

an important issue we should think hard about the cost of not including
that in a release.

IMO, the fixed time approach to releases means that we *start* the

release effort (including stabilization and bug fixing if needed) on a
known date and we *finish* when new believe the quality of the release
branch is sufficient.  Given the number of important fixes being requested,
I’m not sure we are there yet.

I think the release branch concept has merit because it allows us to

isolate ongoing work from the changes needed for a release.

+1 for including GEODE-7079.

Anthony



On Aug 15, 2019, at 10:51 AM, Udo Kohlmeyer 

wrote:

Seems everyone is in favor or including a /*non-critical*/ fix to an

already cut branch of the a potential release...

Am I missing something?

Why cut a release at all... just have a perpetual cycle of fixes added

to develop and users can chose what nightly snapshot build they would want
to use..

I'm voting -1 on a non-critical issue, which is existing and worst

effect is to fill logs will NPE logs... (yes, not something we want).

I believed that we (as a Geode community) agreed that once a release

has been cut, only critical issue fixes will be included. If we continue
just continually adding to the ALREADY CUT 1.10 release, where do we stop
and when do we release...

--Udo

On 8/15/19 10:19 AM, Nabarun Nag wrote:

+1

On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann <

amurm...@apache.org>

wrote:


+1

Agreed to fixing this. It's impossible for a user to discover they

hit an

edge case that we fail to support till they are in prod and restart.

On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos 
wrote:


Hello Udo,

Even if it is an existing issue I'd still consider it critical for

those

cases on which there are unprocessed events on the persistent queue

after a

restart and the region takes long to recover... you can actually see
millions of *NPEs* flooding the member's logs.
My two cents anyway, it's up to the community to make the final

decision.

Cheers.


On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer 

wrote:

Juan,

   From your explanation, it seems this issue is existing and not
critical. Could we possibly hold this for 1.11?

--Udo

On 8/15/19 5:29 AM, Ju@N wrote:

Hello team,

I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in

release

1.10.0.
Long story short: a *NullPointerException* can be continuously

thrown

and flood the member's logs if a

Re: I propose including the fix for GEODE-3780 in 1.10

2019-08-15 Thread Udo Kohlmeyer
Looking at the Geode ticket number, it seems this problem has 
resurfaced, as it seems to have been addressed in 1.7.0 already.


My concern is, do what know WHAT caused it to resurface? Or was this 
issue always dormant and only recently resurfaced? Without understand 
why we are seeing "fixed" issues resurfacing, concerns me. As that could 
mean we have made changes that have adverse effects and we were really 
premature in cutting 1.10.


--Udo

On 8/15/19 2:46 PM, Bruce Schuchardt wrote:
Testing in the past week hit this problem 9 times and it was 
identified as a new issue.



On 8/15/19 2:23 PM, Jacob Barrett wrote:
Because someone will ask, can we be proactive in these request with 
identifying if the issue being fixed is introduced in Geode 1.10 or 
is a preexisting condition.


-jake


On Aug 15, 2019, at 2:09 PM, Bruce Schuchardt 
 wrote:


This is a fix for a problem where a member that has lost quorum does 
not detect it and does not shut down.  The fix is small and has been 
extensively tested.  The fix also addresses the possibility of a 
member being kicked out of the cluster when it is only late in 
delivering a heartbeat (i.e., no availability check performed).


SHA: 8e9b04470264983d0aa1c7900f6e9be2374549d9



Re: Propose fix for 1.10 release: Export offline data command failed with EntryDestroyedException

2019-08-16 Thread Udo Kohlmeyer

+1 to include


On 8/16/19 12:43 PM, Eric Shu wrote:

Hi,

I'd like to include the following commit (
https://gitbox.apache.org/repos/asf?p=geode.git;h=aa33060) into Geode 1.10
release.

The commit fixes the issue that a user tries to use export offline data to
a snapshot file but failed. This issue exists from release 1.1.0. However,
it is a critical issue, as it prevents users to get the data from their
backup disk stores.

Regards,
Eric



Re: Proposal to include GEODE-7088 and GEODE-7089 in 1.10.0

2019-08-26 Thread Udo Kohlmeyer

In order to better understand this request:

Is this an existing issue?

Why is it more critical to squeeze it into an existing (almost release) 
version of Apache Geode?


What guarantees do we have that this fix makes the application more 
stable compared to adding another hidden issue, which we will discover 
in another few weeks from now?



--Udo

On 8/26/19 3:10 PM, Ryan McMahon wrote:

Hi all,

I would like to propose cherry-picking GEODE-7088 and GEODE-7089 to the
1.10.0 release branch.  The two JIRAs are related to the same root problem,
which I would classify as critical.  We discovered a case where a failed
client registration could lead to a memory leak in a server, eventually
causing the server to crash due to lack of memory.

The issue is instigated by a ConcurrentModificationException due to
iteration of a non-thread safe collection while it is being mutated
(GEODE-7088).  This exception occurs when the client's queue image is being
copied from one server to the next during client registration, and it
causes the client's registration to fail.  The client would likely succeed
if it retried registration with that same server, but if it registers with
a different server, we end up leaking events to the client's registration
queue on the original server (GEODE-7089).

The fix for GEODE-7088 is to use thread-safe collections for interested
clients in client update messages.  The fix for GEODE-7089 is to always
drain and remove the registration queue regardless of success or failure.
Together, these fixes prevent the failed registrations and memory leak.

The SHAs for the fixes and tests in develop are:

GEODE-7088
- 174af1d23fb7e09eb2bc2fa55479df854850fadb
- 5bb753a8f4ff2886acd8e62d6f51fea58e37881d

GEODE-7089
- 5d0153ad4adb1612a1083673f98b1982819a6589

This proposal is to cherry-pick these commits to 1.10.0 release branch.

Thanks,
Ryan McMahon



[DISCUSS] Pulling the current proposed 1.10 release until we can agree on develop being stable

2019-08-26 Thread Udo Kohlmeyer

Hi there Apache Geode devs,

It has been some weeks since the proposed 1.10 release was cut. We've 
gone through a few cycles where we keep on submitting "please include 
ticket GEODE-XXX" because it is critical and will break the system. 
WHICH in reality tells me that current develop is broken and unstable.


I'm going to suggest that we abandon the current 1.10 release branch. I 
cannot shake the feeling that our continuous cherry picking into a 
branch will result in either the branch becoming unmaintainable, given 
we have only select fixes in the branch OR we end up with a branch that 
is more stable than our current development branch, which would result 
in us having to rebase the develop branch onto the 1.10 branch.


I propose that once we see the pipeline is clean and green for a solid 
we then again attempt to cut 1.10 branch.


We CANNOT continue adding to a branch in order to stabilize it.. It just 
means the branch we cut from is unstable. If we cannot cut a branch from 
develop without having to have weeks of stabilization cycles, then our 
main branch is broken...


Either way, not a good spot to be in.

Thoughts?

--Udo



Re: Proposal to include GEODE-7088 and GEODE-7089 in 1.10.0

2019-08-26 Thread Udo Kohlmeyer

Thank you Ryan,

+1 for inclusion

On 8/26/19 3:33 PM, Ryan McMahon wrote:

Udo,

Here are inline answers to your questions:

*Is this an existing issue?*

Short answer - yes, but it has never been in a release version of Geode.
The leak was introduced as part of some changes to address handling
multiple concurrent registration requests for a given client on a single
server.  The issue is only seen if client registration fails which is not
particularly common, which is why we are only seeing it now after months of
testing.  The commit for that was introduced here on April 30th.
https://github.com/apache/geode/commit/bc2a2fa5af374cfedfba4dc1abe6cbc2a7b719c8
The ConcurrentModificationException issue (which ultimately causes the
registration to fail) was introduced on April 22nd here:
https://github.com/apache/geode/commit/afc311c04f6908a8f725834cdf2c49ce6e902b3f


*Why is it more critical to squeeze it into an existing (almost
release) version of Apache Geode?*

Not sure I totally understand this question, but it is critical because the
leak can cause servers to crash due to OOM.  Again, because the problems
were introduced in late April and we haven't released Geode since then, so
I think it is very important to merge these fixes into 1.10.0 before we
release.



*What guarantees do we have that this fix makes the application more stable
compared to adding another hidden issue, which we will discover in another
few weeks from now?*

I added numerous tests for this scenario to ensure that the leak would
never happen regardless of the cause of a failed client registration.
There obviously is no way to 100% guarantee that there will be no issues
that arise from the fixes themselves, but our existing test coverage plus
the new tests I wrote should give us the confidence we need.

Thanks,
Ryan

On Mon, Aug 26, 2019 at 3:17 PM Udo Kohlmeyer  wrote:


In order to better understand this request:

Is this an existing issue?

Why is it more critical to squeeze it into an existing (almost release)
version of Apache Geode?

What guarantees do we have that this fix makes the application more
stable compared to adding another hidden issue, which we will discover
in another few weeks from now?


--Udo

On 8/26/19 3:10 PM, Ryan McMahon wrote:

Hi all,

I would like to propose cherry-picking GEODE-7088 and GEODE-7089 to the
1.10.0 release branch.  The two JIRAs are related to the same root

problem,

which I would classify as critical.  We discovered a case where a failed
client registration could lead to a memory leak in a server, eventually
causing the server to crash due to lack of memory.

The issue is instigated by a ConcurrentModificationException due to
iteration of a non-thread safe collection while it is being mutated
(GEODE-7088).  This exception occurs when the client's queue image is

being

copied from one server to the next during client registration, and it
causes the client's registration to fail.  The client would likely

succeed

if it retried registration with that same server, but if it registers

with

a different server, we end up leaking events to the client's registration
queue on the original server (GEODE-7089).

The fix for GEODE-7088 is to use thread-safe collections for interested
clients in client update messages.  The fix for GEODE-7089 is to always
drain and remove the registration queue regardless of success or failure.
Together, these fixes prevent the failed registrations and memory leak.

The SHAs for the fixes and tests in develop are:

GEODE-7088
- 174af1d23fb7e09eb2bc2fa55479df854850fadb
- 5bb753a8f4ff2886acd8e62d6f51fea58e37881d

GEODE-7089
- 5d0153ad4adb1612a1083673f98b1982819a6589

This proposal is to cherry-pick these commits to 1.10.0 release branch.

Thanks,
Ryan McMahon



Re: [DISCUSS] Pulling the current proposed 1.10 release until we can agree on develop being stable

2019-08-28 Thread Udo Kohlmeyer
Fundamentally I also don't have a problem with cherry-picking fixes to a 
potential release candidate. My concern is that the amount of fixes that 
were cherry-picked.


I also believe that "stabilizing" a release does have some 
cherry-picking, BUT the amount of stabilization that we have to apply 
has to be within reason and control.


I also understand that there are fixes that we want to include that are 
critical. BUT if it takes a month (from cutting to generating an RC) to 
get close to releasing, indicates that whatever is in develop is NOT 
stable. There were fixes that were included and the cause of the issues 
was "due to a recent commit SHA-- ". Which means most likely the 
rigor to review and test said bug/issue was not effectively tested.


Finding issues with maps that should have been concurrent maps and lead 
memory leaks are distressing. Now there we have code that has been 
refactored that we prefer not to release. If we don't want to release 
it, why is it in the main develop branch. Imo, only code THAT CAN BE 
RELEASED should ever make it to the main branch.


Maybe I am the problem here (again) Maybe it is I that is concerned 
with the quality that is being committed and then glossed over with, it 
is ok, we can fix any resulting issues coming from that.


If everybody is happy to release 1.10, then so be it... here is my +0.

But I don't think this approach of cutting a release branch and then 
effectively having to maintain to development branches and what commits 
are applied to the correct branches is in any way maintainable..


--Udo

On 8/27/19 11:56 AM, Bruce Schuchardt wrote:

+1 for going ahead with the current release/1.10

On 8/27/19 11:31 AM, Dan Smith wrote:

+1 to creating RC1 with the current release/1.10 branch this week.

I don't see a fundamental problem with cherry-picking some targeted and
tested fixes to release/1.10, based on our assessment of the risk to
customers vs. the risk of destabilizing the branch. I think 
release/1.10 is

in a good state, and we should go ahead with the release.

-Dan


On Tue, Aug 27, 2019 at 9:28 AM Bruce Schuchardt 


wrote:


The "develop" branch has a refactoring of membership code that should
not be included in 1.10.  I waited until the release branch was cut to
push these changes.

On 8/26/19 4:06 PM, Udo Kohlmeyer wrote:

Hi there Apache Geode devs,

It has been some weeks since the proposed 1.10 release was cut. We've
gone through a few cycles where we keep on submitting "please include
ticket GEODE-XXX" because it is critical and will break the system.
WHICH in reality tells me that current develop is broken and unstable.

I'm going to suggest that we abandon the current 1.10 release branch.
I cannot shake the feeling that our continuous cherry picking into a
branch will result in either the branch becoming unmaintainable, given
we have only select fixes in the branch OR we end up with a branch
that is more stable than our current development branch, which would
result in us having to rebase the develop branch onto the 1.10 branch.

I propose that once we see the pipeline is clean and green for a solid
we then again attempt to cut 1.10 branch.

We CANNOT continue adding to a branch in order to stabilize it.. It
just means the branch we cut from is unstable. If we cannot cut a
branch from develop without having to have weeks of stabilization
cycles, then our main branch is broken...

Either way, not a good spot to be in.

Thoughts?

--Udo



Re: [VOTE] Apache Geode 1.9.1 RC1

2019-08-28 Thread Udo Kohlmeyer

+1 for a re-vote

On 8/28/19 2:42 PM, Kirk Lund wrote:

SBDG 1.2 is currently in RC and cannot be changed to depend on Geode 1.10.
It must depend on Geode 1.9 or 1.9.1.

So if we want to provide the logging fixes for SBDG 1.2 then we must
release Geode 1.9.1.

Let's open a new vote for releasing Geode 1.9.1.

On Wed, Aug 28, 2019 at 1:37 PM Owen Nichols  wrote:


It's past the announced deadline and the vote has failed to due to lack of
quorum.

Voting status
==
+1: zero votes

+0: zero votes

-0: zero votes

-1: zero votes

The voting does not meet the requirements <
https://www.apache.org/foundation/voting.html> of at least 3 PMC members
with +1 votes and a majority of +1 votes.

The matter of what to do next is referred back to the original DISCUSS
thread that proposed 1.9.1.

-Owen & Kirk


On Aug 22, 2019, at 6:10 PM, Owen Nichols  wrote:

Hello Geode dev community,

This is a release candidate for Apache Geode, version 1.9.1.RC1.
Thanks to all the community members for their contributions to this

release!

Please do a review and give your feedback. The deadline is 3PM PST Tue,

August 27 2019.

Release notes can be found at:

https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1
<
https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1>


Please note that we are voting on the source tag: rel/v1.9.1.RC1

Apache Geode:
https://github.com/apache/geode/tree/rel/v1.9.1.RC1 <

https://github.com/apache/geode/tree/rel/v1.9.1.RC1>

Apache Geode examples:
https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC1 <

https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC1>

Apache Geode native:
https://github.com/apache/geode-native/tree/rel/v1.9.1.RC1 <

https://github.com/apache/geode-native/tree/rel/v1.9.1.RC1>

Source and binary files:
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1/ <

https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1/>

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachegeode-1055 <

https://repository.apache.org/content/repositories/orgapachegeode-1055>

Geode's KEYS file containing PGP keys we use to sign the release:
https://github.com/apache/geode/blob/develop/KEYS <

https://github.com/apache/geode/blob/develop/KEYS>

PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=

https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1 <
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1>
-PgeodeRepositoryUrl=
https://repository.apache.org/content/repositories/orgapachegeode-1055 <
https://repository.apache.org/content/repositories/orgapachegeode-1055>
build runAll

Regards
Owen Nichols & Kirk Lund





Re: [VOTE] Apache Geode 1.9.1 RC1

2019-08-29 Thread Udo Kohlmeyer

+1

On 8/28/19 5:41 PM, John Blum wrote:

+1

On Wed, Aug 28, 2019 at 2:51 PM Dan Smith  wrote:


I missed this vote email as well - if we reopen the vote I'll cast one. I
don't really have much context on why we want a 1.9.1 but I'm happy to
double check the bits.

One comment on this RC - I noticed that we bumped the ordinal in
Version.java - is that what we actually want to do? That implies a new
version of our communications protocol, which 1.10 will have to understand.
Did we actually change the communication protocol in this release?

-Dan

On Wed, Aug 28, 2019 at 2:42 PM Kirk Lund  wrote:


SBDG 1.2 is currently in RC and cannot be changed to depend on Geode

1.10.

It must depend on Geode 1.9 or 1.9.1.

So if we want to provide the logging fixes for SBDG 1.2 then we must
release Geode 1.9.1.

Let's open a new vote for releasing Geode 1.9.1.

On Wed, Aug 28, 2019 at 1:37 PM Owen Nichols 

wrote:

It's past the announced deadline and the vote has failed to due to lack

of

quorum.

Voting status
==
+1: zero votes

+0: zero votes

-0: zero votes

-1: zero votes

The voting does not meet the requirements <
https://www.apache.org/foundation/voting.html> of at least 3 PMC

members

with +1 votes and a majority of +1 votes.

The matter of what to do next is referred back to the original DISCUSS
thread that proposed 1.9.1.

-Owen & Kirk


On Aug 22, 2019, at 6:10 PM, Owen Nichols 

wrote:

Hello Geode dev community,

This is a release candidate for Apache Geode, version 1.9.1.RC1.
Thanks to all the community members for their contributions to this

release!

Please do a review and give your feedback. The deadline is 3PM PST

Tue,

August 27 2019.

Release notes can be found at:

https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1

<


https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1



Please note that we are voting on the source tag: rel/v1.9.1.RC1

Apache Geode:
https://github.com/apache/geode/tree/rel/v1.9.1.RC1 <

https://github.com/apache/geode/tree/rel/v1.9.1.RC1>

Apache Geode examples:
https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC1 <

https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC1>

Apache Geode native:
https://github.com/apache/geode-native/tree/rel/v1.9.1.RC1 <

https://github.com/apache/geode-native/tree/rel/v1.9.1.RC1>

Source and binary files:
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1/ <

https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1/>

Maven staging repo:


https://repository.apache.org/content/repositories/orgapachegeode-1055

<

https://repository.apache.org/content/repositories/orgapachegeode-1055

Geode's KEYS file containing PGP keys we use to sign the release:
https://github.com/apache/geode/blob/develop/KEYS <

https://github.com/apache/geode/blob/develop/KEYS>

PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=

https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1 <
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1>
-PgeodeRepositoryUrl=
https://repository.apache.org/content/repositories/orgapachegeode-1055

<

https://repository.apache.org/content/repositories/orgapachegeode-1055
build runAll

Regards
Owen Nichols & Kirk Lund







Re: [VOTE] Apache Geode 1.9.1 RC1 (new vote)

2019-08-29 Thread Udo Kohlmeyer

+1

On 8/29/19 9:51 AM, John Blum wrote:

+1

On Thu, Aug 29, 2019 at 9:41 AM Kirk Lund  wrote:


+1 (just in case my vote counts)

On Thu, Aug 29, 2019 at 9:02 AM Kirk Lund  wrote:


Hello Geode dev community,

This is a release candidate for Apache Geode, version 1.9.1.RC1.
Thanks to all the community members for their contributions to this
release!

Please do a review and give your feedback. The deadline is 3PM PST Tue,
August 27 2019.
Release notes can be found at:


https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1

  <


https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1

Please note that we are voting on the source tag: rel/v1.9.1.RC1

Apache Geode:
https://github.com/apache/geode/tree/rel/v1.9.1.RC1 <
https://github.com/apache/geode/tree/rel/v1.9.1.RC1>
Apache Geode examples:
https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC1 <
https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC1>
Apache Geode native:
https://github.com/apache/geode-native/tree/rel/v1.9.1.RC1 <
https://github.com/apache/geode-native/tree/rel/v1.9.1.RC1>

Source and binary files:
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1/ <
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1/>

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachegeode-1055 <
https://repository.apache.org/content/repositories/orgapachegeode-1055>

Geode's KEYS file containing PGP keys we use to sign the release:
https://github.com/apache/geode/blob/develop/KEYS <
https://github.com/apache/geode/blob/develop/KEYS>

PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC1
  -PgeodeRepositoryUrl=
https://repository.apache.org/content/repositories/orgapachegeode-1055

build

runAll

Regards
Owen Nichols & Kirk Lund





Re: [VOTE] Apache Geode 1.9.1 RC2

2019-08-29 Thread Udo Kohlmeyer

+1

On 8/29/19 5:02 PM, Owen Nichols wrote:

Hello Geode dev community,

This is a release candidate for Apache Geode, version 1.9.1.RC2.
Thanks to all the community members for their contributions to this release!

Please do a review and give your feedback. The deadline is 3PM PST Wed, 
September 04 2019.
Release notes can be found at: 
https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.9.1 


Please note that we are voting upon the source tags: rel/v1.9.1.RC2

Apache Geode:
https://github.com/apache/geode/tree/rel/v1.9.1.RC2 

Apache Geode examples:
https://github.com/apache/geode-examples/tree/rel/v1.9.1.RC2 

Apache Geode native:
https://github.com/apache/geode-native/tree/rel/v1.9.1.RC2 


Source and binary files:
https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC2/ 


Maven staging repo:
https://repository.apache.org/content/repositories/orgapachegeode-1056 


Geode's KEYS file containing PGP keys we use to sign the release:
https://github.com/apache/geode/blob/develop/KEYS 


PS: Command to run geode-examples: ./gradlew 
-PgeodeReleaseUrl=https://dist.apache.org/repos/dist/dev/geode/1.9.1.RC2 
-PgeodeRepositoryUrl=https://repository.apache.org/content/repositories/orgapachegeode-1056
 build runAll

Regards
Owen & Kirk


Re: [DISCUSS] RFC - Move membership code to a separate gradle sub-project

2019-09-06 Thread Udo Kohlmeyer

I've reviewed and commented on the RFC.

+1 on the thought / notion of extracting modules.

I'm less convinced on the initial extraction of the geode-serialization 
module and believe some attention is to be given to this, once a 
decision to convert the serialization to a stand alone module.


--Udo

On 8/30/19 3:50 PM, Dan Smith wrote:

Hi all,

We added the following RFC to the wiki about moving Geode's membership
system to a separate gradle sub-project. Please review and comment by
9/6/2019.

https://cwiki.apache.org/confluence/x/WRB4Bw

Thanks!
-Dan



Re: [VOTE] Adding a lucene specific fix to release/1.10.0

2019-09-19 Thread Udo Kohlmeyer

-1

I must agree with Owen's analysis.

It's a known problem, and it will not cause the system to stop working. 
Yes, it is a bug and will cause issues with results, BUT it will NOT 
affect the stability of the system. Which is one of the only reasons we 
should be adding fixes to an already cut release branch.


--Udo

On 9/19/19 11:48 AM, Owen Nichols wrote:

Thank you for providing some context for what is being voted here.  Based on 
this information, I will give my vote as “+0” (imho it may not meet the 
definition of a “critical fix”, but seems like the risk is low and the 
community wants it, so I have no real objection).



On Sep 19, 2019, at 11:38 AM, Xiaojian Zhou  wrote:

Owen:
Here are the answers:

- Is this fixing an issue of Data loss? Performance degradation?
Backward-compatibility issue? Availability impacts?  Resource exhaustion
(threads, disk, cpu, memory, sockets, etc)?

Without the fix, fields in the inherited attributes cannot be indexed, if
it's user object. For example, I have a Customer class, which contains
phoneBook. I have a subclass LocalCustomer to inherit Customer class, then
I cannot index on phoneBook.

- Did this issue exist in the previous release?
Yes.

- What is the impact of not fixing it?
Customer will see it and they have seen it.

- What are the risks of introducing this change so close to shipping?
No risk. It's standalone fix. Not to impact any where else. And it will be
backported in future if we did not do it now.

- How extensively has the fix been tested on develop?
We introduced several dunit and junit tests.

- How “sensitive” is the area of code it touches?
Not sensitive.

- What new tests have been added?
New dunit tests and junit tests.

Regards
Gester

On Thu, Sep 19, 2019 at 11:21 AM Owen Nichols  wrote:


On Sep 19, 2019, at 11:15 AM, Xiaojian Zhou  wrote:

Owen:

The reason is: it's already cherry-picked to 1.9.


Can you kindly point me to the specific SHA where this was fixed in 1.9?
I am not able to find it...


Gester

On Thu, Sep 19, 2019 at 11:13 AM Owen Nichols 

wrote:

It looks like this has already passed the vote, but I don’t see an
explanation anywhere in this thread for what makes this a "critical

fix".

As I recall release/1.10.0 was branched at the beginning of August, so

it

seems appropriate to apply a very high level of scrutiny to any

continuing

proposals to further delay the release of 1.10.0.

- Is this fixing an issue of Data loss? Performance degradation?
Backward-compatibility issue? Availability impacts?  Resource exhaustion
(threads, disk, cpu, memory, sockets, etc)?
- Did this issue exist in the previous release?
- What is the impact of not fixing it?
- What are the risks of introducing this change so close to shipping?
- How extensively has the fix been tested on develop?
- How “sensitive” is the area of code it touches?
- What new tests have been added?



On Sep 19, 2019, at 11:08 AM, Anilkumar Gingade 

wrote:

+1

On Thu, Sep 19, 2019 at 11:02 AM Eric Shu  wrote:


+1


On Thu, Sep 19, 2019 at 10:59 AM Benjamin Ross 

wrote:

+1

On Thu, Sep 19, 2019 at 10:50 AM Nabarun Nag 

wrote:

+1

On Thu, Sep 19, 2019 at 10:49 AM Xiaojian Zhou 

wrote:

I want to merge GEODE-7208, which is lucene specific fix

The fix will enable indexing on inherited attributes in user

object.

revision 4ec87419d456748a7d853e979c90ad4e301b2405

Regards
Gester







Re: [VOTE] Adding a lucene specific fix to release/1.10.0

2019-09-19 Thread Udo Kohlmeyer
Given that we have ALREADY merged this in AND it has passed through the 
majority of our pipeline without incident,


I'll change to a -0...

I make this is a " - " because I'm quietly objecting to the fact that we 
have not followed process and merged without waiting for consensus. Also 
the reasoning to include this issue provides me absolutely NO compelling 
reason to include it in 1.10. The fact that there was a majority "+1" 
without the evaluation of the "does this improve the stability" of the 
release. I'm not talking about the correctness, but stability. Also 
given it is an existing issue, makes be believe that this is/was not a 
issue that affected stability.


Anyway... Hopefully we can get better at releasing.

--Udo

On 9/19/19 2:19 PM, Dick Cavender wrote:

This problem has happened before, and will probably happen in the future.
Recently we adjusted the Geode release process to dictate that the Geode
release manager will handle the merging of approved changes to a release
branch while also allowing the community time for input and discussion on
those. In this case, neither happened. The change was merged by the
committer as soon as there were three votes and I as the release manager
failed to communicate the intended process to them before that happened.

This process needs to be an on-going discussion for us as a community.
What, when and how changes come into a release branch after creation and
until its release. It's enough to scare off future Release Manager
volunteers if not solved.

The GEODE-7208 change is already moving through the 1.10.0 pipeline and is
(at the moment) the final change for 1.10.0.RC2 moving us closer to
release.

Udo- would you consider making your vote a +0 and then being part of the
future work towards improving this part of the process rather than having
to revert the change due to your minus one?


On Thu, Sep 19, 2019 at 1:17 PM Udo Kohlmeyer  wrote:


-1

I must agree with Owen's analysis.

It's a known problem, and it will not cause the system to stop working.
Yes, it is a bug and will cause issues with results, BUT it will NOT
affect the stability of the system. Which is one of the only reasons we
should be adding fixes to an already cut release branch.

--Udo

On 9/19/19 11:48 AM, Owen Nichols wrote:

Thank you for providing some context for what is being voted here.

Based on this information, I will give my vote as “+0” (imho it may not
meet the definition of a “critical fix”, but seems like the risk is low and
the community wants it, so I have no real objection).



On Sep 19, 2019, at 11:38 AM, Xiaojian Zhou  wrote:

Owen:
Here are the answers:

- Is this fixing an issue of Data loss? Performance degradation?
Backward-compatibility issue? Availability impacts?  Resource exhaustion
(threads, disk, cpu, memory, sockets, etc)?

Without the fix, fields in the inherited attributes cannot be indexed,

if

it's user object. For example, I have a Customer class, which contains
phoneBook. I have a subclass LocalCustomer to inherit Customer class,

then

I cannot index on phoneBook.

- Did this issue exist in the previous release?
Yes.

- What is the impact of not fixing it?
Customer will see it and they have seen it.

- What are the risks of introducing this change so close to shipping?
No risk. It's standalone fix. Not to impact any where else. And it will

be

backported in future if we did not do it now.

- How extensively has the fix been tested on develop?
We introduced several dunit and junit tests.

- How “sensitive” is the area of code it touches?
Not sensitive.

- What new tests have been added?
New dunit tests and junit tests.

Regards
Gester

On Thu, Sep 19, 2019 at 11:21 AM Owen Nichols 

wrote:

On Sep 19, 2019, at 11:15 AM, Xiaojian Zhou  wrote:

Owen:

The reason is: it's already cherry-picked to 1.9.

Can you kindly point me to the specific SHA where this was fixed in

1.9?

I am not able to find it...


Gester

On Thu, Sep 19, 2019 at 11:13 AM Owen Nichols 

wrote:

It looks like this has already passed the vote, but I don’t see an
explanation anywhere in this thread for what makes this a "critical

fix".

As I recall release/1.10.0 was branched at the beginning of August,

so

it

seems appropriate to apply a very high level of scrutiny to any

continuing

proposals to further delay the release of 1.10.0.

- Is this fixing an issue of Data loss? Performance degradation?
Backward-compatibility issue? Availability impacts?  Resource

exhaustion

(threads, disk, cpu, memory, sockets, etc)?
- Did this issue exist in the previous release?
- What is the impact of not fixing it?
- What are the risks of introducing this change so close to shipping?
- How extensively has the fix been tested on develop?
- How “sensitive” is the area of code it touches?
- What new tests have been added?



On Sep 19, 2019, at 11:08 AM, Anilkumar Gingade <

aging...@pivotal.io>

w

[DISCUSS] - Cutting of release 1.9.2

2019-09-20 Thread Udo Kohlmeyer

Hi there Geode Dev's,

I would like to propose a release for Geode 1.9.x that includes 
https://issues.apache.org/jira/browse/GEODE-7121.


This is an issue that is current in the 1.9.x branch, which is used by 
Spring Data Geode 2.1.10 .


This issue can cause an application using Spring Data Geode to bootstrap 
GEODE to deadlock upon start up.


This is critical system's issue and given that Spring Data Geode cannot 
change its underlying dependency to 1.10+, would require this fix to be 
back-ported to the 1.9 branch.


--Udo



Re: [DISCUSS] - Cutting of release 1.9.2

2019-09-20 Thread Udo Kohlmeyer

John, thank you!

You are correct, not sure what I was thinking.. Geode 1.9.x IS used by 
SDG 2.2


On 9/20/19 1:11 PM, John Blum wrote:

Correction to *Udo's* comments!

SDG Moore/2.2 is based on Apache Geode 1.9, NOT SDG Lovelace/2.1 (which is
based on Apache Geode 1.6).

You can review the version compatibility matrix beginning with SBDG, here (
https://github.com/spring-projects/spring-boot-data-geode/wiki/Spring-Boot-for-Apache-Geode-and-Pivotal-GemFire-Version-Compatibility-Matrix
).

On Fri, Sep 20, 2019 at 1:09 PM John Blum  wrote:


Hi Kirk - SDG 2.3/Neuman, which is only after SDG 2.2/Moore GAs, which is
tentatively scheduled for Monday, Sept. 30th.

On Fri, Sep 20, 2019 at 1:01 PM Kirk Lund  wrote:


Hi Udo, SDG cannot upgrade to Geode 1.10.x until which version? SDG 2.2.0?

On Fri, Sep 20, 2019 at 12:45 PM Udo Kohlmeyer  wrote:


Hi there Geode Dev's,

I would like to propose a release for Geode 1.9.x that includes
https://issues.apache.org/jira/browse/GEODE-7121.

This is an issue that is current in the 1.9.x branch, which is used by
Spring Data Geode 2.1.10 <https://spring.io/projects/spring-data-geode
.

This issue can cause an application using Spring Data Geode to bootstrap
GEODE to deadlock upon start up.

This is critical system's issue and given that Spring Data Geode cannot
change its underlying dependency to 1.10+, would require this fix to be
back-ported to the 1.9 branch.

--Udo




--
-John
john.blum10101 (skype)





War's vs JArs

2019-09-24 Thread Udo Kohlmeyer

Hi there Geode Dev,

I've just run into an interesting situation where I've realized that 
since Geode 1.8.0, the maven repository artifacts for geode-web and 
geode-web-api have changed from type war to jars.


Is this something that was intentional?

--Udo



Re: War's vs JArs

2019-09-24 Thread Udo Kohlmeyer

Dan, thank you for the investigation.

The war artifacts have been published to maven, since the dawn of time. 
Whilst I agree that they are not stand-alone, there is the ability to 
reference them via maven/gradle as dependencies and start a server using 
Geode API's or other technologies like Spring Boot Data for Apache Geode 
or Spring Data For Apache Geode.


If we've missed those, would it makes sense to maybe update the process 
with the next patch of 1.9.x (we already have a [DISCUSS] thread going 
to cut them).


--Udo

On 9/24/19 11:12 AM, Dan Smith wrote:

Huh. Looks like we're missing a 'from components.web' in publish.gradle if
we want to publish the war files. I'm not sure why this has changed.

I'm not sure how we want to actually ship these projects - they are in the
.tgz of geode. They are not really standalone war files, from what I
remember - they need to be run within a geode server/locator.

-Dan

On Tue, Sep 24, 2019 at 10:55 AM Udo Kohlmeyer  wrote:


Hi there Geode Dev,

I've just run into an interesting situation where I've realized that
since Geode 1.8.0, the maven repository artifacts for geode-web and
geode-web-api have changed from type war to jars.

Is this something that was intentional?

--Udo




Re: [VOTE] Apache Geode 1.10.0.RC2

2019-09-24 Thread Udo Kohlmeyer
Given that 1.8, 1.9 are currently publishing the jar's to the maven 
central, instead of wars, I can only assume that 1.10 would do the same.


Also, given that everyone is surprised by this, means that we have not 
address this issue.


Until we can confirm that we would publish war files instead of jars for 
geode-web and geode-web-api, I will have to vote. This of course would 
change once this issue is resolved.


-1

--Udo


Hello Geode dev community,

This is a release candidate for Apache Geode, version 1.10.0.RC2.
Thanks to all the community members for their contributions to this
release!

Please do a review and give your feedback including the checks performed.
The voting deadline is *3PM PST Wed, September 25 2019*.

This candidate has passed RC Qualification checks (NEW) which can be

found

at:



https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-release-1-10-0-rc

Release notes can be found at:



https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.10.0

Please note that we are voting upon the source tags: *rel/v1.10.0.RC2*

Apache Geode:
https://github.com/apache/geode/tree/rel/v1.10.0.RC2
Apache Geode examples:
https://github.com/apache/geode-examples/tree/rel/v1.10.0.RC2
Apache Geode native:
https://github.com/apache/geode-native/tree/rel/v1.10.0.RC2

Source and binary files:
https://dist.apache.org/repos/dist/dev/geode/1.10.0.RC2/

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachegeode-1059

Geode's KEYS file containing PGP keys we use to sign the release:
https://github.com/apache/geode/blob/develop/KEYS

PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=
https://dist.apache.org/repos/dist/dev/geode/1.10.0.RC2
-PgeodeRepositoryUrl=
https://repository.apache.org/content/repositories/orgapachegeode-1059
build runAll

Regards
Dick Cavender



Re: [VOTE] Apache Geode 1.10.0.RC2

2019-09-24 Thread Udo Kohlmeyer

Hi there Robert,

Whilst it is not a new issue (for 1.10), but missing something this 
critical is an automatic -1, in my books. It has been a problem since 
1.8 and we've just missed it, otherwise 1.8 onwards should have been 
rejected.


There is not a JIRA, as I have not yet received a positive response to 
understand IF the change to JAR was expected or unexpected.


Also note that my '-1' is conditional on either fixing it or a response 
(and JIRA) outlining that this is an signed off feature.


--Udo

On 9/24/19 1:21 PM, Robert Houghton wrote:

Udo, is there a Jira about this issue?
Since this is not a new issue in 1.10 (as you say, it is the case in 1.8,
1.9, 1.9.1) is it really a -1 ?

On Tue, Sep 24, 2019 at 12:01 PM Udo Kohlmeyer  wrote:


Given that 1.8, 1.9 are currently publishing the jar's to the maven
central, instead of wars, I can only assume that 1.10 would do the same.

Also, given that everyone is surprised by this, means that we have not
address this issue.

Until we can confirm that we would publish war files instead of jars for
geode-web and geode-web-api, I will have to vote. This of course would
change once this issue is resolved.

-1

--Udo


Hello Geode dev community,

This is a release candidate for Apache Geode, version 1.10.0.RC2.
Thanks to all the community members for their contributions to this
release!

Please do a review and give your feedback including the checks

performed.

The voting deadline is *3PM PST Wed, September 25 2019*.

This candidate has passed RC Qualification checks (NEW) which can be

found

at:



https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-release-1-10-0-rc

Release notes can be found at:



https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.10.0

Please note that we are voting upon the source tags: *rel/v1.10.0.RC2*

Apache Geode:
https://github.com/apache/geode/tree/rel/v1.10.0.RC2
Apache Geode examples:
https://github.com/apache/geode-examples/tree/rel/v1.10.0.RC2
Apache Geode native:
https://github.com/apache/geode-native/tree/rel/v1.10.0.RC2

Source and binary files:
https://dist.apache.org/repos/dist/dev/geode/1.10.0.RC2/

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachegeode-1059

Geode's KEYS file containing PGP keys we use to sign the release:
https://github.com/apache/geode/blob/develop/KEYS

PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=
https://dist.apache.org/repos/dist/dev/geode/1.10.0.RC2
-PgeodeRepositoryUrl=
https://repository.apache.org/content/repositories/orgapachegeode-1059
build runAll

Regards
Dick Cavender



Re: [VOTE] Apache Geode 1.10.0.RC2

2019-09-24 Thread Udo Kohlmeyer

I'm going to change my vote to "+1".

This is an issue that needs to be resolved. I would have preferred 1.10, 
but maybe it is good enough to have it fixed in 1.11.


@Robert, https://issues.apache.org/jira/browse/GEODE-7241 is the JIRA 
you were looking for.


We definitely have to fix this issue for 1.9.x, so 1.9.2 will require 
this fix.


--Udo

On 9/24/19 1:21 PM, Robert Houghton wrote:

Udo, is there a Jira about this issue?
Since this is not a new issue in 1.10 (as you say, it is the case in 1.8,
1.9, 1.9.1) is it really a -1 ?

On Tue, Sep 24, 2019 at 12:01 PM Udo Kohlmeyer  wrote:


Given that 1.8, 1.9 are currently publishing the jar's to the maven
central, instead of wars, I can only assume that 1.10 would do the same.

Also, given that everyone is surprised by this, means that we have not
address this issue.

Until we can confirm that we would publish war files instead of jars for
geode-web and geode-web-api, I will have to vote. This of course would
change once this issue is resolved.

-1

--Udo


Hello Geode dev community,

This is a release candidate for Apache Geode, version 1.10.0.RC2.
Thanks to all the community members for their contributions to this
release!

Please do a review and give your feedback including the checks

performed.

The voting deadline is *3PM PST Wed, September 25 2019*.

This candidate has passed RC Qualification checks (NEW) which can be

found

at:



https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-release-1-10-0-rc

Release notes can be found at:



https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.10.0

Please note that we are voting upon the source tags: *rel/v1.10.0.RC2*

Apache Geode:
https://github.com/apache/geode/tree/rel/v1.10.0.RC2
Apache Geode examples:
https://github.com/apache/geode-examples/tree/rel/v1.10.0.RC2
Apache Geode native:
https://github.com/apache/geode-native/tree/rel/v1.10.0.RC2

Source and binary files:
https://dist.apache.org/repos/dist/dev/geode/1.10.0.RC2/

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachegeode-1059

Geode's KEYS file containing PGP keys we use to sign the release:
https://github.com/apache/geode/blob/develop/KEYS

PS: Command to run geode-examples: ./gradlew -PgeodeReleaseUrl=
https://dist.apache.org/repos/dist/dev/geode/1.10.0.RC2
-PgeodeRepositoryUrl=
https://repository.apache.org/content/repositories/orgapachegeode-1059
build runAll

Regards
Dick Cavender



Re: [DISCUSS] GEODE-7241 - make Jar not War?

2019-09-24 Thread Udo Kohlmeyer
Maybe the better question is, WHY are we publishing geode-web and 
geode-web-api.


Pulse, from what I remember, could be a standalone deployment. At least 
that is what I remember.


Most likely an oversight...

--Udo

On 9/24/19 3:38 PM, Robert Houghton wrote:

The geode-pulse module also builds a war, but does not publish it. Is this
an oversight, or by design?

On Tue, Sep 24, 2019 at 3:34 PM Robert Houghton 
wrote:


I am working on the change to get the geode-web and geode-web-api war
artifacts published instead of the jars. I have found the
geode-web-management project is also producing a war artifact, in addition
to a jar. Do we want it to be published as well? What is the criterion we
use to decide?

I think this problem was an oversight from the changes PurelyApplied and I
made to the build when we made the publish plugin 'opt-in' instead of
forced by the root project. Easy to publish one or the other, but I am not
qualified to decide whether a war or jar is more appropriate for these
modules.

Thank you,
-Robert



Re: [DISCUSS] GEODE-7241 - make Jar not War?

2019-09-25 Thread Udo Kohlmeyer

Seems these should have been Jars all along...

On 9/24/19 8:09 PM, Jacob Barrett wrote:

Why publish them as WAR files at all? As they are currently packaged they can't 
be deployed in just any J2EE web container because they lack all the 
dependencies. Sure they look like WAR files internally but they are really 
modules that expect to run in and only in the Geode server. Publishing them as 
a WAR file gives the false impression that they can be consumed by any project 
as a full fledged WAR.

We should make up our own JAR spec based on WAR, perhaps calling it Geode Web 
ARchive or GWAR for short. Yes, I just went there… GWAR!!! 🤘🎸🤘🎸🤘🎸🤘🎸

But seriously, unless it can be deployed in any J2EE web container it shouldn’t 
be considered a WAR.

-Jake


On Sep 24, 2019, at 3:34 PM, Robert Houghton  wrote:

I am working on the change to get the geode-web and geode-web-api war
artifacts published instead of the jars. I have found the
geode-web-management project is also producing a war artifact, in addition
to a jar. Do we want it to be published as well? What is the criterion we
use to decide?

I think this problem was an oversight from the changes PurelyApplied and I
made to the build when we made the publish plugin 'opt-in' instead of
forced by the root project. Easy to publish one or the other, but I am not
qualified to decide whether a war or jar is more appropriate for these
modules.

Thank you,
-Robert


Re: [DISCUSS] GEODE-7241 - make Jar not War?

2019-09-25 Thread Udo Kohlmeyer
@Anthony. Ticket was updated.. In a nutshell, to run integration tests, 
using the REST endpoints, requires the starting of the server with and 
embedded web-server.


As all tests run on dependency management only and don't have access to 
a downloaded product, the HTTP endpoints are not part of the dependency. 
These are added in by including the WAR files from mavenCentral.


As @Dan pointed out, GEODE-5660 enabled this behavior.

I think this approach might actually even help with the testing of the 
REST Controller in the Geode codebase itself.


--Udo

On 9/25/19 8:32 AM, Anthony Baker wrote:

Udo,

Can you update GEODE-7241 to help us understand the reason why we need to 
publish geode-web* WARs to maven?  I get that we used to do this but I can’t 
recall why we choose that approach.

There is one request for Pulse on maven (GEODE-6208).

Anthony



On Sep 24, 2019, at 3:44 PM, Udo Kohlmeyer  wrote:

Maybe the better question is, WHY are we publishing geode-web and geode-web-api.

Pulse, from what I remember, could be a standalone deployment. At least that is 
what I remember.

Most likely an oversight...

--Udo

On 9/24/19 3:38 PM, Robert Houghton wrote:

The geode-pulse module also builds a war, but does not publish it. Is this
an oversight, or by design?

On Tue, Sep 24, 2019 at 3:34 PM Robert Houghton 
wrote:


I am working on the change to get the geode-web and geode-web-api war
artifacts published instead of the jars. I have found the
geode-web-management project is also producing a war artifact, in addition
to a jar. Do we want it to be published as well? What is the criterion we
use to decide?

I think this problem was an oversight from the changes PurelyApplied and I
made to the build when we made the publish plugin 'opt-in' instead of
forced by the root project. Easy to publish one or the other, but I am not
qualified to decide whether a war or jar is more appropriate for these
modules.

Thank you,
-Robert



Re: [DISCUSS] GEODE-7241 - make Jar not War?

2019-09-25 Thread Udo Kohlmeyer
So it seems there is consensus around jar vs war. WAR's win by nose. 
(until we can find a more creative way to expose said artifacts)


That said, do we need to start another thread about fixing 1.8.x or 1.9.x?

I'm already considering proposing that GEODE-7241 is included into 
1.9.2, as that patch release is already discussed to backport GEODE-7121.


--Udo

On 9/25/19 10:53 AM, Anthony Baker wrote:

Thanks for the reminder.  If these files are required to start a geode member, 
I agree they should be published artifacts.  Perhaps there’s a better way to 
pull them in…but this seems like the best option for now.

Anthony



On Sep 25, 2019, at 10:22 AM, Udo Kohlmeyer  wrote:

@Anthony. Ticket was updated.. In a nutshell, to run integration tests, using 
the REST endpoints, requires the starting of the server with and embedded 
web-server.

As all tests run on dependency management only and don't have access to a 
downloaded product, the HTTP endpoints are not part of the dependency. These 
are added in by including the WAR files from mavenCentral.

As @Dan pointed out, GEODE-5660 enabled this behavior.

I think this approach might actually even help with the testing of the REST 
Controller in the Geode codebase itself.

--Udo


Re: [DISCUSS] - Cutting of release 1.9.2

2019-09-25 Thread Udo Kohlmeyer

It seems there is consensus to cut a 1.9.2 release to include GEODE-7121.

--Udo

On 9/20/19 3:34 PM, Jens Deppe wrote:

+1 for creating a 1.9.2 release with this fix.

On Fri, Sep 20, 2019 at 2:00 PM Udo Kohlmeyer  wrote:


John, thank you!

You are correct, not sure what I was thinking.. Geode 1.9.x IS used by
SDG 2.2

On 9/20/19 1:11 PM, John Blum wrote:

Correction to *Udo's* comments!

SDG Moore/2.2 is based on Apache Geode 1.9, NOT SDG Lovelace/2.1 (which

is

based on Apache Geode 1.6).

You can review the version compatibility matrix beginning with SBDG,

here (
https://github.com/spring-projects/spring-boot-data-geode/wiki/Spring-Boot-for-Apache-Geode-and-Pivotal-GemFire-Version-Compatibility-Matrix

).

On Fri, Sep 20, 2019 at 1:09 PM John Blum  wrote:


Hi Kirk - SDG 2.3/Neuman, which is only after SDG 2.2/Moore GAs, which

is

tentatively scheduled for Monday, Sept. 30th.

On Fri, Sep 20, 2019 at 1:01 PM Kirk Lund  wrote:


Hi Udo, SDG cannot upgrade to Geode 1.10.x until which version? SDG

2.2.0?

On Fri, Sep 20, 2019 at 12:45 PM Udo Kohlmeyer  wrote:


Hi there Geode Dev's,

I would like to propose a release for Geode 1.9.x that includes
https://issues.apache.org/jira/browse/GEODE-7121.

This is an issue that is current in the 1.9.x branch, which is used by
Spring Data Geode 2.1.10 <

https://spring.io/projects/spring-data-geode

.

This issue can cause an application using Spring Data Geode to

bootstrap

GEODE to deadlock upon start up.

This is critical system's issue and given that Spring Data Geode

cannot

change its underlying dependency to 1.10+, would require this fix to

be

back-ported to the 1.9 branch.

--Udo



--
-John
john.blum10101 (skype)



Re: [DISCUSS] GEODE-7241 - make Jar not War?

2019-09-25 Thread Udo Kohlmeyer
@Anthony, you must have missed the LACK OF excitement in that 
question... 1.8.x should stay as is... I was merely asking the question.


--Udo

On 9/25/19 11:58 AM, Anthony Baker wrote:

Since SBDG is fixed to Geode 1.9.x I can see an argument for backporting to 
there.  I’m personally not as excited about a new 1.8 patch release but I’m 
open to hearing your ideas :-)

Anthony



On Sep 25, 2019, at 11:46 AM, Udo Kohlmeyer  wrote:

So it seems there is consensus around jar vs war. WAR's win by nose. (until we 
can find a more creative way to expose said artifacts)

That said, do we need to start another thread about fixing 1.8.x or 1.9.x?

I'm already considering proposing that GEODE-7241 is included into 1.9.2, as 
that patch release is already discussed to backport GEODE-7121.

--Udo

On 9/25/19 10:53 AM, Anthony Baker wrote:

Thanks for the reminder.  If these files are required to start a geode member, 
I agree they should be published artifacts.  Perhaps there’s a better way to 
pull them in…but this seems like the best option for now.

Anthony



On Sep 25, 2019, at 10:22 AM, Udo Kohlmeyer  wrote:

@Anthony. Ticket was updated.. In a nutshell, to run integration tests, using 
the REST endpoints, requires the starting of the server with and embedded 
web-server.

As all tests run on dependency management only and don't have access to a 
downloaded product, the HTTP endpoints are not part of the dependency. These 
are added in by including the WAR files from mavenCentral.

As @Dan pointed out, GEODE-5660 enabled this behavior.

I think this approach might actually even help with the testing of the REST 
Controller in the Geode codebase itself.

--Udo


  1   2   3   4   5   6   >