Re: geodeOldVersionInstalls

2020-04-28 Thread Kirk Lund
Thanks, yes, it is created by Gradle. Unfortunately we have some tests
which have part of their setup in gradle files which is a major
anti-pattern for tests. We should NOT be doing this. The setup for the test
should be within the test itself.

On Tue, Apr 28, 2020 at 4:32 PM Robert Houghton 
wrote:

> Pretty sure it is generated by Gradle in geode-assembly
>
> On Tue, Apr 28, 2020, 16:09 Kirk Lund  wrote:
>
> > Does anyone know how geodeOldVersionInstalls.txt gets created?
> >
> > When I try to run
> > TomcatSessionBackwardsCompatibilityTomcat8WithOldModuleCanDoPutsTest
> > locally, it fails with the following stack trace.
> >
> > I've searched for geodeOldVersionInstalls.txt and grepped for code that
> > involves the file, but I can't figure out how this file gets created or
> how
> > to run that test.
> >
> > Anyone have ideas how to debug this test or how to
> > ensure geodeOldVersionInstalls.txt gets created?
> >
> > java.lang.InternalError: VersionManager: unable to locate
> > geodeOldVersionInstalls.txt in class-path
> >
> > at
> >
> >
> org.apache.geode.test.version.VersionManager.checkForLoadFailure(VersionManager.java:150)
> > at
> >
> >
> org.apache.geode.test.version.VersionManager.getVersionsWithoutCurrent(VersionManager.java:141)
> > at
> >
> >
> org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTestBase.data(TomcatSessionBackwardsCompatibilityTestBase.java:55)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at
> >
> >
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> > at
> >
> >
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> >
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> > at org.junit.runners.Parameterized.allParameters(Parameterized.java:280)
> > at org.junit.runners.Parameterized.(Parameterized.java:248)
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> > at
> >
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> > at
> >
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> > at
> >
> >
> org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104)
> > at
> >
> >
> org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86)
> > at
> >
> >
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
> > at
> >
> >
> org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26)
> > at
> >
> >
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
> > at
> org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:33)
> > at
> >
> >
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:49)
> > at
> >
> >
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> > at
> >
> >
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> > at
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> >
>


Re: geodeOldVersionInstalls

2020-04-28 Thread Robert Houghton
Pretty sure it is generated by Gradle in geode-assembly

On Tue, Apr 28, 2020, 16:09 Kirk Lund  wrote:

> Does anyone know how geodeOldVersionInstalls.txt gets created?
>
> When I try to run
> TomcatSessionBackwardsCompatibilityTomcat8WithOldModuleCanDoPutsTest
> locally, it fails with the following stack trace.
>
> I've searched for geodeOldVersionInstalls.txt and grepped for code that
> involves the file, but I can't figure out how this file gets created or how
> to run that test.
>
> Anyone have ideas how to debug this test or how to
> ensure geodeOldVersionInstalls.txt gets created?
>
> java.lang.InternalError: VersionManager: unable to locate
> geodeOldVersionInstalls.txt in class-path
>
> at
>
> org.apache.geode.test.version.VersionManager.checkForLoadFailure(VersionManager.java:150)
> at
>
> org.apache.geode.test.version.VersionManager.getVersionsWithoutCurrent(VersionManager.java:141)
> at
>
> org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTestBase.data(TomcatSessionBackwardsCompatibilityTestBase.java:55)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
>
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
>
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
>
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at org.junit.runners.Parameterized.allParameters(Parameterized.java:280)
> at org.junit.runners.Parameterized.(Parameterized.java:248)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
>
> org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104)
> at
>
> org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86)
> at
>
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
> at
>
> org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26)
> at
>
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
> at org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:33)
> at
>
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:49)
> at
>
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at
>
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
>


geodeOldVersionInstalls

2020-04-28 Thread Kirk Lund
Does anyone know how geodeOldVersionInstalls.txt gets created?

When I try to run
TomcatSessionBackwardsCompatibilityTomcat8WithOldModuleCanDoPutsTest
locally, it fails with the following stack trace.

I've searched for geodeOldVersionInstalls.txt and grepped for code that
involves the file, but I can't figure out how this file gets created or how
to run that test.

Anyone have ideas how to debug this test or how to
ensure geodeOldVersionInstalls.txt gets created?

java.lang.InternalError: VersionManager: unable to locate
geodeOldVersionInstalls.txt in class-path

at
org.apache.geode.test.version.VersionManager.checkForLoadFailure(VersionManager.java:150)
at
org.apache.geode.test.version.VersionManager.getVersionsWithoutCurrent(VersionManager.java:141)
at
org.apache.geode.session.tests.TomcatSessionBackwardsCompatibilityTestBase.data(TomcatSessionBackwardsCompatibilityTestBase.java:55)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.runners.Parameterized.allParameters(Parameterized.java:280)
at org.junit.runners.Parameterized.(Parameterized.java:248)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104)
at
org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86)
at
org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
at
org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26)
at
org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
at org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:33)
at
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:49)
at
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)


Re: Dangers of using sockets in unit tests

2020-04-28 Thread Kirk Lund
The UpgradeTest job refuses to pass on PR #5011. As far I can tell,
changing a few tests from UnitTest to IntegrationTest should NOT cause
UpgradeTest to fail. It's the Tomcat Session upgrade tests and they just
seem to fail every single time in this PR. Again, dunno what to do with
this. Anyone familiar with the Tomcat Session upgrade tests want to take a
look?

On Mon, Apr 27, 2020 at 4:25 PM Kirk Lund  wrote:

> The problems in WindowsUnitTest jobs in CI are caused by various unit
> tests that are either bugged or should be moved to IntegrationTests.
>
> I found 3 tests named *IntegrationTests in src/test that need to move:
>
> * 
> geode-core/src/test/java/org/apache/geode/distributed/internal/InternalDistributedSystemIntegrationTest.java
>
> * 
> geode-protobuf/src/test/java/org/apache/geode/internal/protocol/protobuf/v1/operations/OqlQueryRequestOperationHandlerIntegrationTest.java
>
> * 
> geode-wan/src/test/java/org/apache/geode/internal/cache/wan/GatewaySenderEventRemoteDispatcherIntegrationTest.java
>
> These are unit tests that are better off renamed and moved to
> src/integrationTest because they touch so many Geode classes including
> singletons:
>
> * 
> geode-core/src/test/java/org/apache/geode/distributed/internal/InternalLocatorTest.java
>
> * 
> geode-core/src/test/java/org/apache/geode/distributed/internal/locks/DLockServiceJUnitTest.java
>
> This one creates a full DistributedSystem stack so it belongs in
> src/integrationTest:
>
> * 
> geode-core/src/test/java/org/apache/geode/distributed/internal/InternalDistributedSystemTest.java
>
> And these unit tests create a full Cache stack so they belong in
> src/integrationTest:
>
> * 
> geode-lucene/src/test/java/org/apache/geode/cache/lucene/FlatFormatPdxSerializerJunitTest.java
>
> * 
> geode-protobuf/src/test/java/org/apache/geode/internal/protocol/protobuf/v1/serialization/codec/JsonPdxConverterJUnitTest.java
>
> And one last unit test that I was able to fix -- this test creates a spy
> to test as a partial mock with a common goof that results in a full Cache
> being created and then not used by the test:
>
> * 
> extensions/geode-modules/src/test/java/org/apache/geode/modules/util/BootstrappingFunctionTest.java
>
> I filed a PR to fix all of the above issues:
> https://github.com/apache/geode/pull/5011
>
> Thanks,
> Kirk
>
> On Mon, Apr 27, 2020 at 1:58 PM Kirk Lund  wrote:
>
>> This test started failing consistently on Windows (5 builds in a row so
>> far!):
>>
>> org.apache.geode.internal.net.SocketCreatorFactoryJUnitTest >
>> testNewSSLConfigSSLComponentLocator FAILED
>> java.lang.AssertionError
>> at org.junit.Assert.fail(Assert.java:86)
>> at org.junit.Assert.assertTrue(Assert.java:41)
>> at org.junit.Assert.assertTrue(Assert.java:52)
>> at
>> org.apache.geode.internal.net.SocketCreatorFactoryJUnitTest.testNewSSLConfigSSLComponentLocator(SocketCreatorFactoryJUnitTest.java:106)
>>
>> The method testNewSSLConfigSSLComponentLocator is the first
>> in SocketCreatorFactoryJUnitTest to execute. And because a previous unit
>> test initialized the singleton either directly or indirectly without
>> cleaning it up in tearDown, it pollutes the JVM for later tests. The only
>> later unit test that is affected seems to be SocketCreatorFactoryJUnitTest.
>>
>> Now you could easily fix SocketCreatorFactoryJUnitTest by adding a new
>> setUp() method:
>>
>>
>> *@Before*
>> *public void setUp() throws Exception {*
>> *  SocketCreatorFactory.close();*
>> *}*
>>
>> @After
>> public void tearDown() throws Exception {
>>   SocketCreatorFactory.close();
>> }
>>
>> *This change will fix this specific symptom, but not the underlying
>> problem -- which is "previous unit test initialized SocketCreatorFactory
>> without cleaning it up". *
>>
>> *You can see why it's a problem if a unit test fails to cleanup
>> EVERYTHING that it setup. And you can why singletons (especially internal
>> non-User API singletons) are so dangerous for maintaining a clean GREEN CI.*
>>
>> Cheers,
>> Kirk
>>
>>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-28 Thread Kirk Lund
In addition to PR precheckin jobs, I've also run a full regression against
the changes to make Cache.close() synchronous. There are failures, but I
have no idea if they are normal or "ok" failures or not. So I'm not sure
what to do next with this change unless someone else wants to review the
Hydra failures. This is the problem with having a bunch of non-open tests
that we can't really discuss on dev list. Let me know what you guys want to
do!

On Tue, Apr 21, 2020 at 2:27 PM Kirk Lund  wrote:

> PR #4963 https://github.com/apache/geode/pull/4963 is ready for review.
> It has passed precheckin once. After self-review, I reverted a couple small
> changes that weren't needed so it needs to go through precheckin again.
>
> On Fri, Apr 17, 2020 at 9:42 AM Kirk Lund  wrote:
>
>> Memcached IntegrationJUnitTest hangs the PR IntegrationTest job because
>> Cache.close() calls GeodeMemcachedService.close() which again calls
>> Cache.close(). Looks like the code base has lots of Cache.close() calls
>> -- all of them could theoretically cause issues. I hate to add 
>> ThreadLocal
>> isClosingThread or something like it just to allow reentrant calls to
>> Cache.close().
>>
>> Mark let the IntegrationTest job run for 7+ hours which shows the hang in
>> the Memcached IntegrationJUnitTest. (Thanks Mark!)
>>
>> On Thu, Apr 16, 2020 at 1:38 PM Kirk Lund  wrote:
>>
>>> It timed out while running OldFreeListOffHeapRegionJUnitTest but I think
>>> the tests before it were responsible for the timeout being exceeded. I
>>> looked through all of the previously run tests and how long each but
>>> without having some sort of database with how long each test takes, it's
>>> impossible to know which test or tests take longer in any given PR.
>>>
>>> The IntegrationTest job that exceeded the timeout:
>>> https://concourse.apachegeode-ci.info/builds/147866
>>>
>>> The Test Summary for the above IntegrationTest job with Duration for
>>> each test:
>>> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-4963/test-results/integrationTest/1587061092/
>>>
>>> Unless we want to start tracking each test class/method and its Duration
>>> in a database, I don't see how we could look for trends or changes to
>>> identify test(s) that suddenly start taking longer. All of the tests take
>>> less than 3 minutes each, so unless one suddenly spikes to 10 minutes or
>>> more, there's really no way to find the test(s).
>>>
>>> On Thu, Apr 16, 2020 at 12:52 PM Owen Nichols 
>>> wrote:
>>>
 Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see one
 <
 https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/IntegrationTestOpenJDK11/builds/7202>
 that came in just under 45 minutes but did succeed.  It would be nice to
 know what test is occasionally taking longer and why…

 Here’s an example of a previous timeout increase (Note that both the
 job timeout and the callstack timeout should be increased by the same
 amount): https://github.com/apache/geode/pull/4231

 > On Apr 16, 2020, at 10:47 AM, Kirk Lund  wrote:
 >
 > Unfortunately, IntegrationTest exceeds timeout every time I trigger
 it. The
 > cause does not appear to be a specific test or hang. I
 > think IntegrationTest has already been running very close to the
 timeout
 > and is exceeding it fairly often even without my changes in #4963.
 >
 > Should we increase the timeout for IntegrationTest? (Anyone know how
 to
 > increase it?)




Re: [DISCUSS] Publish Builds, not Snapshots

2020-04-28 Thread Jacob Barrett
Probably for a separate discussion but I would opt for a future where the 
release manager validates and signs a specific build from produced by the CI. 
This takes a lot of error away in the what the release manager currently does, 
which has resulted in incorrect artifacts being voted on. 

In this model the release manager would be certifying a release with full 
version number, or whatever we choose for the release version number.

Until then the release manager could just use the short number. In Gradle/Maven 
versioning a version without qualifier is always newer than one with. So 1.13.0 
is newer than any 1.13.0-build.123.

-Jake


> On Apr 28, 2020, at 10:24 AM, Anthony Baker  wrote:
> 
> Note that I’m asking about building on my local system.  Will the version 
> listed in gradle.properties use the full 1.13.0-build.123 syntax?  How would 
> it get bumped?
> 
> Would a release manager end up using just 1.13.0?  Hoping for yes :-)
> 
> Anthony
> 
>> On Apr 28, 2020, at 9:24 AM, Robert Houghton  wrote:
>> 
>> @anthony
>> /develop would say 1.13.0-build.230 (as of this morning).
>> Once cut, /release/1.13 would say 1.13.0-build.231, and /develop would
>> switch to 1.14.0-build.1.
>> 
>> gfsh would return that full value. That is the artifact version.
>> 
>> On Mon, Apr 27, 2020 at 4:38 PM Anthony Baker  wrote:
>> 
>>> If I build the /develop branch and run `gfsh version` what will it print?
>>> 
>>> If I build the soon-to-be /release/1.13 branch and run `gfsh version` what
>>> will it print?
>>> 
>>> Anthony
>>> 
>>> 
>>> On 4/27/20, 4:32 PM, "Robert Houghton"  wrote:
>>> 
>>>   The artifact would change from "1.13.0-SNAPSHOT" to "1.13.0-build.123".
>>> 
>>> 
>>> 
>>> 
>>> 
> 



Re: [DISCUSS] Publish Builds, not Snapshots

2020-04-28 Thread Anthony Baker
Note that I’m asking about building on my local system.  Will the version 
listed in gradle.properties use the full 1.13.0-build.123 syntax?  How would it 
get bumped?

Would a release manager end up using just 1.13.0?  Hoping for yes :-)

Anthony

> On Apr 28, 2020, at 9:24 AM, Robert Houghton  wrote:
> 
> @anthony
> /develop would say 1.13.0-build.230 (as of this morning).
> Once cut, /release/1.13 would say 1.13.0-build.231, and /develop would
> switch to 1.14.0-build.1.
> 
> gfsh would return that full value. That is the artifact version.
> 
> On Mon, Apr 27, 2020 at 4:38 PM Anthony Baker  wrote:
> 
>> If I build the /develop branch and run `gfsh version` what will it print?
>> 
>> If I build the soon-to-be /release/1.13 branch and run `gfsh version` what
>> will it print?
>> 
>> Anthony
>> 
>> 
>> On 4/27/20, 4:32 PM, "Robert Houghton"  wrote:
>> 
>>The artifact would change from "1.13.0-SNAPSHOT" to "1.13.0-build.123".
>> 
>> 
>> 
>> 
>> 



Re: [DISCUSS] Publish Builds, not Snapshots

2020-04-28 Thread Robert Houghton
@anthony
/develop would say 1.13.0-build.230 (as of this morning).
Once cut, /release/1.13 would say 1.13.0-build.231, and /develop would
switch to 1.14.0-build.1.

gfsh would return that full value. That is the artifact version.

On Mon, Apr 27, 2020 at 4:38 PM Anthony Baker  wrote:

> If I build the /develop branch and run `gfsh version` what will it print?
>
> If I build the soon-to-be /release/1.13 branch and run `gfsh version` what
> will it print?
>
> Anthony
>
>
> On 4/27/20, 4:32 PM, "Robert Houghton"  wrote:
>
> The artifact would change from "1.13.0-SNAPSHOT" to "1.13.0-build.123".
>
>
>
>
>


Re: [DISCUSS] Publish Builds, not Snapshots

2020-04-28 Thread Donal Evans
That just means we get to be trailblazers, I guess.

On Tue, Apr 28, 2020 at 9:02 AM Robert Houghton 
wrote:

> @donal I do not have an example of a Java project that builds using a
> Gradle multi-module build, that cares as much as we do about re-running CI
> tests with identical inputs. I wish I did, I would crib from their answer
> sheet without shame or remorse.
>
> On Mon, Apr 27, 2020 at 4:37 PM Donal Evans  wrote:
>
> > Do we know if this is an issue that other open source projects have dealt
> > with? And if so, is this proposed solution similar to what they might
> have
> > done to remedy it?
> > 
> > From: Robert Houghton 
> > Sent: Monday, April 27, 2020 4:31 PM
> > To: dev@geode.apache.org 
> > Subject: Re: [DISCUSS] Publish Builds, not Snapshots
> >
> > The artifact would change from "1.13.0-SNAPSHOT" to "1.13.0-build.123".
> >
> > The number after the "build" slug is auto-incremented by our CI system
> > anyway, as the "geode-build-version" semver resource. We are actually
> doing
> > *more* work in Gradle to truncate that number from the current "SNAPSHOT"
> > value.
> >
> > On Mon, Apr 27, 2020 at 3:41 PM Anthony Baker  wrote:
> >
> > > @Robert, can you show some examples of what the build number would be
> > > under this proposal?  Does 1.13.0-SNAPSHOT become 1.13.0.N where N
> > > increments every build?
> > >
> > > Seems reasonable.  Since the consumers of pre-release artifacts are
> > either
> > > a) this project or b) close related projects for
> > > integration-testing-purposes-only I’m not super worried about the ugly
> > > syntax.
> > >
> > >
> > > Anthony
> > >
> > >
> > > > On Apr 27, 2020, at 3:25 PM, Jacob Barrett 
> > wrote:
> > > >
> > > > It is unfortunate that the Maven/Gradle community hasn’t addressed
> this
> > > glaring issue with SNAPSHOT for decades now (well maybe not decades but
> > > certainly decade). It is also unfortunate that the Maven version
> > coordinate
> > > is ugly. Aside from that I am totally onboard. Yay for reproducible
> > builds
> > > and predictable downstream builds!
> > > >
> > > > With SNAPSHOTS in a repo the repository automatically prunes back old
> > > builds. Do we have any concerns about having a plethora of builds
> filling
> > > up this new pre-release repository?
> > > >
> > > > -Jake
> > > >
> > > >> On Apr 27, 2020, at 3:21 PM, Robert Houghton 
> > > wrote:
> > > >>
> > > >> Hello to the community,
> > > >>
> > > >> tl;dr - Lets publish builds, not snapshots, for repeatable CI
> builds,
> > as
> > > >> GEODE-8016[1]. Communicate desired artifact version via the existing
> > > >> 'UpdatePassingTokens' job.
> > > >>
> > > >> I have been working on the Geode build and CI systems for a long
> time,
> > > and
> > > >> it has irked me that the geode-examples pipeline[2] does not build
> and
> > > test
> > > >> against the latest artifacts from the develop pipeline. Some work
> has
> > > been
> > > >> done already to allow this via "composite" builds for local testing
> > > without
> > > >> needing to publish Geode to your local Maven repository.
> > > >>
> > > >> From a Concourse CI perspective, composite builds are costly due to
> > the
> > > >> rebuild of the upstream artifacts. They allow repeatable builds, but
> > > only
> > > >> by rebuilding those dependencies. Better would be to point to
> upstream
> > > >> artifacts as concrete build versions. SNAPSHOT builds can and do
> roll
> > > >> (invisibly) as new versions are published. Discrete, numbered builds
> > do
> > > >> not. Downstream consumers can use greedy version specifiers to get
> > their
> > > >> current behavior of "latest".
> > > >>
> > > >> Gradle: 'org.apache.geode:geode-core:1.13.0+'
> > > >> Maven: 'org.apache.geode'
> > > >>'geode-core'
> > > >>'[1.13.0,1.14.0)'
> > > >>
> > > >> What do you all think? Discuss!
> > > >> -Robert Houghton
> > > >>
> > > >> [1]
> >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8016data=02%7C01%7Cdoevans%40vmware.com%7Cf1b924b4bfe04438773a08d7eb033b08%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637236271422916825sdata=gRQFRLGueu5x8FxRIsdXeKM5PmZLBT6uW8Dgh7FAx2s%3Dreserved=0
> > > >> [2]
> > > >>
> > >
> >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-examplesdata=02%7C01%7Cdoevans%40vmware.com%7Cf1b924b4bfe04438773a08d7eb033b08%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637236271422916825sdata=g6mfjOLQrkbpS9UWC9QBQL37rcFASIUo5PG0rCs1eAU%3Dreserved=0
> > > >
> > >
> > >
> >
>


Re: [DISCUSS] Publish Builds, not Snapshots

2020-04-28 Thread Robert Houghton
@donal I do not have an example of a Java project that builds using a
Gradle multi-module build, that cares as much as we do about re-running CI
tests with identical inputs. I wish I did, I would crib from their answer
sheet without shame or remorse.

On Mon, Apr 27, 2020 at 4:37 PM Donal Evans  wrote:

> Do we know if this is an issue that other open source projects have dealt
> with? And if so, is this proposed solution similar to what they might have
> done to remedy it?
> 
> From: Robert Houghton 
> Sent: Monday, April 27, 2020 4:31 PM
> To: dev@geode.apache.org 
> Subject: Re: [DISCUSS] Publish Builds, not Snapshots
>
> The artifact would change from "1.13.0-SNAPSHOT" to "1.13.0-build.123".
>
> The number after the "build" slug is auto-incremented by our CI system
> anyway, as the "geode-build-version" semver resource. We are actually doing
> *more* work in Gradle to truncate that number from the current "SNAPSHOT"
> value.
>
> On Mon, Apr 27, 2020 at 3:41 PM Anthony Baker  wrote:
>
> > @Robert, can you show some examples of what the build number would be
> > under this proposal?  Does 1.13.0-SNAPSHOT become 1.13.0.N where N
> > increments every build?
> >
> > Seems reasonable.  Since the consumers of pre-release artifacts are
> either
> > a) this project or b) close related projects for
> > integration-testing-purposes-only I’m not super worried about the ugly
> > syntax.
> >
> >
> > Anthony
> >
> >
> > > On Apr 27, 2020, at 3:25 PM, Jacob Barrett 
> wrote:
> > >
> > > It is unfortunate that the Maven/Gradle community hasn’t addressed this
> > glaring issue with SNAPSHOT for decades now (well maybe not decades but
> > certainly decade). It is also unfortunate that the Maven version
> coordinate
> > is ugly. Aside from that I am totally onboard. Yay for reproducible
> builds
> > and predictable downstream builds!
> > >
> > > With SNAPSHOTS in a repo the repository automatically prunes back old
> > builds. Do we have any concerns about having a plethora of builds filling
> > up this new pre-release repository?
> > >
> > > -Jake
> > >
> > >> On Apr 27, 2020, at 3:21 PM, Robert Houghton 
> > wrote:
> > >>
> > >> Hello to the community,
> > >>
> > >> tl;dr - Lets publish builds, not snapshots, for repeatable CI builds,
> as
> > >> GEODE-8016[1]. Communicate desired artifact version via the existing
> > >> 'UpdatePassingTokens' job.
> > >>
> > >> I have been working on the Geode build and CI systems for a long time,
> > and
> > >> it has irked me that the geode-examples pipeline[2] does not build and
> > test
> > >> against the latest artifacts from the develop pipeline. Some work has
> > been
> > >> done already to allow this via "composite" builds for local testing
> > without
> > >> needing to publish Geode to your local Maven repository.
> > >>
> > >> From a Concourse CI perspective, composite builds are costly due to
> the
> > >> rebuild of the upstream artifacts. They allow repeatable builds, but
> > only
> > >> by rebuilding those dependencies. Better would be to point to upstream
> > >> artifacts as concrete build versions. SNAPSHOT builds can and do roll
> > >> (invisibly) as new versions are published. Discrete, numbered builds
> do
> > >> not. Downstream consumers can use greedy version specifiers to get
> their
> > >> current behavior of "latest".
> > >>
> > >> Gradle: 'org.apache.geode:geode-core:1.13.0+'
> > >> Maven: 'org.apache.geode'
> > >>'geode-core'
> > >>'[1.13.0,1.14.0)'
> > >>
> > >> What do you all think? Discuss!
> > >> -Robert Houghton
> > >>
> > >> [1]
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8016data=02%7C01%7Cdoevans%40vmware.com%7Cf1b924b4bfe04438773a08d7eb033b08%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637236271422916825sdata=gRQFRLGueu5x8FxRIsdXeKM5PmZLBT6uW8Dgh7FAx2s%3Dreserved=0
> > >> [2]
> > >>
> >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-examplesdata=02%7C01%7Cdoevans%40vmware.com%7Cf1b924b4bfe04438773a08d7eb033b08%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637236271422916825sdata=g6mfjOLQrkbpS9UWC9QBQL37rcFASIUo5PG0rCs1eAU%3Dreserved=0
> > >
> >
> >
>


Handling packet drop between sites

2020-04-28 Thread Mario Kevo
Hi geode-dev,

I have a question about how Geode handle when some packets from batch is 
dropped.
I create Geode WAN with two sites and established replication between them. 
Also modified iptables to drop all packets that comes to receiver port.
In that case I have that some threads are stucked. Seems like gw sender never 
received any response back.
[warn 2020/04/27 13:19:04.667 CEST  tid=0x11] Thread 128 (0x80) 
is stuck

[warn 2020/04/27 13:19:04.669 CEST  tid=0x11] Thread <128> 
(0x80) that was executed at <27 Apr 2020 13:18:13 CEST> has been stuck for 
<50.997 seconds> and number of thread monitor iteration <1>
Thread Name  state 
Executor Group 
Monitored metric 
Thread stack:
java.net.PlainSocketImpl.socketConnect(Native Method)
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
java.net.Socket.connect(Socket.java:607)
org.apache.geode.distributed.internal.tcpserver.AdvancedSocketCreatorImpl.connect(AdvancedSocketCreatorImpl.java:102)
org.apache.geode.internal.net.SCAdvancedSocketCreator.connect(SCAdvancedSocketCreator.java:51)
org.apache.geode.distributed.internal.tcpserver.TcpSocketCreatorImpl.connect(TcpSocketCreatorImpl.java:59)
org.apache.geode.distributed.internal.tcpserver.ClientSocketCreatorImpl.connect(ClientSocketCreatorImpl.java:54)
org.apache.geode.cache.client.internal.ConnectionImpl.connect(ConnectionImpl.java:94)
org.apache.geode.cache.client.internal.ConnectionConnector.connectClientToServer(ConnectionConnector.java:75)
org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:118)
org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:206)
org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.forceCreateConnection(ConnectionManagerImpl.java:216)
org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:326)
org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:329)
org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:303)
org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:839)
org.apache.geode.cache.client.internal.PingOp.execute(PingOp.java:36)
org.apache.geode.cache.client.internal.LiveServerPinger$PingTask.run2(LiveServerPinger.java:90)
org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1329)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:276)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

Also, I tried to run the same test with 200K entries and drop 70% of packets 
and see that exception is again there and it takes approx. 40min to transmit 
all entries to another site.

How Geode handle dropping some packets from the batch? Does anyone made some 
tests on this behavior?

Thanks,
Mario