Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-28 Thread Dan Smith
My understanding from our previous discussion about upgrading lucene was that 
we talked about pausing the asynchronous indexing process during the rolling 
upgrade. I don't remember a discussion that it was ok to not allow queries 
during the upgrade. But this is what we added to the docs:

"All cluster members must be running the same major Lucene version in order to 
execute Lucene queries."

What happens if a user runs a query during the rolling upgrade and why do we 
need to have this restriction? It seems to me like at a minimum we need to 
allow queries during the upgrade.

We also should consider what will happen to users with server-side query or 
indexing code - will they be able to upgrade or are they likely to hit breaking 
changes in the Lucene API?

-Dan

From: Nabarun Nag 
Sent: Tuesday, September 28, 2021 7:13 AM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0

But Mario, just for my clarification, if we re-enable the queries in the tests 
in the mixed version servers mode, it goes into a stackoverflow situation. That 
what i saw when i set hasLuceneVersionMismatch(host) to false in the test so 
that it does the query.

Regards
Naba


From: Mario Kevo 
Sent: Tuesday, September 28, 2021 4:49 AM
To: dev@geode.apache.org 
Subject: Odg: [DISCUSS] Upgrading to Lucene 7.1.0

Hi all,

Just a small clarification of the reverted PR.

There were a lot of changes between Lucene versions 6.x and 7.x. There is an 
article for that 
Upgrading+to+Lucene+7.1.0.

The first larger change was in the scoring mechanism. We adapt it to one that 
is correct for us. (verified by DistributedScoringJUnitTest)

The main change was in Lucene index format. There we come into a problem with 
our tests.
Lucene 6.x cannot read the index format of Lucene 7.x.
Through PRs we decided to include Lucene uplift in Geode 1.15.0 and add check 
if all members are on 1.15.0 version or higher (after uplift Lucene to a newer 
version with index format changes this should be changed). If a check is passed 
it will allow doing Lucene query, if not there will be a printed log that not 
all members on 1.15.0 or higher version.

Also, you can found a discussion on dev list from 2 years ago about Lucene 
upgrade: Lucene 
Upgrade

BR,
Mario




Šalje: Udo Kohlmeyer 
Poslano: 28. rujna 2021. 1:44
Prima: dev@geode.apache.org 
Predmet: Re: [DISCUSS] Upgrading to Lucene 7.1.0

Might I propose something here.

There is currently a significant amount of work going into completing 
Geode-8705, which is the Classloader isolation. We are currently targeting to 
getting this release in Geode 1.16.

My proposal is, that we use the capability that Patrick demo’d at the Community 
meeting (on this topic) where one, at runtime, can unload /  load extensions 
(like our integration with Lucene). This means that one could possibly do a 
rolling upgrade on the existing system, and keep the versions of the Lucene 
integration stable.

Once the whole system has been upgraded, the existing Lucene extension 
component is then unloaded, and the newer version of the extension component is 
then loaded. What this means, is that at runtime, there will be a period of 
time where Lucene queries will not be available and as part of the “load” 
lifecycle of the extension, there needs to be an initialization step, which 
will initialize the extension component safely.

Once initialized, Lucene queries can then become available again, etc.

This if course requires some work around the lifecycles of extension components 
and making sure that I can add the extension on at runtime and safely 
initialize it.

I think this approach allows for a more seamless (lower downtime) upgrading of 
system and extension components.

Thoughts?

--Udo

From: Nabarun Nag 
Date: Tuesday, September 28, 2021 at 

Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-28 Thread Jacob Barrett
I asked this previously but didn’t see a response in the follow ups. If were 
were to improve upon the code so that it didn’t stack overflow when it can’t 
read the index version on a specific member/bucket, ignore the members/buckets 
that don’t return a result because they can’t read the index, could we not 
aggregate the results of the members/buckets that can read the index and still 
produce the expected results?

It seems like there should be sufficient coverage across all the nodes that if 
the buckets are redundant, given they are replicating, that one node should be 
able to read the index contained in that bucket and the other nodes can be 
ignored for that bucket.

-Jake


> On Sep 27, 2021, at 2:33 PM, Nabarun Nag  wrote:
> 
> The solution for preventing the query executions to occur in the mixed 
> version mode also caused some problems where the query function executions 
> get repeatedly executed and that results in stack overflow.
> 
> 
> 
> From: Nabarun Nag 
> Sent: Monday, September 27, 2021 2:30 PM
> To: dev@geode.apache.org 
> Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0
> 
> In simple words,  if Lucene indexes were created by a new version (7.1.0), 
> then replicated to others that are still in the older version, they won't 
> understand the index, and the event processors start throwing exceptions.
> 
> This can be simply seen by just re-enabling the query execution in the DUnit 
> tests and commenting out the check blocks: [develop SHA: 
> 68629356f561a932f5dfbace70b01d9971a42473]
> 
> In LuceneEventListener
> if (cache.hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) {
>  logger.info("Some members are older than " + 
> KnownVersion.GEODE_1_15_0.getName());
>  return false;
>}
> 
> In IndexRepositoryFactory:
>if (userRegion.getCache() != null
>&& 
> userRegion.getCache().hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) {
>  logger.info("Some members are older than " + 
> KnownVersion.GEODE_1_15_0.getName());
>  return null;
>}
> 
> 
> This is the exception that will be encountered:
> 
> [Exception]
> 
> [vm2_v1.2.0] [warn 2021/09/27 14:24:42.251 PDT  GatewaySender_AsyncEventQueue_index#_aRegion_5> tid=102] An Exception 
> occurred. The dispatcher will continue.
> [vm2_v1.2.0] org.apache.geode.InternalGemFireError: Unable to create index 
> repository
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
> [vm2_v1.2.0] at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:76)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.LuceneEventListener.process(LuceneEventListener.java:87)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.LuceneEventListener.processEvents(LuceneEventListener.java:64)
> [vm2_v1.2.0] at 
> org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:154)
> [vm2_v1.2.0] at 
> org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:80)
> [vm2_v1.2.0] at 
> org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.processQueue(AbstractGatewaySenderEventProcessor.java:609)
> [vm2_v1.2.0] at 
> org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.run(AbstractGatewaySenderEventProcessor.java:1051)
> [vm2_v1.2.0] Caused by: org.apache.lucene.index.IndexFormatTooNewException: 
> Format version is not supported (resource 
> BufferedChecksumIndexInput(segments_2)): 7 (needs to be between 4 and 6)
> [vm2_v1.2.0] at 
> org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:216)
> [vm2_v1.2.0] at 
> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:302)
> [vm2_v1.2.0] at 
> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286)
> [vm2_v1.2.0] at 
> org.apache.lucene.index.IndexWriter.(IndexWriter.java:938)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:84)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:42)
> [vm2_v1.2.0] at 
> org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:116)
> 

Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-28 Thread Nabarun Nag
But Mario, just for my clarification, if we re-enable the queries in the tests 
in the mixed version servers mode, it goes into a stackoverflow situation. That 
what i saw when i set hasLuceneVersionMismatch(host) to false in the test so 
that it does the query.

Regards
Naba


From: Mario Kevo 
Sent: Tuesday, September 28, 2021 4:49 AM
To: dev@geode.apache.org 
Subject: Odg: [DISCUSS] Upgrading to Lucene 7.1.0

Hi all,

Just a small clarification of the reverted PR.

There were a lot of changes between Lucene versions 6.x and 7.x. There is an 
article for that 
Upgrading+to+Lucene+7.1.0.

The first larger change was in the scoring mechanism. We adapt it to one that 
is correct for us. (verified by DistributedScoringJUnitTest)

The main change was in Lucene index format. There we come into a problem with 
our tests.
Lucene 6.x cannot read the index format of Lucene 7.x.
Through PRs we decided to include Lucene uplift in Geode 1.15.0 and add check 
if all members are on 1.15.0 version or higher (after uplift Lucene to a newer 
version with index format changes this should be changed). If a check is passed 
it will allow doing Lucene query, if not there will be a printed log that not 
all members on 1.15.0 or higher version.

Also, you can found a discussion on dev list from 2 years ago about Lucene 
upgrade: Lucene 
Upgrade

BR,
Mario




Šalje: Udo Kohlmeyer 
Poslano: 28. rujna 2021. 1:44
Prima: dev@geode.apache.org 
Predmet: Re: [DISCUSS] Upgrading to Lucene 7.1.0

Might I propose something here.

There is currently a significant amount of work going into completing 
Geode-8705, which is the Classloader isolation. We are currently targeting to 
getting this release in Geode 1.16.

My proposal is, that we use the capability that Patrick demo’d at the Community 
meeting (on this topic) where one, at runtime, can unload /  load extensions 
(like our integration with Lucene). This means that one could possibly do a 
rolling upgrade on the existing system, and keep the versions of the Lucene 
integration stable.

Once the whole system has been upgraded, the existing Lucene extension 
component is then unloaded, and the newer version of the extension component is 
then loaded. What this means, is that at runtime, there will be a period of 
time where Lucene queries will not be available and as part of the “load” 
lifecycle of the extension, there needs to be an initialization step, which 
will initialize the extension component safely.

Once initialized, Lucene queries can then become available again, etc.

This if course requires some work around the lifecycles of extension components 
and making sure that I can add the extension on at runtime and safely 
initialize it.

I think this approach allows for a more seamless (lower downtime) upgrading of 
system and extension components.

Thoughts?

--Udo

From: Nabarun Nag 
Date: Tuesday, September 28, 2021 at 7:33 AM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0
The solution for preventing the query executions to occur in the mixed version 
mode also caused some problems where the query function executions get 
repeatedly executed and that results in stack overflow.



From: Nabarun Nag 
Sent: Monday, September 27, 2021 2:30 PM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0

In simple words,  if Lucene indexes were created by a new version (7.1.0), then 
replicated to others that are still in the older version, they won't understand 
the index, and the event processors start throwing exceptions.

This can be simply seen by just re-enabling the query execution in the DUnit 
tests and commenting out the check blocks: [develop SHA: 
68629356f561a932f5dfbace70b01d9971a42473]

In LuceneEventListener
if (cache.hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) {
 

Odg: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-28 Thread Mario Kevo
Hi all,

Just a small clarification of the reverted PR.

There were a lot of changes between Lucene versions 6.x and 7.x. There is an 
article for that 
Upgrading+to+Lucene+7.1.0.

The first larger change was in the scoring mechanism. We adapt it to one that 
is correct for us. (verified by DistributedScoringJUnitTest)

The main change was in Lucene index format. There we come into a problem with 
our tests.
Lucene 6.x cannot read the index format of Lucene 7.x.
Through PRs we decided to include Lucene uplift in Geode 1.15.0 and add check 
if all members are on 1.15.0 version or higher (after uplift Lucene to a newer 
version with index format changes this should be changed). If a check is passed 
it will allow doing Lucene query, if not there will be a printed log that not 
all members on 1.15.0 or higher version.

Also, you can found a discussion on dev list from 2 years ago about Lucene 
upgrade: Lucene 
Upgrade

BR,
Mario




Šalje: Udo Kohlmeyer 
Poslano: 28. rujna 2021. 1:44
Prima: dev@geode.apache.org 
Predmet: Re: [DISCUSS] Upgrading to Lucene 7.1.0

Might I propose something here.

There is currently a significant amount of work going into completing 
Geode-8705, which is the Classloader isolation. We are currently targeting to 
getting this release in Geode 1.16.

My proposal is, that we use the capability that Patrick demo’d at the Community 
meeting (on this topic) where one, at runtime, can unload /  load extensions 
(like our integration with Lucene). This means that one could possibly do a 
rolling upgrade on the existing system, and keep the versions of the Lucene 
integration stable.

Once the whole system has been upgraded, the existing Lucene extension 
component is then unloaded, and the newer version of the extension component is 
then loaded. What this means, is that at runtime, there will be a period of 
time where Lucene queries will not be available and as part of the “load” 
lifecycle of the extension, there needs to be an initialization step, which 
will initialize the extension component safely.

Once initialized, Lucene queries can then become available again, etc.

This if course requires some work around the lifecycles of extension components 
and making sure that I can add the extension on at runtime and safely 
initialize it.

I think this approach allows for a more seamless (lower downtime) upgrading of 
system and extension components.

Thoughts?

--Udo

From: Nabarun Nag 
Date: Tuesday, September 28, 2021 at 7:33 AM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0
The solution for preventing the query executions to occur in the mixed version 
mode also caused some problems where the query function executions get 
repeatedly executed and that results in stack overflow.



From: Nabarun Nag 
Sent: Monday, September 27, 2021 2:30 PM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0

In simple words,  if Lucene indexes were created by a new version (7.1.0), then 
replicated to others that are still in the older version, they won't understand 
the index, and the event processors start throwing exceptions.

This can be simply seen by just re-enabling the query execution in the DUnit 
tests and commenting out the check blocks: [develop SHA: 
68629356f561a932f5dfbace70b01d9971a42473]

In LuceneEventListener
if (cache.hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) {
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return false;
}

In IndexRepositoryFactory:
if (userRegion.getCache() != null
&& userRegion.getCache().hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) 
{
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return null;
}


This is the exception that will be encountered:

[Exception]

[vm2_v1.2.0] [warn 2021/09/27 14:24:42.251 PDT  tid=102] An Exception occurred. 
The dispatcher will continue.
[vm2_v1.2.0] org.apache.geode.InternalGemFireError: Unable to create index 
repository
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
[vm2_v1.2.0] at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)