Re: Creating index failed

2022-02-03 Thread Anilkumar Gingade
The other problem which exists is; the case where two threads tries to create 
index with the same name with different index expression concurrently. I assume 
there are ways this could happen.
One solution to address overall issue with index creation on partitioned region 
is by taking a distributed lock with the index name.  When index creation 
request comes, it first acquires a distributed lock with the index name; any 
additional index creation with that name will be blocked till the previous 
index is created with the same name; during this time if the index creation 
comes through local or remote the exception can be ignored. As there is only 
one index creation will be in progress for the same request.

-Anil.

On 2/3/22, 4:41 AM, "Mario Kevo"  wrote:

Hi devs,

After implementing ignoring exception some tests failed as we allowed now 
to pass command again (although it does nothing as the same index is already 
created by execution before). 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7195data=04%7C01%7Cagingade%40vmware.com%7C5c8bc7454b9044a05b1308d9e71275dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637794888745101984%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=wUnq5WuzXrRnpeq%2FM8Ah1vF3TL8tETKxd%2B35v%2FXUMLg%3Dreserved=0


There is a summary of how it works by now.

When we are creating an index on a partitioned region, the locator sends to 
all members to create an index on all data it contains. The partitioned region 
is specific as it is normal that you want to index all data which are 
distributed on all members. That leads to every member will try to create it 
locally and send index create requests to all members on that site.
All members will check if there is an already created index or index 
creating is in progress and wait for it. In case a remotely originated request 
comes but there is already created index it will respond with Index and send an 
acknowledgment to the request sender side. In case it is not created already it 
will create an index on that member and then respond to the request sender 
side. This behavior is okay if we are using a small number of the server or 
using the --member option while creating indexes(which has no sense to use on 
the partitioned region as already described down in the mail thread).

The problem is when we are using a larger num of the servers(8 or more) or 
just with debugging on. It will slow down the whole process and then can happen 
that on some of the servers remotely originated create index request comes 
before locally request. In that case, a remotely originated request will see 
that there is no index with that name and will create a new one. But the 
problem happens after that when a local request comes and there is already 
created index it will think that it is from some execution before and throw 
IndexNameConflictException. 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fblob%2Fdevelop%2Fgeode-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fgeode%2Finternal%2Fcache%2FPartitionedRegion.java%23L8377data=04%7C01%7Cagingade%40vmware.com%7C5c8bc7454b9044a05b1308d9e71275dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637794888745101984%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=0lbpsVQ63FjPaRBvhqmhmAsYp8V0gH3BokmbASBC9hg%3Dreserved=0

The create index command will fail(despite of that the index is created on 
all data, some with local requests ad some with remotely originated requests).
There are two problems with this implementation:

  1.  The user doesn't know that the index is created and will try to 
create it again but then it will fail on all servers.
  2.  The cluster config is updated after the command is finished 
successfully, which is correct as we cannot update the cluster config before 
anything is done.
The user can use indexes despite that command failed, but the problem is 
that after the restart it has nothing in the cluster config and will not create 
an index on them.

So the question is what to do in this case? How to avoid this issue?
Ignore exceptions and fix failing tests expect that a new create index 
command will pass or disable --member option if partition region is used(or 
just document it) and don't send a request on other members as the command will 
send to all members to create it. Or maybe something else?

BR,
Mario



Šalje: Mario Kevo 
Poslano: 14. prosinca 2021. 14:06
Prima: dev@geode.apache.org 
Predmet: Odg: Creating index failed

Hi Alexandar,

The cluster config is updated at the end of the command execution, and only 
in case, the command is successful.
I created PR with Anlikumar's suggestion, but some tests failed. 

Odg: Creating index failed

2022-02-03 Thread Mario Kevo
Hi devs,

After implementing ignoring exception some tests failed as we allowed now to 
pass command again (although it does nothing as the same index is already 
created by execution before). https://github.com/apache/geode/pull/7195


There is a summary of how it works by now.

When we are creating an index on a partitioned region, the locator sends to all 
members to create an index on all data it contains. The partitioned region is 
specific as it is normal that you want to index all data which are distributed 
on all members. That leads to every member will try to create it locally and 
send index create requests to all members on that site.
All members will check if there is an already created index or index creating 
is in progress and wait for it. In case a remotely originated request comes but 
there is already created index it will respond with Index and send an 
acknowledgment to the request sender side. In case it is not created already it 
will create an index on that member and then respond to the request sender 
side. This behavior is okay if we are using a small number of the server or 
using the --member option while creating indexes(which has no sense to use on 
the partitioned region as already described down in the mail thread).

The problem is when we are using a larger num of the servers(8 or more) or just 
with debugging on. It will slow down the whole process and then can happen that 
on some of the servers remotely originated create index request comes before 
locally request. In that case, a remotely originated request will see that 
there is no index with that name and will create a new one. But the problem 
happens after that when a local request comes and there is already created 
index it will think that it is from some execution before and throw 
IndexNameConflictException. 
https://github.com/apache/geode/blob/develop/geode-core/src/main/java/org/apache/geode/internal/cache/PartitionedRegion.java#L8377

The create index command will fail(despite of that the index is created on all 
data, some with local requests ad some with remotely originated requests).
There are two problems with this implementation:

  1.  The user doesn't know that the index is created and will try to create it 
again but then it will fail on all servers.
  2.  The cluster config is updated after the command is finished successfully, 
which is correct as we cannot update the cluster config before anything is done.
The user can use indexes despite that command failed, but the problem is that 
after the restart it has nothing in the cluster config and will not create an 
index on them.

So the question is what to do in this case? How to avoid this issue?
Ignore exceptions and fix failing tests expect that a new create index command 
will pass or disable --member option if partition region is used(or just 
document it) and don't send a request on other members as the command will send 
to all members to create it. Or maybe something else?

BR,
Mario



Šalje: Mario Kevo 
Poslano: 14. prosinca 2021. 14:06
Prima: dev@geode.apache.org 
Predmet: Odg: Creating index failed

Hi Alexandar,

The cluster config is updated at the end of the command execution, and only in 
case, the command is successful.
I created PR with Anlikumar's suggestion, but some tests failed. 
https://github.com/apache/geode/pull/7195
I tried with ignoring exception if it is already created, but in that case, if 
run again the create index command with the same name and expression it will 
not fail.

BR,
Mario



Šalje: Alexander Murmann 
Poslano: 7. prosinca 2021. 18:28
Prima: dev@geode.apache.org 
Predmet: Re: Creating index failed

Hi Mario!
I agree with you that the user wanted to index all the data in the region when 
using a partitioned region. But when the command is not successful, the cluster 
config is not updated.
After the server restart, it will not have indexes as it is not stored in the 
cluster configuration.
Interesting! If I understand you correctly, the initial request to each server 
succeeds, but later ones will fail because the index is already there. However, 
the first, successful request should also have updated the cluster config, 
right?. Am I misunderstanding something?

From: Mario Kevo 
Sent: Tuesday, December 7, 2021 06:36
To: dev@geode.apache.org 
Subject: Odg: Creating index failed

Hi Jason,

I agree with you that the user wanted to index all the data in the region when 
using a partitioned region. But when the command is not successful, the cluster 
config is not updated.
After the server restart, it will not have indexes as it is not stored in the 
cluster configuration.
So there should be some changes, as the index is created on all members but the 
command is not successful.
I'm working on a fix. As soon as possible I will create PR on the already 
mentioned ticket.

BR,
Mario