Hi Anil,

I agree that it can happen that two threads try to create the index with the 
same name with different index expressions concurrently.
This distributed lock will help to avoid this issue, but the same test will 
fail.
The first issue is when we have the above case it will create the first index, 
but the second(the same name and different expression) will be not created but 
the command is successful.
The second issue(these tests which are failing on PR) is that Geode expects 
that if run the same command again it will fail as there is already created 
index with the same name. With ignoring exceptions it will pass but shouldn't.

So, with a distributed lock we can avoid the issue you mentioned but still has 
other issues.

BR,
Mario


________________________________
Šalje: Anilkumar Gingade <aging...@vmware.com>
Poslano: 3. veljače 2022. 16:46
Prima: dev@geode.apache.org <dev@geode.apache.org>
Predmet: Re: Creating index failed

The other problem which exists is; the case where two threads tries to create 
index with the same name with different index expression concurrently. I assume 
there are ways this could happen.
One solution to address overall issue with index creation on partitioned region 
is by taking a distributed lock with the index name.  When index creation 
request comes, it first acquires a distributed lock with the index name; any 
additional index creation with that name will be blocked till the previous 
index is created with the same name; during this time if the index creation 
comes through local or remote the exception can be ignored. As there is only 
one index creation will be in progress for the same request.

-Anil.

On 2/3/22, 4:41 AM, "Mario Kevo" <mario.k...@est.tech> wrote:

    Hi devs,

    After implementing ignoring exception some tests failed as we allowed now 
to pass command again (although it does nothing as the same index is already 
created by execution before). 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7195&amp;data=04%7C01%7Cagingade%40vmware.com%7C5c8bc7454b9044a05b1308d9e71275dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637794888745101984%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=wUnq5WuzXrRnpeq%2FM8Ah1vF3TL8tETKxd%2B35v%2FXUMLg%3D&amp;reserved=0


    There is a summary of how it works by now.

    When we are creating an index on a partitioned region, the locator sends to 
all members to create an index on all data it contains. The partitioned region 
is specific as it is normal that you want to index all data which are 
distributed on all members. That leads to every member will try to create it 
locally and send index create requests to all members on that site.
    All members will check if there is an already created index or index 
creating is in progress and wait for it. In case a remotely originated request 
comes but there is already created index it will respond with Index and send an 
acknowledgment to the request sender side. In case it is not created already it 
will create an index on that member and then respond to the request sender 
side. This behavior is okay if we are using a small number of the server or 
using the --member option while creating indexes(which has no sense to use on 
the partitioned region as already described down in the mail thread).

    The problem is when we are using a larger num of the servers(8 or more) or 
just with debugging on. It will slow down the whole process and then can happen 
that on some of the servers remotely originated create index request comes 
before locally request. In that case, a remotely originated request will see 
that there is no index with that name and will create a new one. But the 
problem happens after that when a local request comes and there is already 
created index it will think that it is from some execution before and throw 
IndexNameConflictException. 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fblob%2Fdevelop%2Fgeode-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fgeode%2Finternal%2Fcache%2FPartitionedRegion.java%23L8377&amp;data=04%7C01%7Cagingade%40vmware.com%7C5c8bc7454b9044a05b1308d9e71275dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637794888745101984%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=0lbpsVQ63FjPaRBvhqmhmAsYp8V0gH3BokmbASBC9hg%3D&amp;reserved=0

    The create index command will fail(despite of that the index is created on 
all data, some with local requests ad some with remotely originated requests).
    There are two problems with this implementation:

      1.  The user doesn't know that the index is created and will try to 
create it again but then it will fail on all servers.
      2.  The cluster config is updated after the command is finished 
successfully, which is correct as we cannot update the cluster config before 
anything is done.
    The user can use indexes despite that command failed, but the problem is 
that after the restart it has nothing in the cluster config and will not create 
an index on them.

    So the question is what to do in this case? How to avoid this issue?
    Ignore exceptions and fix failing tests expect that a new create index 
command will pass or disable --member option if partition region is used(or 
just document it) and don't send a request on other members as the command will 
send to all members to create it. Or maybe something else?

    BR,
    Mario


    ________________________________
    Šalje: Mario Kevo <mario.k...@est.tech>
    Poslano: 14. prosinca 2021. 14:06
    Prima: dev@geode.apache.org <dev@geode.apache.org>
    Predmet: Odg: Creating index failed

    Hi Alexandar,

    The cluster config is updated at the end of the command execution, and only 
in case, the command is successful.
    I created PR with Anlikumar's suggestion, but some tests failed. 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7195&amp;data=04%7C01%7Cagingade%40vmware.com%7C5c8bc7454b9044a05b1308d9e71275dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637794888745101984%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=wUnq5WuzXrRnpeq%2FM8Ah1vF3TL8tETKxd%2B35v%2FXUMLg%3D&amp;reserved=0
    I tried with ignoring exception if it is already created, but in that case, 
if run again the create index command with the same name and expression it will 
not fail.

    BR,
    Mario


    ________________________________
    Šalje: Alexander Murmann <amurm...@vmware.com>
    Poslano: 7. prosinca 2021. 18:28
    Prima: dev@geode.apache.org <dev@geode.apache.org>
    Predmet: Re: Creating index failed

    Hi Mario!
    I agree with you that the user wanted to index all the data in the region 
when using a partitioned region. But when the command is not successful, the 
cluster config is not updated.
    After the server restart, it will not have indexes as it is not stored in 
the cluster configuration.
    Interesting! If I understand you correctly, the initial request to each 
server succeeds, but later ones will fail because the index is already there. 
However, the first, successful request should also have updated the cluster 
config, right?. Am I misunderstanding something?
    ________________________________
    From: Mario Kevo <mario.k...@est.tech>
    Sent: Tuesday, December 7, 2021 06:36
    To: dev@geode.apache.org <dev@geode.apache.org>
    Subject: Odg: Creating index failed

    Hi Jason,

    I agree with you that the user wanted to index all the data in the region 
when using a partitioned region. But when the command is not successful, the 
cluster config is not updated.
    After the server restart, it will not have indexes as it is not stored in 
the cluster configuration.
    So there should be some changes, as the index is created on all members but 
the command is not successful.
    I'm working on a fix. As soon as possible I will create PR on the already 
mentioned ticket.

    BR,
    Mario
    ________________________________
    Šalje: Jason Huynh <jhu...@vmware.com>
    Poslano: 6. prosinca 2021. 18:45
    Prima: dev@geode.apache.org <dev@geode.apache.org>
    Predmet: Re: Creating index failed

    Hi Mario,

    A lot of the indexing code pre-dates GFSH. The behavior you are seeing is 
when an index is created on a partition region.  When creating an index on a 
partition region, the idea is that the user wanted to index all the data in the 
region.  So the server will let all other servers know to create an index on 
the partition region.

    This is slightly different for an index on a replicated region.  That is 
when the index can be created on a per member basis, which is what I think the 
-member flag is for.

    GFSH however defaults to sending the create index message to all members 
for any index type from what I remember and from what is being described. That 
is why you’ll see the race condition with indexes created on partitioned 
regions but the end result being that the index that someone wanted to create 
is either created or already there.

    -Jason

    On 12/6/21, 6:37 AM, "Mario Kevo" <mario.k...@est.tech> wrote:

        Hi devs,

        While doing some testing, I found the issue which is already reported 
there. 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-7875&amp;data=04%7C01%7Cagingade%40vmware.com%7C5c8bc7454b9044a05b1308d9e71275dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637794888745101984%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=OPwcdOskUUOSfKao2ydpT4QvFWqLIcJbhWJJ3o%2B8Puk%3D&amp;reserved=0

        If we run the create index command it will create an index locally and 
send a request to create an index on other members of that region.
        The problem happened if the remote request comes before the request 
from the locator, in that case, a request from the locator failed with the 
following message: Index "index1" already exists.  Create failed due to 
duplicate name.

        This can be reproduced by running 6 servers with DEBUG log level(due to 
this system will be slower), creating a partitioned region, and then creating 
an index.

        Why does the server send remote requests to other members as they will 
get a request from the locator to create an index?
        Also when running the gfsh command to create an index on one member, it 
will send create index requests to all other members. In that case, what is the 
purpose of this --member flag?

        BR,
        Mario



Reply via email to