Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-05-06 Thread via GitHub


mlbiscoc merged PR #3330:
URL: https://github.com/apache/solr/pull/3330


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-05-05 Thread via GitHub


mlbiscoc commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2851908655

   Thanks Pierre. Will merge this in a bit if nothing comes up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-29 Thread via GitHub


mlbiscoc commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2840441739

   Went and added CHANGE.txt. Not a mockito expert but couldn't get a way to 
verify the delete calls on zkclient because I couldn't inject it inside of 
electionContexts as it is a private final attribute with no getters.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-28 Thread via GitHub


kotman12 commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2836130460

   BTW just saw your comment @mlbiscoc ... I think you can try wrapping the 
client exposed by `cluster.getZkClient()` from within `SolrCloudTestCase` in a 
spy... I think that might be the instance that calls delete on the leader elect 
node. Basically not sure this would have to be _that_ difficult re this:
   
   > Not sure how possible this is though with lots of these objects being 
nested deep from coreContainer and some being private/protected we would need 
to inject the spy one way or another.
   
   Although I admit I'm not 100% this will achieve the intended effect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-28 Thread via GitHub


kotman12 commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2836117507

   @psalagnac understood. @mlbiscoc had an idea of potentially using mockito 
spy for capturing the delete event although that has the drawback of all the 
pains of bytecode manipulation. The benefit is that Solr already uses this in a 
few spots.
   
   The other thought I had was using 
[log-captor](https://github.com/Hakky54/log-captor) which is a cool little 
apache-licensed lib that can have other uses but maybe a bit heavyweight to 
bring in for one test. We now log when this optimization _doesn't_ occur so 
capturing the logs and validating this line never gets printed might be an ok 
approach.
   
   Otherwise we are fine with no test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-28 Thread via GitHub


psalagnac commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2835958987

   > How about something like 
[this](https://github.com/kotman12/solr/commit/aaa7564cb049bcedf5472617aac1681c3fe31c2b)
 to test this?
   
   I'm not a big fan of such a test as it depends too much on internals. 
Ideally, a test should detect regressions but should not be a blocker for 
evolution.
   
   I don't think adding a test is a strict requirement for this fix. I was 
discussing it as this feature was initially broken at the first shot, but I'm 
fine with skipping it if there is no easy path to write one.
   The shutdown sequence of Solr is very complex and unfortunately not easily 
testable.
   
   Please add a change log entry in 9.9, and I'll merge it this week if nothing 
else comes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-28 Thread via GitHub


mlbiscoc commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2835459509

   > @mlbiscoc @psalagnac How about something like 
[this](https://github.com/kotman12/solr/commit/aaa7564cb049bcedf5472617aac1681c3fe31c2b)
 to test this? I don't _love_ it because it depends on an implementation detail 
of `CoreContainer::shutdown`, namely that the optimized election node deletion 
occurs _before_ the closing of caches. However, this fix addresses 
implementation details of that very method so I don't see how you can avoid 
relying on _some_ detail while also having a deterministic test.
   
   Ah this is pretty clever. Although I agree, I am not a big fan of this test 
being fixed in that closing of caches must happen after otherwise this whole 
test fails. Might confuse someone thinking something like this was intentional 
even though it is kind of a hack to test something not entirely relating to it. 
If we go with this approach, it should be commented in at the minimum this is 
the reason so if it ever fails due to the closing of caches moving, someone can 
at least understand why.
   
   I was also curious on looking into if Mockito.spy was possible and verify 
the calls for zkclient.delete() actually happening. Not sure how possible this 
is though with lots of these objects being nested deep from coreContainer and 
some being private/protected we would need to inject the spy one way or 
another. Maybe through java reflection? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-27 Thread via GitHub


kotman12 commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2833705368

   @mlbiscoc @psalagnac How about something like 
[this](https://github.com/kotman12/solr/commit/aaa7564cb049bcedf5472617aac1681c3fe31c2b)
 to test this? I don't _love_ it because it depends on an implementation detail 
of `CoreContainer::shutdown`, namely that the optimized election node deletion 
occurs _before_ the closing of caches. However, this fix addresses 
implementation details of that very method so I don't see how you can avoid 
relying on _some_ detail while also having a deterministic test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-24 Thread via GitHub


kotman12 commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2828824582

   > > That's a great analysis. Thanks @mlbiscoc and @kotman12.
   > > > Didn't think a test was necessary for this, but we're happy to add one 
if the community feels it's warranted
   > > 
   > > 
   > > That's very unfortunate that a broken feature was initially merged. The 
very complex shutdown sequence somehow led to this broken change. Adding a test 
would make sure there is no future regression. Saying this, I have no idea how 
hard it would be to test this. Having a test is not a requirement but a nice to 
have.
   > 
   > Thanks @psalagnac! I can look into whats possible for a test. I'll try and 
get it to fail on main and see if this PR it passes. We would need maybe an 
integration test with cc.shutdown() and check that the leader election znode in 
zookeeper goes away during shutdown. We see this bug get really bad during 
heavy ingestion and the container get stuck to shutdown until we hit.
   > 
   > ```
   > Timed out waiting for in-flight update requests to complete for core:
   > ```
   
   This might be challenging because I suspect that the zkSys.close which 
happens later on deletes the leader election node as well. So to truly trigger 
this condition you'd have to cut the shutdown process short or suspend it to 
make sure that the optimized leader node removal is the only thing that can 
delete the leader election node.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-24 Thread via GitHub


mlbiscoc commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2827944977

   > That's a great analysis. Thanks @mlbiscoc and @kotman12.
   > 
   > > Didn't think a test was necessary for this, but we're happy to add one 
if the community feels it's warranted
   > 
   > That's very unfortunate that a broken feature was initially merged. The 
very complex shutdown sequence somehow led to this broken change. Adding a test 
would make sure there is no future regression. Saying this, I have no idea how 
hard it would be to test this. Having a test is not a requirement but a nice to 
have.
   
   Thanks @psalagnac! I can look into whats possible for a test. I'll try and 
get it to fail on main and see if this PR it passes. We would need maybe an 
integration test with cc.shutdown() and check that the leader election znode in 
zookeeper goes away during shutdown. We see this bug get really bad during 
heavy ingestion and the container get stuck to shutdown until we hit.
   ```
   Timed out waiting for in-flight update requests to complete for core:
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17745: Reduce leader election optimization never occurs [solr]

2025-04-23 Thread via GitHub


psalagnac commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2824907741

   That's a great analysis. Thanks @mlbiscoc and @kotman12.
   
   > Didn't think a test was necessary for this, but we're happy to add one if 
the community feels it's warranted
   
   That's very unfortunate that a broken feature was initially merged.
   The very complex shutdown sequence somehow led to this broken change. Adding 
a test would make sure there is no future regression.
   Saying this, I have no idea how hard it would be to test this. Having a test 
is not a requirement but a nice to have.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org