[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-24 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1520926805 @tflobbe CHANGES file updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-21 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1518426686 I spent some more time on this class and have identified 2 problematic things I wanted to share: * this looks like a NPE waiting to happen https://github.com/apache/solr/blob/db4cb6627

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-21 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1518101663 > That is very similar to my other PR that does the same thing for a similar case. I might just list both together, which we do sometimes. The CHANGES.txt proposal sounds good to me,

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-11 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1503952520 Status on current PR: - added the version check on additions to fail in case we are not leader and version = 0. (to match delete flows) - changed error status from BAD_REQUEST to IN

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-11 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1503929909 Leaving this here for future reference. I think we could consider allowing doc updates based on the `isSubShardLeader` but this is tricky to verify, and well beyond my knowledge of this co

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-10 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1502421101 > Where is this in the PR? I at least like the sound of it :-) you can find the logic here, basically not leader logic and zero version will trigger an exception: https://github.com

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-07 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1500709938 I have spent some more time unpacking this test. Test setup is: shard `shard1` is split into 2 new slices `shard1_0` and `shard1_1`, concurrently there are additions happening.

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-04 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1496270123 @dsmiley thanks for taking a look. I think we are looking at the same problem area but different lifecycles. Your summary on #1484 applies here as well: ``` A shard being split

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-04-01 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1493173156 updated the patch with a better fix. I'm now using the `isSubShardLeader` flag which seems to have been introduced for the split shard scenario (not sure). will be running some more tests

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-03-31 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1492480511 seeing some local failures due to the newly added 0 version check ``` org.apache.solr.search.TestRecovery > testLogReplayWithReorderedDBQByAsterixAndChildDocs FAILED java.l

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

2023-03-31 Thread via GitHub
stillalex commented on PR #1504: URL: https://github.com/apache/solr/pull/1504#issuecomment-1492441639 I have pushed a tentative patch for review and more importantly for discussion. I have ran this test extensively (hundreds of beast jobs) and it looks like this is the way to go, al