Re: Cluster Singleton lifecycle behavior after plugin update

2023-08-22 Thread Alex Deparvu
Hi,

I fully agree with what was said before. but as a Solr newbie that gets
lost quite a lot of times in jiras and PRs I want to say asking the list
for guidance should be a good way to go about clarifying some peculiar
implementation details.

Out of curiosity I dug up some links:
- Jira https://issues.apache.org/jira/browse/SOLR-14749
- PR
https://github.com/apache/solr/commit/67ecd8ff9ad3640016424ded86bfaaadd815b96d#diff-49651d6e4deb06bc01a835ba4acda45b5af54646cadc51a06bd2514a8112ad9cR99

I read through it and will add my 2 cents: it sure does smell like a bug to
me, there is no way this works as intended in this order: add new item to
map with same name and start it _then_ remove item from the map with same
name and stop it. well unless the name of the old item and the new item are
different but that does not look like the case here.

hope this helps,
alex








On Tue, Aug 22, 2023 at 6:05 PM David Smiley 
wrote:

> Hi Paul,
>
> As a technique to inquire about "why" questions, I sometimes find the
> original JIRA issue and/or PR and ask there.  It may be a years-old
> issue/PR but I personally think nothing of it -- it's practical as it pings
> relevant people watching vs. a shot in the dark about an obscure matter.
>
> ~ David
>
>
> On Tue, Aug 22, 2023 at 3:51 PM Paul McArthur
>  wrote:
>
> > Hi all,
> >
> > I’m creating a cluster singleton plugin and have found an issue with the
> > lifecycle management of the singleton if the plugin is updated via the
> API.
> >
> > When the /cluster/plugin api is called with an update payload, the
> > ClusterSingletons.modified method is called, which adds the new plugin to
> > the singletonMap (and starts it if applicable). Then it stops and removes
> > the old one. The order of these operations has a couple of side effects:
> >
> > 1. For a very brief period, there are 2 instances of the plugin running.
> > This may not really be a problem, but does seem to violate the Singleton
> > principle
> >
> > 2. Given that the map is keyed on the plugin name, adding the replacement
> > first will overwrite the existing (old) entry in the map. Then when the
> old
> > one is removed, it actually removes the new one that was just added. This
> > leaves the singletonMap with no entry for the plugin. When the Overseer
> > node goes down, the stop method is not called for the plugin because it
> no
> > longer has an entry in the map.
> >
> > I’ve reproduced the issue by modifying the TestContainerPlugin test, and
> I
> > can create a Jira issue, but I wonder if there is any reason that the
> added
> > and deleted methods are called in this order that I haven’t understood.
> It
> > seems to me that reversing the order in which they are called will solve
> > the issue.
> >
> > Thanks,
> > Paul
>


Re: Cluster Singleton lifecycle behavior after plugin update

2023-08-22 Thread David Smiley
Hi Paul,

As a technique to inquire about "why" questions, I sometimes find the
original JIRA issue and/or PR and ask there.  It may be a years-old
issue/PR but I personally think nothing of it -- it's practical as it pings
relevant people watching vs. a shot in the dark about an obscure matter.

~ David


On Tue, Aug 22, 2023 at 3:51 PM Paul McArthur
 wrote:

> Hi all,
>
> I’m creating a cluster singleton plugin and have found an issue with the
> lifecycle management of the singleton if the plugin is updated via the API.
>
> When the /cluster/plugin api is called with an update payload, the
> ClusterSingletons.modified method is called, which adds the new plugin to
> the singletonMap (and starts it if applicable). Then it stops and removes
> the old one. The order of these operations has a couple of side effects:
>
> 1. For a very brief period, there are 2 instances of the plugin running.
> This may not really be a problem, but does seem to violate the Singleton
> principle
>
> 2. Given that the map is keyed on the plugin name, adding the replacement
> first will overwrite the existing (old) entry in the map. Then when the old
> one is removed, it actually removes the new one that was just added. This
> leaves the singletonMap with no entry for the plugin. When the Overseer
> node goes down, the stop method is not called for the plugin because it no
> longer has an entry in the map.
>
> I’ve reproduced the issue by modifying the TestContainerPlugin test, and I
> can create a Jira issue, but I wonder if there is any reason that the added
> and deleted methods are called in this order that I haven’t understood. It
> seems to me that reversing the order in which they are called will solve
> the issue.
>
> Thanks,
> Paul


Re: Crave patch optimization

2023-08-22 Thread Yuvraaj Kelkar
Sorry for the delay, I had to collect this information from the team.

As you said: Crave doesn't need to know the commit history, it just needs 
enough information from git to be able to reproduce the filesystem state on to 
the remote side without having to do a full rsync.
We have to account for the possibility that there's any combination of: (common 
merge base whether on branch or not) + (staged commits) + (unstaged commits)

Patch process:
Get the merge base
Use the command git rev-list --topo-order --reverse  --not 
--remotes= -- to get a list of commit ID.

If the list is empty, then the HEAD commit in the branch is the common merge 
base. Use it.

If the list is NOT empty, then use the first entry in the list to get one 
commit ID before it. This is the merge base.

With the merge base, creare a series of patches
Staged changes: git --no-pager diff --src-prefix=a/ --dst-prefix=b/ --binary 
--no-ext-diff --ignore-submodules --no-color --cached

Local changes: git --no-pager format-patch --src-prefix=a/ --dst-prefix=b/ 
--no-color --stdout 

Send patches list across to the remote side, and then apply using git apply 
--verbose --whitespace=nowarn 

In the past, we did use git am , but it is all git apply now.
Also, we don't use format-patch because we had to implement "incremental git 
patching" for users repeatedly doing incremental builds in the common case, eg:
User does incremental builds as changes keep getting "collected". At the end of 
3-4 days, there was 200 files modified. Patching 200 files on every incremental 
caused a mtime update that causes the build system from unnecessarily 
rebuilding far more than actually needed.
A simple git diff allows us to do incremental diff patching.

But besides all this explanation, we're looking to improve the entire patching 
process so that there's no need to repeatedly restart Github Actions.
One of the ways we're trying to get debug information is to get access to the 
Action Runner so that we can look at the git tree so we can get a repro without 
having to guess every time.

Thanks,
-Uv

On Aug 2 2023, at 10:50 am, Michael Gibney  wrote:
> Hi Uv:
>
> regarding https://lists.apache.org/thread/l8hgqwkm9vt8yhygoo2blw5j4n5gh121
> Would it be possible for Crave, instead of using `git format-patch` /
> `git am`, to instead generate/apply a simple single diff/patch, via
> `git diff .. > file.patch` / `git apply
> file.patch`?
>
> This would address a couple of problems, and should be more efficient
> on IO as well (which is I gather the motivation for the patch-based
> approach in the first place).
>
> In some cases (e.g., https://github.com/apache/solr/pull/1351), Crave
> fails to apply patches even when the source PR branch is completely
> up-to-date (has merged in the latest commit from `main`). The
> workaround of rebasing is not ideal, in that it necessitates
> force-pushes, which are considered an anti-pattern in some contexts,
> and can generally be an inconvenience for transparency and
> collaborative development.
>
> But separately relevant: Crave doesn't care about the commit history
> -- it's only using patches to optimize IO bandwidth. So generating a
> single diff between the merge-base and the PR head should iiuc be
> exactly what you want, and would be guaranteed to always apply
> cleanly.
>
> Would you mind sharing how you're deriving the "merge-base" commit,
> and the command you're using (I assume, `git format-patch`?) to
> generate the patch files? Having just merged `main` into a PR branch,
> the merge-base of the PR branch and `main` should be: `main`.
>
> Thank you for your consideration!
> Michael
>



Re: Cluster Singleton lifecycle behavior after plugin update

2023-08-22 Thread Jan Høydahl
Hi,

Thanks for reporting. Sounds you have enough supporting proof to create a JIRA 
(bug) issue, and we can continue discussion there.

Jan

> 22. aug. 2023 kl. 21:51 skrev Paul McArthur 
> :
> 
> Hi all,
> 
> I’m creating a cluster singleton plugin and have found an issue with the 
> lifecycle management of the singleton if the plugin is updated via the API.
> 
> When the /cluster/plugin api is called with an update payload, the 
> ClusterSingletons.modified method is called, which adds the new plugin to the 
> singletonMap (and starts it if applicable). Then it stops and removes the old 
> one. The order of these operations has a couple of side effects:
> 
> 1. For a very brief period, there are 2 instances of the plugin running. This 
> may not really be a problem, but does seem to violate the Singleton principle
> 
> 2. Given that the map is keyed on the plugin name, adding the replacement 
> first will overwrite the existing (old) entry in the map. Then when the old 
> one is removed, it actually removes the new one that was just added. This 
> leaves the singletonMap with no entry for the plugin. When the Overseer node 
> goes down, the stop method is not called for the plugin because it no longer 
> has an entry in the map.
> 
> I’ve reproduced the issue by modifying the TestContainerPlugin test, and I 
> can create a Jira issue, but I wonder if there is any reason that the added 
> and deleted methods are called in this order that I haven’t understood. It 
> seems to me that reversing the order in which they are called will solve the 
> issue.
> 
> Thanks,
> Paul


-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Cluster Singleton lifecycle behavior after plugin update

2023-08-22 Thread Paul McArthur
Hi all,

I’m creating a cluster singleton plugin and have found an issue with the 
lifecycle management of the singleton if the plugin is updated via the API.

When the /cluster/plugin api is called with an update payload, the 
ClusterSingletons.modified method is called, which adds the new plugin to the 
singletonMap (and starts it if applicable). Then it stops and removes the old 
one. The order of these operations has a couple of side effects:

1. For a very brief period, there are 2 instances of the plugin running. This 
may not really be a problem, but does seem to violate the Singleton principle

2. Given that the map is keyed on the plugin name, adding the replacement first 
will overwrite the existing (old) entry in the map. Then when the old one is 
removed, it actually removes the new one that was just added. This leaves the 
singletonMap with no entry for the plugin. When the Overseer node goes down, 
the stop method is not called for the plugin because it no longer has an entry 
in the map.

I’ve reproduced the issue by modifying the TestContainerPlugin test, and I can 
create a Jira issue, but I wonder if there is any reason that the added and 
deleted methods are called in this order that I haven’t understood. It seems to 
me that reversing the order in which they are called will solve the issue.

Thanks,
Paul

Subscribe

2023-08-22 Thread Dharma Theta Navuluri



Looking for help on shell scripting magic ;-)

2023-08-22 Thread Eric Pugh
Hi all,  

SOLR-16876 is about moving bin/solr auth command argument parsing into Java 
code.   

I’ve put up a draft PR to start this effort, but run on some challenges in my 
shell scripting skills on how best to port some of the directory parsing logic 
that is in the shell script over to Java code.   
https://github.com/apache/solr/pull/1833
Specifically the lines around 1024 in bin/solr and below.   If someone would 
like to take a crack at it, I’d love the eyes, or has suggestions ;-).   I’m a 
bit stuck.

Eric

___
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com  | 
My Free/Busy   
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 


This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.