Thank you for this work. I have found Crave to be really effective in letting me know when small commits have odd side effects. My laptop is older, so I generally can’t run the full test suite the way I want, and seeing the list of green checks from Crave running the tests is always very encouraging. Looking forward to these improvements!
> On Aug 22, 2023, at 8:54 PM, Yuvraaj Kelkar <yuvr...@gmail.com> wrote: > > Sorry for the delay, I had to collect this information from the team. > > As you said: Crave doesn't need to know the commit history, it just needs > enough information from git to be able to reproduce the filesystem state on > to the remote side without having to do a full rsync. > We have to account for the possibility that there's any combination of: > (common merge base whether on branch or not) + (staged commits) + (unstaged > commits) > > Patch process: > Get the merge base > Use the command git rev-list --topo-order --reverse <localBranch> --not > --remotes=<remote> -- to get a list of commit ID. > > If the list is empty, then the HEAD commit in the branch is the common merge > base. Use it. > > If the list is NOT empty, then use the first entry in the list to get one > commit ID before it. This is the merge base. > > With the merge base, creare a series of patches > Staged changes: git --no-pager diff --src-prefix=a/ --dst-prefix=b/ --binary > --no-ext-diff --ignore-submodules --no-color --cached > > Local changes: git --no-pager format-patch --src-prefix=a/ --dst-prefix=b/ > --no-color --stdout <mergeBase> > > Send patches list across to the remote side, and then apply using git apply > --verbose --whitespace=nowarn <patch> > > In the past, we did use git am , but it is all git apply now. > Also, we don't use format-patch because we had to implement "incremental git > patching" for users repeatedly doing incremental builds in the common case, > eg: > User does incremental builds as changes keep getting "collected". At the end > of 3-4 days, there was 200 files modified. Patching 200 files on every > incremental caused a mtime update that causes the build system from > unnecessarily rebuilding far more than actually needed. > A simple git diff allows us to do incremental diff patching. > > But besides all this explanation, we're looking to improve the entire > patching process so that there's no need to repeatedly restart Github Actions. > One of the ways we're trying to get debug information is to get access to the > Action Runner so that we can look at the git tree so we can get a repro > without having to guess every time. > > Thanks, > -Uv > > On Aug 2 2023, at 10:50 am, Michael Gibney <mich...@michaelgibney.net> wrote: >> Hi Uv: >> >> regarding https://lists.apache.org/thread/l8hgqwkm9vt8yhygoo2blw5j4n5gh121 >> Would it be possible for Crave, instead of using `git format-patch` / >> `git am`, to instead generate/apply a simple single diff/patch, via >> `git diff <merge-base>..<pr_head> > file.patch` / `git apply >> file.patch`? >> >> This would address a couple of problems, and should be more efficient >> on IO as well (which is I gather the motivation for the patch-based >> approach in the first place). >> >> In some cases (e.g., https://github.com/apache/solr/pull/1351), Crave >> fails to apply patches even when the source PR branch is completely >> up-to-date (has merged in the latest commit from `main`). The >> workaround of rebasing is not ideal, in that it necessitates >> force-pushes, which are considered an anti-pattern in some contexts, >> and can generally be an inconvenience for transparency and >> collaborative development. >> >> But separately relevant: Crave doesn't care about the commit history >> -- it's only using patches to optimize IO bandwidth. So generating a >> single diff between the merge-base and the PR head should iiuc be >> exactly what you want, and would be guaranteed to always apply >> cleanly. >> >> Would you mind sharing how you're deriving the "merge-base" commit, >> and the command you're using (I assume, `git format-patch`?) to >> generate the patch files? Having just merged `main` into a PR branch, >> the merge-base of the PR branch and `main` should be: `main`. >> >> Thank you for your consideration! >> Michael >> > _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.