Thank you for this work. 

 I have found Crave to be really effective in letting me know when small 
commits have odd side effects.  My laptop is older, so I generally can’t run 
the full test suite the way I want, and seeing the list of green checks from 
Crave running the tests is always very encouraging.   Looking forward to these 
improvements!

> On Aug 22, 2023, at 8:54 PM, Yuvraaj Kelkar <yuvr...@gmail.com> wrote:
> 
> Sorry for the delay, I had to collect this information from the team.
> 
> As you said: Crave doesn't need to know the commit history, it just needs 
> enough information from git to be able to reproduce the filesystem state on 
> to the remote side without having to do a full rsync.
> We have to account for the possibility that there's any combination of: 
> (common merge base whether on branch or not) + (staged commits) + (unstaged 
> commits)
> 
> Patch process:
> Get the merge base
> Use the command git rev-list --topo-order --reverse <localBranch> --not 
> --remotes=<remote> -- to get a list of commit ID.
> 
> If the list is empty, then the HEAD commit in the branch is the common merge 
> base. Use it.
> 
> If the list is NOT empty, then use the first entry in the list to get one 
> commit ID before it. This is the merge base.
> 
> With the merge base, creare a series of patches
> Staged changes: git --no-pager diff --src-prefix=a/ --dst-prefix=b/ --binary 
> --no-ext-diff --ignore-submodules --no-color --cached
> 
> Local changes: git --no-pager format-patch --src-prefix=a/ --dst-prefix=b/ 
> --no-color --stdout <mergeBase>
> 
> Send patches list across to the remote side, and then apply using git apply 
> --verbose --whitespace=nowarn <patch>
> 
> In the past, we did use git am , but it is all git apply now.
> Also, we don't use format-patch because we had to implement "incremental git 
> patching" for users repeatedly doing incremental builds in the common case, 
> eg:
> User does incremental builds as changes keep getting "collected". At the end 
> of 3-4 days, there was 200 files modified. Patching 200 files on every 
> incremental caused a mtime update that causes the build system from 
> unnecessarily rebuilding far more than actually needed.
> A simple git diff allows us to do incremental diff patching.
> 
> But besides all this explanation, we're looking to improve the entire 
> patching process so that there's no need to repeatedly restart Github Actions.
> One of the ways we're trying to get debug information is to get access to the 
> Action Runner so that we can look at the git tree so we can get a repro 
> without having to guess every time.
> 
> Thanks,
> -Uv
> 
> On Aug 2 2023, at 10:50 am, Michael Gibney <mich...@michaelgibney.net> wrote:
>> Hi Uv:
>> 
>> regarding https://lists.apache.org/thread/l8hgqwkm9vt8yhygoo2blw5j4n5gh121
>> Would it be possible for Crave, instead of using `git format-patch` /
>> `git am`, to instead generate/apply a simple single diff/patch, via
>> `git diff <merge-base>..<pr_head> > file.patch` / `git apply
>> file.patch`?
>> 
>> This would address a couple of problems, and should be more efficient
>> on IO as well (which is I gather the motivation for the patch-based
>> approach in the first place).
>> 
>> In some cases (e.g., https://github.com/apache/solr/pull/1351), Crave
>> fails to apply patches even when the source PR branch is completely
>> up-to-date (has merged in the latest commit from `main`). The
>> workaround of rebasing is not ideal, in that it necessitates
>> force-pushes, which are considered an anti-pattern in some contexts,
>> and can generally be an inconvenience for transparency and
>> collaborative development.
>> 
>> But separately relevant: Crave doesn't care about the commit history
>> -- it's only using patches to optimize IO bandwidth. So generating a
>> single diff between the merge-base and the PR head should iiuc be
>> exactly what you want, and would be guaranteed to always apply
>> cleanly.
>> 
>> Would you mind sharing how you're deriving the "merge-base" commit,
>> and the command you're using (I assume, `git format-patch`?) to
>> generate the patch files? Having just merged `main` into a PR branch,
>> the merge-base of the PR branch and `main` should be: `main`.
>> 
>> Thank you for your consideration!
>> Michael
>> 
> 

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to