On 1/2/24 17:40, Eelco Chaudron wrote:
> 
> 
> On 2 Jan 2024, at 17:31, Ilya Maximets wrote:
> 
>> On 1/2/24 17:14, Eelco Chaudron wrote:
>>>
>>>
>>> On 2 Jan 2024, at 16:55, Ilya Maximets wrote:
>>>
>>>> On 1/2/24 16:19, Eelco Chaudron wrote:
>>>>>
>>>>>
>>>>> On 2 Jan 2024, at 13:43, Ilya Maximets wrote:
>>>>>
>>>>>> On 12/20/23 16:07, Eelco Chaudron wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
>>>>>>>
>>>>>>>> This patch identifies new static analysis issues during a GitHub action
>>>>>>>> run and reports them. The process involves analyzing the changes 
>>>>>>>> introduced
>>>>>>>> in the current commit and comparing them to those in the preceding 
>>>>>>>> commit.
>>>>>>>>
>>>>>>>> However, there are two cases when the GitHub push action runner does 
>>>>>>>> not
>>>>>>>> provide enough details to determen the preceding commit. These cases 
>>>>>>>> are
>>>>>>>> a new branch or a forced push. The strategy for these exceptions is to
>>>>>>>> select the first commit not done by the current author as the base 
>>>>>>>> commit.
>>>>>>>
>>>>>>> Including Aarons comment on v3 into v4.
>>>>>>>
>>>>>>> +++ As an alternative we can look at the author (go back in time until
>>>>>>> +++ we find a different author), so if you’re lucky and it’s one of your
>>>>>>> +++ older patches you will be reminded again and again ;) Not sure if
>>>>>>> +++ this is easaly possible with a git command, but it might be worth
>>>>>>> +++ trying out.
>>>>>>>
>>>>>>> ++ I did some experiments with the above approach, and it seems to work
>>>>>>> ++  in all the use cases. The only problem could be that if you are also
>>>>>>> ++ the owner of a previous patch series, it might use a reference branch
>>>>>>> ++ too deep. This could potentially show issues not introduced by your
>>>>>>> ++ latest patches. But as you are still the author, you should be
>>>>>>> ++ “punished” for your earlier mistakes ;)
>>>>>>> ++
>>>>>>> ++ I’ll send out a v4, and we can continue the discussion there if 
>>>>>>> needed :)
>>>>>>>
>>>>>>> + I'll take a look after Jan 1, but I guess we might be able to check
>>>>>>> + against an upstream reference (for example, we can reference towards 
>>>>>>> the
>>>>>>> + 'upstream' repository - such a reference does require some kind of
>>>>>>> + history being pulled) which we know should be stable and 'good'.  IE:
>>>>>>> + the script could add the OVS github remote if it doesn't already 
>>>>>>> exist.
>>>>>>> + I think we could use that to find the common ancestor.  But I need to
>>>>>>> + dive into this more.
>>>>>>>
>>>>>>> I guess we can determine the origin branch going back, and then do 
>>>>>>> something like
>>>>>>> `git merge-base upstream/master HEAD` assuming we fetched the upstream 
>>>>>>> repository.
>>>>>>
>>>>>>
>>>>>> The 'determine the origin branch' is still tricky and requires some
>>>>>> guesswork in case of a new branch or a force push.  Maybe something
>>>>>> like listing all the heads in upstream repo and finding the one that
>>>>>> has the closest merge-base may be an option.  Then use that merge base
>>>>>> as a base commit to compare against.
>>>>>>
>>>>>> The merge-base with upstream/master will be a branching point for a
>>>>>> release branch, and that is way too far.
>>>>>
>>>>> ACK, I think going back in time till a different author will solve all 
>>>>> this for now.
>>>>> And this will even work in Ilya’s and my force commit workflow. So for 
>>>>> now I’ll keep
>>>>> it as it.
>>>>
>>>> It will not work in case you're incorporating someone else's patches in
>>>> your patch set, unless you're pushing them one by one.  Robot will catch
>>>> such problems, but people may miss.  Same problem if you are accumulating
>>>> multiple different patches before batch-applying them.
>>>
>>> ACK, this is true. But with this patch in place, the robot should have 
>>> captured this for you ;)
>>> And your actual push to the main branch will also fail. Guess the 
>>> previously suggested patch -1
>>> will have the same problem.
>>>
>>> Don’t think there will be a ‘full’ solution that will work in all cases 
>>> 100% of the time.
>>
>> But why don't you like "closest merge-base against upstream heads" approach?
>> It seems less error prone to me.  At least I can't right away pinpoint a
>> simple scenario where it will do something unexpected, unless the branch is
>> completely diverged from anything upstream.
> 
> The issue only exists for new branches and forced commits. So how would we 
> determine the branch
> to use for the ‘git merge-base upstream/<BRANCH> HEAD’? Check if there is a 
> branch-XX in the
> timeline use it, and if not assume master/main? Or is there a smarter way?

Not smarter, but a bit dumber instead. :)  I was thinking about something
like this:

 BASE=HEAD~1 # In case no mergeable heads found.
 MIN_DISTANCE=1000000

 for upstream_head in $(git ls-remote --heads upstream | cut -f 1); do
     CURR_BASE=$(git merge-base ${upstream_head} HEAD)
     if [ ${CURR_BASE} ]; then
       DISTANCE=$(git log --oneline ${CURR_BASE}..HEAD | wc -l);
       echo $(git name-rev --refs='remotes/upstream/*' ${CURR_BASE}) ${DISTANCE}

       if test ${MIN_DISTANCE} -gt ${DISTANCE}; then
           BASE=${CURR_BASE}
           MIN_DISTANCE=${DISTANCE}
       fi
     fi
 done
 echo $(git name-rev --refs='remotes/upstream/*' ${BASE}) ${MIN_DISTANCE}

> 
> I’m fine with this too, I can do some experimentation…
> 
>>> This approach works for individual developers and the robot. For 
>>> maintainers, we can push the
>>> new branch, and then the actual patches, although making changes in the 
>>> middle might require
>>> deleting the remote branch, re-creating it, and re-apply. Tried this with 
>>> stgit, and this seems easy.
>>>
>>> //Eelco
>>>
>>>>>>> Both solutions will work, and I have no real preference.
>>>>>>>
>>>>>>> //Eelco
>>>>>>>
>>>>>>>> An example error output might look like this:
>>>>>>>>
>>>>>>>>   error level: +0 -0 no changes
>>>>>>>>   warning level: +2 +0
>>>>>>>>     New issue "deadcode.DeadStores Value stored to 'remote' is never 
>>>>>>>> read" (1 occurrence)
>>>>>>>>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>>>>>>>     New issue "unix.Malloc Potential leak of memory pointed to by 
>>>>>>>> 'remote'" (1 occurrence)
>>>>>>>>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>>>>>>>   note level: +0 -0 no changes
>>>>>>>>   all levels: +2 +0
>>>>>>>>
>>>>>>>> Signed-off-by: Eelco Chaudron <echau...@redhat.com>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> changes in v2:
>>>>>>>>   - When it's a new branch, it compares it to the HEAD of the default 
>>>>>>>> branch.
>>>>>>>>
>>>>>>>> changes in v3:
>>>>>>>>   - Include the clang version as part of the cache
>>>>>>>>   - Change the way it looks for the 'default' branch so it will work
>>>>>>>>     for patch branches.
>>>>>>>>   - Also compare to the base branch for forced commits.
>>>>>>>>
>>>>>>>> changes in v4:
>>>>>>>>   - No longer look for a default branch, but consume all patches
>>>>>>>>     from the current author.
>>>>>>>>
>>>>>>>>  .ci/linux-build.sh                   |   29 ++++++++++
>>>>>>>>  .github/workflows/build-and-test.yml |  103 
>>>>>>>> ++++++++++++++++++++++++++++++++++
>>>>>>>>  2 files changed, 132 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>>>>>>>> index aa2ecc505..fedf1398a 100755
>>>>>>>> --- a/.ci/linux-build.sh
>>>>>>>> +++ b/.ci/linux-build.sh
>>>>>>>> @@ -49,6 +49,30 @@ function build_ovs()
>>>>>>>>      make -j4
>>>>>>>>  }
>>>>>>>>
>>>>>>>> +function clang_analyze()
>>>>>>>> +{
>>>>>>>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>>>>>>>> +                                               || cache_build=true
>>>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>>>> +        # If this is a cache build, proceed to the base branch's 
>>>>>>>> directory.
>>>>>>>> +        cd base_ovs_main
>>>>>>
>>>>>> pushd/popd might be better here.
>>>>>
>>>>> ACK will do this next revision.
>>>>>>
>>>>>>>> +    fi;
>>>>>>>> +
>>>>>>>> +    configure_ovs $OPTS
>>>>>>>> +    make clean
>>>>>>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make 
>>>>>>>> -j4
>>>>>>
>>>>>> We have a make target for clang-analyze and it depends on the clean 
>>>>>> already.
>>>>>> Maybe we can extend it to accept custom arguments via env variable and 
>>>>>> pass
>>>>>> the '-sarif' this way?  E.g. make clang-analyze 
>>>>>> SCAN_BUILD_OPTIONS="-sarif".
>>>>>
>>>>> Let me look into this, as I like the idea.
>>>>>
>>>>>> Also, -j4 should be changed to ${JOBS} since that patch got merged.
>>>>>
>>>>> ACK
>>>>>
>>>>>> And we have CC defined in the job description, so should not use clang 
>>>>>> directly.
>>>>>
>>>>> ACK
>>>>>
>>>>>>>> +
>>>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>>>> +        # Move results, so it will be picked up by the cache.
>>>>>>>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>>>>>>>> +        cd ..
>>>>>>>> +    else
>>>>>>>> +        # Only do the compare on the none cache builds.
>>>>>>>> +        sarif --check note diff ./base-clang-analyzer-results \
>>>>>>>> +                                ./clang-analyzer-results
>>>>>>>> +    fi;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  if [ "$DEB_PACKAGE" ]; then
>>>>>>>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>>>>>>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>>>>>>>> @@ -116,6 +140,11 @@ fi
>>>>>>>>
>>>>>>>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>>>>>>>
>>>>>>>> +if [ "$CLANG_ANALYZE" ]; then
>>>>>>>> +    clang_analyze
>>>>>>>> +    exit 0
>>>>>>>> +fi
>>>>>>>> +
>>>>>>>>  if [ "$TESTSUITE" = 'test' ]; then
>>>>>>>>      # 'distcheck' will reconfigure with required options.
>>>>>>>>      # Now we only need to prepare the Makefile without sparse-wrapped 
>>>>>>>> CC.
>>>>>>>> diff --git a/.github/workflows/build-and-test.yml 
>>>>>>>> b/.github/workflows/build-and-test.yml
>>>>>>>> index 09654205e..cb277ff43 100644
>>>>>>>> --- a/.github/workflows/build-and-test.yml
>>>>>>>> +++ b/.github/workflows/build-and-test.yml
>>>>>>>> @@ -223,6 +223,109 @@ jobs:
>>>>>>>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>>>>>>>          path: logs.tgz
>>>>>>>>
>>>>>>>> +  build-clang-analyze:
>>>>>>>> +    needs: build-dpdk
>>>>>>>> +    env:
>>>>>>>> +      dependencies: |
>>>>>>>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>>>>>>>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>>>>>>>> +        python3-unbound
>>>>>>
>>>>>> Do we need a python3-unbound?
>>>>>
>>>>> Added it, as I removed the 32-bit conditional install, but can verify if 
>>>>> it fails without.
>>>>>
>>>>>>>> +      CC:   clang
>>>>>>>> +      DPDK: dpdk
>>>>>>>> +      CLANG_ANALYZE: true
>>>>>>>> +    name: clang-analyze
>>>>>>>> +    runs-on: ubuntu-22.04
>>>>>>>> +    timeout-minutes: 30
>>>>>>>> +
>>>>>>>> +    steps:
>>>>>>>> +    - name: checkout
>>>>>>>> +      uses: actions/checkout@v3
>>>>>>>> +      with:
>>>>>>>> +        fetch-depth: 0
>>>>>>>> +
>>>>>>>> +    - name: get base branch sha
>>>>>>>> +      id: base_branch
>>>>>>>> +      env:
>>>>>>>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>>>>>>>> +        EVENT_BEFORE: ${{ github.event.before }}
>>>>>>>> +        FORCED_PUSH: ${{ github.event.forced }}
>>>>>>>> +      run: |
>>>>>>>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>>>>>>>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>>>>>>>> +        else
>>>>>>>> +          if [ "$EVENT_BEFORE" = 
>>>>>>>> "0000000000000000000000000000000000000000" ] \
>>>>>>>> +             || [ "$FORCED_PUSH" = true ]; then
>>>>>>>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>>>>>>>> +                        grep -m1 -v $(git log -1 
>>>>>>>> --pretty=format:"%ae") | \
>>>>>>>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>>>>>>>> +          else
>>>>>>>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>>>>>>>> +          fi
>>>>>>>> +        fi
>>>>>>>> +
>>>>>>>> +    - name: checkout base branch
>>>>>>>> +      uses: actions/checkout@v3
>>>>>>>> +      with:
>>>>>>>> +        ref: ${{ steps.base_branch.outputs.sha }}
>>>>>>>> +        path: base_ovs_main
>>>>>>>> +
>>>>>>>> +    - name: update PATH
>>>>>>>> +      run: |
>>>>>>>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>>>>>>>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>>>>>>>> +
>>>>>>>> +    - name: generate cache key
>>>>>>>> +      id: cache_key
>>>>>>>> +      run: |
>>>>>>>> +        ver=$(clang -v 2>&1 | grep version | \
>>>>>>
>>>>>> s/clang/${CC}/ ?
>>>>>>
>>>>>> And if we grep for ' version ', the check will work for gcc as well,
>>>>>> even if we do not really care about this case. :)
>>>>>
>>>>> ACK, will make the change below as well.
>>>>>
>>>>>>>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>>>>>>>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main 
>>>>>>>> rev-parse HEAD)" \
>>>>>>>> +          >> $GITHUB_OUTPUT
>>>>>>>> +
>>>>>>>> +    - name: check for analyzer result cache
>>>>>>>> +      id: clang_cache
>>>>>>>> +      uses: actions/cache@v3
>>>>>>>> +      with:
>>>>>>>> +        path: base-clang-analyzer-results
>>>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>>>> +
>>>>>>>> +    - name: set up python
>>>>>>>> +      uses: actions/setup-python@v4
>>>>>>>> +      with:
>>>>>>>> +        python-version: '3.9'
>>>>>>>> +
>>>>>>>> +    - name: get cached dpdk-dir
>>>>>>>> +      uses: actions/cache/restore@v3
>>>>>>>> +      with:
>>>>>>>> +        path: dpdk-dir
>>>>>>>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>>>>>>>> +
>>>>>>>> +    - name: update APT cache
>>>>>>>> +      run:  sudo apt update || true
>>>>>>>> +
>>>>>>>> +    - name: install common dependencies
>>>>>>>> +      run:  sudo apt install -y ${{ env.dependencies }}
>>>>>>>> +
>>>>>>>> +    - name: install sarif tools
>>>>>>>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
>>>>>>
>>>>>> '--user' to avoid sudo.  Might also be better to add this to the
>>>>>> existing command in the linux-prepare script instead to allow pip
>>>>>> to choose compatible versions of all the packages.
>>>>>
>>>>> Let me move it to see if this will work.
>>>>>
>>>>>>>> +
>>>>>>>> +    - name: prepare
>>>>>>>> +      run:  ./.ci/linux-prepare.sh
>>>>>>>> +
>>>>>>>> +    - name: build base reference
>>>>>>>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>>>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>>>> +
>>>>>>>> +    - name: build
>>>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>>>> +
>>>>>>>> +    - name: save cache
>>>>>>>> +      uses: actions/cache/save@v3
>>>>>>>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
>>>>>>
>>>>>> So, we will save the cache even if the build of the base reference 
>>>>>> failed?
>>>>>> Maybe move this before the 'build' stage and remove the 'always()' check?
>>>>>
>>>>> Good catch, will move it up.
>>>>>
>>>>>>>> +      with:
>>>>>>>> +        path: base-clang-analyzer-results
>>>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>>>> +
>>>>>>>>    build-osx:
>>>>>>>>      env:
>>>>>>>>        CC:    clang
>>>>>>>>
>>>>>>
>>>>>> Best regards, Ilya Maximets.
>>>>>
>>>>> Thanks for the comments, will prep a v5.
>>>>>
>>>>> //Eelco
>>>>>
>>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to