Team,

On Mon, 6 Aug 2018, SZEDER Gábor wrote:

> [Resending with Clemens' last used email address.
> Clemens, please consider sending a patch to update our .mailmap file.]
> 
> 
> On Mon, Aug 6, 2018 at 5:11 PM SZEDER Gábor <szeder....@gmail.com> wrote:
> >
> > Travis CI changed its default OSX image to use XCode 9.4 on 2018-07-31
> > [1].  Since then OSX build jobs fail rather frequently because of a
> > SIGPIPE in the tests 'fetch notices corrupt pack' or 'fetch notices
> > corrupt idx' in 't5570-git-daemon.sh' [2].  I think this is a symptom
> > a real bug in Git affecting other platforms as well, but these tests
> > are too lax to catch it.

I am seeing this very frequently now, as it feels like failing in the
Azure Pipeline about half of the time.

The symptom is this:

-- snip --
++ cp -R '/Users/vsts/agent/2.146.0/work/1/s/t/trash 
directory.t5570-git-daemon/repo/repo_pack.git' 
'/Users/vsts/agent/2.146.0/work/1/s/t/trash 
directory.t5570-git-daemon/repo/repo_bad2.git'
++ cd '/Users/vsts/agent/2.146.0/work/1/s/t/trash 
directory.t5570-git-daemon/repo/repo_bad2.git'
+++ ls objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
++ p=objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
++ chmod u+w objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
++ printf %0256d 0
++ dd of=objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx bs=256 
count=1 seek=1 conv=notrunc
1+0 records in
1+0 records out
256 bytes transferred in 0.000018 secs (14316558 bytes/sec)
++ mkdir repo_bad2.git
++ cd repo_bad2.git
++ git --bare init
Initialized empty Git repository in /Users/vsts/agent/2.146.0/work/1/s/t/trash 
directory.t5570-git-daemon/repo_bad2.git/
++ test_must_fail git --bare fetch git://127.0.0.1:5570/repo_bad2.git
++ case "$1" in
++ _test_ok=
++ git --bare fetch git://127.0.0.1:5570/repo_bad2.git
[13879] Connection from 127.0.0.1:60504
[13879] Extended attribute "host": 127.0.0.1:5570
[13879] Extended attribute "protocol": version=0
[13879] Request upload-pack for '/repo_bad2.git'
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] error: refs/heads/master does not point to a valid object!
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] error: refs/heads/other does not point to a valid object!
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] error: non-monotonic index 
./objects/pack/pack-b2175ab283c0abe358895777ca01c36e88a92e35.idx
[13879] fatal: git upload-pack: not our ref 
b5c2e4226db03597a64815fd226648510b22ba40
[11833] [13879] Disconnected (with error)
++ exit_code=141
++ test 141 -eq 0
++ test_match_signal 13 141
++ test 141 = 141
++ return 0
++ list_contains '' sigpipe
++ case ",$1," in
++ return 1
++ test 141 -gt 129
++ test 141 -le 192
++ echo 'test_must_fail: died by signal 13: git --bare fetch 
git://127.0.0.1:5570/repo_bad2.git'
test_must_fail: died by signal 13: git --bare fetch 
git://127.0.0.1:5570/repo_bad2.git
++ return 1
error: last command exited with $?=1
not ok 10 - fetch notices corrupt idx
#       
#               cp -R "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_pack.git 
"$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
#               (cd "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
#                p=$(ls objects/pack/pack-*.idx) &&
#                chmod u+w $p &&
#                printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc
#               ) &&
#               mkdir repo_bad2.git &&
#               (cd repo_bad2.git &&
#                git --bare init &&
#                test_must_fail git --bare fetch 
"$GIT_DAEMON_URL/repo_bad2.git" &&
#                test 0 = $(ls objects/pack | wc -l)
#               )
#       
-- snap --

Any ideas how to fix this test, anyone?

Ciao,
Dscho

> >
> > What it boils down to is this sequence:
> >
> >   - The test first prepares a repository containing a corrupt pack,
> >     ready to be server via 'git daemon'.
> >
> >   - Then the test runs 'test_must_fail git fetch ....', which connects
> >     to 'git daemon', which forks 'git upload-pack', which then
> >     advertises refs (only HEAD) and capabilities.  So far so good.
> >
> >   - 'git fetch' eventually calls fetch-pack.c:find_common().  The
> >     first half of this function assembles a request consisting of a
> >     want and a flush pkt-line, and sends it via a send_request() call.
> >
> >     At this point the scheduling becomes important: let's suppose that
> >     fetch is slow and upload-pack is fast.
> >
> >   - 'git upload-pack' receives the request, parses the want line,
> >     notices the corrupt pack, responds with an 'ERR upload-pack: not
> >     our ref' pkt-line, and die()s right away.
> >
> >   - 'git fetch' finally approaches the end of the function, where it
> >     attempts to send a done pkt-line via another send_request() call
> >     through the now closing TCP socket.
> >
> >   - What happens now seems to depend on the platform:
> >
> >     - On Linux, both on my machine and on Travis CI, it shows textbook
> >       example behaviour: write() returns with error and sets errno to
> >       ECONNRESET.  Since it happens in write_or_die(), 'git fetch'
> >       die()s with 'fatal: write error: Connection reset by peer', and
> >       doesn't show the error send by 'git upload-pack'; how could it,
> >       it doesn't even get as far to receive upload-pack's ERR
> >       pkt-line.
> >
> >       The test only checks that 'git fetch' fails, but it doesn't
> >       check whether it failed with the right error message, so the
> >       test still succeeds.  Had it checked the error message as well,
> >       we most likely had noticed this issue already, it doesn't happen
> >       all that rarely.
> >
> >     - On the new OSX images with XCode 9.4 on Travis CI the write()
> >       triggers SIGPIPE right away, and 'test_must_fail' notices it and
> >       fails the test.  I couldn't see any sign of an ECONNRESET or any
> >       other error that we could act upon to avoid the SIGPIPE.
> >
> >     - On OSX with XCode 9.2 on Travis CI there is neither SIGPIPE, nor
> >       ECONNRESET, but sending the request actually succeeds even
> >       though there is no process on the other end of the socket
> >       anymore.  'git fetch' then simply continues execution, reads and
> >       parses the ERR pkt-line, and then dies()s with 'fatal: remote
> >       error: upload-pack: not our ref'.  So, on the face of it, it
> >       shows the desired behaviour, but I have no idea how that write()
> >       could succeed instead of returning error.
> >
> > I don't know what happens on a real Mac as I don't have access to one;
> > I figured out all the above by enabling packet tracing, adding a
> > couple of well placed tracing printf() and sleep() calls, running a
> > bunch of builds on Travis CI, and looking through their logs.  But
> > without access to a debugger and netstat and what not I can't really
> > go any further.  So I would now happily pass the baton to those who
> > have a Mac and know a thing or two about its porting issues to first
> > check whether OSX on a real Mac shows the same behaviour as it does in
> > Travis CI's virtualized(?) environment.  And then they can pass the
> > baton to those who know all the intricacies of the pack protocol and
> > its implementation to decide what to do with this issue.
> >
> > For a mostly reliable reproduction recipe you might want to fetch this
> > branch:
> >
> >   https://github.com/szeder/git t5570-git-daemon-sigpipe
> >
> > and then run 'make && cd t && ./t5570-git-daemon.sh -v -x'
> >
> >
> > Have fun! ;)
> >
> >
> > 1 - https://blog.travis-ci.com/2018-07-19-xcode9-4-default-announce
> >
> > 2 - On git.git's master:
> >     https://travis-ci.org/git/git/jobs/411517552#L2717
> 
> 

Reply via email to