On Tue, Jan 23, 2018 at 5:43 PM, Jeff King <p...@peff.net> wrote:
> On Mon, Jan 22, 2018 at 02:32:19PM +0100, SZEDER Gábor wrote:
>
>> Travis CI runs the 32 bit Linux build job in a Docker container, where
>> all commands are executed as root by default.  Therefore, ever since
>> we added this build job in 88dedd5e7 (Travis: also test on 32-bit
>> Linux, 2017-03-05), we have a bit of code to create a user in the
>> container matching the ID of the host user and then to run the test
>> suite as this user.  Matching the host user ID is important, because
>> otherwise the host user would have no access to any files written by
>> processes running in the container, notably the logs of failed tests
>> couldn't be included in the build job's trace log.
>>
>> Alas, this piece of code never worked, because it sets the variable
>> holding the user name ($CI_USER) in a subshell, meaning it doesn't
>> have any effect by the time we get to the point to actually use the
>> variable to switch users with 'su'.  So all this time we were running
>> the test suite as root.
>
> This all makes sense to me, and the patch looks sane.
>
>> Reorganize that piece of code in 'ci/run-linux32-build.sh' a bit to
>> avoid that problematic subshell and to ensure that we switch to the
>> right user.  Furthermore, make the script's optional host user ID
>> option mandatory, so running the build accidentally as root will
>> become harder when debugging locally.  If someone really wants to run
>> the test suite as root, whatever the reasons might be, it'll still be
>> possible to do so by explicitly passing '0' as host user ID.
>
> Coincidentally, we recently set up a docker test build for our local fork
> of Git. We found the whole "mount the existing git source tree" strategy
> slightly annoying, exactly because of these mismatches between the host
> and docker uids.
>
> Instead, we ended up with something more like:
>
>   git archive HEAD | docker run ...
>
> and the in-container script starts with:
>
>   tar -x
>
> to extract the to-be-tested source, and ends with a dump of the failures
> to stdout.
>
> I don't know if it's worth switching strategies now, but just some food
> for thought. It may be less interesting with Travis, too, because you're
> also trying to carry the prove cache across runs, which does require
> some filesystem interaction.

The prove state in itself doesn't need matching user IDs in the host and
the container, because it's only accessed in the container[1].

I think the key is the handling of verbose logs of failed test(s).  The
original motivation for matching the user IDs was, I suppose, that we
wanted to dump the verbose log of the failed test(s) to the trace log on
the host, because this way it's fairly consistent with how other build
job do the same: it uses the same 'after_failure' script to dump the
verbose logs, it relies on Travis CI to automatically run that script if
something goes wrong, and those logs end up in the 'after_failure'
fold.

[1] - I ignored saving and restoring the cache here: while it's done on
      the host, but it's done by Travis CI, presumably as root, so the
      owner of the file doesn't matter.

> (As an aside, I'm not sure the prove cache is doing much. Running in
> slow-to-fast order helps if you are trying to run massively in parallel,
> but we only use -j3 for our Travis builds).

It saves about a minute / 10% of runtime; it's mentioned in 7e72cfcee
(travis-ci: save prove state for the 32 bit Linux build, 2017-12-27).

Reply via email to