[jira] [Commented] (HIVE-16749) Run YETUS in Docker container

Allen Wittenauer (JIRA) Wed, 28 Jun 2017 08:10:25 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-16749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066649#comment-16066649
 ]


Allen Wittenauer commented on HIVE-16749:
-----------------------------------------

FYI, typically people let Yetus execute the docker commands itself. Two big 
reasons for this:
* yetus will run docker build which means that the dockerfile can be modified 
on the fly as necessary. yetus will detect if the dockerfile has changed and 
rebuild as necessary--including as part of the patch being tested!
* the patchdir and basedir will be available after the container exists, which 
means logs and such are available post-build.  This is very useful to have 
access to in case of failures.

Anyway, cutting back the extra bits, given a directory structure of:

artifacts dir for Jenkins here: ${WORKSPACE}/artifacts
git checkout to here: ${WORKSPACE}/source

You just need the following extra lines on the command line:

{code}
        --patch-dir=${WORKSPACE}/artifacts \
        --basedir=${WORKSPACE}/source \
        --docker \
        --dockerfile=${WORKSPACE}/dev-support/docker/Dockerfile \
{code}

Be aware that the Dockerfile needs to have *everything* that Yetus will need to 
do it's work.  e.g., if the pylint test is enabled, then python with all the 
pre-req pylint eggs needs be installed too. You can see the default/example one 
that Yetus uses here:  
https://github.com/apache/yetus/blob/405cd9fa6e4f6240690bbba1bad6d054a4241214/precommit/test-patch-docker/Dockerfile

If you have an existing Dockerfile that has some extra stuff you don't want 
executed as part of the Yetus run, if you can separate that out to the bottom 
of the file, you can use it too.  See Hadoop's as an example: 
https://github.com/apache/hadoop/blob/ee243e5289212aa2912d191035802ea023367e19/dev-support/docker/Dockerfile
  

The {{{# YETUS CUT HERE}}} line acts as a guard.

I also HIGHLY recommend using the {{{--mvn-custom-repos}}} and {{{--jenkins}}} 
where more than one maven run is happening on a Jenkins instance.  Maven does 
*zero* locking of its cache, which means that simultaneous runs will stomp all 
over each other and result in wildly inaccurate results.  Those flags will 
guarantee on Jenkins that different executors will use different .m2 caches for 
themselves as well for different branches.  The very first run on a node will 
take a while as it does the mass download, but after that it's pretty quick.  
We saw significant unit test failure counts drop after doing that in Hadoop.

One other thing:  you don't need to run patch.  You can monkey patch individual 
functions inside the hive personality file.  It's loaded last which means it 
can overwrite other functions... :)


> Run YETUS in Docker container
> -----------------------------
>
>                 Key: HIVE-16749
>                 URL: https://issues.apache.org/jira/browse/HIVE-16749
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Peter Vary
>            Assignee: Zoltan Haindrich
>         Attachments: HIVE-16749.1.patch
>
>
> Think about the pros and cons of running YETUS in a docker container:
> - Resources
> - Usage complexity
> - Yetus version changes
> - Findbugs
> - etc.
> If worthwhile run YETUS in a docker container



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16749) Run YETUS in Docker container

Reply via email to