Thanks for the pointer, Todd! On Wed, Jun 19, 2019 at 11:53 PM Todd Lipcon <t...@cloudera.com> wrote:
> This same issue was reported a month or two ago for Kudu on Fedora 29. I > think Alexey Serbin had started to look into it. Alexey, did we figure out > what was going on here? > > -Todd > > On Wed, Jun 19, 2019 at 6:00 AM Laszlo Gaal <laszlo.g...@cloudera.com> > wrote: > > > Having looked at the failing build Jim quoted above, the failure seems to > > come from the security area. > > This is from the Kudu master's log, from the startup sequence (see > > > > > https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/16/artifact/Impala/logs_static/logs/cluster/cdh6-node-1/kudu/master/kudu-master.INFO/*view*/ > > ), > > all this in the context of an Impala minicluster: > > > > I0612 04:12:56.129866 8515 sys_catalog.cc:424] T > > 00000000000000000000000000000000 P 58a05ce6efa74b30907ac4d679bd0515 > > [sys.catalog]: configured and running, proceeding with master startup. > > W0612 04:12:56.130080 8522 catalog_manager.cc:1113] T > > 00000000000000000000000000000000 P 58a05ce6efa74b30907ac4d679bd0515: > > acquiring CA information for follower catalog manager: Not found: root CA > > entry not found > > W0612 04:12:56.130123 8522 catalog_manager.cc:596] Not found: root CA > > entry not found: failed to prepare follower catalog manager, will retry > > I0612 04:12:56.130151 8521 catalog_manager.cc:1055] Loading table and > > tablet metadata into memory... > > I0612 04:12:56.130228 8521 catalog_manager.cc:1066] Initializing Kudu > > internal certificate authority... > > W0612 04:12:56.167639 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50174: expected TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.170145 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50176: expected TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.172571 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50178: expected TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.182530 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50180: expected TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.185034 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50182: expected TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.187453 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50184: expected TLS_HANDSHAKE step: SASL_INITIATE > > I0612 04:12:56.197146 8521 catalog_manager.cc:950] Generated new > > certificate authority record > > I0612 04:12:56.198005 8521 catalog_manager.cc:1075] Loading token > signing > > keys... > > W0612 04:12:56.293697 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50186: expected TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.295320 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50188: expected TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.296821 8636 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50190: expected TLS_HANDSHAKE step: SASL_INITIATE > > I0612 04:12:56.416918 8521 catalog_manager.cc:4292] T > > 00000000000000000000000000000000 P 58a05ce6efa74b30907ac4d679bd0515: > > Generated new TSK 0 > > W0612 04:12:57.174684 8901 negotiation.cc:320] Unauthorized connection > > attempt: Server connection negotiation failed: server connection from > > 127.0.0.1:50192: expected TLS_HANDSHAKE step: SASL_INITIATE > > [and so on...] > > > > The same run has very similar messages in the tablet server logs as well: > > 0612 04:12:56.289767 8396 rpc_server.cc:205] RPC server started. Bound > to: > > 127.0.0.1:31202 > > I0612 04:12:56.289903 8396 webserver.cc:308] Webserver started at > > http://0.0.0.0:31302/ using document root > > > > > /home/ubuntu/Impala/toolchain/cdh_components-1137441/kudu-1.10.0-cdh6.x-SNAPSHOT/release/bin/../lib/kudu/www > > and password file <none> > > W0612 04:12:56.293773 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (0 consecutive failures): Not authorized: Failed to ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:12:56.296866 8897 heartbeater.cc:380] Failed 3 heartbeats in a > > row: no longer allowing fast heartbeat attempts. > > W0612 04:13:56.424613 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (62 consecutive failures): Not authorized: Failed to ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:14:56.556850 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (122 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:15:56.694403 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (182 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:16:56.826400 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (242 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:17:56.955927 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (302 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:18:57.103503 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (362 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:19:57.237712 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (422 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:20:57.393489 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (482 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:21:57.522513 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (542 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:22:57.652271 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (602 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:23:57.782537 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (662 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > W0612 04:24:57.910481 8897 heartbeater.cc:587] Failed to heartbeat to > > 127.0.0.1:7051 (722 consecutive failures): Not authorized: Failed to > ping > > master at 127.0.0.1:7051: Client connection negotiation failed: client > > connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: > expected > > TLS_HANDSHAKE step: SASL_INITIATE > > > > > > On Mon, Jun 17, 2019 at 9:08 PM Todd Lipcon <t...@cloudera.com> wrote: > > > > > On Sat, Jun 15, 2019 at 2:20 PM Jim Apple <apa...@jbapple.com> wrote: > > > > > > > My goal is to have Impala keep up with (what I perceive to be) the > most > > > > popular version of the most popular Linux distribution, for the > purpose > > > of > > > > easing the workflow of developers, especially new developers. > > > > > > > > > > Sure, that makes sense. I use Ubuntu 18 myself, but tend to develop > > Impala > > > on a remote box running el7 because the dev environment is too > > heavy-weight > > > to realistically run on my laptop. > > > > > > > > > > > > > > 18.04 stopped being able to load data some time between June 9th and > > > > https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/14/ and June > > 12 > > > > and > > > > > > > > > > > > > > https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/16/artifact/Impala/logs_static/logs/data_loading/catalogd.ERROR/*view*/ > > > > . > > > > I tried reproducing the June 9 run with the same git checkouts > (Impala > > > and > > > > Impala-LZO) as #14 today, and data loading still failed. > > > > > > > > What RHEL 7 components did you have in mind that are closer to Ubuntu > > > 16.04 > > > > than 18.04? > > > > > > > > > > Stuff like libc, openssl, krb5, sasl, etc are pretty different > > > version-wise. At least, I know when we made Kudu pass tests on Ubuntu > 18, > > > we dealt with issues mostly in those libraries, which aren't part of > the > > > toolchain (for security reasons we rely on OS-provided libs). > > > > > > Generally I think precommit running on something closer to the oldest > > > supported OS is better than running on the newest, since it's more > likely > > > that new OSes are backward-compatible. Otherwise it's very easy to > > > introduce code that uses features not available on el7, for example. > > > > > > > > > > > > > > On Wed, May 22, 2019 at 10:41 AM Todd Lipcon <t...@cloudera.com> > > wrote: > > > > > > > > > On Mon, May 20, 2019 at 8:36 PM Jim Apple <apa...@jbapple.com> > > wrote: > > > > > > > > > > > Maybe now would be a good time to implement Everblue jobs that > ping > > > > dev@ > > > > > > when they fail. Thoughts? > > > > > > > > > > > > > > > > Mixed feelings on that. We already get many test runs per day of > the > > > > > "default" config because people are running precommit builds. > Adding > > an > > > > > additional cron-based job to the mix that runs the same builds > > doesn't > > > > seem > > > > > like it adds much unless it tests some other config (eg Ubuntu 18 > or > > a > > > > > longer suite of tests). One thing I could get on board with would > be > > > > > switching the precommit builds to run just "core" tests or some > other > > > > > faster subset, and defer the exhaustive/long runs to scheduled > builds > > > or > > > > as > > > > > an optional precommit for particularly invasive patches. I think > that > > > > would > > > > > increase dev quality of life substantially (I find my productivity > is > > > > often > > > > > hampered by only getting two shots at a precommit run per work > day). > > > > > > > > > > I'm not against adding a cron-triggered full test/build on Ubuntu > 18, > > > but > > > > > would like to know if someone plans to sign up to triage it when it > > > > fails. > > > > > My experience with other Apache communities is that collective > > > ownership > > > > > over test triage duty (ie "email the dev list on failure" doesn't > > > work. I > > > > > seem to recall we had such builds back in 2010 or so on Hadoop and > > they > > > > > just always got ignored. In various "day job" teams I've seen this > > work > > > > via > > > > > a prescriptive rotation ("all team members take a triage/build-cop > > > > shift") > > > > > but that's not really compatbile with the nature of Apache projects > > > being > > > > > volunteer communities. > > > > > > > > > > So, I think I'll put the question back to you: as a committer you > can > > > > spend > > > > > your time as you like. If you think an Ubuntu 18 job running on a > > > > schedule > > > > > would be useful and willing to sign up to triage failures, sounds > > great > > > > to > > > > > me :) Personally I don't develop on Ubuntu 18 and in my day job > it's > > > not > > > > a > > > > > particularly important deployment platform, so I personally don't > > think > > > > > I'll spend much time triaging that build. > > > > > > > > > > Todd > > > > > > > > > > > > > > > > > > > > > > On Mon, May 20, 2019 at 9:09 AM Todd Lipcon <t...@cloudera.com> > > > wrote: > > > > > > > > > > > > > Adding a build-only job for 18.04 makes sense to me. A full > test > > > run > > > > on > > > > > > > every precommit seems a bit expensive but doing one once a week > > or > > > > > > > something like that might be a good idea to prevent runtime > > > > > regressions. > > > > > > > > > > > > > > As for switching the precommit from 16.04 to 18.04, I'd lean > > > towards > > > > > > > keeping to 16.04 due to it being closer in terms of component > > > > versions > > > > > to > > > > > > > common enterprise distros like RHEL 7. > > > > > > > > > > > > > > -Todd > > > > > > > > > > > > > > On Sun, May 19, 2019 at 5:03 PM Jim Apple <jbap...@apache.org> > > > > wrote: > > > > > > > > > > > > > > > HEAD now passes on Ubuntu 18.04: > > > > > > > > > > > > > > > > https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/ > > > > > > > > > > > > > > > > Thanks to the community members who have made this happen! > > > > > > > > > > > > > > > > Should we add Ubuntu 18.04 to our pre-merge Jenkins job, > > replace > > > > > 16.04 > > > > > > > with > > > > > > > > 18.04 in our pre-merge Jenkins job, or neither? > > > > > > > > > > > > > > > > I propose adding 18.04 for now (ans so running both 16.04 and > > > 18.04 > > > > > on > > > > > > > > merge) and removing 16.04 when it starts to become > > inconvenient. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Todd Lipcon > > > > > > > Software Engineer, Cloudera > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Todd Lipcon > > > > > Software Engineer, Cloudera > > > > > > > > > > > > > > > > > > -- > > > Todd Lipcon > > > Software Engineer, Cloudera > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >