debugged with the help of benm.

https://issues.apache.org/jira/browse/MESOS-1310



On Sun, May 4, 2014 at 10:11 PM, Benjamin Hindman <b...@eecs.berkeley.edu>wrote:

> This is a rather peculiar issue ... has anybody else seen this crash
> before?
>
>
> On Sun, May 4, 2014 at 5:23 PM, Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
> > See <https://builds.apache.org/job/Mesos-Ubuntu-distcheck/73/>
> >
> > ------------------------------------------
> > [...truncated 13055 lines...]
> > I0505 00:23:09.478384 24455 master.cpp:2838] Successfully authenticated
> > slave(51)@67.195.138.9:37909
> > I0505 00:23:09.478520 24449 slave.cpp:644] Successfully authenticated
> with
> > master master@67.195.138.9:37909
> > I0505 00:23:09.478582 24449 slave.cpp:873] Will retry registration in
> > 6.185133022secs if necessary
> > I0505 00:23:09.478658 24453 master.cpp:2134] Registering slave at
> > slave(51)@67.195.138.9:37909 (vesta.apache.org) with id
> > 20140505-002308-160088899-37909-24266-1
> > I0505 00:23:09.478857 24454 registrar.cpp:422] Attempting to update the
> > 'registry'
> > I0505 00:23:09.480067 24455 log.cpp:680] Attempting to append 515 bytes
> to
> > the log
> > I0505 00:23:09.480121 24449 coordinator.cpp:340] Coordinator attempting
> to
> > write APPEND action at position 5
> > I0505 00:23:09.480514 24455 replica.cpp:508] Replica received write
> > request for position 5
> > I0505 00:23:09.492759 24455 leveldb.cpp:341] Persisting action (534
> bytes)
> > to leveldb took 12.229994ms
> > I0505 00:23:09.492796 24455 replica.cpp:676] Persisted action at 5
> > I0505 00:23:09.493127 24453 replica.cpp:655] Replica received learned
> > notice for position 5
> > I0505 00:23:09.572561 24453 leveldb.cpp:341] Persisting action (536
> bytes)
> > to leveldb took 79.381257ms
> > I0505 00:23:09.572619 24453 replica.cpp:676] Persisted action at 5
> > I0505 00:23:09.572631 24453 replica.cpp:661] Replica learned APPEND
> action
> > at position 5
> > I0505 00:23:09.573096 24453 registrar.cpp:479] Successfully updated
> > 'registry'
> > I0505 00:23:09.573251 24452 log.cpp:699] Attempting to truncate the log
> to
> > 5
> > I0505 00:23:09.573354 24449 coordinator.cpp:340] Coordinator attempting
> to
> > write TRUNCATE action at position 6
> > I0505 00:23:09.573360 24456 master.cpp:2174] Registered slave
> > 20140505-002308-160088899-37909-24266-1 at slave(51)@67.195.138.9:37909(
> > vesta.apache.org)
> > I0505 00:23:09.573384 24456 master.cpp:3283] Adding slave
> > 20140505-002308-160088899-37909-24266-1 at slave(51)@67.195.138.9:37909(
> > vesta.apache.org) with cpus(*):2; mem(*):1024; disk(*):1024;
> > ports(*):[31000-32000]
> > I0505 00:23:09.573531 24452 slave.cpp:677] Registered with master
> > master@67.195.138.9:37909; given slave ID
> > 20140505-002308-160088899-37909-24266-1
> > I0505 00:23:09.573710 24456 replica.cpp:508] Replica received write
> > request for position 6
> > I0505 00:23:09.573719 24452 slave.cpp:690] Checkpointing SlaveInfo to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/
> > slave.info'
> > I0505 00:23:09.573739 24453 hierarchical_allocator_process.hpp:444] Added
> > slave 20140505-002308-160088899-37909-24266-1 (vesta.apache.org) with
> > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and
> > cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> > I0505 00:23:09.573809 24453 hierarchical_allocator_process.hpp:751]
> > Offering cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on
> > slave 20140505-002308-160088899-37909-24266-1 to framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:09.573995 24453 hierarchical_allocator_process.hpp:707]
> > Performed allocation for slave 20140505-002308-160088899-37909-24266-1 in
> > 224450ns
> > I0505 00:23:09.574116 24452 master.hpp:612] Adding offer
> > 20140505-002308-160088899-37909-24266-1 with resources cpus(*):2;
> > mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> > 20140505-002308-160088899-37909-24266-1 (vesta.apache.org)
> > I0505 00:23:09.574167 24452 master.cpp:2747] Sending 1 offers to
> framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:09.574404 24454 sched.cpp:529] Scheduler::resourceOffers took
> > 31038ns
> > I0505 00:23:09.575747 24453 master.hpp:622] Removing offer
> > 20140505-002308-160088899-37909-24266-1 with resources cpus(*):2;
> > mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> > 20140505-002308-160088899-37909-24266-1 (vesta.apache.org)
> > I0505 00:23:09.575811 24453 master.cpp:1812] Processing reply for offers:
> > [ 20140505-002308-160088899-37909-24266-1 ] on slave
> > 20140505-002308-160088899-37909-24266-1 at slave(51)@67.195.138.9:37909(
> > vesta.apache.org) for framework
> 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:09.575906 24453 master.hpp:584] Adding task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa with resources cpus(*):2;
> mem(*):1024;
> > disk(*):1024; ports(*):[31000-32000] on slave
> > 20140505-002308-160088899-37909-24266-1 (vesta.apache.org)
> > I0505 00:23:09.575959 24453 master.cpp:2922] Launching task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000 with resources cpus(*):2;
> > mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> > 20140505-002308-160088899-37909-24266-1 at slave(51)@67.195.138.9:37909(
> > vesta.apache.org)
> > I0505 00:23:09.576086 24449 slave.cpp:907] Got assigned task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa for framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:09.576190 24449 slave.cpp:3085] Checkpointing FrameworkInfo
> to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/
> > framework.info'
> > I0505 00:23:09.576302 24449 slave.cpp:3092] Checkpointing framework pid
> > 'scheduler(39)@67.195.138.9:37909' to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/framework.pid'
> > I0505 00:23:09.576511 24449 slave.cpp:1017] Launching task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa for framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:09.577499 24449 slave.cpp:3399] Checkpointing ExecutorInfo to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/executors/db857d58-97a8-4cd8-a42a-9eb62f4583fa/
> > executor.info'
> > I0505 00:23:09.577883 24449 slave.cpp:3496] Checkpointing TaskInfo to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/executors/db857d58-97a8-4cd8-a42a-9eb62f4583fa/runs/775df9b1-72da-44de-92f2-f87291e20cb7/tasks/db857d58-97a8-4cd8-a42a-9eb62f4583fa/
> > task.info'
> > I0505 00:23:09.577905 24450 mesos_containerizer.cpp:523] Starting
> > container '775df9b1-72da-44de-92f2-f87291e20cb7' for executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > '20140505-002308-160088899-37909-24266-0000'
> > I0505 00:23:09.578043 24449 slave.cpp:1127] Queuing task
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' for executor
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > '20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:09.578121 24449 slave.cpp:488] Successfully attached file
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/executors/db857d58-97a8-4cd8-a42a-9eb62f4583fa/runs/775df9b1-72da-44de-92f2-f87291e20cb7'
> > I0505 00:23:09.579169 24450 launcher.cpp:117] Forked child with pid
> > '25462' for container '775df9b1-72da-44de-92f2-f87291e20cb7'
> > I0505 00:23:09.579269 24450 mesos_containerizer.cpp:746] Checkpointing
> > executor's forked pid 25462 to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/executors/db857d58-97a8-4cd8-a42a-9eb62f4583fa/runs/775df9b1-72da-44de-92f2-f87291e20cb7/pids/forked.pid'
> > I0505 00:23:09.580445 24449 mesos_containerizer.cpp:623] Fetching URIs
> for
> > container '775df9b1-72da-44de-92f2-f87291e20cb7' using command
> >
> '/home/hudson/jenkins-slave/workspace/Mesos-Ubuntu-distcheck/build/mesos-0.19.0/_build/src/mesos-fetcher'
> > I0505 00:23:09.592635 24456 leveldb.cpp:341] Persisting action (16 bytes)
> > to leveldb took 18.785108ms
> > I0505 00:23:09.592696 24456 replica.cpp:676] Persisted action at 6
> > I0505 00:23:09.593547 24451 replica.cpp:655] Replica received learned
> > notice for position 6
> > I0505 00:23:09.600426 24451 leveldb.cpp:341] Persisting action (18 bytes)
> > to leveldb took 6.841373ms
> > I0505 00:23:09.600484 24451 leveldb.cpp:399] Deleting ~2 keys from
> leveldb
> > took 18966ns
> > I0505 00:23:09.600498 24451 replica.cpp:676] Persisted action at 6
> > I0505 00:23:09.600510 24451 replica.cpp:661] Replica learned TRUNCATE
> > action at position 6
> > I0505 00:23:09.973253 24456 slave.cpp:2284] Monitoring executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > '20140505-002308-160088899-37909-24266-0000' in container
> > '775df9b1-72da-44de-92f2-f87291e20cb7'
> > WARNING: Logging before InitGoogleLogging() is written to STDERR
> > I0505 00:23:10.005146 25484 process.cpp:1611] libprocess is initialized
> on
> > 67.195.138.9:35102 for 8 cpus
> > I0505 00:23:10.006180 25484 exec.cpp:131] Version: 0.19.0
> > I0505 00:23:10.007604 25499 exec.cpp:181] Executor started at:
> executor(1)@
> > 67.195.138.9:35102 with pid 25484
> > I0505 00:23:10.008164 24456 slave.cpp:1600] Got registration for executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.008224 24456 slave.cpp:1685] Checkpointing executor pid
> > 'executor(1)@67.195.138.9:35102' to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/executors/db857d58-97a8-4cd8-a42a-9eb62f4583fa/runs/775df9b1-72da-44de-92f2-f87291e20cb7/pids/libprocess.pid'
> > I0505 00:23:10.008625 24456 slave.cpp:1719] Flushing queued task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa for executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.008990 25506 process.cpp:1023] Socket closed while
> receiving
> > I0505 00:23:10.009049 25505 exec.cpp:205] Executor registered on slave
> > 20140505-002308-160088899-37909-24266-1
> > I0505 00:23:10.009913 25505 exec.cpp:217] Executor::registered took
> 83098ns
> > Registered executor on vesta.apache.org
> > I0505 00:23:10.010023 25505 exec.cpp:292] Executor asked to run task
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa'
> > I0505 00:23:10.010076 25505 exec.cpp:301] Executor::launchTask took
> 34587ns
> > Starting task db857d58-97a8-4cd8-a42a-9eb62f4583fa
> > Forked command at 25507
> > sh -c 'sleep 1000'
> > I0505 00:23:10.011773 25503 exec.cpp:524] Executor sending status update
> > TASK_RUNNING (UUID: 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.012099 24454 slave.cpp:1955] Handling status update
> > TASK_RUNNING (UUID: 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000 from executor(1)@
> > 67.195.138.9:35102
> > I0505 00:23:10.012217 24454 status_update_manager.cpp:320] Received
> status
> > update TASK_RUNNING (UUID: 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.012229 24454 status_update_manager.cpp:499] Creating
> > StatusUpdate stream for task db857d58-97a8-4cd8-a42a-9eb62f4583fa of
> > framework 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.012406 24454 status_update_manager.hpp:342] Checkpointing
> > UPDATE for status update TASK_RUNNING (UUID:
> > 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.271118 24454 status_update_manager.cpp:373] Forwarding
> > status update TASK_RUNNING (UUID: 1ac72be6-b0db-4d58-9d1a-7bf13de11a35)
> for
> > task db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000 to master@67.195.138.9:37909
> > I0505 00:23:10.271327 24454 slave.cpp:2072] Status update manager
> > successfully handled status update TASK_RUNNING (UUID:
> > 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.271338 24454 slave.cpp:2078] Sending acknowledgement for
> > status update TASK_RUNNING (UUID: 1ac72be6-b0db-4d58-9d1a-7bf13de11a35)
> for
> > task db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000 to executor(1)@
> > 67.195.138.9:35102
> > I0505 00:23:10.271390 24453 master.cpp:2450] Status update TASK_RUNNING
> > (UUID: 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000 from slave
> > 20140505-002308-160088899-37909-24266-1 at slave(51)@67.195.138.9:37909(
> > vesta.apache.org)
> > I0505 00:23:10.271458 24453 sched.cpp:620] Scheduler::statusUpdate took
> > 14432ns
> > I0505 00:23:10.271669 25506 process.cpp:1023] Socket closed while
> receiving
> > I0505 00:23:10.271704 25498 exec.cpp:338] Executor received status update
> > acknowledgement 1ac72be6-b0db-4d58-9d1a-7bf13de11a35 for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.271924 24454 status_update_manager.cpp:398] Received
> status
> > update acknowledgement (UUID: 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for
> > task db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.271945 24454 status_update_manager.hpp:342] Checkpointing
> > ACK for status update TASK_RUNNING (UUID:
> > 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.298812 24453 hierarchical_allocator_process.hpp:687]
> > Performed allocation for 2 slaves in 43928ns
> > I0505 00:23:10.422714 24450 slave.cpp:1540] Status update manager
> > successfully handled status update acknowledgement (UUID:
> > 1ac72be6-b0db-4d58-9d1a-7bf13de11a35) for task
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.422968 24266 slave.cpp:416] Slave terminating
> > I0505 00:23:10.423104 24450 master.cpp:690] Slave
> > 20140505-002308-160088899-37909-24266-0 at slave(50)@67.195.138.9:37909(
> > vesta.apache.org) disconnected
> > I0505 00:23:10.423132 24450 master.cpp:1262] Disconnecting slave
> > 20140505-002308-160088899-37909-24266-0
> > I0505 00:23:10.423249 24454 hierarchical_allocator_process.hpp:483] Slave
> > 20140505-002308-160088899-37909-24266-0 disconnected
> > I0505 00:23:10.423838 24266 slave.cpp:416] Slave terminating
> > I0505 00:23:10.423907 24453 master.cpp:690] Slave
> > 20140505-002308-160088899-37909-24266-1 at slave(51)@67.195.138.9:37909(
> > vesta.apache.org) disconnected
> > I0505 00:23:10.423933 24453 master.cpp:1262] Disconnecting slave
> > 20140505-002308-160088899-37909-24266-1
> > I0505 00:23:10.424024 24456 hierarchical_allocator_process.hpp:483] Slave
> > 20140505-002308-160088899-37909-24266-1 disconnected
> > I0505 00:23:10.424453 24266 mesos_containerizer.cpp:122] Using isolation:
> > posix/cpu,posix/mem
> > I0505 00:23:10.425937 24455 slave.cpp:140] Slave started on 52)@
> > 67.195.138.9:37909
> > I0505 00:23:10.425968 24455 credentials.hpp:35] Loading credentials for
> > authentication
> > W0505 00:23:10.426074 24455 credentials.hpp:48] Failed to stat
> credentials
> > file 'file:///tmp/SlaveRecoveryTest_0_MultipleSlaves_wg2szi/credential':
> No
> > such file or directory
> > I0505 00:23:10.426100 24455 slave.cpp:231] Slave using credential for:
> > test-principal
> > I0505 00:23:10.426239 24266 mesos_containerizer.cpp:122] Using isolation:
> > posix/cpu,posix/mem
> > I0505 00:23:10.426228 24455 slave.cpp:244] Slave resources: cpus(*):2;
> > mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> > I0505 00:23:10.426309 24455 slave.cpp:272] Slave hostname:
> > vesta.apache.org
> > I0505 00:23:10.426326 24455 slave.cpp:273] Slave checkpoint: true
> > I0505 00:23:10.426879 24456 state.cpp:33] Recovering state from
> > '/tmp/SlaveRecoveryTest_0_MultipleSlaves_wg2szi/meta'
> > I0505 00:23:10.428024 24456 slave.cpp:140] Slave started on 53)@
> > 67.195.138.9:37909
> > I0505 00:23:10.428035 24456 credentials.hpp:35] Loading credentials for
> > authentication
> > W0505 00:23:10.428078 24456 credentials.hpp:48] Failed to stat
> credentials
> > file 'file:///tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/credential':
> No
> > such file or directory
> > I0505 00:23:10.428098 24456 slave.cpp:231] Slave using credential for:
> > test-principal
> > I0505 00:23:10.428185 24456 slave.cpp:244] Slave resources: cpus(*):2;
> > mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> > I0505 00:23:10.428226 24456 slave.cpp:272] Slave hostname:
> > vesta.apache.org
> > I0505 00:23:10.428233 24456 slave.cpp:273] Slave checkpoint: true
> > I0505 00:23:10.428561 24451 slave.cpp:3014] Recovering framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.428632 24451 slave.cpp:3249] Recovering executor
> > 'b67093e7-ce45-41d1-9ad3-b664594f6ff1' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.428848 24452 state.cpp:33] Recovering state from
> > '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta'
> > I0505 00:23:10.429111 24453 status_update_manager.cpp:193] Recovering
> > status update manager
> > I0505 00:23:10.429124 24453 status_update_manager.cpp:201] Recovering
> > executor 'b67093e7-ce45-41d1-9ad3-b664594f6ff1' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.429155 24453 status_update_manager.cpp:499] Creating
> > StatusUpdate stream for task b67093e7-ce45-41d1-9ad3-b664594f6ff1 of
> > framework 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.429311 24453 status_update_manager.hpp:306] Replaying
> > status update stream for task b67093e7-ce45-41d1-9ad3-b664594f6ff1
> > I0505 00:23:10.429317 24451 slave.cpp:488] Successfully attached file
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_wg2szi/slaves/20140505-002308-160088899-37909-24266-0/frameworks/20140505-002308-160088899-37909-24266-0000/executors/b67093e7-ce45-41d1-9ad3-b664594f6ff1/runs/67fa4eaa-e143-4f9f-8230-79ebf19d461b'
> > I0505 00:23:10.429713 24454 mesos_containerizer.cpp:279] Recovering
> > containerizer
> > I0505 00:23:10.429741 24454 mesos_containerizer.cpp:321] Recovering
> > container '67fa4eaa-e143-4f9f-8230-79ebf19d461b' for executor
> > 'b67093e7-ce45-41d1-9ad3-b664594f6ff1' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.430104 24453 slave.cpp:3014] Recovering framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.430132 24453 slave.cpp:3249] Recovering executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.430522 24456 slave.cpp:2887] Sending reconnect request to
> > executor b67093e7-ce45-41d1-9ad3-b664594f6ff1 of framework
> > 20140505-002308-160088899-37909-24266-0000 at executor(1)@
> > 67.195.138.9:56161
> > I0505 00:23:10.430590 24454 status_update_manager.cpp:193] Recovering
> > status update manager
> > I0505 00:23:10.430614 24454 status_update_manager.cpp:201] Recovering
> > executor 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.430642 24454 status_update_manager.cpp:499] Creating
> > StatusUpdate stream for task db857d58-97a8-4cd8-a42a-9eb62f4583fa of
> > framework 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.430778 24453 slave.cpp:488] Successfully attached file
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/executors/db857d58-97a8-4cd8-a42a-9eb62f4583fa/runs/775df9b1-72da-44de-92f2-f87291e20cb7'
> > I0505 00:23:10.430785 24454 status_update_manager.hpp:306] Replaying
> > status update stream for task db857d58-97a8-4cd8-a42a-9eb62f4583fa
> > I0505 00:23:10.430934 25459 process.cpp:1023] Socket closed while
> receiving
> > I0505 00:23:10.430989 25451 exec.cpp:251] Received reconnect request from
> > slave 20140505-002308-160088899-37909-24266-0
> > I0505 00:23:10.431207 24455 mesos_containerizer.cpp:279] Recovering
> > containerizer
> > I0505 00:23:10.431246 24455 mesos_containerizer.cpp:321] Recovering
> > container '775df9b1-72da-44de-92f2-f87291e20cb7' for executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.431288 24452 slave.cpp:1779] Re-registering executor
> > b67093e7-ce45-41d1-9ad3-b664594f6ff1 of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.431591 25459 process.cpp:1023] Socket closed while
> receiving
> > I0505 00:23:10.431643 25455 exec.cpp:228] Executor re-registered on slave
> > 20140505-002308-160088899-37909-24266-0
> > I0505 00:23:10.432095 24456 slave.cpp:2887] Sending reconnect request to
> > executor db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000 at executor(1)@
> > 67.195.138.9:35102
> > I0505 00:23:10.432313 25506 process.cpp:1023] Socket closed while
> receiving
> > I0505 00:23:10.432359 25500 exec.cpp:251] Received reconnect request from
> > slave 20140505-002308-160088899-37909-24266-1
> > I0505 00:23:10.432539 25455 exec.cpp:240] Executor::reregistered took
> > 49858ns
> > Re-registered executor on vesta.apache.org
> > I0505 00:23:10.432643 24454 slave.cpp:1779] Re-registering executor
> > db857d58-97a8-4cd8-a42a-9eb62f4583fa of framework
> > 20140505-002308-160088899-37909-24266-0000
> > I0505 00:23:10.432834 25506 process.cpp:1023] Socket closed while
> receiving
> > I0505 00:23:10.432868 25500 exec.cpp:228] Executor re-registered on slave
> > 20140505-002308-160088899-37909-24266-1
> > Re-registered executor on vesta.apache.org
> > I0505 00:23:10.433661 25500 exec.cpp:240] Executor::reregistered took
> > 34289ns
> > I0505 00:23:10.458559 24456 slave.cpp:1904] Cleaning up un-reregistered
> > executors
> > I0505 00:23:10.458588 24449 slave.cpp:1904] Cleaning up un-reregistered
> > executors
> > I0505 00:23:10.458596 24450 hierarchical_allocator_process.hpp:687]
> > Performed allocation for 2 slaves in 38569ns
> > I0505 00:23:10.458606 24456 slave.cpp:2946] Finished recovery
> > I0505 00:23:10.458823 24455 monitor.cpp:140] Failed to collect resource
> > usage for container '775df9b1-72da-44de-92f2-f87291e20cb7' for executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > '20140505-002308-160088899-37909-24266-0000': Unknown container:
> > 775df9b1-72da-44de-92f2-f87291e20cb7
> > I0505 00:23:10.458905 24456 slave.cpp:527] New master detected at
> > master@67.195.138.9:37909
> > I0505 00:23:10.458927 24456 slave.cpp:587] Authenticating with master
> > master@67.195.138.9:37909
> > I0505 00:23:10.458974 24456 slave.cpp:560] Detecting new master
> > I0505 00:23:10.458997 24455 status_update_manager.cpp:167] New master
> > detected at master@67.195.138.9:37909
> > I0505 00:23:10.459035 24456 authenticatee.hpp:128] Creating new client
> > SASL connection
> > I0505 00:23:10.458734 24449 slave.cpp:2946] Finished recovery
> > I0505 00:23:10.459127 24453 monitor.cpp:140] Failed to collect resource
> > usage for container '775df9b1-72da-44de-92f2-f87291e20cb7' for executor
> > 'db857d58-97a8-4cd8-a42a-9eb62f4583fa' of framework
> > '20140505-002308-160088899-37909-24266-0000': Unknown container:
> > 775df9b1-72da-44de-92f2-f87291e20cb7
> > I0505 00:23:10.459245 24454 master.cpp:2798] Authenticating slave(52)@
> > 67.195.138.9:37909
> > I0505 00:23:10.459270 24449 slave.cpp:527] New master detected at
> > master@67.195.138.9:37909
> > I0505 00:23:10.459310 24449 slave.cpp:587] Authenticating with master
> > master@67.195.138.9:37909
> > I0505 00:23:10.459344 24454 status_update_manager.cpp:167] New master
> > detected at master@67.195.138.9:37909
> > I0505 00:23:10.459383 24453 authenticator.hpp:148] Creating new server
> > SASL connection
> > I0505 00:23:10.459398 24454 authenticatee.hpp:128] Creating new client
> > SASL connection
> > I0505 00:23:10.459388 24449 slave.cpp:560] Detecting new master
> > I0505 00:23:10.459461 24455 master.cpp:2798] Authenticating slave(53)@
> > 67.195.138.9:37909
> > I0505 00:23:10.459481 24454 authenticatee.hpp:219] Received SASL
> > authentication mechanisms: CRAM-MD5
> > I0505 00:23:10.459498 24454 authenticatee.hpp:245] Attempting to
> > authenticate with mechanism 'CRAM-MD5'
> > I0505 00:23:10.459537 24454 authenticator.hpp:148] Creating new server
> > SASL connection
> > I0505 00:23:10.459552 24455 authenticator.hpp:254] Received SASL
> > authentication start
> > I0505 00:23:10.459594 24455 authenticator.hpp:342] Authentication
> requires
> > more steps
> > I0505 00:23:10.460095 24449 authenticatee.hpp:265] Received SASL
> > authentication step
> > I0505 00:23:10.460119 24454 authenticatee.hpp:219] Received SASL
> > authentication mechanisms: CRAM-MD5
> > I0505 00:23:10.460134 24454 authenticatee.hpp:245] Attempting to
> > authenticate with mechanism 'CRAM-MD5'
> > I0505 00:23:10.461935 24449 authenticator.hpp:282] Received SASL
> > authentication step
> > I0505 00:23:10.461978 24456 authenticator.hpp:254] Received SASL
> > authentication start
> > I0505 00:23:10.461993 24449 auxprop.cpp:81] Request to lookup properties
> > for user: 'test-principal' realm: 'vesta.apache.org' server FQDN: '
> > vesta.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
> > I0505 00:23:10.462005 24449 auxprop.cpp:153] Looking up auxiliary
> property
> > '*userPassword'
> > I0505 00:23:10.462008 24456 authenticator.hpp:342] Authentication
> requires
> > more steps
> > I0505 00:23:10.462018 24449 auxprop.cpp:153] Looking up auxiliary
> property
> > '*cmusaslsecretCRAM-MD5'
> > I0505 00:23:10.462030 24449 auxprop.cpp:81] Request to lookup properties
> > for user: 'test-principal' realm: 'vesta.apache.org' server FQDN: '
> > vesta.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
> > I0505 00:23:10.462036 24456 authenticatee.hpp:265] Received SASL
> > authentication step
> > I0505 00:23:10.462038 24449 auxprop.cpp:103] Skipping auxiliary property
> > '*userPassword' since SASL_AUXPROP_AUTHZID == true
> > I0505 00:23:10.462082 24449 auxprop.cpp:103] Skipping auxiliary property
> > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> > I0505 00:23:10.462087 24456 authenticator.hpp:282] Received SASL
> > authentication step
> > I0505 00:23:10.462097 24449 authenticator.hpp:334] Authentication success
> > I0505 00:23:10.462103 24456 auxprop.cpp:81] Request to lookup properties
> > for user: 'test-principal' realm: 'vesta.apache.org' server FQDN: '
> > vesta.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
> > I0505 00:23:10.462116 24456 auxprop.cpp:153] Looking up auxiliary
> property
> > '*userPassword'
> > I0505 00:23:10.462126 24456 auxprop.cpp:153] Looking up auxiliary
> property
> > '*cmusaslsecretCRAM-MD5'
> > I0505 00:23:10.462138 24456 auxprop.cpp:81] Request to lookup properties
> > for user: 'test-principal' realm: 'vesta.apache.org' server FQDN: '
> > vesta.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> > SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
> > I0505 00:23:10.462138 24454 authenticatee.hpp:305] Authentication success
> > I0505 00:23:10.462151 24456 auxprop.cpp:103] Skipping auxiliary property
> > '*userPassword' since SASL_AUXPROP_AUTHZID == true
> > I0505 00:23:10.462165 24456 auxprop.cpp:103] Skipping auxiliary property
> > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> > I0505 00:23:10.462179 24456 authenticator.hpp:334] Authentication success
> > I0505 00:23:10.462218 24454 slave.cpp:644] Successfully authenticated
> with
> > master master@67.195.138.9:37909
> > I0505 00:23:10.462229 24456 authenticatee.hpp:305] Authentication success
> > I0505 00:23:10.462257 24456 master.cpp:2838] Successfully authenticated
> > slave(53)@67.195.138.9:37909
> > I0505 00:23:10.462281 24454 slave.cpp:873] Will retry registration in
> > 7.817449361secs if necessary
> > I0505 00:23:10.462399 24456 master.cpp:2192] Queuing up re-registration
> > request from slave(52)@67.195.138.9:37909 because authentication is
> still
> > in progress
> > I0505 00:23:10.462404 24454 slave.cpp:644] Successfully authenticated
> with
> > master master@67.195.138.9:37909
> > I0505 00:23:10.462461 24454 slave.cpp:873] Will retry registration in
> > 8.220002736secs if necessary
> > I0505 00:23:10.462551 24456 master.cpp:2838] Successfully authenticated
> > slave(52)@67.195.138.9:37909
> > W0505 00:23:10.462718 24456 master.cpp:2248] Slave at slave(53)@
> > 67.195.138.9:37909 (vesta.apache.org) is being allowed to re-register
> > with an already in use id (20140505-002308-160088899-37909-24266-1)
> > I0505 00:23:10.462771 24455 slave.cpp:727] Re-registered with master
> > master@67.195.138.9:37909
> > I0505 00:23:10.462811 24455 slave.cpp:1481] Updating framework
> > 20140505-002308-160088899-37909-24266-0000 pid to scheduler(39)@
> > 67.195.138.9:37909
> > W0505 00:23:10.462827 24456 master.cpp:2248] Slave at slave(52)@
> > 67.195.138.9:37909 (vesta.apache.org) is being allowed to re-register
> > with an already in use id (20140505-002308-160088899-37909-24266-0)
> > I0505 00:23:10.462848 24455 slave.cpp:1489] Checkpointing framework pid
> > 'scheduler(39)@67.195.138.9:37909' to
> >
> '/tmp/SlaveRecoveryTest_0_MultipleSlaves_YXcdrN/meta/slaves/20140505-002308-160088899-37909-24266-1/frameworks/20140505-002308-160088899-37909-24266-0000/framework.pid'
> > I0505 00:23:10.462892 24453 hierarchical_allocator_process.hpp:497] Slave
> > 20140505-002308-160088899-37909-24266-1 reconnected
> > F0505 00:23:10.462848 24456 protobuf.hpp:110] Check failed: from
> > Attempting to reply without a sender
> > *** Check failure stack trace: ***
> >     @     0x2afb9d0cbb6d  google::LogMessage::Fail()
> >     @     0x2afb9d0cdbff  google::LogMessage::SendToLog()
> >     @     0x2afb9d0cb75c  google::LogMessage::Flush()
> >     @     0x2afb9d0ce46d  google::LogMessageFatal::~LogMessageFatal()
> >     @     0x2afb9cc1fbc2  ProtobufProcess<>::reply()
> >     @     0x2afb9cc042c9
> >  mesos::internal::master::Master::reregisterSlave()
> >     @     0x2afb9cc0e9e5  std::tr1::_Function_handler<>::_M_invoke()
> >     @           0x4d15e5  std::tr1::_Function_handler<>::_M_invoke()
> >     @     0x2afb9cfff34c  process::ProcessManager::resume()
> >     @     0x2afb9cffff5c  process::schedule()
> >     @     0x2afb9e2e1e9a  start_thread
> >     @     0x2afb9e5ec3fd  (unknown)
> > I0505 00:23:10.541995 25459 process.cpp:1023] Socket closed while
> receiving
> > I0505 00:23:10.542057 25506 process.cpp:1023] Socket closed while
> receiving
> > Imake[4]: 0505 00:23:10.542064 25455 exec.cpp:449] Slave exited, but
> > framework has checkpointing enabled. Waiting 15mins to reconnect with
> slave
> > 20140505-002308-160088899-37909-24266-0
> > *** [check-local] Aborted
> > I0505 00:23:10.542162 25455 exec.cpp:458] Slave exited ... shutting down
> > I0505 00:23:10.542146 25501 exec.cpp:449] Slave exited, but framework has
> > checkpointing enabled. Waiting 15mins to reconnect with slave
> > 20140505-002308-160088899-37909-24266-1
> > II0505 00:23:10.542242 25501 exec.cpp:458] Slave exited ... shutting down
> > 0505 00:23:10.542242 25455 exec.cpp:475] Executor::shutdown took 25710ns
> > I0505 00:23:10.542246 25451 exec.cpp:77] Scheduling shutdown of the
> > executor
> > Shutting down
> > Sending SIGTERM to process tree at pid 25460
> > I0505 00:23:10.542292 25501 exec.cpp:475] Executor::shutdown took 7377ns
> > Shutting down
> > I0505 00:23:10.542317 25500 exec.cpp:77] Scheduling shutdown of the
> > executor
> > Sending SIGTERM to process tree at pid 25507
> > make[4]: Leaving directory
> >
> `/home/hudson/jenkins-slave/workspace/Mesos-Ubuntu-distcheck/build/mesos-0.19.0/_build/src'
> > make[3]: *** [check-am] Error 2
> > make[3]: Leaving directory
> >
> `/home/hudson/jenkins-slave/workspace/Mesos-Ubuntu-distcheck/build/mesos-0.19.0/_build/src'
> > make[2]: *** [check] Error 2
> > make[2]: Leaving directory
> >
> `/home/hudson/jenkins-slave/workspace/Mesos-Ubuntu-distcheck/build/mesos-0.19.0/_build/src'
> > make[1]: *** [check-recursive] Error 1
> > make[1]: Leaving directory
> >
> `/home/hudson/jenkins-slave/workspace/Mesos-Ubuntu-distcheck/build/mesos-0.19.0/_build'
> > make: *** [distcheck] Error 1
> > Killing the following process trees:
> > [
> > -+- 25507 sh -c sleep 1000
> >  \--- 25508 sleep 1000
> > ]
> > Build step 'Execute shell' marked build as failure
> >
>

Reply via email to