Haproxy reload happened at that time: 2019-04-30 16:16:29. Could you give 
another try ?

On Tue Apr 30 12:40:42 2019, vrpo...@cisco.com wrote:
> > I increased idle timeout from 10min to 60min.
> 
> Was it around the time this [2] job failed recently?
> 
> 16:14:44 ++ sleep 184s
> 16:16:29 FATAL: command execution failed
> 
> Vratko.
> 
> [2] https://jenkins.fd.io/job/csit-vpp-perf-verify-master-3n-
> hsw/335/console
> 
> -----Original Message-----
> From: csit-...@lists.fd.io <csit-...@lists.fd.io> On Behalf Of Kenny
> Paul via RT
> Sent: Tuesday, 2019-April-30 18:22
> To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco)
> <jgel...@cisco.com>
> Cc: csit-...@lists.fd.io; vpp-dev@lists.fd.io
> Subject: [csit-dev] [FD.io Helpdesk #73486] Jenkins.fd.io network
> issues
> 
> 
> I increased idle timeout from 10min to 60min. Let's see if that makes
> any difference.
> 
> Regards,
> 
> --
> Anton Baranov
> Sr. System Operations Engineer
> The Linux Foundation
> 
> On Tue Apr 30 10:03:46 2019, vrpo...@cisco.com wrote:
> > >> interleaved by quick periods of activity
> >
> > >>> 09:26:36 ++ sleep 197s
> >
> > > send any keepalive packages
> >
> > I always assumed the console outputs are enough to keep jnlp
> > connection alive.
> >
> > Also, I believe this failure over weekend has hit multiple jobs at
> > once.
> >
> > For example https://jenkins.fd.io/job/csit-vpp-perf-verify-master-3n-
> > hsw/333/console
> >   09:32:54 ++ sleep 184s
> >   09:33:09 FATAL: command execution failed
> >
> > Vratko.
> >
> > -----Original Message-----
> >  From: csit-...@lists.fd.io <csit-...@lists.fd.io> On Behalf Of Kenny
> > Paul via RT
> > Sent: Tuesday, 2019-April-30 15:57
> >  To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco)
> > <jgel...@cisco.com>
> > Cc: csit-...@lists.fd.io; vpp-dev@lists.fd.io
> >  Subject: [csit-dev] [FD.io Helpdesk #73486] Jenkins.fd.io network
> > issues
> >
> > Hello Vratko,
> >
> > Thank you for explanation. I'm wondering within that period of time
> > when reservation was unsuccessful (~40min) does the job keep jnlp
> > connection alive (send any keepalive packages)?
> >
> > I checked the haproxy node where jnlp is runnining and I don't see
> > any
> > DOWN notification for it
> >
> > Thanks,
> > --
> > Anton Baranov
> > Sr. System Operations Engineer
> > The Linux Foundation
> >
> > On Tue Apr 30 09:27:56 2019, vrpo...@cisco.com wrote:
> > > > 05:26:36 mkdir: cannot create directory '/tmp/reservation_dir':
> > > > File
> > > > exists
> > >
> > > That error is expected, it just means  the testbed is currently
> > > used
> > > by another job, so this job should sleep a while and try again.
> > >
> > > > the job was waiting (sleep) from 04:45:12 til 05:26:36
> > >
> > > I believe my browser is showing me UTC timestamps, which show
> > > values
> > > larger by 4 hours.
> > >
> > > > we have 10m idle timeout
> > >
> > > The ~3m period of sleeps are interleaved by quick periods of
> > > activity, so we usually do not hit the timeout.
> > >
> > > But the final sleep probably took longer for some reason
> > >
> > > 09:26:36 ++ sleep 197s
> > > 09:32:20 FATAL: command execution failed
> > >
> > > and something bad has happened in less than 6 minutes.
> > > So it does not look like the 10m timeout.
> > >
> > > Vratko.
> > >
> > > -----Original Message-----
> > >   From: csit-...@lists.fd.io <csit-...@lists.fd.io> On Behalf Of
> > > Kenny Paul via RT
> > > Sent: Tuesday, 2019-April-30 15:09
> > >   To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco)
> > > <jgel...@cisco.com>
> > > Cc: csit-...@lists.fd.io; vpp-dev@lists.fd.io
> > >   Subject: [csit-dev] [FD.io Helpdesk #73486] Jenkins.fd.io network
> > > issues
> > >
> > > Hello Jan
> > >
> > > From logs I see that the job was waiting (sleep) from 04:45:12 til
> > >   05:26:36 which could cause jnlp session to timed out as we have
> > > 10m
> > > idle timeout (client and server side) set on jenkins.fd.io
> > >
> > > Could you check that error:
> > >
> > > 05:26:36 Reservation unsuccessful:
> > >   05:26:36 mkdir: cannot create directory '/tmp/reservation_dir':
> > > File exists
> > >
> > > Cheers,
> > >
> > > --
> > > Anton Baranov
> > > Sr. System Operations Engineer
> > > The Linux Foundation
> > >
> > > On Mon Apr 29 02:58:28 2019, jgel...@cisco.com wrote:
> > > > Hello,
> > > >
> > > > We are experiencing quite a lot of network issues when running
> > > > CSIT tests for 19.04 report:
> > > >
> > > > Caused: hudson.remoting.ChannelClosedException: Channel
> > > > "unknown":
> > > > Remote call on JNLP4-connect connection from vex-yul-rot-ingress-
> > > >    1.ci.codeaurora.org/10.30.48.3:41068 failed. The channel is
> > > > closing down or has closed down
> > > >
> > > > https://jenkins.fd.io/job/csit-vpp-perf-verify-1904-3n-
> > > > hsw/13/consol
> > > > e
> > > >
> > > > Could you, please, have a look on it?
> > > >
> > > > Thank you very much.
> > > >
> > > > Regards,
> > > > Jan
> > >
> > >
> >
> >
> 
> 


-- 
Anton Baranov
Sr. System Operations Engineer
The Linux Foundation
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12901): https://lists.fd.io/g/vpp-dev/message/12901
Mute This Topic: https://lists.fd.io/mt/31454813/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-
      • ... Maciek Konstantynowicz via RT
      • ... Trishan de Lanerolle via RT
  • ... Anton Baranov via RT
    • ... Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) via Lists.Fd.Io
      • ... Anton Baranov via RT
        • ... Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) via Lists.Fd.Io
          • ... Vratko Polak -X via RT
          • ... Anton Baranov via RT
            • ... Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) via Lists.Fd.Io
              • ... Vratko Polak -X via RT
              • ... Anton Baranov via RT
      • ... Vratko Polak -X via RT

Reply via email to