RE: Trafodion master Daily Test Result - 211 - Failure

Steve Varnau Thu, 19 May 2016 17:30:44 -0700

It was the hiveserver2 log that was using up 29 GB all by itself.  I'll make
sure we clean it at beginning of a test job.


--Steve


> -----Original Message-----
> From: Steve Varnau [mailto:[email protected]]
> Sent: Thursday, May 19, 2016 4:54 PM
> To: '[email protected]'
> <[email protected]>; '[email protected]' <no-
> [email protected]>
> Subject: RE: Trafodion master Daily Test Result - 211 - Failure
>
> In the test job log[1], at the beginning, there were 7GB free on the
> filesystem,
> but that is before installing trafodion and running the executor suite.
>
> At the end of the job, after completing the suite and uninstalling
> trafodion,
> there were 6 GB free.
>
> Hard to say how much was free when that test ran, but not hard to imagine
> that it was less than 2GB.
>
> At beginning, it also lists disk usage of the usual suspects (HDFS data
> and logs):
> 2016-05-19 10:23:38 Disk usage of specific directories
> 2016-05-19 10:23:38 Thu May 19 10:23:38 UTC 2016:  1.4G       /hadoop/hdfs
> 2016-05-19 10:23:38 Thu May 19 10:23:38 UTC 2016:  29G        /var/log
>
> This machine may have run several test suites, but normally the same every
> night. So how much space was available on the previous day's job?
>
> 38GB were free, and /var/log was using only 6.8MB, not 29GB.
>
> So that says to me, even though tests were generally passing, something
> was
> spewing log messages!
> Looking at the archived HBase logs, they are nice and small.
>
> I'll bring up a VM with the disk volume in question to see where all the
> space
> went.
>
> --Steve
>
> [1] https://jenkins.esgyn.com/job/core-regress-executor-hdp/220/console
>
> > -----Original Message-----
> > From: Prashanth Vasudev [mailto:[email protected]]
> > Sent: Thursday, May 19, 2016 4:01 PM
> > To: [email protected]; [email protected]
> > Subject: RE: Trafodion master Daily Test Result - 211 - Failure
> >
> > System error 28 is file system error returned when Disk space ( where
> > overflow disk is configured) is less than 2 gb available.
> > By default overflow location is $My_SQROOT/tmp. The disk that is hosting
> > this file system is getting out of disk space.
> >
> > Prashanth
> >
> > -----Original Message-----
> > From: Selva Govindarajan [mailto:[email protected]]
> > Sent: Thursday, May 19, 2016 12:09 PM
> > To: [email protected]; [email protected]
> > Subject: RE: Trafodion master Daily Test Result - 211 - Failure
> >
> > It looks like executor/TEST002 failed due to no space in swap.
> >
> > Error shown in the diff is
> >
> > > *** ERROR[8427] Hash Join Scratch IO Error occurred. Scratch
> > > Error: -10005, System Error: 28, System Error Detail: 0, Details:
> > > SQScratchFile::SQScratchFile(5)
> > 4178c3135
> >
> > System Error 28 is     #define ENOSPC 28
> >
> > By default sort overflow mode is MMAP. So, I suspect the node is running
> > out
> > of swap space at that moment.
> >
> > Selva
> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]
> > Sent: Thursday, May 19, 2016 4:47 AM
> > To: [email protected]
> > Subject: Trafodion master Daily Test Result - 211 - Failure
> >
> > Daily Automated Testing master
> >
> > Jenkins Job:   https://jenkins.esgyn.com/job/Check-Daily-master/211/
> > Archived Logs: http://traf-testlogs.esgyn.com/Daily-master/211
> > Bld Downloads: http://traf-builds.esgyn.com
> >
> > Changes since previous daily build:
> > [selvaganesang] [TRAFODION-1988] Better Java exception handling in
> > Trafodion
> >
> > [weiqing.xu] [TRAFODION-1998]JDBC Type2 column information improve
> >
> >
> >
> > Test Job Results:
> >
> > FAILURE core-regress-executor-hdp (1 hr 23 min) SUCCESS build-master-
> debug
> > (26 min) SUCCESS build-master-release (30 min) SUCCESS
> > core-regress-charsets-cdh (29 min) SUCCESS core-regress-charsets-hdp (37
> > min) SUCCESS core-regress-compGeneral-cdh (32 min) SUCCESS
> > core-regress-compGeneral-hdp (48 min) SUCCESS core-regress-core-cdh (47
> > min)
> > SUCCESS core-regress-core-hdp (1 hr 11 min) SUCCESS
> > core-regress-executor-cdh (52 min) SUCCESS core-regress-fullstack2-cdh
> > (10
> > min) SUCCESS core-regress-fullstack2-hdp (26 min) SUCCESS
> > core-regress-hive-cdh (36 min) SUCCESS core-regress-hive-hdp (51 min)
> > SUCCESS core-regress-privs1-cdh (36 min) SUCCESS core-regress-privs1-hdp
> (57
> > min) SUCCESS core-regress-privs2-cdh (41 min) SUCCESS
> > core-regress-privs2-hdp (54 min) SUCCESS core-regress-qat-cdh (20 min)
> > SUCCESS core-regress-qat-hdp (28 min) SUCCESS core-regress-seabase-cdh
> (53
> > min) SUCCESS core-regress-seabase-hdp (1 hr 13 min) SUCCESS
> > core-regress-udr-cdh (25 min) SUCCESS core-regress-udr-hdp (31 min)
> SUCCESS
> > jdbc_test-cdh (28 min) SUCCESS jdbc_test-hdp (32 min) SUCCESS
> > phoenix_part1_T2-cdh (1 hr 1 min) SUCCESS phoenix_part1_T2-hdp (1 hr 18
> > min)
> > SUCCESS phoenix_part1_T4-cdh (46 min) SUCCESS phoenix_part1_T4-hdp (58
> > min)
> > SUCCESS phoenix_part2_T2-cdh (51 min) SUCCESS phoenix_part2_T2-hdp (1
> hr
> > 28
> > min) SUCCESS phoenix_part2_T4-cdh (46 min) SUCCESS phoenix_part2_T4-
> hdp
> > (1
> > hr 7 min) SUCCESS pyodbc_test-cdh (16 min) SUCCESS pyodbc_test-hdp (14
> > min)

RE: Trafodion master Daily Test Result - 211 - Failure

Reply via email to