A bit more on this,

Ubuntu 12.04 LTS died with a kernel error during last night's rebuild.

The system logged the following problems with perl 9 times at 07:39, then
waited 2 minutes before logging it a final time before becoming unresponsive
and required a hard reset. All 10 messages are identical bar the first line
where the number beginning with 2 changes " perl            D 00002bd3     0
XXXXX     1 0x00000000"

I am going to start testing upgrading 12.04 LTS to 14.04 LTS to get the
newer perl. Hopefully that will help somewhat.

I was thinking about allocating more space to the ramdisk but then
remembered that you recommended to run on an x86 system. An x86 system can't
make use of more than 3.6GB so if I allocate more memory to the ramdisk then
I am severly limiting that which is available to the system.

Are the recommendations still the same?

Sample log of the kernel lockup:

Apr 18 00:09:39 mail2 kernel: [746880.520185] perl            D 00002bd3
0 20566      1 0x00000000
Apr 18 00:09:39 mail2 kernel: [746880.520192]  e9ed9e20 00200286 0000bb11
00002bd3 00002bd3 c0910fe0 c0a37e00 c0a37e00
Apr 18 00:09:39 mail2 kernel: [746880.520200]  a3383ba9 0002a70b ebbe4e00
c175d8d0 d21c32c0 00000000 0000001a e9a100c0
Apr 18 00:09:39 mail2 kernel: [746880.520208]  ffffffec 00000000 e9ed9df8
c06aba9d b7585730 00000000 c175d8d0 c16b8000
Apr 18 00:09:39 mail2 kernel: [746880.520225] Call Trace:
Apr 18 00:09:39 mail2 kernel: [746880.520238]  [<c06aba9d>] ?
_raw_spin_lock_irqsave+0x2d/0x40
Apr 18 00:09:39 mail2 kernel: [746880.520246]  [<c0158e5c>] ?
mm_release+0xdc/0xf0
Apr 18 00:09:39 mail2 kernel: [746880.520250]  [<c06a9ea5>]
schedule+0x35/0x50
Apr 18 00:09:39 mail2 kernel: [746880.520254]  [<c015eb7d>]
exit_mm+0x6d/0x100
Apr 18 00:09:39 mail2 kernel: [746880.520258]  [<c015ed49>]
do_exit+0x139/0x3c0
Apr 18 00:09:39 mail2 kernel: [746880.520263]  [<c016bd97>] ?
recalc_sigpending+0x17/0x40
Apr 18 00:09:39 mail2 kernel: [746880.520267]  [<c016bf11>] ?
dequeue_signal+0x31/0x190
Apr 18 00:09:39 mail2 kernel: [746880.520271]  [<c015f128>]
do_group_exit+0x38/0xa0
Apr 18 00:09:39 mail2 kernel: [746880.520276]  [<c016e1d6>]
get_signal_to_deliver+0x1b6/0x3e0
Apr 18 00:09:39 mail2 kernel: [746880.520283]  [<c011197f>]
do_signal+0x3f/0xd0
Apr 18 00:09:39 mail2 kernel: [746880.520289]  [<c0109809>] ?
xen_clocksource_read+0x19/0x20
Apr 18 00:09:39 mail2 kernel: [746880.520293]  [<c01848db>] ?
ktime_get_ts+0xeb/0x120
Apr 18 00:09:39 mail2 kernel: [746880.520300]  [<c0257514>] ?
poll_select_set_timeout+0x64/0x80
Apr 18 00:09:39 mail2 kernel: [746880.520304]  [<c025829a>] ?
sys_poll+0x5a/0xd0
Apr 18 00:09:39 mail2 kernel: [746880.520308]  [<c0111c25>]
do_notify_resume+0x75/0x90
Apr 18 00:09:39 mail2 kernel: [746880.520313]  [<c06abd10>]
work_notifysig+0x13/0x1b

-----Original Message-----
From: Colin Waring [mailto:co...@lanternhosting.co.uk] 
Sent: 16 April 2014 09:21
To: 'ASSP development mailing list'
Subject: Re: [Assp-test] ASSP fails with no error message when tmpDB folder
full

Hi Thomas,

ASSP died again overnight due to tmpDB being full. It looks like
BDBMaxCacheSize doesn't prevent ASSP from dying but allows it to clear the
folder and start up again once it does.

All the best,
Colin Waring.

-----Original Message-----
From: Thomas Eckardt [mailto:thomas.ecka...@thockar.com]
Sent: 11 April 2014 08:12
To: ASSP development mailing list
Subject: Re: [Assp-test] ASSP fails with no error message when tmpDB folder
full

Has anything changed recently that would increase the tmpDB requirements?

Thins could happen - it depends on the config and the count of files and
words in the corpus.

How ever 1GB for tmpDB is also too less for my system.

There are some improvements for BDB cache calculation in the latest
versions. This cache settings are useless for systems that uses a RAM-drive
for tmpDB - I'll cange this.

Thomas





Von:    "Colin Waring" <co...@lanternhosting.co.uk>
An:     "'ASSP development mailing list'" 
<assp-test@lists.sourceforge.net>,
Datum:  11.04.2014 08:51
Betreff:        Re: [Assp-test] ASSP fails with no error message when 
tmpDB   folder  full



Hi Thomas,

I didn't have to wait long, it turns out two runs of rebuildspamdb are
enough to fill a 1GB tempdb folder now.

I have added the entry and ASSP starts back up without coredumping. tmpDB
does remain 100% full though

Has anything changed recently that would increase the tmpDB requirements?

All the best,
Colin Waring.

-----Original Message-----
From: Thomas Eckardt [mailto:thomas.ecka...@thockar.com]
Sent: 09 April 2014 09:23
To: ASSP development mailing list
Subject: Re: [Assp-test] ASSP fails with no error message when tmpDB folder
full

add the following line to 'lib/CorrectASSPcfg.pm'

$main::BDBMaxCacheSize = 0;

and restart assp. Tell me if it works or not.

Thomas



Von:    "Colin Waring" <co...@lanternhosting.co.uk>
An:     "'ASSP development mailing list'" 
<assp-test@lists.sourceforge.net>,
Datum:  09.04.2014 09:56
Betreff:        [Assp-test] ASSP fails with no error message when tmpDB 
folder full



Hi Folks,

 

At the weekend one of my mailservers died overnight. I had a quick check
over and saw that it was coredumping without any errors. I decided to 
leave
it till the morning as the other servers could handle things. By morning 
my
monitoring scripts had restarted it.

 

This morning I got up to the same issue, except the problem didn't go away
itself.

 

ASSP would core dump during startup without outputting any messages. If I
enabled debugging, not debug file was created.

 

I had to resort to strace to see that it was getting an error with space 
on
the tmpDB folder which was indeed completely full. There was over a 
gigabyte
of data contained in there, all relating to rebuildspamdb. Some was from
last night's run but some was from two nights prios.

 

I'm presuming that there is some code in the startup that clears up 
leftover
rebuildspamdb data as I emptied the folder but did not remove it. After
starting ASSP the rebuild folder disappeared.

 

I suspect this code needs to be called much earlier in the startup 
process.
I'll include the strace failure incase it gives an idea of where in the
process it needs to be moved to.

 

All the best,

Colin Waring.

 

 

stat64("/usr/local/assp/tmpDB", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=80,
...}) = 0

stat64("/usr/local/assp/tmpDB/_cachecheck", {st_mode=S_IFDIR|0755,
st_size=80, ...}) = 0

lstat64("/usr/local/assp/tmpDB/_cachecheck/__db.001", 
{st_mode=S_IFREG|0644,
st_size=0, ...}) = 0

unlink("/usr/local/assp/tmpDB/_cachecheck/__db.001") = 0

lstat64("/usr/local/assp/tmpDB/_cachecheck/__db.002", 0x8c07064) = -1 
ENOENT
(No such file or directory)

lstat64("/usr/local/assp/tmpDB/_cachecheck/__db.003", 0x8c07064) = -1 
ENOENT
(No such file or directory)

lstat64("/usr/local/assp/tmpDB/_cachecheck/__db.004", 0x8c07064) = -1 
ENOENT
(No such file or directory)

lstat64("/usr/local/assp/tmpDB/_cachecheck/BDB-cachesize-test-error.txt",
{st_mode=S_IFREG|0644, st_size=98, ...}) = 0

unlink("/usr/local/assp/tmpDB/_cachecheck/BDB-cachesize-test-error.txt") =


0

open("/usr/local/assp/tmpDB/_cachecheck/BDB-cachesize-test-error.txt",
O_WRONLY|O_CREAT|O_APPEND|O_LARGEFILE, 0666) = 4

_llseek(4, 0, [0], SEEK_END)            = 0

ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfaccc38) = -1 ENOTTY
(Inappropriate ioctl for device)

_llseek(4, 0, [0], SEEK_CUR)            = 0

fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0

fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0

time(NULL)                              = 1397029108

stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3661, ...}) = 0

time(NULL)                              = 1397029108

open("/sys/devices/system/cpu/online", O_RDONLY|O_CLOEXEC) = 5

read(5, "0\n", 8192)                    = 2

close(5)                                = 0

write(4, "2014-04-09 08:38:28\nBDB cachesiz"..., 50) = 50

fcntl64(4, F_GETFL)                     = 0x8401 (flags
O_WRONLY|O_APPEND|O_LARGEFILE)

fstat64(4, {st_mode=S_IFREG|0644, st_size=50, ...}) = 0

mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
=
0xb6ee0000

_llseek(4, 0, [50], SEEK_CUR)           = 0

open("/usr/local/assp/tmpDB/_cachecheck/DB_CONFIG", O_RDONLY|O_LARGEFILE) 
=
-1 ENOENT (No such file or directory)

stat64("/var/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0

open("/usr/local/assp/tmpDB/_cachecheck/__db.001",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 5

fcntl64(5, F_GETFD)                     = 0

fcntl64(5, F_SETFD, FD_CLOEXEC)         = 0

open("/usr/local/assp/tmpDB/_cachecheck/__db.001",
O_RDWR|O_CREAT|O_LARGEFILE, 0666) = 8

fcntl64(8, F_GETFD)                     = 0

fcntl64(8, F_SETFD, FD_CLOEXEC)         = 0

_llseek(8, 16384, [16384], SEEK_SET)    = 0

write(8,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
8192)
= -1 ENOSPC (No space left on device)

write(4, "write: 0xbc3caf0, 8192: No space"..., 48) = 48

mmap2(NULL, 24576, PROT_READ|PROT_WRITE, MAP_SHARED, 8, 0) = 0xb66bd000

close(8)                                = 0

--- SIGBUS (Bus error) @ 0 (0) ---

+++ killed by SIGBUS (core dumped) +++

--------------------------------------------------------------------------
--
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test




DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally 
privileged and protected in law and are intended solely for the use of the



individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no 
known virus in this email!
*******************************************************




--------------------------------------------------------------------------
--
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test




DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally 
privileged and protected in law and are intended solely for the use of the


individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no 
known virus in this email!
*******************************************************




--------------------------------------------------------------------------
----
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to