Thank you, Daniel. This gives us a usable work around as we can find other options for rootkit detection. I wonder why the process checking was causing it such grief and so locked the systems that only a power cycle was able to stop the runaway process.
The wildcards are great news. I see one uses sregex for ignore directives and posix wild cards for localfiles. Shall I assume these remain the same and we have added posix wild cards for directories directives? - John On Mon, 2009-03-30 at 16:35 -0300, Daniel Cid wrote: > Hi John, > > So, for the first issue (using wildcards), you can do if you update to > the latest snapshot: > > http://www.ossec.net/files/snapshots/ossec-hids-090330.tar.gz > > For the second issue, by looking at the strace output you sent and the > logs, it is being caused by > rootcheck (that does the rootkit detection) and not by syscheck. > However, rootcheck is called from > inside syscheck and that's why you are seeing the process > ossec-syscheckd going crazy. > > If you want to disable rootcheck, just set <disabled> to yes under the > rootcheck configuration and this > problem should go away. Also, by looking at the strace, the CPU was > going very high during the period > of process checking, where it tries to loop through all available pids > and compare the output of > getpid, getpgid, getsid, proc and ps, looking for anomalies... So, it > was not dead of hang . > > > Thanks, > > -- > Daniel B. Cid > dcid ( at ) ossec.net > > > > On Mon, Mar 30, 2009 at 12:01 PM, John A. Sullivan III > <jsulli...@opensourcedevel.com> wrote: > > On Mon, 2009-03-30 at 08:05 -0400, John A. Sullivan III wrote: > >> On Mon, 2009-03-30 at 07:10 -0400, John A. Sullivan III wrote: > >> > On Mon, 2009-03-30 at 07:04 -0400, John A. Sullivan III wrote: > >> > > On Mon, 2009-03-30 at 06:58 -0400, John A. Sullivan III wrote: > >> > > > On Tue, 2009-03-24 at 11:49 -0400, John A. Sullivan III wrote: > >> > > > > Here it is. There is another problem. My apologies for wondering > >> > > > > why > >> > > > > the list was so slow to respond. I am not receiving any email > >> > > > > from the > >> > > > > list including Nerijus' response below. I only received your direct > >> > > > > responses, Daniel. Does one need a gmail account to use > >> > > > > googlegroups? > >> > > > > > >> > > > > In any event, here is the bzip2 file. Thanks - John > >> > > > > > >> > > > > On Tue, 2009-03-24 at 11:44 -0300, Daniel Cid wrote: > >> > > > > > Yes, try zipping it and sending to the list (or directly to my > >> > > > > > email > >> > > > > > if you think it may contain confidential > >> > > > > > information). It will certainly help us debug this issue. > >> > > > > > > >> > > > > > Thanks, > >> > > > > > > >> > > > > > -- > >> > > > > > Daniel B. Cid > >> > > > > > dcid ( at ) ossec.net > >> > > > > > > >> > > > > > On Fri, Mar 20, 2009 at 3:13 AM, Nerijus Krukauskas > >> > > > > > <nkrukaus...@gmail.com> wrote: > >> > > > > > > > >> > > > > > > On 19/03/2009, John A. Sullivan III > >> > > > > > > <jsulli...@opensourcedevel.com> wrote: > >> > > > > > >> > >> > > > > > >> Thanks, Daniel. I have the trace but it is a 40 MB file. > >> > > > > > >> How shall I > >> > > > > > >> send it to you? - John > >> > > > > > > > >> > > > > > > I believe that if you try to zip it, it's gonna be something > >> > > > > > > around 4 MB... :) > >> > > > > > > > >> > > > > > > -- > >> > > > > > > http://nk99.org/ > >> > > > > > > > >> > > > Hello, all. I do have some more information on this serious bug. It > >> > > > has now bitten us on two out of two vservers. > >> > > > > >> > > > We first thought it might have to do with our use of wildcards in the > >> > > > localfile definitions, e.g., > >> > > > <localfile> > >> > > > <log_format>syslog</log_format> > >> > > > <location>/vservers/[a-zA-Z0-9]*/var/log/maillog</location> > >> > > > </localfile> > >> > > > So we pulled them all out. We still had the same problem. However, > >> > > > it > >> > > > did seem to be coincidental with not being able to find specified > >> > > > files. > >> > > > We had mistyped some file names and paths and saw this in the error > >> > > > logs > >> > > > before the service spun out of control: > >> > > > > >> > > > 2009/03/30 04:57:14 ossec-syscheckd: INFO: Starting syscheck scan > >> > > > (db). > >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/error'. > >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/access'. > >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/error'. > >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/access'. > >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/error'. > >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/access'. > >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/error'. > >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/access'. > >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/error'. > >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/admin-serv/access'. > >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not > >> > > > available, ignoring it: > >> > > > '/vservers/w01/var/log/httpd/ssipki.error_log'. > >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not > >> > > > available, ignoring it: > >> > > > '/vservers/w01/var/log/httpd/ssipki.access_log'. > >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not > >> > > > available, ignoring it: '/var/log/dirsrv/admin-serv/error'. > >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not > >> > > > available, ignoring it: '/var/log/dirsrv/admin-serv/access'. > >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not > >> > > > available, ignoring it: '/var/log/dirsrv/slapd-ldap01/errors'. > >> > > > 2009/03/30 05:16:10 ossec-syscheckd: INFO: Ending syscheck scan (db). > >> > > > > >> > > > On our second vserver, we did try wildcards in the directories > >> > > > definitions. That gave us the following before spinning out of > >> > > > control: > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/user/local/sbin': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/etc': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/usr/bin': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/usr/sbin': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/bin': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/sbin': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/usr/local/bin': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/user/local/sbin': No such file or directory > >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > >> > > > '/vservers/*/usr/local/etc': No such file or directory > >> > > > 2009/03/30 05:51:22 ossec-syscheckd: INFO: Starting syscheck scan > >> > > > (db). > >> > > > > >> > > > Having corrected the paths in the first vserver and taken out the > >> > > > wild > >> > > > cards, it seems to be behaving itself. However, not being able to > >> > > > use > >> > > > wild cards or regex's in the directories and localfiles definitions > >> > > > is > >> > > > certainly inconvenient when we anticipate hundreds of virtual > >> > > > machines > >> > > > on some of these systems. > >> > > > > >> > > > That still leaves us with the base problem. It appears that if ossec > >> > > > syscheckd encounters enough missing files, it does spin out of > >> > > > control > >> > > > and requires a power cycle of the system to recover. Thanks - John > >> > > > > >> > > > PS - I'm still not receiving any emails from the mail list. > >> > > > > >> > > Oops! I spoke to soon. The first vserver just went out of control but > >> > > again, it is about missing files. We had defined some directories we > >> > > knew didn't have any files just in case they were populated in the > >> > > future. We would hope we could do that to prevent human error. Here > >> > > is > >> > > what the logs showed before CPU usage spiked to 100%: > >> > > > >> > > 2009/03/30 06:22:20 ossec-syscheckd: Error opening directory: > >> > > '/user/local/sbin': No such file or directory > >> > > 2009/03/30 06:23:07 ossec-syscheckd: Error opening directory: > >> > > '/vservers/ns02/user/local/sbin': No such file or directory > >> > > 2009/03/30 06:23:57 ossec-syscheckd: Error opening directory: > >> > > '/vservers/w01/user/local/sbin': No such file or directory > >> > > 2009/03/30 06:25:18 ossec-syscheckd: Error opening directory: > >> > > '/vservers/pg01/user/local/sbin': No such file or directory > >> > > 2009/03/30 06:26:43 ossec-syscheckd: Error opening directory: > >> > > '/vservers/ld01/user/local/sbin': No such file or directory > >> > > 2009/03/30 06:28:43 ossec-syscheckd: INFO: Starting syscheck scan (db). > >> > > > >> > > > >> > talk about embarassment - I just noticed the typo - however, it again > >> > emphasizes the point that ossec gets very unhappy if it can't find > >> > something that has been defined in ossec.conf - John > >> > >> Bad news! The first vserver spun out of control again. This is with all > >> typos corrected and no wild cards. Here is the log since the last > >> reboot: > >> > >> 2009/03/30 07:09:44 ossec-execd: INFO: Started (pid: 5743). > >> 2009/03/30 07:09:44 ossec-agentd(1410): INFO: Reading authentication keys > >> file. > >> 2009/03/30 07:09:44 ossec-agentd: INFO: No previous counter available for > >> 'vs01'. > >> 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning counter for agent > >> vserver01: '0:0'. > >> 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning sender counter: 6:4637 > >> 2009/03/30 07:09:44 ossec-agentd: INFO: Started (pid: 5747). > >> 2009/03/30 07:09:44 ossec-agentd: INFO: Server IP Address: 172.x.x.30 > >> 2009/03/30 07:09:44 ossec-agentd: INFO: Trying to connect to server > >> (172.x.x.30:1514). > >> 2009/03/30 07:09:48 ossec-syscheckd: INFO: Started (pid: 5755). > >> 2009/03/30 07:09:48 ossec-rootcheck: INFO: Started (pid: 5755). > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/var/log/messages'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/var/log/secure'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/var/log/maillog'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/var/log/cron'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/w01/var/log/httpd/ssipkipub.error_log'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/w01/var/log/httpd/ssipkipub.access_log'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/w01/var/log/httpd/error_log'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/w01/var/log/httpd/access_log'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/w01/var/log/httpd/ssl_error_log'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/w01/var/log/httpd/ssl_access_log'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/ld01/var/log/dirsrv/admin-serv/error'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/ld01/var/log/dirsrv/admin-serv/access'. > >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > >> '/vservers/ld01/var/log/dirsrv/slapd-ldap01/errors'. > >> 2009/03/30 07:09:50 ossec-logcollector: INFO: Started (pid: 5751). > >> 2009/03/30 07:09:59 ossec-agentd(4102): INFO: Connected to the server > >> (172.x.x.30:1514). > >> 2009/03/30 07:19:03 ossec-syscheckd: INFO: Starting syscheck scan (db). > >> 2009/03/30 07:38:01 ossec-syscheckd: INFO: Ending syscheck scan (db). > >> 2009/03/30 07:38:21 ossec-rootcheck: INFO: Starting rootcheck scan. > >> > >> I suppose this implies it is not about not finding files but something > >> specific to searching these vserver directories. They should appear as > >> normal file systems. I will next try it without any vserver directories > >> - John > > <snip> > > Argh! Even worse news. It still hangs - not a single mention of vserver > > directories. As far as I can tell, this should be just like a regular > > server - we are only scanning the host. No clues in the log files other > > than it didn't take long to lock. Here's the log from restart: > > > > 2009/03/30 08:11:31 ossec-execd: INFO: Started (pid: 4373). > > 2009/03/30 08:11:31 ossec-agentd(1410): INFO: Reading authentication keys > > file. > > 2009/03/30 08:11:31 ossec-agentd: INFO: No previous counter available for > > 'vserver01'. > > 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning counter for agent > > vserver01: '0:0'. > > 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning sender counter: 7:3613 > > 2009/03/30 08:11:31 ossec-agentd: INFO: Started (pid: 4377). > > 2009/03/30 08:11:31 ossec-agentd: INFO: Server IP Address: 172.30.10.30 > > 2009/03/30 08:11:31 ossec-agentd: INFO: Trying to connect to server > > (172.30.10.30:1514). > > 2009/03/30 08:11:36 ossec-syscheckd: INFO: Started (pid: 4385). > > 2009/03/30 08:11:36 ossec-rootcheck: INFO: Started (pid: 4385). > > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/messages'. > > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/secure'. > > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/maillog'. > > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/cron'. > > 2009/03/30 08:11:37 ossec-logcollector: INFO: Started (pid: 4381). > > 2009/03/30 08:11:46 ossec-agentd(4102): INFO: Connected to the server > > (172.30.10.30:1514). > > 2009/03/30 08:11:57 ossec-logcollector(1225): INFO: SIGNAL Received. Exit > > Cleaning... > > 2009/03/30 08:11:57 ossec-syscheckd(1225): INFO: SIGNAL Received. Exit > > Cleaning... > > 2009/03/30 08:11:57 ossec-agentd(1225): INFO: SIGNAL Received. Exit > > Cleaning... > > 2009/03/30 08:11:57 ossec-execd(1314): INFO: Shutdown received. Deleting > > responses. > > 2009/03/30 08:11:57 ossec-execd(1225): INFO: SIGNAL Received. Exit > > Cleaning... > > 2009/03/30 08:12:50 ossec-execd: INFO: Started (pid: 5438). > > 2009/03/30 08:12:50 ossec-agentd(1410): INFO: Reading authentication keys > > file. > > 2009/03/30 08:12:50 ossec-agentd: INFO: No previous counter available for > > 'vserver01'. > > 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning counter for agent > > vserver01: '0:0'. > > 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning sender counter: 7:3623 > > 2009/03/30 08:12:50 ossec-agentd: INFO: Started (pid: 5442). > > 2009/03/30 08:12:50 ossec-agentd: INFO: Server IP Address: 172.30.10.30 > > 2009/03/30 08:12:50 ossec-agentd: INFO: Trying to connect to server > > (172.30.10.30:1514). > > 2009/03/30 08:12:51 ossec-agentd(4102): INFO: Connected to the server > > (172.30.10.30:1514). > > 2009/03/30 08:12:54 ossec-syscheckd: INFO: Started (pid: 5450). > > 2009/03/30 08:12:54 ossec-rootcheck: INFO: Started (pid: 5450). > > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/messages'. > > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/secure'. > > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/maillog'. > > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: > > '/var/log/cron'. > > 2009/03/30 08:12:56 ossec-logcollector: INFO: Started (pid: 5446). > > 2009/03/30 08:17:46 ossec-syscheckd: INFO: Starting syscheck scan (db). > > 2009/03/30 08:24:22 ossec-syscheckd: INFO: Ending syscheck scan (db). > > 2009/03/30 08:24:42 ossec-rootcheck: INFO: Starting rootcheck scan. > > > > Where do I look now to solve this problem? Thanks - John > > -- > > John A. Sullivan III > > Open Source Development Corporation > > +1 207-985-7880 > > jsulli...@opensourcedevel.com > > > > http://www.spiritualoutreach.com > > Making Christianity intelligible to secular society > > > > -- John A. Sullivan III Open Source Development Corporation +1 207-985-7880 jsulli...@opensourcedevel.com http://www.spiritualoutreach.com Making Christianity intelligible to secular society