Thank you, Daniel.  This gives us a usable work around as we can find
other options for rootkit detection.  I wonder why the process checking
was causing it such grief and so locked the systems that only a power
cycle was able to stop the runaway process.

The wildcards are great news.  I see one uses sregex for ignore
directives and posix wild cards for localfiles.  Shall I assume these
remain the same and we have added posix wild cards for directories
directives? - John

On Mon, 2009-03-30 at 16:35 -0300, Daniel Cid wrote:
> Hi John,
> 
> So, for the first issue (using wildcards), you can do if you update to
> the latest snapshot:
> 
> http://www.ossec.net/files/snapshots/ossec-hids-090330.tar.gz
> 
> For the second issue, by looking at the strace output you sent and the
> logs, it is being caused by
> rootcheck (that does the rootkit detection) and not by syscheck.
> However, rootcheck is called from
> inside syscheck and that's why you are seeing the process
> ossec-syscheckd going crazy.
> 
> If you want to disable rootcheck, just set <disabled> to yes under the
> rootcheck configuration and this
> problem should go away. Also, by looking at the strace, the CPU was
> going very high during the period
> of process checking, where it tries to loop through all available pids
> and compare the output of
> getpid, getpgid, getsid, proc and ps, looking for anomalies... So, it
> was not dead of hang .
> 
> 
> Thanks,
> 
> --
> Daniel B. Cid
> dcid ( at ) ossec.net
> 
> 
> 
> On Mon, Mar 30, 2009 at 12:01 PM, John A. Sullivan III
> <jsulli...@opensourcedevel.com> wrote:
> > On Mon, 2009-03-30 at 08:05 -0400, John A. Sullivan III wrote:
> >> On Mon, 2009-03-30 at 07:10 -0400, John A. Sullivan III wrote:
> >> > On Mon, 2009-03-30 at 07:04 -0400, John A. Sullivan III wrote:
> >> > > On Mon, 2009-03-30 at 06:58 -0400, John A. Sullivan III wrote:
> >> > > > On Tue, 2009-03-24 at 11:49 -0400, John A. Sullivan III wrote:
> >> > > > > Here it is.  There is another problem.  My apologies for wondering 
> >> > > > > why
> >> > > > > the list was so slow to respond.  I am not receiving any email 
> >> > > > > from the
> >> > > > > list including Nerijus' response below. I only received your direct
> >> > > > > responses, Daniel.  Does one need a gmail account to use 
> >> > > > > googlegroups?
> >> > > > >
> >> > > > > In any event, here is the bzip2 file.  Thanks - John
> >> > > > >
> >> > > > > On Tue, 2009-03-24 at 11:44 -0300, Daniel Cid wrote:
> >> > > > > > Yes, try zipping it and sending to the list (or directly to my 
> >> > > > > > email
> >> > > > > > if you think it may contain confidential
> >> > > > > > information). It will certainly help us debug this issue.
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > >
> >> > > > > > --
> >> > > > > > Daniel B. Cid
> >> > > > > > dcid ( at ) ossec.net
> >> > > > > >
> >> > > > > > On Fri, Mar 20, 2009 at 3:13 AM, Nerijus Krukauskas
> >> > > > > > <nkrukaus...@gmail.com> wrote:
> >> > > > > > >
> >> > > > > > > On 19/03/2009, John A. Sullivan III 
> >> > > > > > > <jsulli...@opensourcedevel.com> wrote:
> >> > > > > > >>
> >> > > > > > >> Thanks, Daniel.  I have the trace but it is a 40 MB file.  
> >> > > > > > >> How shall I
> >> > > > > > >> send it to you? - John
> >> > > > > > >
> >> > > > > > >  I believe that if you try to zip it, it's gonna be something 
> >> > > > > > > around 4 MB... :)
> >> > > > > > >
> >> > > > > > > --
> >> > > > > > > http://nk99.org/
> >> > > > > > >
> >> > > > Hello, all.  I do have some more information on this serious bug.  It
> >> > > > has now bitten us on two out of two vservers.
> >> > > >
> >> > > > We first thought it might have to do with our use of wildcards in the
> >> > > > localfile definitions, e.g.,
> >> > > >   <localfile>
> >> > > >     <log_format>syslog</log_format>
> >> > > >     <location>/vservers/[a-zA-Z0-9]*/var/log/maillog</location>
> >> > > >   </localfile>
> >> > > > So we pulled them all out.  We still had the same problem.  However, 
> >> > > > it
> >> > > > did seem to be coincidental with not being able to find specified 
> >> > > > files.
> >> > > > We had mistyped some file names and paths and saw this in the error 
> >> > > > logs
> >> > > > before the service spun out of control:
> >> > > >
> >> > > > 2009/03/30 04:57:14 ossec-syscheckd: INFO: Starting syscheck scan 
> >> > > > (db).
> >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
> >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
> >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/error'.
> >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/access'.
> >> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
> >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
> >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
> >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/error'.
> >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/access'.
> >> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
> >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
> >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
> >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/error'.
> >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/access'.
> >> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
> >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
> >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
> >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/error'.
> >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/access'.
> >> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
> >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
> >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
> >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/error'.
> >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/admin-serv/access'.
> >> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
> >> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
> >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
> >> > > > available, ignoring it: 
> >> > > > '/vservers/w01/var/log/httpd/ssipki.error_log'.
> >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
> >> > > > available, ignoring it: 
> >> > > > '/vservers/w01/var/log/httpd/ssipki.access_log'.
> >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
> >> > > > available, ignoring it: '/var/log/dirsrv/admin-serv/error'.
> >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
> >> > > > available, ignoring it: '/var/log/dirsrv/admin-serv/access'.
> >> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
> >> > > > available, ignoring it: '/var/log/dirsrv/slapd-ldap01/errors'.
> >> > > > 2009/03/30 05:16:10 ossec-syscheckd: INFO: Ending syscheck scan (db).
> >> > > >
> >> > > > On our second vserver, we did try wildcards in the directories
> >> > > > definitions.  That gave us the following before spinning out of 
> >> > > > control:
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/user/local/sbin': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/etc': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/usr/bin': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/usr/sbin': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/bin': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/sbin': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/usr/local/bin': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/user/local/sbin': No such file or directory
> >> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
> >> > > > '/vservers/*/usr/local/etc': No such file or directory
> >> > > > 2009/03/30 05:51:22 ossec-syscheckd: INFO: Starting syscheck scan 
> >> > > > (db).
> >> > > >
> >> > > > Having corrected the paths in the first vserver and taken out the 
> >> > > > wild
> >> > > > cards, it seems to be behaving itself.  However, not being able to 
> >> > > > use
> >> > > > wild cards or regex's in the directories and localfiles definitions 
> >> > > > is
> >> > > > certainly inconvenient when we anticipate hundreds of virtual 
> >> > > > machines
> >> > > > on some of these systems.
> >> > > >
> >> > > > That still leaves us with the base problem.  It appears that if ossec
> >> > > > syscheckd encounters enough missing files, it does spin out of 
> >> > > > control
> >> > > > and requires a power cycle of the system to recover.  Thanks - John
> >> > > >
> >> > > > PS - I'm still not receiving any emails from the mail list.
> >> > > >
> >> > > Oops! I spoke to soon.  The first vserver just went out of control but
> >> > > again, it is about missing files.  We had defined some directories we
> >> > > knew didn't have any files just in case they were populated in the
> >> > > future.  We would hope we could do that to prevent human error.  Here 
> >> > > is
> >> > > what the logs showed before CPU usage spiked to 100%:
> >> > >
> >> > > 2009/03/30 06:22:20 ossec-syscheckd: Error opening directory: 
> >> > > '/user/local/sbin': No such file or directory
> >> > > 2009/03/30 06:23:07 ossec-syscheckd: Error opening directory: 
> >> > > '/vservers/ns02/user/local/sbin': No such file or directory
> >> > > 2009/03/30 06:23:57 ossec-syscheckd: Error opening directory: 
> >> > > '/vservers/w01/user/local/sbin': No such file or directory
> >> > > 2009/03/30 06:25:18 ossec-syscheckd: Error opening directory: 
> >> > > '/vservers/pg01/user/local/sbin': No such file or directory
> >> > > 2009/03/30 06:26:43 ossec-syscheckd: Error opening directory: 
> >> > > '/vservers/ld01/user/local/sbin': No such file or directory
> >> > > 2009/03/30 06:28:43 ossec-syscheckd: INFO: Starting syscheck scan (db).
> >> > >
> >> > >
> >> > talk about embarassment - I just noticed the typo - however, it again
> >> > emphasizes the point that ossec gets very unhappy if it can't find
> >> > something that has been defined in ossec.conf - John
> >>
> >> Bad news! The first vserver spun out of control again.  This is with all
> >> typos corrected and no wild cards.  Here is the log since the last
> >> reboot:
> >>
> >> 2009/03/30 07:09:44 ossec-execd: INFO: Started (pid: 5743).
> >> 2009/03/30 07:09:44 ossec-agentd(1410): INFO: Reading authentication keys 
> >> file.
> >> 2009/03/30 07:09:44 ossec-agentd: INFO: No previous counter available for 
> >> 'vs01'.
> >> 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning counter for agent 
> >> vserver01: '0:0'.
> >> 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning sender counter: 6:4637
> >> 2009/03/30 07:09:44 ossec-agentd: INFO: Started (pid: 5747).
> >> 2009/03/30 07:09:44 ossec-agentd: INFO: Server IP Address: 172.x.x.30
> >> 2009/03/30 07:09:44 ossec-agentd: INFO: Trying to connect to server 
> >> (172.x.x.30:1514).
> >> 2009/03/30 07:09:48 ossec-syscheckd: INFO: Started (pid: 5755).
> >> 2009/03/30 07:09:48 ossec-rootcheck: INFO: Started (pid: 5755).
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/var/log/messages'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/var/log/secure'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/var/log/maillog'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/var/log/cron'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/w01/var/log/httpd/ssipkipub.error_log'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/w01/var/log/httpd/ssipkipub.access_log'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/w01/var/log/httpd/error_log'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/w01/var/log/httpd/access_log'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/w01/var/log/httpd/ssl_error_log'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/w01/var/log/httpd/ssl_access_log'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/ld01/var/log/dirsrv/admin-serv/error'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/ld01/var/log/dirsrv/admin-serv/access'.
> >> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
> >> '/vservers/ld01/var/log/dirsrv/slapd-ldap01/errors'.
> >> 2009/03/30 07:09:50 ossec-logcollector: INFO: Started (pid: 5751).
> >> 2009/03/30 07:09:59 ossec-agentd(4102): INFO: Connected to the server 
> >> (172.x.x.30:1514).
> >> 2009/03/30 07:19:03 ossec-syscheckd: INFO: Starting syscheck scan (db).
> >> 2009/03/30 07:38:01 ossec-syscheckd: INFO: Ending syscheck scan (db).
> >> 2009/03/30 07:38:21 ossec-rootcheck: INFO: Starting rootcheck scan.
> >>
> >> I suppose this implies it is not about not finding files but something
> >> specific to searching these vserver directories. They should appear as
> >> normal file systems.  I will next try it without any vserver directories
> >> - John
> > <snip>
> > Argh! Even worse news.  It still hangs - not a single mention of vserver
> > directories.  As far as I can tell, this should be just like a regular
> > server - we are only scanning the host.  No clues in the log files other
> > than it didn't take long to lock.  Here's the log from restart:
> >
> > 2009/03/30 08:11:31 ossec-execd: INFO: Started (pid: 4373).
> > 2009/03/30 08:11:31 ossec-agentd(1410): INFO: Reading authentication keys 
> > file.
> > 2009/03/30 08:11:31 ossec-agentd: INFO: No previous counter available for 
> > 'vserver01'.
> > 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning counter for agent 
> > vserver01: '0:0'.
> > 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning sender counter: 7:3613
> > 2009/03/30 08:11:31 ossec-agentd: INFO: Started (pid: 4377).
> > 2009/03/30 08:11:31 ossec-agentd: INFO: Server IP Address: 172.30.10.30
> > 2009/03/30 08:11:31 ossec-agentd: INFO: Trying to connect to server 
> > (172.30.10.30:1514).
> > 2009/03/30 08:11:36 ossec-syscheckd: INFO: Started (pid: 4385).
> > 2009/03/30 08:11:36 ossec-rootcheck: INFO: Started (pid: 4385).
> > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/messages'.
> > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/secure'.
> > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/maillog'.
> > 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/cron'.
> > 2009/03/30 08:11:37 ossec-logcollector: INFO: Started (pid: 4381).
> > 2009/03/30 08:11:46 ossec-agentd(4102): INFO: Connected to the server 
> > (172.30.10.30:1514).
> > 2009/03/30 08:11:57 ossec-logcollector(1225): INFO: SIGNAL Received. Exit 
> > Cleaning...
> > 2009/03/30 08:11:57 ossec-syscheckd(1225): INFO: SIGNAL Received. Exit 
> > Cleaning...
> > 2009/03/30 08:11:57 ossec-agentd(1225): INFO: SIGNAL Received. Exit 
> > Cleaning...
> > 2009/03/30 08:11:57 ossec-execd(1314): INFO: Shutdown received. Deleting 
> > responses.
> > 2009/03/30 08:11:57 ossec-execd(1225): INFO: SIGNAL Received. Exit 
> > Cleaning...
> > 2009/03/30 08:12:50 ossec-execd: INFO: Started (pid: 5438).
> > 2009/03/30 08:12:50 ossec-agentd(1410): INFO: Reading authentication keys 
> > file.
> > 2009/03/30 08:12:50 ossec-agentd: INFO: No previous counter available for 
> > 'vserver01'.
> > 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning counter for agent 
> > vserver01: '0:0'.
> > 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning sender counter: 7:3623
> > 2009/03/30 08:12:50 ossec-agentd: INFO: Started (pid: 5442).
> > 2009/03/30 08:12:50 ossec-agentd: INFO: Server IP Address: 172.30.10.30
> > 2009/03/30 08:12:50 ossec-agentd: INFO: Trying to connect to server 
> > (172.30.10.30:1514).
> > 2009/03/30 08:12:51 ossec-agentd(4102): INFO: Connected to the server 
> > (172.30.10.30:1514).
> > 2009/03/30 08:12:54 ossec-syscheckd: INFO: Started (pid: 5450).
> > 2009/03/30 08:12:54 ossec-rootcheck: INFO: Started (pid: 5450).
> > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/messages'.
> > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/secure'.
> > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/maillog'.
> > 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> > '/var/log/cron'.
> > 2009/03/30 08:12:56 ossec-logcollector: INFO: Started (pid: 5446).
> > 2009/03/30 08:17:46 ossec-syscheckd: INFO: Starting syscheck scan (db).
> > 2009/03/30 08:24:22 ossec-syscheckd: INFO: Ending syscheck scan (db).
> > 2009/03/30 08:24:42 ossec-rootcheck: INFO: Starting rootcheck scan.
> >
> > Where do I look now to solve this problem? Thanks - John
> > --
> > John A. Sullivan III
> > Open Source Development Corporation
> > +1 207-985-7880
> > jsulli...@opensourcedevel.com
> >
> > http://www.spiritualoutreach.com
> > Making Christianity intelligible to secular society
> >
> >
-- 
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsulli...@opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society

Reply via email to