On Mon, 2009-03-30 at 08:05 -0400, John A. Sullivan III wrote: > On Mon, 2009-03-30 at 07:10 -0400, John A. Sullivan III wrote: > > On Mon, 2009-03-30 at 07:04 -0400, John A. Sullivan III wrote: > > > On Mon, 2009-03-30 at 06:58 -0400, John A. Sullivan III wrote: > > > > On Tue, 2009-03-24 at 11:49 -0400, John A. Sullivan III wrote: > > > > > Here it is. There is another problem. My apologies for wondering why > > > > > the list was so slow to respond. I am not receiving any email from > > > > > the > > > > > list including Nerijus' response below. I only received your direct > > > > > responses, Daniel. Does one need a gmail account to use googlegroups? > > > > > > > > > > In any event, here is the bzip2 file. Thanks - John > > > > > > > > > > On Tue, 2009-03-24 at 11:44 -0300, Daniel Cid wrote: > > > > > > Yes, try zipping it and sending to the list (or directly to my email > > > > > > if you think it may contain confidential > > > > > > information). It will certainly help us debug this issue. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > -- > > > > > > Daniel B. Cid > > > > > > dcid ( at ) ossec.net > > > > > > > > > > > > On Fri, Mar 20, 2009 at 3:13 AM, Nerijus Krukauskas > > > > > > <nkrukaus...@gmail.com> wrote: > > > > > > > > > > > > > > On 19/03/2009, John A. Sullivan III > > > > > > > <jsulli...@opensourcedevel.com> wrote: > > > > > > >> > > > > > > >> Thanks, Daniel. I have the trace but it is a 40 MB file. How > > > > > > >> shall I > > > > > > >> send it to you? - John > > > > > > > > > > > > > > I believe that if you try to zip it, it's gonna be something > > > > > > > around 4 MB... :) > > > > > > > > > > > > > > -- > > > > > > > http://nk99.org/ > > > > > > > > > > > Hello, all. I do have some more information on this serious bug. It > > > > has now bitten us on two out of two vservers. > > > > > > > > We first thought it might have to do with our use of wildcards in the > > > > localfile definitions, e.g., > > > > <localfile> > > > > <log_format>syslog</log_format> > > > > <location>/vservers/[a-zA-Z0-9]*/var/log/maillog</location> > > > > </localfile> > > > > So we pulled them all out. We still had the same problem. However, it > > > > did seem to be coincidental with not being able to find specified files. > > > > We had mistyped some file names and paths and saw this in the error logs > > > > before the service spun out of control: > > > > > > > > 2009/03/30 04:57:14 ossec-syscheckd: INFO: Starting syscheck scan (db). > > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/error'. > > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/access'. > > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/error'. > > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/access'. > > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/error'. > > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/access'. > > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/error'. > > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/access'. > > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'. > > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'. > > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/error'. > > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/admin-serv/access'. > > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open > > > > file '/var/log/dirsrv/slapd-ldap01/errors'. > > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not available, > > > > ignoring it: '/vservers/w01/var/log/httpd/ssipki.error_log'. > > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not available, > > > > ignoring it: '/vservers/w01/var/log/httpd/ssipki.access_log'. > > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not available, > > > > ignoring it: '/var/log/dirsrv/admin-serv/error'. > > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not available, > > > > ignoring it: '/var/log/dirsrv/admin-serv/access'. > > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not available, > > > > ignoring it: '/var/log/dirsrv/slapd-ldap01/errors'. > > > > 2009/03/30 05:16:10 ossec-syscheckd: INFO: Ending syscheck scan (db). > > > > > > > > On our second vserver, we did try wildcards in the directories > > > > definitions. That gave us the following before spinning out of control: > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/user/local/sbin': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/etc': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/usr/bin': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/usr/sbin': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/bin': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/sbin': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/usr/local/bin': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/user/local/sbin': No such file or directory > > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: > > > > '/vservers/*/usr/local/etc': No such file or directory > > > > 2009/03/30 05:51:22 ossec-syscheckd: INFO: Starting syscheck scan (db). > > > > > > > > Having corrected the paths in the first vserver and taken out the wild > > > > cards, it seems to be behaving itself. However, not being able to use > > > > wild cards or regex's in the directories and localfiles definitions is > > > > certainly inconvenient when we anticipate hundreds of virtual machines > > > > on some of these systems. > > > > > > > > That still leaves us with the base problem. It appears that if ossec > > > > syscheckd encounters enough missing files, it does spin out of control > > > > and requires a power cycle of the system to recover. Thanks - John > > > > > > > > PS - I'm still not receiving any emails from the mail list. > > > > > > > Oops! I spoke to soon. The first vserver just went out of control but > > > again, it is about missing files. We had defined some directories we > > > knew didn't have any files just in case they were populated in the > > > future. We would hope we could do that to prevent human error. Here is > > > what the logs showed before CPU usage spiked to 100%: > > > > > > 2009/03/30 06:22:20 ossec-syscheckd: Error opening directory: > > > '/user/local/sbin': No such file or directory > > > 2009/03/30 06:23:07 ossec-syscheckd: Error opening directory: > > > '/vservers/ns02/user/local/sbin': No such file or directory > > > 2009/03/30 06:23:57 ossec-syscheckd: Error opening directory: > > > '/vservers/w01/user/local/sbin': No such file or directory > > > 2009/03/30 06:25:18 ossec-syscheckd: Error opening directory: > > > '/vservers/pg01/user/local/sbin': No such file or directory > > > 2009/03/30 06:26:43 ossec-syscheckd: Error opening directory: > > > '/vservers/ld01/user/local/sbin': No such file or directory > > > 2009/03/30 06:28:43 ossec-syscheckd: INFO: Starting syscheck scan (db). > > > > > > > > talk about embarassment - I just noticed the typo - however, it again > > emphasizes the point that ossec gets very unhappy if it can't find > > something that has been defined in ossec.conf - John > > Bad news! The first vserver spun out of control again. This is with all > typos corrected and no wild cards. Here is the log since the last > reboot: > > 2009/03/30 07:09:44 ossec-execd: INFO: Started (pid: 5743). > 2009/03/30 07:09:44 ossec-agentd(1410): INFO: Reading authentication keys > file. > 2009/03/30 07:09:44 ossec-agentd: INFO: No previous counter available for > 'vs01'. > 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning counter for agent > vserver01: '0:0'. > 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning sender counter: 6:4637 > 2009/03/30 07:09:44 ossec-agentd: INFO: Started (pid: 5747). > 2009/03/30 07:09:44 ossec-agentd: INFO: Server IP Address: 172.x.x.30 > 2009/03/30 07:09:44 ossec-agentd: INFO: Trying to connect to server > (172.x.x.30:1514). > 2009/03/30 07:09:48 ossec-syscheckd: INFO: Started (pid: 5755). > 2009/03/30 07:09:48 ossec-rootcheck: INFO: Started (pid: 5755). > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/var/log/messages'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/var/log/secure'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/var/log/maillog'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/var/log/cron'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/w01/var/log/httpd/ssipkipub.error_log'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/w01/var/log/httpd/ssipkipub.access_log'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/w01/var/log/httpd/error_log'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/w01/var/log/httpd/access_log'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/w01/var/log/httpd/ssl_error_log'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/w01/var/log/httpd/ssl_access_log'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/ld01/var/log/dirsrv/admin-serv/error'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/ld01/var/log/dirsrv/admin-serv/access'. > 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: > '/vservers/ld01/var/log/dirsrv/slapd-ldap01/errors'. > 2009/03/30 07:09:50 ossec-logcollector: INFO: Started (pid: 5751). > 2009/03/30 07:09:59 ossec-agentd(4102): INFO: Connected to the server > (172.x.x.30:1514). > 2009/03/30 07:19:03 ossec-syscheckd: INFO: Starting syscheck scan (db). > 2009/03/30 07:38:01 ossec-syscheckd: INFO: Ending syscheck scan (db). > 2009/03/30 07:38:21 ossec-rootcheck: INFO: Starting rootcheck scan. > > I suppose this implies it is not about not finding files but something > specific to searching these vserver directories. They should appear as > normal file systems. I will next try it without any vserver directories > - John <snip> Argh! Even worse news. It still hangs - not a single mention of vserver directories. As far as I can tell, this should be just like a regular server - we are only scanning the host. No clues in the log files other than it didn't take long to lock. Here's the log from restart:
2009/03/30 08:11:31 ossec-execd: INFO: Started (pid: 4373). 2009/03/30 08:11:31 ossec-agentd(1410): INFO: Reading authentication keys file. 2009/03/30 08:11:31 ossec-agentd: INFO: No previous counter available for 'vserver01'. 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning counter for agent vserver01: '0:0'. 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning sender counter: 7:3613 2009/03/30 08:11:31 ossec-agentd: INFO: Started (pid: 4377). 2009/03/30 08:11:31 ossec-agentd: INFO: Server IP Address: 172.30.10.30 2009/03/30 08:11:31 ossec-agentd: INFO: Trying to connect to server (172.30.10.30:1514). 2009/03/30 08:11:36 ossec-syscheckd: INFO: Started (pid: 4385). 2009/03/30 08:11:36 ossec-rootcheck: INFO: Started (pid: 4385). 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/messages'. 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/secure'. 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/maillog'. 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/cron'. 2009/03/30 08:11:37 ossec-logcollector: INFO: Started (pid: 4381). 2009/03/30 08:11:46 ossec-agentd(4102): INFO: Connected to the server (172.30.10.30:1514). 2009/03/30 08:11:57 ossec-logcollector(1225): INFO: SIGNAL Received. Exit Cleaning... 2009/03/30 08:11:57 ossec-syscheckd(1225): INFO: SIGNAL Received. Exit Cleaning... 2009/03/30 08:11:57 ossec-agentd(1225): INFO: SIGNAL Received. Exit Cleaning... 2009/03/30 08:11:57 ossec-execd(1314): INFO: Shutdown received. Deleting responses. 2009/03/30 08:11:57 ossec-execd(1225): INFO: SIGNAL Received. Exit Cleaning... 2009/03/30 08:12:50 ossec-execd: INFO: Started (pid: 5438). 2009/03/30 08:12:50 ossec-agentd(1410): INFO: Reading authentication keys file. 2009/03/30 08:12:50 ossec-agentd: INFO: No previous counter available for 'vserver01'. 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning counter for agent vserver01: '0:0'. 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning sender counter: 7:3623 2009/03/30 08:12:50 ossec-agentd: INFO: Started (pid: 5442). 2009/03/30 08:12:50 ossec-agentd: INFO: Server IP Address: 172.30.10.30 2009/03/30 08:12:50 ossec-agentd: INFO: Trying to connect to server (172.30.10.30:1514). 2009/03/30 08:12:51 ossec-agentd(4102): INFO: Connected to the server (172.30.10.30:1514). 2009/03/30 08:12:54 ossec-syscheckd: INFO: Started (pid: 5450). 2009/03/30 08:12:54 ossec-rootcheck: INFO: Started (pid: 5450). 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/messages'. 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/secure'. 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/maillog'. 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: '/var/log/cron'. 2009/03/30 08:12:56 ossec-logcollector: INFO: Started (pid: 5446). 2009/03/30 08:17:46 ossec-syscheckd: INFO: Starting syscheck scan (db). 2009/03/30 08:24:22 ossec-syscheckd: INFO: Ending syscheck scan (db). 2009/03/30 08:24:42 ossec-rootcheck: INFO: Starting rootcheck scan. Where do I look now to solve this problem? Thanks - John -- John A. Sullivan III Open Source Development Corporation +1 207-985-7880 jsulli...@opensourcedevel.com http://www.spiritualoutreach.com Making Christianity intelligible to secular society