are you parsing the rawmsg or the msg body?

if you think you are doing mmnormalize on the rawmesg but you are really doing it on the msg body (which I think is the default, but my memory could be faulty)

David Lang

On Mon, 7 Jul 2025, Klaus Pichert via rsyslog wrote:

Date: Mon, 7 Jul 2025 16:35:02 +0200
From: Klaus Pichert via rsyslog <[email protected]>
To: [email protected]
Cc: Klaus Pichert <[email protected]>
Subject: [rsyslog] Rsyslog parsing

Hello everybody,

I've got a strange issue: a program creates some weird syslog messages in the following format when written to disk as %msg%

 2025-07-07T16:04:19+02:00 hostname MSCW[2660] Getting sanitized file from agent, file_id='yyyyyyy', data_id='xxxx' [msgid: 1288]  2025-07-07T16:04:19+02:00 hostname MSCW[2660] Adding file to sanitized storage, data_id='xxxx', request_id='187863', signature='yyyyyyy' [msgid: 1265]  2025-07-07T16:04:19+02:00 hostname MSCW[2660] File successfully added to sanitized storage, data_id='xxxx', file_id='yyyyyyy', path='C:/Program Files/…/data/sanitized/74/41/yyyyyyy' [msgid: 1273]  2025-07-07T16:04:19+02:00 hostname MSCW[2660] Processing finished, is_sync_scan='false', user='#', workflow_id='lms::workflow::RestWorkflowExecutor(0x2a483245060, name = "WF-127.0.0.1:51688")', root_data_id='', parent_data_id='', data_id='xxxx', fileName='filename.txt', fileSize='21707', fileTypeDesc='ASCII Text', sha256sum='yyyyyyy', firstchunk_ts='1751897059229', lastchunk_ts='1751897059229', blocked='false', blocked_reason='', overallResult='No Threat Detected', threatFoundCount='0', embeddedObjectsWithThreat='0', totalResultCount='8', threatDetectedBy='', threatName='', yaraRuleMatched='', ruleName='File process', source='10.0.0.1', engines-metadata='{}', totalProcessingTime='149' [msgid: 82]  2025-07-07T16:04:19+02:00 hostname engineprocess[11528] Snd, tid='', data_id='xxxx', file_path='C:\Program Files\...\data\resources\d317643.tmp', tmo='600998', ntsk='1' [msgid: 5576]

There IS a leading space before the events - thats why I was using %msg% in the template - the logs itself are not RFC conform (i.e. the leading space before the timestamp and the missing colon after the process ID)

I have got following settings:
remote.conf (extract)
        action(
            type="mmnormalize"
            rulebase="/etc/rsyslog.d/cdr.rb"
        )
        action(
            type="omfile"
            DynaFile="remote_host_cdr"
            dirCreateMode="0755"
            dirGroup="splunkfwd"
            fileCreateMode="0644"
            fileGroup="splunkfwd"
            template="CDR_Events"
        )

- mmnormalize rule (different tries)
version=2
rule=:%timestamp:word% %hostname:word% %programname:char-to:[%[%procid:number%] %msg:rest% rule=: %timestamp:word% %hostname:word% %programname:char-to:[%[%procid:number%] %msg:rest% rule=:<%pri:number%>%timestamp:word% %hostname:word% %programname:char-to:[%[%procid:number%] %msg:rest% rule=:<%pri:number%> %timestamp:word% %hostname:word% %programname:char-to:[%[%procid:number%] %msg:rest%

- Template
template(
        name="CDR_Events"
        type="string"
string="%msg%\n"

        )

- Results from lognormalize
lognormalizer -r /etc/rsyslog.d/cdr.rb < ~/cdr.log  | jq
{
  "msg": "Getting sanitized file from agent, file_id='yyyyyyy', data_id='xxxx' [msgid: 1288]",
  "procid": "2660",
  "programname": "MSCW",
  "hostname": "hostname",
  "timestamp": "2025-07-07T16:04:19+02:00"
}
{
  "msg": "Adding file to sanitized storage, data_id='xxxx', request_id='187863', signature='yyyyyyy' [msgid: 1265]",
  "procid": "2660",
  "programname": "MSCW",
  "hostname": "hostname",
  "timestamp": "2025-07-07T16:04:19+02:00"
}
{
  "msg": "File successfully added to sanitized storage, data_id='xxxx', file_id='yyyyyyy', path='C:/Program Files/…/data/sanitized/74/41/yyyyyyy' [msgid: 1273]",
  "procid": "2660",
  "programname": "MSCW",
  "hostname": "hostname",
  "timestamp": "2025-07-07T16:04:19+02:00"
}
{
  "msg": "Processing finished, is_sync_scan='false', user='#', workflow_id='lms::workflow::RestWorkflowExecutor(0x2a483245060, name = \"WF-127.0.0.1:51688\")', root_data_id='', parent_data_id='', data_id='xxxx', fileName='filename.txt', fileSize='21707', fileTypeDesc='ASCII Text', sha256sum='yyyyyyy', firstchunk_ts='1751897059229', lastchunk_ts='1751897059229', blocked='false', blocked_reason='', overallResult='No Threat Detected', threatFoundCount='0', embeddedObjectsWithThreat='0', totalResultCount='8', threatDetectedBy='', threatName='', yaraRuleMatched='', ruleName='File process', source='10.0.0.1', engines-metadata='{}', totalProcessingTime='149' [msgid: 82]",
  "procid": "2660",
  "programname": "MSCW",
  "hostname": "hostname",
  "timestamp": "2025-07-07T16:04:19+02:00"
}
{
  "msg": "Snd, tid='', data_id='xxxx', file_path='C:\\Program Files\\...\\data\\resources\\d317643.tmp', tmo='600998', ntsk='1' [msgid: 5576]",
  "procid": "11528",
  "programname": "engineprocess",
  "hostname": "hostname",
  "timestamp": "2025-07-07T16:04:19+02:00"
}

Events are written to disk in a .log file, but  my goal would be to write different logs per programname (enginelogs, MSCW)

Is it possible? Any hints?

Best regards

Klaus




_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to