To fix up invalid JSON you can try readClob (or maybe readLine) followed by
findReplace or grok, followed by toByteArray, followed by setValues {
_attachment_body : "@{message}" }, followed by readJson.
Wolfgang.
On Mar 26, 2014, at 8:59 PM, Andrew Sammut <[email protected]> wrote:
>
> Hi all
>
> I'm a relative beginner to flume and morphlines, however I will try to
> explain myself.
>
> I am generating logs (with the message in JSON) from PHP and pumping them to
> rsyslog which then in turn pumps the logs into a flume syslog tcp source.
>
> An example message would be the following:
>
> <12>1 2014-03-27T03:46:56.648886+00:00 x x - -
> {"time":1395892016,"level":4,"body":"HTTP_Exception_404 [404]: \/
> KIX-53339f309cafb6.72508132 - Unable to find a route to match the URI: in
> MODPATH\/patches\/classes\/Request.php on line 254"}
>
> And this translates into the following on entry to my morphlines command:
>
> 27 Mar 2014 03:46:56,890 DEBUG [pool-9-thread-1]
> (com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug.log:63) - begin:
> [{Facility=[1], Severity=[4], _attachment_body=[[B@2d913d11],
> environment=[x], host=[x], hostname=[x], pop=[x], product=[x],
> timestamp=[1395892664886]}]
>
> That's all good, however when you look at the actual string representation of
> _attachment_body (using readLine) it is formatted as this:
>
> x - - {"time":1395892016,"level":4,"body":"HTTP_Exception_404 [404]: \/
> KIX-53339f309cafb6.72508132 - Unable to find a route to match the URI: in
> MODPATH\/patches\/classes\/Request.php on line 254"}
>
> Now, if I run readJson on that, it fails completely as it's improperly
> formatted. The question is, how can one process the attachment body to remove
> the leading 'x - - ' so that readJson would work? Am I missing something
> completely?
>
> Regards,
> Andrew S