[ 
https://issues.apache.org/jira/browse/FLUME-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213632#comment-13213632
 ] 

Brock Noland commented on FLUME-828:
------------------------------------

Hi Mike,

new String(body, UTF8) will fail for non-string event structures so I think the 
Hex code is required for SimpleEvent. I think that each event implementation 
may want to define their own way to "log" the event. For example, an XMLEvent 
or ImageEvent event may want to define their own way of displaying themselves 
when logged.  By standard we can use toString() or some other method, 
toString() is commonly used for debugging which the logger sink will be used 
for.


>To be honest, I think this bug is an indication that we may be missing some 
>important type information in the system 
> that one might want to use to determine how to decode a given Event. So 
> regardless of how we fix this bug it ends up 
> being kind of a band-aid. :) What do you think?

I agree with this for SimpleEvent but it is a general tool. If you want to 
store higher level data structures you can define your own event type.

Thoughts?
                
> LoggerSink representation of the event's body isn't too useful
> --------------------------------------------------------------
>
>                 Key: FLUME-828
>                 URL: https://issues.apache.org/jira/browse/FLUME-828
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: NG alpha 1
>            Reporter: Will McQueen
>            Assignee: Brock Noland
>             Fix For: v1.1.0
>
>         Attachments: FLUME-828-0.patch, FLUME-828-1.patch
>
>
> LoggerSink logs entries to console that looks like this:
>      Event: { headers:{} body:[B@5c1ae90c }
> ...where the body is just "getClass().getName() + "@" + 
> Integer.toHexString(hashCode())". The "getClass().getName() will always 
> resolve to [B.
> The issue seems to be how can we represent a SimpleEvent's payload as a 
> String, when the payload is some arbitrary byte array... the array's bytes 
> could represent encoded ascii chars, encoded UTF-8 chars, or binary data such 
> as an encrypted payload. If we default to ASCII translation for everything, 
> then the resulting String won't be useful for binary payloads since not all 
> 256 possible bytes have equivalent printable ASCII chars. Here's one idea:
> For each event body, we can print up to the first 16 bytes in hex format. If 
> there are >16 bytes, then print a "..." suffix at the end. The output would 
> look similar to what you get with unix "hexdump -C". Here's what a sample 
> output from LoggerSink would look like:
>      Event: { headers:{} body: 00000000 54 68 65 20 71 75 69 63 6B 20 62 72 
> 6F 77 6E 20 |The quick brown | ... }
> ...where both the hex and the ascii are displayed for the first 16 chars.
> Is it the most useful representation of the body? Probably not. Is it as 
> least more useful than printing "[B@" + Integer.toHexString(hashCode())"? I 
> think so.
> The commons io lib has a useful HexDump.dump cmd we can leverage.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to