[ 
https://issues.apache.org/jira/browse/FLUME-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214190#comment-13214190
 ] 

Mike Percy commented on FLUME-828:
----------------------------------

I've thought about this a bit today and I agree with what you're saying.

One remaining concern I have is the performance impact of adding a largish 
implementation of toString() to such a core class that will be instantiated 
many times. If the event body is very large, it won't be much impact, but if 
the event body is very small then it could be a decent percentage of the memory 
usage of each SimpleEvent object.
                
> LoggerSink representation of the event's body isn't too useful
> --------------------------------------------------------------
>
>                 Key: FLUME-828
>                 URL: https://issues.apache.org/jira/browse/FLUME-828
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: NG alpha 1
>            Reporter: Will McQueen
>            Assignee: Brock Noland
>             Fix For: v1.1.0
>
>         Attachments: FLUME-828-0.patch, FLUME-828-1.patch
>
>
> LoggerSink logs entries to console that looks like this:
>      Event: { headers:{} body:[B@5c1ae90c }
> ...where the body is just "getClass().getName() + "@" + 
> Integer.toHexString(hashCode())". The "getClass().getName() will always 
> resolve to [B.
> The issue seems to be how can we represent a SimpleEvent's payload as a 
> String, when the payload is some arbitrary byte array... the array's bytes 
> could represent encoded ascii chars, encoded UTF-8 chars, or binary data such 
> as an encrypted payload. If we default to ASCII translation for everything, 
> then the resulting String won't be useful for binary payloads since not all 
> 256 possible bytes have equivalent printable ASCII chars. Here's one idea:
> For each event body, we can print up to the first 16 bytes in hex format. If 
> there are >16 bytes, then print a "..." suffix at the end. The output would 
> look similar to what you get with unix "hexdump -C". Here's what a sample 
> output from LoggerSink would look like:
>      Event: { headers:{} body: 00000000 54 68 65 20 71 75 69 63 6B 20 62 72 
> 6F 77 6E 20 |The quick brown | ... }
> ...where both the hex and the ascii are displayed for the first 16 chars.
> Is it the most useful representation of the body? Probably not. Is it as 
> least more useful than printing "[B@" + Integer.toHexString(hashCode())"? I 
> think so.
> The commons io lib has a useful HexDump.dump cmd we can leverage.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to