justinleet commented on issue #1341: METRON-614: Eliminate use of the default 
Charset
URL: https://github.com/apache/metron/pull/1341#issuecomment-493480413
 
 
   Ahh good call, it's been long enough that I'd forgotten about that 
discussion. There should definitely at least be a README addition that I'll add.
   
   Re: non-UTF-8 inbound data sets, that's potentially a fair problem, although 
I don't personally know what the circumstances would be where non-UTF-8 string 
data is coming from (maybe Latin-1?).  Seems like the only real way to deal 
with this is to make it configurable at the parser level, or otherwise mixing 
incoming charset encodings is a problem (Which I think it would be right now, 
if everything is just using platform default, right? Double check my thinking 
on that). Then the parser itself just reads with whatever character encoding.
   
   At that point, stuff like say GrokParser would need do something like `new 
InputStreamReader(commonInputStream, getEncoding());` or similar.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to