justinleet commented on issue #1341: METRON-614: Eliminate use of the default Charset URL: https://github.com/apache/metron/pull/1341#issuecomment-493480413 Ahh good call, it's been long enough that I'd forgotten about that discussion. There should definitely at least be a README addition that I'll add. Re: non-UTF-8 inbound data sets, that's potentially a fair problem, although I don't personally know what the circumstances would be where non-UTF-8 string data is coming from (maybe Latin-1?). Seems like the only real way to deal with this is to make it configurable at the parser level, or otherwise mixing incoming charset encodings is a problem (Which I think it would be right now, if everything is just using platform default, right? Double check my thinking on that). Then the parser itself just reads with whatever character encoding. At that point, stuff like say GrokParser would need do something like `new InputStreamReader(commonInputStream, getEncoding());` or similar.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services