[ 
https://issues.apache.org/jira/browse/METRON-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Miklavcic reassigned METRON-614:
----------------------------------------

    Assignee: Michael Miklavcic

> Eliminate use of the default Charset
> ------------------------------------
>
>                 Key: METRON-614
>                 URL: https://issues.apache.org/jira/browse/METRON-614
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Justin Leet
>            Assignee: Michael Miklavcic
>            Priority: Minor
>          Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> During work on METRON-612, I noticed a lot of our warnings are from implicit 
> use of the default Charset (e.g. when creating InputStreamReaders).  Given 
> that this value is platform dependent, we should at least consider explicitly 
> using a particular Charset.
> In addition, after fixing this, I would like to see it upgraded to a compile 
> error, because in my experience almost everyone tends to use methods that use 
> the implicit default Charset.  This can be done by changing the compiler arg 
> in the pom to instead be
> {code}
> <compilerArgs>
>   <arg>-Xlint:unchecked</arg>
>   <arg>-Xep:DefaultCharset:ERROR</arg>
> </compilerArgs>
> {code}
> My assumption is that we'll want UTF-8, but given a lack of expertise and 
> that usually there's a lot of opinions on character set issues, we may want 
> to consider other options.
> http://errorprone.info/bugpattern/DefaultCharset
> https://docs.oracle.com/javase/8/docs/api/java/nio/charset/Charset.html lists 
> standard Charsets and provides more details.
> {code}
> Charset       Description
> US-ASCII      Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block 
> of the Unicode character set
> ISO-8859-1    ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
> UTF-8 Eight-bit UCS Transformation Format
> UTF-16BE      Sixteen-bit UCS Transformation Format, big-endian byte order
> UTF-16LE      Sixteen-bit UCS Transformation Format, little-endian byte order
> UTF-16        Sixteen-bit UCS Transformation Format, byte order identified by 
> an optional byte-order mark
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to