Like Ulf, I am sometimes annoyed by the use of the "character" misnomer throughout the API docs, and would support an effort to use "character" the way that unicode.org uses it. "char" no longer represents a Unicode character, but at least it provides a short clear name, in the Java language, for "UTF-16 code unit" - if we use it consistently! https://unicode.org/faq/utf_bom.html#utf16-1
On Thu, Sep 26, 2019 at 2:24 PM <mark.reinh...@oracle.com> wrote: > 2019/9/24 13:00:21 -0700, ulf.zi...@cosoco.de: > > Am 21.09.19 um 00:03 schrieb mark.reinh...@oracle.com: > >> To avoid this confusion, a more verbose specification might read: > >> * Returns the maximum number of $otype$s that will be produced for > each > >> * $itype$ of input. This value may be used to compute the > worst-case size > >> * of the output buffer required for a given input sequence. This > value > >> * accounts for any necessary content-independent prefix or suffix > >> #if[encoder] > >> * $otype$s, such as byte-order marks. > >> #end[encoder] > >> #if[decoder] > >> * $otype$s. > >> #end[decoder] > > > > wouldn't it be more clear to use "char" or even "{@code char}" instead > > "character" as replacment for the $xtype$ parameters? > > The specifications of the Charset{De,En}coder classes make it clear > up front that “character” means “sixteen-bit Unicode character,” so > I don’t think changing “character” everywhere to “{@code char}” is > necessary. > > This usage of “character” is common throughout the API specification. > With the introduction of 32-bit Unicode characters we started calling > those “code points,” but kept on calling sixteen-bit characters just > “characters.” (I don’t think the official term “Unicode code unit” > ever caught on, and it’s a bit of a mouthful anyway.) > > - Mark >