Jacob Pratt <[email protected]> writes:
> Using .width x along with .mode columns, any non-ASCII character isn't
> counted, causing the column to shrink by one.
>
> I *think* my analysis is correct, but it also might be counted multiple
> times by taking a naïve approach and just counting the number of bytes
> (UTF-8 has multi-byte characters).
Yep, it uses bytes:
#if defined(_WIN32) || defined(WIN32)
...
#elif !defined(utf8_printf)
# define utf8_printf fprintf
#endif
...
static int shell_callback(
...
case MODE_Column: {
...
if( w<0 ){
utf8_printf(p->out,"%*.*s%s",-w,-w,
azArg[i] ? azArg[i] : p->nullValue,
i==nArg-1 ? rowSep : " ");
}else{
utf8_printf(p->out,"%-*.*s%s",w,w,
azArg[i] ? azArg[i] : p->nullValue,
i==nArg-1 ? rowSep : " ");
}
And fprintf counts bytes, not characters.
I'd like to also note that aside of multi-byte characters that must be
accounted for, there are "opposite": double-width characters. E.g.
<a5> /x30/x42 HIRAGANA LETTER A
utf-8 representation takes 3 bytes, but *two* column positions.
See man wcwidth wcswidth.
BTW, truncating utf-8 in the middle, as fprintf would do, can
produce confusing mojibake.
Whether it will be fixed (it is definitely annoying to implement), but
at least current deficiencies should be documented.
cat >>sqlite3.1
.SH KNOWN BUGS
.B .mode column
does not handle either multibyte or double-width characters, patches welcomed.
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users