On Mon, May 25, 2026 at 9:04 PM Branko Čibej <[email protected]> wrote:
I took another look at how 'svn blame' aligns its output.
static svn_error_t *
print_line_info(svn_stream_t *out,
svn_revnum_t revision,
const char *author,
const char *date,
const char *path,
svn_boolean_t verbose,
int rev_maxlength,
apr_pool_t *pool)
{
const char *time_utf8;
const char *time_stdout;
const char *rev_str;
rev_str = SVN_IS_VALID_REVNUM(revision)
? apr_psprintf(pool, "%*ld", rev_maxlWith propeties oength,
revision)
: apr_psprintf(pool, "%*s", rev_maxlength, "-");
if (verbose)
{
if (date)
{
SVN_ERR(svn_cl__time_cstring_to_human_cstring(&time_utf8,
date, pool));
SVN_ERR(svn_cmdline_cstring_from_utf8(&time_stdout, time_utf8,
pool));
Converts timestamp to locale encoding ...
}
else
{
/* ### This is a 44 characters long string. It assumes
the current
format of svn_time_to_human_cstring and also 3 letter
abbreviations for the month and weekday names.
Else, the
line contents will be misaligned. */
time_stdout = " -";
}
SVN_ERR(svn_stream_printf(out, pool, "%s %10s %s ", rev_str,
author ? author : " -",
time_stdout));
But author remains in UTF-8? The author name is extracted from
properties, I don't recall if we enforce UTF-8 in svn:author. I
know that we do in svn:log.
if (path)
SVN_ERR(svn_stream_printf(out, pool, "%-14s ", path));
And so does the path? The blame-receiver's docstring says nothing
about that.
}
else
{
return svn_stream_printf(out, pool, "%s %10.10s ", rev_str,
author ? author : " -");
}
return SVN_NO_ERROR;
}
I guess most of the time, locale encoding is UTF-8 or some other
Unicode format that's lossless. Otherwise I can't imagine how this
could work correctly, in general.
What am I missing?
I think all API should assume UTF-8 string (with certain exceptions
like let's say the svn_utf.h itself).
However, the problem is what it'd actually do. Since both the path and
the properties at the end are stored as binary blobs on the disk, they
could technically be anything. But I assume if the path wasn't
UTF-8/ASCII - then FSFS wouldn't parse them properly which would lead
to a corrupted repository.