Re: Semantics of toString

Denis Koroskin Tue, 10 Nov 2009 09:20:45 -0800

On Tue, 10 Nov 2009 15:30:20 +0300, Don <nos...@nospam.com> wrote:

Bill Baxter wrote:
On Tue, Nov 10, 2009 at 2:51 AM, Don <nos...@nospam.com> wrote:
Lutger wrote:
Justin Johansson wrote:
Lutger Wrote:
Justin Johansson wrote:
I assert that the semantics of "toString" or similarlynamed/purposed
methods/functions in many PL's (including and not limited to D) is
ill-defined.
To put this statement into perspective, I would be mostappreciative ofD NG readers responding with their own idea(s) of what thesemantics of
"toString" are (or should be) in a language agnostic ideology.
My other reply didn't take the language agnostic into account,sorry.
Semantics of toString would depend on the object, I would thinkthere
are
three general types of objects:

1. objects with only one sensible or one clear default string
representations, like integers. Maybe even none of these exist(except
strings themselves?)
2. objects that, given some formatting options or locale have aclear
string representation. floating points, dates, curreny and the like.

3. objects that have no sensible default representation.
toString() would not make sense for 3) type objects and only for 2)type
objects as part of a formatting / localization package.
toString() as a debugging aid sometimes doubles as a formatter for1)
and
2) class objects, but that may be more confusing than it's worth.
Thanks for that Lutger.
Do you think it would make better sense if programminglanguages/their
libraries separated functions/methods which are currently loosely
purposed
as "toString" into methods which are more specific to the types you
suggest (leaving only the types/classifications and number thereof to
argue about)?

In my own D project, I've introduced a toDebugString method and left
toString alone. There are times when I like D's default toStringprinting
out the name of the object
class.  For debug purposes there are times also when I like to see a
string printed
out in quotes so you can tell the difference between "123" and 123.Thenagain, and since I'm working on a scripting language, sometimes Ilike to
see debug output distinguish between different numeric types.
Anyway going by the replies on this topic, looks like most peopleview
toString as being good for debug purposes and that about it.

Cheers
Justin
Your design makes better sense (to me at least) because it is basedon why
you want a string from some object.
Take .NET for example: it does provide very elaborate and niceformattingoptions based and toString() with parameters. For some types however,thedefault toString() gives you the name of the type itself which is inno wayrelated to formatting an object. You learn to work with it, but Ifind it a
bit muddled.
As a last note, I think people view toString as a debug thing mostly
because it is very underpowered.
There is a definite use for such as thing. But the existing toString()ismuch, much worse than useless. People think you can do something withit,
but you can't.
eg, people have asked for BigInt to support toString(). That is an
over-my-dead-body.
 You can definitely do something with it -- printf debugging.  And if I
were using BigInt, that's exactly why I'd want BigInt to have a
toString.
I almost always want to print the value out in hex. And with some kindof digit separators, so that I can see how many digits it has.
  Just out of curiousity, how does someone print out the
value of a BigInt right now?
In Tango, there's just .toHex() and .toDecimalString(). Needs properformatting options, it's the biggest thing which isn't done. I hit onetoo many compiler segfaults and starting patching the compiler instead<g>. But I really want a decent toString().
Given a BigInt n, you should be able to just do

writefln("%s %x", n, n);  // Phobos
formatln("{0} {0:X}", n); // Tango
To solve this part of the issue, it would be enough to have toString()take a string parameter. (it would be "x" or "X" in this case).
string toString(string fmt);
But the performance would still be very poor, and that's much moredifficult to solve.


Yes, it would solve half of the toString problems.

Another part (i.e. memory allocation) could be solved by providing anoptional buffer to the toString:

char[] toString(string format = "s" /* comes from %s which is a defaultqualifier */, char[] buffer = null)

{
    // operate on the buffer, possibly resizing it
    // which is safe and fast - it only allocates
    // when *really* necessary, instead of always, as now
    return buffer;
}

You can use it almost the same way you used it before:

string s = assumeUnique(someObject.toString()); // because we return amutable string now


Optimization example:

int sprintf(string format, ...)
{
    char[512] preallocatedBuffer;
    char[] buffer = preallocatedBuffer[]; // buffer may grow, but
    // initially points to a preallocatedBuffer

    char[] storage = buffer[]; // storage for a current element

    ...
    for (...) { // iterate over qualifiers (and arguments)
        string currentQualifier = format[i..j];
        auto currentArgument = argsTuple[n];

        char[] result = currentArgument.toString(storage);
        if (result.ptr is storage.ptr) {
            // okay, string was constructed in-place
            storage = storage[result.length..$];
        } else {
            // storage didn't have enough space for the whole
            // string (a reallocation occurred)

            int offset = buffer.length - storage.length;

            // increase the capacity
            buffer.length *= 2;

            // append our string to the buffer
            buffer[offset..offset+storage.length] = storage[];

            // renew the temporary storage
            storage = preallocatedBuffer[];
        }
    }
    ...
}

Another example:

class Array(T)
{
    // ...
    private T[] elements;

    char[] toString(string format, char[] buffer) {

auto builder = StringBuilder(buffer); // reallocates when no spaceleft

        builder.append("[");
        foreach (i, o; elements) {
            if (i > 0) builder.append(", "); // separator

            buffer = builder.getBuffer()[appender.length..$];
            char[] result = o.toString(format, buffer);
            if (result.ptr is buffer.ptr) {
                // no reallocation
                builder.length += result.length; // without copying
            } else {
                builder.append(result);
            }
        }

        builder.append("]");

        return builder.toString();
    }
}

auto array = new Array!(int);
array ~= [0, 1, 2, 3, 4];
assert(array.toString() == "[0, 1, 2, 3, 4]");

It's not very easy to take advantage of, but it's usable the old way(well, almost).


Any ideas?

Re: Semantics of toString

Reply via email to