On Thursday, 22 August 2013 at 14:48:57 UTC, Dicebot wrote:
On Thursday, 22 August 2013 at 03:13:46 UTC, Tyler Jameson Little wrote:
On Wednesday, 21 August 2013 at 20:21:49 UTC, Dicebot wrote:
It should be range of strings - one call to popFront should serialize one object from input object range and provide matching string buffer.

I don't like this because it still caches the whole object into memory. In a memory-restricted application, this is unacceptable.

Well, in memory-restricted applications having large object at all is unacceptable. Rationale is that you hardly ever want half-deserialized object. If environment is very restrictive, smaller objects will be used anyway (list of smaller objects).

It seems you and I are trying to solve two very different problems. Perhaps if I explain my use-case, it'll make things clearer.

I have a server that serializes data from a socket, processes that data, then updates internal state and sends notifications to clients (involves serialization as well).

When new clients connect, they need all of this internal state, so the easiest way to do this is to create one large object out of all of the smaller objects:

    class Widget {
    }

    class InternalState {
        Widget[string] widgets;
        ... other data here
    }

InternalState isn't very big by itself; it just has an associative array of Widget pointers with some other rather small data. When serialized, however, this can get quite large. Since archive formats are orders of magnitude less-efficient than in-memory stores, caching the archived version of the internal state can be prohibitively expensive.

Let's say the serialized form of the internal state is 5MB, and I have 128MB available, while 50MB or so is used by the application. This leaves about 70MB, so I can only support 14 connected clients.

With a streaming serializer (per object), I'll get that 5MB down to a few hundred KB and I can support many more clients.

...
There's no reason why the serializer can't output this in chunks

Outputting on its own is not useful to discuss - in pipe model output matches input. What is the point in outputting partial chunks of serialized object if you still need to provide it as a whole to the input?

This only makes sense if you are deserializing right after serializing, which is *not* a common thing to do.

Also, it's much more likely to need to serialize a single object (as in a REST API, 3d model parser [think COLLADA] or config parser). Providing a range seems to fit only a small niche, people that need to dump the state of the system. With single-object serialization and chunked output, you can define your own range to get the same effect, but with an API as you detailed, you can't avoid memory problems without going outside std.

Reply via email to