Eric Blake <ebl...@redhat.com> writes:

> On 10/29/2015 06:44 AM, Markus Armbruster wrote:
>> Commit 29c75dd "json-streamer: limit the maximum recursion depth and
>> maximum token count" attempts to guard against excessive heap usage by
>> limiting total token size (it says "token count", but that's a lie).
>> 
>> Total token size is a rather imprecise predictor of heap usage: many
>> small tokens use more space than few large tokens with the same input
>> size, because there's a constant per-token overhead.
>> 
>> Tighten this up: limit the token count to 128Ki.
>> 
>> If you think 128Ki is too stingy: check-qjson's large_dict test eats a
>> sweet 500MiB and pegs a core for four minutes on my machine to parse
>> ~100K tokens.  Absurdly wasteful.
>
> Sounds like we have some quadratic (or worse) scaling in the parser.
> Worth fixing some day, but I also agree that we don't have to tackle it
> in this series.

I believe it's linear with a criminally negligent constant (several KiB
per token).  The first hog is actually fairly obvious: we use on QDict
per token.

> I'm assuming you temporarily patched check-qjson to use larger constants
> when you hit your ~100K token testing?  Because I am definitely seeing a
> lot of execution time spent on large_dict when running tests/check-qjson
> by hand, in relation to all the other tests of that file, but not
> minutes worth.  Care to post the diff you played with?

I tested on a slow machine.

>> Signed-off-by: Markus Armbruster <arm...@redhat.com>
>> ---
>>  qobject/json-streamer.c | 2 ++
>>  1 file changed, 2 insertions(+)
>
> Reviewed-by: Eric Blake <ebl...@redhat.com>

Thanks!

Reply via email to