Re: [Development] State of "binary JSON" in 5.15+?

2021-04-15 Thread Thiago Macieira
On Thursday, 15 April 2021 13:49:30 PDT Stottlemyer, Brett (B.S.) wrote:
> No, your description is correct.  Thanks for helping to clarify.
> 
> Does this help Thiago?  I'm not using JSON as an input to sending a QString
> or QByteArray over the wire, I want to read a file from disk (quickly) and
> find or extract specific elements from the document.

Yes, it helps.

No, your use-case won't be efficient the way you described. The in-memory 
representation of QJsonDocument has changed, so attempting to save and 
reloading as binary JSON will instead now make it worse. Just keep the 
QJsonDocument / QJsonValue that the thing was about, or keep the CBOR or JSON 
forms.

The memory consumption of the CBOR-based in-memory format should be 
comparable, at least at first approximation, for reasonably-nested structures. 
The QCborContainerPrivate structure that is the backend of everything is 
designed to minimise the overhead and it also keeps your strings in US-ASCII 
as much as possible (most of your JSON object keys will be US-ASCII).

If you REALLY need the binary format, you can try to use the qt5compat module 
in Qt 5. You may need to rename or namespace the class, since QtCore does 
export classes by the same name and you don't want an ODR violation.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-15 Thread Stottlemyer, Brett (B.S.)
On 4/15/21, 3:45 PM, "Development on behalf of Elvis Stansvik" 
 wrote:
With the risk of muddling things even more, but the way I understood

> Think geographic data, where I can prefetch at low-priority, and 
load/process data on-demand as location changes.

was that his use case was:

1. In the background, proactively fetch some JSON that might be needed
soon, parse it and store it as "binary JSON".
2. When the time comes and the data is needed, it can be loaded fast
from disk with fromBinaryJson.

And with 5.14, step (2) was faster than loading from JSON text due to
the mmapping, but in 5.15 it is slower (due to the change in change in
in-memory format).

It's quite possible I've misunderstood though.

Elvis

No, your description is correct.  Thanks for helping to clarify.

Does this help Thiago?  I'm not using JSON as an input to sending a QString or
QByteArray over the wire, I want to read a file from disk (quickly) and find
or extract specific elements from the document.

Thanks,
Brett


 

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-15 Thread Elvis Stansvik
Den tors 15 apr. 2021 kl 21:17 skrev Thiago Macieira
:
>
> On Thursday, 15 April 2021 07:36:08 PDT Stottlemyer, Brett (B.S.) wrote:
> > On 4/14/21, 12:53 AM, "Development on behalf of Thiago Macieira"
> > 
> > wrote:
>
> > No, that was it. I assume you're caching templates which you need to
> > modify slightly for each reply, not plain-text JSON.
> >
> > Ahh, no.  Sorry for not being clear.  I'm the consumer of the REST API, not
> > the producer.
>
> This made it even less clear.
>
> I was trying to draw the distinction between prepared JSON in a QString or
> QByteArray and a template that you needed to fill in. For example, suppose you
> need to send a POST with JSON data but you need to fill in some information in
> that POST before sending, such as a URI or an ID.
>
> Because if the contents don't change, there's no need to use QJsonDocument,
> binary or not, to cache the request or reply. You can do that in QString or
> QByteArray form.

With the risk of muddling things even more, but the way I understood

> Think geographic data, where I can prefetch at low-priority, and load/process 
> data on-demand as location changes.

was that his use case was:

1. In the background, proactively fetch some JSON that might be needed
soon, parse it and store it as "binary JSON".
2. When the time comes and the data is needed, it can be loaded fast
from disk with fromBinaryJson.

And with 5.14, step (2) was faster than loading from JSON text due to
the mmapping, but in 5.15 it is slower (due to the change in change in
in-memory format).

It's quite possible I've misunderstood though.

Elvis

>
> > In Qt 5, the existing methods in QJsonDocument continue to work like
> > they have done since 5.0.
> >
> > I think you are saying this with a different use-case in mind, and I'd like
> > to understand your use-case.  For my use-case, I disagree with your
> > statement. Qt 5.14 could memcpy and cast a QByteArray into a QJsonDocument
> > (untested, but by inspection of the Woboq code and the documentation saying
> > binary JSON was "fast for mmap").  Qt 5.15 can't, although the
> > fromBinaryJson API is still supported.
>
> The in-memory format of QJsonDocument and related classes has indeed changed.
> But from your point of view that's not relevant. If you cache the prepared
> template in the form of a QJsonDocument, the behaviour and performance should
> be the same in either 5.14 or 5.15.
>
> But that also means there's no reason to talk about the binary format any
> more. It's not a caching technique.
> --
> Thiago Macieira - thiago.macieira (AT) intel.com
>   Software Architect - Intel DPG Cloud Engineering
>
>
>
> ___
> Development mailing list
> Development@qt-project.org
> https://lists.qt-project.org/listinfo/development
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-15 Thread Thiago Macieira
On Thursday, 15 April 2021 07:36:08 PDT Stottlemyer, Brett (B.S.) wrote:
> On 4/14/21, 12:53 AM, "Development on behalf of Thiago Macieira"
> 
> wrote:
 
> No, that was it. I assume you're caching templates which you need to
> modify slightly for each reply, not plain-text JSON.
> 
> Ahh, no.  Sorry for not being clear.  I'm the consumer of the REST API, not
> the producer.

This made it even less clear.

I was trying to draw the distinction between prepared JSON in a QString or 
QByteArray and a template that you needed to fill in. For example, suppose you 
need to send a POST with JSON data but you need to fill in some information in 
that POST before sending, such as a URI or an ID.

Because if the contents don't change, there's no need to use QJsonDocument, 
binary or not, to cache the request or reply. You can do that in QString or 
QByteArray form.

> In Qt 5, the existing methods in QJsonDocument continue to work like
> they have done since 5.0.
> 
> I think you are saying this with a different use-case in mind, and I'd like
> to understand your use-case.  For my use-case, I disagree with your
> statement. Qt 5.14 could memcpy and cast a QByteArray into a QJsonDocument
> (untested, but by inspection of the Woboq code and the documentation saying
> binary JSON was "fast for mmap").  Qt 5.15 can't, although the
> fromBinaryJson API is still supported.

The in-memory format of QJsonDocument and related classes has indeed changed. 
But from your point of view that's not relevant. If you cache the prepared 
template in the form of a QJsonDocument, the behaviour and performance should 
be the same in either 5.14 or 5.15.

But that also means there's no reason to talk about the binary format any 
more. It's not a caching technique.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-15 Thread Stottlemyer, Brett (B.S.)
Apologies if this was already sent, I meant to send yesterday and found it in a
still open window.

On 4/14/21, 12:53 AM, "Development on behalf of Thiago Macieira" 
 
wrote:

No, that was it. I assume you're caching templates which you need to modify
slightly for each reply, not plain-text JSON.

Ahh, no.  Sorry for not being clear.  I'm the consumer of the REST API, not the
producer.

Think geographic data, where I can prefetch at low-priority, and load/process
data on-demand as location changes.

In Qt 5, the existing methods in QJsonDocument continue to work like they 
have
done since 5.0.

I think you are saying this with a different use-case in mind, and I'd like to
understand your use-case.  For my use-case, I disagree with your statement.
Qt 5.14 could memcpy and cast a QByteArray into a QJsonDocument (untested,
but by inspection of the Woboq code and the documentation saying binary JSON
was "fast for mmap").  Qt 5.15 can't, although the fromBinaryJson API is still
supported.

Under the covers (see 
https://codereview.qt-project.org/c/qt/qtbase/+/265312/37/src/corelib/serialization/qjsondocument.cpp#362),
after the memcpy operations:
Qt 5.14 -> return QJsonDocument(d);
Qt 5.15 -> return d->toJsonDocument();
  -> toJsonDocument() calls toJsonObject() (or toJsonArray())
-> toJsonObject() recurses over all of the elements
(See here 
https://codereview.qt-project.org/c/qt/qtbase/+/265312/37/src/corelib/serialization/qbinaryjson.cpp#204)

The result is that fromBinaryData() is significantly slower than parsing the
JSON text directly, not significantly faster.  This format is different, and I'm
not sure what use-case it is better for...

Thanks,
Brett


___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Thiago Macieira
On Tuesday, 13 April 2021 19:11:51 PDT Stottlemyer, Brett (B.S.) wrote:
> Yes, small but reparsing isn't negligible.  The idea is to cache (to disk) a
> number of files, and when run, the application dynamically picks the right
> files to use. I am caching in RAM as well, once the files are parsed, but
> it is the initial load time I was trying to reduce.  Or did I misunderstand
> your suggestion? 

No, that was it. I assume you're caching templates which you need to modify 
slightly for each reply, not plain-text JSON.

> Thanks, Mårten.   Unfortunately, I'm still on Qt 5.15 which seems to be
> incompatible with that code (crash).  I'm guessing because the
> QJsonDocument input/output from the new methods is expecting the Qt6
> version of the class. 

In Qt 5, the existing methods in QJsonDocument continue to work like they have 
done since 5.0.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Stottlemyer, Brett (B.S.)
On 4/13/21, 5:23 PM, "Development on behalf of Thiago Macieira" 
 
wrote:

Are you sure you can't just use a memory cache?

Your requirement sounds that your JSON snippet is small enough that it will
not hit the 128 MB limit any time soon but large enough that reparsing it is
not negligible. Is that it?

Yes, small but reparsing isn't negligible.  The idea is to cache (to disk) a 
number
of files, and when run, the application dynamically picks the right files to 
use.
I am caching in RAM as well, once the files are parsed, but it is the initial 
load
time I was trying to reduce.  Or did I misunderstand your suggestion?

On 4/13/21, 3:42 PM, "Mårten Nordheim"  wrote:

It's in the qt5compat library ...

Thanks, Mårten.   Unfortunately, I'm still on Qt 5.15 which seems to be
incompatible with that code (crash).  I'm guessing because the QJsonDocument
input/output from the new methods is expecting the Qt6 version of the class.

https://code.woboq.org/qt5/qtbase/src/corelib/serialization/qjsondocument.cpp.html#_ZN13QJsonDocument14fromBinaryDataERK10QByteArrayNS_14DataValidationE
(which I guess is 5.14 or earlier) looks like it replaces the reparsing with a 
few
memcpy calls, but that was incompatible with Qt 5.15.  A similar mechanism
is obviously possible with Qt6, given the Qt5compat code.  Is it just Qt 5.15
that is problematic?

Thanks,
Brett

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Thiago Macieira
On Tuesday, 13 April 2021 12:20:49 PDT Stottlemyer, Brett (B.S.) wrote:
> The 128 MB piece isn't an issue for my specific case, I will have smaller
> files than that.  I was hoping there was an easy way to cache the REST
> response locally, where I may load the data many times before I need to
> refresh the cache.  In this case, additional storage is worthwhile is it
> improves the parsing time and memory allocation.  I'd be interested in
> seeing if QBinaryJsonDocument is applicable, but I can't find it.  Can I
> trouble you for a link?

Are you sure you can't just use a memory cache?

Your requirement sounds that your JSON snippet is small enough that it will 
not hit the 128 MB limit any time soon but large enough that reparsing it is 
not negligible. Is that it?

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Mårten Nordheim
Hey Brett,

It's in the qt5compat library 
(https://code.qt.io/cgit/qt/qt5compat.git/tree/src/core5/serialization)

Mårten


From: Development  on behalf of 
Stottlemyer, Brett (B.S.) 
Sent: Tuesday, April 13, 2021 21:20
To: Macieira, Thiago; development@qt-project.org
Subject: Re: [Development] State of "binary JSON" in 5.15+?

On Tuesday, 13 April 2021 07:29:48 PDT Lars Knoll wrote:

The binary JSON support is deprecated and only there for backwards
compatibility. It had some issues (e.g. it couldn’t handle large JSON
files), that’s why we deprecated it. It’s gone in Qt 6. I guess the docs
need some adjustment though.

I created https://bugreports.qt.io/browse/QTBUG-92826 for the
documentation.

On 4/13/21, 10:52 AM, "Development on behalf of Thiago Macieira" 
 
wrote:

There's a QBinaryJsonDocument class that is not part of QtCore or qtbase 
that
you can use to load your old documents and then extract data from, even save
data to it again if you need to retain compatibility. The limitation of 128 
MB
maximum size remains.

The 128 MB piece isn't an issue for my specific case, I will have smaller files 
than
that.  I was hoping there was an easy way to cache the REST response locally,
where I may load the data many times before I need to refresh the cache.  In 
this
case, additional storage is worthwhile is it improves the parsing time and 
memory
allocation.  I'd be interested in seeing if QBinaryJsonDocument is applicable, 
but I
can't find it.  Can I trouble you for a link?

Thanks to you both for the clarification.

Regards,
Brett

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Stottlemyer, Brett (B.S.)
On Tuesday, 13 April 2021 07:29:48 PDT Lars Knoll wrote:

The binary JSON support is deprecated and only there for backwards
compatibility. It had some issues (e.g. it couldn’t handle large JSON
files), that’s why we deprecated it. It’s gone in Qt 6. I guess the docs
need some adjustment though.

I created https://bugreports.qt.io/browse/QTBUG-92826 for the
documentation.

On 4/13/21, 10:52 AM, "Development on behalf of Thiago Macieira" 
 
wrote:

There's a QBinaryJsonDocument class that is not part of QtCore or qtbase 
that
you can use to load your old documents and then extract data from, even save
data to it again if you need to retain compatibility. The limitation of 128 
MB
maximum size remains.

The 128 MB piece isn't an issue for my specific case, I will have smaller files 
than
that.  I was hoping there was an easy way to cache the REST response locally,
where I may load the data many times before I need to refresh the cache.  In 
this
case, additional storage is worthwhile is it improves the parsing time and 
memory
allocation.  I'd be interested in seeing if QBinaryJsonDocument is applicable, 
but I
can't find it.  Can I trouble you for a link?

Thanks to you both for the clarification.

Regards,
Brett

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Thiago Macieira
On Tuesday, 13 April 2021 07:29:48 PDT Lars Knoll wrote:
> The binary JSON support is deprecated and only there for backwards
> compatibility. It had some issues (e.g. it couldn’t handle large JSON
> files), that’s why we deprecated it. It’s gone in Qt 6. I guess the docs
> need some adjustment though.

There's a QBinaryJsonDocument class that is not part of QtCore or qtbase that 
you can use to load your old documents and then extract data from, even save 
data to it again if you need to retain compatibility. The limitation of 128 MB 
maximum size remains.

QCborValue had a size limitation in 64-bit Qt 5 due to internally using 
QVector and QByteArray, so it couldn't deal have any array with more than 128 
* 2^20  elements or maps with half as many, or any arrays or objects where 
string content in that level added up to more than 2 GB. Those limitations are 
gone in Qt 6 and you'll run out of memory before you run into the limitations.

Parsing time of text JSON and CBOR is mostly the same and remains so from 
before the 5.15 switch of QJsonDocument's backend. They're all dominated by 
memory allocation. The big advantage of the binary JSON is that it made one 
huge allocation at the beginning equal to the size of the input, and in most 
cases that was just large enough for the binary data and would need no 
reallocation, with acceptable overhead. That overhead could be large if you 
had deeply-nested arrays and objects, with a lot of whitespace in your non-
compact JSON text form, though.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Lars Knoll
Hi Brett,

The binary JSON support is deprecated and only there for backwards 
compatibility. It had some issues (e.g. it couldn’t handle large JSON files), 
that’s why we deprecated it. It’s gone in Qt 6. I guess the docs need some 
adjustment though.

Cheers,
Lars

On 13 Apr 2021, at 15:43, Stottlemyer, Brett (B.S.) 
mailto:bstot...@ford.com>> wrote:

Hi,

I was at the Contributor’s Summit where it was discussed, so I know there were 
good reasons to deprecate the binary json format.  The actual changes looks to 
have been done in https://codereview.qt-project.org/c/qt/qtbase/+/265312.

IIUC, the original intent was that parsing the JSON creates a lookup table to 
speed up subsequent access, and the binary format saved that table to disk (or 
for streaming).  This allowed “opening the JSON” (technically no longer JSON?) 
much more quickly.

Is such a mechanism still available currently?  Or is parsing the data from 
scratch necessary now?

For reference, I tried the deprecated calls and also used the conversion tool 
to convert to CBOR and tried reading that.  With a particular (~20MB) JSON file 
from a REST call, parsing the CBOR version of the file was about twice as fast 
as JSON, while the binary JSON format was about 2.5x *slower* than parsing the 
JSON (just measuring the document creation time).

FWIW, if the capability is no longer available, the overview for Qt JSON 
(https://doc.qt.io/qt-5/json.html) should probably have “It also contains 
support for saving this data in a binary format that is directly "mmap"-able 
and very fast to access” removed.

Thanks!
Brett


___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


[Development] State of "binary JSON" in 5.15+?

2021-04-13 Thread Stottlemyer, Brett (B.S.)
Hi,

I was at the Contributor’s Summit where it was discussed, so I know there were 
good reasons to deprecate the binary json format.  The actual changes looks to 
have been done in https://codereview.qt-project.org/c/qt/qtbase/+/265312.

IIUC, the original intent was that parsing the JSON creates a lookup table to 
speed up subsequent access, and the binary format saved that table to disk (or 
for streaming).  This allowed “opening the JSON” (technically no longer JSON?) 
much more quickly.

Is such a mechanism still available currently?  Or is parsing the data from 
scratch necessary now?

For reference, I tried the deprecated calls and also used the conversion tool 
to convert to CBOR and tried reading that.  With a particular (~20MB) JSON file 
from a REST call, parsing the CBOR version of the file was about twice as fast 
as JSON, while the binary JSON format was about 2.5x *slower* than parsing the 
JSON (just measuring the document creation time).

FWIW, if the capability is no longer available, the overview for Qt JSON 
(https://doc.qt.io/qt-5/json.html) should probably have “It also contains 
support for saving this data in a binary format that is directly "mmap"-able 
and very fast to access” removed.

Thanks!
Brett


___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development