On Tue, Mar 28, 2017 at 12:52 PM, Paul Sanderson <
sandersonforens...@gmail.com> wrote:

> I am sure Richard will correct me if I am wrong. But...
>
> The format for a record is
>
> 1. payload length varint
> 2. rowid varint (optional)
> 3. serial type array varint
> 4. serial types
> followed by the data for the serial types
>
> The issue are as I see them:
>
> The payload length varint above, this is the sum of 3 + 4 above plus all of
> the following data forming the record. So as things stand you can't store
> any record where the sum of the bytes in the serial types array and the
> actual data that follows is greater than MAXVARINT because the total length
> must be stored in 1. (MAXVARINT is actually max positive varint - see
> below).
>

Good point. But still, MAXVARINT is 64-bit (see below) not 32-bit.

The record format makes extensive use of the variable-length integer or
> varint representation of 64-bit signed integers defined above.
>


> If you want to use one of the reserved serial types to store a blob of 6GB
> then the serial type itself must be capable of storing the size of the
> blob. Currently, a blob has *any* serial type of >= 12 and even, so the
> maximum size for a blob is (MAXVARINT-12)/2 i.e. *any* even serial type >=
> 12 and a text serial type is any odd serial type >= 13. All of the
> remaining utilised serial types (i.e. those <= 9) refer to fixed length
> data (ints and a 64 bit real).
>

I understand that. That's why I put the length in the "old style" blob
value itself.
But again, the varint encodes a 64-bit signed integer, and the "new style"
blob could
be assumed if the blob length exceed 2GiB (or 4 GiB), not even resorting to
the
two reserved serial types.


> The remaining 2 serial types (remember these are just two bits from a
> 64-bit serial type, each serial type is not a separate varint in its own
> right) could be used to signify something like a 128-bit integer or some
> other fixed-length data type, but, 1 bit by definition cannot store an
> arbitrary length value.
>

I understand that (see above). But using the level of indirection of storing
in the record only the meta-data of the blob, e.g. its full length, its
in-record
length (in case using 10, or 11 serial type, which cannot encode the length
like the traditional text and blob serial types), and the ordered list of
blob
pages to read the blob from, seems completely possible.


> I guess that the change Richard mentions (to up to 4GB) would be by
> treating the varints as unsigned integers, rather than signed as they
> currently are. This could be done (as far as I can see) for all varints
> other than the rowid without affecting existing DBs.
>

That would be an implementation limitation though, not a file format
limitation.

Again, I'm probably naive here, but I still don't clearly see the file
format limitation,
and that's what I'm trying to understand. I completely accept this would be
a lot of
work and that the incentive for Richard to do it is rather low, to
extremely low, although
of course that does bum me out, I have to admit :), but really
understanding the
limitation I'm not seeing now is what I'm after here. Thanks, --DD

PS: The alternate scheme of assuming new-style blob for length > 4 GiB,
which is more backward-compatible, could be further refined via a pragma to
put it lower, make the DB incompatible with older SQLite versions, but no
more
than the many other opt-in features old versions don't support.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to