[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-04 Thread Dominique Devienne
On Fri, Mar 4, 2016 at 4:45 PM, Eric Grange  wrote:

> [...] why can't we both have our cake and eat it? :)
>

Exactly. That's why I've been asking/advocating for out-of-row (large)
blobs for a while now.
See for example
http://permalink.gmane.org/gmane.comp.db.sqlite.general/97686. --DD


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-04 Thread Eric Grange
Thanks for all the details!

I was already storing the really large blobs in their own tables.

However I will have to review the situations for smallish blobs (about as
large as the rest of the fields in a record), as moving them to another
table would halve the number of pages involved for queries that do not need
them... of course that would have to be balanced with more complex sql, and
more page reads in the cases where those blobs are needed... why can't we
both have our cake and eat it? :)

On Thu, Mar 3, 2016 at 5:40 PM, Drago, William @ CSG - NARDA-MITEQ <
William.Drago at l-3com.com> wrote:

> I haven't seen any anyone else mention this yet. Another consideration
> when working with large blobs is to make those columns that last ones in
> your schema. To quote Dr. Hipp:
>
> "make the BLOB columns the last column in your tables.  Or even store the
> BLOBs in a separate table which only has two columns: an integer primary
> key and the blob itself, and then access the BLOB content using a join if
> you need to. If you put various small integer fields after the BLOB, then
> SQLite has to scan through the entire BLOB content (following the linked
> list of disk pages) to get to the integer fields at the end, and that
> definitely can slow you down."
>
> Found here:
>
>
> http://sqlite.1065341.n5.nabble.com/Effect-of-blobs-on-performance-td19559.html#a19560
>
> Your blobs are small so this probably doesn't apply to your application,
> but something to keep in mind for future projects.
>
> --
> Bill Drago
> Staff Engineer
> L3 Narda-MITEQ
> 435 Moreland Road
> Hauppauge, NY 11788
> 631-272-5947 / William.Drago at L-3COM.com
>
>
> > -Original Message-
> > From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> > bounces at mailinglists.sqlite.org] On Behalf Of Eric Grange
> > Sent: Thursday, March 03, 2016 3:16 AM
> > To: SQLite mailing list
> > Subject: Re: [sqlite] Are BLOB stored inline with the rest of the record
> fields?
> >
> > > All BLOBs are stored inline.  The only complication comes when the
> > > total
> > row length is longer than a page.
> >
> > Ok thanks!
> >
> > That is besides my original question but what is the complication in
> that case?
> > Is the blob partially stored in the row, or is some indirection to other
> pages
> > used, with only some pointer stored in the row?
> >
> > > Surely you mean big-endian?
> >
> > Yes, my mistake :)
> >
> > > Storing Blobs has a few conversions added if you try to use the SQL to
> > store it, but it's quite efficient when using the API.
> >
> > By "use SQL", you mean as a literal blob embedded in the SQL string?
> >
> > My main use case would be through the API, the actual numbers are fixed
> > precision, and so scattered over the range they cannot be displayed to
> end
> > users without using exponents, so some formatting will have to happen.
> > Besides size, using a blob rather than base32/base64 would simplify the
> > encoding/decoding, and for debugging, blobs usually display in an
> > hexadecimal form here, so a big-endian blob would be directly "readable".
> >
> >
> >
> > On Wed, Mar 2, 2016 at 11:07 PM, Doug Currie 
> > wrote:
> >
> > > On Wed, Mar 2, 2016 at 4:42 PM, R Smith  wrote:
> > > >
> > > >
> > > > Personally, unless your space is constrained, I would simply save
> > > > the numbers as strings, perhaps Hex or BCD with leading chars and
> > > > convert as needed. This would sort correctly without tricks and not
> > > > do much worse
> > > for
> > > > space. (Base64 would be even better space-wise but won't sort
> correct).
> > > >
> > >
> > > There is an encoding defined in RFC 4648 "Base-N Encodings" that does
> > > preserve sort order; it is called Base 32 Encoding with Extended Hex
> > > Alphabet. I would think the Base64 alphabet could be rearranged to
> > > have the same property.
> > >
> > > e
> > > ___
> > > sqlite-users mailing list
> > > sqlite-users at mailinglists.sqlite.org
> > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> > >
> > ___
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER

[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-03 Thread Drago, William @ CSG - NARDA-MITEQ
I haven't seen any anyone else mention this yet. Another consideration when 
working with large blobs is to make those columns that last ones in your 
schema. To quote Dr. Hipp:

"make the BLOB columns the last column in your tables.  Or even store the BLOBs 
in a separate table which only has two columns: an integer primary key and the 
blob itself, and then access the BLOB content using a join if you need to. If 
you put various small integer fields after the BLOB, then SQLite has to scan 
through the entire BLOB content (following the linked list of disk pages) to 
get to the integer fields at the end, and that definitely can slow you down."

Found here:

http://sqlite.1065341.n5.nabble.com/Effect-of-blobs-on-performance-td19559.html#a19560

Your blobs are small so this probably doesn't apply to your application, but 
something to keep in mind for future projects.

--
Bill Drago
Staff Engineer
L3 Narda-MITEQ
435 Moreland Road
Hauppauge, NY 11788
631-272-5947 / William.Drago at L-3COM.com


> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Eric Grange
> Sent: Thursday, March 03, 2016 3:16 AM
> To: SQLite mailing list
> Subject: Re: [sqlite] Are BLOB stored inline with the rest of the record 
> fields?
>
> > All BLOBs are stored inline.  The only complication comes when the
> > total
> row length is longer than a page.
>
> Ok thanks!
>
> That is besides my original question but what is the complication in that 
> case?
> Is the blob partially stored in the row, or is some indirection to other pages
> used, with only some pointer stored in the row?
>
> > Surely you mean big-endian?
>
> Yes, my mistake :)
>
> > Storing Blobs has a few conversions added if you try to use the SQL to
> store it, but it's quite efficient when using the API.
>
> By "use SQL", you mean as a literal blob embedded in the SQL string?
>
> My main use case would be through the API, the actual numbers are fixed
> precision, and so scattered over the range they cannot be displayed to end
> users without using exponents, so some formatting will have to happen.
> Besides size, using a blob rather than base32/base64 would simplify the
> encoding/decoding, and for debugging, blobs usually display in an
> hexadecimal form here, so a big-endian blob would be directly "readable".
>
>
>
> On Wed, Mar 2, 2016 at 11:07 PM, Doug Currie 
> wrote:
>
> > On Wed, Mar 2, 2016 at 4:42 PM, R Smith  wrote:
> > >
> > >
> > > Personally, unless your space is constrained, I would simply save
> > > the numbers as strings, perhaps Hex or BCD with leading chars and
> > > convert as needed. This would sort correctly without tricks and not
> > > do much worse
> > for
> > > space. (Base64 would be even better space-wise but won't sort correct).
> > >
> >
> > There is an encoding defined in RFC 4648 "Base-N Encodings" that does
> > preserve sort order; it is called Base 32 Encoding with Extended Hex
> > Alphabet. I would think the Base64 alphabet could be rearranged to
> > have the same property.
> >
> > e
> > ___
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence of viruses as L-3 does not accept any 
liability associated with the transmission of this e-mail. If you have received 
this communication in error, please notify the sender by reply e-mail and 
immediately delete this message and any attachments.


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-03 Thread Eric Grange
> All BLOBs are stored inline.  The only complication comes when the total
row length is longer than a page.

Ok thanks!

That is besides my original question but what is the complication in that
case?
Is the blob partially stored in the row, or is some indirection to other
pages used, with only some pointer stored in the row?

> Surely you mean big-endian?

Yes, my mistake :)

> Storing Blobs has a few conversions added if you try to use the SQL to
store it, but it's quite efficient when using the API.

By "use SQL", you mean as a literal blob embedded in the SQL string?

My main use case would be through the API, the actual numbers are fixed
precision, and so scattered over the range they cannot be displayed to end
users without using exponents, so some formatting will have to happen.
Besides size, using a blob rather than base32/base64 would simplify the
encoding/decoding, and for debugging, blobs usually display in an
hexadecimal form here, so a big-endian blob would be directly "readable".



On Wed, Mar 2, 2016 at 11:07 PM, Doug Currie  wrote:

> On Wed, Mar 2, 2016 at 4:42 PM, R Smith  wrote:
> >
> >
> > Personally, unless your space is constrained, I would simply save the
> > numbers as strings, perhaps Hex or BCD with leading chars and convert as
> > needed. This would sort correctly without tricks and not do much worse
> for
> > space. (Base64 would be even better space-wise but won't sort correct).
> >
>
> There is an encoding defined in RFC 4648 "Base-N Encodings" that does
> preserve sort order; it is called Base 32 Encoding with Extended Hex
> Alphabet. I would think the Base64 alphabet could be rearranged to have the
> same property.
>
> e
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-03 Thread Paul Sanderson
What is the likely max length of a row in your table? if your row
length is less than page length-35 then it will fit in to one page.

The amount of wasted page depends on what fraction of a page size your
typical record is.

If you have a record that is 600 bytes long on a DB with 1024 byte
page size then you would "waste" about 1024-600 = 424 bytes per record
. 

If your page size is 4096 then for the same record size you fit 6
records into the page so would waste about 4096 - (600*6) = 496 / 6 =
82 bytes per record.

Pick your page size appropriately although there is usually more to
think about with page size than just wasted space.
Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence


On 3 March 2016 at 08:58, Simon Slavin  wrote:
>
>> On 3 Mar 2016, at 8:16am, Eric Grange  wrote:
>>
>>> All BLOBs are stored inline.  The only complication comes when the total
>>> row length is longer than a page.
>>
>> Ok thanks!
>>
>> That is besides my original question but what is the complication in that
>> case?
>> Is the blob partially stored in the row, or is some indirection to other
>> pages used, with only some pointer stored in the row?
>
> SQLite database files are split up into pages.  Every page in a database is 
> the same number of bytes long.  Every page belongs to the header, or to free 
> space, or to a specific TABLE or INDEX.
>
> SQLite stores all the values for a row together.  Changing the value of any 
> column for a row requires rewriting the entire row, and then the row entry 
> for every index on that table which was changed. [1]
>
> Ideally a row of data would fit in a page.  This simplifies the job of 
> reading a row from disk or changing the data in a row.  SQLite does this 
> where possible.
>
> But it's possible for a row to be longer than a page.  This happens 
> especially where the row contains a long string or a long BLOB.  In that case 
> SQLite has to fetch more than one page from disk just to access the values 
> for that row.  And it's possible for a BLOB to be so long that it requires 
> one or more pages entirely to itself.  So the word 'inline' is a little 
> tricky here because there is more than one 'line' involved.
>
> Simon.
>
> [1] Handwave a few exceptions.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-03 Thread Simon Slavin

> On 3 Mar 2016, at 8:16am, Eric Grange  wrote:
> 
>> All BLOBs are stored inline.  The only complication comes when the total
>> row length is longer than a page.
> 
> Ok thanks!
> 
> That is besides my original question but what is the complication in that
> case?
> Is the blob partially stored in the row, or is some indirection to other
> pages used, with only some pointer stored in the row?

SQLite database files are split up into pages.  Every page in a database is the 
same number of bytes long.  Every page belongs to the header, or to free space, 
or to a specific TABLE or INDEX.

SQLite stores all the values for a row together.  Changing the value of any 
column for a row requires rewriting the entire row, and then the row entry for 
every index on that table which was changed. [1]

Ideally a row of data would fit in a page.  This simplifies the job of reading 
a row from disk or changing the data in a row.  SQLite does this where possible.

But it's possible for a row to be longer than a page.  This happens especially 
where the row contains a long string or a long BLOB.  In that case SQLite has 
to fetch more than one page from disk just to access the values for that row.  
And it's possible for a BLOB to be so long that it requires one or more pages 
entirely to itself.  So the word 'inline' is a little tricky here because there 
is more than one 'line' involved.

Simon.

[1] Handwave a few exceptions.


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-02 Thread R Smith


On 2016/03/02 6:47 PM, Eric Grange wrote:
> Hi,
>
> I am wondering if very small BLOBs are stored inline with the record fields
> (in the same page) or in another page?
>
> So far I had not worried about it, but now I have a usage case where I need
> to store 128 and 256 bit integers (numbers, not IDs or hashes), SQLite does
> not support such large integers natively, so I am planning to store them as
> little-endian binary blobs, so that the BINARY collation can work for
> sorting, and for the rest I would do the processing with custom functions
> or plain old code.
>
> However I will have quite a lot of those numbers, and would like to know if
> they would be stored inline with the rest of the records (would be
> preferable for db size), or with an indirection in another page of the db
> (so larger db), and if stored in another page, if there is an alternative,
> such as passing binary as "utf8 text"and forcing a COLLATE BINARY on the
> column?
>
> (yes, I could test that or use the source, but I would also like to have
> some "designer feedback", as what current SQLite version does may not be
> what designers intend for future SQLite versions)

Personally, unless your space is constrained, I would simply save the 
numbers as strings, perhaps Hex or BCD with leading chars and convert as 
needed. This would sort correctly without tricks and not do much worse 
for space. (Base64 would be even better space-wise but won't sort correct).

Storing Blobs has a few conversions added if you try to use the SQL to 
store it, but it's quite efficient when using the API.

I am confused as to why you would prefer little-endian word order? The 
binary sorting would not work at all I think, unless I am missing some 
trick in SQLite's sort methods.

Lots of considerations to take into account, speed, DB size etc. If I 
were you, I'd definitely do some testing first of different methods with 
data simulated for your intended use case. I don't think there's a 
one-size-fits-all-solution.

Cheers,
Ryan




[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-02 Thread Eric Grange
Hi,

I am wondering if very small BLOBs are stored inline with the record fields
(in the same page) or in another page?

So far I had not worried about it, but now I have a usage case where I need
to store 128 and 256 bit integers (numbers, not IDs or hashes), SQLite does
not support such large integers natively, so I am planning to store them as
little-endian binary blobs, so that the BINARY collation can work for
sorting, and for the rest I would do the processing with custom functions
or plain old code.

However I will have quite a lot of those numbers, and would like to know if
they would be stored inline with the rest of the records (would be
preferable for db size), or with an indirection in another page of the db
(so larger db), and if stored in another page, if there is an alternative,
such as passing binary as "utf8 text"and forcing a COLLATE BINARY on the
column?

(yes, I could test that or use the source, but I would also like to have
some "designer feedback", as what current SQLite version does may not be
what designers intend for future SQLite versions)

Thanks!


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-02 Thread Doug Currie
On Wed, Mar 2, 2016 at 4:42 PM, R Smith  wrote:
>
>
> Personally, unless your space is constrained, I would simply save the
> numbers as strings, perhaps Hex or BCD with leading chars and convert as
> needed. This would sort correctly without tricks and not do much worse for
> space. (Base64 would be even better space-wise but won't sort correct).
>

There is an encoding defined in RFC 4648 "Base-N Encodings" that does
preserve sort order; it is called Base 32 Encoding with Extended Hex
Alphabet. I would think the Base64 alphabet could be rearranged to have the
same property.

e


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-02 Thread Simon Slavin

On 2 Mar 2016, at 4:47pm, Eric Grange  wrote:

> I am wondering if very small BLOBs are stored inline with the record fields
> (in the same page) or in another page?

All BLOBs are stored inline.  The only complication comes when the total row 
length is longer than a page.

Simon.


[sqlite] Are BLOB stored inline with the rest of the record fields?

2016-03-02 Thread Richard Hipp
On 3/2/16, Eric Grange  wrote:
>  I am planning to store them as
> little-endian binary blobs, so that the BINARY collation can work for
> sorting,

Surely you mean big-endian?
-- 
D. Richard Hipp
drh at sqlite.org