[sqlite] Are BLOB stored inline with the rest of the record fields?
On Fri, Mar 4, 2016 at 4:45 PM, Eric Grange wrote: > [...] why can't we both have our cake and eat it? :) > Exactly. That's why I've been asking/advocating for out-of-row (large) blobs for a while now. See for example http://permalink.gmane.org/gmane.comp.db.sqlite.general/97686. --DD
[sqlite] Are BLOB stored inline with the rest of the record fields?
Thanks for all the details! I was already storing the really large blobs in their own tables. However I will have to review the situations for smallish blobs (about as large as the rest of the fields in a record), as moving them to another table would halve the number of pages involved for queries that do not need them... of course that would have to be balanced with more complex sql, and more page reads in the cases where those blobs are needed... why can't we both have our cake and eat it? :) On Thu, Mar 3, 2016 at 5:40 PM, Drago, William @ CSG - NARDA-MITEQ < William.Drago at l-3com.com> wrote: > I haven't seen any anyone else mention this yet. Another consideration > when working with large blobs is to make those columns that last ones in > your schema. To quote Dr. Hipp: > > "make the BLOB columns the last column in your tables. Or even store the > BLOBs in a separate table which only has two columns: an integer primary > key and the blob itself, and then access the BLOB content using a join if > you need to. If you put various small integer fields after the BLOB, then > SQLite has to scan through the entire BLOB content (following the linked > list of disk pages) to get to the integer fields at the end, and that > definitely can slow you down." > > Found here: > > > http://sqlite.1065341.n5.nabble.com/Effect-of-blobs-on-performance-td19559.html#a19560 > > Your blobs are small so this probably doesn't apply to your application, > but something to keep in mind for future projects. > > -- > Bill Drago > Staff Engineer > L3 Narda-MITEQ > 435 Moreland Road > Hauppauge, NY 11788 > 631-272-5947 / William.Drago at L-3COM.com > > > > -Original Message- > > From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users- > > bounces at mailinglists.sqlite.org] On Behalf Of Eric Grange > > Sent: Thursday, March 03, 2016 3:16 AM > > To: SQLite mailing list > > Subject: Re: [sqlite] Are BLOB stored inline with the rest of the record > fields? > > > > > All BLOBs are stored inline. The only complication comes when the > > > total > > row length is longer than a page. > > > > Ok thanks! > > > > That is besides my original question but what is the complication in > that case? > > Is the blob partially stored in the row, or is some indirection to other > pages > > used, with only some pointer stored in the row? > > > > > Surely you mean big-endian? > > > > Yes, my mistake :) > > > > > Storing Blobs has a few conversions added if you try to use the SQL to > > store it, but it's quite efficient when using the API. > > > > By "use SQL", you mean as a literal blob embedded in the SQL string? > > > > My main use case would be through the API, the actual numbers are fixed > > precision, and so scattered over the range they cannot be displayed to > end > > users without using exponents, so some formatting will have to happen. > > Besides size, using a blob rather than base32/base64 would simplify the > > encoding/decoding, and for debugging, blobs usually display in an > > hexadecimal form here, so a big-endian blob would be directly "readable". > > > > > > > > On Wed, Mar 2, 2016 at 11:07 PM, Doug Currie > > wrote: > > > > > On Wed, Mar 2, 2016 at 4:42 PM, R Smith wrote: > > > > > > > > > > > > Personally, unless your space is constrained, I would simply save > > > > the numbers as strings, perhaps Hex or BCD with leading chars and > > > > convert as needed. This would sort correctly without tricks and not > > > > do much worse > > > for > > > > space. (Base64 would be even better space-wise but won't sort > correct). > > > > > > > > > > There is an encoding defined in RFC 4648 "Base-N Encodings" that does > > > preserve sort order; it is called Base 32 Encoding with Extended Hex > > > Alphabet. I would think the Base64 alphabet could be rearranged to > > > have the same property. > > > > > > e > > > ___ > > > sqlite-users mailing list > > > sqlite-users at mailinglists.sqlite.org > > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > > > ___ > > sqlite-users mailing list > > sqlite-users at mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER
[sqlite] Are BLOB stored inline with the rest of the record fields?
I haven't seen any anyone else mention this yet. Another consideration when working with large blobs is to make those columns that last ones in your schema. To quote Dr. Hipp: "make the BLOB columns the last column in your tables. Or even store the BLOBs in a separate table which only has two columns: an integer primary key and the blob itself, and then access the BLOB content using a join if you need to. If you put various small integer fields after the BLOB, then SQLite has to scan through the entire BLOB content (following the linked list of disk pages) to get to the integer fields at the end, and that definitely can slow you down." Found here: http://sqlite.1065341.n5.nabble.com/Effect-of-blobs-on-performance-td19559.html#a19560 Your blobs are small so this probably doesn't apply to your application, but something to keep in mind for future projects. -- Bill Drago Staff Engineer L3 Narda-MITEQ 435 Moreland Road Hauppauge, NY 11788 631-272-5947 / William.Drago at L-3COM.com > -Original Message- > From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users- > bounces at mailinglists.sqlite.org] On Behalf Of Eric Grange > Sent: Thursday, March 03, 2016 3:16 AM > To: SQLite mailing list > Subject: Re: [sqlite] Are BLOB stored inline with the rest of the record > fields? > > > All BLOBs are stored inline. The only complication comes when the > > total > row length is longer than a page. > > Ok thanks! > > That is besides my original question but what is the complication in that > case? > Is the blob partially stored in the row, or is some indirection to other pages > used, with only some pointer stored in the row? > > > Surely you mean big-endian? > > Yes, my mistake :) > > > Storing Blobs has a few conversions added if you try to use the SQL to > store it, but it's quite efficient when using the API. > > By "use SQL", you mean as a literal blob embedded in the SQL string? > > My main use case would be through the API, the actual numbers are fixed > precision, and so scattered over the range they cannot be displayed to end > users without using exponents, so some formatting will have to happen. > Besides size, using a blob rather than base32/base64 would simplify the > encoding/decoding, and for debugging, blobs usually display in an > hexadecimal form here, so a big-endian blob would be directly "readable". > > > > On Wed, Mar 2, 2016 at 11:07 PM, Doug Currie > wrote: > > > On Wed, Mar 2, 2016 at 4:42 PM, R Smith wrote: > > > > > > > > > Personally, unless your space is constrained, I would simply save > > > the numbers as strings, perhaps Hex or BCD with leading chars and > > > convert as needed. This would sort correctly without tricks and not > > > do much worse > > for > > > space. (Base64 would be even better space-wise but won't sort correct). > > > > > > > There is an encoding defined in RFC 4648 "Base-N Encodings" that does > > preserve sort order; it is called Base 32 Encoding with Extended Hex > > Alphabet. I would think the Base64 alphabet could be rearranged to > > have the same property. > > > > e > > ___ > > sqlite-users mailing list > > sqlite-users at mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any attachments are solely for the use of the addressee and may contain information that is privileged or confidential. Any disclosure, use or distribution of the information contained herein is prohibited. In the event this e-mail contains technical data within the definition of the International Traffic in Arms Regulations or Export Administration Regulations, it is subject to the export control laws of the U.S.Government. The recipient should check this e-mail and any attachments for the presence of viruses as L-3 does not accept any liability associated with the transmission of this e-mail. If you have received this communication in error, please notify the sender by reply e-mail and immediately delete this message and any attachments.
[sqlite] Are BLOB stored inline with the rest of the record fields?
> All BLOBs are stored inline. The only complication comes when the total row length is longer than a page. Ok thanks! That is besides my original question but what is the complication in that case? Is the blob partially stored in the row, or is some indirection to other pages used, with only some pointer stored in the row? > Surely you mean big-endian? Yes, my mistake :) > Storing Blobs has a few conversions added if you try to use the SQL to store it, but it's quite efficient when using the API. By "use SQL", you mean as a literal blob embedded in the SQL string? My main use case would be through the API, the actual numbers are fixed precision, and so scattered over the range they cannot be displayed to end users without using exponents, so some formatting will have to happen. Besides size, using a blob rather than base32/base64 would simplify the encoding/decoding, and for debugging, blobs usually display in an hexadecimal form here, so a big-endian blob would be directly "readable". On Wed, Mar 2, 2016 at 11:07 PM, Doug Currie wrote: > On Wed, Mar 2, 2016 at 4:42 PM, R Smith wrote: > > > > > > Personally, unless your space is constrained, I would simply save the > > numbers as strings, perhaps Hex or BCD with leading chars and convert as > > needed. This would sort correctly without tricks and not do much worse > for > > space. (Base64 would be even better space-wise but won't sort correct). > > > > There is an encoding defined in RFC 4648 "Base-N Encodings" that does > preserve sort order; it is called Base 32 Encoding with Extended Hex > Alphabet. I would think the Base64 alphabet could be rearranged to have the > same property. > > e > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >
[sqlite] Are BLOB stored inline with the rest of the record fields?
What is the likely max length of a row in your table? if your row length is less than page length-35 then it will fit in to one page. The amount of wasted page depends on what fraction of a page size your typical record is. If you have a record that is 600 bytes long on a DB with 1024 byte page size then you would "waste" about 1024-600 = 424 bytes per record . If your page size is 4096 then for the same record size you fit 6 records into the page so would waste about 4096 - (600*6) = 496 / 6 = 82 bytes per record. Pick your page size appropriately although there is usually more to think about with page size than just wasted space. Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite email from a work address for a fully functional demo licence On 3 March 2016 at 08:58, Simon Slavin wrote: > >> On 3 Mar 2016, at 8:16am, Eric Grange wrote: >> >>> All BLOBs are stored inline. The only complication comes when the total >>> row length is longer than a page. >> >> Ok thanks! >> >> That is besides my original question but what is the complication in that >> case? >> Is the blob partially stored in the row, or is some indirection to other >> pages used, with only some pointer stored in the row? > > SQLite database files are split up into pages. Every page in a database is > the same number of bytes long. Every page belongs to the header, or to free > space, or to a specific TABLE or INDEX. > > SQLite stores all the values for a row together. Changing the value of any > column for a row requires rewriting the entire row, and then the row entry > for every index on that table which was changed. [1] > > Ideally a row of data would fit in a page. This simplifies the job of > reading a row from disk or changing the data in a row. SQLite does this > where possible. > > But it's possible for a row to be longer than a page. This happens > especially where the row contains a long string or a long BLOB. In that case > SQLite has to fetch more than one page from disk just to access the values > for that row. And it's possible for a BLOB to be so long that it requires > one or more pages entirely to itself. So the word 'inline' is a little > tricky here because there is more than one 'line' involved. > > Simon. > > [1] Handwave a few exceptions. > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Are BLOB stored inline with the rest of the record fields?
> On 3 Mar 2016, at 8:16am, Eric Grange wrote: > >> All BLOBs are stored inline. The only complication comes when the total >> row length is longer than a page. > > Ok thanks! > > That is besides my original question but what is the complication in that > case? > Is the blob partially stored in the row, or is some indirection to other > pages used, with only some pointer stored in the row? SQLite database files are split up into pages. Every page in a database is the same number of bytes long. Every page belongs to the header, or to free space, or to a specific TABLE or INDEX. SQLite stores all the values for a row together. Changing the value of any column for a row requires rewriting the entire row, and then the row entry for every index on that table which was changed. [1] Ideally a row of data would fit in a page. This simplifies the job of reading a row from disk or changing the data in a row. SQLite does this where possible. But it's possible for a row to be longer than a page. This happens especially where the row contains a long string or a long BLOB. In that case SQLite has to fetch more than one page from disk just to access the values for that row. And it's possible for a BLOB to be so long that it requires one or more pages entirely to itself. So the word 'inline' is a little tricky here because there is more than one 'line' involved. Simon. [1] Handwave a few exceptions.
[sqlite] Are BLOB stored inline with the rest of the record fields?
On 2016/03/02 6:47 PM, Eric Grange wrote: > Hi, > > I am wondering if very small BLOBs are stored inline with the record fields > (in the same page) or in another page? > > So far I had not worried about it, but now I have a usage case where I need > to store 128 and 256 bit integers (numbers, not IDs or hashes), SQLite does > not support such large integers natively, so I am planning to store them as > little-endian binary blobs, so that the BINARY collation can work for > sorting, and for the rest I would do the processing with custom functions > or plain old code. > > However I will have quite a lot of those numbers, and would like to know if > they would be stored inline with the rest of the records (would be > preferable for db size), or with an indirection in another page of the db > (so larger db), and if stored in another page, if there is an alternative, > such as passing binary as "utf8 text"and forcing a COLLATE BINARY on the > column? > > (yes, I could test that or use the source, but I would also like to have > some "designer feedback", as what current SQLite version does may not be > what designers intend for future SQLite versions) Personally, unless your space is constrained, I would simply save the numbers as strings, perhaps Hex or BCD with leading chars and convert as needed. This would sort correctly without tricks and not do much worse for space. (Base64 would be even better space-wise but won't sort correct). Storing Blobs has a few conversions added if you try to use the SQL to store it, but it's quite efficient when using the API. I am confused as to why you would prefer little-endian word order? The binary sorting would not work at all I think, unless I am missing some trick in SQLite's sort methods. Lots of considerations to take into account, speed, DB size etc. If I were you, I'd definitely do some testing first of different methods with data simulated for your intended use case. I don't think there's a one-size-fits-all-solution. Cheers, Ryan
[sqlite] Are BLOB stored inline with the rest of the record fields?
Hi, I am wondering if very small BLOBs are stored inline with the record fields (in the same page) or in another page? So far I had not worried about it, but now I have a usage case where I need to store 128 and 256 bit integers (numbers, not IDs or hashes), SQLite does not support such large integers natively, so I am planning to store them as little-endian binary blobs, so that the BINARY collation can work for sorting, and for the rest I would do the processing with custom functions or plain old code. However I will have quite a lot of those numbers, and would like to know if they would be stored inline with the rest of the records (would be preferable for db size), or with an indirection in another page of the db (so larger db), and if stored in another page, if there is an alternative, such as passing binary as "utf8 text"and forcing a COLLATE BINARY on the column? (yes, I could test that or use the source, but I would also like to have some "designer feedback", as what current SQLite version does may not be what designers intend for future SQLite versions) Thanks!
[sqlite] Are BLOB stored inline with the rest of the record fields?
On Wed, Mar 2, 2016 at 4:42 PM, R Smith wrote: > > > Personally, unless your space is constrained, I would simply save the > numbers as strings, perhaps Hex or BCD with leading chars and convert as > needed. This would sort correctly without tricks and not do much worse for > space. (Base64 would be even better space-wise but won't sort correct). > There is an encoding defined in RFC 4648 "Base-N Encodings" that does preserve sort order; it is called Base 32 Encoding with Extended Hex Alphabet. I would think the Base64 alphabet could be rearranged to have the same property. e
[sqlite] Are BLOB stored inline with the rest of the record fields?
On 2 Mar 2016, at 4:47pm, Eric Grange wrote: > I am wondering if very small BLOBs are stored inline with the record fields > (in the same page) or in another page? All BLOBs are stored inline. The only complication comes when the total row length is longer than a page. Simon.
[sqlite] Are BLOB stored inline with the rest of the record fields?
On 3/2/16, Eric Grange wrote: > I am planning to store them as > little-endian binary blobs, so that the BINARY collation can work for > sorting, Surely you mean big-endian? -- D. Richard Hipp drh at sqlite.org