Hi Richard,
You know that is the first clear and concise explanation of why not to store large 
blobs in a database that I have heard anywhere.

Greg 
  ----- Original Message ----- 
  From: D. Richard Hipp 
  Cc: [EMAIL PROTECTED] 
  Sent: Monday, April 19, 2004 9:50 AM
  Subject: Re: [sqlite] row size limit


  [EMAIL PROTECTED] wrote:
  > According to the FAQ on sqlite.org, the row size is arbitrarily
  > limited to 1MB, which can be increased to 16MB by changing a
  > #define in the source code.
  > 
  > My question is, why even limit the row size? Is there a way the
  > code can modified so that there is no limit for the row size (other
  > than the available disk/memory space)?
  > 

  The current file format allocates 24 bits for storing the number of
  bytes in a particular row.  (See http://www.sqlite.org/fileformat.html
  for details.)  So the currect version of SQLite will never allow more
  than 16MB in one row.  The proposed version 3.0.0 of SQLite uses a
  variable-length integer to store the row size and can thus accomodate
  up to 2^64 bytes in a single row.  In theory.

  But a limit of 1MB will probably still be enforced.  Why is this?

  SQLite stores large rows by breaking the data up into 1K chunks and
  storing each chunk in a separate page of the database.  Filesystems
  do much the same thing in an operating system.  But filesystems have
  an advantage over databases in that they can access the many chunks
  that make up a file in parallel, whereas SQLite has to access them
  one at a time.

  Suppose you have a 1MB row in SQLite and you want to read the whole
  thing.  SQLite must first ask for the 1st 1K page and wait for it to
  be retrieved.  Then it asks for the 2nd 1K page and waits for it.
  And so forth for all 1000+ pages.  If each page retrieval requires
  1 rotation of the disk, that's about 8.5 seconds on a 7200RPM disk
  drive.

  The filesystem code inside the operating system can do the same task
  in parallel.  If you ask the operating system to read in all of a
  1MB file for you, it can request many separate blocks from the disk
  controller at once.  The blocks might arrive out of order, but the
  OS can reassemble them into the correct order before returning the
  result up to the user-space process.  Using this approach, only a few
  rotations of the disk platter would be required to retrieve a 1MB
  file, instead of thousands.  The retrival will be 100s of times faster.

  The moral of the story:  If you have large amounts of data you want
  to store, it is best to store that data in a separate file and then
  write the name of that file into the database.
  -- 
  D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565


  ---------------------------------------------------------------------
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to