> > Thank you all, it all makes sense (at least to me) . > > The "row" object contains a pointer to a location elsewhere in the SAME > FILE where the BLOB itself is kept. That pointer (and maybe the size of > the BLOB itself) add the 5 to 9 bytes to the row as discussed in the > docs. The actual BLOB data is included with the other "row" data in the > same file but not actually stored as part of the row. > > This brings up a tangent question. As related in another thread (I can't > find it right now) another DBA suggested that for greater performance, > that BLOB columns should be split from the non-BLOB columns into their > own table. His reasoning was that in order to process a WHERE clause > that the entire row (including the BLOB columns) was retrieved. He said > he was able to improve his query performance by only including the BLOB > data when it was actually needed for the output. > > Could his performance differences be because the BLOB objects are > interspersed _between_ the "ROW" objects and is forcing his disks to > seek more to get from row to row? Or, does anyone know if his initial > theory (that the BLOBs were loaded even if not needed) is right? > > humbly impressed and freshly educated, > > Shawn Green > Database Administrator > Unimin Corporation - Spruce Pine > > > > > [EMAIL PROTECTED] > om > To: > <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> > > cc: > <[EMAIL PROTECTED]>, > <[EMAIL PROTECTED]> > > 06/17/2004 11:12 Fax to: > AM > Subject: RE: Where are > BLOBs / TEXTs stored? > > > > > > > Here's an excerpt from the MySQL internal doc, the format is better in > html but not appropriate for this forum. The last paragraph sums it up: > > > Hexadecimal Display of Table1.MYD file > F1 61 62 63 00 F5 64 00 66 00 ... .abc..d e. > > > Here's how to read this hexadecimal-dump display: > > > The hexadecimal numbers F1 61 62 63 00 F5 64 20 66 00 are byte values > and the column on the right is an attempt to show the same bytes in > ASCII. > The F1 byte means that there are no null fields in the first row. > The F5 byte means that the second column of the second row is NULL. > (It's probably easier to understand the flag setting if you restate F5 > as 11110101 binary, and (a) notice that the third flag bit from the > right is on, and (b) remember that the first flag bit is the X bit.) > > > There are complications -- the record header is more complex if there > are variable-length fields -- but the simple display shown in the > example is exactly what you'd see if you looked at the MySQL Data file > with a debugger or a hexadecimal file dumper. > > > So much for the fixed format. Now, let's discuss the dynamic format. > > > The dynamic file format is necessary if rows can vary in size. That will > be the case if there are BLOB columns, or "true" VARCHAR columns. > (Remember that MySQL may treat VARCHAR columns as if they're CHAR > columns, in which case the fixed format is used.) A dynamic row has more > fields in the header. The important ones are "the actual length", "the > unused length", and "the overflow pointer". The actual length is the > total number of bytes in all the columns. The unused length is the total > number of bytes between one physical record and the next one. The > overflow pointer is the location of the rest of the record if there are > multiple parts. > > > For example, here is a dynamic row: > 03, 00 start of header > 04 actual length > 0c unused length > 01, fc flags + overflow pointer > **** data in the row > ************ unused bytes > <-- next row starts here) > > > > > In the example, the actual length and the unused length are short (one > byte each) because the table definition says that the columns are short > -- if the columns were potentially large, then the actual length and the > unused length could be two bytes each, three bytes each, and so on. In > this case, actual length plus unused length is 10 hexadecimal (sixteen > decimal), which is a minimum. > > As for the third format -- packed -- we will only say briefly that: > > Numeric values are stored in a form that depends on the range (start/end > values) for the data type. > All columns are packed using either Huffman or enum coding. > For details, see the source files /myisam/mi_statrec.c (for fixed > format), /myisam/mi_dynrec.c (for dynamic format), and > /myisam/mi_packrec.c (for packed format). > > Note: Internally, MySQL uses a format much like the fixed format which > it uses for disk storage. The main differences are: > > BLOBs have a length and a memory pointer rather than being stored > inline. > "True VARCHAR" (a column storage which will be fully implemented in > version 5.0) will have a 16-bit length plus the data. > All integer or floating-point numbers are stored with the low byte > first. Point (3) does not apply for ISAM storage or internals. > > Ed > ------------------------------------------------------------------------ > ---- > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Thursday, June 17, 2004 9:00 AM > To: [EMAIL PROTECTED] > Cc: emierzwa; [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: RE: Where are BLOBs / TEXTs stored? > > > [EMAIL PROTECTED] wrote on 17/06/2004 15:35:36: > >> >> I am curious about this, too. However, I don't think that you answer > the >> original question. >> >> Are BLOBs stored as separate files, one file per object? Are they > combined >> into a single large BLOB file? are they aggregated into several medium >> sized files? Answering "where are they stored on the disk" may be a > better >> response as the docs state that they are not stored in the MyISAM > table >> itself but as "separate objects" (which means what, exactly?) > > This is not how I read the section of the manual. Normally, a database > row > is a single "Object" within the MyISAM file, whcih contains many > "Objects". Successive numeric fields will be stored in adjacent words of > > the MyISAM file in exactly the order you see them when you do a "SELECT > *". If you want to access this record, then only one disk seek is needed > > to fetch it. However, because large BLOBs are rarely involved in > searches, > rather than creating a single huge record with the BLOB embedded in it, > the BLOB is stored elsewhere *in the same .myd file*, with only a > pointer > to the position of the blob within the file. > > The upside of this is that for searches not involving the BLOB field, > and > after the indexes have been exausted, only the relatively small non-BLOB > > needs to be read and checked. The downside is that if the search > involves > the BLOB field, or if the BLOB field needsw to be fetched, then a second > > disk access is required, reducing performance. > > That is how I understand it: if anybody knows better, feel free to > correct > me - one learns by ones mistakes. > > Alec > There are certain performance advantages if the rows in a table are fixed length. If blobs/text fields are moved to a separate table then the varchar fields can be stored as char. Also, you will no have to clean up the database after large numbers of deletes/updates/inserts because the space in the file will not become fragmented.
-- William R. Mussatto, Senior Systems Engineer Ph. 909-920-9154 ext. 27 FAX. 909-608-7061 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]