Re: blob data types

2014-05-26 Thread Reindl Harald

Am 26.05.2014 11:40, schrieb geetanjali mehra:
 I want to know where does MyISAM and innodb stores its BLOB data ; inside
 the table or outside the table.  I tried to understand BLOB using MySQL
 online docs but failed.

inside the table, it's just a field type



signature.asc
Description: OpenPGP digital signature


Re: blob data types

2014-05-26 Thread geetanjali mehra
Is it possible to move blob data type values out of table and keep it in
separate page, keeping BLOB  the part of the table.


Geetanjali Mehra
Oracle and MySQL DBA Corporate Trainer



On Mon, May 26, 2014 at 3:26 PM, Reindl Harald h.rei...@thelounge.netwrote:


 Am 26.05.2014 11:40, schrieb geetanjali mehra:
  I want to know where does MyISAM and innodb stores its BLOB data ; inside
  the table or outside the table.  I tried to understand BLOB using MySQL
  online docs but failed.

 inside the table, it's just a field type




Re: blob data types

2014-05-26 Thread yoku ts.
Hello,

I can see MyISAM stores BLOB column as same space as other data type
column,
but InnoDB doesn't. (If you mean same .ibd file it's true)

http://dev.mysql.com/doc/refman/5.6/en/innodb-row-format-overview.html

ROW_FORMAT= Compact holds first 768 bytes of BLOB column,
ROW_FORMAT= Dynamic or Compressed holds first 20bytes of column.

http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-compression-internals-storage-blobs.html



2014-05-26 18:56 GMT+09:00 Reindl Harald h.rei...@thelounge.net:


 Am 26.05.2014 11:40, schrieb geetanjali mehra:
  I want to know where does MyISAM and innodb stores its BLOB data ; inside
  the table or outside the table.  I tried to understand BLOB using MySQL
  online docs but failed.

 inside the table, it's just a field type




Re: blob data types

2014-05-26 Thread Reindl Harald
just don't store large binary data in tables

save the files somewhere in the application
and keep only references to the files in
the database

Am 26.05.2014 12:11, schrieb geetanjali mehra:
 Is it possible to move blob data type values out of table and keep it in
 separate page, keeping BLOB  the part of the table.
 
 On Mon, May 26, 2014 at 3:26 PM, Reindl Harald h.rei...@thelounge.netwrote:
 

 Am 26.05.2014 11:40, schrieb geetanjali mehra:
 I want to know where does MyISAM and innodb stores its BLOB data ; inside
 the table or outside the table.  I tried to understand BLOB using MySQL
 online docs but failed.

 inside the table, it's just a field type



signature.asc
Description: OpenPGP digital signature


Re: blob data types

2014-05-26 Thread Johan De Meersman
- Original Message -
 From: Reindl Harald h.rei...@thelounge.net
 To: mysql@lists.mysql.com
 Sent: Monday, 26 May, 2014 11:56:26 AM
 Subject: Re: blob data types
 
 
 Am 26.05.2014 11:40, schrieb geetanjali mehra:
  I want to know where does MyISAM and innodb stores its BLOB data ; inside
  the table or outside the table.  I tried to understand BLOB using MySQL
  online docs but failed.
 
 inside the table, it's just a field type

Didn't InnoDB store them out of line (but still inside the datafile, of course) 
? I might be remembering wrong, though.

You could design your application to store blobs in files instead; but you 
preferably also need to use redundant/shared storage, then. It's a common 
enough design, but it needs though and understanding :-)

As a tangential thought, there was also a thirdparty plugin engine at some 
point that streamed blobs outside of the normal MySQL protocol. Can't remember 
the name.


-- 
Unhappiness is discouraged and will be corrected with kitten pictures.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: BLOB data gets encoded as utf8!

2011-02-09 Thread Andreas Iwanowski
Thank you for your reply, Janusz.

I appreciate your help.

I have tried making that call before the INSERT statement, but to no avail.
The table collation is set to utf8 - default collation, and all columns are 
set to Table default.
I am thinking that this problem might be due to me sending the BLOB data in a 
unicode string to the server, i.e:

INSERT INTO TableName (Rawfield) VALUES ('%s'); - whereas the %s string is a 
unicode string

That string is encoded by escaping mysql characters according to 
http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html (0x00, 0x22, 0x27, 
0x5C).
Then each byte (and each escaped byte) is followed by a 0x00 before being sent 
to the server, just so that I can send it as a unicode string via ODBC.

I.e. the binary data FA 5C 93 A4 would be expanded into FA 00 5C 00 5C 00 93 00 
A4 00

Otherwise I can't send the data in a unicode INSERT statement via ODBC from C++ 
/ MFC.

Do you think that could be the issue? If so, do you have a suggestion on how to 
do it better?

Thank you again for your help!

Sincerely,
   Andreas Iwanowski

-Original Message-
From: Janusz Paśkiewicz [mailto:ad...@multipasko.pl] 
Sent: Wednesday, February 09, 2011 7:45 AM
To: Andreas Iwanowski
Subject: BLOB data gets encoded as utf8! (Anyone?)

Before inserting BLOB data use:

 

mysql_query(SET CHARACTER SET latin1);




after that you can set it back to:




mysql_query(SET CHARACTER SET utf8);








This is due to:

http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html




If the column has a binary data type (BINARY 
http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , VARBINARY 
http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , BLOB 
http://dev.mysql.com/doc/refman/5.1/en/blob.html ), all the values that it 
contains must be encoded using a single character set (the character set you're 
converting the column to). If you use a binary column to store information in 
multiple character sets, MySQL has no way to know which values use which 
character set and cannot convert the data properly.







Kind Regards,
Janusz Paskiewicz

www.freelancer4u.net




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: BLOB data gets encoded as utf8!

2011-02-09 Thread Johan De Meersman
I can't help but wonder, if you send a string, does that mean you're putting
text in a blob ? Blobs are binary, and thus don't get encoded in the sense
of UTF8 vs Unicode. For a string, you may want a TEXT type column.

On the other hand, if you're indeed trying to insert binary data, it is not
the best of ideas to insert it into an SQL string, where it will be subject
to UTF8 or whatever interpretation.

For large objects, and generally with repeating statements too, it's best to
use bind variables to send your data. That also removes the need to worry
about escaping, sql insertion and the like.


2011/2/10 Andreas Iwanowski namez...@afim.info

 Thank you for your reply, Janusz.

 I appreciate your help.

 I have tried making that call before the INSERT statement, but to no avail.
 The table collation is set to utf8 - default collation, and all columns
 are set to Table default.
 I am thinking that this problem might be due to me sending the BLOB data in
 a unicode string to the server, i.e:

 INSERT INTO TableName (Rawfield) VALUES ('%s'); - whereas the %s string is
 a unicode string

 That string is encoded by escaping mysql characters according to
 http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html (0x00, 0x22,
 0x27, 0x5C).
 Then each byte (and each escaped byte) is followed by a 0x00 before being
 sent to the server, just so that I can send it as a unicode string via ODBC.

 I.e. the binary data FA 5C 93 A4 would be expanded into FA 00 5C 00 5C 00
 93 00 A4 00

 Otherwise I can't send the data in a unicode INSERT statement via ODBC from
 C++ / MFC.

 Do you think that could be the issue? If so, do you have a suggestion on
 how to do it better?

 Thank you again for your help!

 Sincerely,
   Andreas Iwanowski

 -Original Message-
 From: Janusz Paśkiewicz [mailto:ad...@multipasko.pl]
 Sent: Wednesday, February 09, 2011 7:45 AM
 To: Andreas Iwanowski
 Subject: BLOB data gets encoded as utf8! (Anyone?)

 Before inserting BLOB data use:



 mysql_query(SET CHARACTER SET latin1);




 after that you can set it back to:




 mysql_query(SET CHARACTER SET utf8);








 This is due to:

 http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html




 If the column has a binary data type (BINARY 
 http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , VARBINARY
 http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , BLOB 
 http://dev.mysql.com/doc/refman/5.1/en/blob.html ), all the values that
 it contains must be encoded using a single character set (the character set
 you're converting the column to). If you use a binary column to store
 information in multiple character sets, MySQL has no way to know which
 values use which character set and cannot convert the data properly.







 Kind Regards,
 Janusz Paskiewicz

 www.freelancer4u.net




 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be




-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel


Re: BLOB data gets encoded as utf8! (Anyone?)

2010-08-30 Thread Carlos Proal


I think that probably the mistake is in the way you are inserting the 
blob.
If i understand right, you are inserting the blob with a traditional 
insert statement ??, in my experience i always use prepared statements 
to insert blobs, in that way the text goes exactly as the source without 
any utf-8 conversion.
Besides this, using the insert can truncate your blob / statement, 
because some drivers/connectors limit the size of the sql statement, so 
if you are uploading a huge file the statement will not work (I dont 
know if this happens with the mysql driver or if there is a limit), so i 
strongly recommend using prepared statements for blobs with any database.


Hope this helps.

Carlos Proal

On 8/28/2010 1:04 AM, Andreas Iwanowski wrote:

  Has no one encountered this before? There has got to be a solution, and
I still haven't found it... So if you have any input please let me know!

-Original Message-
From: Andreas Iwanowski [mailto:namez...@afim.info]
Sent: Tuesday, August 24, 2010 2:48 PM
To: mysql@lists.mysql.com
Subject: BLOB data gets encoded as utf8!

Hello everyone!

I am using an MFC unicode project that uses ODBC to access a MySQL
5.1.50 database via the MySQL ODBC 5.1.6 driver.
character_set_connection is set to utf8 (Which I believe is the default
for the driver) One of the tables contains two LONGBLOB columns, and the
table default charset is utf-8 (since the application is unicode).

However, when inserting into the LONGBLOB columns via an INSERT
statement, the data gets corrupted/modified because is incorrectly UTF-8
encoded.

My insert statement (simplified) does this:

INSERT INTO _table_ (Desc, Data) VALUES ('...', '_blobdata_');

I have also tried the _binary introducer
(http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html)

INSERT INTO _table_ (Desc, Data) VALUES ('...', _binary'_blobdata_');

I'm escaping 0x00, 0x22, 0x27, and 0x5C, which are the only four that
MySQL requires to be escaped per documentation for BLOB fields, and
according to the MySQL docs columns of any BLOB type use no character
set.

Here is the problem: I found that, for example, 0xFF gets modified to
0xC3BF, which is in fact the UTF8 encoding for ASCII/Binary 0xFF.

Needless to say the data becomes useless. I did, however, determine that
the data is already corrupted in the myodbc.log file that the driver
outputs if the option is set, so there is likely a problem with the
driver settings or with the statement itself.

I cannot be the first one to encounter this issue, so maybe if you have
an idea (or a solution would be even better! :) then please let me know.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=namez...@afim.info







--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: BLOB data gets encoded as utf8! (Anyone?)

2010-08-29 Thread Johan De Meersman
Is the code you use to get the data out in the same charset as the code you
use to put the data in ? Both should ideally also match your database
setting. Have you tried explicitly setting the connection to UTF8 ?

Just swinging in the dark, here, really.


On Sat, Aug 28, 2010 at 8:04 AM, Andreas Iwanowski namez...@afim.infowrote:

  Has no one encountered this before? There has got to be a solution, and
 I still haven't found it... So if you have any input please let me know!

 -Original Message-
 From: Andreas Iwanowski [mailto:namez...@afim.info]
 Sent: Tuesday, August 24, 2010 2:48 PM
 To: mysql@lists.mysql.com
 Subject: BLOB data gets encoded as utf8!

 Hello everyone!

 I am using an MFC unicode project that uses ODBC to access a MySQL
 5.1.50 database via the MySQL ODBC 5.1.6 driver.
 character_set_connection is set to utf8 (Which I believe is the default
 for the driver) One of the tables contains two LONGBLOB columns, and the
 table default charset is utf-8 (since the application is unicode).

 However, when inserting into the LONGBLOB columns via an INSERT
 statement, the data gets corrupted/modified because is incorrectly UTF-8
 encoded.

 My insert statement (simplified) does this:

 INSERT INTO _table_ (Desc, Data) VALUES ('...', '_blobdata_');

 I have also tried the _binary introducer
 (http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html)

 INSERT INTO _table_ (Desc, Data) VALUES ('...', _binary'_blobdata_');

 I'm escaping 0x00, 0x22, 0x27, and 0x5C, which are the only four that
 MySQL requires to be escaped per documentation for BLOB fields, and
 according to the MySQL docs columns of any BLOB type use no character
 set.

 Here is the problem: I found that, for example, 0xFF gets modified to
 0xC3BF, which is in fact the UTF8 encoding for ASCII/Binary 0xFF.

 Needless to say the data becomes useless. I did, however, determine that
 the data is already corrupted in the myodbc.log file that the driver
 outputs if the option is set, so there is likely a problem with the
 driver settings or with the statement itself.

 I cannot be the first one to encounter this issue, so maybe if you have
 an idea (or a solution would be even better! :) then please let me know.

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=namez...@afim.info




 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be




-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel


RE: Blob data

2007-07-03 Thread Rick James
I gave up on putting large blobs in Mysql -- too many limits around 16MB.

Instead I broke blobs into pieces, inserting them with a sequence number.

Added benefit:  Does not clog up replication while huge single-insert is
being copied over network and reexecuted on slaves. 

 -Original Message-
 From: Paul McCullagh [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, June 27, 2007 2:57 AM
 To: Ann W. Harrison
 Cc: MySQL List; MySQL Internal
 Subject: Re: Blob data
 
 Hi Ann,
 
 Currently, the thoughts on how to make the BLOB references secure go  
 like this:
 
 The BLOB reference consists of 2 components: The first component is  
 basically an index used to find the BLOB on the server. The second  
 component is a random number generated when the BLOB is created.
 
 The random number acts as an authorization code, and is checked  
 when the BLOB is requested. So if the authorization code supplied in  
 the BLOB reference does not match the code stored by the server for  
 that BLOB, then the BLOB is not returned.
 
 If the authorization code is a 4-byte number, then the chances of  
 getting the correct code for any particular BLOB is 1 in 4 billion.  
 This makes it practically impossible to discover a BLOB by  
 generating BLOB references and requesting them from the server.
 
 However, it does mean that once you have a valid BLOB reference it  
 remains valid until the BLOB is deleted. So you can pass it 
 around to  
 your friends, or post it on the internet if you like.
 
 In order to prevent this (it will depend on the site, as to whether  
 this is required), it would be possible to add a dynamic 
 component to  
 the BLOB reference which has a certain lifetime (for example, it  
 expires after a certain amount of time, or when a database 
 session is  
 closed).
 
 Such a component would have to be added to the BLOB reference URL by  
 the storage engine on the fly. So, as the SELECT result is being  
 generated, the dynamic component is added to the BLOB references  
 returned in the rowset.
 
 Security of the BLOB streaming stuff is one of the major issues, so  
 further comments, questions and ideas are welcome!
 
 Best regards,
 
 Paul
 
 On Jun 26, 2007, at 4:36 PM, Ann W. Harrison wrote:
 
  Paul McCullagh wrote:
 
  It will also be possible to store the BLOBs out-of-row. In this  
  case, only a BLOB reference is stored in the row. The 
 reference is  
  basically a URL which can be used to retrieve the data. So when  
  you do an SQL SELECT which includes a BLOB column, the resulting  
  rowset does not contain the data, just the BLOB reference (URL).
 
  How does this work with access privileges?  Can you just send random
  numbers in the URL until you start seeing blob data?
 
  Best regards,
 
 
  Ann
 
 
 -- 
 MySQL Internals Mailing List
 For list archives: http://lists.mysql.com/internals
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Blob data

2007-07-03 Thread Ann W. Harrison

Rick James wrote:


Instead I broke blobs into pieces, inserting them with a sequence number.


Understanding the underlying problem, that still seems like an
unnatural way to store pictures and documents.


Added benefit:  Does not clog up replication while huge single-insert is
being copied over network and reexecuted on slaves. 


The design of blobs that Jim did at DEC included the ability to send
them across the network in chunks of a client specified size.  In 1982
it was quite common to have blobs that were larger than physical memory.
What he did more recently was add a blob repository separate from the
active tablespace that allowed the backup function to skip unchanged
blobs while backing up active data.  It also allows replicants to share
a single copy of blobs, if appropriate.

There are lots of ways of making large blobs work better in relational
databases.

Regards,


Ann


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Blob data

2007-07-03 Thread colbey

Rick is dead on correct, I call I chunking blob data..  There is an
article here on a simple implementation:

http://www.dreamwerx.net/phpforum/?id=1

I've had hundreds of thousands of files in this type of storage before
with no issues.


On Tue, 3 Jul 2007, Rick James wrote:

 I gave up on putting large blobs in Mysql -- too many limits around 16MB.

 Instead I broke blobs into pieces, inserting them with a sequence number.

 Added benefit:  Does not clog up replication while huge single-insert is
 being copied over network and reexecuted on slaves.

  -Original Message-
  From: Paul McCullagh [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, June 27, 2007 2:57 AM
  To: Ann W. Harrison
  Cc: MySQL List; MySQL Internal
  Subject: Re: Blob data
 
  Hi Ann,
 
  Currently, the thoughts on how to make the BLOB references secure go
  like this:
 
  The BLOB reference consists of 2 components: The first component is
  basically an index used to find the BLOB on the server. The second
  component is a random number generated when the BLOB is created.
 
  The random number acts as an authorization code, and is checked
  when the BLOB is requested. So if the authorization code supplied in
  the BLOB reference does not match the code stored by the server for
  that BLOB, then the BLOB is not returned.
 
  If the authorization code is a 4-byte number, then the chances of
  getting the correct code for any particular BLOB is 1 in 4 billion.
  This makes it practically impossible to discover a BLOB by
  generating BLOB references and requesting them from the server.
 
  However, it does mean that once you have a valid BLOB reference it
  remains valid until the BLOB is deleted. So you can pass it
  around to
  your friends, or post it on the internet if you like.
 
  In order to prevent this (it will depend on the site, as to whether
  this is required), it would be possible to add a dynamic
  component to
  the BLOB reference which has a certain lifetime (for example, it
  expires after a certain amount of time, or when a database
  session is
  closed).
 
  Such a component would have to be added to the BLOB reference URL by
  the storage engine on the fly. So, as the SELECT result is being
  generated, the dynamic component is added to the BLOB references
  returned in the rowset.
 
  Security of the BLOB streaming stuff is one of the major issues, so
  further comments, questions and ideas are welcome!
 
  Best regards,
 
  Paul
 
  On Jun 26, 2007, at 4:36 PM, Ann W. Harrison wrote:
 
   Paul McCullagh wrote:
  
   It will also be possible to store the BLOBs out-of-row. In this
   case, only a BLOB reference is stored in the row. The
  reference is
   basically a URL which can be used to retrieve the data. So when
   you do an SQL SELECT which includes a BLOB column, the resulting
   rowset does not contain the data, just the BLOB reference (URL).
  
   How does this work with access privileges?  Can you just send random
   numbers in the URL until you start seeing blob data?
  
   Best regards,
  
  
   Ann
 
 
  --
  MySQL Internals Mailing List
  For list archives: http://lists.mysql.com/internals
  To unsubscribe:
  http://lists.mysql.com/[EMAIL PROTECTED]
 
 


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Blob data

2007-07-03 Thread Rick James
And while you are at it, you may as well compress the chunks.  You machine
probably can compress/uncompress faster than it can write/read disk.  I use
Perl's Zlib::Compress or PHP's equivalent instead of Mysql's function for 2
reasons:
 * The network traffic is compressed.
 * Mysql puts an unnecessary extra byte on end of the string (ok, this is
totally insignificant)

And definitely compress each chunk separately.  It seems that those library
routines slow down (excessive memory realloc??) after about 50K.  That is,
you can probably compress 20 50K chunks faster than 1 1M chunk.

My implementation did File - Database -- the huge blob was never
instantiated completely in RAM, only one chunk at a time.  (Imagine trying
to store a 50GB file.)

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, July 03, 2007 2:08 PM
 To: Rick James
 Cc: 'Paul McCullagh'; 'Ann W. Harrison'; 'MySQL List'
 Subject: RE: Blob data
 
 
 Rick is dead on correct, I call I chunking blob data..  There is an
 article here on a simple implementation:
 
 http://www.dreamwerx.net/phpforum/?id=1
 
 I've had hundreds of thousands of files in this type of storage before
 with no issues.
 
 
 On Tue, 3 Jul 2007, Rick James wrote:
 
  I gave up on putting large blobs in Mysql -- too many 
 limits around 16MB.
 
  Instead I broke blobs into pieces, inserting them with a 
 sequence number.
 
  Added benefit:  Does not clog up replication while huge 
 single-insert is
  being copied over network and reexecuted on slaves.
 
   -Original Message-
   From: Paul McCullagh [mailto:[EMAIL PROTECTED]
   Sent: Wednesday, June 27, 2007 2:57 AM
   To: Ann W. Harrison
   Cc: MySQL List; MySQL Internal
   Subject: Re: Blob data
  
   Hi Ann,
  
   Currently, the thoughts on how to make the BLOB 
 references secure go
   like this:
  
   The BLOB reference consists of 2 components: The first 
 component is
   basically an index used to find the BLOB on the server. The second
   component is a random number generated when the BLOB is created.
  
   The random number acts as an authorization code, and is checked
   when the BLOB is requested. So if the authorization code 
 supplied in
   the BLOB reference does not match the code stored by the 
 server for
   that BLOB, then the BLOB is not returned.
  
   If the authorization code is a 4-byte number, then the chances of
   getting the correct code for any particular BLOB is 1 in 
 4 billion.
   This makes it practically impossible to discover a BLOB by
   generating BLOB references and requesting them from the server.
  
   However, it does mean that once you have a valid BLOB reference it
   remains valid until the BLOB is deleted. So you can pass it
   around to
   your friends, or post it on the internet if you like.
  
   In order to prevent this (it will depend on the site, as 
 to whether
   this is required), it would be possible to add a dynamic
   component to
   the BLOB reference which has a certain lifetime (for example, it
   expires after a certain amount of time, or when a database
   session is
   closed).
  
   Such a component would have to be added to the BLOB 
 reference URL by
   the storage engine on the fly. So, as the SELECT result is being
   generated, the dynamic component is added to the BLOB references
   returned in the rowset.
  
   Security of the BLOB streaming stuff is one of the major 
 issues, so
   further comments, questions and ideas are welcome!
  
   Best regards,
  
   Paul
  
   On Jun 26, 2007, at 4:36 PM, Ann W. Harrison wrote:
  
Paul McCullagh wrote:
   
It will also be possible to store the BLOBs 
 out-of-row. In this
case, only a BLOB reference is stored in the row. The
   reference is
basically a URL which can be used to retrieve the data. So when
you do an SQL SELECT which includes a BLOB column, the 
 resulting
rowset does not contain the data, just the BLOB 
 reference (URL).
   
How does this work with access privileges?  Can you 
 just send random
numbers in the URL until you start seeing blob data?
   
Best regards,
   
   
Ann
  
  
   --
   MySQL Internals Mailing List
   For list archives: http://lists.mysql.com/internals
   To unsubscribe:
   http://lists.mysql.com/[EMAIL PROTECTED]
  
  
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]
 
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Blob data

2007-07-03 Thread colbey

Interesting, never tried compressing the data, sounds like that might be
a nice addon..  Do you have any performance numbers you can share?  I posted
some performance numbers on one of my implementations some time ago.

I found the thread here:
http://lists.mysql.com/mysql/206337





On Tue, 3 Jul 2007, Rick James wrote:

 And while you are at it, you may as well compress the chunks.  You machine
 probably can compress/uncompress faster than it can write/read disk.  I use
 Perl's Zlib::Compress or PHP's equivalent instead of Mysql's function for 2
 reasons:
  * The network traffic is compressed.
  * Mysql puts an unnecessary extra byte on end of the string (ok, this is
 totally insignificant)

 And definitely compress each chunk separately.  It seems that those library
 routines slow down (excessive memory realloc??) after about 50K.  That is,
 you can probably compress 20 50K chunks faster than 1 1M chunk.

 My implementation did File - Database -- the huge blob was never
 instantiated completely in RAM, only one chunk at a time.  (Imagine trying
 to store a 50GB file.)

  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, July 03, 2007 2:08 PM
  To: Rick James
  Cc: 'Paul McCullagh'; 'Ann W. Harrison'; 'MySQL List'
  Subject: RE: Blob data
 
 
  Rick is dead on correct, I call I chunking blob data..  There is an
  article here on a simple implementation:
 
  http://www.dreamwerx.net/phpforum/?id=1
 
  I've had hundreds of thousands of files in this type of storage before
  with no issues.
 
 
  On Tue, 3 Jul 2007, Rick James wrote:
 
   I gave up on putting large blobs in Mysql -- too many
  limits around 16MB.
  
   Instead I broke blobs into pieces, inserting them with a
  sequence number.
  
   Added benefit:  Does not clog up replication while huge
  single-insert is
   being copied over network and reexecuted on slaves.
  
-Original Message-
From: Paul McCullagh [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 27, 2007 2:57 AM
To: Ann W. Harrison
Cc: MySQL List; MySQL Internal
Subject: Re: Blob data
   
Hi Ann,
   
Currently, the thoughts on how to make the BLOB
  references secure go
like this:
   
The BLOB reference consists of 2 components: The first
  component is
basically an index used to find the BLOB on the server. The second
component is a random number generated when the BLOB is created.
   
The random number acts as an authorization code, and is checked
when the BLOB is requested. So if the authorization code
  supplied in
the BLOB reference does not match the code stored by the
  server for
that BLOB, then the BLOB is not returned.
   
If the authorization code is a 4-byte number, then the chances of
getting the correct code for any particular BLOB is 1 in
  4 billion.
This makes it practically impossible to discover a BLOB by
generating BLOB references and requesting them from the server.
   
However, it does mean that once you have a valid BLOB reference it
remains valid until the BLOB is deleted. So you can pass it
around to
your friends, or post it on the internet if you like.
   
In order to prevent this (it will depend on the site, as
  to whether
this is required), it would be possible to add a dynamic
component to
the BLOB reference which has a certain lifetime (for example, it
expires after a certain amount of time, or when a database
session is
closed).
   
Such a component would have to be added to the BLOB
  reference URL by
the storage engine on the fly. So, as the SELECT result is being
generated, the dynamic component is added to the BLOB references
returned in the rowset.
   
Security of the BLOB streaming stuff is one of the major
  issues, so
further comments, questions and ideas are welcome!
   
Best regards,
   
Paul
   
On Jun 26, 2007, at 4:36 PM, Ann W. Harrison wrote:
   
 Paul McCullagh wrote:

 It will also be possible to store the BLOBs
  out-of-row. In this
 case, only a BLOB reference is stored in the row. The
reference is
 basically a URL which can be used to retrieve the data. So when
 you do an SQL SELECT which includes a BLOB column, the
  resulting
 rowset does not contain the data, just the BLOB
  reference (URL).

 How does this work with access privileges?  Can you
  just send random
 numbers in the URL until you start seeing blob data?

 Best regards,


 Ann
   
   
--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]
   
   
  
  
   --
   MySQL General Mailing List
   For list archives: http://lists.mysql.com/mysql
   To unsubscribe:
  http://lists.mysql.com/[EMAIL PROTECTED]
  
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql

Re: Blob data

2007-06-27 Thread Paul McCullagh

Hi Ann,

Currently, the thoughts on how to make the BLOB references secure go  
like this:


The BLOB reference consists of 2 components: The first component is  
basically an index used to find the BLOB on the server. The second  
component is a random number generated when the BLOB is created.


The random number acts as an authorization code, and is checked  
when the BLOB is requested. So if the authorization code supplied in  
the BLOB reference does not match the code stored by the server for  
that BLOB, then the BLOB is not returned.


If the authorization code is a 4-byte number, then the chances of  
getting the correct code for any particular BLOB is 1 in 4 billion.  
This makes it practically impossible to discover a BLOB by  
generating BLOB references and requesting them from the server.


However, it does mean that once you have a valid BLOB reference it  
remains valid until the BLOB is deleted. So you can pass it around to  
your friends, or post it on the internet if you like.


In order to prevent this (it will depend on the site, as to whether  
this is required), it would be possible to add a dynamic component to  
the BLOB reference which has a certain lifetime (for example, it  
expires after a certain amount of time, or when a database session is  
closed).


Such a component would have to be added to the BLOB reference URL by  
the storage engine on the fly. So, as the SELECT result is being  
generated, the dynamic component is added to the BLOB references  
returned in the rowset.


Security of the BLOB streaming stuff is one of the major issues, so  
further comments, questions and ideas are welcome!


Best regards,

Paul

On Jun 26, 2007, at 4:36 PM, Ann W. Harrison wrote:


Paul McCullagh wrote:


It will also be possible to store the BLOBs out-of-row. In this  
case, only a BLOB reference is stored in the row. The reference is  
basically a URL which can be used to retrieve the data. So when  
you do an SQL SELECT which includes a BLOB column, the resulting  
rowset does not contain the data, just the BLOB reference (URL).


How does this work with access privileges?  Can you just send random
numbers in the URL until you start seeing blob data?

Best regards,


Ann



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Blob data

2007-06-26 Thread Kevin Waterson
This one time, at band camp, Warren Young [EMAIL PROTECTED] wrote:

 Storing arbitrary data in a BLOB column is 
 tantamount to trying to turn the database into a file system.  If you 
 want a file system, use a file system.

What is a file system, if not a database?

Kevin

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Blob data

2007-06-26 Thread Paul McCullagh

Hi Steve,

Some very good points.

We are working on a solution for BLOBs in the a MySQL database at  
www.blobstreaming.org.


Any good solution will have to take your comments into account...

On Jun 22, 2007, at 9:24 AM, Steve Edberg wrote:


At 12:11 PM +0530 6/22/07, Ratheesh K J wrote:

Hello All,

I want a clarification. Whe run a forum wherein people send  
messages with/without attachments. Attachments may contain images,  
documents etc.. We are actually storing the attachment in a blob  
column. Sometimes the attachments are big. And today the table  
size has grown to 40 GB. This has created a headache for any  
maintanance task, backup, restoration. etc.


I want to know whether this is the right approach. Or should we  
actually store the attachments in directories and just stiore the  
attachment path in the database.


Kindly suggest the best approach so that I can reduce the database  
size.


Thanks in advance



Yes, storing files - especially non-textual files - in the file  
system instead of the database is generally considered the best  
practice.


At one point I had created a document management system that stored  
everything in the database as you are doing; my rationale was that  
it allowed me to manage permissions using the existing database  
permissions, and to back up the whole database using mysqldump, vs  
mysqldump + doing a tar of the files. However, I abandoned this  
approach for the following reasons:


(1) Storing non-plaintext items (eg; pictures) in the database  
makes it bigger and slower without added value - you can't (at  
least not yet, or in the foreseeable future) do a meaningful search  
on a blob.


(2) It becomes more difficult to split storage out onto multiple  
filesystems; eg, leaving the database files in /var/database,  
putting the documents themselves into /home/docmanager, etc.


(3) It makes queries on the commandline unwieldy; if you have a  
blob field, doing a select * to check a record's contents can dump  
a lot of garbage on the screen.


(4) It can make doing incremental backups more difficult; if the  
documents themselves are relatively static, but the document  
metadata stored in the database is very dynamic, it becomes simple  
to do a compact daily database dump + a weekly document directory  
backup (for example) if the files are not in the database.


What I do is create a unique SHA1 hash when a file is uploaded (eg;  
sha1(rand()). The original filename and the 40-character hash are  
stored in the database, and the document is stored in the  
filesystem using the hash as the filename. I can optionally  
compress and encrypt the document as well, storing the encryption  
key in the database. This gives (for me) adequate document  
security. An additional advantage is that you can take advantage of  
the filesystem tree if you have a large number of documents. For  
example, if a document hash is 'f0118e9bd2c4fb29c64ee03abce698b8',  
you can store the file in the directory tree f0/11/8e/ 
f0118e9bd2c4fb29c64ee03abce698b8 (extending to as many levels as  
you feel necessary). By keeping the number of files per directory  
fairly small, file retrieval becomes relatively fast. As the hashes  
approximate a random distribution, you should always have a close- 
to-balanced tree.


Lastly, I store a hash of the document itself in the database as  
well. This allows me to detect if duplicate files are uploaded, and  
to determine if a previously-uploaded file has been corrupted in  
some way.


steve

--
+--- my people are the people of the dessert,  
---+
| Steve Edberghttp:// 
pgfsun.ucdavis.edu/ |
| UC Davis Genome Center 
[EMAIL PROTECTED] |
| Bioinformatics programming/database/sysadmin (530) 
754-9127 |
+ said t e lawrence, picking up his fork  
+


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql? 
[EMAIL PROTECTED]





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Blob data

2007-06-26 Thread Paul McCullagh

Hi Konstantin,

On Jun 26, 2007, at 1:37 PM, Konstantin Osipov wrote:


* Paul McCullagh [EMAIL PROTECTED] [07/06/26 15:35]:

I am about to release the first version of the BLOB Streaming engine.
We are not that far yet (obviously), but our aim is to create an open
platform which will be extendable. So adding various image and
fulltext searching should be possible. Also the inclusion of
compression and encryption as you mentioned below.


Our client/server protocol and SQL interpreter do not have support
for blob streaming. We always copy the entire column to and from
the engine and, in the protocol, to the client.
So it would be actually very interesting to see how you
contemplate to deliver data between applications and the storage
engine.


The BLOB Streaming engine allows you to stream BLOB data directly  
from a MySQL table using HTTP.


To store a BLOB to the database you will use the HTTP PUT method to  
upload the data into a table row, and to retrieve a BLOB, you use the  
HTTP GET method to download the data from a table row.


The URL specifies the database, table, column and search condition,  
in order to identify the BLOB location.


It will also be possible to store the BLOBs out-of-row. In this  
case, only a BLOB reference is stored in the row. The reference is  
basically a URL which can be used to retrieve the data. So when you  
do an SQL SELECT which includes a BLOB column, the resulting rowset  
does not contain the data, just the BLOB reference (URL).


The BLOB reference URL can be placed directly into an HTML page, or  
the data can be retrieved by the application using HTTP when it is  
needed.


In order to store a BLOB out-or-row, you upload the BLOB to MySQL  
without specifying a target row. The PUT command returns the BLOB  
reference. Then you do a normal INSERT, placing the BLOB reference in  
the BLOB column. When the INSERT is committed, the uploaded BLOB is  
committed to the database as well.


Best regards,

Paul

P.S. Thanks for your reply to MySQL forces engine shutdown, and  
hangs. Saved me quite a bit of time! :)



--
-- Konstantin Osipov  Software Developer, Moscow, Russia
-- MySQL AB, www.mysql.com   The best DATABASE COMPANY in the GALAXY




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Blob data

2007-06-26 Thread Warren Young

Kevin Waterson wrote:

This one time, at band camp, Warren Young [EMAIL PROTECTED] wrote:

 Storing arbitrary data in a BLOB column is 
tantamount to trying to turn the database into a file system.  If you 
want a file system, use a file system.


What is a file system, if not a database?


That's the sort of logic that leads one to believe that the only data 
structure anyone needs is a hash table (Javascript), or that because bc 
is Turing complete, that it is a good general purpose programming language.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Blob data

2007-06-25 Thread Alex Arul Lurthu

Ratheesh,

If you still want to place you blob data in the database, seperate out the
blob storage to a seperate table. This will help you  alleviate few of your
performance and maintenance problems.

~Alex

On 6/22/07, Steve Edberg [EMAIL PROTECTED] wrote:


At 12:11 PM +0530 6/22/07, Ratheesh K J wrote:
Hello All,

I want a clarification. Whe run a forum wherein people send messages
with/without attachments. Attachments may contain images, documents
etc.. We are actually storing the attachment in a blob column.
Sometimes the attachments are big. And today the table size has
grown to 40 GB. This has created a headache for any maintanance
task, backup, restoration. etc.

I want to know whether this is the right approach. Or should we
actually store the attachments in directories and just stiore the
attachment path in the database.

Kindly suggest the best approach so that I can reduce the database size.

Thanks in advance


Yes, storing files - especially non-textual files - in the file
system instead of the database is generally considered the best
practice.

At one point I had created a document management system that stored
everything in the database as you are doing; my rationale was that it
allowed me to manage permissions using the existing database
permissions, and to back up the whole database using mysqldump, vs
mysqldump + doing a tar of the files. However, I abandoned this
approach for the following reasons:

(1) Storing non-plaintext items (eg; pictures) in the database makes
it bigger and slower without added value - you can't (at least not
yet, or in the foreseeable future) do a meaningful search on a blob.

(2) It becomes more difficult to split storage out onto multiple
filesystems; eg, leaving the database files in /var/database, putting
the documents themselves into /home/docmanager, etc.

(3) It makes queries on the commandline unwieldy; if you have a blob
field, doing a select * to check a record's contents can dump a lot
of garbage on the screen.

(4) It can make doing incremental backups more difficult; if the
documents themselves are relatively static, but the document metadata
stored in the database is very dynamic, it becomes simple to do a
compact daily database dump + a weekly document directory backup (for
example) if the files are not in the database.

What I do is create a unique SHA1 hash when a file is uploaded (eg;
sha1(rand()). The original filename and the 40-character hash are
stored in the database, and the document is stored in the filesystem
using the hash as the filename. I can optionally compress and encrypt
the document as well, storing the encryption key in the database.
This gives (for me) adequate document security. An additional
advantage is that you can take advantage of the filesystem tree if
you have a large number of documents. For example, if a document hash
is 'f0118e9bd2c4fb29c64ee03abce698b8', you can store the file in the
directory tree f0/11/8e/f0118e9bd2c4fb29c64ee03abce698b8 (extending
to as many levels as you feel necessary). By keeping the number of
files per directory fairly small, file retrieval becomes relatively
fast. As the hashes approximate a random distribution, you should
always have a close-to-balanced tree.

Lastly, I store a hash of the document itself in the database as
well. This allows me to detect if duplicate files are uploaded, and
to determine if a previously-uploaded file has been corrupted in some
way.

steve

--
+--- my people are the people of the dessert, ---+
| Steve Edberghttp://pgfsun.ucdavis.edu/ |
| UC Davis Genome Center[EMAIL PROTECTED] |
| Bioinformatics programming/database/sysadmin (530)754-9127 |
+ said t e lawrence, picking up his fork +

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]




Re: Blob data

2007-06-25 Thread Warren Young

Ratheesh K J wrote:


I want to know whether this is the right approach. Or should we
actually store the attachments in directories and just stiore the
attachment path in the database.


Databases are designed to handle arbitrarily large numbers of rows of 
structured data, where each datum is small and roughly the same size as 
all others of its kind.  Storing arbitrary data in a BLOB column is 
tantamount to trying to turn the database into a file system.  If you 
want a file system, use a file system.


Microsoft's been promising a database-backed file system for something 
like 15 years now.  Maybe it's because they don't write software all 
that well up there in Redmond.  Or maybe it's because this is one of 
those ideas that sounds good on paper but doesn't work out so well in 
practice.  I'm betting on the latter.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Blob data

2007-06-22 Thread Steve Edberg

At 12:11 PM +0530 6/22/07, Ratheesh K J wrote:

Hello All,

I want a clarification. Whe run a forum wherein people send messages 
with/without attachments. Attachments may contain images, documents 
etc.. We are actually storing the attachment in a blob column. 
Sometimes the attachments are big. And today the table size has 
grown to 40 GB. This has created a headache for any maintanance 
task, backup, restoration. etc.


I want to know whether this is the right approach. Or should we 
actually store the attachments in directories and just stiore the 
attachment path in the database.


Kindly suggest the best approach so that I can reduce the database size.

Thanks in advance



Yes, storing files - especially non-textual files - in the file 
system instead of the database is generally considered the best 
practice.


At one point I had created a document management system that stored 
everything in the database as you are doing; my rationale was that it 
allowed me to manage permissions using the existing database 
permissions, and to back up the whole database using mysqldump, vs 
mysqldump + doing a tar of the files. However, I abandoned this 
approach for the following reasons:


(1) Storing non-plaintext items (eg; pictures) in the database makes 
it bigger and slower without added value - you can't (at least not 
yet, or in the foreseeable future) do a meaningful search on a blob.


(2) It becomes more difficult to split storage out onto multiple 
filesystems; eg, leaving the database files in /var/database, putting 
the documents themselves into /home/docmanager, etc.


(3) It makes queries on the commandline unwieldy; if you have a blob 
field, doing a select * to check a record's contents can dump a lot 
of garbage on the screen.


(4) It can make doing incremental backups more difficult; if the 
documents themselves are relatively static, but the document metadata 
stored in the database is very dynamic, it becomes simple to do a 
compact daily database dump + a weekly document directory backup (for 
example) if the files are not in the database.


What I do is create a unique SHA1 hash when a file is uploaded (eg; 
sha1(rand()). The original filename and the 40-character hash are 
stored in the database, and the document is stored in the filesystem 
using the hash as the filename. I can optionally compress and encrypt 
the document as well, storing the encryption key in the database. 
This gives (for me) adequate document security. An additional 
advantage is that you can take advantage of the filesystem tree if 
you have a large number of documents. For example, if a document hash 
is 'f0118e9bd2c4fb29c64ee03abce698b8', you can store the file in the 
directory tree f0/11/8e/f0118e9bd2c4fb29c64ee03abce698b8 (extending 
to as many levels as you feel necessary). By keeping the number of 
files per directory fairly small, file retrieval becomes relatively 
fast. As the hashes approximate a random distribution, you should 
always have a close-to-balanced tree.


Lastly, I store a hash of the document itself in the database as 
well. This allows me to detect if duplicate files are uploaded, and 
to determine if a previously-uploaded file has been corrupted in some 
way.


steve

--
+--- my people are the people of the dessert, ---+
| Steve Edberghttp://pgfsun.ucdavis.edu/ |
| UC Davis Genome Center[EMAIL PROTECTED] |
| Bioinformatics programming/database/sysadmin (530)754-9127 |
+ said t e lawrence, picking up his fork +

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]