Storing huge amount of binary data

2005-07-11 Thread Cabbar Duzayak
Hi Everyone,

I will be starting a project for which I will need to store about 1
million image files all of which are about 50-100K in size. I will be
using Linux for this project. The database won't be very busy, there
will be batch image uploads 1-2 times a day, and around 100-200
concurrent users at most, most of which will be reading from the db
and writing only session information type of data, etc... And, I don't
really need transaction support (InnoDB)...

Adding this up, the image data size will be around 50-100 Gb, and I
will need to store a total of 1-2 Gb text information (1K for each
image) along with each of these images...

First of all, I heard that Mysql does not perform very well when
tablesize goes above 1 Gb. Is this a myth? Image table is not a big
deal, since I can partition/distribute it to ~ 100-200 tables, i.e. by
table_(id % 100). However, text information needs to stay in a single
table (since I need to do queries on it for information) and there
will be multiple indexes over this information.

And, as you can imagine, I am not sure if mysql can handle something
like this, and was wondering if you can provide some feedback.

So my questions are:

1. The main question is, do you guys have any experience with this
much binary and regular data? Do you think Mysql can handle this much
data in a reliable manner (without corrupting data and/or
degrading/terrible performance) ?

2. Can I implement this using regular SCSI disks with regular mysql?
Or do I have need advanced solutions such as clustered, replicated,
etc?

3. Again, as you can understand, I want to minimize the cost here. If
you don't think I can use mysql, do you think Microsoft SQL server is
good enough for this task?

Thanks...

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Storing huge amount of binary data

2005-07-11 Thread Matt Babineau
Sometimes, the easiest way to do this is to use the file system of the linux
machine to store the files, and make reference to them in the DB...storing
not data in the DB and getting rid of all your possible problems. 



Thanks,

Matt Babineau
Criticalcode
858.733.0160
[EMAIL PROTECTED]
http://www.criticalcode.com
 
-Original Message-
From: Cabbar Duzayak [mailto:[EMAIL PROTECTED] 
Sent: Saturday, July 09, 2005 12:01 AM
To: mysql@lists.mysql.com
Subject: Storing huge amount of binary data

Hi Everyone,

I will be starting a project for which I will need to store about 1 million
image files all of which are about 50-100K in size. I will be using Linux
for this project. The database won't be very busy, there will be batch image
uploads 1-2 times a day, and around 100-200 concurrent users at most, most
of which will be reading from the db and writing only session information
type of data, etc... And, I don't really need transaction support
(InnoDB)...

Adding this up, the image data size will be around 50-100 Gb, and I will
need to store a total of 1-2 Gb text information (1K for each
image) along with each of these images...

First of all, I heard that Mysql does not perform very well when tablesize
goes above 1 Gb. Is this a myth? Image table is not a big deal, since I can
partition/distribute it to ~ 100-200 tables, i.e. by table_(id % 100).
However, text information needs to stay in a single table (since I need to
do queries on it for information) and there will be multiple indexes over
this information.

And, as you can imagine, I am not sure if mysql can handle something like
this, and was wondering if you can provide some feedback.

So my questions are:

1. The main question is, do you guys have any experience with this much
binary and regular data? Do you think Mysql can handle this much data in a
reliable manner (without corrupting data and/or degrading/terrible
performance) ?

2. Can I implement this using regular SCSI disks with regular mysql?
Or do I have need advanced solutions such as clustered, replicated, etc?

3. Again, as you can understand, I want to minimize the cost here. If you
don't think I can use mysql, do you think Microsoft SQL server is good
enough for this task?

Thanks...

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Storing huge amount of binary data

2005-07-11 Thread Per Jessen
Cabbar Duzayak wrote:

 So my questions are:
 
 1. The main question is, do you guys have any experience with this
 much binary and regular data? Do you think Mysql can handle this much
 data in a reliable manner (without corrupting data and/or
 degrading/terrible performance) ?

I would say so, yes.  One of my biggest databases holds 50-60million rows, and 
takes up
about 5Gb diskspace.  I don't think mysql will have any problems running what 
you
describe.

 2. Can I implement this using regular SCSI disks with regular mysql?
 Or do I have need advanced solutions such as clustered, replicated,
 etc?

Clustering and replication is more to do with data-availability.  You'll 
probably benefit
from using RAID in some form - depends on whether you need reliability or speed.

 3. Again, as you can understand, I want to minimize the cost here. If
 you don't think I can use mysql, do you think Microsoft SQL server is
 good enough for this task? 

I don't think so, no.


/Per Jessen, Zürich


-- 
http://www.spamchek.com/freetrial - managed anti-spam and anti-virus solution.
Sign up for your free 30-day trial now!


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Storing huge amount of binary data

2005-07-11 Thread Devananda

First of all, I heard that Mysql does not perform very well when
tablesize goes above 1 Gb. Is this a myth? 
I don't have much experience with individual tables being that large, 
though I have used MySQL databases with 100G total data with no 
problems at all. Based on my experience, it's not the table size which 
can degrade performance, but the index size - if the index is too large 
or clumsy, then MySQL can not search quickly enough. If the index is 
well-structured, the data is found quickly and overall performance is 
great regardless of the quantity of data.



Image table is not a big
deal, since I can partition/distribute it to ~ 100-200 tables, i.e. by
table_(id % 100). 
This is a good way to help keep your indexes small and thus keep 
search/seek time to a minimum :)



However, text information needs to stay in a single
table (since I need to do queries on it for information) and there
will be multiple indexes over this information.
A suggestion to help keep your indexes on this table small: if you have 
large text fields that you need to index, you may want to use partial 
indexes. (for example, if you have a column filename varchar(200), 
create an index on filename(50), or what ever is appropriate. This will 
cut the size of the index file a lot, and even though it may result in 
some duplicate entries in the index, it will probably increase overall 
performance.)



And, as you can imagine, I am not sure if mysql can handle something
like this, and was wondering if you can provide some feedback.

So my questions are:

1. The main question is, do you guys have any experience with this
much binary and regular data? Do you think Mysql can handle this much
data in a reliable manner (without corrupting data and/or
degrading/terrible performance) ?
Based on my experience with  100GB InnoDB databases, with the right 
indexes, MySQL can easily handle 3,000 reads/sec on text/numerical data. 
(This is on dual xeon 3Ghz, 4 GB RAM, SCSI Raid-5 disks.) I've never had 
any problem with data corruption, but I use primarily the InnoDB engine, 
which is not prone to corruption in the same way as MyISAM.


I don't have experience storing large amounts of binary data, so I can't 
say anything about MySQL's performance in that area specifically.




2. Can I implement this using regular SCSI disks with regular mysql?

Probably.

Or do I have need advanced solutions such as clustered, replicated,
etc?
No need for clustered or distributed databases, from what you've 
described. Cluster would be useful if you need 100% availability, even 
in the event of hardware failures. Replication, such as single master - 
multiple slaves, is useful if you have massive reads and minimal writes, 
and _may_ be something you will need.


I would recommend using the command SHOW PROCESSLIST, or a tool like 
MyTop, to see what state the client connections spend the most time in. 
(searching the index, or sending the data over the network? if it's the 
latter, then you would benefit from distributing the read load to 
multiple slave servers.)




Regards,
Devananda vdv

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Storing huge amount of binary data

2005-07-11 Thread Stephen Cook

Per Jessen wrote:


3. Again, as you can understand, I want to minimize the cost here. If
you don't think I can use mysql, do you think Microsoft SQL server is
good enough for this task? 



I don't think so, no.


what are you basing this on?

SQL Server is a truly great database package, don't let some foolish 
bias blind you to the fact that it is professional grade software.


i like MySQL as much as the next guy, but its being open source does not 
mean you should ignore the facts.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]