First of all, I heard that Mysql does not perform very well when
tablesize goes above 1 Gb. Is this a myth?
I don't have much experience with individual tables being that large, though I have used MySQL databases with >100G total data with no problems at all. Based on my experience, it's not the table size which can degrade performance, but the index size - if the index is too large or clumsy, then MySQL can not search quickly enough. If the index is well-structured, the data is found quickly and overall performance is great regardless of the quantity of data.

Image table is not a big
deal, since I can partition/distribute it to ~ 100-200 tables, i.e. by
table_(id % 100).
This is a good way to help keep your indexes small and thus keep search/seek time to a minimum :)

However, text information needs to stay in a single
table (since I need to do queries on it for information) and there
will be multiple indexes over this information.
A suggestion to help keep your indexes on this table small: if you have large text fields that you need to index, you may want to use partial indexes. (for example, if you have a column "filename varchar(200)", create an index on filename(50), or what ever is appropriate. This will cut the size of the index file a lot, and even though it may result in some duplicate entries in the index, it will probably increase overall performance.)

And, as you can imagine, I am not sure if mysql can handle something
like this, and was wondering if you can provide some feedback.

So my questions are:

1. The main question is, do you guys have any experience with this
much binary and regular data? Do you think Mysql can handle this much
data in a reliable manner (without corrupting data and/or
degrading/terrible performance) ?
Based on my experience with > 100GB InnoDB databases, with the right indexes, MySQL can easily handle 3,000 reads/sec on text/numerical data. (This is on dual xeon 3Ghz, 4 GB RAM, SCSI Raid-5 disks.) I've never had any problem with data corruption, but I use primarily the InnoDB engine, which is not prone to corruption in the same way as MyISAM.

I don't have experience storing large amounts of binary data, so I can't say anything about MySQL's performance in that area specifically.


2. Can I implement this using regular SCSI disks with regular mysql?
Probably.
Or do I have need advanced solutions such as clustered, replicated,
etc?
No need for clustered or distributed databases, from what you've described. Cluster would be useful if you need 100% availability, even in the event of hardware failures. Replication, such as single master -> multiple slaves, is useful if you have massive reads and minimal writes, and _may_ be something you will need.

I would recommend using the command SHOW PROCESSLIST, or a tool like MyTop, to see what state the client connections spend the most time in. (searching the index, or sending the data over the network? if it's the latter, then you would benefit from distributing the read load to multiple slave servers.)



Regards,
Devananda vdv

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to