Hello, Just looking for a little feedback before deciding on a data engine path and configuration.
We're working on an application that is expected to have a high growth rate over the coming years. We're at the stage of designing the backend databases to maximize performance while keeping costs at a minimum. (I know, everyone's heard this one before) -- database already exists but in a small controlled environment. Here is the overall gist of the database. We have a couple tables that will be update infrequently but read from 100-200 times per second. We also have a couple tables that will be updated up to a couple hundred times per second. The tables that are read may only have 50k rows in them, the tables that are written will continue to grow (expecting 50M rows per year). Here is my basic idea on the overall design on the system (based upon how the app will work). The app has 3 distinct parts to it. The first part (A) will primarily do lookups (50K rows - Client table). The second part (B) will take transactional information and insert it into a database (50M rows - Transaction table). The third part (C) of the app will be used to query the transactional data. I was thinking of creating a master database that will hold the basic information needed across the three processes. Since parts A, B and C will all need to access the Client tables, I thought that maybe I should create a master database for Client Table. We will call this ServerA. >From there I figure we can create replicated slave ServerA-Nodes that all of the processes can read from in a load balanced scenario (using IPVSADM). This is the easy part. The part B is heavy write and part C is heavy read I figure we can use a similar scenario. Have a single large cluster for part B's writes (including the use of table partitions) on ServerB and then create replicated slave ServerB-nodes that all of the part C processes would read from. Replication has to be near real time. That is, we have a contractual agreement to report the incoming processed data within 30 minutes. General table information Client table, multiple tables, 1:many Table A, 500 bytes ~200 read/s (50K records) Table B, Text, < 3000 bytes, rarely read (mostly contact info) Table C, 250 bytes medium ~1 read/s (150K records) Transaction table, multiple tables, each 1:1 Table A, 400 bytes ~200 insert/s (peak 500 insert/s, low 10 insert/s) (50M records) Table B, 200 bytes ~50 insert/s (peak 200 insert/s, low 10 insert/s) (15M records) Table C, Text, < 2000 bytes ~200 insert/s (peak 500 insert/s, low 10 insert/s) (50M records) So, in my end design I'm thinking two master clusters, one for Client tables, one for Transaction tables, and slave everything else (into two distinct groups). Anyone see any downside to this? Any better suggestions? Also, I've been looking at some data partitioning schemes. Would this add any performance impacts in the long run (by allowing me to put the different files on different drive arrays in the future). For the hardware clusters I'm looking to use dual dual core AMD's, 8gb ram, raid 5, for the slaves single dual core AMD's, 4gb ram, raid 5. All GB nic. Any feedback would be greatly appreciated. Gary -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]