Max Nitribitt wrote: > Hi Holger, > > thank you for your help. I'm running the database on > a: > OS: LINUX (Suse 7.2) > 768 MB DDR-RAM > ATHLON 1400 MHz (256 K full speed L2 Cache) > HD: 160 GB (Raid5) > > The table has 560 Million rows (not 5 600 000 000 as I > stated before (sorry ;))) with about 450 Million > distinct values in the field I'm trying to Index. > TABLE: TOF > Column a: INTEGER <- this is what I want to Index > Column b: CHAR (9) > (No Primary Key specified; single candidate key is > combination of both columns) > > There are no other users connected to the database nor > to the machine it's running on yet. > What would be a sensible Value for DATA_CACHE and > _IDX_FILE_SIZE under these circumstances? >
>>1. DATA_CACHE should be as big as possible to >>minimize disk I/O. >> ... >>algorithm. A larger number of >>buffers decreases the number of merge steps, and >>thus increases performance. >> ... >>3. Spread your database over several disks to >>increase performance by >>parallel I/O. Hi Max, Holger, The very first thing to consider is upping the amount of RAM. Can you cram 2, 4 or 6+ GB in that box? The second thing is minimising the load by other apps (if any) on the CPU but especially the RAM and disks. Another thing to consider is the L2 cache, since that's even faster than main memory. Besides increasing the number of spindles you can also increase the number of channels. As well as making sure you've got the latest and greatest SCSI goodies. If the disks are on the same bus they still can get in each others way. What would be the difference betwixt having but one candidate key and a primary key? The primary key is a candidate key picked more or less at random, using arcane arts or sometimes by divination. It's still a candidate key. Anyway, having but the int for the value and another one to point to the physical row, you already have to juggle 4480 M bytes of information around. That is without the balanced tree pointer rigmarole. So you got a problem. You can't fit that into main memory. That means virtual memory which means disk access. Better have a separate and very snappy disk for that. And controller, naturally. Else you got one very fast processor twiddling tumbs, waiting for the next batch of data from whatever disk it's coming from. My 2 cents. Roelof PS good thing hardware prices keep going down ;). -- _______________________________________________________________________ eBOA® est. 1982 http://eBOA.com/ tel. +31-58-2123014 mailto:[EMAIL PROTECTED]?subject=Information_request fax. +31-58-2160293