Simon, Why dont u split the file and use LOAD DATA INFILE command which would improve the performance while loading into an empty table with keys disabled.
regards anandkl On 6/5/08, Simon Collins <[EMAIL PROTECTED]> wrote: > > I'm loading the data through the command below mysql -f -u root -p enwiki < > enwiki.sql > > The version is MySQL 5.0.51a-community > > I've disabled the primary key, so there are no indexes. The CPU has 2 cores > and 2 Gigs memory. > > The import fell over overnight with a "table full" error as it hit 1T (I > think this may be a file system problem). As it's not importing before > anymore show status isn't going to provide any interesting info however, I > did notice that mysql was not consuming much CPU time ~ 10%. > > I wouldn't like to split the data up into separate tables as it would > change the schema and I'm not in charge of the schema design - just the DBA > at the backend. > > Cheers > > Simon > > mos wrote: > >> Simon, >> As someone else mentioned, how are you loading the data? Can you post the >> SQL? >> >> You have an Id field, so is that not the primary key? If so, the slowdown >> could be maintaining the index. If so, add up to 30% of your available ram >> to your key_bufer_size in your my.cnf file and restart the server. How much >> RAM do you have on your machine and how many CPU's do you have? What version >> of MySQL are you using? Also can you post your "Show Status" output after it >> has started to slow down? How much CPU is being used after the import slows >> down? >> >> Now from what you've said, it looks like you are using this table as a >> lookup table, so if it just has an id and a blob field, you probably return >> the blob field for a given id, correct? If it were up to me, I would break >> the data into more manageable tables. If you have 100 million rows, then I'd >> break it into 10x10 million row tables. Table_1 would have id's from 1 to >> 9,999,999, and table_2 with id's from 10 million to 10,999,999 etc. Your >> lookup would call a stored procedure which determines which table to use >> based on the Id it was given. If you really had to search all the tables you >> can then use a Merge table based on those 10 tables. I use Merge tables >> quite a bit and the performance is quite good. >> >> Mike >> >> At 11:42 AM 6/4/2008, you wrote: >> >>> Dear all, >>> >>> I'm presently trying to import the full wikipedia dump for one of our >>> research users. Unsurprisingly it's a massive import file (2.7T) >>> >>> Most of the data is importing into a single MyISAM table which has an id >>> field and a blob field. There are no constraints / indexes on this table. >>> We're using an XFS filesystem. >>> >>> The import starts of quickly but gets increasingly slower as it >>> progresses, starting off at about 60 G per hour but now the MyISAM table is >>> ~1TB it's slowed to a load of about 5G per hour. At this rate the import >>> will not finish for a considerable time, if at all. >>> >>> Can anyone suggest to me why this is happening and if there's a way to >>> improve performance. If there's a more suitable list to discuss this, please >>> let me know. >>> >>> Regards >>> >>> Simon >>> >> >> >> > > -- > Dr Simon Collins > Data Grid Consultant > National Grid Service > University of Manchester > Research Computing Services > Kilburn Building > Oxford Road > Manchester > M13 9PL > > Tel 0161 275 0604 > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] > >