Simon,
Why dont u split the file and use LOAD DATA INFILE command which would
improve the performance while loading into an empty table with keys
disabled.

regards
anandkl


On 6/5/08, Simon Collins <[EMAIL PROTECTED]> wrote:
>
> I'm loading the data through the command below mysql -f -u root -p enwiki <
> enwiki.sql
>
> The version is MySQL 5.0.51a-community
>
> I've disabled the primary key, so there are no indexes. The CPU has 2 cores
> and 2 Gigs memory.
>
> The import fell over overnight with a "table full" error as it hit 1T (I
> think this may be a file system problem). As it's not importing before
> anymore show status isn't going to provide any interesting info however, I
> did notice that mysql was not consuming much CPU time ~ 10%.
>
> I wouldn't like to split the data up into separate tables as it would
> change the schema and I'm not in charge of the schema design - just the DBA
> at the backend.
>
> Cheers
>
> Simon
>
> mos wrote:
>
>> Simon,
>> As someone else mentioned, how are you loading the data? Can you post the
>> SQL?
>>
>> You have an Id field, so is that not the primary key? If so, the slowdown
>> could be maintaining the index. If so, add up to 30% of your available ram
>> to your key_bufer_size in your my.cnf file and restart the server. How much
>> RAM do you have on your machine and how many CPU's do you have? What version
>> of MySQL are you using? Also can you post your "Show Status" output after it
>> has started to slow down? How much CPU is being used after the import slows
>> down?
>>
>> Now from what you've said, it looks like you are using this table as a
>> lookup table, so if it just has an id and a blob field, you probably return
>> the blob field for a given id, correct? If it were up to me, I would break
>> the data into more manageable tables. If you have 100 million rows, then I'd
>> break it into 10x10 million row tables. Table_1 would have id's from 1 to
>> 9,999,999, and table_2 with id's from 10 million to 10,999,999 etc. Your
>> lookup would call a stored procedure which determines which table to use
>> based on the Id it was given. If you really had to search all the tables you
>> can then use a Merge table based on those 10 tables. I use Merge tables
>> quite a bit and the performance is quite good.
>>
>> Mike
>>
>> At 11:42 AM 6/4/2008, you wrote:
>>
>>> Dear all,
>>>
>>> I'm presently trying to import the full wikipedia dump for one of our
>>> research users. Unsurprisingly it's a massive import file (2.7T)
>>>
>>> Most of the data is importing into a single MyISAM table which has an id
>>> field and a blob field. There are no constraints / indexes on this table.
>>> We're using an XFS filesystem.
>>>
>>> The import starts of quickly but gets increasingly slower as it
>>> progresses, starting off at about 60 G per hour but now the MyISAM table is
>>> ~1TB it's slowed to a load of about 5G per hour. At this rate the import
>>> will not finish for a considerable time, if at all.
>>>
>>> Can anyone suggest to me why this is happening and if there's a way to
>>> improve performance. If there's a more suitable list to discuss this, please
>>> let me know.
>>>
>>> Regards
>>>
>>> Simon
>>>
>>
>>
>>
>
> --
> Dr Simon Collins
> Data Grid Consultant
> National Grid Service
> University of Manchester
> Research Computing Services
> Kilburn Building
> Oxford Road
> Manchester
> M13 9PL
>
> Tel 0161 275 0604
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
>
>

Reply via email to