On Jan 15, 2008, at 8:01 PM, Andreas Becker wrote: > Hi Steffen and Ries > > it does not convert all types of content - If this would be fixed it > would > be very useful. > With convert2utf8 you get your tt_content converted but i.e. not > tt_news - i > guess the same serialize problem. > > I guess the only suitable solution might be to convert your database > in to > binaries before performing the conversion with iconv. > Then editing the charset and collation settings in an text editor > (depending > on how big this is pspad is a wonderful tool doing this) > Then restoring everything back into mysql. > > When you convert your data into binaries before doing the conversion > then > all types of arrays should work. > > @ Steffen > What about mysqldumper. > It offers the export to utf8 - does it do any conversion? > Won't it be possible to dump a file directly to utf8 and then you > create a > new database - setting charsets and collations to utf8 and restoring > your > dumped utf8 file? > What actually is mysqldumper doing when it offers to export to utf8? > > mysqldumper would be also a very useful tool to perform such a > conversion, > as it offers to store and upload BIG datafiles without timeout > problems. > Is there a chance to implement this utf8 conversion there if it isn't > already existing? > > Andi > _______________________________________________ >
HEY ANDI, thank you for your lengthy response. to get it straight(er)? 1) some older version of mysqldump dumps in the charset the database was created in 2) newer versions ALWAYS dump in utf-8 by default. so when you want to change your latin-x database to urf-8 with a newer version of mysqldump, you can simply dump and the latin-1 => utf-8 conversion will be done for you. However the charset you will see in the dump is still latin-x, so during import of the tables and data mysql will reconv back to latin-x from utf-8. So if you want to leave it all in utf-8 then all you need to do is change in the dumped sql files the latin-x to utf-8 and then re-import. Usually you can leave the collation alone /collation controls thins like sorting, not how data is stored). Su theoretically the only correct way to convert a latin-x database to utf-8 is do it programatically, that is you simply load each record and each field of each table. Check if it's a serialized array. If so, de-serialize it and then re-load it (using a other DB connection) into a new utf-8 database. I have done something like that to migrate a mysql database to postgresql and it works perfectly. During the conversion from latin-x to utf-8 you have a change that single byte characters get's converted to 2 or more byte characters and that the length indicator of a serialized object is wrong. clear enough? or is there still doubth? anyone? and correct me if I am wrong! Ries _______________________________________________ TYPO3-english mailing list TYPO3-english@lists.netfielders.de http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-english