Hi Jeremy,
Thanks - A list of pages that need fixing is not a problem - it's pretty much a 
one-man wiki at the moment, so most of the content should need to be converted.

To add a bit of confusion to the issue, however, I've noticed that the system 
messages are also encoded as ISO-8859-1 and thus displaying badly in UTF-8. 
They haven't even been customized through the wiki, and I've tried cleaning the 
l10n_cache table.  I'm not sure where it's getting non-UTF8 versions from. Any 
ideas how do I go about fixing that?  When I switch the page encoding to 
ISO-8859-1 the text displays correctly... 

Thanks
Andru

On 12/11/2013, at 13:00, mediawiki-l-requ...@lists.wikimedia.org wrote:
> 
> 
> From: Andru Vallance <an...@tinymighty.com>
> Subject: [MediaWiki-l] Character set problem
> Date: 11 de noviembre de 2013 17:17:07 GMT+01:00
> To: "mediawiki-l@lists.wikimedia.org" <mediawiki-l@lists.wikimedia.org>
> Reply-To: MediaWiki announcements and site admin list 
> <mediawiki-l@lists.wikimedia.org>
> 
> 
> I'm setting up a new wiki installation and running into some problems with 
> garbage characters showing up due to mismatched character sets. The wiki in 
> question is here: http://wikiausland.de/bookshop/Hauptseite
> 
> New articles written in are fine and display in UTF-8 as expected, but the 
> owner has copied over some content, presumably from an old wiki or MS Word, 
> and it seems like it's in ISO-8859-1 and thus showing a heap of question 
> marks for all the umlauts etc… does anyone know how I can go about converting 
> a page from ISO-8859-1 to UTF-8 easily enough?
> 
> I've tried setting $wgLegacyEncoding to 'ISO-8859-1' [1] in the hope it might 
> do the conversion for me on article save, but no joy.  Are there any other 
> options? 
> 
> Any tips would be greatly appreciated!
> 
> Andru
> 
> [1] https://www.mediawiki.org/wiki/Manual:$wgLegacyEncoding
> 
> 
> 
> From: Jeremy Baron <jer...@tuxmachine.com>
> Subject: Re: [MediaWiki-l] Character set problem
> Date: 11 de noviembre de 2013 17:38:33 GMT+01:00
> To: MediaWiki announcements and site admin list 
> <mediawiki-l@lists.wikimedia.org>
> Reply-To: MediaWiki announcements and site admin list 
> <mediawiki-l@lists.wikimedia.org>
> 
> 
> On Mon, Nov 11, 2013 at 4:17 PM, Andru Vallance <an...@tinymighty.com> wrote:
>> I'm setting up a new wiki installation and running into some problems with 
>> garbage characters showing up due to mismatched character sets. The wiki in 
>> question is here: http://wikiausland.de/bookshop/Hauptseite
>> 
>> New articles written in are fine and display in UTF-8 as expected, but the 
>> owner has copied over some content, presumably from an old wiki or MS Word, 
>> and it seems like it's in ISO-8859-1 and thus showing a heap of question 
>> marks for all the umlauts etc… does anyone know how I can go about 
>> converting a page from ISO-8859-1 to UTF-8 easily enough?
>> 
>> I've tried setting $wgLegacyEncoding to 'ISO-8859-1' [1] in the hope it 
>> might do the conversion for me on article save, but no joy.  Are there any 
>> other options?
> 
> I guess he copied over into a wiki that was already utf8 and so the
> row was marked as being utf8 already when saved.
> 
> $wgLegacyEncoding should do nothing if the row is already utf8. You
> could fix this with a bot or possibly by changing the flag in the DB
> (idk how safe that is...).
> 
> But the very first thing you need is a list of pages that need fixing.
> Maybe that's just as simple as listing that particular user's
> contribs.
> 
> -Jeremy
> 
> 
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Reply via email to