Hi Jeremy, Thanks - A list of pages that need fixing is not a problem - it's pretty much a one-man wiki at the moment, so most of the content should need to be converted.
To add a bit of confusion to the issue, however, I've noticed that the system messages are also encoded as ISO-8859-1 and thus displaying badly in UTF-8. They haven't even been customized through the wiki, and I've tried cleaning the l10n_cache table. I'm not sure where it's getting non-UTF8 versions from. Any ideas how do I go about fixing that? When I switch the page encoding to ISO-8859-1 the text displays correctly... Thanks Andru On 12/11/2013, at 13:00, mediawiki-l-requ...@lists.wikimedia.org wrote: > > > From: Andru Vallance <an...@tinymighty.com> > Subject: [MediaWiki-l] Character set problem > Date: 11 de noviembre de 2013 17:17:07 GMT+01:00 > To: "mediawiki-l@lists.wikimedia.org" <mediawiki-l@lists.wikimedia.org> > Reply-To: MediaWiki announcements and site admin list > <mediawiki-l@lists.wikimedia.org> > > > I'm setting up a new wiki installation and running into some problems with > garbage characters showing up due to mismatched character sets. The wiki in > question is here: http://wikiausland.de/bookshop/Hauptseite > > New articles written in are fine and display in UTF-8 as expected, but the > owner has copied over some content, presumably from an old wiki or MS Word, > and it seems like it's in ISO-8859-1 and thus showing a heap of question > marks for all the umlauts etc… does anyone know how I can go about converting > a page from ISO-8859-1 to UTF-8 easily enough? > > I've tried setting $wgLegacyEncoding to 'ISO-8859-1' [1] in the hope it might > do the conversion for me on article save, but no joy. Are there any other > options? > > Any tips would be greatly appreciated! > > Andru > > [1] https://www.mediawiki.org/wiki/Manual:$wgLegacyEncoding > > > > From: Jeremy Baron <jer...@tuxmachine.com> > Subject: Re: [MediaWiki-l] Character set problem > Date: 11 de noviembre de 2013 17:38:33 GMT+01:00 > To: MediaWiki announcements and site admin list > <mediawiki-l@lists.wikimedia.org> > Reply-To: MediaWiki announcements and site admin list > <mediawiki-l@lists.wikimedia.org> > > > On Mon, Nov 11, 2013 at 4:17 PM, Andru Vallance <an...@tinymighty.com> wrote: >> I'm setting up a new wiki installation and running into some problems with >> garbage characters showing up due to mismatched character sets. The wiki in >> question is here: http://wikiausland.de/bookshop/Hauptseite >> >> New articles written in are fine and display in UTF-8 as expected, but the >> owner has copied over some content, presumably from an old wiki or MS Word, >> and it seems like it's in ISO-8859-1 and thus showing a heap of question >> marks for all the umlauts etc… does anyone know how I can go about >> converting a page from ISO-8859-1 to UTF-8 easily enough? >> >> I've tried setting $wgLegacyEncoding to 'ISO-8859-1' [1] in the hope it >> might do the conversion for me on article save, but no joy. Are there any >> other options? > > I guess he copied over into a wiki that was already utf8 and so the > row was marked as being utf8 already when saved. > > $wgLegacyEncoding should do nothing if the row is already utf8. You > could fix this with a bot or possibly by changing the flag in the DB > (idk how safe that is...). > > But the very first thing you need is a list of pages that need fixing. > Maybe that's just as simple as listing that particular user's > contribs. > > -Jeremy > > _______________________________________________ MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l