https://bugzilla.wikimedia.org/show_bug.cgi?id=62870
Bawolff (Brian Wolff) <bawolff...@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |d_ent...@yahoo.com Component|Media storage |GWToolset Product|Wikimedia |MediaWiki extensions Summary|Ghost file with strange |GWtoolset uploaded a file |properties on Commons - |with non-normalized unicode |database corruption? |characters causing subtle | |breakage --- Comment #2 from Bawolff (Brian Wolff) <bawolff...@gmail.com> --- Further investigation. Note still accessible at https://commons.wikimedia.org/?curid=31451688 Basically, it appears somewhere along the lines gwtoolset didn't properly normalize the page title correctly, thus creating it with the letter 'é' (ie Using combining characters. A U+69 followed by a U+301), instead of doing a 'é' (The precomposed version - U+E9). Titles are supposed to be in NFC, so the various things subtly explode when the non-NFC U+69 U+301 is used. All the symptoms mentioned are consistent with an incorectly normalized db entry, except maybe symptom 1 which seems to imply there was a page at one point using the other form of the é. Kind of unclear what happened there, given the page is now moved/deleted. Perhaps there were page entries for both variants, but the proper variant was broken (e.g. It was fully uploaded to the wrong é, but as part of the process, it was partially uploaded to the correct é too). Hard to know. My previous comment (comment 1) seems to have been incorrect, and this has nothing to do with bug 32551. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l