MadBob created this task. MadBob added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION **Steps to replicate the issue** (include links if applicable): - fresh install of Debian and OpenJDK - follow the instructions found here https://github.com/wikimedia/wikidata-query-rdf/blob/master/docs/getting-started.md - after the munge.sh part, examine the wikidump-000000*.ttl.gz files - find lots of � (UTF-8 replacement character) in place of non-ascii characters **What happens?**: By default, OpenJDK (at least, on Debian) has ANSI_X3.4-1968 file.encoding set. This breaks all UTF-8 characters. Perhaps file encoding have to be forced within the WQS programs, or a proper notice to change own system configurations have to be added in the documentation. **What should have happened instead?**: Munged strings should have proper UTF-8 strings. **Software version** (skip for WMF-hosted wikis like Wikipedia): service-0.3.118-SNAPSHOT TASK DETAIL https://phabricator.wikimedia.org/T323575 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: MadBob Cc: MadBob, Aklapper, AWesterinen, MPhamWMF, CBogen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org