[Wikidata-bugs] [Maniphest] T323575: wikidata import process breaks utf-8 characters

2022-12-20 Thread MadBob
MadBob added a comment. Sorry for the delay... I had to enforce `JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8` in my env, and it worked as expected: munge and load produced correct UTF-8 strings. Discussing this issue on IRC (wikimedia-search) it appeared a bit strange that WQS tools missed

[Wikidata-bugs] [Maniphest] T323575: wikidata import process breaks utf-8 characters

2022-11-22 Thread MadBob
MadBob created this task. MadBob added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION **Steps to replicate the issue** (include links if applicable): - fresh install of Debian and OpenJDK - follow the instructions found here