Hello, I can't read properly some text due to the way NetBeans configures System.in for the running app. One cannot parse that text with any charset and the read string is corrupted basically.
I have a Maven app and I already configured globally -Dfile.encoding=utf-8 for Maven. I've also added -J-Dfile.encoding=utf-8 for NetBeans just to be sure. Still, if I try to paste Γίνεται into the Output window I get the following from System.in.read : Charset.defaultCharset: UTF-8 Input Γίνεται : Γίνεται 147(-109) 175(-81) 189(-67) 181(-75) 196(-60) 177(-79) 185(-71) The above are the int (and byte) values. The bellow is what the actual string constant .getBytes returns: Internal Γίνεται: [-50, -109, -50, -81, -50, -67, -50, -75, -49, -124, -50, -79, -50, -71] UTF8 Γίνεται : [-50, -109, -50, -81, -50, -67, -50, -75, -49, -124, -50, -79, -50, -71] So looks like the UTF8 values should be coded with two bytes but when I paste the text it is coded as a single byte. Oddly enough, it is *displayed* properly in the editor (for the constant) and in the Output window for the input text. Anybody has UTF8 input working for their configuration? Sample code: public static void main(String[] args) throws IOException { System.out.println("Charset.defaultCharset: " + Charset.defaultCharset().displayName()); String x = "Γίνεται"; int b; System.out.println("Input Γίνεται : "); List<Integer> bytes = new ArrayList<>(); while ((b = System.in.read()) != '\n') { bytes.add(Integer.valueOf(b)); System.out.print(b + "(" + Integer.valueOf(b).byteValue() + ") "); } System.out.println(); System.out.println("Internal Γίνεται: " + Arrays.toString(x.getBytes())); System.out.println("UTF8 Γίνεται : " + Arrays.toString(x.getBytes("UTF-8"))); byte[] actualbytes = new byte[bytes.size()]; for(int i=0;i<bytes.size();i++){ actualbytes[i] = bytes.get(i).byteValue(); } Charset.availableCharsets().forEach((name, charset) -> { Scanner s = new Scanner(new ByteArrayInputStream(actualbytes), charset); String r = s.next(); // String r = new String(actualbytes, charset); System.out.println(name + ": " + r); if(Arrays.equals(actualbytes, r.getBytes())){ System.out.println("======== BINGO!!! =========="); } }); Output: Charset.defaultCharset: UTF-8 Input Γίνεται : Γίνεται 147(-109) 175(-81) 189(-67) 181(-75) 196(-60) 177(-79) 185(-71) Internal Γίνεται: [-50, -109, -50, -81, -50, -67, -50, -75, -49, -124, -50, -79, -50, -71] UTF8 Γίνεται : [-50, -109, -50, -81, -50, -67, -50, -75, -49, -124, -50, -79, -50, -71] Big5: �紗腔措 Big5-HKSCS: 𢵧蔥覺� CESU-8: ����ı� EUC-JP: �週脹� EUC-KR: ��슉캇� GB18030: 摨降谋� GB2312: ��降谋� GBK: 摨降谋� IBM-Thai: lฮา๕D๑๙ IBM00858: ô»¢Á─▒╣ IBM01140: l®¨§D£¾ IBM01141: l®¨@D£¾ IBM01142: l®¨§D£¾ IBM01143: l®¨[D£¾ IBM01144: l®¨@D#¾ IBM01145: l®~§D£¾ IBM01146: l®¨§D[¾ IBM01147: l®~]D#¾ IBM01148: l®¨§D£¾ IBM01149: l®¨§D£¾ IBM037: l®¨§D£¾ IBM1026: l®¨§D£¾ IBM1047: l®]§D£¾ IBM273: l®¨@D£¾ IBM277: l®¨§D£¾ IBM278: l®¨[D£¾ IBM280: l®¨@D#¾ IBM284: l®~§D£¾ IBM285: l®¨§D[¾ IBM290: ツルンvD¢z IBM297: l®~]D#¾ IBM420: lكنﻸDلﻼ IBM424: l®¨§D£¾ IBM437: ô»╜╡─▒╣ IBM500: l®¨§D£¾ IBM775: ō»ĮĄ─▒╣ IBM850: ô»¢Á─▒╣ IBM852: ô»ŻÁ─▒╣ IBM855: Њ»йх─▒╣ IBM857: ô»¢Á─▒╣ IBM860: ô»╜╡─▒╣ IBM861: ô»╜╡─▒╣ IBM862: ף»╜╡─▒╣ IBM863: ô»╜╡─▒╣ IBM864: ±ﺥﺵ٥ﺅ١٩ IBM865: ô¤╜╡─▒╣ IBM866: Уп╜╡─▒╣ IBM868: ﭖ»ﺿﺷ─▒╣ IBM869: �»ΞΚ─▒╣ IBM870: lި§DĄŹ IBM871: l®¨§D£¾ IBM918: lﻋﮔﻑDﻍﮎ ISO-2022-CN: “¯½µÄ±¹ ISO-2022-JP: ������� ISO-2022-JP-2: ������� ISO-2022-KR: “¯½µÄ±¹ ISO-8859-1: “¯½µÄ±¹ ISO-8859-13: “ƽµÄ±¹ ISO-8859-15: “¯œµÄ±¹ ISO-8859-16: “Żœ”ıč ISO-8859-2: “Ż˝ľÄąš ISO-8859-3: “Ż½µÄħı ISO-8859-4: “¯ŊĩÄąš ISO-8859-5: “ЏНЕФБЙ ISO-8859-6: “���ؤ�� ISO-8859-7: “―½΅Δ±Ή ISO-8859-8: “¯½µ�±¹ ISO-8859-9: “¯½µÄ±¹ JIS_X0201: �ッスオトアケ JIS_X0212-1990: ���� KOI8-R: ⌠╞╫╣д╠╧ KOI8-U: ⌠╞Ґ╣д╠╧ Shift_JIS: 同スオトアケ TIS-620: �ฏฝตฤฑน US-ASCII: ������� UTF-16: 鎯붵쒱� UTF-16BE: 鎯붵쒱� UTF-16LE: 꾓떽뇄� UTF-32: �� UTF-32BE: �� UTF-32LE: �� UTF-8: ����ı� windows-1250: “Ż˝µÄ±ą windows-1251: “ЇЅµД±№ windows-1252: “¯½µÄ±¹ windows-1253: “―½µΔ±Ή windows-1254: “¯½µÄ±¹ windows-1255: “¯½µִ±¹ windows-1256: “¯½µؤ±¹ windows-1257: “ƽµÄ±¹ windows-1258: “¯½µÄ±¹ windows-31j: 同スオトアケ x-Big5-HKSCS-2001: 𢵧蔥覺� x-Big5-Solaris: �紗腔措 x-euc-jp-linux: �週脹� x-EUC-TW: ���� x-eucJP-Open: �週脹� x-IBM1006: “ﺁﺛﭖﺥﺎﺗ x-IBM1025: lвЕщDыА x-IBM1046: ﹿﺳﺿ٥ﺅ١٩ x-IBM1097: lﻎﻟﻗDﻐﮔ x-IBM1098: ﺑ»¤ﺽ─▒╣ x-IBM1112: l®Ķ§D£¾ x-IBM1122: l®¨[D£¾ x-IBM1123: lвЕщDыА x-IBM1124: “ЏНЕФБЙ x-IBM1129: “¯½µÄ±¹ x-IBM1166: lвЕщDыА x-IBM1364: lᅲ��D�� x-IBM1381: 降谋� x-IBM1383: “的惫 x-IBM29626C: “�議厩 x-IBM300: ���� x-IBM33722: “�議厩 x-IBM737: Υψ╜╡─▒╣ x-IBM833: lᅲ��D�� x-IBM834: �쫏�� x-IBM856: ף»¢�─▒╣ x-IBM874: �ฏฝตฤฑน x-IBM875: lσφίDάώ x-IBM921: “ƽµÄ±¹ x-IBM922: “‾½µÄ±¹ x-IBM930: ツルンvD¢z x-IBM933: lᅲ��D�� x-IBM935: l���D�� x-IBM937: l���D�� x-IBM939: lモ]ヨD£レ x-IBM942: 同スオトアケ x-IBM942C: 同スオトアケ x-IBM943: 同スオトアケ x-IBM943C: 同スオトアケ x-IBM948: 胞慔燅� x-IBM949: 슉캇� x-IBM949C: 슉캇� x-IBM950: 蔥覺� x-IBM964: “��� x-IBM970: “�돨국 x-ISCII91: �ऒटगदऔछ x-ISO-2022-CN-CNS: “¯½µÄ±¹ x-ISO-2022-CN-GB: “¯½µÄ±¹ x-iso-8859-11: “ฏฝตฤฑน x-JIS0208: ���� x-JISAutoDetect: 同スオトアケ x-Johab: 닖쫏캼� x-MacArabic: …��٥ؤ١٩ x-MacCentralEurope: ďĮĹĶńĪĻ x-MacCroatian: ìØΩµƒ±š x-MacCyrillic: Уѓљµƒ±є x-MacDingbat: �④❽⑩➄⑥❹ x-MacGreek: ™·ΫΒΡ±Ι x-MacHebrew: ì������ x-MacIceland: ìØΩµƒ±π x-MacRoman: ìØΩµƒ±π x-MacRomania: ìŞΩµƒ±π x-MacSymbol: �↓∝⊗±≠ x-MacThai: ฏฝตฤฑน x-MacTurkish: ìØΩµƒ±π x-MacUkraine: Уѓљµƒ±є x-MS932_0213: 同スオトアケ x-MS950-HKSCS: 𢵧蔥覺� x-MS950-HKSCS-XP: 蔥覺� x-mswin-936: 摨降谋� x-PCK: 同スオトアケ x-SJIS_0213: 同スオトアケ x-UTF-16LE-BOM: 꾓떽뇄� X-UTF-32BE-BOM: �� X-UTF-32LE-BOM: �� x-windows-50220: ������� x-windows-50221: ������� x-windows-874: “ฏฝตฤฑน x-windows-949: 벏슉캇� x-windows-950: 蔥覺� x-windows-iso2022jp: ������� --emi