** Attachment added: "addon_english.txt"
   
https://bugs.launchpad.net/ubuntu/+source/gedit/+bug/1671512/+attachment/4834676/+files/addon_english.txt

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to gedit in Ubuntu.
https://bugs.launchpad.net/bugs/1671512

Title:
  Gedit fails to read UTF-16 encoded file

Status in gedit package in Ubuntu:
  New

Bug description:
  Affected version: 3.22.0

  I'm trying to open a certain text file. Unsure of the exact encoding
  used, I viewed another text file in the same folder (part of the same
  thing) and had GEdit auto-detect the encoding as UTF-16. Viewing the
  file in a hex editor this seems to indeed be the case. The other file
  contains a lot of CJK characters, while this file contains very little
  (mostly ASCII english, with a few special symbols). You can see it in
  the file structure; almost every even byte is a zero byte (0x00).

  GEdit fails to open the file with the message 
  Could not open the file “/hdd/programs/thd2/resource/addon_english.txt”.
  Unexpected error: Invalid byte sequence in conversion input

  I first figured the problem was with the text file. So I tried
  'fixing' the file by converting to its own encoding, ignoring invalid
  sequences, using 'iconv' tool.

  $ iconv -c -f 'UTF-16' -t 'UTF-16' addon_english.txt > 
addon_english_fixed.txt 
  $ sha1sum addon_english.txt
  e0e9f360482f2f234e5aeb09406c10081ebb6e1a  addon_english.txt
  $ sha1sum addon_english_fixed.txt
  e0e9f360482f2f234e5aeb09406c10081ebb6e1a  addon_english_fixed.txt

  As you can clearly see, nothing changed. Therefore I'm suspecting something's 
wrong with gedit here. 
  As an aside, other editors also don't like this file much: 

  GNU nano won't open it by default. 
  vim will open it, but can't display all the characters in it (probably han 
unification issues). 
  leafpad will nuke the contents replacing it with a literal ASCII Byte-order 
mark. (A BOM as rendered in Latin-1). 

  My locale settings are EN-GB for language and UTF-8 for preferred
  charset used by the OS itself.

  
  The file in question has been attached to this bug report for bug 
reproduction purposes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/gedit/+bug/1671512/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to