Would it be helpful to have the option to change the encoding on the fly
like jEdit's "reload with encoding"?

-Keegan


On Wed, Apr 3, 2013 at 4:38 PM, Nick <[email protected]> wrote:

> Inline below.
>
> On Thu, 2013-04-04 at 05:50 +1000, Kai Willadsen wrote:
> > (Answering lots of things at once, and not in order.)
> >
> > On 4 April 2013 00:06, Nick <[email protected]> wrote:
> > > I think I found a solution.  It required 3 pieces:
> > >
> > > 1.  Set Meld's Encodings to:
> > >        utf8, utf-16le, iso8859
> > >
> > > 2.  Set SVN's mime-type property on the UTF-16 files to
> > >     text/plain;encoding=UTF-16LE.
> > >
> > > 3.  Placed a BOM in the UTF-16 files.
> > >
> > > With this configuration I am able to view UTF-8 and UTF-16 files in
> Meld
> > > without changing the configuration.  The files can be directly from the
> > > filesystem (ie. meld file1 file2) or via the SVN hook within Meld.
> > >
> > >
> > > In the process of experimenting on this (and I think contributing to
> the
> > > problem), I think I found a bug in Meld.  It seems that once I attempt
> > > to view/diff a file that's in SVN which fails, other files which
> > > normally work also fail.  Here's a breakdown of the steps I observe
> this
> > > happening (using Meld 1.7.0):
> > >
> > > (1)  Open Meld for a directory inside a SVN working copy, which
> contains
> > > 3 files:  a.xml (a UTF-16LE file without a BOM), b.xml (a UTF-16LE file
> > > with a BOM), c.txt (a UTF-8 file).
>
> The issue seems to be tied to opening a binary file.  In this case,
> a.xml only needs to be considered binary (SVN's svn:mime-type property
> set to application/octet-stream).
>
> > > (2)  Set Meld's Encodings configuration to "utf8, utf-16le"
> > > (3)  Open/View b.xml.  This should work.
> > > (4)  Open/View c.txt.  This should work.
> > > (5)  Attempt to open a.xml.  This should yield an error that the file
> is
> > > binary (as expected).
> > > (6)  Now attempt to open/view b.xml again.  It fails with the same
> > > error.
> > >
> > > The only way I've found to get it out of this stuck state is to refresh
> > > the listing.
> > >
> > > I can try creating a screen recording of this behavior if it helps.
> >
> > A screen recording wouldn't really help - those are pretty clear
> > instructions - but if there's any way you could provide a SVN working
> > copy to reproduce the problem, then that would be great.
>
> I've attached a patch which exhibits the problem that you should be able
> to apply to any WC since it doesn't require modifications to existing
> files (ie. new files marked for addition exhibit the problem fine).  Let
> me know if you can't apply this patch to a WC and I'll provide a
> procedure or a script to create the same (it's very easy).
>
> Note the svn:mime-type property on each file:
>  - File1.txt is UTF-16LE that's marked as binary by SVN
> (application/octet-stream).
>  - File2.txt is UTF-16LE that's marked as UTF-16LE by SVN.
>  - File3.txt is UTF-8.
>
> To see the problem:
> 1.  Open meld for the directory in the WC where you applied the patch.
> 2.  Set the Encodings to "utf8, utf-16le".
> 3.  Open/view the files in reverse order: (File3.txt, File2.txt,
> File1.txt).  Notice that you can view File3 & File2, but not File1.
> That's sort-of expected since File1.txt is marked as binary.  (I say
> sort-of because the file is really UTF-16LE and includes a suitable
> BOM).
> 4.  Now open/view File2.txt again, and notice that it does not open this
> time.  Refresh the directory listing in Meld, and you can once again
> open it.
>
> Let me know if I can help w/ more info.
>
>
> >
> > > On Wed, 2013-04-03 at 09:41 -0400, Nick wrote:
> > >> Looks like if I change the order of the codecs such that utf16 is
> listed
> > >> first, then Meld displays the file fine.  But then I lose the ability
> to
> > >> view UTF-8 files.  So it seems like it's one or the other, but not
> both.
> > >>
> > >> If this is true, I don't understand the purpose of being able to
> specify
> > >> more than one encoding in the Preferences dialog.
> > >>
> > >> Can Meld support going through each specified encoding while the file
> is
> > >> not displayable (including the finding that it's a 'binary' file)?
>  This
> > >> will allow me to specify "utf8, utf16" for the encodings which will
> > >> support UTF-8 and UTF-16 files to be used in Meld w/out changing the
> > >> configuration.
> >
> > That's exactly what we do... except that the binary file check is
> > unrelated to the rest. Having said that, reordering those really
> > shouldn't avoid the binary file check.
> >
> > >> On Wed, 2013-04-03 at 08:48 -0400, Nick wrote:
> > >> > Hi,
> > >> >
> > >> > First and foremost, thanks for a great diff & merge tool!
> > >> >
> > >> > My project involves XML files which need to be encoded in UTF-16
> Little
> > >> > Endian.  I cannot seem to view or diff UTF-16 files with Meld.
> > >> >
> > >> > In the Encoding tab of the Preferences dialog I have this for the
> > >> > codecs:
> > >> >
> > >> >     utf8, iso8859, utf16, utf-16, utf16le, utf-16le
> > >> >
> > >> > When I try to open a UTF-16LE file that's in SVN, Meld displays a
> yellow
> > >> > error bar on top which reads, "Error fetching original comparison
> file".
> > >> > I've confirmed UTF-8 files in the repo open fine--it's only an
> issue w/
> > >> > UTF-16 files.
> > >> >
> > >> > It behaves the same even for files which are marked for addition in
> the
> > >> > repo but not yet added (so in this case, there's nothing to diff
> > >> > against, but normally Meld will display the contents of the file
> > >> > alongside a blank pane).
> > >> >
> > >> > I've tried UTF-16 files that contain a BOM and files which do not;
> no
> > >> > difference.
> > >> >
> > >> > I notice that SVN sets the mime-type on these files as binary
> > >> > (application/octet-stream).  If I manually change it to UTF-16LE
> > >> > (text/plain;encoding=UTF-16LE), Meld displays a yellow error bar on
> top
> > >> > which reads, "Could not read file" "test.xml appears to be a binary
> > >> > file."--but it still doesn't display the contents of the file.
> >
> > I had no idea the mime-type behaviour would be different... we
> > certainly don't do anything on the SVN end with regards to that. I
> > guess that's a possibly-interesting issue with the new SVN support.
>
> Yeah, I got the idea from
> http://rhubbarb.wordpress.com/2012/04/28/svn-unicode/ which speaks only
> about subversion support.  I confirmed that the svn command line tool
> functions fine according to the mime-type.
>
>
> > What version of Subversion are you using here? We fetch files in very
> > different ways for <1.6 and 1.7.
>
> Client and server (same machine) are 1.7:
>
> nick@nimble ~/test_repo $ svn --version
> svn, version 1.7.7 (r1393599)
>    compiled Jan  5 2013, 15:01:56
>
>
> > >> > If I call meld and pass it 2 UTF-16 files on the file system (ie.
> not
> > >> > trying to open a file from the SVN listing), I still get a yellow
> error
> > >> > bar on top which reports "Could not read file" "test.xml appears to
> be a
> > >> > binary file."
> > >> >
> > >> > Is there something else I need to do?
> > >> >
> > >> > Has anyone used Meld to diff UTF-16 files?
> >
> > No, and it's known not to work. In fact, it shouldn't be possible to
> > view UTF-16 files in Meld. Or at least, this is what I would have said
> > if I'd seen this email before I saw your follow-ups.
> >
> > The problem is that in FileDiff._load_files, we check for null bytes
> > in the file we're reading in, and throw up our hands and declare a
> > file to be binary if there are any. This works shockingly well,
> > considering how wrong it is. Obviously it falls over pretty badly for
> > UTF16. What I'm actually more puzzled by is that you've somehow
> > managed to find a way around this!
> >
> > Also, this is bug 632540:
> >     https://bugzilla.gnome.org/show_bug.cgi?id=632540
> >
>
> It's nice when thing work when they ought not to, huh?  Certainly nicer
> than the opposite.  :)
>
> I'm too lazy to look at the code now, so do you scan the entire file for
> null bytes, or just the beginning?  As I mentioned, the presence of the
> BOM makes a difference.  Is it possible it's only looking at the
> beginning couple bytes?
>
> One thing I noticed was that specifying invalid encodings makes a
> difference.  I got the clue from some error messages printed to the
> console about one of my encodings being invalid.  I got the initial list
> from iconv's list, where they are all valid, but I guess Meld's
> underlying library has a different list.  Anyway, that's how my list
> shrank
> from: utf8, iso8859, utf16, utf-16, utf16le, utf-16le
> to:   utf8, utf-16le, iso8859
>
> Using process of elimination I found the only encoding that actually
> worked for UTF-16LE files is utf-16le.  The others I had (like utf16)
> did not work.  I mention this because in the bug you referenced, Martin
> Weis reports Meld does not work and cites the encodings "utf16 utf-16".
> It's possible that is the cause of the problem for him.
> But I can say for sure the prescriptive steps I provided above work.
> Please don't break it. :)
>
>
> > cheers,
> > Kai
>
> _______________________________________________
> meld-list mailing list
> [email protected]
> https://mail.gnome.org/mailman/listinfo/meld-list
>
_______________________________________________
meld-list mailing list
[email protected]
https://mail.gnome.org/mailman/listinfo/meld-list

Reply via email to