On 4 April 2013 06:38, Nick <[email protected]> wrote:
> On Thu, 2013-04-04 at 05:50 +1000, Kai Willadsen wrote:
>> A screen recording wouldn't really help - those are pretty clear
>> instructions - but if there's any way you could provide a SVN working
>> copy to reproduce the problem, then that would be great.
>
> I've attached a patch which exhibits the problem that you should be able
> to apply to any WC since it doesn't require modifications to existing
> files (ie. new files marked for addition exhibit the problem fine).  Let
> me know if you can't apply this patch to a WC and I'll provide a
> procedure or a script to create the same (it's very easy).
>
> Note the svn:mime-type property on each file:
>  - File1.txt is UTF-16LE that's marked as binary by SVN
> (application/octet-stream).
>  - File2.txt is UTF-16LE that's marked as UTF-16LE by SVN.
>  - File3.txt is UTF-8.
>
> To see the problem:
> 1.  Open meld for the directory in the WC where you applied the patch.
> 2.  Set the Encodings to "utf8, utf-16le".
> 3.  Open/view the files in reverse order: (File3.txt, File2.txt,
> File1.txt).  Notice that you can view File3 & File2, but not File1.
> That's sort-of expected since File1.txt is marked as binary.  (I say
> sort-of because the file is really UTF-16LE and includes a suitable
> BOM).
> 4.  Now open/view File2.txt again, and notice that it does not open this
> time.  Refresh the directory listing in Meld, and you can once again
> open it.

So I've applied your patch and can confirm the first few bits, but I
can't get the stuck state to occur. When you hit the problem, do you
get any command-line tracebacks?

Also, this is Meld 1.7.0, right? 1.7.1 (and head) is very different,
but I can't reproduce with either of them from your sample. I'd
certainly be interested to know whether you can reproduce with the
current git version (it's easy to try! clone and run from the
directory).

>> What version of Subversion are you using here? We fetch files in very
>> different ways for <1.6 and 1.7.
>
> Client and server (same machine) are 1.7:
>
> nick@nimble ~/test_repo $ svn --version
> svn, version 1.7.7 (r1393599)
>    compiled Jan  5 2013, 15:01:56

Right. When I said this, I was thinking of Meld 1.7.1+. It won't make
any difference on 1.7.0, but I'm testing with the same SVN version
anyway.

>> No, and it's known not to work. In fact, it shouldn't be possible to
>> view UTF-16 files in Meld. Or at least, this is what I would have said
>> if I'd seen this email before I saw your follow-ups.
>>
>> The problem is that in FileDiff._load_files, we check for null bytes
>> in the file we're reading in, and throw up our hands and declare a
>> file to be binary if there are any. This works shockingly well,
>> considering how wrong it is. Obviously it falls over pretty badly for
>> UTF16. What I'm actually more puzzled by is that you've somehow
>> managed to find a way around this!
>>
>> Also, this is bug 632540:
>>     https://bugzilla.gnome.org/show_bug.cgi?id=632540
>>
>
> It's nice when thing work when they ought not to, huh?  Certainly nicer
> than the opposite.  :)

No! I like knowing why things that work work, and why things that
break break! :)

> I'm too lazy to look at the code now, so do you scan the entire file for
> null bytes, or just the beginning?  As I mentioned, the presence of the
> BOM makes a difference.  Is it possible it's only looking at the
> beginning couple bytes?

We scan each chunk as we read it from the file. Of course, it's not
like UTF16 is guaranteed to have null bytes, so maybe your sample just
happens to work?

> One thing I noticed was that specifying invalid encodings makes a
> difference.  I got the clue from some error messages printed to the
> console about one of my encodings being invalid.  I got the initial list
> from iconv's list, where they are all valid, but I guess Meld's
> underlying library has a different list.  Anyway, that's how my list
> shrank
> from: utf8, iso8859, utf16, utf-16, utf16le, utf-16le
> to:   utf8, utf-16le, iso8859

We use Python's list of encodings which is... not authoritative. I
would have thought that most of those would be recognised aliases, but
I haven't checked.

> Using process of elimination I found the only encoding that actually
> worked for UTF-16LE files is utf-16le.  The others I had (like utf16)
> did not work.  I mention this because in the bug you referenced, Martin
> Weis reports Meld does not work and cites the encodings "utf16 utf-16".
> It's possible that is the cause of the problem for him.
> But I can say for sure the prescriptive steps I provided above work.
> Please don't break it. :)

I'll try not to, though I'd still like to know why it's working. Could
I ask you to dump whatever you currently have to hand in that bugzilla
bug, just for future reference?

...and I'm glad that it's working for you, even if it scares me slightly.

cheers,
Kai
_______________________________________________
meld-list mailing list
[email protected]
https://mail.gnome.org/mailman/listinfo/meld-list

Reply via email to