Re: Test::Builder versus Unicode

2004-12-22 Thread Nicholas Clark
On Wed, Dec 22, 2004 at 11:41:56AM -0800, Ovid wrote:
> --- Nicholas Clark <[EMAIL PROTECTED]> wrote:
> 
> > On Wed, Dec 22, 2004 at 10:26:02AM -0800, David Wheeler wrote:
> > > 1. Perl gets smarter about duping file handles, so that the dupes
> > get 
> > > the same i/o layer settings as the handles they dupe.
> > 
> > Changing this going forwards doesn't change any of the installed
> > perls out there in the wild.
> 
> So far, given that this problem  has only surfaced in relation to
> Unicode, I can't say I'm overly concerned about fixing it on versions
> of Perl where Unicode is already known to be broken.  Of course, as

Which you're sort of implying is all versions up to and including 5.8.6 :-)
(Well, I can misread it as that. I don't think that you really are implying
this).

Personally I'd be quite happy using anything 5.8.3 or later with Unicode.
The later the better, as more bugs have got fixed. But I feel that Unicode
isn't more broken than any other existing part of perl by 5.8.1. And they
are out there, and they aren't going away rapidly.

Nicholas Clark




Re: Test::Builder versus Unicode

2004-12-22 Thread Ovid
--- Nicholas Clark <[EMAIL PROTECTED]> wrote:

> On Wed, Dec 22, 2004 at 10:26:02AM -0800, David Wheeler wrote:
> > 1. Perl gets smarter about duping file handles, so that the dupes
> get 
> > the same i/o layer settings as the handles they dupe.
> 
> Changing this going forwards doesn't change any of the installed
> perls out there in the wild.

So far, given that this problem  has only surfaced in relation to
Unicode, I can't say I'm overly concerned about fixing it on versions
of Perl where Unicode is already known to be broken.  Of course, as
soon as someone comes up with other layers  for which we have a
definite problem, I'll shut up.

Cheers,
Ovid

=
Silence is Evil
http://users.easystreet.com/ovid/philosophy/decency.html
Ovid   http://www.perlmonks.org/index.pl?node_id=17000
Web Programming with Perl  http://users.easystreet.com/ovid/cgi_course/


Re: Test::Builder versus Unicode

2004-12-22 Thread Nicholas Clark
On Wed, Dec 22, 2004 at 10:26:02AM -0800, David Wheeler wrote:
> 1. Perl gets smarter about duping file handles, so that the dupes get 
> the same i/o layer settings as the handles they dupe.


Changing this going forwards doesn't change any of the installed perls out
there in the wild. So whatever happens in the future, if we want things to
work on existing installations, we need to work round the problem in some
way.

(I don't currently have time to look into or think about whether it's a bug.
Or how to fix it if it is. Or who could fix it.)

Nicholas Clark


Re: Test::Builder versus Unicode

2004-12-22 Thread David Wheeler
On Dec 20, 2004, at 6:44 PM, David Wheeler wrote:
PS  Somebody should drag autrijus into this.
I'll try to grab him on IRC in the morning...
I got him this morning. Here's the discussion:
09:50am] Theory: seen autrijus
[09:50am] purl: autrijus was last seen on #p5p 1 hour and 32 minutes 
ago, saying: rofl
 [09:50am] autrijus: you seek me?
[09:50am] modred: you didn't even need to say his name three times
[09:50am] xantus: seek(0)
[09:50am] Theory: autrijus: I do!
[09:51am] autrijus: pray tell(), why?
[09:51am] Theory: autrijus: Are you on perl-qa?
[09:51am] autrijus: nay, I am not
[09:51am] Theory: autrijus: Let me find an archive link for a unicode & 
Test::Builder discussion there.
[09:51am] Theory: I have a feeling you'll already be familiar with the 
issue...
[09:51am] autrijus: http://www.nntp.perl.org/group/perâl.qa/3404
[09:52am] Theory: autrijus: D'oh!
[09:52am] Theory: You're way ahead of me.
[09:52am] Theory: autrijus: What do you think?
[09:52am] purl: I think Theory should try flossing more often!
[09:52am] autrijus: give me a test case that involves big5?
[09:52am] Theory vomits on purl
[09:53am] Theory: autrijus: Hrm. I'd have to dig up some Big5.
[09:53am] Theory: autrijus: The problem won't come up with Big5 unless 
you binmode Test::Builder's FHs to utf8.
[09:54am] Theory: autrijus: Schwern was suggesting that they just 
always be utf8, but I thought that'd break things when you used 
non-utf8 characters.
[09:54am] autrijus: if it's always in utf8 and you send random binary 
data there
[09:54am] autrijus: it could be unhappy.
[09:54am] Theory: autrijus: Exactly, and that's when Schwern threw up 
his hands in disgust.
[09:54am] Theory: seen Schwern
[09:54am] purl: Schwern was last seen on #perl 24 minutes ago, saying: 
crab:Â I've read "Naked"
 [09:55am] Schwern left the chat room. (Ping timeout: 240 seconds)
[09:56am] Theory: autrijus: So it seems to me that there needs to be 
some way to tell Test::Builder what binmode to use on its file handles.
[09:56am] autrijus: so T::B uses its own fh
[09:56am] autrijus: apart from STDOUT
[09:56am] autrijus: and it does not inherid STDOUT's layers.
[09:56am] autrijus: is it that?
[09:56am] Theory: autrijus: Yes, it dupes STDOUT and STDERR, but the 
duping doesn't preserve binmode.
[09:56am] Theory: right
[09:57am] autrijus: so it seems to me that the right fix is for the 
duping to fix binmode.
[09:57am] Theory: It does the duping so it can know what it's 
outputting to STDOUT (and STDERR) as opposed to what the scripts are 
outputting.
[09:57am] Theory: autrijus: Me too, but I don't know if there's a way 
to detect the layer assigned to a file handle. Do you?
[09:57am] autrijus: I do.
[09:57am] uri: Theory: i bet dupping occurs at the system level so 
binmode (which is perl i/o level) is lost
[09:58am] autrijus: I don't want to go there.
[09:58am] Theory: uri: Good point.
[09:58am] autrijus thinks.
[09:58am] Theory: autrijus: Uh, why not? What is it?
[09:58am] uri: Theory: perl prolly calls dup() or variant to do it
[09:58am] Theory: uri: Yeah
[09:59am] uri: and just returns a plain handle. it should be smart 
about copying i/o flags
[09:59am] autrijus: you can use a layer to read io layers.
[09:59am] uri: but you can write a smart dup sub
[09:59am] autrijus: I don't think there is an api for that.
[09:59am] hachi: dup sub!
[09:59am] autrijus: and even if we write one, T::B could not ship it
[09:59am] autrijus: because it's, well, XS
[10:01am] Theory: autrijus: I was afraid of that.
[10:01am] autrijus: and sometimes it does not make sense.
[10:01am] Theory: autrijus: So the workaround is to have some way to 
tell Test::Builder what mode to use.
[10:01am] autrijus: yes.
[10:02am] autrijus: and in this regard I agree your binmode proposal.
[10:02am] Theory: autrijus: Okay.
[10:02am] Theory: pokes schwern
Ovid: Autrijus:Â what about using Ingy's Devel::Pointer to get down to 
that flag info? It's pure Perl and relatively cross-platform.
[10:11am] hachi: you can't actually twiddle the flag with that, I 
thought
[10:12am] Ovid: But you should be able to read them and create a new 
handle with appropriate flags, yes?
[10:12am] Theory: uri: I agree that it should be smarter about copying 
i/o flags.
[10:12am] Theory: Maybe I should mention it on #p5p
[10:12am] crab: what are we talking about again?
[10:12am] hachi: Ovid: I think the problem is that you can't actually 
touch the FH that is in question here... otherwise they would just do a 
binmode() on it
[10:13am] Ovid: Oh, I take it back. Devel::Pointer is Cozens' module. 
It's http://search.cpan.org/~ingy/Poinâter-0.10/ that I meant.
[10:13am] Ovid: Yes, but aren't the filehandles getting duped in T::B? 
If so, you can read the flags on the handles they're getting duped 
from.
[10:14am] Theory: ovid: Yes, the ideal solution would be for Test::B to 
detect the flags on the STDERR and STDOUT file handles and copy them to 
its duped versions.
[10:14am] Theory: ovid: Even better would be if Perl did it 

Re: Test::Builder versus Unicode

2004-12-20 Thread David Wheeler
On Dec 20, 2004, at 6:41 PM, Michael G Schwern wrote:
PS  Somebody should drag autrijus into this.
I'll try to grab him on IRC in the morning...
Regards,
David


Re: Test::Builder versus Unicode

2004-12-20 Thread Michael G Schwern
My Official Policy on this is now to let people who actually understand
character encodings to work it out and just wait for a patch.

PS  Somebody should drag autrijus into this.

-- 
Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/
We don't know.  But if we did, we wouldn't tell you.


Re: Test::Builder versus Unicode

2004-12-20 Thread David Wheeler
On Dec 20, 2004, at 6:19 PM, Michael G Schwern wrote:
Is there a module or function in Perl that can provide this 
information?
Why does it matter what it was set to before?  I'm always going to be
shoving text out through this filehandle.
It matters because if I'm using Big5 in my module, I *don't* want 
binmode set to ":utf8", which is Perl's internal representation of 
UTF-8. I would want it set to ":big5".

Again, this is not something the user should have to care about.
Only text is shoved through those filehandles so setting them to handle
Unicode should always be the right thing to do, unless it breaks an old
perl.
Well, if that's the case, then the smarter thing might be to encode 
utf8 strings in Test::Builder before outputting them. You'd have to do 
something like this:

  print $fh map { $_ = Encode::encode_utf8($_) if Encode::is_utf8($_); 
$_ } @_;

This should prevent the warning from happening.
Regards,
David


Re: Test::Builder versus Unicode

2004-12-20 Thread Michael G Schwern
On Mon, Dec 20, 2004 at 06:20:41PM -0800, David Wheeler wrote:
> If not, another option is to add a binmode option to Test::Builder (and 
> the modules that depend on it). So you could do something like this:
> 
>   use Test::More tests => 6, binmode => ':utf8';
> 
> Thoughts?

Again, this is not something the user should have to care about.

Only text is shoved through those filehandles so setting them to handle
Unicode should always be the right thing to do, unless it breaks an old
perl.


-- 
Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/
I hate war as only a soldier who has lived it can, only as one who has
seen its brutality, its stupidity.
-- Dwight D. Eisenhower


Re: Test::Builder versus Unicode

2004-12-20 Thread chromatic
On Mon, 2004-12-20 at 18:20 -0800, David Wheeler wrote:

> If not, another option is to add a binmode option to Test::Builder (and 
> the modules that depend on it). So you could do something like this:
> 
>use Test::More tests => 6, binmode => ':utf8';
> 
> Thoughts?

I'd rather override Test::Builder::Output.  Schwern, how's that
refactoring we planned two years and a few months ago coming?

Oh right.  Yeah, me too.

Sorry,
-- c



Re: Test::Builder versus Unicode

2004-12-20 Thread David Wheeler
On Dec 20, 2004, at 6:13 PM, David Wheeler wrote:
If there was a way to tell what mode was on STDERR before you duped 
it, you could just set it to the same. Something like:

  my $mode = what_binmode(STDERR);
  my $fh = $builder->failure_output;
  binmode $fh, $mode;
Is there a module or function in Perl that can provide this 
information?
If not, another option is to add a binmode option to Test::Builder (and 
the modules that depend on it). So you could do something like this:

  use Test::More tests => 6, binmode => ':utf8';
Thoughts?
Regards,
David


Re: Test::Builder versus Unicode

2004-12-20 Thread Michael G Schwern
On Mon, Dec 20, 2004 at 06:13:54PM -0800, David Wheeler wrote:
> >Test::Builder should do something like this internally, its not like 
> >anyone's
> >going to drive binary data through a TB filehandle.  The question is
> >how does one do it without breaking older perls?
> 
> If there was a way to tell what mode was on STDERR before you duped it, 
> you could just set it to the same. Something like:
> 
>   my $mode = what_binmode(STDERR);
>   my $fh = $builder->failure_output;
>   binmode $fh, $mode;
> 
> Is there a module or function in Perl that can provide this information?

Why does it matter what it was set to before?  I'm always going to be
shoving text out through this filehandle.


-- 
Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/
And God was pleased.
And Dog was happy and wagged his tail.
And Adam was greatly improved.
And Cat did not care one way or the other.
-- http://www.catsarefrommars.com/creationist.htm


Re: Test::Builder versus Unicode

2004-12-20 Thread David Wheeler
On Dec 20, 2004, at 6:06 PM, Michael G Schwern wrote:
  use Test::Builder;
  BEGIN {my $fh = Test::Builder->new->failure_output; binmode $fh, 
':utf8';}

Test::Builder should do something like this internally, its not like 
anyone's
going to drive binary data through a TB filehandle.  The question is
how does one do it without breaking older perls?
If there was a way to tell what mode was on STDERR before you duped it, 
you could just set it to the same. Something like:

  my $mode = what_binmode(STDERR);
  my $fh = $builder->failure_output;
  binmode $fh, $mode;
Is there a module or function in Perl that can provide this information?
Regards,
David


Re: Test::Builder versus Unicode

2004-12-20 Thread Michael G Schwern
On Mon, Dec 20, 2004 at 04:50:57PM -0800, Ovid wrote:
> And looking at line 1005:
> 
>   sub _print_diag {
> my $self = shift;
> 
> local($\, $", $,) = (undef, ' ', '');
> my $fh = $self->todo ? $self->todo_output : $self->failure_output;
> print $fh @_; # here there be smart quotes
>   }
> 
> There are a few strange paths in the code which could be causing this
> (I'm wondering about the autoflush), but I was wondering if anyone has
> seen this and knows how to cope with it?  As you can see, I've tried
> that standard binmode ':utf8' and using utf8, but to no avail.  

For one, diag() goes to STDERR.  But binmode'ing that doesn't work either.
It must not survive the filehandle dup Test::Builder does.

This shuts it up.

  use Test::Builder;
  BEGIN {my $fh = Test::Builder->new->failure_output; binmode $fh, ':utf8';}

Test::Builder should do something like this internally, its not like anyone's
going to drive binary data through a TB filehandle.  The question is
how does one do it without breaking older perls?

-- 
Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/
Once is a prank.  Twice is a nuisance.  But NINE TIMES is a TRADITION.
-- Mark-Jason Dominus in <[EMAIL PROTECTED]>