Re: Fwd: [demerphq@gmail.com: Re: fixing is_deeply]

demerphq Fri, 01 Jul 2005 04:32:33 -0700

On 7/1/05, Smylers <[EMAIL PROTECTED]> wrote:
> demerphq writes:
> 
> > On 7/1/05, Michael G Schwern <[EMAIL PROTECTED]> wrote:
> >
> > > ... I'm of the opinion that is_deeply() is currently doing the right
> > > thing ... Largely it comes down to the Principle of Least Surprise.
> >
> > I cant agree with this analysis. If you go down this route surprise
> > abounds.
> >
> > is_deeply($x,$y);  #ok
> >
> > $x->[0]{a}=1;
> > $y->[0]{a}=1;
> >
> > is_deeply($x,$y);  #Surprise!
> 
> For those of us not following along too closely (and without the right
> stuff installed to test the above line right now), it isn't clear what
> the answer to the above is -- since it's the thing being disputed and
> there are apparently arguments on each side, either result is of course
> possible -- that you find surprising.
> 
> $x and $y look to contain the same data when looked at deeply to be, so
> I'd expect them to compare the same; I'd be surprised if they didn't,
> which I think means I disagree with you.


So you expect the first test to pass and the second to fail? Because
that is what will happen. Here is the code in one piece and using only
standard modules:

use Test::More tests=>2;
use Data::Dumper;

my $a={};
my $b={};
my $c={};

$x=[$a,$a];
$y=[$b,$c];

diag(Data::Dumper->new([$x,$y])->Purity(1)->Dump());
is_deeply($x,$y);  #ok

$_->[0]{a}=1 for $x,$y;

diag(Data::Dumper->new([$x,$y])->Purity(1)->Dump());
is_deeply($x,$y);  # Surprise!



__END__
1..2
# $VAR1 = [
#           {},
#           {}
#         ];
# $VAR1->[1] = $VAR1->[0];
# $VAR2 = [
#           {},
#           {}
#         ];
ok 1
# $VAR1 = [
#           {
#             'a' => 1
#           },
#           {}
#         ];
# $VAR1->[1] = $VAR1->[0];
# $VAR2 = [
#           {
#             'a' => 1
#           },
#           {}
#         ];
not ok 2
#     Failed test (c:\temp\tm.pl at line 17)
#     Structures begin differing at:
#          $got->[1]{a} = '1'
#     $expected->[1]{a} = Does not exist
# Looks like you failed 1 test of 2.

Even Data::Dumper shows the two to be different. Why should
is_deeply() do any different?!
 
> > Thats a MUCH bigger surprise IMO. And a fatal one for anybody really
> > relying on is_deeply.
> 
> I'd say exactly the same about it comparing them differently.

Which _them_ the first test or the second test?
 
> > > In fact, I would even argue that this violates some of the black
> > > boxness of the test.  In much the same way that we don't check to
> > > see if a given piece of data is blessed, tied or overloaded, we
> > > wouldn't care if we have a repeating reference or two different
> > > references to equivalent data.
> >
> > But they COMPLETELY change the datastructure. Utterly and totally.
> 
> Not necessarily.  It depends on what you're using the data for.

Well that says there are two different behaviours that people expect.
They are exclusive.

> 
> > The question you have to ask yourself is why should a reference be
> > treated different from any other value? It is a VALUE.
> 
> Except it isn't.  Or at least, not all the time: it depends how you wish
> to look at it.  If you just consider a reference to be a value
> (effectively a pointer, a memory address) then you aren't examining a
> data structure _deeply_; you're just doing a _shallow_ comparision of it
> as a reference.

No i think youve missed the point here. See yitzchaks comments. 

In simple I am arguing that you should be able to traverse the two
data strcutures in parallel and should not encountered objects in one
traverse more often than one encounteres its parallel object in the
other structure.

> 
> > IE: why should [tweaked as per later mail]
> >
> > $x=1;$y=1;$z=2;
> >
> > $a1=[$x,$x];
> > $a2=[$y,$z];
> >
> > $a1 and $a2 are different but
> 
> Clearly, because you've assigned different values into $x and $z.  If
> you'd assigned the same values into each of them then they should
> compare as the same:
> 
>   $x=1;$y=1;$z=1;
> 
> > $x={};$y={};$z={};

This is bad comparison. The real comparison should be:

my %hash;
$x=\%hash; $y=\%hash; $z=\%hash;


> >
> > $a3=[$x,$x];
> > $a4=[$y,$z];
> >
> > $a3 and $a4 are not?
> 
> In this case you've assigned the same value -- an empty hash ref -- to
> each of $x and $z, so the data structures in each case are the same.

No, the $x hash is present twice. Thus altering the $x hash and the $y
hash equivelently results in what we all agree are different
structures. Im saying that if that is possible then the originals
structures were different in the first place.

> 
> > Isn't the situation identical?
> 
> No.  Just look at the source code:
>   $x=1;$y=1;$z=2;
>   $x={};$y={};$z={};
> 
> A system in which the 1st of those lines puts something different in the
> variables but in which the 2nd puts the same in each variable seems
> entirely plausible to me.

Each of the hashes there is a totally distinct hash. Each variable
gets something totally and utterly distinct.

The argument being made goes something like this:

Because both arrays index 0 and 1 reference empty hashes they are equivelent.

But the thing is one array references the same hash twice, then the
other references two different hashes.

If we were going to be consistant with our logic on this then we
should say that

[1,1] and [1,2]

are the _same_ because both arrays index 0 and 1 contain numbers. Why
do we care if they are they same numbers or not here, but we dont care
when they are hashes?

This is a logical inconsistancy that is unnecessary and confusing. I
argue that just as

[1,1] and [1,2] should be considered different so should [\%h,\%h] and [\%i,\%j]

> 
> > ... the refs turn into their addresses and you will consider them
> > different? This just doesnt make sense.
> 
> Yes it does, cos once you're looking at the value of refs that's become
> a shallow copy -- you're no longer bothered by what's inside them.

Im sorry but this doesnt make any sense. The point is the originals
are different. When we add 0 to them they are different. But
is_deeply() considers the first equivelent, and the second not. Since
they are both different representations of the same thing  and indeed
with a little pack magic you could reverse the process (assuming the
hashes were otherwise referenced to prevent them from being
deallocated by the assignment) this is an inconsistancy.

> Going back to your first example:
> 
> > $x=1;$y=1;$z=2;
> > $a1=[$x,$x];
> > $a2=[$y,$z];
> 
> $a1 and $a2 are different, and you seem happy with that.  Suppose it
> were instead:
> 
>   $a1=[1, 1];
>   $a2=[1, 1];
> 
> Would you be happy for $a1 and $a2 now to compare the same?  

IMO they are they same. 

> If not then obviously you never need to use is_deeply cos you could just use 
> ==
> instead (and treat the array-refs as values).  If you are happy for them
> to compare the same, despite being different anon array-refs in memory,
> then why aren't you happy for the same rule to apply nested within those
> array-refs?

Because when you put COPIES of references to the SAME object inside
you produce a totally different structure than when you put references
to DIFFERENT objects inside.

Like I said. If I have two structures that are deep copies of each
other I should be able to alter both in an equivelent way and still
have them equivelent. If they arent then they werent.

(Sorry im repeating myself here, im getting frustrated.)

> 
> To me 'deeply' implies recursing as deep as the data structure goes, not
> that there's a special rule for the top-level that's treated differently
> from the others.

Nobody is talking about changing this.

> 
> > is_deeply should be reliable and do what its name says.
> 
> I think everybody's in agreement about that!  There's just the matter of
> agreeing what its name does say ...

No. there is no agreement about it. Hence this debate. 

> 
> > The fact is it wouldnt be too difficult to support both [behaviours]
> > off of the same code base.
> 
> Then we're still in the position of arguing which should be the default
> behaviour ...

Yes. And I posit it should be the one that is most consistant. And the
existing logic is not consistant.

Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Re: Fwd: [demerphq@gmail.com: Re: fixing is_deeply]

Reply via email to