TestAgainstBrzozowski weirdness

Van Den Berghe, Vincent Thu, 23 Mar 2017 07:57:32 -0700

Here's a fun fact about TestAgainstBrzozowski.
This is the original test which fails every time:


                Automaton a = AutomatonTestUtil.RandomAutomaton(Random());
                AutomatonTestUtil.MinimizeSimple(a);
                Automaton b = (Automaton)a.Clone();
                MinimizationOperations.Minimize(b);
                Assert.IsTrue(BasicOperations.SameLanguage(a, b));
                Assert.AreEqual(a.GetNumberOfStates(), b.GetNumberOfStates());
                Assert.AreEqual(a.GetNumberOfTransitions(), 
b.GetNumberOfTransitions());

If we change the call to AutomatonTestUtil.MinimizeSimple(a) by 
MinimizationOperations.Minimize(a), the test succeeds:

                Automaton a = AutomatonTestUtil.RandomAutomaton(Random());
                MinimizationOperations.Minimize(a);
                Automaton b = (Automaton)a.Clone();
                MinimizationOperations.Minimize(b);
                Assert.IsTrue(BasicOperations.SameLanguage(a, b));
                Assert.AreEqual(a.GetNumberOfStates(), b.GetNumberOfStates());
                Assert.AreEqual(a.GetNumberOfTransitions(), 
b.GetNumberOfTransitions());

Before you say "big deal", if we replace the call to 
MinimizationOperations.Minimize(b) by AutomatonTestUtil.MinimizeSimple(b), the 
test fails every time:

                Automaton a = AutomatonTestUtil.RandomAutomaton(Random());
                AutomatonTestUtil.MinimizeSimple(a);
                Automaton b = (Automaton)a.Clone();
                AutomatonTestUtil.MinimizeSimple(b);
                Assert.IsTrue(BasicOperations.SameLanguage(a, b));
                Assert.AreEqual(a.GetNumberOfStates(), b.GetNumberOfStates());
                Assert.AreEqual(a.GetNumberOfTransitions(), 
b.GetNumberOfTransitions());

Not only that, but if we re-order the clone operation to be executed on the 
un-minimized automaton (to make sure we really have 2 identical automatons) 
like this:

                Automaton a = AutomatonTestUtil.RandomAutomaton(Random());
                Automaton b = (Automaton)a.Clone();
                AutomatonTestUtil.MinimizeSimple(a);
                AutomatonTestUtil.MinimizeSimple(b);
                Assert.IsTrue(BasicOperations.SameLanguage(a, b));
                Assert.AreEqual(a.GetNumberOfStates(), b.GetNumberOfStates());
                Assert.AreEqual(a.GetNumberOfTransitions(), 
b.GetNumberOfTransitions());

... the test fails as well. This is contrary to what is claimed in the big 
giant comment.
To make sure that cloning works, this:

                Automaton a = AutomatonTestUtil.RandomAutomaton(Random());
                Automaton b = (Automaton)a.Clone();
                Assert.IsTrue(BasicOperations.SameLanguage(a, b));
                Assert.AreEqual(a.GetNumberOfStates(), b.GetNumberOfStates());
                Assert.AreEqual(a.GetNumberOfTransitions(), 
b.GetNumberOfTransitions());
                AutomatonTestUtil.MinimizeSimple(a);
                AutomatonTestUtil.MinimizeSimple(b);
                Assert.IsTrue(BasicOperations.SameLanguage(a, b));
                Assert.AreEqual(a.GetNumberOfStates(), b.GetNumberOfStates());
                Assert.AreEqual(a.GetNumberOfTransitions(), 
b.GetNumberOfTransitions());

... never fails in the first 3 assertions. Always fails in one of the last ones.

Preliminary conclusion:  if AutomatonTestUtil.MinimizeSimple can't give the 
same results for identical automatons, it's not a good basis for a test.
As to why... I searched for random factors in the 
AutomatonTestUtil.MinimizeSimple, but found none.  It looks like the algorithm 
is deterministic.
The only strange thing is the GetHashCode() implementation on State: it's bad 
form to define a hash code without an Equals method, especially if you're 
putting States in hash sets or dictionaries. But it seems like reference 
comparison is all that's needed, and since the hashcode is a different one for 
every instance, removing the GetHashCode method or adding the Equals method has 
no effect.

However, I can make the last 3 failing tests cited above work by correcting a 
bug in ValueHashSet<T>.GetHashCode. The current definition is this:


        public override int GetHashCode()
        {
            int h = 0;
            var i = GetEnumerator();
            while (i.MoveNext())
            {
                T obj = i.Current;
                if (!EqualityComparer<T>.Default.Equals(obj, default(T)))
                {
                    h = HashHelpers.CombineHashCodes(h, obj.GetHashCode());
                }
            }
            return h;
        }


This definition is wrong, since it relies on the incorrect assumption that sets 
containing identical elements will enumerate them in identical order. There is 
no order defined in a HashSet<T>, and if you have 2 sets to which you add items 
a after b in one, and b after a in another, the sets are identical but their 
enumerators will not necessarily be. The operation 
HashHelpers.CombineHashCodes(h, obj.GetHashCode()) is noncommutative, causing 
equal set to generate different hashcodes. Using:

        public override int GetHashCode()
        {
            int h = 0;
            foreach(var obj in this)
            {
                if (!EqualityComparer<T>.Default.Equals(obj, default(T)))
                {
                                      h += obj.GetHashCode();
                }
            }
            return h;
        }


Will make the last 3 failing tests work correct. But not the original one!



More weirdness, I suppose.

Vincent

TestAgainstBrzozowski weirdness

Reply via email to