Does it look similar to https://issues.apache.org/jira/browse/BEAM-6813 ? I
also stumbled on a problem with a state in DirectRunner but wasn't able to
figure it out yet:
https://lists.apache.org/thread.html/dae8b605a218532c085a0eea4e71338eae51922c26820f37b24875c0@%3Cdev.beam.apache.org%3E

Regards,
Anton

*From: *Jan Lukavský <je...@seznam.cz>
*Date: *Thu, May 9, 2019 at 8:13 AM
*To: * <dev@beam.apache.org>

Because of the use of hashCode in StateSpecs, I'd say that it is. But it is
> not obvious. That's why I'd suggest to make it abstract on Coder, so that
> all implementations have to override it. That's a simple solution, but the
> question is - should hashCode of Coder be used that way? I think that
> StateSpec instances should be equal only to itself. Then the hashCode can
> be stored in the instance, e.g.
>
>   private final int hashCode = System.identityHashCode(this)
>
> and returned in hashCode(). There would be no need for Coder to implement
> hashCode anymore (if there aren't any other cases, where it is needed, in
> which case it would still be better to add abstract hashCode and equals
> methods on Coder).
>
> Jan
> On 5/9/19 5:04 PM, Reuven Lax wrote:
>
> Is a valid hashCode on Coder part of our contract or not? If it is, then
> the lack of hashCode on SchemaCoder is simply a bug.
>
> On Thu, May 9, 2019 at 7:42 AM Jan Lukavský <je...@seznam.cz> wrote:
>
>> Hi,
>>
>> I have spent several hour digging into strange issue with DirectRunner,
>> that manifested as non-deterministic run of pipeline. The pipeline
>> contains basically only single stateful ParDo, which adds elements into
>> state and after some timeout flushes these elements into output. The
>> issues was, that sometimes (very often) when the timer fired, the state
>> appeared to be empty, although I actually added something into the
>> state. I will skip details, but the problem boils down to the fact, that
>> StateSpecs hash Coder into hashCode - e.g.
>>
>>      @Override
>>      public int hashCode() {
>>        return Objects.hash(getClass(), coder);
>>      }
>>
>> in ValueStateSpec. Now, when Coder doesn't have hashCode and equals
>> implemented (and there are some of those in the codebase itself - e.g.
>> SchemaCoder), it all blows up in a very hard-to-debug manner. So the
>> proposal is - either to add abstract hashCode and equals to Coder, or
>> don't hash the Coder into hashCode of StateSpecs (we can generate unique
>> ID for each StateSpec instance for example).
>>
>> Any thoughts about which path to follow? Or maybe both? :)
>>
>> Jan
>>
>>
>>

Reply via email to