[ 
https://issues.apache.org/jira/browse/BEAM-5866?focusedWorklogId=159727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-159727
 ]

ASF GitHub Bot logged work on BEAM-5866:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 28/Oct/18 14:24
            Start Date: 28/Oct/18 14:24
    Worklog Time Spent: 10m 
      Work Description: kanterov commented on issue #6845: [BEAM-5866] Fix 
`Row#equals`
URL: https://github.com/apache/beam/pull/6845#issuecomment-433710063
 
 
   @kennknowles as it was last implemented, `Row#equals` didn't do any 
allocations, or used `structuralValue`. It became a bit confusing because ghere 
are two different, but related issues: 
   1. `structuralValue` wasn't working properly for `Row`, `Map` and `List`
   2.`Row#equals` didn't handle lists or maps with BYTES
   
   I've put the fix for the first issue into a separate pull request: 
https://github.com/apache/beam/pull/6862. In this pull request, I removed 
everything not relevant for `Row#equals`
   
   As we already discussed, there are two approaches to fix equality:
   - rolling own `deepEquals` 
   - wrapping `byte[]` into object implementing equality, but not in 
`structuralValue`, but while we build rows
   
   Not to talk about performance without having any numbers, I did two 
implementations and benchmarked both with JMH. Approach with own `deepEquals` 
has better performance by almost 50%. There are two reasons:
   
   - we can abuse the fact that our lists have random access and don't need to 
allocate iterators, it makes it even more performant than anything using 
`AbstractList#equals`
   - getting field type from schema is faster than using reflection
   
   There is still a problem with `Map<byte[], ?>`, it could be fixed either by 
banning such schemas, or rolling own `ByteArrayMap<K> extends Map<byte[], K>`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 159727)
    Time Spent: 3h 50m  (was: 3h 40m)

> RowCoder doesn't implement structuralValue
> ------------------------------------------
>
>                 Key: BEAM-5866
>                 URL: https://issues.apache.org/jira/browse/BEAM-5866
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Gleb Kanterov
>            Assignee: Gleb Kanterov
>            Priority: Major
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> These two properties fail for RowCoder with `BYTES` field, or `Map<BYTES, ?>` 
> field. 
> {code}
>   public static <T> void testConsistentWithEquals(Coder<T> coder, T example) {
>     assumeTrue(coder.consistentWithEquals());
>     byte[] bytes = encodeBytes(coder, example);
>     // even if the coder is non-deterministic, if the encoded bytes match,
>     // coder is consistent with equals, decoded values must be equal
>     T out0 = decodeBytes(coder, bytes);
>     T out1 = decodeBytes(coder, bytes);
>     assertEquals("If the encoded bytes match, decoded values must be equal", 
> out0, out1);
>     assertEquals(
>         "If two values are equal, their hash codes must be equal",
>         out0.hashCode(),
>         out1.hashCode());
>   }
>   public static <T> void testStructuralValueConsistentWithEquals(Coder<T> 
> coder, T example) {
>     byte[] bytes = encodeBytes(coder, example);
>     // even if coder is non-deterministic, if the encoded bytes match,
>     // structural values must be equal
>     Object out0 = coder.structuralValue(decodeBytes(coder, bytes));
>     Object out1 = coder.structuralValue(decodeBytes(coder, bytes));
>     assertEquals("If the encoded bytes match, structural values must be 
> equal", out0, out1);
>     assertEquals(
>         "If two values are equal, their hash codes must be equal",
>         out0.hashCode(),
>         out1.hashCode());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to