[ 
https://issues.apache.org/jira/browse/BEAM-4008?focusedWorklogId=110940&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110940
 ]

ASF GitHub Bot logged work on BEAM-4008:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Jun/18 03:35
            Start Date: 12/Jun/18 03:35
    Worklog Time Spent: 10m 
      Work Description: tvalentyn commented on a change in pull request #5336: 
[BEAM-4008] Futurize utils subpackage
URL: https://github.com/apache/beam/pull/5336#discussion_r194606891
 
 

 ##########
 File path: sdks/python/apache_beam/utils/windowed_value.py
 ##########
 @@ -178,33 +182,14 @@ def __repr__(self):
         self.windows,
         self.pane_info)
 
-  def __hash__(self):
-    return (hash(self.value) +
-            3 * self.timestamp_micros +
-            7 * hash(self.windows) +
-            11 * hash(self.pane_info))
-
-  # We'd rather implement __eq__, but Cython supports that via __richcmp__
-  # instead.  Fortunately __cmp__ is understood by both (but not by Python 3).
-  def __cmp__(left, right):  # pylint: disable=no-self-argument
-    """Compares left and right for equality.
-
-    For performance reasons, doesn't actually impose an ordering
-    on unequal values (always returning 1).
-    """
-    if type(left) is not type(right):
-      return cmp(type(left), type(right))
+  def _key(self):
+    return self.value, self.timestamp_micros, self.windows, self.pane_info
 
-    # TODO(robertwb): Avoid the type checks?
-    # Returns False (0) if equal, and True (1) if not.
-    return not WindowedValue._typed_eq(left, right)
+  def __eq__(self, other):
+    return type(self) == type(other) and self._key() == other._key()
 
-  @staticmethod
-  def _typed_eq(left, right):
-    return (left.timestamp_micros == right.timestamp_micros
-            and left.value == right.value
-            and left.windows == right.windows
-            and left.pane_info == right.pane_info)
+  def __hash__(self):
 
 Review comment:
   Thanks for your patience for awaiting the feedback, @RobbeSneyders.
   
   I doublechecked  performance of __hash__  using a microbenchmark. Previous 
implementation of __hash__ appears to be at least 7.5% faster, possibly due to 
overheads to create a tuple, and/or the way Cython compiles it.
   
   I suggest we revert to previous implementation of __hash__. For consistency  
we can also implement __eq__  by comparing fields directly.  
   
   My microbenchmark can be found in 
https://github.com/apache/beam/compare/master...tvalentyn:utils_futurization_benchmark,
 see 
https://github.com/tvalentyn/beam/commit/67d3df0f1248f43feb84503361f0071efc560fc2.
 It depends on the mini-framework that's being finalized in 
https://github.com/apache/beam/pull/5565.
   
   CC: @robertwb 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 110940)
    Time Spent: 1h 20m  (was: 1h 10m)

> Futurize and fix python 2 compatibility for utils subpackage
> ------------------------------------------------------------
>
>                 Key: BEAM-4008
>                 URL: https://issues.apache.org/jira/browse/BEAM-4008
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Robbe
>            Assignee: Robbe
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to