yifan-c commented on code in PR #46:
URL: 
https://github.com/apache/cassandra-analytics/pull/46#discussion_r1541824843


##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/utils/XXHash32DigestAlgorithm.java:
##########
@@ -61,7 +61,10 @@ public Digest calculateFileDigest(Path path) throws 
IOException
             {
                 hasher.update(buffer, 0, len);
             }
-            return new XXHash32Digest(Long.toHexString(hasher.getValue()), 
SEED);
+            // lz4 library doesn't mask the hash value, so we need to mask it 
to
+            // prevent forwarding the negative sign bit when converting to a 
long value
+            long hash = hasher.getValue() & 0xffffffffL;

Review Comment:
   The comment is not true. 
   There is no mask required for lz4's implementation since it returns `int`. 
Masking is required because our code casts `int` to `long`, and the sign bit 
cannot be carried over. 
   Alternatively, you can just call this w/o masking.
   
   ```java
   Integer.toHexString(hasher.getValue())
   ```
   
   And the proof.
   ```
   jshell> int i = -1;
   i ==> -1
   
   jshell> Integer.toHexString(i);
   $2 ==> "ffffffff"
   
   jshell> long l = i & 0xffffffffL;
   l ==> 4294967295
   
   jshell> Long.toHexString(l);
   $4 ==> "ffffffff"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to