Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6911#discussion_r32972763
  
    --- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 ---
    @@ -311,21 +310,23 @@ public double getDouble(int i) {
       }
     
       public UTF8String getUTF8String(int i) {
    +    return UTF8String.fromBytes(getBinary(i));
    +  }
    +
    +  public byte[] getBinary(int i) {
         assertIndexIsValid(i);
    -    final UTF8String str = new UTF8String();
    -    final long offsetToStringSize = getLong(i);
    -    final int stringSizeInBytes =
    -      (int) PlatformDependent.UNSAFE.getLong(baseObject, baseOffset + 
offsetToStringSize);
    -    final byte[] strBytes = new byte[stringSizeInBytes];
    +    final long offsetAndSize = getLong(i);
    +    final int offset = (int)(offsetAndSize >> 32);
    --- End diff --
    
    Do we need to mask out the upper 32 bits before converting to a long?  I 
guess the uppermost bit probably can't be 1 because the offset can't be 
negative, so I guess we don't need to worry about sign-extension during the 
shift.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to