gene-db commented on code in PR #47473:
URL: https://github.com/apache/spark/pull/47473#discussion_r1691812112


##########
common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java:
##########
@@ -120,6 +120,12 @@ public class VariantUtil {
   // Long string value. The content is (4-byte little-endian unsigned integer 
representing the
   // string size) + (size bytes of string content).
   public static final int LONG_STR = 16;
+  // year-month interval value. The content is one byte representing the start 
and end field values
+  // (1 bit each starting at least significant bits) and a 4-byte 
little-endian signed integer
+  public static final int YEAR_MONTH_INTERVAL = 19;
+  // day-time interval value. The content is one byte representing the start 
and end field values
+  // (2 bits each starting at least significant bits) and an 8-byte 
little-endian signed integer

Review Comment:
   Similar to the README, we should document what 0, 1, 2, 3 represent for the 
start and end fields.



##########
common/variant/src/main/java/org/apache/spark/types/variant/Variant.java:
##########
@@ -88,6 +91,16 @@ public long getLong() {
     return VariantUtil.getLong(value, pos);
   }
 
+  // Get the start and end fields of a year-month interval from the variant.
+  public IntervalFields getYearMonthIntervalFields() {

Review Comment:
   What uses these apis?



##########
common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java:
##########
@@ -377,11 +405,52 @@ public static long getLong(byte[] value, int pos) {
       case TIMESTAMP:
       case TIMESTAMP_NTZ:
         return readLong(value, pos + 1, 8);
+      case YEAR_MONTH_INTERVAL:
+        return readLong(value, pos + 2, 4);
+      case DAY_TIME_INTERVAL:
+        return readLong(value, pos + 2, 8);
       default:
         throw new IllegalStateException(exceptionMessage);
     }
   }
 
+  // Class used to pass around start and end fields of year-month and day-time 
interval values.
+  public static class IntervalFields {
+    public IntervalFields(byte startField, byte endField) {
+      this.startField = startField;
+      this.endField = endField;
+    }
+
+    public final byte startField;
+    public final byte endField;
+  }
+
+  // Get the start and end fields of a variant value representing a year-month 
interval value. The
+  // returned array contains the start field at the zeroth index and the end 
field at the first
+  // index.
+  public static IntervalFields getYearMonthIntervalFields(byte[] value, int 
pos) {
+    long fieldInfo = readLong(value, pos + 1, 1);
+    IntervalFields intervalFields = new IntervalFields((byte) (fieldInfo & 
0x1),
+            (byte) ((fieldInfo >> 1) & 0x1));
+    if (intervalFields.endField < intervalFields.startField) {

Review Comment:
   Do we need to check the type to be year-month interval, and the length? Some 
other functions call `checkIndex`.



##########
common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java:
##########
@@ -355,9 +380,12 @@ public static boolean getBoolean(byte[] value, int pos) {
 
   // Get a long value from variant value `value[pos...]`.
   // It is only legal to call it if `getType` returns one of 
`Type.LONG/DATE/TIMESTAMP/
-  // TIMESTAMP_NTZ`. If the type is `DATE`, the return value is guaranteed to 
fit into an int and
-  // represents the number of days from the Unix epoch. If the type is 
`TIMESTAMP/TIMESTAMP_NTZ`,
-  // the return value represents the number of microseconds from the Unix 
epoch.
+  // TIMESTAMP_NTZ/YEAR_MONTH_INTERVAL/DAY_TIME_INTERVAL`. If the type is 
`DATE`, the return value
+  // is guaranteed to fit into an int and represents the number of days from 
the Unix epoch.
+  // If the type is `TIMESTAMP/TIMESTAMP_NTZ`, the return value represents the 
number of
+  // microseconds from the Unix epoch. If the type is `YEAR_MONTH_INTERVAL`, 
the return value

Review Comment:
   We should also mention that the year-month one is guaranteed to fit into an 
int.



##########
common/variant/src/main/java/org/apache/spark/types/variant/Variant.java:
##########
@@ -113,6 +126,11 @@ public String getString() {
     return VariantUtil.getString(value, pos);
   }
 
+  // Get the type info bits from a variant value.
+  public int getTypeInfo() {

Review Comment:
   I can't easily tell, but what uses this api?



##########
common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java:
##########
@@ -120,6 +120,12 @@ public class VariantUtil {
   // Long string value. The content is (4-byte little-endian unsigned integer 
representing the
   // string size) + (size bytes of string content).
   public static final int LONG_STR = 16;
+  // year-month interval value. The content is one byte representing the start 
and end field values
+  // (1 bit each starting at least significant bits) and a 4-byte 
little-endian signed integer

Review Comment:
   Similar to the README, we should document what 0 and 1 represent for the 
start and end fields.



##########
common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java:
##########
@@ -233,6 +245,13 @@ public enum Type {
     TIMESTAMP_NTZ,
     FLOAT,
     BINARY,
+    YEAR_MONTH_INTERVAL,
+    DAY_TIME_INTERVAL,
+  }
+
+  public static int getTypeInfo(byte[] value, int pos) {

Review Comment:
   Similar to an earlier question, what uses this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to