[GitHub] [spark] cloud-fan commented on a change in pull request #28481: [SPARK-31665][SQL][TESTS] Check parquet dictionary encoding of random dates/timestamps

2020-05-11 Thread GitBox


cloud-fan commented on a change in pull request #28481:
URL: https://github.com/apache/spark/pull/28481#discussion_r422836209



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
##
@@ -162,52 +162,66 @@ object RandomDataGenerator {
   })
   case BooleanType => Some(() => rand.nextBoolean())
   case DateType =>
-val generator =
-  () => {
-var milliseconds = rand.nextLong() % 25340232959L
-// -6213574080L is the number of milliseconds before January 
1, 1970, 00:00:00 GMT
-// for "0001-01-01 00:00:00.00". We need to find a
-// number that is greater or equals to this number as a valid 
timestamp value.
-while (milliseconds < -6213574080L) {
-  // 25340232959L is the number of milliseconds since
-  // January 1, 1970, 00:00:00 GMT for "-12-31 
23:59:59.99".
-  milliseconds = rand.nextLong() % 25340232959L
-}
-val date = DateTimeUtils.toJavaDate((milliseconds / 
MILLIS_PER_DAY).toInt)
-// The generated `date` is based on the hybrid calendar Julian + 
Gregorian since
-// 1582-10-15 but it should be valid in Proleptic Gregorian 
calendar too which is used
-// by Spark SQL since version 3.0 (see SPARK-26651). We try to 
convert `date` to
-// a local date in Proleptic Gregorian calendar to satisfy this 
requirement.
-// Some years are leap years in Julian calendar but not in 
Proleptic Gregorian calendar.
-// As the consequence of that, 29 February of such years might not 
exist in Proleptic
-// Gregorian calendar. When this happens, we shift the date by one 
day.
-Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + 
MILLIS_PER_DAY))
+def uniformDateRand(rand: Random): java.sql.Date = {
+  var milliseconds = rand.nextLong() % 25340232959L
+  // -6213574080L is the number of milliseconds before January 1, 
1970, 00:00:00 GMT
+  // for "0001-01-01 00:00:00.00". We need to find a
+  // number that is greater or equals to this number as a valid 
timestamp value.

Review comment:
   oh sorry, the upper/lower bound is already here. Then I'm fine





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28481: [SPARK-31665][SQL][TESTS] Check parquet dictionary encoding of random dates/timestamps

2020-05-10 Thread GitBox


cloud-fan commented on a change in pull request #28481:
URL: https://github.com/apache/spark/pull/28481#discussion_r422784030



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
##
@@ -162,52 +162,66 @@ object RandomDataGenerator {
   })
   case BooleanType => Some(() => rand.nextBoolean())
   case DateType =>
-val generator =
-  () => {
-var milliseconds = rand.nextLong() % 25340232959L
-// -6213574080L is the number of milliseconds before January 
1, 1970, 00:00:00 GMT
-// for "0001-01-01 00:00:00.00". We need to find a
-// number that is greater or equals to this number as a valid 
timestamp value.
-while (milliseconds < -6213574080L) {
-  // 25340232959L is the number of milliseconds since
-  // January 1, 1970, 00:00:00 GMT for "-12-31 
23:59:59.99".
-  milliseconds = rand.nextLong() % 25340232959L
-}
-val date = DateTimeUtils.toJavaDate((milliseconds / 
MILLIS_PER_DAY).toInt)
-// The generated `date` is based on the hybrid calendar Julian + 
Gregorian since
-// 1582-10-15 but it should be valid in Proleptic Gregorian 
calendar too which is used
-// by Spark SQL since version 3.0 (see SPARK-26651). We try to 
convert `date` to
-// a local date in Proleptic Gregorian calendar to satisfy this 
requirement.
-// Some years are leap years in Julian calendar but not in 
Proleptic Gregorian calendar.
-// As the consequence of that, 29 February of such years might not 
exist in Proleptic
-// Gregorian calendar. When this happens, we shift the date by one 
day.
-Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + 
MILLIS_PER_DAY))
+def uniformDateRand(rand: Random): java.sql.Date = {
+  var milliseconds = rand.nextLong() % 25340232959L
+  // -6213574080L is the number of milliseconds before January 1, 
1970, 00:00:00 GMT
+  // for "0001-01-01 00:00:00.00". We need to find a
+  // number that is greater or equals to this number as a valid 
timestamp value.

Review comment:
   I'm a bit unsure about this. Spark does support negative years for now, 
and we should keep testing it.

##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
##
@@ -162,52 +162,66 @@ object RandomDataGenerator {
   })
   case BooleanType => Some(() => rand.nextBoolean())
   case DateType =>
-val generator =
-  () => {
-var milliseconds = rand.nextLong() % 25340232959L
-// -6213574080L is the number of milliseconds before January 
1, 1970, 00:00:00 GMT
-// for "0001-01-01 00:00:00.00". We need to find a
-// number that is greater or equals to this number as a valid 
timestamp value.
-while (milliseconds < -6213574080L) {
-  // 25340232959L is the number of milliseconds since
-  // January 1, 1970, 00:00:00 GMT for "-12-31 
23:59:59.99".
-  milliseconds = rand.nextLong() % 25340232959L
-}
-val date = DateTimeUtils.toJavaDate((milliseconds / 
MILLIS_PER_DAY).toInt)
-// The generated `date` is based on the hybrid calendar Julian + 
Gregorian since
-// 1582-10-15 but it should be valid in Proleptic Gregorian 
calendar too which is used
-// by Spark SQL since version 3.0 (see SPARK-26651). We try to 
convert `date` to
-// a local date in Proleptic Gregorian calendar to satisfy this 
requirement.
-// Some years are leap years in Julian calendar but not in 
Proleptic Gregorian calendar.
-// As the consequence of that, 29 February of such years might not 
exist in Proleptic
-// Gregorian calendar. When this happens, we shift the date by one 
day.
-Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + 
MILLIS_PER_DAY))
+def uniformDateRand(rand: Random): java.sql.Date = {
+  var milliseconds = rand.nextLong() % 25340232959L
+  // -6213574080L is the number of milliseconds before January 1, 
1970, 00:00:00 GMT
+  // for "0001-01-01 00:00:00.00". We need to find a
+  // number that is greater or equals to this number as a valid 
timestamp value.
+  while (milliseconds < -6213574080L) {
+// 25340232959L is the number of milliseconds since
+// January 1, 1970, 00:00:00 GMT for "-12-31 23:59:59.99".

Review comment:
   ditto.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above