[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20787 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186725257 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1194,13 +1194,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `date1` and `date2`. + * If `date1` is later than `date2`, then the result is positive. + * If `date1` and `date2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits unless roundOff=false. */ // scalastyle:off line.size.limit @ExpressionDescription( usage = """ -_FUNC_(timestamp1, timestamp2[, roundOff]) - Returns number of months between `timestamp1` and `timestamp2`. - The result is rounded to 8 decimal places by default. Set roundOff=false otherwise., +_FUNC_(date1, date2[, roundOff]) - If `date1` is later than `date2`, then the result --- End diff -- sorry if I wasn't clear. mind changing date1 and date2 to timestamp1 and timestamp2 here in `usage`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user aditkumar commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186475229 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1194,13 +1194,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `date1` and `date2`. + * If `date1` is later than `date2`, then the result is positive. + * If `date1` and `date2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits unless roundOff=false. */ // scalastyle:off line.size.limit @ExpressionDescription( usage = """ -_FUNC_(timestamp1, timestamp2[, roundOff]) - Returns number of months between `timestamp1` and `timestamp2`. - The result is rounded to 8 decimal places by default. Set roundOff=false otherwise., +_FUNC_(date1, date2[, roundOff]) - If `date1` is later than `date2`, then the result --- End diff -- ðð¾ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186467842 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1194,13 +1194,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `date1` and `date2`. + * If `date1` is later than `date2`, then the result is positive. + * If `date1` and `date2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits unless roundOff=false. */ // scalastyle:off line.size.limit @ExpressionDescription( usage = """ -_FUNC_(timestamp1, timestamp2[, roundOff]) - Returns number of months between `timestamp1` and `timestamp2`. - The result is rounded to 8 decimal places by default. Set roundOff=false otherwise., +_FUNC_(date1, date2[, roundOff]) - If `date1` is later than `date2`, then the result --- End diff -- Yea, I got that they are consistent now but actually I think timestamp is more correct since the actual expected types here are timestmaps, not dates. Let's just leave them as were alone. It had a long review iterations. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user aditkumar commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186464442 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1194,13 +1194,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `date1` and `date2`. + * If `date1` is later than `date2`, then the result is positive. + * If `date1` and `date2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits unless roundOff=false. */ // scalastyle:off line.size.limit @ExpressionDescription( usage = """ -_FUNC_(timestamp1, timestamp2[, roundOff]) - Returns number of months between `timestamp1` and `timestamp2`. - The result is rounded to 8 decimal places by default. Set roundOff=false otherwise., +_FUNC_(date1, date2[, roundOff]) - If `date1` is later than `date2`, then the result --- End diff -- I'm going to leave it as date1, date2 because that also lines up with the variable names in the function definition. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186458260 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1194,13 +1194,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `date1` and `date2`. + * If `date1` is later than `date2`, then the result is positive. + * If `date1` and `date2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits unless roundOff=false. */ // scalastyle:off line.size.limit @ExpressionDescription( usage = """ -_FUNC_(timestamp1, timestamp2[, roundOff]) - Returns number of months between `timestamp1` and `timestamp2`. - The result is rounded to 8 decimal places by default. Set roundOff=false otherwise., +_FUNC_(date1, date2[, roundOff]) - If `date1` is later than `date2`, then the result --- End diff -- Last comment. Let's just leave `timestamp1` and `timestamp2` as were in the `usage` specifically.. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user aditkumar commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186436718 --- Diff: R/pkg/R/functions.R --- @@ -1906,6 +1906,7 @@ setMethod("atan2", signature(y = "Column"), #' @details #' \code{datediff}: Returns the number of days from \code{y} to \code{x}. +#' If \code{y} is later than \code{x} then the result is positive. --- End diff -- Done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186277140 --- Diff: R/pkg/R/functions.R --- @@ -1906,6 +1906,7 @@ setMethod("atan2", signature(y = "Column"), #' @details #' \code{datediff}: Returns the number of days from \code{y} to \code{x}. +#' If \code{y} is later than \code{x} then the result is positive. --- End diff -- I'd add the same sentence to Python and Scala side too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r186277122 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1194,13 +1194,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `date1` and `date2`. + * If `date1` is later than `date2`, then the result is positive. + * If `date1` and `date2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits unless roundOff=false. */ // scalastyle:off line.size.limit @ExpressionDescription( usage = """ -_FUNC_(timestamp1, timestamp2[, roundOff]) - Returns number of months between `timestamp1` and `timestamp2`. - The result is rounded to 8 decimal places by default. Set roundOff=false otherwise., +_FUNC_(date1, date2) - If `date1` is later than `date2`, then the result --- End diff -- wait .. why did you change ` _FUNC_(timestamp1, timestamp2[, roundOff])`?? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r185976200 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -887,11 +887,25 @@ object DateTimeUtils { * Returns number of months between time1 and time2. time1 and time2 are expressed in * microseconds since 1.1.1970. * - * If time1 and time2 having the same day of month, or both are the last day of month, - * it returns an integer (time under a day will be ignored). + * If time1 and time2 are on the same day of month, or both are the last day of month, + * time of a day will be ignored. + * + * Otherwise, the difference is calculated based on 31 days per month, and rounded to + * 8 digits. + */ + def monthsBetween(time1: SQLTimestamp, time2: SQLTimestamp): Double = { --- End diff -- From a very quick look. there was a mistake when merging it in. @aditkumar, mind if I ask to rebase and sync it please? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r183221673 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1117,11 +1117,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `timestamp1` and `timestamp2`. + * If `timestamp` is later than `timestamp2`, then the result is positive. --- End diff -- Nit: timestamp -> timestamp1. Same below. Nit: These are called date1 and date2 in Python, and also here in the Scala code. Worth being consistent? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r182701579 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1117,11 +1117,21 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `timestamp1` and `timestamp2`. + * If `timestamp` is later than `timestamp2`, then the result is positive. + * If `timestamp1` and `timestamp2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits. */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(timestamp1, timestamp2) - Returns number of months between `timestamp1` and `timestamp2`.", + usage = """ +_FUNC_(timestamp1, timestamp2) - If `timestamp` is later than `timestamp2`, then the result +is positive. If `timestamp1` and `timestamp2` are on the same day of month, or both +are the last day of month, time of day will be ignored. Otherwise, the +difference is calculated based on 31 days per month, and rounded to 8 digits. + """, --- End diff -- I am not sure why you don't like this format: ``` usage = """ _FUNC_(timestamp1, timestamp2) - If `timestamp` is later than `timestamp2`, then the result is positive. If `timestamp1` and `timestamp2` are on the same day of month, or both are the last day of month, time of day will be ignored. Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits. """, ``` It's actually consistent with other ones such as: https://github.com/apache/spark/blob/df7fc3ef3899cadd252d2837092bebe3442d6523/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala#L51-L60 not a big deal but it should be good to match while we are here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r181971230 --- Diff: R/pkg/R/functions.R --- @@ -1957,8 +1958,11 @@ setMethod("levenshtein", signature(y = "Column"), }) #' @details -#' \code{months_between}: Returns number of months between dates \code{y} and \code{x}. #' +#' \code{months_between}: Returns number of months between dates \code{y} and \code{x}. +#' If \code{y} is later than \code{x}, then the result is positive. If \code{y} and \code{x} +#' are on the same day of month, or both are the last day of month, time of day will be ignored. +#' Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits. #' @rdname column_datetime_diff_functions --- End diff -- https://github.com/apache/spark/pull/20787#discussion_r177662053 roxygen is very particular with "empty line" which is significant. please change this to ``` #' @details #' \code{months_between}: Returns number of months between dates \code{y} and \code{x}. #' If \code{y} is later than \code{x}, then the result is positive. If \code{y} and \code{x} #' are on the same day of month, or both are the last day of month, time of day will be ignored. #' Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits. #' #' @rdname column_datetime_diff_functions ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r177663453 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1117,11 +1117,22 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `timestamp1` and `timestamp2`. + * If `timestamp` is later than `timestamp2`, then the result is positive. + * If `timestamp1` and `timestamp2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. Otherwise, the + * difference is calculated based on 31 days per month, and rounded to + * 8 digits. */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(timestamp1, timestamp2) - Returns number of months between `timestamp1` and `timestamp2`.", + usage = """ +_FUNC_(timestamp1, timestamp2) - +If `timestamp` is later than `timestamp2`, then the result is positive. +If `timestamp1` and `timestamp2` are on the same day of month, or both +are the last day of month, time of day will be ignored. Otherwise, the +difference is calculated based on 31 days per month, and rounded to 8 digits. + """, --- End diff -- Seems different from what I suggested at https://github.com/apache/spark/pull/20787#discussion_r176949408 .. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r177662053 --- Diff: R/pkg/R/functions.R --- @@ -1958,6 +1958,7 @@ setMethod("levenshtein", signature(y = "Column"), }) #' @details +#' --- End diff -- I don't think that's what I meant? `#'` on a newline after ` 8 digits. ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r176949408 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1117,11 +1117,23 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `timestamp1` and `timestamp2`. + * If `timestamp` is later than `timestamp2`, then the result is positive. + * If `timestamp1` and `timestamp2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. + * Otherwise, the difference is calculated based on 31 days per month, and + * rounded to 8 digits. */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(timestamp1, timestamp2) - Returns number of months between `timestamp1` and `timestamp2`.", + usage = """ +_FUNC_(timestamp1, timestamp2) - +If `timestamp` is later than `timestamp2`, then the result is positive. +If `timestamp1` and `timestamp2` are on the same day of month, or both +are the last day of month, time of day will be ignored. +Otherwise, the difference is calculated based on 31 days per month, and +rounded to 8 digits. +""", --- End diff -- ```scala usage = """ _FUNC_(timestamp1, timestamp2) - If `timestamp` is later than `timestamp2`, then the result is positive. If `timestamp1` and `timestamp2` are on the same day of month, or both are the last day of month, time of day will be ignored. Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits. """, ``` Let's keep the format consistent. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r176946746 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1117,11 +1117,23 @@ case class AddMonths(startDate: Expression, numMonths: Expression) } /** - * Returns number of months between dates date1 and date2. + * Returns number of months between dates `timestamp1` and `timestamp2`. + * If `timestamp` is later than `timestamp2`, then the result is positive. + * If `timestamp1` and `timestamp2` are on the same day of month, or both + * are the last day of month, time of day will be ignored. + * Otherwise, the difference is calculated based on 31 days per month, and + * rounded to 8 digits. --- End diff -- Thanks for improving our doc! Could you also add the related tests in our test suite? https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala#L488-L511 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r176925831 --- Diff: R/pkg/R/functions.R --- @@ -1957,8 +1958,12 @@ setMethod("levenshtein", signature(y = "Column"), }) #' @details -#' \code{months_between}: Returns number of months between dates \code{y} and \code{x}. -#' +#' \code{months_between}: Returns number of months between dates \code{y} and \code{x}. +#' If \code{y} is later than \code{x}, then the result is positive. +#' If \code{y} and \code{x} are on the same day of month, or both are the last day of month, +#' time of day will be ignored. +#' Otherwise, the difference is calculated based on 31 days per month, and rounded to +#' 8 digits. #' @rdname column_datetime_diff_functions --- End diff -- you should leave a line `#'` before this --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20787: [MINOR][DOCS] Documenting months_between directio...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/20787#discussion_r176925842 --- Diff: R/pkg/R/functions.R --- @@ -1957,8 +1958,12 @@ setMethod("levenshtein", signature(y = "Column"), }) #' @details -#' \code{months_between}: Returns number of months between dates \code{y} and \code{x}. -#' +#' \code{months_between}: Returns number of months between dates \code{y} and \code{x}. +#' If \code{y} is later than \code{x}, then the result is positive. +#' If \code{y} and \code{x} are on the same day of month, or both are the last day of month, +#' time of day will be ignored. +#' Otherwise, the difference is calculated based on 31 days per month, and rounded to +#' 8 digits. #' @rdname column_datetime_diff_functions --- End diff -- also just as reference, the whitespace/newline will be stripped ``` time of day will be ignored. Otherwise ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org