[SPARK-17963][SQL][DOCUMENTATION] Add examples (extend) in each expression and improve documentation
## What changes were proposed in this pull request? This PR proposes to change the documentation for functions. Please refer the discussion from https://github.com/apache/spark/pull/15513 The changes include - Re-indent the documentation - Add examples/arguments in `extended` where the arguments are multiple or specific format (e.g. xml/ json). For examples, the documentation was updated as below: ### Functions with single line usage **Before** - `pow` ``` sql Usage: pow(x1, x2) - Raise x1 to the power of x2. Extended Usage: > SELECT pow(2, 3); 8.0 ``` - `current_timestamp` ``` sql Usage: current_timestamp() - Returns the current timestamp at the start of query evaluation. Extended Usage: No example for current_timestamp. ``` **After** - `pow` ``` sql Usage: pow(expr1, expr2) - Raises `expr1` to the power of `expr2`. Extended Usage: Examples: > SELECT pow(2, 3); 8.0 ``` - `current_timestamp` ``` sql Usage: current_timestamp() - Returns the current timestamp at the start of query evaluation. Extended Usage: No example/argument for current_timestamp. ``` ### Functions with (already) multiple line usage **Before** - `approx_count_distinct` ``` sql Usage: approx_count_distinct(expr) - Returns the estimated cardinality by HyperLogLog++. approx_count_distinct(expr, relativeSD=0.05) - Returns the estimated cardinality by HyperLogLog++ with relativeSD, the maximum estimation error allowed. Extended Usage: No example for approx_count_distinct. ``` - `percentile_approx` ``` sql Usage: percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is a positive integer literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. percentile_approx(col, array(percentage1 [, percentage2]...) [, accuracy]) - Returns the approximate percentile array of column `col` at the given percentage array. Each value of the percentage array must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is a positive integer literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. Extended Usage: No example for percentile_approx. ``` **After** - `approx_count_distinct` ``` sql Usage: approx_count_distinct(expr[, relativeSD]) - Returns the estimated cardinality by HyperLogLog++. `relativeSD` defines the maximum estimation error allowed. Extended Usage: No example/argument for approx_count_distinct. ``` - `percentile_approx` ``` sql Usage: percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. Extended Usage: Examples: > SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT percentile_approx(10.0, 0.5, 100); 10.0 ``` ## How was this patch tested? Manually tested **When examples are multiple** ``` sql spark-sql> describe function extended reflect; Function: reflect Class: org.apache.spark.sql.catalyst.expressions.CallMethodViaReflection Usage: reflect(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection. Extended Usage: Examples: > SELECT reflect('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 > SELECT reflect('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2 ``` **When `Usage` is in single line** ``` sql spark-sql> describe function extended min; Function: min Class: org.apache.spark.sql.catalyst.expressions.aggregate.Min Usage: min(expr) - Returns the minimum value of `expr`. Extended Usage: No example/argument for min. ``` **When `Usage` is already in multiple lines** ``` sql spark-sql> describe function extended percentile_approx; Function: percentile_approx Class: org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile Usage: percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. Extended Usage: Examples: > SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT percentile_approx(10.0, 0.5, 100); 10.0 ``` **When example/argument is missing** ``` sql spark-sql> describe function extended rank; Function: rank Class: org.apache.spark.sql.catalyst.expressions.Rank Usage: rank() - Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. Extended Usage: No example/argument for rank. ``` Author: hyukjinkwon <gurwls...@gmail.com> Closes #15677 from HyukjinKwon/SPARK-17963-1. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7eb2ca8e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7eb2ca8e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7eb2ca8e Branch: refs/heads/master Commit: 7eb2ca8e338e04034a662920261e028f56b07395 Parents: 3a1bc6f Author: hyukjinkwon <gurwls...@gmail.com> Authored: Wed Nov 2 20:56:30 2016 -0700 Committer: gatorsmile <gatorsm...@gmail.com> Committed: Wed Nov 2 20:56:30 2016 -0700 ---------------------------------------------------------------------- .../expressions/ExpressionDescription.java | 2 +- .../expressions/CallMethodViaReflection.scala | 12 +- .../spark/sql/catalyst/expressions/Cast.scala | 8 +- .../catalyst/expressions/InputFileName.scala | 3 +- .../expressions/MonotonicallyIncreasingID.scala | 14 +- .../catalyst/expressions/SparkPartitionID.scala | 3 +- .../aggregate/ApproximatePercentile.scala | 26 +- .../expressions/aggregate/Average.scala | 2 +- .../aggregate/CentralMomentAgg.scala | 14 +- .../catalyst/expressions/aggregate/Corr.scala | 4 +- .../catalyst/expressions/aggregate/Count.scala | 10 +- .../expressions/aggregate/Covariance.scala | 4 +- .../catalyst/expressions/aggregate/First.scala | 8 +- .../aggregate/HyperLogLogPlusPlus.scala | 8 +- .../catalyst/expressions/aggregate/Last.scala | 5 +- .../catalyst/expressions/aggregate/Max.scala | 2 +- .../catalyst/expressions/aggregate/Min.scala | 2 +- .../catalyst/expressions/aggregate/Sum.scala | 2 +- .../expressions/aggregate/collect.scala | 2 +- .../sql/catalyst/expressions/arithmetic.scala | 79 ++++- .../expressions/bitwiseExpressions.scala | 32 +- .../expressions/collectionOperations.scala | 36 +- .../expressions/complexTypeCreator.scala | 29 +- .../expressions/conditionalExpressions.scala | 9 +- .../expressions/datetimeExpressions.scala | 199 ++++++++--- .../sql/catalyst/expressions/generators.scala | 36 +- .../catalyst/expressions/jsonExpressions.scala | 14 +- .../catalyst/expressions/mathExpressions.scala | 346 ++++++++++++++----- .../spark/sql/catalyst/expressions/misc.scala | 59 +++- .../catalyst/expressions/nullExpressions.scala | 72 +++- .../sql/catalyst/expressions/predicates.scala | 24 +- .../expressions/randomExpressions.scala | 24 +- .../expressions/regexpExpressions.scala | 30 +- .../expressions/stringExpressions.scala | 317 ++++++++++++----- .../expressions/windowExpressions.scala | 117 ++++--- .../sql/catalyst/expressions/xml/xpath.scala | 78 ++++- .../spark/sql/execution/command/functions.scala | 22 +- .../org/apache/spark/sql/SQLQuerySuite.scala | 7 +- .../spark/sql/execution/command/DDLSuite.scala | 22 +- .../sql/hive/execution/SQLQuerySuite.scala | 24 +- 40 files changed, 1256 insertions(+), 451 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java index 9e10f27..62a2ce4 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java @@ -39,5 +39,5 @@ import java.lang.annotation.RetentionPolicy; @Retention(RetentionPolicy.RUNTIME) public @interface ExpressionDescription { String usage() default "_FUNC_ is undocumented"; - String extended() default "No example for _FUNC_."; + String extended() default "\n No example/argument for _FUNC_.\n"; } http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala index fe24c04..40f1b14 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala @@ -43,11 +43,15 @@ import org.apache.spark.util.Utils * and the second element should be a literal string for the method name, * and the remaining are input arguments to the Java method. */ -// scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", - extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") -// scalastyle:on line.size.limit + usage = "_FUNC_(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection.", + extended = """ + Examples: + > SELECT _FUNC_('java.util.UUID', 'randomUUID'); + c33fb387-8500-4bfa-81d2-6e0e3e930df2 + > SELECT _FUNC_('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); + a5cf6c42-0c85-418f-af6c-3e4e5b1328f2 + """) case class CallMethodViaReflection(children: Seq[Expression]) extends Expression with CodegenFallback { http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala index 58fd65f..4db1ae6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala @@ -114,8 +114,12 @@ object Cast { /** Cast the child expression to the target data type. */ @ExpressionDescription( - usage = " - Cast value v to the target data type.", - extended = "> SELECT _FUNC_('10' as int);\n 10") + usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data type `type`.", + extended = """ + Examples: + > SELECT _FUNC_('10' as int); + 10 + """) case class Cast(child: Expression, dataType: DataType) extends UnaryExpression with NullIntolerant { override def toString: String = s"cast($child as ${dataType.simpleString})" http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InputFileName.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InputFileName.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InputFileName.scala index b6c12c5..b7fb285 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InputFileName.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InputFileName.scala @@ -27,8 +27,7 @@ import org.apache.spark.unsafe.types.UTF8String * Expression that returns the name of the current file being read. */ @ExpressionDescription( - usage = "_FUNC_() - Returns the name of the current file being read if available", - extended = "> SELECT _FUNC_();\n ''") + usage = "_FUNC_() - Returns the name of the current file being read if available.") case class InputFileName() extends LeafExpression with Nondeterministic { override def nullable: Boolean = true http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/MonotonicallyIncreasingID.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/MonotonicallyIncreasingID.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/MonotonicallyIncreasingID.scala index 72b8dcc..32358a9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/MonotonicallyIncreasingID.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/MonotonicallyIncreasingID.scala @@ -33,13 +33,13 @@ import org.apache.spark.sql.types.{DataType, LongType} * Since this expression is stateful, it cannot be a case object. */ @ExpressionDescription( - usage = - """_FUNC_() - Returns monotonically increasing 64-bit integers. - The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. - The current implementation puts the partition ID in the upper 31 bits, and the lower 33 bits - represent the record number within each partition. The assumption is that the data frame has - less than 1 billion partitions, and each partition has less than 8 billion records.""", - extended = "> SELECT _FUNC_();\n 0") + usage = """ + _FUNC_() - Returns monotonically increasing 64-bit integers. The generated ID is guaranteed + to be monotonically increasing and unique, but not consecutive. The current implementation + puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number + within each partition. The assumption is that the data frame has less than 1 billion + partitions, and each partition has less than 8 billion records. + """) case class MonotonicallyIncreasingID() extends LeafExpression with Nondeterministic { /** http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SparkPartitionID.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SparkPartitionID.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SparkPartitionID.scala index 6bef473..8db7efd 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SparkPartitionID.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SparkPartitionID.scala @@ -25,8 +25,7 @@ import org.apache.spark.sql.types.{DataType, IntegerType} * Expression that returns the current partition id. */ @ExpressionDescription( - usage = "_FUNC_() - Returns the current partition id", - extended = "> SELECT _FUNC_();\n 0") + usage = "_FUNC_() - Returns the current partition id.") case class SparkPartitionID() extends LeafExpression with Nondeterministic { override def nullable: Boolean = false http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index f91ff87..692cbd7 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,21 +49,23 @@ import org.apache.spark.sql.types._ * DEFAULT_PERCENTILE_ACCURACY. */ @ExpressionDescription( - usage = - """ - _FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric + usage = """ + _FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 10000) is a positive integer literal which + and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. - - _FUNC_(col, array(percentage1 [, percentage2]...) [, accuracy]) - Returns the approximate - percentile array of column `col` at the given percentage array. Each value of the - percentage array must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is - a positive integer literal which controls approximation accuracy at the cost of memory. - Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of - the approximation. - """) + When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. + In this case, returns the approximate percentile array of column `col` at the given + percentage array. + """, + extended = """ + Examples: + > SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100); + [10.0,10.0,10.0] + > SELECT percentile_approx(10.0, 0.5, 100); + 10.0 + """) case class ApproximatePercentile( child: Expression, percentageExpression: Expression, http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala index ff70774..d523420 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala @@ -24,7 +24,7 @@ import org.apache.spark.sql.catalyst.util.TypeUtils import org.apache.spark.sql.types._ @ExpressionDescription( - usage = "_FUNC_(x) - Returns the mean calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the mean calculated from values of a group.") case class Average(child: Expression) extends DeclarativeAggregate { override def prettyName: String = "avg" http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala index 17a7c6d..3020547 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala @@ -132,7 +132,7 @@ abstract class CentralMomentAgg(child: Expression) extends DeclarativeAggregate // Compute the population standard deviation of a column // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(x) - Returns the population standard deviation calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the population standard deviation calculated from values of a group.") // scalastyle:on line.size.limit case class StddevPop(child: Expression) extends CentralMomentAgg(child) { @@ -147,8 +147,10 @@ case class StddevPop(child: Expression) extends CentralMomentAgg(child) { } // Compute the sample standard deviation of a column +// scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(x) - Returns the sample standard deviation calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the sample standard deviation calculated from values of a group.") +// scalastyle:on line.size.limit case class StddevSamp(child: Expression) extends CentralMomentAgg(child) { override protected def momentOrder = 2 @@ -164,7 +166,7 @@ case class StddevSamp(child: Expression) extends CentralMomentAgg(child) { // Compute the population variance of a column @ExpressionDescription( - usage = "_FUNC_(x) - Returns the population variance calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the population variance calculated from values of a group.") case class VariancePop(child: Expression) extends CentralMomentAgg(child) { override protected def momentOrder = 2 @@ -179,7 +181,7 @@ case class VariancePop(child: Expression) extends CentralMomentAgg(child) { // Compute the sample variance of a column @ExpressionDescription( - usage = "_FUNC_(x) - Returns the sample variance calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the sample variance calculated from values of a group.") case class VarianceSamp(child: Expression) extends CentralMomentAgg(child) { override protected def momentOrder = 2 @@ -194,7 +196,7 @@ case class VarianceSamp(child: Expression) extends CentralMomentAgg(child) { } @ExpressionDescription( - usage = "_FUNC_(x) - Returns the Skewness value calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the skewness value calculated from values of a group.") case class Skewness(child: Expression) extends CentralMomentAgg(child) { override def prettyName: String = "skewness" @@ -209,7 +211,7 @@ case class Skewness(child: Expression) extends CentralMomentAgg(child) { } @ExpressionDescription( - usage = "_FUNC_(x) - Returns the Kurtosis value calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the kurtosis value calculated from values of a group.") case class Kurtosis(child: Expression) extends CentralMomentAgg(child) { override protected def momentOrder = 4 http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Corr.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Corr.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Corr.scala index e29265e..657f519 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Corr.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Corr.scala @@ -28,8 +28,10 @@ import org.apache.spark.sql.types._ * Definition of Pearson correlation can be found at * http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient */ +// scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(x,y) - Returns Pearson coefficient of correlation between a set of number pairs.") + usage = "_FUNC_(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs.") +// scalastyle:on line.size.limit case class Corr(x: Expression, y: Expression) extends DeclarativeAggregate { override def children: Seq[Expression] = Seq(x, y) http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala index 17ae012..bcae0dc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Count.scala @@ -23,9 +23,13 @@ import org.apache.spark.sql.types._ // scalastyle:off line.size.limit @ExpressionDescription( - usage = """_FUNC_(*) - Returns the total number of retrieved rows, including rows containing NULL values. - _FUNC_(expr) - Returns the number of rows for which the supplied expression is non-NULL. - _FUNC_(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL.""") + usage = """ + _FUNC_(*) - Returns the total number of retrieved rows, including rows containing null. + + _FUNC_(expr) - Returns the number of rows for which the supplied expression is non-null. + + _FUNC_(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null. + """) // scalastyle:on line.size.limit case class Count(children: Seq[Expression]) extends DeclarativeAggregate { http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala index d80afbe..ae5ed77 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala @@ -77,7 +77,7 @@ abstract class Covariance(x: Expression, y: Expression) extends DeclarativeAggre } @ExpressionDescription( - usage = "_FUNC_(x,y) - Returns the population covariance of a set of number pairs.") + usage = "_FUNC_(expr1, expr2) - Returns the population covariance of a set of number pairs.") case class CovPopulation(left: Expression, right: Expression) extends Covariance(left, right) { override val evaluateExpression: Expression = { If(n === Literal(0.0), Literal.create(null, DoubleType), @@ -88,7 +88,7 @@ case class CovPopulation(left: Expression, right: Expression) extends Covariance @ExpressionDescription( - usage = "_FUNC_(x,y) - Returns the sample covariance of a set of number pairs.") + usage = "_FUNC_(expr1, expr2) - Returns the sample covariance of a set of number pairs.") case class CovSample(left: Expression, right: Expression) extends Covariance(left, right) { override val evaluateExpression: Expression = { If(n === Literal(0.0), Literal.create(null, DoubleType), http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/First.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/First.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/First.scala index d702c08..29b8947 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/First.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/First.scala @@ -29,10 +29,10 @@ import org.apache.spark.sql.types._ * a single partition, and we use a single reducer to do the aggregation.). */ @ExpressionDescription( - usage = """_FUNC_(expr) - Returns the first value of `child` for a group of rows. - _FUNC_(expr,isIgnoreNull=false) - Returns the first value of `child` for a group of rows. - If isIgnoreNull is true, returns only non-null values. - """) + usage = """ + _FUNC_(expr[, isIgnoreNull]) - Returns the first value of `expr` for a group of rows. + If `isIgnoreNull` is true, returns only non-null values. + """) case class First(child: Expression, ignoreNullsExpr: Expression) extends DeclarativeAggregate { def this(child: Expression) = this(child, Literal.create(false, BooleanType)) http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala index 83c8d40..b9862aa 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala @@ -47,10 +47,10 @@ import org.apache.spark.sql.types._ */ // scalastyle:on @ExpressionDescription( - usage = """_FUNC_(expr) - Returns the estimated cardinality by HyperLogLog++. - _FUNC_(expr, relativeSD=0.05) - Returns the estimated cardinality by HyperLogLog++ - with relativeSD, the maximum estimation error allowed. - """) + usage = """ + _FUNC_(expr[, relativeSD]) - Returns the estimated cardinality by HyperLogLog++. + `relativeSD` defines the maximum estimation error allowed. + """) case class HyperLogLogPlusPlus( child: Expression, relativeSD: Double = 0.05, http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala index 8579f72..b0a363e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala @@ -29,7 +29,10 @@ import org.apache.spark.sql.types._ * a single partition, and we use a single reducer to do the aggregation.). */ @ExpressionDescription( - usage = "_FUNC_(expr,isIgnoreNull) - Returns the last value of `child` for a group of rows.") + usage = """ + _FUNC_(expr[, isIgnoreNull]) - Returns the last value of `expr` for a group of rows. + If `isIgnoreNull` is true, returns only non-null values. + """) case class Last(child: Expression, ignoreNullsExpr: Expression) extends DeclarativeAggregate { def this(child: Expression) = this(child, Literal.create(false, BooleanType)) http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Max.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Max.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Max.scala index c534fe4..f32c9c6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Max.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Max.scala @@ -23,7 +23,7 @@ import org.apache.spark.sql.catalyst.util.TypeUtils import org.apache.spark.sql.types._ @ExpressionDescription( - usage = "_FUNC_(expr) - Returns the maximum value of expr.") + usage = "_FUNC_(expr) - Returns the maximum value of `expr`.") case class Max(child: Expression) extends DeclarativeAggregate { override def children: Seq[Expression] = child :: Nil http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Min.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Min.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Min.scala index 35289b4..9ef42b9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Min.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Min.scala @@ -23,7 +23,7 @@ import org.apache.spark.sql.catalyst.util.TypeUtils import org.apache.spark.sql.types._ @ExpressionDescription( - usage = "_FUNC_(expr) - Returns the minimum value of expr.") + usage = "_FUNC_(expr) - Returns the minimum value of `expr`.") case class Min(child: Expression) extends DeclarativeAggregate { override def children: Seq[Expression] = child :: Nil http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala index ad217f2..f3731d4 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala @@ -23,7 +23,7 @@ import org.apache.spark.sql.catalyst.util.TypeUtils import org.apache.spark.sql.types._ @ExpressionDescription( - usage = "_FUNC_(x) - Returns the sum calculated from values of a group.") + usage = "_FUNC_(expr) - Returns the sum calculated from values of a group.") case class Sum(child: Expression) extends DeclarativeAggregate { override def children: Seq[Expression] = child :: Nil http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala index 89eb864..d2880d5 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala @@ -106,7 +106,7 @@ case class CollectList( } /** - * Collect a list of unique elements. + * Collect a set of unique elements. */ @ExpressionDescription( usage = "_FUNC_(expr) - Collects and returns a set of unique elements.") http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala index 6f3db79..4870093 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala @@ -25,7 +25,12 @@ import org.apache.spark.sql.types._ import org.apache.spark.unsafe.types.CalendarInterval @ExpressionDescription( - usage = "_FUNC_(a) - Returns -a.") + usage = "_FUNC_(expr) - Returns the negated value of `expr`.", + extended = """ + Examples: + > SELECT _FUNC_(1); + -1 + """) case class UnaryMinus(child: Expression) extends UnaryExpression with ExpectsInputTypes with NullIntolerant { @@ -62,7 +67,7 @@ case class UnaryMinus(child: Expression) extends UnaryExpression } @ExpressionDescription( - usage = "_FUNC_(a) - Returns a.") + usage = "_FUNC_(expr) - Returns the value of `expr`.") case class UnaryPositive(child: Expression) extends UnaryExpression with ExpectsInputTypes with NullIntolerant { override def prettyName: String = "positive" @@ -84,7 +89,11 @@ case class UnaryPositive(child: Expression) */ @ExpressionDescription( usage = "_FUNC_(expr) - Returns the absolute value of the numeric value.", - extended = "> SELECT _FUNC_('-1');\n 1") + extended = """ + Examples: + > SELECT _FUNC_(-1); + 1 + """) case class Abs(child: Expression) extends UnaryExpression with ExpectsInputTypes with NullIntolerant { @@ -131,7 +140,12 @@ object BinaryArithmetic { } @ExpressionDescription( - usage = "a _FUNC_ b - Returns a+b.") + usage = "expr1 _FUNC_ expr2 - Returns `expr1`+`expr2`.", + extended = """ + Examples: + > SELECT 1 _FUNC_ 2; + 3 + """) case class Add(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant { override def inputType: AbstractDataType = TypeCollection.NumericAndInterval @@ -162,7 +176,12 @@ case class Add(left: Expression, right: Expression) extends BinaryArithmetic wit } @ExpressionDescription( - usage = "a _FUNC_ b - Returns a-b.") + usage = "expr1 _FUNC_ expr2 - Returns `expr1`-`expr2`.", + extended = """ + Examples: + > SELECT 2 _FUNC_ 1; + 1 + """) case class Subtract(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant { @@ -194,7 +213,12 @@ case class Subtract(left: Expression, right: Expression) } @ExpressionDescription( - usage = "a _FUNC_ b - Multiplies a by b.") + usage = "expr1 _FUNC_ expr2 - Returns `expr1`*`expr2`.", + extended = """ + Examples: + > SELECT 2 _FUNC_ 3; + 6 + """) case class Multiply(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant { @@ -208,9 +232,17 @@ case class Multiply(left: Expression, right: Expression) protected override def nullSafeEval(input1: Any, input2: Any): Any = numeric.times(input1, input2) } +// scalastyle:off line.size.limit @ExpressionDescription( - usage = "a _FUNC_ b - Divides a by b.", - extended = "> SELECT 3 _FUNC_ 2;\n 1.5") + usage = "expr1 _FUNC_ expr2 - Returns `expr1`/`expr2`. It always performs floating point division.", + extended = """ + Examples: + > SELECT 3 _FUNC_ 2; + 1.5 + > SELECT 2L _FUNC_ 2L; + 1.0 + """) +// scalastyle:on line.size.limit case class Divide(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant { @@ -286,7 +318,12 @@ case class Divide(left: Expression, right: Expression) } @ExpressionDescription( - usage = "a _FUNC_ b - Returns the remainder when dividing a by b.") + usage = "expr1 _FUNC_ expr2 - Returns the remainder after `expr1`/`expr2`.", + extended = """ + Examples: + > SELECT 2 _FUNC_ 1.8; + 0.2 + """) case class Remainder(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant { @@ -367,8 +404,14 @@ case class Remainder(left: Expression, right: Expression) } @ExpressionDescription( - usage = "_FUNC_(a, b) - Returns the positive modulo", - extended = "> SELECT _FUNC_(10,3);\n 1") + usage = "_FUNC_(expr1, expr2) - Returns the positive value of `expr1` mod `expr2`.", + extended = """ + Examples: + > SELECT _FUNC_(10, 3); + 1 + > SELECT _FUNC_(-10, 3); + 2 + """) case class Pmod(left: Expression, right: Expression) extends BinaryArithmetic with NullIntolerant { override def toString: String = s"pmod($left, $right)" @@ -471,7 +514,12 @@ case class Pmod(left: Expression, right: Expression) extends BinaryArithmetic wi * It takes at least 2 parameters, and returns null iff all parameters are null. */ @ExpressionDescription( - usage = "_FUNC_(n1, ...) - Returns the least value of all parameters, skipping null values.") + usage = "_FUNC_(expr, ...) - Returns the least value of all parameters, skipping null values.", + extended = """ + Examples: + > SELECT _FUNC_(10, 9, 2, 4, 3); + 2 + """) case class Least(children: Seq[Expression]) extends Expression { override def nullable: Boolean = children.forall(_.nullable) @@ -531,7 +579,12 @@ case class Least(children: Seq[Expression]) extends Expression { * It takes at least 2 parameters, and returns null iff all parameters are null. */ @ExpressionDescription( - usage = "_FUNC_(n1, ...) - Returns the greatest value of all parameters, skipping null values.") + usage = "_FUNC_(expr, ...) - Returns the greatest value of all parameters, skipping null values.", + extended = """ + Examples: + > SELECT _FUNC_(10, 9, 2, 4, 3); + 10 + """) case class Greatest(children: Seq[Expression]) extends Expression { override def nullable: Boolean = children.forall(_.nullable) http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/bitwiseExpressions.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/bitwiseExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/bitwiseExpressions.scala index 3a0a882..2918040 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/bitwiseExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/bitwiseExpressions.scala @@ -27,8 +27,12 @@ import org.apache.spark.sql.types._ * Code generation inherited from BinaryArithmetic. */ @ExpressionDescription( - usage = "a _FUNC_ b - Bitwise AND.", - extended = "> SELECT 3 _FUNC_ 5; 1") + usage = "expr1 _FUNC_ expr2 - Returns the result of bitwise AND of `expr1` and `expr2`.", + extended = """ + Examples: + > SELECT 3 _FUNC_ 5; + 1 + """) case class BitwiseAnd(left: Expression, right: Expression) extends BinaryArithmetic { override def inputType: AbstractDataType = IntegralType @@ -55,8 +59,12 @@ case class BitwiseAnd(left: Expression, right: Expression) extends BinaryArithme * Code generation inherited from BinaryArithmetic. */ @ExpressionDescription( - usage = "a _FUNC_ b - Bitwise OR.", - extended = "> SELECT 3 _FUNC_ 5; 7") + usage = "expr1 _FUNC_ expr2 - Returns the result of bitwise OR of `expr1` and `expr2`.", + extended = """ + Examples: + > SELECT 3 _FUNC_ 5; + 7 + """) case class BitwiseOr(left: Expression, right: Expression) extends BinaryArithmetic { override def inputType: AbstractDataType = IntegralType @@ -83,8 +91,12 @@ case class BitwiseOr(left: Expression, right: Expression) extends BinaryArithmet * Code generation inherited from BinaryArithmetic. */ @ExpressionDescription( - usage = "a _FUNC_ b - Bitwise exclusive OR.", - extended = "> SELECT 3 _FUNC_ 5; 2") + usage = "expr1 _FUNC_ expr2 - Returns the result of bitwise exclusive OR of `expr1` and `expr2`.", + extended = """ + Examples: + > SELECT 3 _FUNC_ 5; + 2 + """) case class BitwiseXor(left: Expression, right: Expression) extends BinaryArithmetic { override def inputType: AbstractDataType = IntegralType @@ -109,8 +121,12 @@ case class BitwiseXor(left: Expression, right: Expression) extends BinaryArithme * A function that calculates bitwise not(~) of a number. */ @ExpressionDescription( - usage = "_FUNC_ b - Bitwise NOT.", - extended = "> SELECT _FUNC_ 0; -1") + usage = "_FUNC_ expr - Returns the result of bitwise NOT of `expr`.", + extended = """ + Examples: + > SELECT _FUNC_ 0; + -1 + """) case class BitwiseNot(child: Expression) extends UnaryExpression with ExpectsInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(IntegralType) http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala index f56bb39..c863ba4 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala @@ -28,8 +28,12 @@ import org.apache.spark.sql.types._ * Given an array or map, returns its size. Returns -1 if null. */ @ExpressionDescription( - usage = "_FUNC_(expr) - Returns the size of an array or a map.", - extended = " > SELECT _FUNC_(array('b', 'd', 'c', 'a'));\n 4") + usage = "_FUNC_(expr) - Returns the size of an array or a map. Returns -1 if null.", + extended = """ + Examples: + > SELECT _FUNC_(array('b', 'd', 'c', 'a')); + 4 + """) case class Size(child: Expression) extends UnaryExpression with ExpectsInputTypes { override def dataType: DataType = IntegerType override def inputTypes: Seq[AbstractDataType] = Seq(TypeCollection(ArrayType, MapType)) @@ -60,7 +64,11 @@ case class Size(child: Expression) extends UnaryExpression with ExpectsInputType */ @ExpressionDescription( usage = "_FUNC_(map) - Returns an unordered array containing the keys of the map.", - extended = " > SELECT _FUNC_(map(1, 'a', 2, 'b'));\n [1,2]") + extended = """ + Examples: + > SELECT _FUNC_(map(1, 'a', 2, 'b')); + [1,2] + """) case class MapKeys(child: Expression) extends UnaryExpression with ExpectsInputTypes { @@ -84,7 +92,11 @@ case class MapKeys(child: Expression) */ @ExpressionDescription( usage = "_FUNC_(map) - Returns an unordered array containing the values of the map.", - extended = " > SELECT _FUNC_(map(1, 'a', 2, 'b'));\n [\"a\",\"b\"]") + extended = """ + Examples: + > SELECT _FUNC_(map(1, 'a', 2, 'b')); + ["a","b"] + """) case class MapValues(child: Expression) extends UnaryExpression with ExpectsInputTypes { @@ -109,8 +121,12 @@ case class MapValues(child: Expression) */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(array(obj1, obj2, ...), ascendingOrder) - Sorts the input array in ascending order according to the natural ordering of the array elements.", - extended = " > SELECT _FUNC_(array('b', 'd', 'c', 'a'), true);\n 'a', 'b', 'c', 'd'") + usage = "_FUNC_(array[, ascendingOrder]) - Sorts the input array in ascending or descending order according to the natural ordering of the array elements.", + extended = """ + Examples: + > SELECT _FUNC_(array('b', 'd', 'c', 'a'), true); + ["a","b","c","d"] + """) // scalastyle:on line.size.limit case class SortArray(base: Expression, ascendingOrder: Expression) extends BinaryExpression with ExpectsInputTypes with CodegenFallback { @@ -200,8 +216,12 @@ case class SortArray(base: Expression, ascendingOrder: Expression) * Checks if the array (left) has the element (right) */ @ExpressionDescription( - usage = "_FUNC_(array, value) - Returns TRUE if the array contains the value.", - extended = " > SELECT _FUNC_(array(1, 2, 3), 2);\n true") + usage = "_FUNC_(array, value) - Returns true if the array contains the value.", + extended = """ + Examples: + > SELECT _FUNC_(array(1, 2, 3), 2); + true + """) case class ArrayContains(left: Expression, right: Expression) extends BinaryExpression with ImplicitCastInputTypes { http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala index dbfb299..c9f3664 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala @@ -30,7 +30,12 @@ import org.apache.spark.unsafe.types.UTF8String * Returns an Array containing the evaluation of all children expressions. */ @ExpressionDescription( - usage = "_FUNC_(n0, ...) - Returns an array with the given elements.") + usage = "_FUNC_(expr, ...) - Returns an array with the given elements.", + extended = """ + Examples: + > SELECT _FUNC_(1, 2, 3); + [1,2,3] + """) case class CreateArray(children: Seq[Expression]) extends Expression { override def foldable: Boolean = children.forall(_.foldable) @@ -84,7 +89,12 @@ case class CreateArray(children: Seq[Expression]) extends Expression { * The children are a flatted sequence of kv pairs, e.g. (key1, value1, key2, value2, ...) */ @ExpressionDescription( - usage = "_FUNC_(key0, value0, key1, value1...) - Creates a map with the given key/value pairs.") + usage = "_FUNC_(key0, value0, key1, value1, ...) - Creates a map with the given key/value pairs.", + extended = """ + Examples: + > SELECT _FUNC_(1.0, '2', 3.0, '4'); + {1.0:"2",3.0:"4"} + """) case class CreateMap(children: Seq[Expression]) extends Expression { lazy val keys = children.indices.filter(_ % 2 == 0).map(children) lazy val values = children.indices.filter(_ % 2 != 0).map(children) @@ -276,7 +286,12 @@ trait CreateNamedStructLike extends Expression { */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values.") + usage = "_FUNC_(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values.", + extended = """ + Examples: + > SELECT _FUNC_("a", 1, "b", 2, "c", 3); + {"a":1,"b":2,"c":3} + """) // scalastyle:on line.size.limit case class CreateNamedStruct(children: Seq[Expression]) extends CreateNamedStructLike { @@ -329,8 +344,12 @@ case class CreateNamedStructUnsafe(children: Seq[Expression]) extends CreateName */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(text[, pairDelim, keyValueDelim]) - Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for pairDelim and ':' for keyValueDelim.", - extended = """ > SELECT _FUNC_('a:1,b:2,c:3',',',':');\n map("a":"1","b":"2","c":"3") """) + usage = "_FUNC_(text[, pairDelim[, keyValueDelim]]) - Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for `pairDelim` and ':' for `keyValueDelim`.", + extended = """ + Examples: + > SELECT _FUNC_('a:1,b:2,c:3', ',', ':'); + map("a":"1","b":"2","c":"3") + """) // scalastyle:on line.size.limit case class StringToMap(text: Expression, pairDelim: Expression, keyValueDelim: Expression) extends TernaryExpression with CodegenFallback with ExpectsInputTypes { http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala index 71d4e9a..a7d9e2d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala @@ -24,7 +24,12 @@ import org.apache.spark.sql.types._ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(expr1,expr2,expr3) - If expr1 is TRUE then IF() returns expr2; otherwise it returns expr3.") + usage = "_FUNC_(expr1, expr2, expr3) - If `expr1` evaluates to true, then returns `expr2`; otherwise returns `expr3`.", + extended = """ + Examples: + > SELECT _FUNC_(1 < 2, 'a', 'b'); + a + """) // scalastyle:on line.size.limit case class If(predicate: Expression, trueValue: Expression, falseValue: Expression) extends Expression { @@ -162,7 +167,7 @@ abstract class CaseWhenBase( */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e.") + usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; when `expr3` = true, return `expr4`; else return `expr5`.") // scalastyle:on line.size.limit case class CaseWhen( val branches: Seq[(Expression, Expression)], http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala index 05bfa7d..9cec6be 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala @@ -75,8 +75,12 @@ case class CurrentTimestamp() extends LeafExpression with CodegenFallback { * Adds a number of days to startdate. */ @ExpressionDescription( - usage = "_FUNC_(start_date, num_days) - Returns the date that is num_days after start_date.", - extended = "> SELECT _FUNC_('2016-07-30', 1);\n '2016-07-31'") + usage = "_FUNC_(start_date, num_days) - Returns the date that is `num_days` after `start_date`.", + extended = """ + Examples: + > SELECT _FUNC_('2016-07-30', 1); + 2016-07-31 + """) case class DateAdd(startDate: Expression, days: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -104,8 +108,12 @@ case class DateAdd(startDate: Expression, days: Expression) * Subtracts a number of days to startdate. */ @ExpressionDescription( - usage = "_FUNC_(start_date, num_days) - Returns the date that is num_days before start_date.", - extended = "> SELECT _FUNC_('2016-07-30', 1);\n '2016-07-29'") + usage = "_FUNC_(start_date, num_days) - Returns the date that is `num_days` before `start_date`.", + extended = """ + Examples: + > SELECT _FUNC_('2016-07-30', 1); + 2016-07-29 + """) case class DateSub(startDate: Expression, days: Expression) extends BinaryExpression with ImplicitCastInputTypes { override def left: Expression = startDate @@ -129,8 +137,12 @@ case class DateSub(startDate: Expression, days: Expression) } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the hour component of the string/timestamp/interval.", - extended = "> SELECT _FUNC_('2009-07-30 12:58:59');\n 12") + usage = "_FUNC_(timestamp) - Returns the hour component of the string/timestamp.", + extended = """ + Examples: + > SELECT _FUNC_('2009-07-30 12:58:59'); + 12 + """) case class Hour(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(TimestampType) @@ -148,8 +160,12 @@ case class Hour(child: Expression) extends UnaryExpression with ImplicitCastInpu } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the minute component of the string/timestamp/interval.", - extended = "> SELECT _FUNC_('2009-07-30 12:58:59');\n 58") + usage = "_FUNC_(timestamp) - Returns the minute component of the string/timestamp.", + extended = """ + Examples: + > SELECT _FUNC_('2009-07-30 12:58:59'); + 58 + """) case class Minute(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(TimestampType) @@ -167,8 +183,12 @@ case class Minute(child: Expression) extends UnaryExpression with ImplicitCastIn } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the second component of the string/timestamp/interval.", - extended = "> SELECT _FUNC_('2009-07-30 12:58:59');\n 59") + usage = "_FUNC_(timestamp) - Returns the second component of the string/timestamp.", + extended = """ + Examples: + > SELECT _FUNC_('2009-07-30 12:58:59'); + 59 + """) case class Second(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(TimestampType) @@ -186,8 +206,12 @@ case class Second(child: Expression) extends UnaryExpression with ImplicitCastIn } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the day of year of date/timestamp.", - extended = "> SELECT _FUNC_('2016-04-09');\n 100") + usage = "_FUNC_(date) - Returns the day of year of the date/timestamp.", + extended = """ + Examples: + > SELECT _FUNC_('2016-04-09'); + 100 + """) case class DayOfYear(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(DateType) @@ -205,8 +229,12 @@ case class DayOfYear(child: Expression) extends UnaryExpression with ImplicitCas } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the year component of the date/timestamp/interval.", - extended = "> SELECT _FUNC_('2016-07-30');\n 2016") + usage = "_FUNC_(date) - Returns the year component of the date/timestamp.", + extended = """ + Examples: + > SELECT _FUNC_('2016-07-30'); + 2016 + """) case class Year(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(DateType) @@ -224,7 +252,12 @@ case class Year(child: Expression) extends UnaryExpression with ImplicitCastInpu } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the quarter of the year for date, in the range 1 to 4.") + usage = "_FUNC_(date) - Returns the quarter of the year for date, in the range 1 to 4.", + extended = """ + Examples: + > SELECT _FUNC_('2016-08-31'); + 3 + """) case class Quarter(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(DateType) @@ -242,8 +275,12 @@ case class Quarter(child: Expression) extends UnaryExpression with ImplicitCastI } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the month component of the date/timestamp/interval", - extended = "> SELECT _FUNC_('2016-07-30');\n 7") + usage = "_FUNC_(date) - Returns the month component of the date/timestamp.", + extended = """ + Examples: + > SELECT _FUNC_('2016-07-30'); + 7 + """) case class Month(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(DateType) @@ -261,8 +298,12 @@ case class Month(child: Expression) extends UnaryExpression with ImplicitCastInp } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the day of month of date/timestamp, or the day of interval.", - extended = "> SELECT _FUNC_('2009-07-30');\n 30") + usage = "_FUNC_(date) - Returns the day of month of the date/timestamp.", + extended = """ + Examples: + > SELECT _FUNC_('2009-07-30'); + 30 + """) case class DayOfMonth(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(DateType) @@ -280,8 +321,12 @@ case class DayOfMonth(child: Expression) extends UnaryExpression with ImplicitCa } @ExpressionDescription( - usage = "_FUNC_(param) - Returns the week of the year of the given date.", - extended = "> SELECT _FUNC_('2008-02-20');\n 8") + usage = "_FUNC_(date) - Returns the week of the year of the given date.", + extended = """ + Examples: + > SELECT _FUNC_('2008-02-20'); + 8 + """) case class WeekOfYear(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def inputTypes: Seq[AbstractDataType] = Seq(DateType) @@ -320,8 +365,12 @@ case class WeekOfYear(child: Expression) extends UnaryExpression with ImplicitCa // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(date/timestamp/string, fmt) - Converts a date/timestamp/string to a value of string in the format specified by the date format fmt.", - extended = "> SELECT _FUNC_('2016-04-08', 'y')\n '2016'") + usage = "_FUNC_(timestamp, fmt) - Converts `timestamp` to a value of string in the format specified by the date format `fmt`.", + extended = """ + Examples: + > SELECT _FUNC_('2016-04-08', 'y'); + 2016 + """) // scalastyle:on line.size.limit case class DateFormatClass(left: Expression, right: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -351,7 +400,12 @@ case class DateFormatClass(left: Expression, right: Expression) extends BinaryEx * Deterministic version of [[UnixTimestamp]], must have at least one parameter. */ @ExpressionDescription( - usage = "_FUNC_(date[, pattern]) - Returns the UNIX timestamp of the give time.") + usage = "_FUNC_(expr[, pattern]) - Returns the UNIX timestamp of the give time.", + extended = """ + Examples: + > SELECT _FUNC_('2016-04-08', 'yyyy-MM-dd'); + 1460041200 + """) case class ToUnixTimestamp(timeExp: Expression, format: Expression) extends UnixTime { override def left: Expression = timeExp override def right: Expression = format @@ -374,7 +428,14 @@ case class ToUnixTimestamp(timeExp: Expression, format: Expression) extends Unix * second parameter. */ @ExpressionDescription( - usage = "_FUNC_([date[, pattern]]) - Returns the UNIX timestamp of current or specified time.") + usage = "_FUNC_([expr[, pattern]]) - Returns the UNIX timestamp of current or specified time.", + extended = """ + Examples: + > SELECT _FUNC_(); + 1476884637 + > SELECT _FUNC_('2016-04-08', 'yyyy-MM-dd'); + 1460041200 + """) case class UnixTimestamp(timeExp: Expression, format: Expression) extends UnixTime { override def left: Expression = timeExp override def right: Expression = format @@ -497,8 +558,12 @@ abstract class UnixTime extends BinaryExpression with ExpectsInputTypes { * Note that hive Language Manual says it returns 0 if fail, but in fact it returns null. */ @ExpressionDescription( - usage = "_FUNC_(unix_time, format) - Returns unix_time in the specified format", - extended = "> SELECT _FUNC_(0, 'yyyy-MM-dd HH:mm:ss');\n '1970-01-01 00:00:00'") + usage = "_FUNC_(unix_time, format) - Returns `unix_time` in the specified `format`.", + extended = """ + Examples: + > SELECT _FUNC_(0, 'yyyy-MM-dd HH:mm:ss'); + 1970-01-01 00:00:00 + """) case class FromUnixTime(sec: Expression, format: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -586,7 +651,11 @@ case class FromUnixTime(sec: Expression, format: Expression) */ @ExpressionDescription( usage = "_FUNC_(date) - Returns the last day of the month which the date belongs to.", - extended = "> SELECT _FUNC_('2009-01-12');\n '2009-01-31'") + extended = """ + Examples: + > SELECT _FUNC_('2009-01-12'); + 2009-01-31 + """) case class LastDay(startDate: Expression) extends UnaryExpression with ImplicitCastInputTypes { override def child: Expression = startDate @@ -615,8 +684,12 @@ case class LastDay(startDate: Expression) extends UnaryExpression with ImplicitC */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated.", - extended = "> SELECT _FUNC_('2015-01-14', 'TU');\n '2015-01-20'") + usage = "_FUNC_(start_date, day_of_week) - Returns the first date which is later than `start_date` and named as indicated.", + extended = """ + Examples: + > SELECT _FUNC_('2015-01-14', 'TU'); + 2015-01-20 + """) // scalastyle:on line.size.limit case class NextDay(startDate: Expression, dayOfWeek: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -701,11 +774,17 @@ case class TimeAdd(start: Expression, interval: Expression) } /** - * Assumes given timestamp is UTC and converts to given timezone. + * Given a timestamp, which corresponds to a certain time of day in UTC, returns another timestamp + * that corresponds to the same time of day in the given timezone. */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(timestamp, string timezone) - Assumes given timestamp is UTC and converts to given timezone.") + usage = "_FUNC_(timestamp, timezone) - Given a timestamp, which corresponds to a certain time of day in UTC, returns another timestamp that corresponds to the same time of day in the given timezone.", + extended = """ + Examples: + > SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul'); + 2016-08-31 09:00:00 + """) // scalastyle:on line.size.limit case class FromUTCTimestamp(left: Expression, right: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -784,9 +863,15 @@ case class TimeSub(start: Expression, interval: Expression) /** * Returns the date that is num_months after start_date. */ +// scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(start_date, num_months) - Returns the date that is num_months after start_date.", - extended = "> SELECT _FUNC_('2016-08-31', 1);\n '2016-09-30'") + usage = "_FUNC_(start_date, num_months) - Returns the date that is `num_months` after `start_date`.", + extended = """ + Examples: + > SELECT _FUNC_('2016-08-31', 1); + 2016-09-30 + """) +// scalastyle:on line.size.limit case class AddMonths(startDate: Expression, numMonths: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -814,9 +899,15 @@ case class AddMonths(startDate: Expression, numMonths: Expression) /** * Returns number of months between dates date1 and date2. */ +// scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(date1, date2) - returns number of months between dates date1 and date2.", - extended = "> SELECT _FUNC_('1997-02-28 10:30:00', '1996-10-30');\n 3.94959677") + usage = "_FUNC_(timestamp1, timestamp2) - Returns number of months between `timestamp1` and `timestamp2`.", + extended = """ + Examples: + > SELECT _FUNC_('1997-02-28 10:30:00', '1996-10-30'); + 3.94959677 + """) +// scalastyle:on line.size.limit case class MonthsBetween(date1: Expression, date2: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -842,11 +933,17 @@ case class MonthsBetween(date1: Expression, date2: Expression) } /** - * Assumes given timestamp is in given timezone and converts to UTC. + * Given a timestamp, which corresponds to a certain time of day in the given timezone, returns + * another timestamp that corresponds to the same time of day in UTC. */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(timestamp, string timezone) - Assumes given timestamp is in given timezone and converts to UTC.") + usage = "_FUNC_(timestamp, timezone) - Given a timestamp, which corresponds to a certain time of day in the given timezone, returns another timestamp that corresponds to the same time of day in UTC.", + extended = """ + Examples: + > SELECT _FUNC_('2016-08-31', 'Asia/Seoul'); + 2016-08-30 15:00:00 + """) // scalastyle:on line.size.limit case class ToUTCTimestamp(left: Expression, right: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -897,8 +994,12 @@ case class ToUTCTimestamp(left: Expression, right: Expression) * Returns the date part of a timestamp or string. */ @ExpressionDescription( - usage = "_FUNC_(expr) - Extracts the date part of the date or datetime expression expr.", - extended = "> SELECT _FUNC_('2009-07-30 04:17:52');\n '2009-07-30'") + usage = "_FUNC_(expr) - Extracts the date part of the date or timestamp expression `expr`.", + extended = """ + Examples: + > SELECT _FUNC_('2009-07-30 04:17:52'); + 2009-07-30 + """) case class ToDate(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { // Implicit casting of spark will accept string in both date and timestamp format, as @@ -921,8 +1022,14 @@ case class ToDate(child: Expression) extends UnaryExpression with ImplicitCastIn */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(date, fmt) - Returns returns date with the time portion of the day truncated to the unit specified by the format model fmt.", - extended = "> SELECT _FUNC_('2009-02-12', 'MM')\n '2009-02-01'\n> SELECT _FUNC_('2015-10-27', 'YEAR');\n '2015-01-01'") + usage = "_FUNC_(date, fmt) - Returns `date` with the time portion of the day truncated to the unit specified by the format model `fmt`.", + extended = """ + Examples: + > SELECT _FUNC_('2009-02-12', 'MM'); + 2009-02-01 + > SELECT _FUNC_('2015-10-27', 'YEAR'); + 2015-01-01 + """) // scalastyle:on line.size.limit case class TruncDate(date: Expression, format: Expression) extends BinaryExpression with ImplicitCastInputTypes { @@ -994,8 +1101,12 @@ case class TruncDate(date: Expression, format: Expression) * Returns the number of days from startDate to endDate. */ @ExpressionDescription( - usage = "_FUNC_(date1, date2) - Returns the number of days between date1 and date2.", - extended = "> SELECT _FUNC_('2009-07-30', '2009-07-31');\n 1") + usage = "_FUNC_(date1, date2) - Returns the number of days between `date1` and `date2`.", + extended = """ + Examples: + > SELECT _FUNC_('2009-07-30', '2009-07-31'); + 1 + """) case class DateDiff(endDate: Expression, startDate: Expression) extends BinaryExpression with ImplicitCastInputTypes { http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala index f74208f..d042bfb 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala @@ -102,8 +102,13 @@ case class UserDefinedGenerator( * }}} */ @ExpressionDescription( - usage = "_FUNC_(n, v1, ..., vk) - Separate v1, ..., vk into n rows.", - extended = "> SELECT _FUNC_(2, 1, 2, 3);\n [1,2]\n [3,null]") + usage = "_FUNC_(n, expr1, ..., exprk) - Separates `expr1`, ..., `exprk` into `n` rows.", + extended = """ + Examples: + > SELECT _FUNC_(2, 1, 2, 3); + 1 2 + 3 NULL + """) case class Stack(children: Seq[Expression]) extends Expression with Generator with CodegenFallback { @@ -226,8 +231,13 @@ abstract class ExplodeBase(child: Expression, position: Boolean) */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.", - extended = "> SELECT _FUNC_(array(10,20));\n 10\n 20") + usage = "_FUNC_(expr) - Separates the elements of array `expr` into multiple rows, or the elements of map `expr` into multiple rows and columns.", + extended = """ + Examples: + > SELECT _FUNC_(array(10, 20)); + 10 + 20 + """) // scalastyle:on line.size.limit case class Explode(child: Expression) extends ExplodeBase(child, position = false) @@ -242,8 +252,13 @@ case class Explode(child: Expression) extends ExplodeBase(child, position = fals */ // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(a) - Separates the elements of array a into multiple rows with positions, or the elements of a map into multiple rows and columns with positions.", - extended = "> SELECT _FUNC_(array(10,20));\n 0\t10\n 1\t20") + usage = "_FUNC_(expr) - Separates the elements of array `expr` into multiple rows with positions, or the elements of map `expr` into multiple rows and columns with positions.", + extended = """ + Examples: + > SELECT _FUNC_(array(10,20)); + 0 10 + 1 20 + """) // scalastyle:on line.size.limit case class PosExplode(child: Expression) extends ExplodeBase(child, position = true) @@ -251,8 +266,13 @@ case class PosExplode(child: Expression) extends ExplodeBase(child, position = t * Explodes an array of structs into a table. */ @ExpressionDescription( - usage = "_FUNC_(a) - Explodes an array of structs into a table.", - extended = "> SELECT _FUNC_(array(struct(1, 'a'), struct(2, 'b')));\n [1,a]\n [2,b]") + usage = "_FUNC_(expr) - Explodes an array of structs into a table.", + extended = """ + Examples: + > SELECT _FUNC_(array(struct(1, 'a'), struct(2, 'b'))); + 1 a + 2 b + """) case class Inline(child: Expression) extends UnaryExpression with Generator with CodegenFallback { override def checkInputDataTypes(): TypeCheckResult = child.dataType match { http://git-wip-us.apache.org/repos/asf/spark/blob/7eb2ca8e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala index 244a5a3..e034735 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala @@ -110,7 +110,12 @@ private[this] object SharedFactory { * of the extracted json object. It will return null if the input json string is invalid. */ @ExpressionDescription( - usage = "_FUNC_(json_txt, path) - Extract a json object from path") + usage = "_FUNC_(json_txt, path) - Extracts a json object from `path`.", + extended = """ + Examples: + > SELECT _FUNC_('{"a":"b"}', '$.a'); + b + """) case class GetJsonObject(json: Expression, path: Expression) extends BinaryExpression with ExpectsInputTypes with CodegenFallback { @@ -326,7 +331,12 @@ case class GetJsonObject(json: Expression, path: Expression) // scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(jsonStr, p1, p2, ..., pn) - like get_json_object, but it takes multiple names and return a tuple. All the input parameters and output column types are string.") + usage = "_FUNC_(jsonStr, p1, p2, ..., pn) - Return a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.", + extended = """ + Examples: + > SELECT _FUNC_('{"a":1, "b":2}', 'a', 'b'); + 1 2 + """) // scalastyle:on line.size.limit case class JsonTuple(children: Seq[Expression]) extends Generator with CodegenFallback { --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org