cloud-fan commented on code in PR #36066:
URL: https://github.com/apache/spark/pull/36066#discussion_r843997726


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala:
##########
@@ -22,48 +22,58 @@ import java.util.Locale
 import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
CodeGenerator, ExprCode}
 import org.apache.spark.sql.catalyst.expressions.codegen.Block.BlockHelper
-import org.apache.spark.sql.catalyst.util.NumberFormatter
+import org.apache.spark.sql.catalyst.util.ToNumberParser
 import org.apache.spark.sql.types.{DataType, StringType}
 import org.apache.spark.unsafe.types.UTF8String
 
 /**
- * A function that converts string to numeric.
+ * A function that converts strings to decimal values, returning an exception 
if the input string
+ * fails to match the format string.
  */
 @ExpressionDescription(
   usage = """
-     _FUNC_(strExpr, formatExpr) - Convert `strExpr` to a number based on the 
`formatExpr`.
-       The format can consist of the following characters:
-         '0' or '9':  digit position
-         '.' or 'D':  decimal point (only allowed once)
-         ',' or 'G':  group (thousands) separator
-         '-' or 'S':  sign anchored to number (only allowed once)
-         '$':  value with a leading dollar sign (only allowed once)
+     _FUNC_(expr, fmt) - Convert string 'expr' to a number based on the string 
format 'fmt'.
+       Throws an exception if the conversion fails. The format can consist of 
the following
+       characters, case insensitive:
+         '0' or '9':  Specifies an expected digit between 0 and 9. A 0 to the 
left of the decimal
+           point indicates that 'expr' must have at least as many digits. A 
leading 9 indicates
+           that 'expr' may omit these digits. 'expr' must not be larger than 
the number of digits

Review Comment:
   We can improve the doc a little bit. For a sequence of `0` or `9` or a mix 
of them, the behavior is decided by the left-most digit if the sequence is 
before the decimal point, or by the right-most digit if the sequence is after 
the decimal point.
   
   A simpler idea is to forbid mixing `0` and `9`. What do you think? @srielau 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to