Ohad Raviv created SPARK-26645: ---------------------------------- Summary: CSV infer schema bug infers decimal(9,-1) Key: SPARK-26645 URL: https://issues.apache.org/jira/browse/SPARK-26645 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.3.0 Reporter: Ohad Raviv
we have a file /tmp/t1/file.txt that contains only one line "1.18927098E9". running: {code:python} df = spark.read.csv('/tmp/t1', header=False, inferSchema=True, sep='\t') print df.dtypes {code} causes: {noformat} ValueError: Could not parse datatype: decimal(9,-1) {noformat} I'm not sure where the bug is - inferSchema or dtypes? I saw it is legal to have a decimal with negative scale in the code (CSVInferSchema.scala): {code:python} if (bigDecimal.scale <= 0) { // `DecimalType` conversion can fail when // 1. The precision is bigger than 38. // 2. scale is bigger than precision. DecimalType(bigDecimal.precision, bigDecimal.scale) } {code} but what does it mean? -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org