Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/23201#discussion_r240038837 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala --- @@ -121,7 +122,26 @@ private[sql] class JsonInferSchema(options: JSONOptions) extends Serializable { DecimalType(bigDecimal.precision, bigDecimal.scale) } decimalTry.getOrElse(StringType) - case VALUE_STRING => StringType + case VALUE_STRING => + val stringValue = parser.getText --- End diff -- I didn't mean type inference in partition values but you are probably right we should follow the same logic in schema inferring in datasources and partition value types. Just wondering how it works for now, this code: https://github.com/apache/spark/blob/5a140b7844936cf2b65f08853b8cfd8c499d4f13/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L474-L482 and this https://github.com/apache/spark/blob/f982ca07e80074bdc1e3b742c5e21cf368e4ede2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala#L163 can use different timestamp patterns, or it is supposed to work only with default settings? Maybe `inferPartitionColumnValue` should ask a datasource for inferring date/timestamp types?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org