Re: [discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Hyukjin Kwon
Thanks all for feedback. > 1. when merging NullType with another type, the result should always be that type. > 2. when merging StringType with another type, the result should always be StringType. > 3. when merging integral types, the priority from high to low: DecimalType, LongType,

Re: [discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Reynold Xin
Most of those thoughts from Wenchen make sense to me. Rather than a list, can we create a table? X-axis is data type, and Y-axis is also data type, and the intersection explains what the coerced type is? Can we also look at what Hive, standard SQL (Postgres?) do? Also, this shouldn't be

Re: [discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Wenchen Fan
My 2 cents: 1. when merging NullType with another type, the result should always be that type. 2. when merging StringType with another type, the result should always be StringType. 3. when merging integral types, the priority from high to low: DecimalType, LongType, IntegerType. This is because

[discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Hyukjin Kwon
Hi dev, I would like to post a proposal about partitioned column type inference (related with 'spark.sql.sources.partitionColumnTypeInference.enabled' configuration). This thread focuses on the type coercion (finding the common type) in partitioned columns, in particular, when the different form