subject:"\[discuss\]\[SQL\] Partitioned column type inference proposal"

Re: [discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Hyukjin Kwon

Thanks all for feedback. > 1. when merging NullType with another type, the result should always be that type. > 2. when merging StringType with another type, the result should always be StringType. > 3. when merging integral types, the priority from high to low: DecimalType, LongType,

Re: [discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Reynold Xin

Most of those thoughts from Wenchen make sense to me. Rather than a list, can we create a table? X-axis is data type, and Y-axis is also data type, and the intersection explains what the coerced type is? Can we also look at what Hive, standard SQL (Postgres?) do? Also, this shouldn't be

Re: [discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Wenchen Fan

My 2 cents: 1. when merging NullType with another type, the result should always be that type. 2. when merging StringType with another type, the result should always be StringType. 3. when merging integral types, the priority from high to low: DecimalType, LongType, IntegerType. This is because

[discuss][SQL] Partitioned column type inference proposal

2017-11-14 Thread Hyukjin Kwon

Hi dev, I would like to post a proposal about partitioned column type inference (related with 'spark.sql.sources.partitionColumnTypeInference.enabled' configuration). This thread focuses on the type coercion (finding the common type) in partitioned columns, in particular, when the different form