Optimize the type conversion of spark array function and map function in calcite

Cancai Cai Mon, 15 Apr 2024 08:11:20 -0700

Hi, calcite community，

Recently, I am testing the map and array related functions of spark in
calcite. I found that in some cases, spark is a little different from our
understanding of type conversion.


For example

scala>  val df = spark.sql("select map_contains_key(map(1, 'a', 2, 'b'), 2.0)")
val df: org.apache.spark.sql.DataFrame = [map_contains_key(map(1, a,
2, b), 2.0): boolean]

scala> df.show()
+--------------------------------------+
|map_contains_key(map(1, a, 2, b), 2.0)|
+--------------------------------------+
|                                  true|
+--------------------------------------+

Mihai Budiu pointed out that similar processing may be done in Spark,

map_contains_key(map<Double, String>((Double)1, 'a', (Double)2, 'b'), 2.0)

We can't say that Spark is wrong, we should adapt to this situation, so I
think I might add an adjustTypeForMapContainsKey method to perform display
conversion on it, but this situation should not only exist in the
map_contain_keys method, we cannot guarantee map_concat that they are no
similar problems with other related functions. Therefore, we should
discover what common characteristics these functions have in type
conversion, and we should encapsulate them in a unified method instead of
adding a similar adjust method to each function.

I thought I should do this in three steps.

①Test various situations related to the map function and array function in
Spark, and raise jira if it is inconsistent with the spark behavior in
calcite

② Summarize the same characteristics of some functions and find out whether
there is any relationship

③For the same characteristics, use a method to encapsulate the type
conversion。

The above are my personal thoughts. I feel that this may be more conducive
to the maintenance of calcite code.

Finally, thank you for reading

Best wishes,

Cancai Cai

Optimize the type conversion of spark array function and map function in calcite

Reply via email to