Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19792#discussion_r153130303 --- Diff: python/pyspark/sql/types.py --- @@ -1108,19 +1109,23 @@ def _has_nulltype(dt): return isinstance(dt, NullType) -def _merge_type(a, b): +def _merge_type(a, b, path=''): --- End diff -- Yup, I agree with it for the current status .. it's kind of difficult to come up with a good message format in such cases. To me, I actually kind of gave up a pretty format and just chose prose before (in the PR #18521). I still don't have a good idea to show the nested structure in the error message to be honest. I was thinking one of prose, kind of piece of codes (like `schema['f1'].dataType.elementType.keyType` instead of https://github.com/apache/spark/pull/19792#issuecomment-346681841), or somehow pretty one like `printSchema()` ... ? Maybe, I am thinking of referring other formats in Spark or somewhere like Pandas or piece of codes for now. So, to cut it short, here are what are on my mind for the example of https://github.com/apache/spark/pull/19792#issuecomment-346681841: 1. Prose ``` TypeError: key in map field in array element in field f1: Can not blabla ``` 2. Piece of codes with `StructType` ``` TypeError: schema(?)['f1'].dataType.elementType.keyType: Can not blabla TypeError: struct(?)['f1'].dataType.elementType.keyType: Can not blabla ``` 3. `printSchema()` ``` TypeError: root |-- f1: array (nullable = false) | |-- element: map (containsNull = false) | | |-- key: integer* | | |-- value: long (valueContainsNull = false) : *Can not blabla ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org