[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-09-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17435


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-09-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r137773078
  
--- Diff: python/pyspark/sql/types.py ---
@@ -438,6 +438,11 @@ def toInternal(self, obj):
 def fromInternal(self, obj):
 return self.dataType.fromInternal(obj)
 
+def typeName(self):
+raise TypeError(
+"StructField does not have typeName."
--- End diff --

I am sorry. Could you add a space at the end `..eName."` -> `...eName. "`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-09-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r137752617
  
--- Diff: python/pyspark/sql/types.py ---
@@ -438,6 +438,11 @@ def toInternal(self, obj):
 def fromInternal(self, obj):
 return self.dataType.fromInternal(obj)
 
+def typeName(self):
+raise TypeError(
+"StructField does not have typename. \
+You can use self.dataType.simpleString() instead.")
--- End diff --

I think it should be `typeName` on its datatype because `typeName` was 
called.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-09-08 Thread szalai1
Github user szalai1 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r137749545
  
--- Diff: python/pyspark/sql/types.py ---
@@ -438,6 +438,11 @@ def toInternal(self, obj):
 def fromInternal(self, obj):
 return self.dataType.fromInternal(obj)
 
+def typeName(self):
+raise TypeError(
+"StructField does not have typename. \
+You can use self.dataType.simpleString() instead.")
--- End diff --

use mean `simpleString()` and not `typeName()`. right ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-09-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r137685731
  
--- Diff: python/pyspark/sql/types.py ---
@@ -438,6 +438,11 @@ def toInternal(self, obj):
 def fromInternal(self, obj):
 return self.dataType.fromInternal(obj)
 
+def typeName(self):
+raise TypeError(
+"StructField does not have typename. \
--- End diff --

Little nit: looks a typo, typename -> typeName.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-09-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r137685629
  
--- Diff: python/pyspark/sql/types.py ---
@@ -438,6 +438,11 @@ def toInternal(self, obj):
 def fromInternal(self, obj):
 return self.dataType.fromInternal(obj)
 
+def typeName(self):
+raise TypeError(
+"StructField does not have typename. \
+You can use self.dataType.simpleString() instead.")
--- End diff --

I'd remove `self` here and just say something like ` use typeName() on its 
type explicitly ...`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-09-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r137684263
  
--- Diff: python/pyspark/sql/types.py ---
@@ -438,6 +438,11 @@ def toInternal(self, obj):
 def fromInternal(self, obj):
 return self.dataType.fromInternal(obj)
 
+def typeName(self):
+raise TypeError(
--- End diff --

Could we do like ...

```python
raise TypeError(
"..."
"...")
```
if it doesn't bother you much?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-29 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108691360
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108685381
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

Yup, I think we should still fix and overriding it is good enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-29 Thread szalai1
Github user szalai1 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108683954
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

@viirya You are right. I will use it that way.  Thanks @HyukjinKwon !

But my bug report is still valid,  I think. Can I override the `typeName` 
function? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-28 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108593342
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

Btw, I don't think `i.typeName()` is a valid usage. We better let it throw 
an exception when calling `typeName` on `StructField`.

`i.dataType.typeName()` is more reasonable call to me.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-28 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108593127
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

@szalai1 I think @HyukjinKwon 's code snippets should address your request. 
Doesn't it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-28 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108437512
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

I guess it would not produce correct fields if `struct` is printed from 
`StructField`.

```python
>>> from pyspark.sql import Row
>>>
>>> df = spark.createDataFrame([[Row(a=1), 2]])
>>> cols = []
>>> for i in df.schema:
... cols.append("`" + i.name + "`" + "\t" +  i.typeName())
...
>>> print ",\n".join(cols)
`_1`struct,
`_2`struct
```

because actual types are as below:

```python
>>> df.schema.simpleString()
'struct<_1:struct,_2:bigint>'
```

How about the one as below?

```python
from pyspark.sql import Row

df = spark.createDataFrame([[Row(a=1), 2]])
cols = []
for i in df.schema:
cols.append("`" + i.name + "`" + "\t" +  i.dataType.simpleString())

print ",\n".join(cols)
```

prints

```
`_1`struct,
`_2`bigint
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-28 Thread szalai1
Github user szalai1 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108348472
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

The reason I called  `typeName ` is, that I wanted to generate a HIVE table 
dynamically from data and to do this I need the type of each column. 

```
>>> sqlContext = SQLContext(sc)
>>> df = sqlContext.read.json('path_to_a_json_doc')
>>> cols = []
>>> for i in df.schema:
...   cols.append("`" + i.name + "`" + "\t" +  i.typeName())
>>> ",\n".join(cols)
```

In the real case, I converted the type name to a hive compatible form. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108103801
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

It is valid call for `DataType`s but I don't think it is against 
`StructField`. If you look at the Scala-side codes, `StructField` is not a 
`DataType` but it seems the parent became `DataType` in Python for some reason 
in the PR I pointed out. In any way, It seems  ["`A field inside a 
StructType`"](https://github.com/apache/spark/blob/3694ba48f0db0f47baea4b005cdeef3f454b7329/sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructField.scala#L26).

If you want to know the schema, you could simply

```python
>>> spark.range(1).schema[0].simpleString()
'id:bigint'
>>> spark.range(1).schema.simpleString()
'struct'
```

Could you elaborate your use-case and why `simpleString` is not enough?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-27 Thread szalai1
Github user szalai1 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108100937
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

@viirya The way I found this bug: 
I wanted to figure out the schema of a dataset. I loaded it into a data 
frame  and asked its schema. Then I called the `typeName` on each column. I do 
not know this is/was best way to do this, but I think it is valid to call 
`typeName` against a `dataType` to get its real type. 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-26 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108082197
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

Yeah, I don't think it is valid to call `typeName` against a `StructField`. 
Actually, `StructField` is not a data type, strictly speaking...

I don't know why `StructField` inherits `DataType` in pyspark. In scala, it 
is not.

Throwing an exception when calling `typeName` on `StructField` seems good 
enough, instead of a map like this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108079001
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

cc @holdenk and @viirya WDYT?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17435#discussion_r108078597
  
--- Diff: python/pyspark/sql/types.py ---
@@ -57,7 +57,25 @@ def __ne__(self, other):
 
 @classmethod
 def typeName(cls):
-return cls.__name__[:-4].lower()
+typeTypeNameMap = {"DataType": "data",
+   "NullType": "null",
+   "StringType": "string",
+   "BinaryType": "binary",
+   "BooleanType": "boolean",
+   "DateType": "date",
+   "TimestampType": "timestamp",
+   "DecimalType": "decimal",
+   "DoubleType": "double",
+   "FloatType": "float",
+   "ByteType": "byte",
+   "IntegerType": "integer",
+   "LongType": "long",
+   "ShortType": "short",
+   "ArrayType": "array",
+   "MapType": "map",
+   "StructField": "struct",
--- End diff --

It seems this problem only applies to `StructField`. Could we just 
overwrite `typeName` with simply throwing an exception? I think users are not 
supposed to call `typeName` against `StructField` but `simpleString` against 
the type instance.

BTW, It apparently seems a bit odd that it extends `DataType` though.. I 
guess probably some tests are broken if we change the parent as it seems it is 
dependent on the parent assuming from 
https://github.com/apache/spark/pull/1598. So, I guess minimised fix would be 
just to overwrite.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

2017-03-26 Thread szalai1
GitHub user szalai1 opened a pull request:

https://github.com/apache/spark/pull/17435

[SPARK-20098][PYSPARK] dataType's typeName fix 

## What changes were proposed in this pull request?
`typeName`  classmethod has been fixed by using type -> typeName map. 

## How was this patch tested?
local build

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/szalai1/spark datatype-gettype-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17435.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17435


commit 00994e922be77d686c303d3d6bc3a484478d95d7
Author: Peter Szalai 
Date:   2017-03-26T17:51:10Z

getType fix

commit 933f3cb7f0b97f3bd79ea55bbd15dc27d0946c4b
Author: Peter Szalai 
Date:   2017-03-26T18:10:29Z

type




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org