Re: df.dtypes -> pyspark.sql.types

2016-03-20 Thread Reynold Xin
We probably should have the alias. Is this still a problem on master branch? On Wed, Mar 16, 2016 at 9:40 AM, Ruslan Dautkhanov wrote: > Running following: > > #fix schema for gaid which should not be Double >> from pyspark.sql.types import * >> customSchema = StructType()

Re: df.dtypes -> pyspark.sql.types

2016-03-19 Thread Ruslan Dautkhanov
Spark 1.5 is the latest that I have access to and where this problem happens. I don't see it's fixed in master but I might be wrong. diff atatched. https://raw.githubusercontent.com/apache/spark/branch-1.5/python/pyspark/sql/types.py

Re: df.dtypes -> pyspark.sql.types

2016-03-19 Thread Ruslan Dautkhanov
Running following: #fix schema for gaid which should not be Double > from pyspark.sql.types import * > customSchema = StructType() > for (col,typ) in tsp_orig.dtypes: > if col=='Agility_GAID': > typ='string' > customSchema.add(col,typ,True) Getting ValueError: Could not parse

df.dtypes -> pyspark.sql.types

2016-03-19 Thread Ruslan Dautkhanov
Hello, Looking at https://spark.apache.org/docs/1.5.1/api/python/_modules/pyspark/sql/types.html and can't wrap my head around how to convert string data types names to actual pyspark.sql.types data types? Does pyspark.sql.types has an interface to return StringType() for "string",