from pyspark.sql.types import * list =[("buck trends", "ceo", 200000.00, 0.25, "100")]
schema = StructType([ StructField("name", StringType(), True), StructField("title", StringType(), True), StructField("salary", DoubleType(), True), StructField("rate", DoubleType(), True), StructField("insurance", StringType(), True) ]) df= spark.createDataFrame(data=list, schema=schema) On Wed, Jan 26, 2022 at 6:49 PM <capitnfrak...@free.fr> wrote: > when creating dataframe from a list, how can I specify the col type? > > such as: > > >>> df = > >>> > spark.createDataFrame(list,["name","title","salary","rate","insurance"]) > >>> df.show() > +-----------+---------+------+----+---------+ > | name| title|salary|rate|insurance| > +-----------+---------+------+----+---------+ > |buck trends| ceo|200000|0.25| 100| > |cindy banks| cfo|170000|0.22| 120| > | joe coder|developer|130000| 0.2| 120| > +-----------+---------+------+----+---------+ > > > >>> df.describe() > DataFrame[summary: string, name: string, title: string, salary: string, > rate: string, insurance: string] > > I want the salary, rate, insurance to be Double type, not a String type. > > Thank you. > Frakass > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >