[ https://issues.apache.org/jira/browse/SPARK-9714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davies Liu updated SPARK-9714: ------------------------------ Assignee: Yin Huai > Cannot insert into a table using pySpark > ---------------------------------------- > > Key: SPARK-9714 > URL: https://issues.apache.org/jira/browse/SPARK-9714 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Yun Park > Assignee: Yin Huai > Priority: Blocker > > This is a bug on the master branch. After creating the table ("yun" is the > table name) with the corresponding fields, I ran the following command. > from pyspark.sql import * > sc.parallelize([Row(id=1, name="test", > description="")]).toDF().write.mode("append").saveAsTable("yun") > I get the following error: > Py4JJavaError: An error occurred while calling o100.saveAsTable. > : org.apache.spark.SparkException: Task not serializable > Caused by: java.io.NotSerializableException: org.apache.hadoop.fs.Path > Serialization stack: > - object not serializable (class: org.apache.hadoop.fs.Path, value: > dbfs:/user/hive/warehouse/yun) > - field (class: org.apache.hadoop.hive.ql.metadata.Table, name: path, > type: class org.apache.hadoop.fs.Path) > - object (class org.apache.hadoop.hive.ql.metadata.Table, yun) > - field (class: org.apache.hadoop.hive.ql.metadata.Partition, name: > table, type: class org.apache.hadoop.hive.ql.metadata.Table) > - object (class org.apache.hadoop.hive.ql.metadata.Partition, yun()) > - field (class: scala.collection.immutable.Stream$Cons, name: hd, type: > class java.lang.Object) > - object (class scala.collection.immutable.Stream$Cons, Stream(yun())) > - field (class: scala.collection.immutable.Stream$$anonfun$map$1, name: > $outer, type: class scala.collection.immutable.Stream) > - object (class scala.collection.immutable.Stream$$anonfun$map$1, > <function0>) > - field (class: scala.collection.immutable.Stream$Cons, name: tl, type: > interface scala.Function0) > - object (class scala.collection.immutable.Stream$Cons, > Stream(HivePartition(List(),HiveStorageDescriptor(dbfs:/user/hive/warehouse/yun,org.apache.hadoop.mapred.TextInputFormat,org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,Map(serialization.format > -> 1))))) > - field (class: scala.collection.immutable.Stream$$anonfun$map$1, name: > $outer, type: class scala.collection.immutable.Stream) > - object (class scala.collection.immutable.Stream$$anonfun$map$1, > <function0>) > - field (class: scala.collection.immutable.Stream$Cons, name: tl, type: > interface scala.Function0) > - object (class scala.collection.immutable.Stream$Cons, > Stream(dbfs:/user/hive/warehouse/yun)) > - field (class: org.apache.spark.sql.hive.MetastoreRelation, name: > paths, type: interface scala.collection.Seq) > - object (class org.apache.spark.sql.hive.MetastoreRelation, > MetastoreRelation default, yun, None > ) > - field (class: > org.apache.spark.sql.hive.execution.InsertIntoHiveTable, name: table, type: > class org.apache.spark.sql.hive.MetastoreRelation) > - object (class > org.apache.spark.sql.hive.execution.InsertIntoHiveTable, InsertIntoHiveTable > (MetastoreRelation default, yun, None), Map(), false, false > ConvertToSafe > TungstenProject [CAST(description#10, FloatType) AS > description#16,CAST(id#11L, StringType) AS id#17,name#12] > PhysicalRDD [description#10,id#11L,name#12], MapPartitionsRDD[17] at > applySchemaToPythonRDD at NativeMethodAccessorImpl.java:-2 > ) > - field (class: > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3, > name: $outer, type: class > org.apache.spark.sql.hive.execution.InsertIntoHiveTable) > - object (class > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3, > <function2>) > at > org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40) > at > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84) > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301) > ... 30 more -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org