from:"pth001"

pyspark split pair rdd to multiple

2016-04-19 Thread pth001

Hi, How can I split pair rdd [K, V] to map [K, Array(V)] efficiently in Pyspark? Best, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

dataframe access hive complex type

2016-01-19 Thread pth001

Hi, How dataframe (What API) can access hive complex type (Struct, Array, Maps)? Thanks, Patcharee - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

How to use KryoSerializer : ClassNotFoundException

2015-06-24 Thread pth001

Hi, I am using spark 1.4. I wanted to serialize by KryoSerializer, but got ClassNotFoundException. The configuration and exception is below. When I submitted the job, I also provided --jars mylib.jar which contains WRFVariableZ. conf.set(spark.serializer,

memory needed for each executor

2015-06-21 Thread pth001

Hi, How can I know the size of memory needed for each executor (one core) to execute each job? If there are many cores per executors, will the memory be the multiplication (memory needed for each executor (one core) * no. of cores)? Any suggestions/guidelines? BR, Patcharee

Re: Dataframe Write : Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

2015-06-13 Thread pth001

://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables Hope this helps, Will On June 13, 2015, at 3:36 PM, pth001 patcharee.thong...@uni.no wrote: Hi, I am using spark 0.14. I try to insert data into a hive table (in orc format) from DF. partitionedTestDF.write.format

Dataframe Write : Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

2015-06-13 Thread pth001

Hi, I am using spark 0.14. I try to insert data into a hive table (in orc format) from DF. partitionedTestDF.write.format(org.apache.spark.sql.hive.orc.DefaultSource) .mode(org.apache.spark.sql.SaveMode.Append).partitionBy(zone,z,year,month).saveAsTable(testorc) When this job is submitted by

pyspark split pair rdd to multiple

dataframe access hive complex type

How to use KryoSerializer : ClassNotFoundException

memory needed for each executor

Re: Dataframe Write : Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

Dataframe Write : Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

6 matches

Site Navigation

Mail list logo

Footer information