Hi,
How can I split pair rdd [K, V] to map [K, Array(V)] efficiently in Pyspark?
Best,
Patcharee
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Hi,
How dataframe (What API) can access hive complex type (Struct, Array, Maps)?
Thanks,
Patcharee
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Hi,
How to write partitioned orc file using OrcNewOutputFormat in MapReduce?
Thanks
Patcharee
Hi,
How can I override log4j level by using --hiveconf? I want to use ERROR
level for some tasks.
Thanks,
Patcharee
Hi,
The query result
11236119012.64043-5.9708868.5592070.0 0.0
0.0-19.6869931308.804799848.00.006196644 0.00.0
301.274750.382470460.0NULL11 20081
11236122012.513598-6.36717137.3927946 0.0
Hi,
I have a table partitioned by a, b, c, d column. I want to alter
concatenate this table. Is it possible to use wildcard in alter command
to alter several partitions at a time? For ex.
alter table TestHive partition (a=1, b=*, c=2, d=*) CONCATENATE;
BR,
Patcharee
Hi,
I am using spark 1.4. I wanted to serialize by KryoSerializer, but got
ClassNotFoundException. The configuration and exception is below. When I
submitted the job, I also provided --jars mylib.jar which contains
WRFVariableZ.
conf.set(spark.serializer,
Hi,
How can I know the size of memory needed for each executor (one core) to
execute each job? If there are many cores per executors, will the memory
be the multiplication (memory needed for each executor (one core) * no.
of cores)?
Any suggestions/guidelines?
BR,
Patcharee
://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables
Hope this helps,
Will
On June 13, 2015, at 3:36 PM, pth001 patcharee.thong...@uni.no wrote:
Hi,
I am using spark 0.14. I try to insert data into a hive table (in orc
format) from DF.
partitionedTestDF.write.format
Hi,
I am using spark 0.14. I try to insert data into a hive table (in orc
format) from DF.
partitionedTestDF.write.format(org.apache.spark.sql.hive.orc.DefaultSource)
.mode(org.apache.spark.sql.SaveMode.Append).partitionBy(zone,z,year,month).saveAsTable(testorc)
When this job is submitted by
Hi,
My pig on Tez (to store dataset into a partitioned hive table) throws
the following exception. What can be wrong? How can I fix it?
2015-06-09 10:59:57,268 ERROR [TezChild] runtime.PigProcessor:
Encountered exception while processing:
org.apache.pig.backend.executionengine.ExecException:
Hi,
I am new to pig. First I queried a hive table (x = LOAD 'x' USING
org.apache.hive.hcatalog.pig.HCatLoader();) and got a single
record/value. How can I used this single value to filter in another
query? I hope to get a better performance by filter as soon as possible.
BR,
Patcharee
Hi,
How can I create a pipeline (containing a sequence of pig scripts)?
BR,
Patcharee
13 matches
Mail list logo