-27 15:45 GMT+08:00 Tom Walwyn twal...@gmail.com:
Hi,
The behaviour is the same for me in Scala and Python, so posting here in
Python. When I use DataFrame.saveAsTable with the path option, I expect an
external Hive table to be created at the specified path. Specifically, when
I call
Hi,
The behaviour is the same for me in Scala and Python, so posting here in
Python. When I use DataFrame.saveAsTable with the path option, I expect an
external Hive table to be created at the specified path. Specifically, when
I call:
df.saveAsTable(..., path=/tmp/test)
I expect an external
Another follow-up: saveAsTable works as expected when running on hadoop
cluster with Hive installed. It's just locally that I'm getting this
strange behaviour. Any ideas why this is happening?
Kind Regards.
Tom
On 27 March 2015 at 11:29, Tom Walwyn twal...@gmail.com wrote:
We can set a path
, 2015 at 1:43 AM, Tom Walwyn twal...@gmail.com wrote:
Thanks for the reply, I'll try your suggestions.
Apologies, in my previous post I was mistaken. rdd is actually an PairRDD
of (Int, Int). I'm doing the self-join so I can count two things. First, I
can count the number of times a value appears
))
Thanks
Best Regards
On Wed, Feb 18, 2015 at 12:21 PM, Tom Walwyn twal...@gmail.com wrote:
Hi All,
I'm a new Spark (and Hadoop) user and I want to find out if the cluster
resources I am using are feasible for my use-case. The following is a
snippet of code that is causing a OOM exception
Hi All,
I'm a new Spark (and Hadoop) user and I want to find out if the cluster
resources I am using are feasible for my use-case. The following is a
snippet of code that is causing a OOM exception in the executor after about
125/1000 tasks during the map stage.
val rdd2 = rdd.join(rdd,