ocess).
> The amount of memory required also depend on how many fields are used in
> the
> results.
>
> On Tue, Aug 9, 2016 at 11:09 AM, Zoltan Fedor <zoltan.1.fe...@gmail.com>
> wrote:
> >> Does this mean you only have 1.6G memory for executor (others left for
> &
tributes of the UDF?
On Mon, Aug 8, 2016 at 5:59 PM, Davies Liu <dav...@databricks.com> wrote:
> On Mon, Aug 8, 2016 at 2:24 PM, Zoltan Fedor <zoltan.1.fe...@gmail.com>
> wrote:
> > Hi all,
> >
> > I have an interesting issue trying to use UDFs from SparkSQL in
Hi all,
I have an interesting issue trying to use UDFs from SparkSQL in Spark 2.0.0
using pyspark.
There is a big table (5.6 Billion rows, 450Gb in memory) loaded into 300
executors's memory in SparkSQL, on which we would do some calculation using
UDFs in pyspark.
If I run my SQL on only a
ling
None.org.apache.spark.sql.hive.HiveContext.\n', JavaObject id=o20))
>>>
On Thu, Oct 29, 2015 at 7:20 AM, Deenar Toraskar <deenar.toras...@gmail.com>
wrote:
> *Hi Zoltan*
>
> Add hive-site.xml to your YARN_CONF_DIR. i.e. $SPARK_HOME/conf/yarn-conf
>
> Deenar
tserver
> -DskipTests clean package
>
>
> On 29 October 2015 at 13:08, Zoltan Fedor <zoltan.0.fe...@gmail.com>
> wrote:
>
>> Hi Deenar,
>> As suggested, I have moved the hive-site.xml from HADOOP_CONF_DIR
>> ($SPARK_HOME/hadoop-conf) to YARN_CONF_
Yes, I have the hive-site.xml in $SPARK_HOME/conf, also in yarn-conf, in
/etc/hive/conf, etc
On Thu, Oct 29, 2015 at 10:46 AM, Kai Wei <kai.wei...@gmail.com> wrote:
> Did you try copy it to spark/conf dir?
> On 30 Oct 2015 1:42 am, "Zoltan Fedor" <zoltan
riftserver -DskipTests clean package
On Thu, Oct 29, 2015 at 11:05 AM, Kai Wei <kai.wei...@gmail.com> wrote:
> Failed to see a hadoop-2.5 profile in pom. Maybe that's the problem.
> On 30 Oct 2015 1:51 am, "Zoltan Fedor" <zoltan.0.fe...@gmail.com> wrote:
>
>> The fu
w a lot about how pyspark works. Can you possibly try running
> spark-shell and do the same?
>
> sqlContext.sql("show databases").collect
>
> Deenar
>
> On 29 October 2015 at 14:18, Zoltan Fedor <zoltan.0.fe...@gmail.com>
> wrote:
>
>> Yes, I am.
The funny thing is, that with Spark 1.2.0 on the same machine (Spark 1.2.0
is the default shipped with CDH 5.3.3) the same hive-site.xml is being
picked up and I have no problem whatsoever.
On Thu, Oct 29, 2015 at 10:48 AM, Zoltan Fedor <zoltan.0.fe...@gmail.com>
wrote:
> Yes, I have
There is /user/biapp in hdfs. The problem is that the hive-site.xml is
being ignored, so it is looking for it locally.
On Thu, Oct 29, 2015 at 10:40 AM, Kai Wei <kai.wei...@gmail.com> wrote:
> Create /user/biapp in hdfs manually first.
> On 30 Oct 2015 1:36 am, "Zoltan Fe
rk/conf/yarn-conf to
> $SPARK_HOME/conf/yarn-conf
>
> and it worked. You may be better off with a custom build for CDH 5.3.3
> hadoop, which you already have done.
>
> Deenar
>
> On 29 October 2015 at 14:35, Zoltan Fedor <zoltan.0.fe...@gmail.com>
> wrote:
>
&
xt.\n', JavaObject id=o24))
>>>
On Thu, Oct 29, 2015 at 11:44 AM, Deenar Toraskar <deenar.toras...@gmail.com
> wrote:
>
> Zoltan
>
> you should have these in your existing CDH 5.3, that's the best place to
> get them. Find where spark is running from and should should
Hi,
We have a shared CDH 5.3.3 cluster and trying to use Spark 1.5.1 on it in
yarn client mode with Hive.
I have compiled Spark 1.5.1 with SPARK_HIVE=true, but it seems I am not
able to make SparkSQL to pick up the hive-site.xml when runnig pyspark.
hive-site.xml is located in
13 matches
Mail list logo