I have a Scala application in which I have added some extra rules to
Catalyst.
While adding some unit tests, I am trying to use some existing functions
from Catalyst's test code: Specifically comparePlans() and normalizePlan()
under PlanTestBase
park-graphx-in-action
>
>
>
>
>
> On 29 Sep 2015, at 04:47, James Pirz <james.p...@gmail.com> wrote:
>
> Thanks for your reply.
>
> Setting it as
>
> --conf spark.executor.cores=1
>
> when I start spark-shell (as an example application) indeed sets the
er worker since you
> have 4 cores per worker
>
>
>
> On Tue, Sep 29, 2015 at 8:24 AM, James Pirz <james.p...@gmail.com> wrote:
>
>> Hi,
>>
>> I am using speak 1.5 (standalone mode) on a cluster with 10 nodes while
>> each machine has 12GB of RAM and
Hi,
I am using speak 1.5 (standalone mode) on a cluster with 10 nodes while
each machine has 12GB of RAM and 4 cores. On each machine I have one worker
which is running one executor that grabs all 4 cores. I am interested to
check the performance with "one worker but 4 executors per machine -
I am using Spark 1.4.1 , in stand-alone mode, on a cluster of 3 nodes.
Using Spark sql and Hive Context, I am trying to run a simple scan query on
an existing Hive table (which is an external table consisting of rows in
text files stored in HDFS - it is NOT parquet, ORC or any other richer
are scheduled that way,
as it is a map-only job and reading can happen in parallel.
On Thu, Aug 13, 2015 at 9:10 PM, James Pirz james.p...@gmail.com wrote:
Hi,
I am using Spark 1.4 on a cluster (stand-alone mode), across 3 machines,
for a workload similar to TPCH (analytical queries with multiple
Hi,
I am using Spark 1.4 on a cluster (stand-alone mode), across 3 machines,
for a workload similar to TPCH (analytical queries with multiple/multi-way
large joins and aggregations). Each machine has 12GB of Memory and 4 cores.
My total data size is 150GB, stored in HDFS (stored as Hive tables),
to communicate with Hive metastore.
So your program need to instantiate a
`org.apache.spark.sql.hive.HiveContext` instead.
Cheng
On 6/10/15 10:19 AM, James Pirz wrote:
I am using Spark (standalone) to run queries (from a remote client)
against data in tables that are already defined/loaded
to connect to hive, which should
work even without spark.
Best
Ayan
On Tue, Jun 9, 2015 at 10:42 AM, James Pirz james.p...@gmail.com wrote:
Thanks for the help!
I am actually trying Spark SQL to run queries against tables that I've
defined in Hive.
I follow theses steps:
- I start
a query file with -f flag). Looking at
the Spark SQL documentation, it seems that it is possible. Please correct
me if I am wrong.
On Mon, Jun 8, 2015 at 6:56 PM, Cheng Lian lian.cs@gmail.com wrote:
On 6/9/15 8:42 AM, James Pirz wrote:
Thanks for the help!
I am actually trying Spark SQL to run
I am using Spark (standalone) to run queries (from a remote client) against
data in tables that are already defined/loaded in Hive.
I have started metastore service in Hive successfully, and by putting
hive-site.xml, with proper metastore.uri, in $SPARK_HOME/conf directory, I
tried to share its
that would be
highly appreciated.
Thnx
On Sun, Jun 7, 2015 at 6:39 AM, Cheng Lian lian.cs@gmail.com wrote:
On 6/6/15 9:06 AM, James Pirz wrote:
I am pretty new to Spark, and using Spark 1.3.1, I am trying to use 'Spark
SQL' to run some SQL scripts, on the cluster. I realized
12 matches
Mail list logo