Hi guys,
I am running some SQL queries, but all my tasks are reported as either
NODE_LOCAL or PROCESS_LOCAL.
In case of Hadoop world, the reduce tasks are RACK or NON_RACK LOCAL because
they have to aggregate data from multiple hosts. However, in Spark even the
aggregation stages are reported
Guys,
Do you have any thoughts on this ?
Thanks,Robert
On Sunday, April 12, 2015 5:35 PM, Grandl Robert
rgra...@yahoo.com.INVALID wrote:
Hi guys,
I was trying to figure out some counters in Spark, related to the amount of CPU
or Memory used (in some metric), used by a task/stage
Hi guys,
I was trying to figure out some counters in Spark, related to the amount of CPU
or Memory used (in some metric), used by a task/stage/job, but I could not find
any.
Is there any such counter available ?
Thank you,Robert
Hi guys,
I am trying to get a better understanding of the DAG generation for a job in
Spark.
Ideally, what I want is to run some SQL query and extract the generated DAG by
Spark. By DAG I mean the stages and dependencies among stages, and the number
of tasks in every stage.
Could you guys
Hi guys,
I have a stupid question, but I am not sure how to get out of it.
I deployed spark 1.2.1 on a cluster of 30 nodes. Looking at master:8088 I can
see all the workers I have created so far. (I start the cluster with
sbin/start-all.sh)
However, when running a Spark SQL query or even
Sorry guys for this.
It seems that I need to start the thrift server with --master
spark://ms0220:7077 option and now I can see applications running in my web UI.
Thanks,Robert
On Thursday, March 12, 2015 10:57 AM, Grandl Robert
rgra...@yahoo.com.INVALID wrote:
I figured out
I figured out for spark-shell by passing the --master option. However, still
troubleshooting for launching sql queries. My current command is like that:
./bin/beeline -u jdbc:hive2://ms0220:1 -n `whoami` -p ignored -f
tpch_query10.sql
On Thursday, March 12, 2015 10:37 AM, Grandl
Hi guys,
I am a newbie in running Spark SQL / Spark. My goal is to run some TPC-H
queries atop Spark SQL using Hive metastore.
It looks like spark 1.2.1 release has Spark SQL / Hive support. However, I am
not able to fully connect all the dots.
I did the following:
1. Copied hive-site.xml
Hi guys,
I deployed BlinkDB(built atop Shark) and running Spark 0.9.
I tried to run several TPCDS shark queries taken from
https://github.com/cloudera/impala-tpcds-kit/tree/master/queries-sql92-modified/queries/shark.
However, the following exceptions are encountered. Do you have any idea why
, 2015 9:18 AM, Akhil Das
ak...@sigmoidanalytics.com wrote:
I'd suggest you updating your spark to the latest version and try SparkSQL
instead of Shark.
ThanksBest Regards
On Sun, Feb 15, 2015 at 7:36 AM, Grandl Robert rgra...@yahoo.com.invalid
wrote:
Hi guys,
I deployed BlinkDB(built atop
Hi guys,
Probably a dummy question. Do you know how to compile Spark 0.9 to easily
integrate with HDFS 2.6.0 ?
I was trying
sbt/sbt -Pyarn -Phadoop-2.6 assembly
ormvn -Dhadoop.version=2.6.0 -DskipTests clean package
but none of these approaches succeeded.
Thanks,Robert
, Grandl Robert rgra...@yahoo.com wrote:
Thanks Sean for your prompt response.
I was trying to compile as following:
mvn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests clean package
but I got a bunch of errors(see below). Hadoop-2.6.0 compiled correctly, and
all hadoop jars are in .m2
standalone mode, you don't need -Pyarn. There is no
-Phadoop-2.6; you should use -Phadoop-2.4 for 2.4+. Yes, set
-Dhadoop.version=2.6.0. That should be it.
If that still doesn't work, define doesn't succeed.
On Fri, Feb 13, 2015 at 7:13 PM, Grandl Robert
rgra...@yahoo.com.invalid wrote:
Hi guys
13 matches
Mail list logo