hi guys,
i have a question about spark streaming.
There’s an application keep sending transaction records into spark stream with
about 50k tps
The record represents a sales information including customer id / product id /
time / price columns
The application is required to monitor the change
Hi all,
We are planing to use SparkSQL in a DW system. There’s a question about the
caching mechanism of SparkSQL.
For example, if I have a SQL like sqlContext.sql(“select c1, sum(c2) from T1,
T2 where T1.key=T2.key group by c1”).cache()
Is it going to cache the final result or the raw data
Hi all,
We are planing to use SparkSQL in a DW system. There’s a question about the
caching mechanism of SparkSQL.
For example, if I have a SQL like sqlContext.sql(“select c1, sum(c2) from T1,
T2 where T1.key=T2.key group by c1”).cache()
Is it going to cache the final result or the raw data
If I run spark in stand-alone mode ( not YARN mode ), is there any tool like
Sqoop that able to transfer data from RDBMS to spark storage?
Thanks
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
Hey guys,
Not sure if i’m the only one got this. We are building high-available
standalone spark env. We are using ZK with 3 masters in the cluster.
However, in sbin/start-slaves.sh, it calls start-slave.sh for each member in
conf/slaves file, and specify master using $SPARK_MASTER_IP and
hey guys,
In my understanding SparkSQL only supports JDBC connection through hive thrift
server, is this correct?
Thanks
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail:
guys, is there any easier way to build all modules by mvn ?
right now if I run “mvn package” in spark root directory I got:
[INFO] Reactor Summary:
[INFO]
[INFO] Spark Project Parent POM ... SUCCESS [ 8.327 s]
[INFO] Spark Project Networking ...
/data/sequoiadb-driver-1.10.jar,/data/spark-sequoiadb-0.0.1-SNAPSHOT.jar::/data/spark/conf:/data/spark/assembly/target/scala-2.10/spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar
-XX:MaxPermSize=128m -Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=centos-151:2181,centos-152:2181
job that periodically cleans up /tmp dir ?
Cheers
On Thu, Mar 12, 2015 at 6:18 PM, sequoiadb mailing-list-r...@sequoiadb.com
mailto:mailing-list-r...@sequoiadb.com wrote:
Checking the script, it seems spark-daemon.sh unable to stop the worker
$ ./spark-daemon.sh stop