Hi,
I am trying to broadcast large objects (order of a couple of 100 MBs).
However, I keep getting errors when trying to do so:
Traceback (most recent call last):
File /LORM_experiment.py, line 510, in module
broadcast_gradient_function = sc.broadcast(gradient_function)
File
Hello All,
I am new to Apache Spark, I am trying to run JavaKMeans.java from Spark
Examples in my Ubuntu System.
I downloaded spark-1.2.1-bin-hadoop2.4.tgz
http://www.apache.org/dyn/closer.cgi/spark/spark-1.2.1/spark-1.2.1-bin-hadoop2.4.tgz
and started sbin/start-master.sh
After starting
If you would like a morr detailed walkthrough I wrote one recently.
https://dataissexy.wordpress.com/2015/02/03/apache-spark-standalone-clusters-bigdata-hadoop-spark/
Regards
Jason Bell
On 22 Feb 2015 14:16, VISHNU SUBRAMANIAN johnfedrickena...@gmail.com
wrote:
Try restarting your Spark
Hi Francisco,While I haven't tried this, have a look at the contents of
start-thriftserver.sh - all it's doing is setting up a few variables and
calling:
/bin/spark-submit --class
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
and passing some additional parameters. Perhaps doing the
Try restarting your Spark cluster .
./sbin/stop-all.sh
./sbin/start-all.sh
Thanks,
Vishnu
On Sun, Feb 22, 2015 at 7:30 PM, Surendran Duraisamy
2013ht12...@wilp.bits-pilani.ac.in wrote:
Hello All,
I am new to Apache Spark, I am trying to run JavaKMeans.java from Spark
Examples in my Ubuntu
Hi Akhil,
thanks for your reply. I am using the latest version of Spark 1.2.1 (also
tried 1.3 developer branch). If I am not mistaken the TorrentBroadcast is
the default there, isn't it?
Thanks,
Tassilo
On Sun, Feb 22, 2015 at 10:59 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Did you try
Thank You Jason,
Got the program working after setting
SPARK_WORKER_CORES
SPARK_WORKER_MEMORY
While running the program from eclipse, got strange ClassNotFoundException.
In JavaKMeans.java, ParsePoint is static inner class. When running the
program I got ClassNotFound for ParsePoint.
I have
Hi Francisco,
Out of curiosity - why ROLAP mode using multi-dimensional mode (vs tabular)
from SSAS to Spark? As a past SSAS guy you've definitely piqued my
interest.
The one thing that you may run into is that the SQL generated by SSAS can
be quite convoluted. When we were doing the same thing
You can simply follow these http://spark.apache.org/docs/1.2.0/tuning.html
Thanks
Best Regards
On Sun, Feb 22, 2015 at 1:14 AM, java8964 java8...@hotmail.com wrote:
Can someone share some ideas about how to tune the GC time?
Thanks
--
From: java8...@hotmail.com
Did you try with torrent broadcast factory?
Thanks
Best Regards
On Sun, Feb 22, 2015 at 3:29 PM, TJ Klein tjkl...@gmail.com wrote:
Hi,
I am trying to broadcast large objects (order of a couple of 100 MBs).
However, I keep getting errors when trying to do so:
Traceback (most recent call
I see, thanks. Yes, I have tried already all sorts of changes to these
parameters. Unfortunately, none of seem had any impact.
Thanks,
Tassilo
On Sun, Feb 22, 2015 at 1:24 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Yes it is, you have some more customizable options over here
Back to thrift, there was an earlier thread on this topic at
http://mail-archives.apache.org/mod_mbox/spark-user/201411.mbox/%3CCABPQxsvXA-ROPeXN=wjcev_n9gv-drqxujukbp_goutvnyx...@mail.gmail.com%3E
that may be useful as well.
On Sun Feb 22 2015 at 8:42:29 AM Denny Lee denny.g@gmail.com wrote:
Yes it is, you have some more customizable options over here
http://spark.apache.org/docs/1.2.0/configuration.html#compression-and-serialization
Thanks
Best Regards
On Sun, Feb 22, 2015 at 11:47 PM, Tassilo Klein tjkl...@gmail.com wrote:
Hi Akhil,
thanks for your reply. I am using the
Hello,
I work on a MS consulting company and we are evaluating including SPARK on our
BigData offer. We are particulary interested into testing SPARK as rolap engine
for SSAS but we cannot find a way to activate the odbc server (thrift) on a
Windows custer. There is no start-thriftserver.sh
Do you guys have dynamic allocation turned on for YARN?
Anders, was Task 450 in your job acting like a Reducer and fetching the Map
spill output data from a different node?
If a Reducer task can't read the remote data it needs, that could cause the
stage to fail. Sometimes this forces the
The SparkConf doesn't allow you to set arbitrary variables. You can use
SparkContext's HadoopRDD and create a JobConf (with whatever variables you
want), and then grab them out of the JobConf in your RecordReader.
On Sun, Feb 22, 2015 at 4:28 PM, hnahak harihar1...@gmail.com wrote:
Hi,
I
I'm also facing the same issue, this is third time whenever I post anything
it never accept by the community and at the same time got a failure mail in
my register mail id.
and when click to subscribe to this mailing list link, i didnt get any new
subscription mail in my inbox.
Please anyone
Haven't found the method in
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.SchemaRDD
The new DataFrame has this method:
/**
* Returns the content of the [[DataFrame]] as an [[RDD]] of [[Row]]s.
* @group rdd
*/
def rdd: RDD[Row] = {
FYI
On Sun, Feb
Hi,
I have written custom InputFormat and RecordReader for Spark, I need to use
user variables from spark client program.
I added them in SparkConf
val sparkConf = new
SparkConf().setAppName(args(0)).set(developer,MyName)
*and in InputFormat class*
protected boolean
Hi Michael,
I think that the feature (convert a SchemaRDD to a structured class RDD) is
now available. But I didn't understand in the PR how exactly to do this. Can
you give an example or doc links?
Best regards
--
View this message in context:
bq. i didnt get any new subscription mail in my inbox.
Have you checked your Spam folder ?
Cheers
On Sun, Feb 22, 2015 at 2:36 PM, hnahak harihar1...@gmail.com wrote:
I'm also facing the same issue, this is third time whenever I post anything
it never accept by the community and at the same
Hi
I had installed spark on 3 node cluster. Spark services are up and
running.But i want to integrate hbase on spark
Do i need to install HBASE on hadoop cluster or spark cluster.
Please let me know asap.
Regards,
Sandeep.v
I've set up the EC2 cluster with Spark. Everything works, all master/slaves
are up and running.
I'm trying to submit a sample job (SparkPi). When I ssh to cluster and
submit it from there - everything works fine. However when driver is created
on a remote host (my laptop), it doesn't work. I've
Does anyone fix this error ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/cannot-run-spark-shell-in-yarn-client-mode-tp4013p21761.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
If you are having both the clusters on the same network, then i'd suggest
you installing it on the hadoop cluster. If you install it on the spark
cluster itself, then hbase might take up a few cpu cycles and there's a
chance for the job to lag.
Thanks
Best Regards
On Mon, Feb 23, 2015 at 12:48
I checked it but I didn't see any mail from user list. Let me do it one
more time.
[image: Inline image 1]
--Harihar
On Mon, Feb 23, 2015 at 11:50 AM, Ted Yu yuzhih...@gmail.com wrote:
bq. i didnt get any new subscription mail in my inbox.
Have you checked your Spam folder ?
Cheers
On
Hi,
On Sat, Feb 21, 2015 at 1:05 AM, craigv craigvanderbo...@gmail.com wrote:
/Might it be possible to perform large batches processing on HDFS time
series data using Spark Streaming?/
1.I understand that there is not currently an InputDStream that could do
what's needed. I would have
Spark Streaming already directly supports Kafka
http://spark.apache.org/docs/latest/streaming-programming-guide.html#advanced-sources
Is there any reason why that is not sufficient?
TD
On Sun, Feb 22, 2015 at 5:18 PM, mykidong mykid...@gmail.com wrote:
In java, you can see this example:
bq. bash: git: command not found
Looks like the AMI doesn't have git pre-installed.
Cheers
On Sun, Feb 22, 2015 at 4:29 PM, olegshirokikh o...@solver.com wrote:
I'm trying to launch Spark cluster on AWS EC2 with custom AMI (Ubuntu)
using
the following:
./ec2/spark-ec2 --key-pair=***
Thanks. I extract hadoop configuration and set a my arbitrary variable and
able to get inside InputFormat from JobContext.configuration
On Mon, Feb 23, 2015 at 12:04 PM, Tom Vacek minnesota...@gmail.com wrote:
The SparkConf doesn't allow you to set arbitrary variables. You can use
Instead of setting in SparkConf , set it into
SparkContext.hadoopconfiguration.set(key,value)
and from JobContext extract same key.
--Harihar
--
View this message in context:
I'm trying to launch Spark cluster on AWS EC2 with custom AMI (Ubuntu) using
the following:
./ec2/spark-ec2 --key-pair=*** --identity-file='/home/***.pem'
--region=us-west-2 --zone=us-west-2b --spark-version=1.2.1 --slaves=2
--instance-type=t2.micro --ami=ami-29ebb519 --user=ubuntu launch
In java, you can see this example:
https://github.com/mykidong/spark-kafka-simple-consumer-receiver
- Kidong.
-- Original Message --
From: icecreamlc [via Apache Spark User List]
ml-node+s1001560n21746...@n3.nabble.com
To: mykidong mykid...@gmail.com
Sent: 2015-02-21 오전 11:16:37
See if https://issues.apache.org/jira/browse/SPARK-3660 helps you. My patch
has been accepted and, this enhancement is scheduled for 1.3.0.
This lets you specify initialRDD for updateStateByKey operation. Let me
know if you need any information.
On Sun, Feb 22, 2015 at 5:21 PM, Tobias Pfeiffer
34 matches
Mail list logo