Hi Sarath,
By any chance have you resolved this issue ?
Thanks,
Padma CH
On Tue, Apr 28, 2015 at 11:20 PM, sarath [via Apache Spark User List] <
ml-node+s1001560n22694...@n3.nabble.com> wrote:
>
> I am trying to train a large dataset consisting of 8 million data points
> and 20 million
By default spark uses 2 executors with one core each, have you allocated
more executors using the command line args as -
--num-executors 25 --executor-cores x ???
What do you mean by the difference between the nodes is huge ?
Regards,
Padma Ch
On Tue, Mar 15, 2016 at 6:57 PM, bkapukaranov [via
*Something like below ...*
*Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError:
Java heap space at
org.apache.spark.util.io.ByteArrayChunkOutputStream.allocateNewChunkIfNeeded(ByteArrayChunkOutputStream.scala:66)
*
Hi,
Can you please try to show the stack trace line by line, because its bit
difficult to read the entire paragraph and make sense out of it .
On Mon, Mar 14, 2016 at 3:11 PM, adamreith [via Apache Spark User List] <
ml-node+s1001560n26479...@n3.nabble.com> wrote:
> Hi,
>
> I'm using spark
Hi All,
I am facing the same issue. taking k values from 60 to 120 incrementing
by 10 each time i.e k takes value 60,70,80,...120 the algorithm takes
around 2.5h on a 800 MB data set with 38 dimensions.
On Sun, Mar 29, 2015 at 9:34 AM, davidshen84 [via Apache Spark User List] <
Hi,
I am interested in the Streaming k-means algorithm and the parameter
forgetfulness. Please some one can throw light on this ?
On Wed, Jul 29, 2015 at 11:23 AM, AmmarYasir [via Apache Spark User List] <
ml-node+s1001560n24050...@n3.nabble.com> wrote:
>
> I read the post regarding
If you want to do processing in parallel, never use collect or any action
such as count or first, they compute the result and bring it back to
driver. rdd.map does processing in parallel. Once you have processed rdd
then save it to DB.
rdd.foreach executes on the workers, Infact, it returns
Hi,
I didn't get the point that you want to mention i.e "distribute
computation across nodes by restricting parallelism on each node". Do you
mean per node you are expecting only one task to run ?
Can you please paste the configuration changes you made ?
On Wed, Feb 24, 2016 at 11:24 PM,
rdd.collect() never does any processing on the workers. It brings the
entire rdd as an in-memory collection back to driver
On Wed, Feb 24, 2016 at 10:58 PM, Anurag [via Apache Spark User List] <
ml-node+s1001560n26320...@n3.nabble.com> wrote:
> Hi Everyone
>
> I am new to Scala and Spark.
>
> I
Hi Vaibhav,
As you said, from the second link, I can figure out that, it is not able
to cast the class when it is trying to read from checkpoint. Can you try
explicit casting like asInstanceOf[T] for the broad casted value ?
>From the bug, looks like it affects version 1.5. Try sample
Hi Vaibhav,
Please try with Kafka direct API approach. Is this not working ?
-- Padma Ch
On Tue, Feb 23, 2016 at 12:36 AM, vaibhavrtk1 [via Apache Spark User List] <
ml-node+s1001560n26291...@n3.nabble.com> wrote:
> Hi
>
> I am using kafka with spark streaming 1.3.0 . When the spark
Hi,
When you say that you want to produce new information, are you looking
forward to put the processed data in other consumers ?
Spark will be definitely the choice for real-time streaming computations.
Are you looking for near-real time processing or exactly real-time
processing ?
On Sun, Feb
Hi,
I am interested in this opportunity. I am working as Research Engineer in
Impetus Technologies, Bangalore, India. In fact we implemented Distributed
Deep Learning on Spark. Will share my CV if you are interested.
Please visit the below link:
Yes. I built spar 1.2 with apache hadoop 2.2. No compatibility issues.
On Sat, Jan 17, 2015 at 4:47 AM, bhavyateja [via Apache Spark User List]
ml-node+s1001560n21197...@n3.nabble.com wrote:
Is spark 1.2 is compatibly with HDP 2.1
--
If you reply to this email,
and check where I am going wrong. As my word count program is erroring
out when using spark 1.2 using YARN but its getting executed using spark
0.9.1
On Sat, Jan 17, 2015 at 5:55 AM, Chitturi Padma [via Apache Spark User
List] [hidden email]
http:///user/SendEmail.jtp?type=nodenode=21207i=1 wrote
Hi,
I tried with try catch blocks. Infact, inside mapPartitionsWithIndex,
method is invoked which does the operation. I put the operations inside the
function in try...catch block but thats of no use...still the error
persists. Even I commented all the operations and a simple print statement
Include the commons-math3/3.3 in class path while submitting jar to spark
cluster. Like..
spark-submit --driver-class-path maths3.3jar --class MainClass --master
spark cluster url appjar
On Mon, Nov 17, 2014 at 1:55 PM, Ritesh Kumar Singh [via Apache Spark User
List]
(/path/to/jar) within spark-shell and in my
project sourcefile
It still didn't import the jar at both locations.
More
Any fixes? Please help
On Mon, Nov 17, 2014 at 2:14 PM, Chitturi Padma [hidden email]
http://user/SendEmail.jtp?type=nodenode=19073i=0 wrote:
Include the commons-math3/3.3
which means the details are not persisted and hence any failures in workers
and master wouldnt start the daemons normally ..right ?
On Wed, Oct 15, 2014 at 12:17 PM, Prashant Sharma [via Apache Spark User
List] ml-node+s1001560n16468...@n3.nabble.com wrote:
[Removing dev lists]
You are
Is it possible to view the persisted RDD blocks ?
If I use YARN, RDD blocks would be persisted to hdfs then will i be able to
read the hdfs blocks as i could do in hadoop ?
On Tue, Sep 23, 2014 at 5:56 PM, Shao, Saisai [via Apache Spark User List]
ml-node+s1001560n14885...@n3.nabble.com wrote:
I couldnt even see the spark-id folder in the default /tmp directory of
local.dir.
On Tue, Sep 23, 2014 at 6:01 PM, Priya Ch learnings.chitt...@gmail.com
wrote:
Is it possible to view the persisted RDD blocks ?
If I use YARN, RDD blocks would be persisted to hdfs then will i be able
Hi,
I have similar problem. I need matrix operations such as dot product ,
cross product , transpose, matrix multiplication to be performed on Spark.
Does spark has inbuilt API to support these?
I see matrix factorization implementation in mlib.
On Fri, Aug 8, 2014 at 12:38 PM, yaochunnan [via
Hi,
I wanted to set up standalone cluster on windows machine. But unfortunately,
spark-master.cmd file is not available. Can someone suggest how to proceed
or is spark-1.0.0 has missed spark-master.cmd file ?
--
View this message in context:
23 matches
Mail list logo