will
work fine if there is only a single access at a time to this object. So, my
question is how many threads in each worker access broadcast variables.
Thanks in advance,
Eduardo
I had this problem at my work.
I solved by increasing the unix ulimit, because spark is trying to open to
many files.
Em 29 de set de 2017 5:05 PM, "Anthony Thomas"
escreveu:
> Hi Spark Users,
>
> I recently compiled spark 2.2.0 from source on an EC2 m4.2xlarge instance
mps
are equal)?
Additionally, what is the use of sliding window?
Thanks,
Eduardo
2017-09-11 13:11 GMT-03:00 Burak Yavuz <brk...@gmail.com>:
> Hi Eduardo,
>
> What you have written out is to output counts "as fast as possible" for
> windows of 5 minute length and with a s
seems to change the behavior, but
still far from what I expected.
What is wrong with my assumptions on the way it should work? Given the
code, how should the sample output be interpreted or used?
Thanks,
Eduardo
what program do u use to profile Spark?
On Fri, Jun 23, 2017 at 3:07 PM, Marcelo Vanzin wrote:
> That thread looks like the connection between the Spark process and
> jvisualvm. It's expected to show high up when doing sampling if the
> app is not doing much else.
>
> On
You can add "?zeroDateTimeBehavior=convertToNull" to the connection string.
On Wed, Jun 21, 2017 at 9:04 AM, Aviral Agarwal
wrote:
> The exception is happening in JDBC RDD code where getNext() is called to
> get the next row.
> I do not have access to the result set. I am
Is there a way to write a transformation that for each entry of an RDD uses
certain other values of another RDD? As an example, image you have a RDD of
entries to predict a certain label. In a second RDD, you have historical
data. So for each entry in the first RDD, you want to find similar
Hi Aida,
The installation has detected a maven version 3.0.3. Update to 3.3.3 and
try again.
Il 08/Mar/2016 14:06, "Aida" <aida1.tef...@gmail.com> ha scritto:
> Hi all,
>
> Thanks everyone for your responses; really appreciate it.
>
> Eduardo - I tried your sugge
Hi Aida
Run only "build/mvn -DskipTests clean package”
BR
Eduardo Costa Alfaia
Ph.D. Student in Telecommunications Engineering
Università degli Studi di Brescia
Tel: +39 3209333018
On 3/4/16, 16:18, "Aida" <aida1.tef...@gmail.com> wrote:
>Hi all,
&
Hi,
try http://OAhtvJ5MCA:8080
BR
On 2/19/16, 07:18, "vasbhat" wrote:
>OAhtvJ5MCA
--
Informativa sulla Privacy: http://www.unibs.it/node/8155
-
To unsubscribe, e-mail:
Hi Gourav,
I did a prove as you said, for me it’s working, I am using spark in local mode,
master and worker in the same machine. I run the example in spark-shell
—package com.databricks:spark-csv_2.10:1.3.0 without errors.
BR
From: Gourav Sengupta
Date: Monday,
Hi Guys,
How could I unsubscribe the email e.costaalf...@studenti.unibs.it, that is an
alias from my email e.costaalf...@unibs.it and it is registered in the mail
list .
Thanks
Eduardo Costa Alfaia
PhD Student Telecommunication Engineering
Università degli Studi di Brescia-UNIBS
in
the "Executor Computing Time" in History Server.
Do you recommend any documentation to understand better the History Server
logs and maybe more stats included in the log files?
Thanks in advance,
Carlos Eduardo M. Santos
CS PhD student
Akhil Das:
Thanks for your reply. I am using exactly the same installation everywhere.
Actually, the spark directory is shared among all nodes, including the
place where I start pyspark. So, I believe this is not the problem.
Regards,
Eduardo
On Mon, Jul 13, 2015 at 3:56 AM, Akhil Das ak
My installation of spark is not working correctly in my local cluster. I
downloaded spark-1.4.0-bin-hadoop2.6.tgz and untar it in a directory
visible to all nodes (these nodes are all accessible by ssh without
password). In addition, I edited conf/slaves so that it contains the names
of the nodes.
Hi, I change my process flow.
Now I am processing a file per hour, instead of process at the end of the
day.
This decreased the memory comsuption .
Regards
Eduardo
On Thu, Mar 26, 2015 at 3:16 PM, Davies Liu dav...@databricks.com wrote:
Could you narrow down to a step which cause
, Mar 26, 2015 at 10:02 AM, Eduardo Cusa
eduardo.c...@usmediaconsulting.com wrote:
I running on ec2 :
1 Master : 4 CPU 15 GB RAM (2 GB swap)
2 Slaves 4 CPU 15 GB RAM
the uncompressed dataset size is 15 GB
On Thu, Mar 26, 2015 at 10:41 AM, Eduardo Cusa
eduardo.c
I running on ec2 :
1 Master : 4 CPU 15 GB RAM (2 GB swap)
2 Slaves 4 CPU 15 GB RAM
the uncompressed dataset size is 15 GB
On Thu, Mar 26, 2015 at 10:41 AM, Eduardo Cusa
eduardo.c...@usmediaconsulting.com wrote:
Hi Davies, I upgrade to 1.3.0 and still getting Out of Memory.
I ran
a taste for the new DataFrame API.
On Wed, Mar 25, 2015 at 11:49 AM, Eduardo Cusa
eduardo.c...@usmediaconsulting.com wrote:
Hi Davies, I running 1.1.0.
Now I'm following this thread that recommend use batchsize parameter = 1
http://apache-spark-user-list.1001560.n3.nabble.com/pySpark
dataset completed successfully
Any ideas to debug is welcome.
Regards
Eduardo
Liu dav...@databricks.com wrote:
What's the version of Spark you are running?
There is a bug in SQL Python API [1], it's fixed in 1.2.1 and 1.3,
[1] https://issues.apache.org/jira/browse/SPARK-6055
On Wed, Mar 25, 2015 at 10:33 AM, Eduardo Cusa
eduardo.c...@usmediaconsulting.com wrote:
Hi
Thanks Ted.
On Feb 10, 2015, at 20:06, Ted Yu yuzhih...@gmail.com wrote:
Please take a look at:
examples/scala-2.10/src/main/java/org/apache/spark/examples/streaming/JavaDirectKafkaWordCount.java
which was checked in yesterday.
On Sat, Feb 7, 2015 at 10:53 AM, Eduardo Costa Alfaia
Hi Guys,
How could I doing in Java the code scala below?
val KafkaDStreams = (1 to numStreams) map {_ =
KafkaUtils.createStream[String, String, StringDecoder, StringDecoder](ssc,
kafkaParams, topicMap,storageLevel = StorageLevel.MEMORY_ONLY).map(_._2)
}
val unifiedStream =
Hi Guys,
I’m getting this error in KafkaWordCount;
TaskSetManager: Lost task 0.0 in stage 4095.0 (TID 1281, 10.20.10.234):
java.lang.ClassCastException: [B cannot be cast to java.lang.String
I don’t think so Sean.
On Feb 5, 2015, at 16:57, Sean Owen so...@cloudera.com wrote:
Is SPARK-4905 / https://github.com/apache/spark/pull/4371/files the same
issue?
On Thu, Feb 5, 2015 at 7:03 AM, Eduardo Costa Alfaia
e.costaalf...@unibs.it wrote:
Hi Guys,
I’m getting this error
.
`DefaultDecoder` is to return Array[Byte], not String, so here class casting
will meet error.
Thanks
Jerry
-Original Message-
From: Eduardo Costa Alfaia [mailto:e.costaalf...@unibs.it]
Sent: Friday, February 6, 2015 12:04 AM
To: Sean Owen
Cc: user@spark.apache.org
Subject: Re: Error
Hi Guys,
I would like to put in the kafkawordcount scala code the kafka parameter: val
kafkaParams = Map(“fetch.message.max.bytes” - “400”). I’ve put this
variable like this
val KafkaDStreams = (1 to numStreams) map {_ =
Hi Guys,
some idea how solve this error
[error]
/sata_disk/workspace/spark-1.1.1/examples/src/main/scala/org/apache/spark/examples/streaming/KafkaWordCount.scala:76:
missing parameter type for expanded function ((x$6, x$7) = x$6.$plus(x$7))
I have the same issue.
- Messaggio originale -
Da: Rasika Pohankar rasikapohan...@gmail.com
Inviato: 18/01/2015 18:48
A: user@spark.apache.org user@spark.apache.org
Oggetto: Spark Streaming with Kafka
I am using Spark Streaming to process data received through Kafka. The Spark
the build
file https://github.com/knoldus/Play-Spark-Scala/blob/master/build.sbt
of your play application it seems that it uses Spark 1.0.1.
Thanks
Best Regards
On Fri, Jan 9, 2015 at 7:17 PM, Eduardo Cusa
eduardo.c...@usmediaconsulting.com wrote:
Hi guys, I running the following example
Hi guys, I running the following example :
https://github.com/knoldus/Play-Spark-Scala in the same machine as the
spark master, and the spark cluster was lauched with ec2 script.
I'm stuck with this errors, any idea how to fix it?
Regards
Eduardo
call the play app prints the following
Eduardo
On Sat, Dec 20, 2014 at 7:53 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
What version of the script are you running? What did you see in the EC2
web console when this happened?
Sometimes instances just don't come up in a reasonable amount of time and
you have to kill
cluster vpc_spark...
Spark AMI: ami-5bb18832
Launching instances...
Launched 1 slaves in us-east-1a, regid = r-e9d603c4
Launched master in us-east-1a, regid = r-89d104a4
Waiting for cluster to enter 'ssh-ready' state...
any ideas what happend?
regards
Eduardo
Hi guys.
I run the folling command to lauch a new cluster :
./spark-ec2 -k test -i test.pem -s 1 --vpc-id vpc-X --subnet-id
subnet-X launch vpc_spark
The instances started ok but the command never end. With the following
output:
Setting up security groups...
Searching for existing
Hi guys,
Has anyone already tried doing this work?
Thanks
--
Informativa sulla Privacy: http://www.unibs.it/node/8155
Hi Guys,
I am doing some tests with JavaKafkaWordCount, my cluster is composed by 8
workers and 1 driver con spark-1.1.0, I am using Kafka too and I have some
questions about.
1 - When I launch the command:
bin/spark-submit --class org.apache.spark.examples.streaming.JavaKafkaWordCount
Hi guys,
The Kafka’s examples in master branch were canceled?
Thanks
--
Informativa sulla Privacy: http://www.unibs.it/node/8155
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail:
: Association failed with
[akka.tcp://sparkMaster@10.0.2.20:7077]
My spark master is run on 10.0.2.20.
From pyspark I can work properly.
Regards
Eduardo
Hi Guys,
I am doing some tests with Spark Streaming and Kafka, but I have seen something
strange, I have modified the JavaKafkaWordCount to use ReducebyKeyandWindow and
to print in the screen the accumulated numbers of the words, in the beginning
spark works very well in each interaction the
.
On Thu, Nov 6, 2014 at 9:32 AM, Eduardo Costa Alfaia e.costaalf...@unibs.it
wrote:
Hi Guys,
I am doing some tests with Spark Streaming and Kafka, but I have seen
something strange, I have modified the JavaKafkaWordCount to use
ReducebyKeyandWindow and to print in the screen
.
On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Alfaia e.costaalf...@unibs.it
wrote:
Hi Guys,
Anyone could explain me how to work Kafka with Spark, I am using the
JavaKafkaWordCount.java like a test and the line command is:
./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount
Hi Guys,
Anyone could explain me how to work Kafka with Spark, I am using the
JavaKafkaWordCount.java like a test and the line command is:
./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount
spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3
and like a
Hi TD,
I have sent more informations now using 8 workers. The gap has been 27 sec now.
Have you seen?
Thanks
BR
--
Informativa sulla Privacy: http://www.unibs.it/node/8155
Ok Andrew,
Thanks
I sent informations of test with 8 worker and the gap is grown up.
On May 4, 2014, at 2:31, Andrew Ash and...@andrewash.com wrote:
From the logs, I see that the print() starts printing stuff 10 seconds
after the context is started. And that 10 seconds is taken by the
. And that does
not seem to be a persistent problem as after that 10 seconds, the data is
being received and processed.
TD
On Fri, May 2, 2014 at 2:14 PM, Eduardo Costa Alfaia e.costaalf...@unibs.it
wrote:
Hi TD,
I got the another information today using Spark 1.0 RC3 and the situation
Hi TD,
In my tests with spark streaming, I'm using JavaNetworkWordCount(modified) code
and a program that I wrote that sends words to the Spark worker, I use TCP as
transport. I verified that after starting Spark, it connects to my source which
actually starts sending, but the first word count
no
room for processing the received data. It could be that after 30 seconds, the
server disconnects, the receiver terminates, releasing the single slot for
the processing to proceed.
TD
On Tue, Apr 29, 2014 at 2:28 PM, Eduardo Costa Alfaia
e.costaalf...@unibs.it wrote:
Hi TD
are facing?
TD
On Fri, Apr 4, 2014 at 8:03 AM, Eduardo Costa Alfaia
e.costaalf...@unibs.it mailto:e.costaalf...@unibs.it wrote:
Hi guys,
I would like knowing if the part of code is right to use in Window.
JavaPairDStreamString, Integer wordCounts = words.map(
103 new
Hi Guys,
I would like understanding why the Driver's RAM goes down, Does the
processing occur only in the workers?
Thanks
# Start Tests
computer1(Worker/Source Stream)
23:57:18 up 12:03, 1 user, load average: 0.03, 0.31, 0.44
total used free shared
Hi all,
Could anyone explain me about the lines below?
computer1 - worker
computer8 - driver(master)
14/04/04 14:24:56 INFO BlockManagerMasterActor$BlockManagerInfo: Added
input-0-1396614314800 in memory on computer1.ant-net:60820 (size: 1262.5
KB, free: 540.3 MB)
14/04/04 14:24:56 INFO
Hi all,
I am doing some tests using JavaNetworkWordcount and I have some
questions about the performance machine, my tests' time are
approximately 2 min.
Why does the RAM Memory decrease meaningly? I have done tests with 2, 3
machines and I had gotten the same behavior.
What should I
Hi Guys,
Could anyone help me understand this driver behavior when I start the
JavaNetworkWordCount?
computer8
16:24:07 up 121 days, 22:21, 12 users, load average: 0.66, 1.27, 1.55
total used free shared buffers
cached
Mem: 5897
Hi all,
I have put this line in my spark-env.sh:
-Dspark.default.parallelism=20
this parallelism level, is it correct?
The machine's processor is a dual core.
Thanks
--
Informativa sulla Privacy: http://www.unibs.it/node/8155
Hi Guys,
Could anyone explain me this behavior? After 2 min of tests
computer1- worker
computer10 - worker
computer8 - driver(master)
computer1
18:24:31 up 73 days, 7:14, 1 user, load average: 3.93, 2.45, 1.14
total used free shared buffers
cached
, Eduardo Costa Alfaia
e.costaalf...@unibs.it mailto:e.costaalf...@unibs.it wrote:
Hi all,
I have put this line in my spark-env.sh:
-Dspark.default.parallelism=20
this parallelism level, is it correct?
The machine's processor is a dual core.
Thanks
--
Informativa
problem you are facing?
TD
On Fri, Apr 4, 2014 at 8:03 AM, Eduardo Costa Alfaia
e.costaalf...@unibs.it mailto:e.costaalf...@unibs.it wrote:
Hi guys,
I would like knowing if the part of code is right to use in Window.
JavaPairDStreamString, Integer wordCounts = words.map
Hi Guys
I would like printing the content inside of line in :
JavaDStreamString lines = ssc.socketTextStream(args[1],
Integer.parseInt(args[2]));
JavaDStreamString words = lines.flatMap(new
FlatMapFunctionString, String() {
@Override
public IterableString call(String x) {
Thank you very much Sourav
BR
Em 3/26/14, 17:29, Sourav Chandra escreveu:
def print() {
def foreachFunc = (rdd: RDD[T], time: Time) = {
val total = rdd.collect().toList
println (---)
println (Time: + time)
println
Hi Guys,
I think that I already did this question, but I don't remember if anyone
has answered me. I would like changing in the function print() the
quantity of words and the frequency number that are sent to driver's
screen. The default value is 10.
Anyone could help me with this?
Best
Hi Guys,
Could anyone help me to understand this piece of log in red? Why is this
happened?
Thanks
14/03/10 16:55:20 INFO SparkContext: Starting job: first at
NetworkWordCount.scala:87
14/03/10 16:55:20 INFO JobScheduler: Finished job streaming job
1394466892000 ms.0 from job set of time
Yes TD,
I can use tcpdump to see if the data are being accepted by the receiver
and if else them are arriving into the IP packet.
Thanks
Em 3/8/14, 4:19, Tathagata Das escreveu:
I am not sure how to debug this without any more information about the
source. Can you monitor on the receiver side
61 matches
Mail list logo