Executors slow down when running on the same node

2018-05-21 Thread Javier Pareja
Mesos assigns resources, I might have to submit it several times until it works. Does anyone know what could be wrong? Any idea of what can I look into? Network, Max Host Connections, shared VM...? Javier Pareja

Re: [Spark2.1] SparkStreaming to Cassandra performance problem

2018-04-30 Thread Javier Pareja
, > tt timestamp, > in_tt timestamp, > out_tt timestamp, > sensor_id int, > measure double, > PRIMARY KEY (mid, tt, sensor_id, in_tt, out_tt) > ) with compact storage; > > The system CPU while the demo is running is almost always at 100% for both > cores. > &g

Re: [Spark2.1] SparkStreaming to Cassandra performance problem

2018-04-29 Thread Javier Pareja
sed "map" directly instead of using transform, but > the *kafkaStream* is created with KafkaUtils which does not have a method > to save to cassandra directly. > > Do you know any workarround for this? > > > Thank you for the suggestion. > > Best Regards, > > O

Re: [Spark2.1] SparkStreaming to Cassandra performance problem

2018-04-29 Thread Javier Pareja
Hi Saulo, I'm no expert but I will give it a try. I would remove the rdd2.count(), I can't see the point and you will gain performance right away. Because of this, I would not use a transform, just directly the map. I have not used python but in Scala the cassandra-spark connector can save