How to skip nonexistent file when read files with spark?

2018-05-20 Thread JF Chen
Hi Everyone I meet a tricky problem recently. I am trying to read some file paths generated by other method. The file paths are represented by wild card in list, like [ '/data/*/12', '/data/*/13'] But in practice, if the wildcard cannot match any existed path, it will throw an

Re: [Spark2.1] SparkStreaming to Cassandra performance problem

2018-05-20 Thread Saulo Sobreiro
Hi Javier, Thank you a lot for the feedback. Indeed the CPU is a huge limitation. I got a lot of trouble trying to run this use case in yarn-client mode. I managed to run this in standalone (local master) mode only. I do not have the hardware available to run this setup in a cluster yet, so I

is it possible to create one KafkaDirectStream (Dstream) per topic?

2018-05-20 Thread kant kodali
Hi All, I have 5 Kafka topics and I am wondering if is even possible to create one KafkaDirectStream (Dstream) per topic within the same JVM i.e using only one sparkcontext? Thanks!

Re: Does Spark shows logical or physical plan when executing job on the yarn cluster

2018-05-20 Thread Ajay
You can look at the spark master UI at port 4040. It should tell you all the currently running stages as well as past/future stages. On Sun, May 20, 2018, 12:22 AM giri ar wrote: > Hi, > > > Good Day. > > Could you please let me know whether we can see spark logical or

Does Spark shows logical or physical plan when executing job on the yarn cluster

2018-05-20 Thread giri ar
Hi, Good Day. Could you please let me know whether we can see spark logical or physical plan while running spark job on the yarn cluster( Eg: like number of stages) Thanks in advance. Thanks, Giri