Hi All,
I am struggling with an odd issue and would like your help in addressing it.
Environment
AWS Cluster (40 Spark Nodes & 4 node Kafka cluster)
Spark Kafka Streaming submitted in Yarn cluster mode
Kafka - Single topic, 400 partitions
Spark 2.1 on Cloudera
Kafka 10.0 on Cloudera
We have zero
Hi,
You can enable backpressure to handle this.
spark.streaming.backpressure.enabled
spark.streaming.receiver.maxRate
Thanks,
Edwin
On Mar 18, 2017, 12:53 AM -0400, sagarcasual . , wrote:
> Hi, we have spark 1.6.1 streaming from Kafka (0.10.1) topic using direct
>
Hi All,
I believe here what we are looking for is a serving layer where user queries
can be executed on a subset of processed data.
In this scenario, we are using Impala for this as it provides a layered
caching, in our use case it caches some set in memory and then some in HDFS and
the full