unsubscribe
On Wed, Jan 30, 2019 at 10:05 PM wang wei
wrote:
> Unsubscribe
>
Hi, did you guys figure it out?
Thanks
Soumitra
On Sun, Mar 5, 2017 at 9:51 PM nguyen duc Tuan wrote:
> Hi everyone,
> We are deploying kafka cluster for ingesting streaming data. But
> sometimes, some of nodes on the cluster have troubles (node dies, kafka
> daemon is
If your job involves a shuffle then the compute for the entire batch will
increase with network latency. What would be interesting is to see how much
time each task/job/stage takes.
On Thu, Sep 22, 2016 at 5:11 PM Peter Figliozzi
wrote:
> It seems to me they must
Hi,
I am running a steaming job with 4 executors and 16 cores so that each
executor has two cores to work with. The input Kafka topic has 4 partitions.
With this given configuration I was expecting MapWithStateRDD to be evenly
distributed across all executors, how ever I see that it uses only two
I think a better partitioning scheme can help u too.
On Tue, May 10, 2016 at 10:31 AM Cody Koeninger wrote:
> maxRate is not used by the direct stream.
>
> Significant skew in rate across different partitions for the same
> topic is going to cause you all kinds of problems,
Hi,
I have two applications : App1 and App2.
On a single cluster I have to spawn 5 instances os App1 and 1 instance of
App2.
What would be the best way to send data from the 5 App1 instances to the
single App2 instance ?
Right now I am using Kafka to send data from one spark application to the
Hi,
I am relatively new to Spark and am using updateStateByKey() operation to
maintain state in my Spark Streaming application. The input data is coming
through a Kafka topic.
1. I want to understand how are DStreams partitioned?
2. How does the partitioning work with mapWithState() or
Hi, in the example :
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/StatefulNetworkWordCount.scala
the streaming frequency is 1 seconds however I do not want to print the
contents of the word-counts every minute and resent the word counts
Hi All,
I have a Graph with millions of edges. Edges are represented by
org.apache.spark.rdd.RDD[org.apache.spark.graphx.Edge[String]]
= MappedRDD[4] . I have two questions :
1)How can I fetch all the nodes with a given edge label ( all edges with a
given property )
2) Is it possible to create
Hi,
With respect to the first issue, one possible way is to filter the graph
via 'graph.subgraph(epred = e = e.attr == edgeLabel)' , but I am still
curious if we can index RDDs.
Warm Regards
Soumitra
On Tue, Oct 14, 2014 at 2:46 PM, Soumitra Johri
soumitra.siddha...@gmail.com wrote:
Hi
10 matches
Mail list logo