Hi all,
I'm running a Spark Streaming application with 1-hour batches to join two
data feeds and write the output to disk. The total size of one data feed is
about 40 GB per hour (split in multiple files), while the size of the second
data feed is about 600-800 MB per hour (also split in multiple
Hi all,
I'm running a Spark Streaming application that uses reduceByKeyAndWindow().
The window interval is 2 hours, while the slide interval is 1 hour. I have a
JavaPairRDD in which both keys and values are strings. Each time the
reduceByKeyAndWindow() function is called, it uses appendString()
Dear all,
Can someone please explain me how Spark Streaming executes the window()
operation? From the Spark 1.6.1 documentation, it seems that windowed
batches are automatically cached in memory, but looking at the web UI it
seems that operations already executed in previous batches are executed
Hi all,
It seems to me that Spark Streaming doesn't read symbolic links. Do you
confirm that?
Marco
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Symbolic-links-in-Spark-tp27062.html
Sent from the Apache Spark User List mailing list archive at
Hi experts,
I'm using Apache Spark Streaming 1.6.1 to write a Java application that
joins two Key/Value data streams and writes the output to HDFS. The two data
streams contain K/V strings and are periodically ingested in Spark from HDFS
by using textFileStream().
The two data streams aren't