I have the streaming program writing sequence files. I can find one of the files and load it in the shell using:
scala> val rdd = sc.sequenceFile[String, Int]("tachyon://localhost:19998/files/WordCounts/20140724-213930") 14/07/24 21:47:50 INFO storage.MemoryStore: ensureFreeSpace(32856) called with curMem=0, maxMem=309225062 14/07/24 21:47:50 INFO storage.MemoryStore: Block broadcast_0 stored as values to memory (estimated size 32.1 KB, free 294.9 MB) rdd: org.apache.spark.rdd.RDD[(String, Int)] = MappedRDD[1] at sequenceFile at <console>:12 So I got some type information, seems good. It took a while to research but I got the following streaming code to compile and run: val wordCounts = ssc.fileStream[String, Int, SequenceFileInputFormat[String, Int]](args(0)) It works now and I offer this for reference to anybody else who may be curious about saving sequence files and then streaming them back in. Question: When running both streaming programs at the same time using spark-submit I noticed that only one app would really run. To get the one app to continue I had to stop the other app. Is there a way to get these running simultaneously? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/streaming-sequence-files-tp10557p10620.html Sent from the Apache Spark User List mailing list archive at Nabble.com.