Re: Get filename in Spark Streaming

2015-02-24 Thread Emre Sevinc
Hello Subacini, Until someone more knowledgeable suggests a better, more straightforward, and simpler approach with a working code snippet, I suggest the following workaround / hack: inputStream.foreachRDD(rdd = val myStr = rdd.toDebugString // process myStr string value, e.g. using

Re: Get filename in Spark Streaming

2015-02-06 Thread Subacini B
Thank you Emre, This helps, i am able to get filename. But i am not sure how to fit this into Dstream RDD. val inputStream = ssc.textFileStream(/hdfs Path/) inputStream is Dstreamrdd and in foreachrdd , am doing my processing inputStream.foreachRDD(rdd = { * //how to get filename here??*

Re: Get filename in Spark Streaming

2015-02-05 Thread Emre Sevinc
Hello, Did you check the following? http://themodernlife.github.io/scala/spark/hadoop/hdfs/2014/09/28/spark-input-filename/ http://apache-spark-user-list.1001560.n3.nabble.com/access-hdfs-file-name-in-map-td6551.html -- Emre Sevinç On Fri, Feb 6, 2015 at 2:16 AM, Subacini B

Get filename in Spark Streaming

2015-02-05 Thread Subacini B
Hi All, We have filename with timestamp say ABC_1421893256000.txt and the timestamp needs to be extracted from file name for further processing.Is there a way to get input file name picked up by spark streaming job? Thanks in advance Subacini