Hi All,

I am receiving data from AWS Kinesis using Spark Streaming and am writing
the data collected in the dstream to s3 using output function:

dstreamData.saveAsTextFiles("s3n://XXX:XXX@XXXX/")

After the run the application for several seconds, I end up with a sequence
of directories in S3 that look like [PREFIX]-1425597204000.

At the same time I'd like to run a copy command on Redshift that pulls over
the exported data. The problem is that I am not sure how to extract the
folder names from the dstream object in order to construct the appropriate
COPY command.

https://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.streaming.dstream.DStream

Anyone have any ideas?

Thanks, Mike.

Reply via email to