Hello Jose,
We've hit the same issue a couple of months ago. It is possible to write
directly to files instead of creating directories, but it is not
straightforward, and I haven't seen any clear demonstration in books,
tutorials, etc.
We do something like:
SparkConf sparkConf = new
To clarify, sometimes in the world of Hadoop people freely refer to an
output 'file' when it's really a directory containing 'part-*' files which
are pieces of the file. It's imprecise but that's the meaning. I think the
scaladoc may be referring to 'the path to the file, which includes this
*To:* Emre Sevinc
*Cc:* Jose Fernandez; user@spark.apache.org
*Subject:* Re: Spark Streaming output cannot be used as input?
To clarify, sometimes in the world of Hadoop people freely refer to an
output 'file' when it's really a directory containing 'part-*' files which
are pieces of the file. It's
, February 18, 2015 1:53 AM
To: Emre Sevinc
Cc: Jose Fernandez; user@spark.apache.org
Subject: Re: Spark Streaming output cannot be used as input?
To clarify, sometimes in the world of Hadoop people freely refer to an output
'file' when it's really a directory containing 'part-*' files which are pieces
Hello folks,
Our intended use case is:
- Spark Streaming app #1 reads from RabbitMQ and output to HDFS
- Spark Streaming app #2 reads #1's output and stores the data into
Elasticsearch
The idea behind this architecture is that if Elasticsearch is down due to an
upgrade or