I definitely delete the file on the right HDFS, I only have one HDFS instance.
The problem seems to be in the CassandraRDD - reading always fails in some way
when run on the cluster, but single-machine reads are okay.
On Feb 20, 2015, at 4:20 AM, Ilya Ganelin ilgan...@gmail.com wrote:
The
Yeah, I do manually delete the files, but it still fails with this error.
On Feb 19, 2015, at 8:16 PM, Ganelin, Ilya ilya.gane...@capitalone.com
wrote:
When writing to hdfs Spark will not overwrite existing files or directories.
You must either manually delete these or use Java's Hadoop
On Feb 19, 2015, at 7:29 PM, Pavel Velikhov pavel.velik...@icloud.com wrote:
I have a simple Spark job that goes out to Cassandra, runs a pipe and stores
results:
val sc = new SparkContext(conf)
val rdd = sc.cassandraTable(“keyspace, “table)
.map(r = r.getInt(“column) + \t +
When writing to hdfs Spark will not overwrite existing files or directories.
You must either manually delete these or use Java's Hadoop FileSystem class to
remove them.
Sent with Good (www.good.com)
-Original Message-
From: Pavel Velikhov
The stupid question is whether you're deleting the file from hdfs on the
right node?
On Thu, Feb 19, 2015 at 11:31 AM Pavel Velikhov pavel.velik...@gmail.com
wrote:
Yeah, I do manually delete the files, but it still fails with this error.
On Feb 19, 2015, at 8:16 PM, Ganelin, Ilya