subject:"RE\: cannot read file form a local path"

RE: cannot read file form a local path

2014-09-11 Thread Mozumder, Monir

I am seeing this same issue with Spark 1.0.1 (tried with file:// for local file 
) :



scala val lines = sc.textFile(file:///home/monir/.bashrc)
lines: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at 
console:12

scala val linecount = lines.count
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/monir/.bashrc
at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197)
at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:175)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)

-Original Message-
From: wsun
Sent: Feb 03, 2014; 12:44pm
To: u...@spark.incubator.apache.org
Subject: cannot read file form a local path


After installing spark 0.8.1 on a EC2 cluster, I launched Spark shell on the 
master. This is what happened to me: 

scalaval textFile=sc.textFile(README.md) 
14/02/03 20:38:08 INFO storage.MemoryStore: ensureFreeSpace(34380) called with 
c  urMem=0, maxMem=4082116853 
14/02/03 20:38:08 INFO storage.MemoryStore: Block broadcast_0 stored as values 
t  o memory (estimated size 33.6 KB, free 
3.8 GB) 
textFile: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at 
consol  e:12 


scala textFile.count() 
14/02/03 20:38:39 WARN snappy.LoadSnappy: Snappy native library is available 
14/02/03 20:38:39 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/02/03 20:38:39 INFO snappy.LoadSnappy: Snappy native library loaded 
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
hdfs:  
//ec2-54-234-136-50.compute-1.amazonaws.com:9000/user/root/README.md 
at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.j   
   ava:197) 
at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.ja   
   va:208) 
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:141) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199) 
at scala.Option.getOrElse(Option.scala:108) 
at org.apache.spark.rdd.RDD.partitions(RDD.scala:199) 
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:26) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199) 
at scala.Option.getOrElse(Option.scala:108) 
at org.apache.spark.rdd.RDD.partitions(RDD.scala:199) 
at org.apache.spark.SparkContext.runJob(SparkContext.scala:886) 
at org.apache.spark.rdd.RDD.count(RDD.scala:698) 


Spark seems looking for README.md in hdfs. However, I did not specify the 
file is located in hdfs. I am just wondering if there any configuration in 
Spark that force Spark to read files from local file system. Thanks in advance 
for any helps. 

wp

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: cannot read file form a local path

2014-09-11 Thread Mozumder, Monir

Seems starting spark-shell in local mode solves this. But still then it cannot 
recognize file beginning with a '.' 

MASTER=local[4] ./bin/spark-shell

.
scala val lineCount = sc.textFile(/home/monir/ref).count
lineCount: Long = 68

scala val lineCount2 = sc.textFile(/home/monir/.ref).count
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/monir/.ref


Though I am ok with running spark-shell in  local mode to basic examples run, I 
was wondering if getting to local files on the cluster nodes is possible when 
all of the worker nodes have the file in question in their local file system.

Still fairly new to Spark so bear with me if this is easily tunable by some 
config params.

Bests,
-Monir



-Original Message-
From: Mozumder, Monir 
Sent: Thursday, September 11, 2014 12:15 PM
To: user@spark.apache.org
Subject: RE: cannot read file form a local path

I am seeing this same issue with Spark 1.0.1 (tried with file:// for local file 
) :



scala val lines = sc.textFile(file:///home/monir/.bashrc)
lines: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at 
console:12

scala val linecount = lines.count
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/monir/.bashrc
at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197)
at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:175)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)

-Original Message-
From: wsun
Sent: Feb 03, 2014; 12:44pm
To: u...@spark.incubator.apache.org
Subject: cannot read file form a local path


After installing spark 0.8.1 on a EC2 cluster, I launched Spark shell on the 
master. This is what happened to me: 

scalaval textFile=sc.textFile(README.md)
14/02/03 20:38:08 INFO storage.MemoryStore: ensureFreeSpace(34380) called with 
c  urMem=0, maxMem=4082116853 
14/02/03 20:38:08 INFO storage.MemoryStore: Block broadcast_0 stored as values 
t  o memory (estimated size 33.6 KB, free 
3.8 GB) 
textFile: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at 
consol  e:12 


scala textFile.count()
14/02/03 20:38:39 WARN snappy.LoadSnappy: Snappy native library is available
14/02/03 20:38:39 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/02/03 20:38:39 INFO snappy.LoadSnappy: Snappy native library loaded 
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
hdfs:  
//ec2-54-234-136-50.compute-1.amazonaws.com:9000/user/root/README.md 
at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.j   
   ava:197) 
at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.ja   
   va:208) 
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:141) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199) 
at scala.Option.getOrElse(Option.scala:108) 
at org.apache.spark.rdd.RDD.partitions(RDD.scala:199) 
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:26) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:199) 
at scala.Option.getOrElse(Option.scala:108) 
at org.apache.spark.rdd.RDD.partitions(RDD.scala:199) 
at org.apache.spark.SparkContext.runJob(SparkContext.scala:886) 
at org.apache.spark.rdd.RDD.count(RDD.scala:698) 


Spark seems looking for README.md in hdfs. However, I did not specify the 
file is located in hdfs. I am just wondering if there any configuration in 
Spark that force Spark to read files from local file system. Thanks in advance 
for any helps. 

wp

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: cannot read file form a local path

RE: cannot read file form a local path

2 matches

Site Navigation

Mail list logo

Footer information