Re: [Spark 1.4.0] java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation

2015-06-13 Thread Steve Loughran

That's the Tachyon FS there, which appears to be missing a method override.

On 12 Jun 2015, at 19:58, Peter Haumer 
mailto:phau...@us.ibm.com>> wrote:


Exception in thread "main" java.lang.UnsupportedOperationException: Not 
implemented by the TFS FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653)
at 
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389)
at 
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762)
at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762)
at 
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172)
at 
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)

A quick look at the tachyon source says it now does
https://github.com/amplab/tachyon/blob/8408edd04430b11bf9ccfc1dbe1e8a7e502bb582/clients/unshaded/src/main/java/tachyon/hadoop/TFS.java

..which means you really need a consistent version with the rest of the code, 
or somehow get TFS out of the pipeline


[Spark 1.4.0] java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation

2015-06-12 Thread Peter Haumer


Hello.
I used to be able to run debug my Spark apps in Eclipse for Spark 1.3.1 by
creating a launch and setting the vm var "-Dspark.master=local[4]".
I am not playing with the new 1.4 and trying out some of my simple samples,
which all fail with the same exception as shown below. Running them with
spark-submit works fine.

Anybody has any hints for getting it to work in the IDE again? It seems to
be related to loading input files, which path I provide via the main args
and the load via sc.textFile() in Java8. Are there any new options that I
missed to tell the app to use the local file system?

Exception in thread "main" java.lang.UnsupportedOperationException: Not
implemented by the TFS FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(
FileSystem.java:2401)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(
FileSystem.java:2411)
at org.apache.hadoop.fs.FileSystem.createFileSystem(
FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(
FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(
JobConf.java:653)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(
FileInputFormat.java:389)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(
FileInputFormat.java:362)
at org.apache.spark.SparkContext$$anonfun$28.apply(
SparkContext.scala:762)
at org.apache.spark.SparkContext$$anonfun$28.apply(
SparkContext.scala:762)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(
HadoopRDD.scala:172)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(
HadoopRDD.scala:172)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219
)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217
)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(
MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219
)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217
)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(
MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219
)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217
)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(
MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219
)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217
)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:900)
at org.apache.spark.api.java.JavaRDDLike$class.reduce(
JavaRDDLike.scala:357)
at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(
JavaRDDLike.scala:46)
at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60)



Thanks and best regards,
Peter Haumer.