[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643652#comment-14643652 ] Steve Loughran commented on SPARK-8385: --- What's happening? Hadoop is trying to enum all the filesystems via the service loader mechanism (HADOOP-7549), auto-registering all filsystems listed in any JAR's resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} —in a map indexed by the filesystem scheme as returned by {{FileSystem.getScheme()}} The default value for that raises an exception, so the FS init fails, and the user gets to see a stack trace # every filesystem needs to implement this method # hadoop's FS contract tests need to explicitly call the method and verify it is non null, non empty, so at anyone who implements those tests gets to find the problem. (there's an implicit probe already) # maybe, hadoop should be more forgiving of filesystems which don't know their own name, yet have metadata entries. That's a tough call: it'd be more forgiving at startup time, but less intuitive downstream when things simply don't work if a filesystem is named but not found (i.e. there's no fallback to fs.*.impl=classname entry in the cluster configs. Spark does ship with Tachyon 0.6.4, which has the method, but Tachyon 0.5.0 does not. Except Tachyon 0.5.0 does no have a resource file {{META-INF/services/org.apache.hadoop.fs.FileSystem}} -that is new with 0.60. Which leads to the following hypothesis about what is going wrong: # There are two versions of tachyon on the classpath # tachyon 0.6.4+ explicitly declares the FS in the metadata file, triggering an auto instantiate/load # tachyon 0.5.0's version of the FS class is the one being loaded by Hadoop (i.e. that JAR comes first in the classpath) It's OK to have 0.50 on the classpath; or 0.6.4: it's the combination which is triggering the problem. This isn't something Spark can fix, nor can Hadoop: duplicate, inconsistent JAR versions is always a disaster. This stack trace is how the specific case of 1 tachyon JAR on the classpath surfaces if v 0.5.0 comes first. Closing as a WONTFIX as its an installation side problem, not anything that is fixable in source. h2. For anyone seeing this: # Check your SPARK_HOME environment variable and make sure its not pointing to an older one than the rest of your code is trying to use. # Check your build to make sure you aren't explicitly pulling in a tachyon JAR —theres one packaged up in spark-assembly # Make sure that you aren't pulling in another assembly module with its own tachyon version # Make sure no tachyon JAR has been copied into any of your hadoop directories (i.e. {{HADOOP_HOME/lib}} java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617067#comment-14617067 ] Jerrick Hoang commented on SPARK-8385: -- [~angel2014] I'm running into the same issues, it'd be awesome if you can provide some more details. Thanks! java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617512#comment-14617512 ] Yuri Brovman commented on SPARK-8385: - I had the same issue when moving from Spark 1.3 to Spark 1.4 using IntelliJ. I was able to resolve it by updating the Tachyon dependency in Project Structure = Libraries: org.tachyonproject:tachyon:0.5.0:jar = org.tachyonproject:tachyon:0.6.4:jar Make sure to delete any of the old dependencies. Now project runs locally in IntelliJ. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617680#comment-14617680 ] Jerrick Hoang commented on SPARK-8385: -- I got this error when trying to launch spark-sql shell. I'm running on a Yarn cluster that previously used Spark 1.2. I don't use any IDE. From what I've gathered, this is probably a configuration issue. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617639#comment-14617639 ] Sean Owen commented on SPARK-8385: -- Oh I just noticed the statement in the JIRA that spark-submit works. Yeah, I think the OP is using this in an unsupported way like through an IDE so this kind of thing could be the problem and solution. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605712#comment-14605712 ] Ángel Álvarez commented on SPARK-8385: -- A simple WordCount test worked fine in my Eclipse environment with Spark 1.4 (in both, local and yarn-cluster modes). Make sure you don't have any reference to the previous 1.3 version in your project and launch configuration. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605744#comment-14605744 ] Ángel Álvarez commented on SPARK-8385: -- I could finally reproduce this same error in Eclipse (yarn-cluster mode) and it was due a reference to the spark assembly 1.3 in my launch configuration. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588330#comment-14588330 ] Peter Haumer commented on SPARK-8385: - Sean, I see the class in the big assembly file of the Spark for Hadoop 2.6 distributions for 1.3.1 and 1.4.0. However, it seems that with 1.4 a version was packaged that has unimplemented methods, which causes the regression. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588540#comment-14588540 ] Sean Owen commented on SPARK-8385: -- Oh, is TFS Tachyon? Not sure what the status is on that, whether it's supposed to work without extra steps and just happened to in the past, or what. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8385) java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
[ https://issues.apache.org/jira/browse/SPARK-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587527#comment-14587527 ] Sean Owen commented on SPARK-8385: -- What is the TFS file system? it sounds like you are missing a JAR that adds support for this to Hadoop FileSystem somewhere. java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation - Key: SPARK-8385 URL: https://issues.apache.org/jira/browse/SPARK-8385 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 1.4.0 Environment: RHEL 7.1 Reporter: Peter Haumer I used to be able to debug my Spark apps in Eclipse. With Spark 1.3.1 I created a launch and just set the vm var -Dspark.master=local[4]. With 1.4 this stopped working when reading files from the OS filesystem. Running the same apps with spark-submit works fine. Loosing the ability to debug that way has a major impact on the usability of Spark. The following exception is thrown: Exception in thread main java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:213) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2401) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.SparkContext$$anonfun$28.apply(SparkContext.scala:762) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:172) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:172) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:196) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1535) at org.apache.spark.rdd.RDD.reduce(RDD.scala:900) at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:357) at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:46) at com.databricks.apps.logs.LogAnalyzer.main(LogAnalyzer.java:60) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org