Can someone tell me at what point this error could come? In one of my use cases, I am trying to use hadoop custom input format. Here is my code.
val hConf: Configuration = sc.hadoopConfiguration hConf.set("fs.hdfs.impl", classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getName) hConf.set("fs.file.impl", classOf[org.apache.hadoop.fs.LocalFileSystem].getName)var job = new Job(hConf) FileInputFormat.setInputPaths(job,new Path("hdfs:///user/bala/MyBinaryFile")); var hRDD = new NewHadoopRDD(sc, classOf[RandomAccessInputFormat], classOf[IntWritable], classOf[BytesWritable], job.getConfiguration() ) val count = hRDD.mapPartitionsWithInputSplit{ (split, iter) => myfuncPart(split, iter)} The moment I invoke mapPartitionsWithInputSplit() method, I get the below error in my spark-submit launch 15/10/30 11:11:39 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 40.221.94.235): java.io.IOException: No FileSystem for scheme: spark at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) Any help here to move towards fixing this will be of great help Thanks Bala