spark-itemsimilarity No FileSystem for scheme error

roy Tue, 05 Jan 2016 12:22:52 -0800

Hi we are using CDH 5.4.0 with Spark 1.5.2 (doesn't come with CDH 5.4.0)


I am following this link
https://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html to
trying to test/create new algorithm with mahout item-similarity.

I am running following command 

        ./bin/mahout spark-itemsimilarity \
        --input $INPUT \
        --output $OUTPUT \
        --filter1 o --filter2 v \
        --inDelim "\t" \
         --itemIDColumn 2 --rowIDColumn 0 --filterColumn 1 \
         --master yarn-client \
         -D:fs.hdfs.impl=org.apache.hadoop.hdfs.DistributedFileSystem \
         -D:fs.file.impl=org.apache.hadoop.fs.LocalFileSystem

I am getting following error  
 
java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
at org.apache.spark.deploy.yarn.Client.cleanupStagingDir(Client.scala:143)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:129)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:523)
at
org.apache.mahout.sparkbindings.package$.mahoutSparkContext(package.scala:91)
at
org.apache.mahout.drivers.MahoutSparkDriver.start(MahoutSparkDriver.scala:83)
at
org.apache.mahout.drivers.ItemSimilarityDriver$.start(ItemSimilarityDriver.scala:118)
at
org.apache.mahout.drivers.ItemSimilarityDriver$.process(ItemSimilarityDriver.scala:199)
at
org.apache.mahout.drivers.ItemSimilarityDriver$$anonfun$main$1.apply(ItemSimilarityDriver.scala:112)
at
org.apache.mahout.drivers.ItemSimilarityDriver$$anonfun$main$1.apply(ItemSimilarityDriver.scala:110)
at scala.Option.map(Option.scala:145)
at
org.apache.mahout.drivers.ItemSimilarityDriver$.main(ItemSimilarityDriver.scala:110)
at
org.apache.mahout.drivers.ItemSimilarityDriver.main(ItemSimilarityDriver.scala)


I found solution here by adding following properties to into
/etc/hadoop/conf/core-site.xml on client/gateway machine more info 

<property>
  <name>fs.file.impl</name>
  <value>org.apache.hadoop.fs.LocalFileSystem</value>
  <description>The FileSystem for file: uris.</description>
</property>

<property>
  <name>fs.hdfs.impl</name>
  <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
  <description>The FileSystem for hdfs: uris.</description> 
</property> 

 But is there any better way to solve this error ?

Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-itemsimilarity-No-FileSystem-for-scheme-error-tp25887.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

spark-itemsimilarity No FileSystem for scheme error

Reply via email to