[ https://issues.apache.org/jira/browse/SPARK-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-13265: ---------------------------------- Assignee: Yu Ishikawa > Refactoring of basic ML import/export for other file system besides HDFS > ------------------------------------------------------------------------ > > Key: SPARK-13265 > URL: https://issues.apache.org/jira/browse/SPARK-13265 > Project: Spark > Issue Type: Bug > Components: ML > Reporter: Yu Ishikawa > Assignee: Yu Ishikawa > > We can't save a model into other file system besides HDFS, for example Amazon > S3. Because the file system is fixed at Spark 1.6. > https://github.com/apache/spark/blob/v1.6.0/mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala#L78 > When I tried to export a KMeans model into Amazon S3, I got the error. > {noformat} > scala> val kmeans = new KMeans().setK(2) > scala> val model = kmeans.fit(train) > scala> model.write.overwrite().save("s3n://test-bucket/tmp/test-kmeans/") > java.lang.IllegalArgumentException: Wrong FS: > s3n://test-bucket/tmp/test-kmeans, expected: > hdfs://ec2-54-248-42-97.ap-northeast-1.compute.amazonaws.c > om:9000 > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:590) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:170) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:803) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1332) > at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:80) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:36) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:41) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:43) > at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:45) > at $iwC$$iwC$$iwC$$iwC.<init>(<console>:47) > at $iwC$$iwC$$iwC.<init>(<console>:49) > at $iwC$$iwC.<init>(<console>:51) > at $iwC.<init>(<console>:53) > at <init>(<console>:55) > at .<init>(<console>:59) > at .<clinit>(<console>) > at .<init>(<console>:7) > at .<clinit>(<console>) > at $print(<console>) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at > org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at > org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > at > org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) > at > org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) > at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) > at > org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) > at > org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) > at > scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) > at > org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) > at org.apache.spark.repl.Main$.main(Main.scala:31) > at org.apache.spark.repl.Main.main(Main.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org