Hi Gen Thanks for your feedback. We do have a business reason to run spark on windows. We have an existing application that is built on C# .NET running on windows. We are considering adding spark to the application for parallel processing of large data. We want spark to run on windows so it integrate with our existing app easily.
Has anybody use spark on windows for production system? Is spark reliable on windows? Ningjun From: gen tang [mailto:gen.tan...@gmail.com] Sent: Thursday, January 29, 2015 12:53 PM To: Wang, Ningjun (LNG-NPV) Cc: user@spark.apache.org Subject: Re: Fail to launch spark-shell on windows 2008 R2 Hi, Using spark under windows is a really bad idea, because even you solve the problems about hadoop, you probably will meet the problem of java.net.SocketException. connection reset by peer. It is caused by the fact we ask socket port too frequently under windows. In my knowledge, it is really difficult to solve. And you will find something really funny: the same code sometimes works and sometimes not, even in the shell mode. And I am sorry but I don't see the interest to run spark under windows and moreover using local file system in a business environment. Do you have a cluster in windows? FYI, I have used spark prebuilt on hadoop 1 under windows 7 and there is no problem to launch, but have problem of java.net.SocketException. If you are using spark prebuilt on hadoop 2, you should consider follow the solution provided by https://issues.apache.org/jira/browse/SPARK-2356 Cheers Gen On Thu, Jan 29, 2015 at 5:54 PM, Wang, Ningjun (LNG-NPV) <ningjun.w...@lexisnexis.com<mailto:ningjun.w...@lexisnexis.com>> wrote: Install virtual box which run Linux? That does not help us. We have business reason to run it on Windows operating system, e.g. Windows 2008 R2. If anybody have done that, please give some advise on what version of spark, which version of Hadoop do you built spark against, etc…. Note that we only use local file system and do not have any hdfs file system at all. I don’t understand why spark generate so many error on Hadoop while we don’t even need hdfs. Ningjun From: gen tang [mailto:gen.tan...@gmail.com<mailto:gen.tan...@gmail.com>] Sent: Thursday, January 29, 2015 10:45 AM To: Wang, Ningjun (LNG-NPV) Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Fail to launch spark-shell on windows 2008 R2 Hi, I tried to use spark under windows once. However the only solution that I found is to install virtualbox.... Hope this can help you. Best Gen On Thu, Jan 29, 2015 at 4:18 PM, Wang, Ningjun (LNG-NPV) <ningjun.w...@lexisnexis.com<mailto:ningjun.w...@lexisnexis.com>> wrote: I deployed spark-1.1.0 on Windows 7 and was albe to launch the spark-shell. I then deploy it to windows 2008 R2 and launch the spark-shell, I got the error java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOExceptio n: Cannot run program "ls": CreateProcess error=2, The system cannot find the file specified at java.lang.ProcessBuilder.start(Unknown Source) at org.apache.hadoop.util.Shell.runCommand(Shell.java:200) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) at org.apache.hadoop.util.Shell.execCommand(Shell.java:461) at org.apache.hadoop.util.Shell.execCommand(Shell.java:444) at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:710) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFil eSystem.java:443) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getPermission(RawLocalFileSyst em.java:418) Here is the detail output C:\spark-1.1.0\bin> spark-shell 15/01/29 10:13:13 INFO SecurityManager: Changing view acls to: ningjun.wang, 15/01/29 10:13:13 INFO SecurityManager: Changing modify acls to: ningjun.wang, 15/01/29 10:13:13 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ningjun.wang, ); users with modify permissions: Set(ningjun.wang, ) 15/01/29 10:13:13 INFO HttpServer: Starting HTTP Server 15/01/29 10:13:14 INFO Server: jetty-8.y.z-SNAPSHOT 15/01/29 10:13:14 INFO AbstractConnector: Started SocketConnector@0.0.0.0:53692<http://SocketConnector@0.0.0.0:53692> 15/01/29 10:13:14 INFO Utils: Successfully started service 'HTTP class server' on port 53692. Failed to created SparkJLineReader: java.lang.NoClassDefFoundError: Could not initialize class org.f usesource.jansi.internal.Kernel32 Falling back to SimpleReader. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.1.0 /_/ Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_71) Type in expressions to have them evaluated. Type :help for more information. 15/01/29 10:13:18 INFO SecurityManager: Changing view acls to: ningjun.wang, 15/01/29 10:13:18 INFO SecurityManager: Changing modify acls to: ningjun.wang, 15/01/29 10:13:18 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ningjun.wang, ); users with modify permissions: Set(ningjun.wang, ) 15/01/29 10:13:18 INFO Slf4jLogger: Slf4jLogger started 15/01/29 10:13:18 INFO Remoting: Starting remoting 15/01/29 10:13:19 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@L AB4-WIN01.pcc.lexisnexis.com:53705<http://AB4-WIN01.pcc.lexisnexis.com:53705>] 15/01/29 10:13:19 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@LAB4-WIN 01.pcc.lexisnexis.com:53705<http://01.pcc.lexisnexis.com:53705>] 15/01/29 10:13:19 INFO Utils: Successfully started service 'sparkDriver' on port 53705. 15/01/29 10:13:19 INFO SparkEnv: Registering MapOutputTracker 15/01/29 10:13:19 INFO SparkEnv: Registering BlockManagerMaster 15/01/29 10:13:19 INFO DiskBlockManager: Created local directory at C:\Users\NINGJU~1.WAN\AppData\Lo cal\Temp\3\spark-local-20150129101319-f9da 15/01/29 10:13:19 INFO Utils: Successfully started service 'Connection manager for block manager' on port 53708. 15/01/29 10:13:19 INFO ConnectionManager: Bound socket to port 53708 with id = ConnectionManagerId(L AB4-WIN01.pcc.lexisnexis.com<http://AB4-WIN01.pcc.lexisnexis.com>,53708) 15/01/29 10:13:19 INFO MemoryStore: MemoryStore started with capacity 265.4 MB 15/01/29 10:13:19 INFO BlockManagerMaster: Trying to register BlockManager 15/01/29 10:13:19 INFO BlockManagerMasterActor: Registering block manager LAB4-WIN01.pcc.lexisnexis. com:53708 with 265.4 MB RAM 15/01/29 10:13:19 INFO BlockManagerMaster: Registered BlockManager 15/01/29 10:13:19 INFO HttpFileServer: HTTP File server directory is C:\Users\NINGJU~1.WAN\AppData\L ocal\Temp\3\spark-2f65b1c3-00e2-489b-967c-4e1f41520583 15/01/29 10:13:19 INFO HttpServer: Starting HTTP Server 15/01/29 10:13:19 INFO Server: jetty-8.y.z-SNAPSHOT 15/01/29 10:13:19 INFO AbstractConnector: Started SocketConnector@0.0.0.0:53709<http://SocketConnector@0.0.0.0:53709> 15/01/29 10:13:19 INFO Utils: Successfully started service 'HTTP file server' on port 53709. 15/01/29 10:13:20 INFO Server: jetty-8.y.z-SNAPSHOT 15/01/29 10:13:20 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040<http://SelectChannelConnector@0.0.0.0:4040> 15/01/29 10:13:20 INFO Utils: Successfully started service 'SparkUI' on port 4040. 15/01/29 10:13:20 INFO SparkUI: Started SparkUI at http://LAB4-WIN01.pcc.lexisnexis.com:4040 java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOExceptio n: Cannot run program "ls": CreateProcess error=2, The system cannot find the file specified at java.lang.ProcessBuilder.start(Unknown Source) at org.apache.hadoop.util.Shell.runCommand(Shell.java:200) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) at org.apache.hadoop.util.Shell.execCommand(Shell.java:461) at org.apache.hadoop.util.Shell.execCommand(Shell.java:444) at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:710) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFil eSystem.java:443) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getPermission(RawLocalFileSyst em.java:418) at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:113) at org.apache.spark.util.FileLogger.start(FileLogger.scala:91) at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:79) at org.apache.spark.SparkContext.<init>(SparkContext.scala:252) at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:972) at $iwC$$iwC.<init>(<console>:8) at $iwC.<init>(<console>:14) at <init>(<console>:16) at .<init>(<console>:20) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:789) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1062) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:615) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:646) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:610) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:814) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:859) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:771) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scal a:121) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scal a:120) at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:264) at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:120) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp( SparkILoop.scala:931) at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:142) at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:104) at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:948) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:902) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:997) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.lang.ProcessImpl.create(Native Method) at java.lang.ProcessImpl.<init>(Unknown Source) at java.lang.ProcessImpl.start(Unknown Source) ... 59 more at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFil eSystem.java:468) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getPermission(RawLocalFileSyst em.java:418) at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:113) at org.apache.spark.util.FileLogger.start(FileLogger.scala:91) at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:79) at org.apache.spark.SparkContext.<init>(SparkContext.scala:252) at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:972) at $iwC$$iwC.<init>(<console>:8) at $iwC.<init>(<console>:14) at <init>(<console>:16) at .<init>(<console>:20) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:789) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1062) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:615) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:646) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:610) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:814) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:859) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:771) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scal a:121) at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scal a:120) at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:264) at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:120) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp( SparkILoop.scala:931) at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:142) at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:104) at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:56) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:948) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:902) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:997) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Spark context available as sc. Please advise Ningjun