Re: Is it feasible to build and run Spark on Windows?

2019-12-10 Thread Ping Liu
I am new and plan to be an individual contributor for bug fix.  I assume I
need building the project if I'll be working on source code based on master
branch that the binaries are behind.  Do you think this makes sense?
Please let me know if, in this case, I can still use binary instead of
building the project.

On Tue, Dec 10, 2019 at 7:00 AM Deepak Vohra  wrote:

> The initial question was to build from source. Any reason to build when
> binaries are available at https://spark.apache.org/downloads.html
>
> On Tuesday, December 10, 2019, 03:05:44 AM UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
>
>
> Super.  Thanks Deepak!
>
> On Mon, Dec 9, 2019 at 6:58 PM Deepak Vohra  wrote:
>
> Please install Apache Spark on Windows as discussed in Apache Spark on
> Windows - DZone Open Source
> <https://dzone.com/articles/working-on-apache-spark-on-windows>
>
> Apache Spark on Windows - DZone Open Source
>
> This article explains and provides solutions for some of the most common
> errors developers come across when inst...
> <https://dzone.com/articles/working-on-apache-spark-on-windows>
>
>
>
> On Monday, December 9, 2019, 11:27:53 p.m. UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
>
>
> Thanks Deepak!  Yes, I want to try it with Docker.  But my AWS account ran
> out of free period.  Is there a shared EC2 for Spark that we can use for
> free?
>
> Ping
>
>
> On Monday, December 9, 2019, Deepak Vohra  wrote:
> > Haven't tested but the general procedure is to exclude all guava
> dependencies that are not needed. The hadoop-common depedency does not have
> a dependency on guava according to Maven Repository: org.apache.hadoop »
> hadoop-common
> >
> > Maven Repository: org.apache.hadoop » hadoop-common
> >
> > Apache Spark 2.4 has dependency on guava 14.
> > If a Docker image for Cloudera Hadoop is used Spark is may be installed
> on Docker for Windows.
> > For Docker on Windows on EC2 refer Getting Started with Docker for
> Windows - Developer.com
> >
> > Getting Started with Docker for Windows - Developer.com
> >
> > Docker for Windows makes it feasible to run a Docker daemon on Windows
> Server 2016. Learn to harness its power.
> >
> >
> > Conflicting versions is not an issue if Docker is used.
> > "Apache Spark applications usually have a complex set of required
> software dependencies. Spark applications may require specific versions of
> these dependencies (such as Pyspark and R) on the Spark executor hosts,
> sometimes with conflicting versions."
> > Running Spark in Docker Containers on YARN
> >
> > Running Spark in Docker Containers on YARN
> >
> >
> >
> >
> >
> > On Monday, December 9, 2019, 08:37:47 p.m. UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
> >
> > Hi Deepak,
> > I tried it.  Unfortunately, it still doesn't work.  28.1-jre isn't
> downloaded for somehow.  I'll try something else.  Thank you very much for
> your help!
> > Ping
> >
> > On Fri, Dec 6, 2019 at 5:28 PM Deepak Vohra  wrote:
> >
> >  As multiple guava versions are found exclude guava from all the
> dependecies it could have been downloaded with. And explicitly add a recent
> guava version.
> > 
> > org.apache.hadoop
> > hadoop-common
> >  3.2.1
> > 
> >   
> >  com.google.guava
> >  guava
> >
> > 
> >
> > 
> > com.google.guava
> > guava
> > 28.1-jre
> > 
> >  
> >   
> >
> > On Friday, December 6, 2019, 10:12:55 p.m. UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
> >
> > Hi Deepak,
> > Following your suggestion, I put exclusion of guava in topmost POM
> (under Spark home directly) as follows.
> > 2227-  
> > 2228-  
> > 2229-org.apache.hadoop
> > 2230:hadoop-common
> > 2231-3.2.1
> > 2232-
> > 2233-  
> > 2234-com.google.guava
> > 2235-guava
> > 2236-  
> > 2237-
> > 2238-  
> > 2239-
> > 2240-  
> > I also set properties for spark.executor.userClassPathFirst=true and
> spark.driver.userClassPathFirst=true
> > D:\apache\spark>mvn -Pyarn -Phadoop-3.2 -Dhadoop-version=3.2.1
> -Dspark.executor.userClassPathFirst=true
> -Dspark.driver.userClassPathFirst=true -DskipTests clean package
> > and rebuilt spark.
> > But I got the same error when running spark-shell.
> >
> > [INFO

Re: Is it feasible to build and run Spark on Windows?

2019-12-10 Thread Deepak Vohra
 The initial question was to build from source. Any reason to build when 
binaries are available at https://spark.apache.org/downloads.html
On Tuesday, December 10, 2019, 03:05:44 AM UTC, Ping Liu 
 wrote:  
 
 Super.  Thanks Deepak!

On Mon, Dec 9, 2019 at 6:58 PM Deepak Vohra  wrote:

 Please install Apache Spark on Windows as discussed in Apache Spark on Windows 
- DZone Open Source

| 
| 
| 
|  |  |

 |

 |
| 
|  | 
Apache Spark on Windows - DZone Open Source

This article explains and provides solutions for some of the most common errors 
developers come across when inst...
 |

 |

 |




On Monday, December 9, 2019, 11:27:53 p.m. UTC, Ping Liu 
 wrote:  
 
 Thanks Deepak!  Yes, I want to try it with Docker.  But my AWS account ran out 
of free period.  Is there a shared EC2 for Spark that we can use for free?

Ping


On Monday, December 9, 2019, Deepak Vohra  wrote:
> Haven't tested but the general procedure is to exclude all guava dependencies 
> that are not needed. The hadoop-common depedency does not have a dependency 
> on guava according to Maven Repository: org.apache.hadoop » hadoop-common
>
> Maven Repository: org.apache.hadoop » hadoop-common
>
> Apache Spark 2.4 has dependency on guava 14. 
> If a Docker image for Cloudera Hadoop is used Spark is may be installed on 
> Docker for Windows.  
> For Docker on Windows on EC2 refer Getting Started with Docker for Windows - 
> Developer.com
>
> Getting Started with Docker for Windows - Developer.com
>
> Docker for Windows makes it feasible to run a Docker daemon on Windows Server 
> 2016. Learn to harness its power.
>
>
> Conflicting versions is not an issue if Docker is used.
> "Apache Spark applications usually have a complex set of required software 
> dependencies. Spark applications may require specific versions of these 
> dependencies (such as Pyspark and R) on the Spark executor hosts, sometimes 
> with conflicting versions."
> Running Spark in Docker Containers on YARN
>
> Running Spark in Docker Containers on YARN
>
>
>
>
>
> On Monday, December 9, 2019, 08:37:47 p.m. UTC, Ping Liu 
>  wrote:
>
> Hi Deepak,
> I tried it.  Unfortunately, it still doesn't work.  28.1-jre isn't downloaded 
> for somehow.  I'll try something else.  Thank you very much for your help!
> Ping
>
> On Fri, Dec 6, 2019 at 5:28 PM Deepak Vohra  wrote:
>
>  As multiple guava versions are found exclude guava from all the dependecies 
> it could have been downloaded with. And explicitly add a recent guava version.
> 
>         org.apache.hadoop
>         hadoop-common
>          3.2.1
>         
>           
>              com.google.guava
>              guava
>            
>         
>        
> 
>     com.google.guava
>     guava
>     28.1-jre
> 
>      
>   
>
> On Friday, December 6, 2019, 10:12:55 p.m. UTC, Ping Liu 
>  wrote:
>
> Hi Deepak,
> Following your suggestion, I put exclusion of guava in topmost POM (under 
> Spark home directly) as follows.
> 2227-      
> 2228-      
> 2229-        org.apache.hadoop
> 2230:        hadoop-common
> 2231-        3.2.1
> 2232-        
> 2233-          
> 2234-            com.google.guava
> 2235-            guava
> 2236-          
> 2237-        
> 2238-      
> 2239-    
> 2240-  
> I also set properties for spark.executor.userClassPathFirst=true and 
> spark.driver.userClassPathFirst=true
> D:\apache\spark>mvn -Pyarn -Phadoop-3.2 -Dhadoop-version=3.2.1 
> -Dspark.executor.userClassPathFirst=true 
> -Dspark.driver.userClassPathFirst=true -DskipTests clean package
> and rebuilt spark.
> But I got the same error when running spark-shell.
>
> [INFO] Reactor Summary for Spark Project Parent POM 3.0.0-SNAPSHOT:
> [INFO]
> [INFO] Spark Project Parent POM ... SUCCESS [ 25.092 
> s]
> [INFO] Spark Project Tags . SUCCESS [ 22.093 
> s]
> [INFO] Spark Project Sketch ... SUCCESS [ 19.546 
> s]
> [INFO] Spark Project Local DB . SUCCESS [ 10.468 
> s]
> [INFO] Spark Project Networking ... SUCCESS [ 17.733 
> s]
> [INFO] Spark Project Shuffle Streaming Service  SUCCESS [  6.531 
> s]
> [INFO] Spark Project Unsafe ... SUCCESS [ 25.327 
> s]
> [INFO] Spark Project Launcher . SUCCESS [ 27.264 
> s]
> [INFO] Spark Project Core . SUCCESS [07:59 
> min]
> [INFO] Spark Project ML Local Library . SUCCESS [01:39 
> min]
> [INFO] Spark Project GraphX ... SUCCESS [02:08 
> min]
> [INFO] S

Re: Is it feasible to build and run Spark on Windows?

2019-12-09 Thread Ping Liu
Super.  Thanks Deepak!

On Mon, Dec 9, 2019 at 6:58 PM Deepak Vohra  wrote:

> Please install Apache Spark on Windows as discussed in Apache Spark on
> Windows - DZone Open Source
> <https://dzone.com/articles/working-on-apache-spark-on-windows>
>
> Apache Spark on Windows - DZone Open Source
>
> This article explains and provides solutions for some of the most common
> errors developers come across when inst...
> <https://dzone.com/articles/working-on-apache-spark-on-windows>
>
>
>
> On Monday, December 9, 2019, 11:27:53 p.m. UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
>
>
> Thanks Deepak!  Yes, I want to try it with Docker.  But my AWS account ran
> out of free period.  Is there a shared EC2 for Spark that we can use for
> free?
>
> Ping
>
>
> On Monday, December 9, 2019, Deepak Vohra  wrote:
> > Haven't tested but the general procedure is to exclude all guava
> dependencies that are not needed. The hadoop-common depedency does not have
> a dependency on guava according to Maven Repository: org.apache.hadoop »
> hadoop-common
> >
> > Maven Repository: org.apache.hadoop » hadoop-common
> >
> > Apache Spark 2.4 has dependency on guava 14.
> > If a Docker image for Cloudera Hadoop is used Spark is may be installed
> on Docker for Windows.
> > For Docker on Windows on EC2 refer Getting Started with Docker for
> Windows - Developer.com
> >
> > Getting Started with Docker for Windows - Developer.com
> >
> > Docker for Windows makes it feasible to run a Docker daemon on Windows
> Server 2016. Learn to harness its power.
> >
> >
> > Conflicting versions is not an issue if Docker is used.
> > "Apache Spark applications usually have a complex set of required
> software dependencies. Spark applications may require specific versions of
> these dependencies (such as Pyspark and R) on the Spark executor hosts,
> sometimes with conflicting versions."
> > Running Spark in Docker Containers on YARN
> >
> > Running Spark in Docker Containers on YARN
> >
> >
> >
> >
> >
> > On Monday, December 9, 2019, 08:37:47 p.m. UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
> >
> > Hi Deepak,
> > I tried it.  Unfortunately, it still doesn't work.  28.1-jre isn't
> downloaded for somehow.  I'll try something else.  Thank you very much for
> your help!
> > Ping
> >
> > On Fri, Dec 6, 2019 at 5:28 PM Deepak Vohra  wrote:
> >
> >  As multiple guava versions are found exclude guava from all the
> dependecies it could have been downloaded with. And explicitly add a recent
> guava version.
> > 
> > org.apache.hadoop
> > hadoop-common
> >  3.2.1
> > 
> >   
> >  com.google.guava
> >  guava
> >
> > 
> >
> > 
> > com.google.guava
> > guava
> > 28.1-jre
> > 
> >  
> >   
> >
> > On Friday, December 6, 2019, 10:12:55 p.m. UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
> >
> > Hi Deepak,
> > Following your suggestion, I put exclusion of guava in topmost POM
> (under Spark home directly) as follows.
> > 2227-  
> > 2228-  
> > 2229-org.apache.hadoop
> > 2230:hadoop-common
> > 2231-3.2.1
> > 2232-
> > 2233-  
> > 2234-com.google.guava
> > 2235-guava
> > 2236-  
> > 2237-
> > 2238-  
> > 2239-
> > 2240-  
> > I also set properties for spark.executor.userClassPathFirst=true and
> spark.driver.userClassPathFirst=true
> > D:\apache\spark>mvn -Pyarn -Phadoop-3.2 -Dhadoop-version=3.2.1
> -Dspark.executor.userClassPathFirst=true
> -Dspark.driver.userClassPathFirst=true -DskipTests clean package
> > and rebuilt spark.
> > But I got the same error when running spark-shell.
> >
> > [INFO] Reactor Summary for Spark Project Parent POM 3.0.0-SNAPSHOT:
> > [INFO]
> > [INFO] Spark Project Parent POM ... SUCCESS [
> 25.092 s]
> > [INFO] Spark Project Tags . SUCCESS [
> 22.093 s]
> > [INFO] Spark Project Sketch ... SUCCESS [
> 19.546 s]
> > [INFO] Spark Project Local DB . SUCCESS [
> 10.468 s]
> > [INFO] Spark Project Networking ... SUCCESS [
> 17.733 s]
> > [INFO] Spark Project Shuffle Streaming Service  SUCCESS [
>  6.531 s]
> > [INFO

Re: Is it feasible to build and run Spark on Windows?

2019-12-09 Thread Deepak Vohra
 Please install Apache Spark on Windows as discussed in Apache Spark on Windows 
- DZone Open Source

| 
| 
| 
|  |  |

 |

 |
| 
|  | 
Apache Spark on Windows - DZone Open Source

This article explains and provides solutions for some of the most common errors 
developers come across when inst...
 |

 |

 |




On Monday, December 9, 2019, 11:27:53 p.m. UTC, Ping Liu 
 wrote:  
 
 Thanks Deepak!  Yes, I want to try it with Docker.  But my AWS account ran out 
of free period.  Is there a shared EC2 for Spark that we can use for free?

Ping


On Monday, December 9, 2019, Deepak Vohra  wrote:
> Haven't tested but the general procedure is to exclude all guava dependencies 
> that are not needed. The hadoop-common depedency does not have a dependency 
> on guava according to Maven Repository: org.apache.hadoop » hadoop-common
>
> Maven Repository: org.apache.hadoop » hadoop-common
>
> Apache Spark 2.4 has dependency on guava 14. 
> If a Docker image for Cloudera Hadoop is used Spark is may be installed on 
> Docker for Windows.  
> For Docker on Windows on EC2 refer Getting Started with Docker for Windows - 
> Developer.com
>
> Getting Started with Docker for Windows - Developer.com
>
> Docker for Windows makes it feasible to run a Docker daemon on Windows Server 
> 2016. Learn to harness its power.
>
>
> Conflicting versions is not an issue if Docker is used.
> "Apache Spark applications usually have a complex set of required software 
> dependencies. Spark applications may require specific versions of these 
> dependencies (such as Pyspark and R) on the Spark executor hosts, sometimes 
> with conflicting versions."
> Running Spark in Docker Containers on YARN
>
> Running Spark in Docker Containers on YARN
>
>
>
>
>
> On Monday, December 9, 2019, 08:37:47 p.m. UTC, Ping Liu 
>  wrote:
>
> Hi Deepak,
> I tried it.  Unfortunately, it still doesn't work.  28.1-jre isn't downloaded 
> for somehow.  I'll try something else.  Thank you very much for your help!
> Ping
>
> On Fri, Dec 6, 2019 at 5:28 PM Deepak Vohra  wrote:
>
>  As multiple guava versions are found exclude guava from all the dependecies 
> it could have been downloaded with. And explicitly add a recent guava version.
> 
>         org.apache.hadoop
>         hadoop-common
>          3.2.1
>         
>           
>              com.google.guava
>              guava
>            
>         
>        
> 
>     com.google.guava
>     guava
>     28.1-jre
> 
>      
>   
>
> On Friday, December 6, 2019, 10:12:55 p.m. UTC, Ping Liu 
>  wrote:
>
> Hi Deepak,
> Following your suggestion, I put exclusion of guava in topmost POM (under 
> Spark home directly) as follows.
> 2227-      
> 2228-      
> 2229-        org.apache.hadoop
> 2230:        hadoop-common
> 2231-        3.2.1
> 2232-        
> 2233-          
> 2234-            com.google.guava
> 2235-            guava
> 2236-          
> 2237-        
> 2238-      
> 2239-    
> 2240-  
> I also set properties for spark.executor.userClassPathFirst=true and 
> spark.driver.userClassPathFirst=true
> D:\apache\spark>mvn -Pyarn -Phadoop-3.2 -Dhadoop-version=3.2.1 
> -Dspark.executor.userClassPathFirst=true 
> -Dspark.driver.userClassPathFirst=true -DskipTests clean package
> and rebuilt spark.
> But I got the same error when running spark-shell.
>
> [INFO] Reactor Summary for Spark Project Parent POM 3.0.0-SNAPSHOT:
> [INFO]
> [INFO] Spark Project Parent POM ... SUCCESS [ 25.092 
> s]
> [INFO] Spark Project Tags . SUCCESS [ 22.093 
> s]
> [INFO] Spark Project Sketch ... SUCCESS [ 19.546 
> s]
> [INFO] Spark Project Local DB . SUCCESS [ 10.468 
> s]
> [INFO] Spark Project Networking ... SUCCESS [ 17.733 
> s]
> [INFO] Spark Project Shuffle Streaming Service  SUCCESS [  6.531 
> s]
> [INFO] Spark Project Unsafe ... SUCCESS [ 25.327 
> s]
> [INFO] Spark Project Launcher . SUCCESS [ 27.264 
> s]
> [INFO] Spark Project Core . SUCCESS [07:59 
> min]
> [INFO] Spark Project ML Local Library . SUCCESS [01:39 
> min]
> [INFO] Spark Project GraphX ... SUCCESS [02:08 
> min]
> [INFO] Spark Project Streaming  SUCCESS [02:56 
> min]
> [INFO] Spark Project Catalyst . SUCCESS [08:55 
> min]
> [INFO] Spark Project SQL .. SUCCESS [12:33 
> min]
> [INFO] Spark Project ML Library 

Re: Is it feasible to build and run Spark on Windows?

2019-12-09 Thread Ping Liu
 SUCCESS
[03:16 min]
> [INFO] Spark Project Catalyst . SUCCESS
[08:45 min]
> [INFO] Spark Project SQL .. SUCCESS
[12:12 min]
> [INFO] Spark Project ML Library ... SUCCESS [
 16:28 h]
> [INFO] Spark Project Tools  SUCCESS [
23.602 s]
> [INFO] Spark Project Hive . SUCCESS
[07:50 min]
> [INFO] Spark Project Graph API  SUCCESS [
 8.734 s]
> [INFO] Spark Project Cypher ... SUCCESS [
12.420 s]
> [INFO] Spark Project Graph  SUCCESS [
10.186 s]
> [INFO] Spark Project REPL . SUCCESS
[01:03 min]
> [INFO] Spark Project YARN Shuffle Service . SUCCESS
[01:19 min]
> [INFO] Spark Project YARN . SUCCESS
[02:19 min]
> [INFO] Spark Project Assembly . SUCCESS [
18.912 s]
> [INFO] Kafka 0.10+ Token Provider for Streaming ... SUCCESS [
57.925 s]
> [INFO] Spark Integration for Kafka 0.10 ... SUCCESS
[01:20 min]
> [INFO] Kafka 0.10+ Source for Structured Streaming  SUCCESS
[02:26 min]
> [INFO] Spark Project Examples . SUCCESS
[02:00 min]
> [INFO] Spark Integration for Kafka 0.10 Assembly .. SUCCESS [
28.354 s]
> [INFO] Spark Avro . SUCCESS
[01:44 min]
> [INFO]

> [INFO] BUILD SUCCESS
> [INFO]

> [INFO] Total time:  17:30 h
> [INFO] Finished at: 2019-12-05T12:20:01-08:00
> [INFO]

>
> D:\apache\spark>cd bin
>
> D:\apache\spark\bin>ls
> beeline   load-spark-env.cmd  run-example   spark-shell
spark-sql2.cmd sparkR.cmd
> beeline.cmd   load-spark-env.sh   run-example.cmd
spark-shell.cmd   spark-submit   sparkR2.cmd
> docker-image-tool.sh  pyspark spark-class
spark-shell2.cmd  spark-submit.cmd
> find-spark-home   pyspark.cmd spark-class.cmd   spark-sql
spark-submit2.cmd
> find-spark-home.cmd   pyspark2.cmdspark-class2.cmd  spark-sql.cmd
sparkR
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError:
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> at
org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> at
org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> at
org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> at
org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> at
org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> at
org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
Source)
> at scala.Option.getOrElse(Option.scala:189)
> at
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> at org.apache.spark.deploy.SparkSubmit.org
$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at
org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> at
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> D:\apache\spark\bin>
> On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:
>
> What was the build error? you didn't say. Are you sure it succeeded?
> Try running from the Spark home dir, not bin.
> I know we do run Windows tests and it appears to pass tests, etc.
>
> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>>
>> Hello,
>>
>> I understand Spark is preferably built on Linux.  But I have a Windows
machine with a slow Virtual Box for Linux.  So I wish I am able to build
and run Spark code on Windows environment.
>>
>> Unfortunately,
>>
>> # Apache Hadoop 2.6.X
>> ./build/mvn -Pyarn -DskipTests clean package
>>
>> # Apache Hadoop 2.7.X and later
>> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
package
>>
>>
>> Both are listed on
http://sp

Re: Is it feasible to build and run Spark on Windows?

2019-12-09 Thread Deepak Vohra
ssembly . SUCCESS [ 18.912 s]
[INFO] Kafka 0.10+ Token Provider for Streaming ... SUCCESS [ 57.925 s]
[INFO] Spark Integration for Kafka 0.10 ... SUCCESS [01:20 min]
[INFO] Kafka 0.10+ Source for Structured Streaming  SUCCESS [02:26 min]
[INFO] Spark Project Examples . SUCCESS [02:00 min]
[INFO] Spark Integration for Kafka 0.10 Assembly .. SUCCESS [ 28.354 s]
[INFO] Spark Avro . SUCCESS [01:44 min]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time:  17:30 h
[INFO] Finished at: 2019-12-05T12:20:01-08:00
[INFO] 

D:\apache\spark>cd bin

D:\apache\spark\bin>ls
beeline               load-spark-env.cmd  run-example       spark-shell       
spark-sql2.cmd     sparkR.cmd
beeline.cmd           load-spark-env.sh   run-example.cmd   spark-shell.cmd   
spark-submit       sparkR2.cmd
docker-image-tool.sh  pyspark             spark-class       spark-shell2.cmd  
spark-submit.cmd
find-spark-home       pyspark.cmd         spark-class.cmd   spark-sql         
spark-submit2.cmd
find-spark-home.cmd   pyspark2.cmd        spark-class2.cmd  spark-sql.cmd     
sparkR

D:\apache\spark\bin>spark-shell
Exception in thread "main" java.lang.NoSuchMethodError: 
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
        at 
org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
        at 
org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
        at 
org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
        at 
org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
        at scala.Option.getOrElse(Option.scala:189)
        at 
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

D:\apache\spark\bin>
On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:

What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>         at 
>org.apache.spark.de

Re: Is it feasible to build and run Spark on Windows?

2019-12-09 Thread Ping Liu
gt; 18.912 s]
> [INFO] Kafka 0.10+ Token Provider for Streaming ... SUCCESS [
> 57.925 s]
> [INFO] Spark Integration for Kafka 0.10 ... SUCCESS [01:20
> min]
> [INFO] Kafka 0.10+ Source for Structured Streaming  SUCCESS [02:26
> min]
> [INFO] Spark Project Examples . SUCCESS [02:00
> min]
> [INFO] Spark Integration for Kafka 0.10 Assembly .. SUCCESS [
> 28.354 s]
> [INFO] Spark Avro . SUCCESS [01:44
> min]
> [INFO]
> 
> [INFO] BUILD SUCCESS
> [INFO]
> 
> [INFO] Total time:  17:30 h
> [INFO] Finished at: 2019-12-05T12:20:01-08:00
> [INFO]
> 
>
> D:\apache\spark>cd bin
>
> D:\apache\spark\bin>ls
> beeline   load-spark-env.cmd  run-example   spark-shell
> spark-sql2.cmd sparkR.cmd
> beeline.cmd   load-spark-env.sh   run-example.cmd
> spark-shell.cmd   spark-submit   sparkR2.cmd
> docker-image-tool.sh  pyspark spark-class
> spark-shell2.cmd  spark-submit.cmd
> find-spark-home   pyspark.cmd spark-class.cmd   spark-sql
> spark-submit2.cmd
> find-spark-home.cmd   pyspark2.cmdspark-class2.cmd  spark-sql.cmd
> sparkR
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> at
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> at
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
> Source)
> at scala.Option.getOrElse(Option.scala:189)
> at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> D:\apache\spark\bin>
>
> On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:
>
> What was the build error? you didn't say. Are you sure it succeeded?
> Try running from the Spark home dir, not bin.
> I know we do run Windows tests and it appears to pass tests, etc.
>
> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
> >
> > Hello,
> >
> > I understand Spark is preferably built on Linux.  But I have a Windows
> machine with a slow Virtual Box for Linux.  So I wish I am able to build
> and run Spark code on Windows environment.
> >
> > Unfortunately,
> >
> > # Apache Hadoop 2.6.X
> > ./build/mvn -Pyarn -DskipTests clean package
> >
> > # Apache Hadoop 2.7.X and later
> > ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package
> >
> >
> > Both are listed on
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
> >
> > But neither works for me (I stay directly under spark root directory and
> run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package"
> >
> > and
> >
> > Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests
> clean package"
> >
> > Now build works.  But when I run spark-shell.  I got the following error.
> >
> > D:\apache\spark\bin>spark-shell
> > Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> > at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1

Re: Is it feasible to build and run Spark on Windows?

2019-12-06 Thread Deepak Vohra
conditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
        at 
org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
        at 
org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
        at 
org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
        at 
org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
        at scala.Option.getOrElse(Option.scala:189)
        at 
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

D:\apache\spark\bin>
On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:

What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>         at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>         at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>         at scala.Option.getOrElse(Option.scala:189)
>         at 
>org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>         at 
>org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>         at 
>org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>         at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>         at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>         at 
>org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> Has anyone experienced building and running Spark source code successfully on 
> Windows?  Could you please share your experience?
>
> Thanks a lot!
>
> Ping
>

  
  

Re: Is it feasible to build and run Spark on Windows?

2019-12-06 Thread Ping Liu
eption in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> at
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> at
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
> Source)
> at scala.Option.getOrElse(Option.scala:189)
> at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> D:\apache\spark\bin>
>
> On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:
>
> What was the build error? you didn't say. Are you sure it succeeded?
> Try running from the Spark home dir, not bin.
> I know we do run Windows tests and it appears to pass tests, etc.
>
> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
> >
> > Hello,
> >
> > I understand Spark is preferably built on Linux.  But I have a Windows
> machine with a slow Virtual Box for Linux.  So I wish I am able to build
> and run Spark code on Windows environment.
> >
> > Unfortunately,
> >
> > # Apache Hadoop 2.6.X
> > ./build/mvn -Pyarn -DskipTests clean package
> >
> > # Apache Hadoop 2.7.X and later
> > ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package
> >
> >
> > Both are listed on
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
> >
> > But neither works for me (I stay directly under spark root directory and
> run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package"
> >
> > and
> >
> > Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests
> clean package"
> >
> > Now build works.  But when I run spark-shell.  I got the following error.
> >
> > D:\apache\spark\bin>spark-shell
> > Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> > at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> > at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> > at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> > at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> > at
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> > at
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
> Source)
> > at scala.Option.getOrElse(Option.scala:189)
> > at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> > at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> > at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> > at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> > at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> > at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> > at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> >
> >
> > Has anyone experienced building and running Spark source code
> successfully on Windows?  Could you please share your experience?
> >
> > Thanks a lot!
> >
> > Ping
> >
>
>


Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Deepak Vohra
runMain(SparkSubmit.scala:871)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

D:\apache\spark\bin>
On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:

What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>         at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>         at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>         at scala.Option.getOrElse(Option.scala:189)
>         at 
>org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>         at 
>org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>         at 
>org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>         at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>         at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>         at 
>org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> Has anyone experienced building and running Spark source code successfully on 
> Windows?  Could you please share your experience?
>
> Thanks a lot!
>
> Ping
>

  

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Deepak Vohra
 Is Hadoop 3.x not set as a dependency? If so, exclude the guava provided by 
Hadoop.

    org.apache.hadoop    
hadoop-common    3.2.1  
                          com.google.guava        
        guava                    


On Friday, December 6, 2019, 12:20:49 AM UTC, Ping Liu 
 wrote:  
 
 Thanks Deepak!  I'll try it.

On Thu, Dec 5, 2019 at 4:13 PM Deepak Vohra  wrote:

 The Guava issue could be fixed in one of two ways:
- Use Hadoop v3- Create an Uber jar, 
referhttps://gite.lirmm.fr/yagoubi/spark/commit/c9f743957fa963bc1dbed7a44a346ffce1a45cf2
  Managing Java dependencies for Apache Spark applications on Cloud Dataproc | 
Google Cloud Blog

| 
| 
| 
|  |  |

 |

 |
| 
|  | 
Managing Java dependencies for Apache Spark applications on Cloud Datapr...

Learn how to set up Java imported packages for Apache Spark on Cloud Dataproc 
to avoid conflicts.
 |

 |

 |




On Thursday, December 5, 2019, 11:49:47 PM UTC, Ping Liu 
 wrote:  
 
 Hi Deepak,
For Spark, I am using master branch and just have code updated yesterday.
For Guava, I actually deleted my old versions from the local Maven repo.  The 
build process of Spark automatically downloaded a few versions.  The oldest 
version is 14.0.1.
But even in 14.0,1 
(https://guava.dev/releases/14.0.1/api/docs/com/google/common/base/Preconditions.html)
 Preconditions already requires boolean as first parameter.

| static void | checkArgument(boolean expression, String errorMessageTemplate, 
Object... errorMessageArgs) |


The newer Guava version, checkArgument() all require boolean as first parameter.
For Docker, using EC2 is a good idea.  Is there a document or guidance for it?
Thanks.
Ping



On Thu, Dec 5, 2019 at 3:30 PM Deepak Vohra  wrote:

 Such type exception could occur if a dependency (most likely Guava) version is 
not supported by the Spark version. What is the Spark and Guava versions? Use a 
more recent Guava version dependency in Maven pom.xml. 
Regarding Docker, a cloud platform instance such as EC2 could be used with 
Hyper-V support.
On Thursday, December 5, 2019, 10:51:59 PM UTC, Ping Liu 
 wrote:  
 
 Hi Deepak,
Yes, I did use Maven. I even have the build pass successfully when setting 
Hadoop version to 3.2.  Please see my response to Sean's email.
Unfortunately, I only have Docker Toolbox as my Windows doesn't have Microsoft 
Hyper-V.  So I want to avoid using Docker to do major work if possible.
Thanks!
Ping

On Thu, Dec 5, 2019 at 2:24 PM Deepak Vohra  wrote:

 Several alternatives are available:
- Use Maven to build Spark on Windows. 
http://spark.apache.org/docs/latest/building-spark.html#apache-maven

- Use Docker image for  CDH on WindowsDocker Hub

| 
| 
|  | 
Docker Hub


 |

 |

 |





On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen 
 wrote:  
 
 What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>        at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>        at scala.Option.getOrElse(Option.scala:189)
>        at 
>org.apache.spark.deploy.S

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Deepak Vohra
 Sorry, didn't notice, Hadoop v3.x is already being used.
On Thursday, December 5, 2019, 11:49:47 PM UTC, Ping Liu 
 wrote:  
 
 Hi Deepak,
For Spark, I am using master branch and just have code updated yesterday.
For Guava, I actually deleted my old versions from the local Maven repo.  The 
build process of Spark automatically downloaded a few versions.  The oldest 
version is 14.0.1.
But even in 14.0,1 
(https://guava.dev/releases/14.0.1/api/docs/com/google/common/base/Preconditions.html)
 Preconditions already requires boolean as first parameter.

| static void | checkArgument(boolean expression, String errorMessageTemplate, 
Object... errorMessageArgs) |


The newer Guava version, checkArgument() all require boolean as first parameter.
For Docker, using EC2 is a good idea.  Is there a document or guidance for it?
Thanks.
Ping



On Thu, Dec 5, 2019 at 3:30 PM Deepak Vohra  wrote:

 Such type exception could occur if a dependency (most likely Guava) version is 
not supported by the Spark version. What is the Spark and Guava versions? Use a 
more recent Guava version dependency in Maven pom.xml. 
Regarding Docker, a cloud platform instance such as EC2 could be used with 
Hyper-V support.
On Thursday, December 5, 2019, 10:51:59 PM UTC, Ping Liu 
 wrote:  
 
 Hi Deepak,
Yes, I did use Maven. I even have the build pass successfully when setting 
Hadoop version to 3.2.  Please see my response to Sean's email.
Unfortunately, I only have Docker Toolbox as my Windows doesn't have Microsoft 
Hyper-V.  So I want to avoid using Docker to do major work if possible.
Thanks!
Ping

On Thu, Dec 5, 2019 at 2:24 PM Deepak Vohra  wrote:

 Several alternatives are available:
- Use Maven to build Spark on Windows. 
http://spark.apache.org/docs/latest/building-spark.html#apache-maven

- Use Docker image for  CDH on WindowsDocker Hub

| 
| 
|  | 
Docker Hub


 |

 |

 |





On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen 
 wrote:  
 
 What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>        at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>        at scala.Option.getOrElse(Option.scala:189)
>        at 
>org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>        at 
>org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>        at 
>org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> Has anyone experienced building and running Spark source code successfully on 
> Windows?  Could you please share your experience?
>
> Thanks a lot!
>
> Ping
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

  
  
  

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Ping Liu
Thanks Deepak!  I'll try it.

On Thu, Dec 5, 2019 at 4:13 PM Deepak Vohra  wrote:

> The Guava issue could be fixed in one of two ways:
>
> - Use Hadoop v3
> - Create an Uber jar, refer
>
> https://gite.lirmm.fr/yagoubi/spark/commit/c9f743957fa963bc1dbed7a44a346ffce1a45cf2
>   Managing Java dependencies for Apache Spark applications on Cloud
> Dataproc | Google Cloud Blog
> <https://cloud.google.com/blog/products/data-analytics/managing-java-dependencies-apache-spark-applications-cloud-dataproc>
>
> Managing Java dependencies for Apache Spark applications on Cloud Datapr...
>
> Learn how to set up Java imported packages for Apache Spark on Cloud
> Dataproc to avoid conflicts.
>
> <https://cloud.google.com/blog/products/data-analytics/managing-java-dependencies-apache-spark-applications-cloud-dataproc>
>
>
>
> On Thursday, December 5, 2019, 11:49:47 PM UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
>
>
> Hi Deepak,
>
> For Spark, I am using master branch and just have code updated yesterday.
>
> For Guava, I actually deleted my old versions from the local Maven repo.
> The build process of Spark automatically downloaded a few versions.  The
> oldest version is 14.0.1.
>
> But even in 14.0,1 (
> https://guava.dev/releases/14.0.1/api/docs/com/google/common/base/Preconditions.html)
> Preconditions already requires boolean as first parameter.
>
> static void *checkArgument
> <https://guava.dev/releases/14.0.1/api/docs/com/google/common/base/Preconditions.html#checkArgument(boolean,%20java.lang.String,%20java.lang.Object...)>*(boolean
>  expression,
> String
> <http://download.oracle.com/javase/6/docs/api/java/lang/String.html?is-external=true>
>  errorMessageTemplate,
> Object
> <http://download.oracle.com/javase/6/docs/api/java/lang/Object.html?is-external=true>
> ... errorMessageArgs)
>
> The newer Guava version, checkArgument() all require boolean as first
> parameter.
>
> For Docker, using EC2 is a good idea.  Is there a document or guidance for
> it?
>
> Thanks.
>
> Ping
>
>
>
> On Thu, Dec 5, 2019 at 3:30 PM Deepak Vohra  wrote:
>
> Such type exception could occur if a dependency (most likely Guava)
> version is not supported by the Spark version. What is the Spark and Guava
> versions? Use a more recent Guava version dependency in Maven pom.xml.
>
> Regarding Docker, a cloud platform instance such as EC2 could be used with
> Hyper-V support.
>
> On Thursday, December 5, 2019, 10:51:59 PM UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
>
>
> Hi Deepak,
>
> Yes, I did use Maven. I even have the build pass successfully when setting
> Hadoop version to 3.2.  Please see my response to Sean's email.
>
> Unfortunately, I only have Docker Toolbox as my Windows doesn't have
> Microsoft Hyper-V.  So I want to avoid using Docker to do major work if
> possible.
>
> Thanks!
>
> Ping
>
>
> On Thu, Dec 5, 2019 at 2:24 PM Deepak Vohra  wrote:
>
> Several alternatives are available:
>
> - Use Maven to build Spark on Windows.
> http://spark.apache.org/docs/latest/building-spark.html#apache-maven
>
> - Use Docker image for  CDH on Windows
> Docker Hub <https://hub.docker.com/u/cloudera>
>
> Docker Hub
>
> <https://hub.docker.com/u/cloudera>
>
>
>
>
> On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen <
> sro...@gmail.com> wrote:
>
>
> What was the build error? you didn't say. Are you sure it succeeded?
> Try running from the Spark home dir, not bin.
> I know we do run Windows tests and it appears to pass tests, etc.
>
> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
> >
> > Hello,
> >
> > I understand Spark is preferably built on Linux.  But I have a Windows
> machine with a slow Virtual Box for Linux.  So I wish I am able to build
> and run Spark code on Windows environment.
> >
> > Unfortunately,
> >
> > # Apache Hadoop 2.6.X
> > ./build/mvn -Pyarn -DskipTests clean package
> >
> > # Apache Hadoop 2.7.X and later
> > ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package
> >
> >
> > Both are listed on
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
> >
> > But neither works for me (I stay directly under spark root directory and
> run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package"
> >
> > and
> >
> > Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests
> clean package"
> >
> > Now build works.  But when

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Deepak Vohra
 The Guava issue could be fixed in one of two ways:
- Use Hadoop v3- Create an Uber jar, 
referhttps://gite.lirmm.fr/yagoubi/spark/commit/c9f743957fa963bc1dbed7a44a346ffce1a45cf2
  Managing Java dependencies for Apache Spark applications on Cloud Dataproc | 
Google Cloud Blog

| 
| 
| 
|  |  |

 |

 |
| 
|  | 
Managing Java dependencies for Apache Spark applications on Cloud Datapr...

Learn how to set up Java imported packages for Apache Spark on Cloud Dataproc 
to avoid conflicts.
 |

 |

 |




On Thursday, December 5, 2019, 11:49:47 PM UTC, Ping Liu 
 wrote:  
 
 Hi Deepak,
For Spark, I am using master branch and just have code updated yesterday.
For Guava, I actually deleted my old versions from the local Maven repo.  The 
build process of Spark automatically downloaded a few versions.  The oldest 
version is 14.0.1.
But even in 14.0,1 
(https://guava.dev/releases/14.0.1/api/docs/com/google/common/base/Preconditions.html)
 Preconditions already requires boolean as first parameter.

| static void | checkArgument(boolean expression, String errorMessageTemplate, 
Object... errorMessageArgs) |


The newer Guava version, checkArgument() all require boolean as first parameter.
For Docker, using EC2 is a good idea.  Is there a document or guidance for it?
Thanks.
Ping



On Thu, Dec 5, 2019 at 3:30 PM Deepak Vohra  wrote:

 Such type exception could occur if a dependency (most likely Guava) version is 
not supported by the Spark version. What is the Spark and Guava versions? Use a 
more recent Guava version dependency in Maven pom.xml. 
Regarding Docker, a cloud platform instance such as EC2 could be used with 
Hyper-V support.
On Thursday, December 5, 2019, 10:51:59 PM UTC, Ping Liu 
 wrote:  
 
 Hi Deepak,
Yes, I did use Maven. I even have the build pass successfully when setting 
Hadoop version to 3.2.  Please see my response to Sean's email.
Unfortunately, I only have Docker Toolbox as my Windows doesn't have Microsoft 
Hyper-V.  So I want to avoid using Docker to do major work if possible.
Thanks!
Ping

On Thu, Dec 5, 2019 at 2:24 PM Deepak Vohra  wrote:

 Several alternatives are available:
- Use Maven to build Spark on Windows. 
http://spark.apache.org/docs/latest/building-spark.html#apache-maven

- Use Docker image for  CDH on WindowsDocker Hub

| 
| 
|  | 
Docker Hub


 |

 |

 |





On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen 
 wrote:  
 
 What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>        at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>        at scala.Option.getOrElse(Option.scala:189)
>        at 
>org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>        at 
>org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>        at org.apache.spark.deploy.SparkSubmit.submit(Spark

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Ping Liu
Hi Deepak,

For Spark, I am using master branch and just have code updated yesterday.

For Guava, I actually deleted my old versions from the local Maven repo.
The build process of Spark automatically downloaded a few versions.  The
oldest version is 14.0.1.

But even in 14.0,1 (
https://guava.dev/releases/14.0.1/api/docs/com/google/common/base/Preconditions.html)
Preconditions already requires boolean as first parameter.

static void *checkArgument
<https://guava.dev/releases/14.0.1/api/docs/com/google/common/base/Preconditions.html#checkArgument(boolean,
java.lang.String, java.lang.Object...)>*(boolean expression, String
<http://download.oracle.com/javase/6/docs/api/java/lang/String.html?is-external=true>
errorMessageTemplate,
Object
<http://download.oracle.com/javase/6/docs/api/java/lang/Object.html?is-external=true>
... errorMessageArgs)

The newer Guava version, checkArgument() all require boolean as first
parameter.

For Docker, using EC2 is a good idea.  Is there a document or guidance for
it?

Thanks.

Ping



On Thu, Dec 5, 2019 at 3:30 PM Deepak Vohra  wrote:

> Such type exception could occur if a dependency (most likely Guava)
> version is not supported by the Spark version. What is the Spark and Guava
> versions? Use a more recent Guava version dependency in Maven pom.xml.
>
> Regarding Docker, a cloud platform instance such as EC2 could be used with
> Hyper-V support.
>
> On Thursday, December 5, 2019, 10:51:59 PM UTC, Ping Liu <
> pingpinga...@gmail.com> wrote:
>
>
> Hi Deepak,
>
> Yes, I did use Maven. I even have the build pass successfully when setting
> Hadoop version to 3.2.  Please see my response to Sean's email.
>
> Unfortunately, I only have Docker Toolbox as my Windows doesn't have
> Microsoft Hyper-V.  So I want to avoid using Docker to do major work if
> possible.
>
> Thanks!
>
> Ping
>
>
> On Thu, Dec 5, 2019 at 2:24 PM Deepak Vohra  wrote:
>
> Several alternatives are available:
>
> - Use Maven to build Spark on Windows.
> http://spark.apache.org/docs/latest/building-spark.html#apache-maven
>
> - Use Docker image for  CDH on Windows
> Docker Hub <https://hub.docker.com/u/cloudera>
>
> Docker Hub
>
> <https://hub.docker.com/u/cloudera>
>
>
>
>
> On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen <
> sro...@gmail.com> wrote:
>
>
> What was the build error? you didn't say. Are you sure it succeeded?
> Try running from the Spark home dir, not bin.
> I know we do run Windows tests and it appears to pass tests, etc.
>
> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
> >
> > Hello,
> >
> > I understand Spark is preferably built on Linux.  But I have a Windows
> machine with a slow Virtual Box for Linux.  So I wish I am able to build
> and run Spark code on Windows environment.
> >
> > Unfortunately,
> >
> > # Apache Hadoop 2.6.X
> > ./build/mvn -Pyarn -DskipTests clean package
> >
> > # Apache Hadoop 2.7.X and later
> > ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package
> >
> >
> > Both are listed on
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
> >
> > But neither works for me (I stay directly under spark root directory and
> run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package"
> >
> > and
> >
> > Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests
> clean package"
> >
> > Now build works.  But when I run spark-shell.  I got the following error.
> >
> > D:\apache\spark\bin>spark-shell
> > Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> >at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> >at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> >at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> >at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> >at
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> >at
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
> Source)
> >at scala.Option.getOrElse(Option.scala:189)
> >at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> >at org.apache.spark.deploy

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Deepak Vohra
ark Project Streaming  SUCCESS [03:16 
> min]
> [INFO] Spark Project Catalyst . SUCCESS [08:45 
> min]
> [INFO] Spark Project SQL .. SUCCESS [12:12 
> min]
> [INFO] Spark Project ML Library ... SUCCESS [  16:28 
> h]
> [INFO] Spark Project Tools  SUCCESS [ 23.602 
> s]
> [INFO] Spark Project Hive . SUCCESS [07:50 
> min]
> [INFO] Spark Project Graph API  SUCCESS [  8.734 
> s]
> [INFO] Spark Project Cypher ... SUCCESS [ 12.420 
> s]
> [INFO] Spark Project Graph  SUCCESS [ 10.186 
> s]
> [INFO] Spark Project REPL . SUCCESS [01:03 
> min]
> [INFO] Spark Project YARN Shuffle Service . SUCCESS [01:19 
> min]
> [INFO] Spark Project YARN . SUCCESS [02:19 
> min]
> [INFO] Spark Project Assembly . SUCCESS [ 18.912 
> s]
> [INFO] Kafka 0.10+ Token Provider for Streaming ... SUCCESS [ 57.925 
> s]
> [INFO] Spark Integration for Kafka 0.10 ... SUCCESS [01:20 
> min]
> [INFO] Kafka 0.10+ Source for Structured Streaming  SUCCESS [02:26 
> min]
> [INFO] Spark Project Examples . SUCCESS [02:00 
> min]
> [INFO] Spark Integration for Kafka 0.10 Assembly .. SUCCESS [ 28.354 
> s]
> [INFO] Spark Avro . SUCCESS [01:44 
> min]
> [INFO] 
> 
> [INFO] BUILD SUCCESS
> [INFO] 
> 
> [INFO] Total time:  17:30 h
> [INFO] Finished at: 2019-12-05T12:20:01-08:00
> [INFO] 
> 
>
> D:\apache\spark>cd bin
>
> D:\apache\spark\bin>ls
> beeline               load-spark-env.cmd  run-example       spark-shell       
> spark-sql2.cmd     sparkR.cmd
> beeline.cmd           load-spark-env.sh   run-example.cmd   spark-shell.cmd   
> spark-submit       sparkR2.cmd
> docker-image-tool.sh  pyspark             spark-class       spark-shell2.cmd  
> spark-submit.cmd
> find-spark-home       pyspark.cmd         spark-class.cmd   spark-sql         
> spark-submit2.cmd
> find-spark-home.cmd   pyspark2.cmd        spark-class2.cmd  spark-sql.cmd     
> sparkR
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>         at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>         at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>         at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>         at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>         at scala.Option.getOrElse(Option.scala:189)
>         at 
>org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>         at 
>org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>         at 
>org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>         at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>         at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>         at 
>org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> D:\apache\spark\bin>
>
> On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:
>>
>> What was the build error? you didn't say. Are you sure it succeeded?
>> Try running from the Spark home dir, not bin.
>> I know we do run Windows tests and it appears to pass tests, etc.
>>
>> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>> >
>> > Hello,
>> >
>> > I understand Spark is preferably built on Linux.  But I have a Windows 
>> > machine with a slow Virtual Box for Linux.  So I wish I am able to build 
>> > and run Spark code on Windows 

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Ping Liu
 Project ML Local Library . SUCCESS
> [01:51 min]
> > [INFO] Spark Project GraphX ... SUCCESS
> [02:20 min]
> > [INFO] Spark Project Streaming  SUCCESS
> [03:16 min]
> > [INFO] Spark Project Catalyst . SUCCESS
> [08:45 min]
> > [INFO] Spark Project SQL .. SUCCESS
> [12:12 min]
> > [INFO] Spark Project ML Library ... SUCCESS [
> 16:28 h]
> > [INFO] Spark Project Tools  SUCCESS [
> 23.602 s]
> > [INFO] Spark Project Hive . SUCCESS
> [07:50 min]
> > [INFO] Spark Project Graph API  SUCCESS [
> 8.734 s]
> > [INFO] Spark Project Cypher ... SUCCESS [
> 12.420 s]
> > [INFO] Spark Project Graph  SUCCESS [
> 10.186 s]
> > [INFO] Spark Project REPL . SUCCESS
> [01:03 min]
> > [INFO] Spark Project YARN Shuffle Service . SUCCESS
> [01:19 min]
> > [INFO] Spark Project YARN . SUCCESS
> [02:19 min]
> > [INFO] Spark Project Assembly . SUCCESS [
> 18.912 s]
> > [INFO] Kafka 0.10+ Token Provider for Streaming ... SUCCESS [
> 57.925 s]
> > [INFO] Spark Integration for Kafka 0.10 ... SUCCESS
> [01:20 min]
> > [INFO] Kafka 0.10+ Source for Structured Streaming  SUCCESS
> [02:26 min]
> > [INFO] Spark Project Examples . SUCCESS
> [02:00 min]
> > [INFO] Spark Integration for Kafka 0.10 Assembly .. SUCCESS [
> 28.354 s]
> > [INFO] Spark Avro . SUCCESS
> [01:44 min]
> > [INFO]
> 
> > [INFO] BUILD SUCCESS
> > [INFO]
> 
> > [INFO] Total time:  17:30 h
> > [INFO] Finished at: 2019-12-05T12:20:01-08:00
> > [INFO]
> 
> >
> > D:\apache\spark>cd bin
> >
> > D:\apache\spark\bin>ls
> > beeline   load-spark-env.cmd  run-example   spark-shell
>  spark-sql2.cmd sparkR.cmd
> > beeline.cmd   load-spark-env.sh   run-example.cmd
>  spark-shell.cmd   spark-submit   sparkR2.cmd
> > docker-image-tool.sh  pyspark spark-class
>  spark-shell2.cmd  spark-submit.cmd
> > find-spark-home   pyspark.cmd spark-class.cmd   spark-sql
>  spark-submit2.cmd
> > find-spark-home.cmd   pyspark2.cmdspark-class2.cmd
> spark-sql.cmd sparkR
> >
> > D:\apache\spark\bin>spark-shell
> > Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> > at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> > at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> > at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> > at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> > at
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> > at
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
> Source)
> > at scala.Option.getOrElse(Option.scala:189)
> > at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> > at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> > at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> > at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> > at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> > at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> > at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> >
> > D:\apache\spark\bin>
> >
> > On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:
> >>
> >> What was the build error? you didn't say. Are y

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Deepak Vohra
 Such type exception could occur if a dependency (most likely Guava) version is 
not supported by the Spark version. What is the Spark and Guava versions? Use a 
more recent Guava version dependency in Maven pom.xml. 
Regarding Docker, a cloud platform instance such as EC2 could be used with 
Hyper-V support.
On Thursday, December 5, 2019, 10:51:59 PM UTC, Ping Liu 
 wrote:  
 
 Hi Deepak,
Yes, I did use Maven. I even have the build pass successfully when setting 
Hadoop version to 3.2.  Please see my response to Sean's email.
Unfortunately, I only have Docker Toolbox as my Windows doesn't have Microsoft 
Hyper-V.  So I want to avoid using Docker to do major work if possible.
Thanks!
Ping

On Thu, Dec 5, 2019 at 2:24 PM Deepak Vohra  wrote:

 Several alternatives are available:
- Use Maven to build Spark on Windows. 
http://spark.apache.org/docs/latest/building-spark.html#apache-maven

- Use Docker image for  CDH on WindowsDocker Hub

| 
| 
|  | 
Docker Hub


 |

 |

 |





On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen 
 wrote:  
 
 What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>        at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>        at scala.Option.getOrElse(Option.scala:189)
>        at 
>org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>        at 
>org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>        at 
>org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> Has anyone experienced building and running Spark source code successfully on 
> Windows?  Could you please share your experience?
>
> Thanks a lot!
>
> Ping
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

  
  

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Ping Liu
Hi Deepak,

Yes, I did use Maven. I even have the build pass successfully when setting
Hadoop version to 3.2.  Please see my response to Sean's email.

Unfortunately, I only have Docker Toolbox as my Windows doesn't have
Microsoft Hyper-V.  So I want to avoid using Docker to do major work if
possible.

Thanks!

Ping


On Thu, Dec 5, 2019 at 2:24 PM Deepak Vohra  wrote:

> Several alternatives are available:
>
> - Use Maven to build Spark on Windows.
> http://spark.apache.org/docs/latest/building-spark.html#apache-maven
>
> - Use Docker image for  CDH on Windows
> Docker Hub <https://hub.docker.com/u/cloudera>
>
> Docker Hub
>
> <https://hub.docker.com/u/cloudera>
>
>
>
>
> On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen <
> sro...@gmail.com> wrote:
>
>
> What was the build error? you didn't say. Are you sure it succeeded?
> Try running from the Spark home dir, not bin.
> I know we do run Windows tests and it appears to pass tests, etc.
>
> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
> >
> > Hello,
> >
> > I understand Spark is preferably built on Linux.  But I have a Windows
> machine with a slow Virtual Box for Linux.  So I wish I am able to build
> and run Spark code on Windows environment.
> >
> > Unfortunately,
> >
> > # Apache Hadoop 2.6.X
> > ./build/mvn -Pyarn -DskipTests clean package
> >
> > # Apache Hadoop 2.7.X and later
> > ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package
> >
> >
> > Both are listed on
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
> >
> > But neither works for me (I stay directly under spark root directory and
> run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package"
> >
> > and
> >
> > Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests
> clean package"
> >
> > Now build works.  But when I run spark-shell.  I got the following error.
> >
> > D:\apache\spark\bin>spark-shell
> > Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> >at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> >at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> >at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> >at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> >at
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> >at
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
> Source)
> >at scala.Option.getOrElse(Option.scala:189)
> >at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> >at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> >at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> >at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> >at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> >at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> >at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> >at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> >
> >
> > Has anyone experienced building and running Spark source code
> successfully on Windows?  Could you please share your experience?
> >
> > Thanks a lot!
> >
> > Ping
>
> >
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
>


Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Sean Owen
at 
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> at 
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> at 
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown 
> Source)
> at scala.Option.getOrElse(Option.scala:189)
> at 
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> D:\apache\spark\bin>
>
> On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:
>>
>> What was the build error? you didn't say. Are you sure it succeeded?
>> Try running from the Spark home dir, not bin.
>> I know we do run Windows tests and it appears to pass tests, etc.
>>
>> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>> >
>> > Hello,
>> >
>> > I understand Spark is preferably built on Linux.  But I have a Windows 
>> > machine with a slow Virtual Box for Linux.  So I wish I am able to build 
>> > and run Spark code on Windows environment.
>> >
>> > Unfortunately,
>> >
>> > # Apache Hadoop 2.6.X
>> > ./build/mvn -Pyarn -DskipTests clean package
>> >
>> > # Apache Hadoop 2.7.X and later
>> > ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
>> > package
>> >
>> >
>> > Both are listed on 
>> > http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>> >
>> > But neither works for me (I stay directly under spark root directory and 
>> > run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
>> > package"
>> >
>> > and
>> >
>> > Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
>> > clean package"
>> >
>> > Now build works.  But when I run spark-shell.  I got the following error.
>> >
>> > D:\apache\spark\bin>spark-shell
>> > Exception in thread "main" java.lang.NoSuchMethodError: 
>> > com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>> > at 
>> > org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>> > at 
>> > org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>> > at 
>> > org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>> > at 
>> > org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown 
>> > Source)
>> > at scala.Option.getOrElse(Option.scala:189)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>> > at 
>> > org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>> > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>> >
>> >
>> > Has anyone experienced building and running Spark source code successfully 
>> > on Windows?  Could you please share your experience?
>> >
>> > Thanks a lot!
>> >
>> > Ping
>> >

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Ping Liu
ache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

D:\apache\spark\bin>

On Thu, Dec 5, 2019 at 1:33 PM Sean Owen  wrote:

> What was the build error? you didn't say. Are you sure it succeeded?
> Try running from the Spark home dir, not bin.
> I know we do run Windows tests and it appears to pass tests, etc.
>
> On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
> >
> > Hello,
> >
> > I understand Spark is preferably built on Linux.  But I have a Windows
> machine with a slow Virtual Box for Linux.  So I wish I am able to build
> and run Spark code on Windows environment.
> >
> > Unfortunately,
> >
> > # Apache Hadoop 2.6.X
> > ./build/mvn -Pyarn -DskipTests clean package
> >
> > # Apache Hadoop 2.7.X and later
> > ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package
> >
> >
> > Both are listed on
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
> >
> > But neither works for me (I stay directly under spark root directory and
> run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
> package"
> >
> > and
> >
> > Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests
> clean package"
> >
> > Now build works.  But when I run spark-shell.  I got the following error.
> >
> > D:\apache\spark\bin>spark-shell
> > Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> > at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> > at
> org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> > at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> > at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> > at
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> > at
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
> Source)
> > at scala.Option.getOrElse(Option.scala:189)
> > at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> > at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> > at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> > at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> > at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> > at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> > at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> >
> >
> > Has anyone experienced building and running Spark source code
> successfully on Windows?  Could you please share your experience?
> >
> > Thanks a lot!
> >
> > Ping
> >
>


Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Deepak Vohra
 Several alternatives are available:
- Use Maven to build Spark on Windows. 
http://spark.apache.org/docs/latest/building-spark.html#apache-maven

- Use Docker image for  CDH on WindowsDocker Hub

| 
| 
|  | 
Docker Hub


 |

 |

 |





On Thursday, December 5, 2019, 09:33:43 p.m. UTC, Sean Owen 
 wrote:  
 
 What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
>        at 
>org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
>        at 
>org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown Source)
>        at scala.Option.getOrElse(Option.scala:189)
>        at 
>org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
>        at 
>org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
>        at 
>org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>        at 
>org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
>        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> Has anyone experienced building and running Spark source code successfully on 
> Windows?  Could you please share your experience?
>
> Thanks a lot!
>
> Ping
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

  

Re: Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Sean Owen
What was the build error? you didn't say. Are you sure it succeeded?
Try running from the Spark home dir, not bin.
I know we do run Windows tests and it appears to pass tests, etc.

On Thu, Dec 5, 2019 at 3:28 PM Ping Liu  wrote:
>
> Hello,
>
> I understand Spark is preferably built on Linux.  But I have a Windows 
> machine with a slow Virtual Box for Linux.  So I wish I am able to build and 
> run Spark code on Windows environment.
>
> Unfortunately,
>
> # Apache Hadoop 2.6.X
> ./build/mvn -Pyarn -DskipTests clean package
>
> # Apache Hadoop 2.7.X and later
> ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean 
> package
>
>
> Both are listed on 
> http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn
>
> But neither works for me (I stay directly under spark root directory and run 
> "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package"
>
> and
>
> Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests 
> clean package"
>
> Now build works.  But when I run spark-shell.  I got the following error.
>
> D:\apache\spark\bin>spark-shell
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> at 
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
> at 
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
> at 
> org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
> at 
> org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown 
> Source)
> at scala.Option.getOrElse(Option.scala:189)
> at 
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
> at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> Has anyone experienced building and running Spark source code successfully on 
> Windows?  Could you please share your experience?
>
> Thanks a lot!
>
> Ping
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Is it feasible to build and run Spark on Windows?

2019-12-05 Thread Ping Liu
Hello,

I understand Spark is preferably built on Linux.  But I have a Windows
machine with a slow Virtual Box for Linux.  So I wish I am able to build
and run Spark code on Windows environment.

Unfortunately,

# Apache Hadoop 2.6.X
./build/mvn -Pyarn -DskipTests clean package

# Apache Hadoop 2.7.X and later
./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean package


Both are listed on
http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn

But neither works for me (I stay directly under spark root directory and
run "mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -DskipTests clean
package"

and

Then I tried "mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.1 -DskipTests
clean package"

Now build works.  But when I run spark-shell.  I got the following error.

D:\apache\spark\bin>spark-shell
Exception in thread "main" java.lang.NoSuchMethodError:
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
at
org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
at
org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
at
org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
at
org.apache.spark.deploy.SparkSubmit$$Lambda$132/817978763.apply(Unknown
Source)
at scala.Option.getOrElse(Option.scala:189)
at
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
at org.apache.spark.deploy.SparkSubmit.org
$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


Has anyone experienced building and running Spark source code successfully
on Windows?  Could you please share your experience?

Thanks a lot!

Ping


Re: Not able to sort out environment settings to start spark from windows

2018-06-16 Thread Raymond Xie
Thank you. But there is no special char or space, I actually copied it from
Program Files to the root to ensure no space in the path.


**
*Sincerely yours,*


*Raymond*

On Sat, Jun 16, 2018 at 3:42 PM, vaquar khan  wrote:

> Plz check ur Java Home path .
> May be spacial char or space on ur path.
>
> Regards,
> Vaquar khan
>
> On Sat, Jun 16, 2018, 1:36 PM Raymond Xie  wrote:
>
>> I am trying to run spark-shell in Windows but receive error of:
>>
>> \Java\jre1.8.0_151\bin\java was unexpected at this time.
>>
>> Environment:
>>
>> System variables:
>>
>> SPARK_HOME:
>>
>> c:\spark
>>
>> Path:
>>
>> C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\
>> ProgramData\Anaconda2;C:\ProgramData\Anaconda2\Library\
>> mingw-w64\bin;C:\ProgramData\Anaconda2\Library\usr\bin;C:\
>> ProgramData\Anaconda2\Library\bin;C:\ProgramData\Anaconda2\
>> Scripts;C:\ProgramData\Oracle\Java\javapath;C:\Windows\
>> system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\
>> WindowsPowerShell\v1.0\;I:\Anaconda2;I:\Anaconda2\
>> Scripts;I:\Anaconda2\Library\bin;C:\Program Files
>> (x86)\sbt\\bin;C:\Program Files (x86)\Microsoft SQL
>> Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL
>> Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL
>> Server\100\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL
>> Server\100\Tools\Binn\VSShell\Common7\IDE\;C:\Program Files
>> (x86)\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\;C:\Program
>> Files (x86)\Microsoft SQL Server\100\DTS\Binn\;%DDPATH%;
>> %USERPROFILE%\.dnx\bin;C:\Program Files\Microsoft DNX\Dnvm\;C:\Program
>> Files\Microsoft SQL 
>> Server\130\Tools\Binn\;C:\jre1.8.0_151\bin\server;C:\Program
>> Files (x86)\OpenSSH\bin;C:\Program Files (x86)\Calibre2\;C:\Program
>> Files\nodejs\;C:\Program Files (x86)\Skype\Phone\;
>> %JAVA_HOME%\bin;%JAVA_HOME%\jre\bin;C:\Program Files
>> (x86)\scala\bin;C:\hadoop\bin;C:\Program Files\Git\cmd;I:\Program
>> Files\EmEditor; C:\RXIE\Learning\Spark\bin;C:\spark\bin
>>
>> JAVA_HOME:
>>
>> C:\jdk1.8.0_151\bin
>>
>> JDK_HOME:
>>
>> C:\jdk1.8.0_151
>>
>> I also copied all  C:\jdk1.8.0_151 to  C:\Java\jdk1.8.0_151, and
>> received the same error.
>>
>> Any help is greatly appreciated.
>>
>> Thanks.
>>
>>
>>
>>
>> **
>> *Sincerely yours,*
>>
>>
>> *Raymond*
>>
>


Re: Not able to sort out environment settings to start spark from windows

2018-06-16 Thread vaquar khan
Plz check ur Java Home path .
May be spacial char or space on ur path.

Regards,
Vaquar khan

On Sat, Jun 16, 2018, 1:36 PM Raymond Xie  wrote:

> I am trying to run spark-shell in Windows but receive error of:
>
> \Java\jre1.8.0_151\bin\java was unexpected at this time.
>
> Environment:
>
> System variables:
>
> SPARK_HOME:
>
> c:\spark
>
> Path:
>
> C:\Program Files (x86)\Common
> Files\Oracle\Java\javapath;C:\ProgramData\Anaconda2;C:\ProgramData\Anaconda2\Library\mingw-w64\bin;C:\ProgramData\Anaconda2\Library\usr\bin;C:\ProgramData\Anaconda2\Library\bin;C:\ProgramData\Anaconda2\Scripts;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;I:\Anaconda2;I:\Anaconda2\Scripts;I:\Anaconda2\Library\bin;C:\Program
> Files (x86)\sbt\\bin;C:\Program Files (x86)\Microsoft SQL
> Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL
> Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL
> Server\100\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL
> Server\100\Tools\Binn\VSShell\Common7\IDE\;C:\Program Files (x86)\Microsoft
> Visual Studio 9.0\Common7\IDE\PrivateAssemblies\;C:\Program Files
> (x86)\Microsoft SQL
> Server\100\DTS\Binn\;%DDPATH%;%USERPROFILE%\.dnx\bin;C:\Program
> Files\Microsoft DNX\Dnvm\;C:\Program Files\Microsoft SQL
> Server\130\Tools\Binn\;C:\jre1.8.0_151\bin\server;C:\Program Files
> (x86)\OpenSSH\bin;C:\Program Files (x86)\Calibre2\;C:\Program
> Files\nodejs\;C:\Program Files (x86)\Skype\Phone\;
> %JAVA_HOME%\bin;%JAVA_HOME%\jre\bin;C:\Program Files
> (x86)\scala\bin;C:\hadoop\bin;C:\Program Files\Git\cmd;I:\Program
> Files\EmEditor; C:\RXIE\Learning\Spark\bin;C:\spark\bin
>
> JAVA_HOME:
>
> C:\jdk1.8.0_151\bin
>
> JDK_HOME:
>
> C:\jdk1.8.0_151
>
> I also copied all  C:\jdk1.8.0_151 to  C:\Java\jdk1.8.0_151, and received
> the same error.
>
> Any help is greatly appreciated.
>
> Thanks.
>
>
>
>
> **
> *Sincerely yours,*
>
>
> *Raymond*
>


Not able to sort out environment settings to start spark from windows

2018-06-16 Thread Raymond Xie
I am trying to run spark-shell in Windows but receive error of:

\Java\jre1.8.0_151\bin\java was unexpected at this time.

Environment:

System variables:

SPARK_HOME:

c:\spark

Path:

C:\Program Files (x86)\Common
Files\Oracle\Java\javapath;C:\ProgramData\Anaconda2;C:\ProgramData\Anaconda2\Library\mingw-w64\bin;C:\ProgramData\Anaconda2\Library\usr\bin;C:\ProgramData\Anaconda2\Library\bin;C:\ProgramData\Anaconda2\Scripts;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;I:\Anaconda2;I:\Anaconda2\Scripts;I:\Anaconda2\Library\bin;C:\Program
Files (x86)\sbt\\bin;C:\Program Files (x86)\Microsoft SQL
Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL
Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL
Server\100\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL
Server\100\Tools\Binn\VSShell\Common7\IDE\;C:\Program Files (x86)\Microsoft
Visual Studio 9.0\Common7\IDE\PrivateAssemblies\;C:\Program Files
(x86)\Microsoft SQL
Server\100\DTS\Binn\;%DDPATH%;%USERPROFILE%\.dnx\bin;C:\Program
Files\Microsoft DNX\Dnvm\;C:\Program Files\Microsoft SQL
Server\130\Tools\Binn\;C:\jre1.8.0_151\bin\server;C:\Program Files
(x86)\OpenSSH\bin;C:\Program Files (x86)\Calibre2\;C:\Program
Files\nodejs\;C:\Program Files (x86)\Skype\Phone\;
%JAVA_HOME%\bin;%JAVA_HOME%\jre\bin;C:\Program Files
(x86)\scala\bin;C:\hadoop\bin;C:\Program Files\Git\cmd;I:\Program
Files\EmEditor; C:\RXIE\Learning\Spark\bin;C:\spark\bin

JAVA_HOME:

C:\jdk1.8.0_151\bin

JDK_HOME:

C:\jdk1.8.0_151

I also copied all  C:\jdk1.8.0_151 to  C:\Java\jdk1.8.0_151, and received
the same error.

Any help is greatly appreciated.

Thanks.




**
*Sincerely yours,*


*Raymond*


Apache spark on windows without shortnames enabled

2018-04-15 Thread ashwini
Hi,

We use Apache Spark 2.2.0 in our stack. Our software by default like other
softwares gets installed under "C:\Program Files\". We have a
restriction that we cannot ask our customers to enable short names on their
machines. From our experience, spark does not handle the absolute paths well
if there is a whitespace in the path neither while calling spark-class2.cmd
from commandline nor the paths inside spark-env.cmd.

We have tried the following:
1. Using double quotes around the path.
2. Escaping the whitespaces.
3. Using relative paths. 

None of them have been successful in bringing up spark. How do you recommend
handling this?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Add snappy support for spark in Windows

2017-12-04 Thread Junfeng Chen
I have put winutils and hadoop.dll within HADOOP_HOME, and spark works well
with it, but snappy decompress function throw the above exception.


Regard,
Junfeng Chen

On Mon, Dec 4, 2017 at 7:07 PM, Qiao, Richard <richard.q...@capitalone.com>
wrote:

> Junjeng, it worth a try to start your spark local with
> hadoop.dll/winutils.exe etc hadoop windows support package in HADOOP_HOME,
> if you didn’t do that yet.
>
>
>
> Best Regards
>
> Richard
>
>
>
>
>
> *From: *Junfeng Chen <darou...@gmail.com>
> *Date: *Monday, December 4, 2017 at 3:53 AM
> *To: *"Qiao, Richard" <richard.q...@capitalone.com>
> *Cc: *"user@spark.apache.org" <user@spark.apache.org>
> *Subject: *Re: Add snappy support for spark in Windows
>
>
>
> But I am working on my local development machine, so it should have no
> relative to workers/executers.
>
>
>
> I find some documents about enable snappy on hadoop. If I want to use
> snappy with spark, do I need to config spark as hadoop or have some easy
> way to access it?
>
>
>
>
> Regard,
> Junfeng Chen
>
>
>
> On Mon, Dec 4, 2017 at 4:12 PM, Qiao, Richard <richard.q...@capitalone.com>
> wrote:
>
> It seems a common mistake that the path is not accessible by
> workers/executors.
>
>
>
> Best regards
>
> Richard
>
> Sent from my iPhone
>
>
> On Dec 3, 2017, at 22:32, Junfeng Chen <darou...@gmail.com> wrote:
>
> I am working on importing snappy compressed json file into spark rdd or
> dataset. However I meet this error: java.lang.UnsatisfiedLinkError:
> org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
>
> I have set the following configuration:
>
> SparkConf conf = new SparkConf()
>
> .setAppName("normal spark")
>
> .setMaster("local")
>
> .set("spark.io.compression.codec", 
> "org.apache.spark.io.SnappyCompressionCodec")
>
> 
> .set("spark.driver.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
>
> 
> .set("spark.driver.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
>
> 
> .set("spark.executor.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
>
> 
> .set("spark.executor.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
>
> ;
>
> Where D:\Downloads\spark-2.2.0-bin-hadoop2.7 is my spark unpacked path,
> and I can find the snappy jar file snappy-0.2.jar and
> snappy-java-1.1.2.6.jar in
>
> D:\Downloads\spark-2.2.0-bin-hadoop2.7\spark-2.2.0-bin-hadoop2.7\jars\
>
> However nothing works and even the error message not change.
>
> How can I fix it?
>
>
>
> ref of stackoverflow: https://stackoverflow.com/questions/
> 47626012/config-snappy-support-for-spark-in-windows
> <https://stackoverflow.com/questions/47626012/config-snappy-support-for-spark-in-windows>
>
>
>
>
>
>
> Regard,
> Junfeng Chen
>
>
> --
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>
>
>
> --
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>


Re: Add snappy support for spark in Windows

2017-12-04 Thread Qiao, Richard
Junjeng, it worth a try to start your spark local with hadoop.dll/winutils.exe 
etc hadoop windows support package in HADOOP_HOME, if you didn’t do that yet.

Best Regards
Richard


From: Junfeng Chen <darou...@gmail.com>
Date: Monday, December 4, 2017 at 3:53 AM
To: "Qiao, Richard" <richard.q...@capitalone.com>
Cc: "user@spark.apache.org" <user@spark.apache.org>
Subject: Re: Add snappy support for spark in Windows

But I am working on my local development machine, so it should have no relative 
to workers/executers.

I find some documents about enable snappy on hadoop. If I want to use snappy 
with spark, do I need to config spark as hadoop or have some easy way to access 
it?


Regard,
Junfeng Chen

On Mon, Dec 4, 2017 at 4:12 PM, Qiao, Richard 
<richard.q...@capitalone.com<mailto:richard.q...@capitalone.com>> wrote:
It seems a common mistake that the path is not accessible by workers/executors.

Best regards
Richard

Sent from my iPhone

On Dec 3, 2017, at 22:32, Junfeng Chen 
<darou...@gmail.com<mailto:darou...@gmail.com>> wrote:

I am working on importing snappy compressed json file into spark rdd or 
dataset. However I meet this error: java.lang.UnsatisfiedLinkError: 
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z

I have set the following configuration:

SparkConf conf = new SparkConf()

.setAppName("normal spark")

.setMaster("local")

.set("spark.io.compression.codec", 
"org.apache.spark.io<http://org.apache.spark.io>.SnappyCompressionCodec")


.set("spark.driver.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")


.set("spark.driver.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")


.set("spark.executor.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")


.set("spark.executor.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")

;

Where D:\Downloads\spark-2.2.0-bin-hadoop2.7 is my spark unpacked path, and I 
can find the snappy jar file snappy-0.2.jar and snappy-java-1.1.2.6.jar in

D:\Downloads\spark-2.2.0-bin-hadoop2.7\spark-2.2.0-bin-hadoop2.7\jars\

However nothing works and even the error message not change.

How can I fix it?



ref of stackoverflow: 
https://stackoverflow.com/questions/47626012/config-snappy-support-for-spark-in-windows
 
<https://stackoverflow.com/questions/47626012/config-snappy-support-for-spark-in-windows>


Regard,
Junfeng Chen



The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates and may only be used solely in performance of 
work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.



The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates and may only be used solely in performance of 
work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Re: Add snappy support for spark in Windows

2017-12-04 Thread Junfeng Chen
But I am working on my local development machine, so it should have no
relative to workers/executers.

I find some documents about enable snappy on hadoop. If I want to use
snappy with spark, do I need to config spark as hadoop or have some easy
way to access it?


Regard,
Junfeng Chen

On Mon, Dec 4, 2017 at 4:12 PM, Qiao, Richard <richard.q...@capitalone.com>
wrote:

> It seems a common mistake that the path is not accessible by
> workers/executors.
>
> Best regards
> Richard
>
> Sent from my iPhone
>
> On Dec 3, 2017, at 22:32, Junfeng Chen <darou...@gmail.com> wrote:
>
> I am working on importing snappy compressed json file into spark rdd or
> dataset. However I meet this error: java.lang.UnsatisfiedLinkError:
> org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
>
> I have set the following configuration:
>
> SparkConf conf = new SparkConf()
> .setAppName("normal spark")
> .setMaster("local")
> .set("spark.io.compression.codec", 
> "org.apache.spark.io.SnappyCompressionCodec")
> 
> .set("spark.driver.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
> 
> .set("spark.driver.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
> 
> .set("spark.executor.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
> 
> .set("spark.executor.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
> ;
>
> Where D:\Downloads\spark-2.2.0-bin-hadoop2.7 is my spark unpacked path,
> and I can find the snappy jar file snappy-0.2.jar and
> snappy-java-1.1.2.6.jar in
>
> D:\Downloads\spark-2.2.0-bin-hadoop2.7\spark-2.2.0-bin-hadoop2.7\jars\
>
> However nothing works and even the error message not change.
>
> How can I fix it?
>
>
> ref of stackoverflow: https://stackoverflow.com/questions/47626012/
> config-snappy-support-for-spark-in-windows
> <https://stackoverflow.com/questions/47626012/config-snappy-support-for-spark-in-windows>
>
>
>
> Regard,
> Junfeng Chen
>
>
> --
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>


Re: Add snappy support for spark in Windows

2017-12-04 Thread Qiao, Richard
It seems a common mistake that the path is not accessible by workers/executors.

Best regards
Richard

Sent from my iPhone

On Dec 3, 2017, at 22:32, Junfeng Chen 
<darou...@gmail.com<mailto:darou...@gmail.com>> wrote:


I am working on importing snappy compressed json file into spark rdd or 
dataset. However I meet this error: java.lang.UnsatisfiedLinkError: 
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z

I have set the following configuration:

SparkConf conf = new SparkConf()
.setAppName("normal spark")
.setMaster("local")
.set("spark.io.compression.codec", 
"org.apache.spark.io<http://org.apache.spark.io>.SnappyCompressionCodec")

.set("spark.driver.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")

.set("spark.driver.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")

.set("spark.executor.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")

.set("spark.executor.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
;

Where D:\Downloads\spark-2.2.0-bin-hadoop2.7 is my spark unpacked path, and I 
can find the snappy jar file snappy-0.2.jar and snappy-java-1.1.2.6.jar in

D:\Downloads\spark-2.2.0-bin-hadoop2.7\spark-2.2.0-bin-hadoop2.7\jars\

However nothing works and even the error message not change.

How can I fix it?


ref of stackoverflow: 
https://stackoverflow.com/questions/47626012/config-snappy-support-for-spark-in-windows
 
<https://stackoverflow.com/questions/47626012/config-snappy-support-for-spark-in-windows>


Regard,
Junfeng Chen


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates and may only be used solely in performance of 
work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Add snappy support for spark in Windows

2017-12-03 Thread Junfeng Chen
I am working on importing snappy compressed json file into spark rdd or
dataset. However I meet this error: java.lang.UnsatisfiedLinkError:
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z

I have set the following configuration:

SparkConf conf = new SparkConf()
.setAppName("normal spark")
.setMaster("local")
.set("spark.io.compression.codec",
"org.apache.spark.io.SnappyCompressionCodec")

.set("spark.driver.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")

.set("spark.driver.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")

.set("spark.executor.extraLibraryPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")

.set("spark.executor.extraClassPath","D:\\Downloads\\spark-2.2.0-bin-hadoop2.7\\spark-2.2.0-bin-hadoop2.7\\jars")
;

Where D:\Downloads\spark-2.2.0-bin-hadoop2.7 is my spark unpacked path, and
I can find the snappy jar file snappy-0.2.jar and snappy-java-1.1.2.6.jar in

D:\Downloads\spark-2.2.0-bin-hadoop2.7\spark-2.2.0-bin-hadoop2.7\jars\

However nothing works and even the error message not change.

How can I fix it?


ref of stackoverflow: https://stackoverflow.com/questions/
47626012/config-snappy-support-for-spark-in-windows
<https://stackoverflow.com/questions/47626012/config-snappy-support-for-spark-in-windows>



Regard,
Junfeng Chen


Re: Spark (on Windows) not picking up HADOOP_CONF_DIR

2016-07-17 Thread Jacek Laskowski
Hi,

How did you set it? How do you run the app? Use sys.env to know whether it
was set or not.

Jacek

On 17 Jul 2016 11:33 a.m., "Daniel Haviv" 
wrote:

> Hi,
> I'm running Spark using IntelliJ on Windows and even though I set
> HADOOP_CONF_DIR it does not affect the contents of sc.hadoopConfiguration.
>
> Anybody encountered it ?
>
> Thanks,
> Daniel
>


Spark (on Windows) not picking up HADOOP_CONF_DIR

2016-07-17 Thread Daniel Haviv
Hi,
I'm running Spark using IntelliJ on Windows and even though I set
HADOOP_CONF_DIR it does not affect the contents of sc.hadoopConfiguration.

Anybody encountered it ?

Thanks,
Daniel


Re: Spark on Windows platform

2016-03-01 Thread Sabarish Sasidharan
If all you want is Spark standalone then its as simple as installing the
binaries and calling Spark submit passing your main class. I would advise
against running on Hadoop on Windows, it's a bit of trouble. But yes you
can do it if you want to.

Regards
Sab

Regards
Sab
On 29-Feb-2016 6:58 pm, "gaurav pathak" <gauravpathak...@gmail.com> wrote:

> Can someone guide me the steps and information regarding, installation of
> SPARK on Windows 7/8.1/10 , as well as on Windows Server. Also, it will be
> great to read your experiences in using SPARK on Windows platform.
>
>
> Thanks & Regards,
> Gaurav Pathak
>


Re: Spark on Windows platform

2016-02-29 Thread Steve Loughran

On 29 Feb 2016, at 13:40, gaurav pathak 
<gauravpathak...@gmail.com<mailto:gauravpathak...@gmail.com>> wrote:


Thanks Jorn.

Any guidance on how to get started with getting SPARK on Windows, is highly 
appreciated.

Thanks & Regards

Gaurav Pathak


you are at risk of seeing stack traces when you try to talk to the local 
filesystem, on account of (a) hadoop being part of the process and (b) it 
needing some native windows binaries

details: https://wiki.apache.org/hadoop/WindowsProblems

those binaries: https://github.com/steveloughran/winutils

(I need to add some 2.7.2 binaries in there, by the look of things)



Re: Spark on Windows platform

2016-02-29 Thread Gaurav Agarwal
> Hi
> I am running spark on windows but a standalone one.
>
> Use this code
>
> SparkConf conf = new
SparkConf().setMaster("local[1]").seatAppName("spark").setSparkHome("c:/spark/bin/spark-submit.cmd");
>
> Where sparkhome is the path where u extracted ur spark binaries till
bin/*.cmd
>
> You will get spark context or streaming context
>
> Thanks
>
> On Feb 29, 2016 7:10 PM, "gaurav pathak" <gauravpathak...@gmail.com>
wrote:
>>
>> Thanks Jorn.
>>
>> Any guidance on how to get started with getting SPARK on Windows, is
highly appreciated.
>>
>> Thanks & Regards
>>
>> Gaurav Pathak
>>
>> ~ sent from handheld device
>>
>> On Feb 29, 2016 5:34 AM, "Jörn Franke" <jornfra...@gmail.com> wrote:
>>>
>>> I think Hortonworks has a Windows Spark distribution. Maybe Bigtop as
well?
>>>
>>> > On 29 Feb 2016, at 14:27, gaurav pathak <gauravpathak...@gmail.com>
wrote:
>>> >
>>> > Can someone guide me the steps and information regarding,
installation of SPARK on Windows 7/8.1/10 , as well as on Windows Server.
Also, it will be great to read your experiences in using SPARK on Windows
platform.
>>> >
>>> >
>>> > Thanks & Regards,
>>> > Gaurav Pathak


Re: Spark on Windows platform

2016-02-29 Thread gaurav pathak
Thanks Jorn.

Any guidance on how to get started with getting SPARK on Windows, is highly
appreciated.

Thanks & Regards

Gaurav Pathak

~ sent from handheld device
On Feb 29, 2016 5:34 AM, "Jörn Franke" <jornfra...@gmail.com> wrote:

> I think Hortonworks has a Windows Spark distribution. Maybe Bigtop as well?
>
> > On 29 Feb 2016, at 14:27, gaurav pathak <gauravpathak...@gmail.com>
> wrote:
> >
> > Can someone guide me the steps and information regarding, installation
> of SPARK on Windows 7/8.1/10 , as well as on Windows Server. Also, it will
> be great to read your experiences in using SPARK on Windows platform.
> >
> >
> > Thanks & Regards,
> > Gaurav Pathak
>


Re: Spark on Windows platform

2016-02-29 Thread Jörn Franke
I think Hortonworks has a Windows Spark distribution. Maybe Bigtop as well? 

> On 29 Feb 2016, at 14:27, gaurav pathak <gauravpathak...@gmail.com> wrote:
> 
> Can someone guide me the steps and information regarding, installation of 
> SPARK on Windows 7/8.1/10 , as well as on Windows Server. Also, it will be 
> great to read your experiences in using SPARK on Windows platform.
> 
> 
> Thanks & Regards,
> Gaurav Pathak

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark on Windows platform

2016-02-29 Thread gaurav pathak
Can someone guide me the steps and information regarding, installation of
SPARK on Windows 7/8.1/10 , as well as on Windows Server. Also, it will be
great to read your experiences in using SPARK on Windows platform.


Thanks & Regards,
Gaurav Pathak


Re: Spark on Windows

2016-02-15 Thread UMESH CHAUDHARY
You can check "spark.master" property in conf/spark-defaults.conf and try
to give IP of the VM in place of "localhost".

On Tue, Feb 16, 2016 at 7:48 AM, KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:

> Hi,
>
> I am new to spark and starting working on it by writing small programs. I
> am able to run those in cloudera quickstart VM but not able to run in the
> eclipse when giving master URL
>
> *Steps I perfromed:*
>
> Started Master and can access it through http://localhost:8080
>
> Started worker and access it.
>
> Ran the wordcount by giving master as spark://localhost:7077 but no output
> and I cant see the application Id also in master web UI.
>
> I tried with master as local and was able to run successfully. I want to
> run on the master so that I can view logs in master and worker. any
> suggestions for this?
>
> Thanks,
> Asmath
>
>
>


Spark on Windows

2016-02-15 Thread KhajaAsmath Mohammed
Hi,

I am new to spark and starting working on it by writing small programs. I
am able to run those in cloudera quickstart VM but not able to run in the
eclipse when giving master URL

*Steps I perfromed:*

Started Master and can access it through http://localhost:8080

Started worker and access it.

Ran the wordcount by giving master as spark://localhost:7077 but no output
and I cant see the application Id also in master web UI.

I tried with master as local and was able to run successfully. I want to
run on the master so that I can view logs in master and worker. any
suggestions for this?

Thanks,
Asmath


Spark 1.5.2 error on quitting spark in windows 7

2015-12-09 Thread skypickle
If I start spark-shell then just quit, I get an error.


scala> :q
Stopping spark context.
15/12/09 23:43:32 ERROR ShutdownHookManager: Exception while deleting Spark
temp dir:
C:\Users\Stefan\AppData\Local\Temp\spark-68d3a813-9c55-4649-aa7a-5fc269e669e7
java.io.IOException: Failed to delete:
C:\Users\Stefan\AppData\Local\Temp\spark-68d3a813-9c55-4649-aa7a-5fc269e669e7
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884)

*So, if u use winutils to examine the directory:*

C:\Users\Stefan\AppData\Local\Temp>winutils ls
spark-cb325426-4a3c-48ec-becc-baaa077bea1f
drwx-- 1 BloomBear-SSD\Stefan BloomBear-SSD\None 0 Dec 10 2015
spark-cb325426-4a3c-48ec-becc-baaa077bea1f

*I interpret this to mean that the OWNER has read/write/execute privs on
this folder.
So why does scala have a problem deleting it?

Just for fun I also installed a set of windows executables that are ports of
common UNIX utilities -
http://sourceforge.net/projects/unxutils/?source=typ_redirect

So now I can run a command like ls and get*

C:\Users\Stefan\AppData\Local\Temp>ls -al
total 61
drwxrwxrwx   1 user group   0 Dec  9 23:44 .
drwxrwxrwx   1 user group   0 Dec  9 22:27 ..
drwxrwxrwx   1 user group   0 Dec  9 23:43
61135062-623a-4624-b406-fbd0ae9308ae_resources
drwxrwxrwx   1 user group   0 Dec  9 23:43
9cc17e8c-2941-4768-9f55-e740e54dab0b_resources
-rw-rw-rw-   1 user group   0 Sep  4  2013
FXSAPIDebugLogFile.txt
drwxrwxrwx   1 user group   0 Dec  9 23:43 Stefan
-rw-rw-rw-   1 user group   16400 Dec  9 21:07
etilqs_3SQb9MejUX0BHwy
-rw-rw-rw-   1 user group2052 Dec  9 21:41
etilqs_8YWZWJEClIYRrKf
drwxrwxrwx   1 user group   0 Dec  9 23:43 hsperfdata_Stefan
-rw-rw-rw-   1 user group   19968 Dec  9 23:09
jansi-64-1-8475478299913367674.11
-rw-rw-rw-   1 user group   18944 Dec  9 23:43 jansi-64-1.5.2.dll
-rw-rw-rw-   1 user group2031 Dec  9 23:15
sbt3359615202868869571.log
drwxrwxrwx   1 user group   0 Dec  9 23:43
spark-68d3a813-9c55-4649-aa7a-5fc269e669e7

*Now the spark directory is being seen by windows as fully readable by
EVERYONE.
In any event, can someone enlighten me about their environment to avoid this
irritating error. Here is my environment:
*

windows 7 64 bit
Spark 1.5.2
Scala 2.10.6
Python 2.7.10 (from Anaconda)

PATH includes:
C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6\bin
C:\ProgramData\Oracle\Java\javapath
C:\Users\Stefan\scala
C:\Users\Stefan\hadoop-2.6.0\bin
C:\ProgramData\Oracle\Java\javapath

SYSTEM variables set are:
SPARK_HOME=C:\Users\Stefan\spark-1.5.2-bin-hadoop2.6
JAVA_HOME=C:\Program Files\Java\jre1.8.0_65
HADOOP_HOME=C:\Users\Stefan\hadoop-2.6.0
(where the bin\winutils resides)
winutils.exe chmod 777 /tmp/hive

\tmp\hive directory at root on C; drive with full permissions,
e.g.
>winutils ls \tmp\hive
drwxrwxrwx 1 BloomBear-SSD\Stefan BloomBear-SSD\None 0 Dec  8 2015 \tmp\hive




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-5-2-error-on-quitting-spark-in-windows-7-tp25659.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Error building Spark on Windows with sbt

2015-10-30 Thread Judy Nash
I have not had any success building using sbt/sbt on windows.
However, I have been able to binary by using maven command directly.

From: Richard Eggert [mailto:richard.egg...@gmail.com]
Sent: Sunday, October 25, 2015 12:51 PM
To: Ted Yu <yuzhih...@gmail.com>
Cc: User <user@spark.apache.org>
Subject: Re: Error building Spark on Windows with sbt

Yes, I know, but it would be nice to be able to test things myself before I 
push commits.

On Sun, Oct 25, 2015 at 3:50 PM, Ted Yu 
<yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>> wrote:
If you have a pull request, Jenkins can test your change for you.

FYI

On Oct 25, 2015, at 12:43 PM, Richard Eggert 
<richard.egg...@gmail.com<mailto:richard.egg...@gmail.com>> wrote:
Also, if I run the Maven build on Windows or Linux without setting 
-DskipTests=true, it hangs indefinitely when it gets to 
org.apache.spark.JavaAPISuite.

It's hard to test patches when the build doesn't work. :-/

On Sun, Oct 25, 2015 at 3:41 PM, Richard Eggert 
<richard.egg...@gmail.com<mailto:richard.egg...@gmail.com>> wrote:
By "it works", I mean, "It gets past that particular error". It still fails 
several minutes later with a different error:

java.lang.IllegalStateException: impossible to get artifacts when data has not 
been loaded. IvyNode = org.scala-lang#scala-library;2.10.3


On Sun, Oct 25, 2015 at 3:38 PM, Richard Eggert 
<richard.egg...@gmail.com<mailto:richard.egg...@gmail.com>> wrote:

When I try to start up sbt for the Spark build,  or if I try to import it in 
IntelliJ IDEA as an sbt project, it fails with a "No such file or directory" 
error when it attempts to "git clone" sbt-pom-reader into 
.sbt/0.13/staging/some-sha1-hash.

If I manually create the expected directory before running sbt or importing 
into IntelliJ, then it works. Why is it necessary to do this,  and what can be 
done to make it not necessary?

Rich



--
Rich



--
Rich



--
Rich


Re: Error building Spark on Windows with sbt

2015-10-25 Thread Ted Yu
If you have a pull request, Jenkins can test your change for you. 

FYI 

> On Oct 25, 2015, at 12:43 PM, Richard Eggert  wrote:
> 
> Also, if I run the Maven build on Windows or Linux without setting 
> -DskipTests=true, it hangs indefinitely when it gets to 
> org.apache.spark.JavaAPISuite.
> 
> It's hard to test patches when the build doesn't work. :-/
> 
>> On Sun, Oct 25, 2015 at 3:41 PM, Richard Eggert  
>> wrote:
>> By "it works", I mean, "It gets past that particular error". It still fails 
>> several minutes later with a different error: 
>> 
>> java.lang.IllegalStateException: impossible to get artifacts when data has 
>> not been loaded. IvyNode = org.scala-lang#scala-library;2.10.3
>> 
>> 
>>> On Sun, Oct 25, 2015 at 3:38 PM, Richard Eggert  
>>> wrote:
>>> When I try to start up sbt for the Spark build,  or if I try to import it 
>>> in IntelliJ IDEA as an sbt project, it fails with a "No such file or 
>>> directory" error when it attempts to "git clone" sbt-pom-reader into 
>>> .sbt/0.13/staging/some-sha1-hash.
>>> 
>>> If I manually create the expected directory before running sbt or importing 
>>> into IntelliJ, then it works. Why is it necessary to do this,  and what can 
>>> be done to make it not necessary?
>>> 
>>> Rich
>>> 
>> 
>> 
>> 
>> -- 
>> Rich
> 
> 
> 
> -- 
> Rich


Re: Error building Spark on Windows with sbt

2015-10-25 Thread Richard Eggert
Yes, I know, but it would be nice to be able to test things myself before I
push commits.

On Sun, Oct 25, 2015 at 3:50 PM, Ted Yu  wrote:

> If you have a pull request, Jenkins can test your change for you.
>
> FYI
>
> On Oct 25, 2015, at 12:43 PM, Richard Eggert 
> wrote:
>
> Also, if I run the Maven build on Windows or Linux without setting
> -DskipTests=true, it hangs indefinitely when it gets to
> org.apache.spark.JavaAPISuite.
>
> It's hard to test patches when the build doesn't work. :-/
>
> On Sun, Oct 25, 2015 at 3:41 PM, Richard Eggert 
> wrote:
>
>> By "it works", I mean, "It gets past that particular error". It still
>> fails several minutes later with a different error:
>>
>> java.lang.IllegalStateException: impossible to get artifacts when data
>> has not been loaded. IvyNode = org.scala-lang#scala-library;2.10.3
>>
>>
>> On Sun, Oct 25, 2015 at 3:38 PM, Richard Eggert > > wrote:
>>
>>> When I try to start up sbt for the Spark build,  or if I try to import
>>> it in IntelliJ IDEA as an sbt project, it fails with a "No such file or
>>> directory" error when it attempts to "git clone" sbt-pom-reader into
>>> .sbt/0.13/staging/some-sha1-hash.
>>>
>>> If I manually create the expected directory before running sbt or
>>> importing into IntelliJ, then it works. Why is it necessary to do this,
>>> and what can be done to make it not necessary?
>>>
>>> Rich
>>>
>>
>>
>>
>> --
>> Rich
>>
>
>
>
> --
> Rich
>
>


-- 
Rich


Error building Spark on Windows with sbt

2015-10-25 Thread Richard Eggert
When I try to start up sbt for the Spark build,  or if I try to import it
in IntelliJ IDEA as an sbt project, it fails with a "No such file or
directory" error when it attempts to "git clone" sbt-pom-reader into
.sbt/0.13/staging/some-sha1-hash.

If I manually create the expected directory before running sbt or importing
into IntelliJ, then it works. Why is it necessary to do this,  and what can
be done to make it not necessary?

Rich


Re: Error building Spark on Windows with sbt

2015-10-25 Thread Richard Eggert
Also, if I run the Maven build on Windows or Linux without setting
-DskipTests=true, it hangs indefinitely when it gets to
org.apache.spark.JavaAPISuite.

It's hard to test patches when the build doesn't work. :-/

On Sun, Oct 25, 2015 at 3:41 PM, Richard Eggert 
wrote:

> By "it works", I mean, "It gets past that particular error". It still
> fails several minutes later with a different error:
>
> java.lang.IllegalStateException: impossible to get artifacts when data has
> not been loaded. IvyNode = org.scala-lang#scala-library;2.10.3
>
>
> On Sun, Oct 25, 2015 at 3:38 PM, Richard Eggert 
> wrote:
>
>> When I try to start up sbt for the Spark build,  or if I try to import it
>> in IntelliJ IDEA as an sbt project, it fails with a "No such file or
>> directory" error when it attempts to "git clone" sbt-pom-reader into
>> .sbt/0.13/staging/some-sha1-hash.
>>
>> If I manually create the expected directory before running sbt or
>> importing into IntelliJ, then it works. Why is it necessary to do this,
>> and what can be done to make it not necessary?
>>
>> Rich
>>
>
>
>
> --
> Rich
>



-- 
Rich


Re: Error building Spark on Windows with sbt

2015-10-25 Thread Richard Eggert
By "it works", I mean, "It gets past that particular error". It still fails
several minutes later with a different error:

java.lang.IllegalStateException: impossible to get artifacts when data has
not been loaded. IvyNode = org.scala-lang#scala-library;2.10.3


On Sun, Oct 25, 2015 at 3:38 PM, Richard Eggert 
wrote:

> When I try to start up sbt for the Spark build,  or if I try to import it
> in IntelliJ IDEA as an sbt project, it fails with a "No such file or
> directory" error when it attempts to "git clone" sbt-pom-reader into
> .sbt/0.13/staging/some-sha1-hash.
>
> If I manually create the expected directory before running sbt or
> importing into IntelliJ, then it works. Why is it necessary to do this,
> and what can be done to make it not necessary?
>
> Rich
>



-- 
Rich


Re: Download Apache Spark on Windows 7 for a Proof of Concept installation

2015-07-26 Thread Jörn Franke
Use a Hadoop distribution that supports Windows and has Spark included.
Generally - if you want to use windows - you should use the server version.

Le sam. 25 juil. 2015 à 20:11, Peter Leventis pleven...@telkomsa.net a
écrit :

 I just wanted an easy step by step guide as to exactly what version of what
 ever to download for a Proof of Concept installation of Apache Spark on
 Windows 7. I have spent quite some time following  a number of different
 recipes to no avail. I have tried about 10 different permutations to date.

 I prefer the easiest approach, e.g. download Pre-build Version of ... etc



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Download-Apache-Spark-on-Windows-7-for-a-Proof-of-Concept-installation-tp23992.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Download Apache Spark on Windows 7 for a Proof of Concept installation

2015-07-26 Thread Peter Leventis
Thank you for the answers. I followed numerous recipes including videos and
uncounted many obstacles such as 7-Zip unable to unzip the *.gx file and to
the need to use SBT.

My situation is fixed. I use a Windows 7 PC (not Linux). I would be very
grateful for an approach that simply works. This is the first time in 15
years that I have struggled so much to download and install Open Source
software from Apache. I managed to download and install Apache Drill in
minutes.

Apache Spark is just so awkward! Please help. Any version would do for the
required proof of concept.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Download-Apache-Spark-on-Windows-7-for-a-Proof-of-Concept-installation-tp23992p23998.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Download Apache Spark on Windows 7 for a Proof of Concept installation

2015-07-25 Thread Peter Leventis
I just wanted an easy step by step guide as to exactly what version of what
ever to download for a Proof of Concept installation of Apache Spark on
Windows 7. I have spent quite some time following  a number of different
recipes to no avail. I have tried about 10 different permutations to date.

I prefer the easiest approach, e.g. download Pre-build Version of ... etc



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Download-Apache-Spark-on-Windows-7-for-a-Proof-of-Concept-installation-tp23992.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: spark on Windows 2008 failed to save RDD to windows shared folder

2015-05-26 Thread Wang, Ningjun (LNG-NPV)
It is Hadoop-2.4.0 with spark-1.3.0.

I found that the problem only happen if there are multi nodes. If the cluster 
has only one node, it works fine.

For example if the cluster has a spark-master on machine A and a spark-worker 
on machine B, this problem happen. If both spark-master and spark-worker are on 
machine A, then no problem.

I do not use HDFS. I am just saving the RDD to a window share folder
rdd.saveAsObjectFile(“file:///T:/lab4-win02/IndexRoot01/tobacco-07/myrdd.objfile:///T:\lab4-win02\IndexRoot01\tobacco-07\myrdd.obj”)

With T: drive mapped to   
\\10.196.119.230\mysharefile:///\\10.196.119.230\myshare

Ningjun

From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Friday, May 22, 2015 5:02 PM
To: Wang, Ningjun (LNG-NPV)
Cc: user@spark.apache.org
Subject: Re: spark on Windows 2008 failed to save RDD to windows shared folder

The stack trace is related to hdfs.

Can you tell us which hadoop release you are using ?

Is this a secure cluster ?

Thanks

On Fri, May 22, 2015 at 1:55 PM, Wang, Ningjun (LNG-NPV) 
ningjun.w...@lexisnexis.commailto:ningjun.w...@lexisnexis.com wrote:
I used spark standalone cluster on Windows 2008. I kept on getting the 
following error when trying to save an RDD to a windows shared folder

rdd.saveAsObjectFile(“file:///T:/lab4-win02/IndexRoot01/tobacco-07/myrdd.objfile:///T:\lab4-win02\IndexRoot01\tobacco-07\myrdd.obj”)

15/05/22 16:49:05 ERROR Executor: Exception in task 0.0 in stage 12.0 (TID 12)
java.io.IOException: Mkdirs failed to create 
file:/T:/lab4-win02/IndexRoot01/tobacco-07/tmp/docs-150522204904805.op/_temporary/0/_temporary/attempt_201505221649_0012_m_00_12
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:438)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
at 
org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1071)
at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)
at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:527)
at 
org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:63)
at 
org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1068)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)
at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
The T: drive is mapped to a windows shared folder, e.g.  T:  -  
\\10.196.119.230\mysharefile:///\\10.196.119.230\myshare

The id running spark does have write permission to this folder. It works most 
of the time but failed sometime.

Can anybody tell me what is the problem here?

Please advise. Thanks.



spark on Windows 2008 failed to save RDD to windows shared folder

2015-05-22 Thread Wang, Ningjun (LNG-NPV)
I used spark standalone cluster on Windows 2008. I kept on getting the 
following error when trying to save an RDD to a windows shared folder

rdd.saveAsObjectFile(file:///T:/lab4-win02/IndexRoot01/tobacco-07/myrdd.obj)

15/05/22 16:49:05 ERROR Executor: Exception in task 0.0 in stage 12.0 (TID 12)
java.io.IOException: Mkdirs failed to create 
file:/T:/lab4-win02/IndexRoot01/tobacco-07/tmp/docs-150522204904805.op/_temporary/0/_temporary/attempt_201505221649_0012_m_00_12
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:438)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
at 
org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1071)
at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)
at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:527)
at 
org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:63)
at 
org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1068)
at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)
at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
The T: drive is mapped to a windows shared folder, e.g.  T:  -  
\\10.196.119.230\myshare

The id running spark does have write permission to this folder. It works most 
of the time but failed sometime.

Can anybody tell me what is the problem here?

Please advise. Thanks.


Re: spark on Windows 2008 failed to save RDD to windows shared folder

2015-05-22 Thread Ted Yu
The stack trace is related to hdfs.

Can you tell us which hadoop release you are using ?

Is this a secure cluster ?

Thanks

On Fri, May 22, 2015 at 1:55 PM, Wang, Ningjun (LNG-NPV) 
ningjun.w...@lexisnexis.com wrote:

  I used spark standalone cluster on Windows 2008. I kept on getting the
 following error when trying to save an RDD to a windows shared folder




 rdd.saveAsObjectFile(“file:///T:/lab4-win02/IndexRoot01/tobacco-07/myrdd.obj”)



 15/05/22 16:49:05 ERROR Executor: Exception in task 0.0 in stage 12.0 (TID
 12)

 java.io.IOException: Mkdirs failed to create
 file:/T:/lab4-win02/IndexRoot01/tobacco-07/tmp/docs-150522204904805.op/_temporary/0/_temporary/attempt_201505221649_0012_m_00_12

 at
 org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:438)

 at
 org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)

 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)

 at
 org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1071)

 at
 org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)

 at
 org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:527)

 at
 org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:63)

 at
 org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)

 at
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1068)

 at
 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)

 at
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

 at org.apache.spark.scheduler.Task.run(Task.scala:64)

 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 at java.lang.Thread.run(Thread.java:745)

  The T: drive is mapped to a windows shared folder, e.g.  T:  -
 \\10.196.119.230\myshare



 The id running spark does have write permission to this folder. It works
 most of the time but failed sometime.



 Can anybody tell me what is the problem here?



 Please advise. Thanks.



Fwd: Change ivy cache for spark on Windows

2015-04-27 Thread Burak Yavuz
+user

-- Forwarded message --
From: Burak Yavuz brk...@gmail.com
Date: Mon, Apr 27, 2015 at 1:59 PM
Subject: Re: Change ivy cache for spark on Windows
To: mj jone...@gmail.com


Hi,

In your conf file (SPARK_HOME\conf\spark-defaults.conf) you can set:

`spark.jars.ivy \your\path`


Best,
Burak

On Mon, Apr 27, 2015 at 1:49 PM, mj jone...@gmail.com wrote:

 Hi,

 I'm having trouble using the --packages option for spark-shell.cmd - I have
 to use Windows at work and have been issued a username with a space in it
 that means when I use the --packages option it fails with this message:

 Exception in thread main java.net.URISyntaxException: Illegal character
 in path at index 13: C:/Users/My Name/.ivy2/jars/spark-csv_2.10.jar

 The command I'm trying to run is:
 .\spark-shell.cmd --packages com.databricks:spark-csv_2.10:1.0.3

 I've tried creating an ivysettings.xml file with the content below in my
 .ivy2 directory, but spark doesn't seem to pick it up. Does anyone have any
 ideas of how to get around this issue?

 ivysettings
 caches defaultCacheDir=c:\ivy_cache/
 /ivysettings




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Change-ivy-cache-for-spark-on-Windows-tp22675.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Change ivy cache for spark on Windows

2015-04-27 Thread mj
Hi,

I'm having trouble using the --packages option for spark-shell.cmd - I have
to use Windows at work and have been issued a username with a space in it
that means when I use the --packages option it fails with this message:

Exception in thread main java.net.URISyntaxException: Illegal character
in path at index 13: C:/Users/My Name/.ivy2/jars/spark-csv_2.10.jar

The command I'm trying to run is:
.\spark-shell.cmd --packages com.databricks:spark-csv_2.10:1.0.3

I've tried creating an ivysettings.xml file with the content below in my
.ivy2 directory, but spark doesn't seem to pick it up. Does anyone have any
ideas of how to get around this issue?

ivysettings
caches defaultCacheDir=c:\ivy_cache/
/ivysettings




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Change-ivy-cache-for-spark-on-Windows-tp22675.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark on Windows

2015-04-17 Thread Sree V
spark 'master' branch (i.e. v1.4.0) builds successfully on windows 8.1 intel i7 
64-bit with oracle jdk8_45.with maven opts without the flag 
-XX:ReservedCodeCacheSize=1g.
takes about 33 minutes.
Thanking you.

With Regards
Sree  


 On Thursday, April 16, 2015 9:07 PM, Arun Lists lists.a...@gmail.com 
wrote:
   

 Here is what I got from the engineer who worked on building Spark and using it 
on Windows:
1)  Hadoop winutils.exe is needed on Windows, even for local files – and you 
have to set the Hadoop.home.dir in the spark-class2.cmd (for the two lines with 
$RUNNER near the end, by adding “-Dhadoop.home.dir=dir” file after 
downloading Hadoop binaries + winutils. 2)  Java/Spark cannot delete the spark 
temporary files and it throws an exception (program still works though).  
Manual clean-up works just fine, and it is not a permissions issue as it has 
rights to create the file (I have also tried using my own directory rather than 
the default, same error).3)  tried building Spark again, and have attached the 
log – I don’t get any errors, just warnings.  However when I try to use that 
JAR I just get the error message “Error: Could not find or load main class 
org.apache.spark.deploy.SparkSubmit”.
On Thu, Apr 16, 2015 at 12:19 PM, Arun Lists lists.a...@gmail.com wrote:

Thanks, Matei! We'll try that and let you know if it works. You are correct in 
inferring that some of the problems we had were with dependencies.
We also had problems with the spark-submit scripts. I will get the details from 
the engineer who worked on the Windows builds and provide them to you.
arun

On Thu, Apr 16, 2015 at 10:44 AM, Matei Zaharia matei.zaha...@gmail.com wrote:

You could build Spark with Scala 2.11 on Mac / Linux and transfer it over to 
Windows. AFAIK it should build on Windows too, the only problem is that Maven 
might take a long time to download dependencies. What errors are you seeing?

Matei

 On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote:

 We run Spark on Mac and Linux but also need to run it on Windows 8.1 and  
 Windows Server. We ran into problems with the Scala 2.10 binary bundle for 
 Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we are on 
 Scala 2.11.6 (we built Spark from the sources). On Windows, however despite 
 our best efforts we cannot get Spark 1.3.0 as built from sources working for 
 Scala 2.11.6. Spark has too many moving parts and dependencies!

 When can we expect to see a binary bundle for Spark 1.3.0 that is built for 
 Scala 2.11.6?  I read somewhere that the only reason that Spark 1.3.0 is 
 still built for Scala 2.10 is because Kafka is still on Scala 2.10. For those 
 of us who don't use Kafka, can we have a Scala 2.10 bundle.

 If there isn't an official bundle arriving any time soon, can someone who has 
 built it for Windows 8.1 successfully please share with the group?

 Thanks,
 arun








-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

  

Re: Spark on Windows

2015-04-17 Thread Arun Lists
Thanks, Sree!

Are you able to run your applications using spark-submit? Even after we
were able to build successfully, we ran into problems with running the
spark-submit script. If everything worked correctly for you, we can hope
that things will be smoother when 1.4.0 is made generally available.

arun

On Thu, Apr 16, 2015 at 10:18 PM, Sree V sree_at_ch...@yahoo.com wrote:

 spark 'master' branch (i.e. v1.4.0) builds successfully on windows 8.1
 intel i7 64-bit with oracle jdk8_45.
 with maven opts without the flag -XX:ReservedCodeCacheSize=1g.
 takes about 33 minutes.

 Thanking you.

 With Regards
 Sree




   On Thursday, April 16, 2015 9:07 PM, Arun Lists lists.a...@gmail.com
 wrote:


 Here is what I got from the engineer who worked on building Spark and
 using it on Windows:

 1)  Hadoop winutils.exe is needed on Windows, even for local files – and
 you have to set the Hadoop.home.dir in the spark-class2.cmd (for the two
 lines with $RUNNER near the end, by adding “-Dhadoop.home.dir=dir” file
 after downloading Hadoop binaries + winutils.
 2)  Java/Spark cannot delete the spark temporary files and it throws an
 exception (program still works though).  Manual clean-up works just fine,
 and it is not a permissions issue as it has rights to create the file (I
 have also tried using my own directory rather than the default, same error).
 3)  tried building Spark again, and have attached the log – I don’t get
 any errors, just warnings.  However when I try to use that JAR I just get
 the error message “Error: Could not find or load main class
 org.apache.spark.deploy.SparkSubmit”.

 On Thu, Apr 16, 2015 at 12:19 PM, Arun Lists lists.a...@gmail.com wrote:

 Thanks, Matei! We'll try that and let you know if it works. You are
 correct in inferring that some of the problems we had were with
 dependencies.

 We also had problems with the spark-submit scripts. I will get the details
 from the engineer who worked on the Windows builds and provide them to you.

 arun


 On Thu, Apr 16, 2015 at 10:44 AM, Matei Zaharia matei.zaha...@gmail.com
 wrote:

 You could build Spark with Scala 2.11 on Mac / Linux and transfer it over
 to Windows. AFAIK it should build on Windows too, the only problem is that
 Maven might take a long time to download dependencies. What errors are you
 seeing?

 Matei

  On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote:
 
  We run Spark on Mac and Linux but also need to run it on Windows 8.1
 and  Windows Server. We ran into problems with the Scala 2.10 binary bundle
 for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we
 are on Scala 2.11.6 (we built Spark from the sources). On Windows, however
 despite our best efforts we cannot get Spark 1.3.0 as built from sources
 working for Scala 2.11.6. Spark has too many moving parts and dependencies!
 
  When can we expect to see a binary bundle for Spark 1.3.0 that is built
 for Scala 2.11.6?  I read somewhere that the only reason that Spark 1.3.0
 is still built for Scala 2.10 is because Kafka is still on Scala 2.10. For
 those of us who don't use Kafka, can we have a Scala 2.10 bundle.
 
  If there isn't an official bundle arriving any time soon, can someone
 who has built it for Windows 8.1 successfully please share with the group?
 
  Thanks,
  arun
 





 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





Spark on Windows

2015-04-16 Thread Arun Lists
We run Spark on Mac and Linux but also need to run it on Windows 8.1 and
 Windows Server. We ran into problems with the Scala 2.10 binary bundle for
Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we are on
Scala 2.11.6 (we built Spark from the sources). On Windows, however despite
our best efforts we cannot get Spark 1.3.0 as built from sources working
for Scala 2.11.6. Spark has too many moving parts and dependencies!

When can we expect to see a binary bundle for Spark 1.3.0 that is built for
Scala 2.11.6?  I read somewhere that the only reason that Spark 1.3.0 is
still built for Scala 2.10 is because Kafka is still on Scala 2.10. For
those of us who don't use Kafka, can we have a Scala 2.10 bundle.

If there isn't an official bundle arriving any time soon, can someone who
has built it for Windows 8.1 successfully please share with the group?

Thanks,
arun


Re: Spark on Windows

2015-04-16 Thread Matei Zaharia
You could build Spark with Scala 2.11 on Mac / Linux and transfer it over to 
Windows. AFAIK it should build on Windows too, the only problem is that Maven 
might take a long time to download dependencies. What errors are you seeing?

Matei

 On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote:
 
 We run Spark on Mac and Linux but also need to run it on Windows 8.1 and  
 Windows Server. We ran into problems with the Scala 2.10 binary bundle for 
 Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we are on 
 Scala 2.11.6 (we built Spark from the sources). On Windows, however despite 
 our best efforts we cannot get Spark 1.3.0 as built from sources working for 
 Scala 2.11.6. Spark has too many moving parts and dependencies!
 
 When can we expect to see a binary bundle for Spark 1.3.0 that is built for 
 Scala 2.11.6?  I read somewhere that the only reason that Spark 1.3.0 is 
 still built for Scala 2.10 is because Kafka is still on Scala 2.10. For those 
 of us who don't use Kafka, can we have a Scala 2.10 bundle.
 
 If there isn't an official bundle arriving any time soon, can someone who has 
 built it for Windows 8.1 successfully please share with the group?
 
 Thanks,
 arun
 


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark on Windows

2015-04-16 Thread Arun Lists
Thanks, Matei! We'll try that and let you know if it works. You are correct
in inferring that some of the problems we had were with dependencies.

We also had problems with the spark-submit scripts. I will get the details
from the engineer who worked on the Windows builds and provide them to you.

arun


On Thu, Apr 16, 2015 at 10:44 AM, Matei Zaharia matei.zaha...@gmail.com
wrote:

 You could build Spark with Scala 2.11 on Mac / Linux and transfer it over
 to Windows. AFAIK it should build on Windows too, the only problem is that
 Maven might take a long time to download dependencies. What errors are you
 seeing?

 Matei

  On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote:
 
  We run Spark on Mac and Linux but also need to run it on Windows 8.1
 and  Windows Server. We ran into problems with the Scala 2.10 binary bundle
 for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we
 are on Scala 2.11.6 (we built Spark from the sources). On Windows, however
 despite our best efforts we cannot get Spark 1.3.0 as built from sources
 working for Scala 2.11.6. Spark has too many moving parts and dependencies!
 
  When can we expect to see a binary bundle for Spark 1.3.0 that is built
 for Scala 2.11.6?  I read somewhere that the only reason that Spark 1.3.0
 is still built for Scala 2.10 is because Kafka is still on Scala 2.10. For
 those of us who don't use Kafka, can we have a Scala 2.10 bundle.
 
  If there isn't an official bundle arriving any time soon, can someone
 who has built it for Windows 8.1 successfully please share with the group?
 
  Thanks,
  arun
 




Re: Spark on Windows

2015-04-16 Thread Stephen Boesch
The hadoop support from HortonWorks only *actually *works with Windows
Server  - well at least as of Spark Summit last year : and AFAIK that has
not changed since

2015-04-16 15:18 GMT-07:00 Dean Wampler deanwamp...@gmail.com:

 If you're running Hadoop, too, now that Hortonworks supports Spark, you
 might be able to use their distribution.

 Dean Wampler, Ph.D.
 Author: Programming Scala, 2nd Edition
 http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
 Typesafe http://typesafe.com
 @deanwampler http://twitter.com/deanwampler
 http://polyglotprogramming.com

 On Thu, Apr 16, 2015 at 2:19 PM, Arun Lists lists.a...@gmail.com wrote:

 Thanks, Matei! We'll try that and let you know if it works. You are
 correct in inferring that some of the problems we had were with
 dependencies.

 We also had problems with the spark-submit scripts. I will get the
 details from the engineer who worked on the Windows builds and provide them
 to you.

 arun


 On Thu, Apr 16, 2015 at 10:44 AM, Matei Zaharia matei.zaha...@gmail.com
 wrote:

 You could build Spark with Scala 2.11 on Mac / Linux and transfer it
 over to Windows. AFAIK it should build on Windows too, the only problem is
 that Maven might take a long time to download dependencies. What errors are
 you seeing?

 Matei

  On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote:
 
  We run Spark on Mac and Linux but also need to run it on Windows 8.1
 and  Windows Server. We ran into problems with the Scala 2.10 binary bundle
 for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we
 are on Scala 2.11.6 (we built Spark from the sources). On Windows, however
 despite our best efforts we cannot get Spark 1.3.0 as built from sources
 working for Scala 2.11.6. Spark has too many moving parts and dependencies!
 
  When can we expect to see a binary bundle for Spark 1.3.0 that is
 built for Scala 2.11.6?  I read somewhere that the only reason that Spark
 1.3.0 is still built for Scala 2.10 is because Kafka is still on Scala
 2.10. For those of us who don't use Kafka, can we have a Scala 2.10 bundle.
 
  If there isn't an official bundle arriving any time soon, can someone
 who has built it for Windows 8.1 successfully please share with the group?
 
  Thanks,
  arun
 






Re: Spark on Windows

2015-04-16 Thread Dean Wampler
If you're running Hadoop, too, now that Hortonworks supports Spark, you
might be able to use their distribution.

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafe http://typesafe.com
@deanwampler http://twitter.com/deanwampler
http://polyglotprogramming.com

On Thu, Apr 16, 2015 at 2:19 PM, Arun Lists lists.a...@gmail.com wrote:

 Thanks, Matei! We'll try that and let you know if it works. You are
 correct in inferring that some of the problems we had were with
 dependencies.

 We also had problems with the spark-submit scripts. I will get the details
 from the engineer who worked on the Windows builds and provide them to you.

 arun


 On Thu, Apr 16, 2015 at 10:44 AM, Matei Zaharia matei.zaha...@gmail.com
 wrote:

 You could build Spark with Scala 2.11 on Mac / Linux and transfer it over
 to Windows. AFAIK it should build on Windows too, the only problem is that
 Maven might take a long time to download dependencies. What errors are you
 seeing?

 Matei

  On Apr 16, 2015, at 9:23 AM, Arun Lists lists.a...@gmail.com wrote:
 
  We run Spark on Mac and Linux but also need to run it on Windows 8.1
 and  Windows Server. We ran into problems with the Scala 2.10 binary bundle
 for Spark 1.3.0 but managed to get it working. However, on Mac/Linux, we
 are on Scala 2.11.6 (we built Spark from the sources). On Windows, however
 despite our best efforts we cannot get Spark 1.3.0 as built from sources
 working for Scala 2.11.6. Spark has too many moving parts and dependencies!
 
  When can we expect to see a binary bundle for Spark 1.3.0 that is built
 for Scala 2.11.6?  I read somewhere that the only reason that Spark 1.3.0
 is still built for Scala 2.10 is because Kafka is still on Scala 2.10. For
 those of us who don't use Kafka, can we have a Scala 2.10 bundle.
 
  If there isn't an official bundle arriving any time soon, can someone
 who has built it for Windows 8.1 successfully please share with the group?
 
  Thanks,
  arun
 





Error when running Spark on Windows 8.1

2015-04-07 Thread Arun Lists
Hi,

We are trying to run a Spark application using spark-submit on Windows 8.1.
The application runs successfully to completion on MacOS 10.10 and on
Ubuntu Linux. On Windows, we get the following error messages (see below).
It appears that Spark is trying to delete some temporary directory that it
creates.

How do we solve this problem?

Thanks,
arun

5/04/07 10:55:14 ERROR Utils: Exception while deleting Spark temp dir:
C:\Users\JOSHMC~1\AppData\Local\Temp\spark-339bf2d9-8b89-46e9-b5c1-404caf9d3cd7\userFiles-62976ef7-ab56-41c0-a35b-793c7dca31c7

java.io.IOException: Failed to delete:
C:\Users\JOSHMC~1\AppData\Local\Temp\spark-339bf2d9-8b89-46e9-b5c1-404caf9d3cd7\userFiles-62976ef7-ab56-41c0-a35b-793c7dca31c7

  at
org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:932)

  at
org.apache.spark.util.Utils$$anon$4$$anonfun$run$1$$anonfun$apply$mcV$sp$2.apply(Utils.scala:181)

  at
org.apache.spark.util.Utils$$anon$4$$anonfun$run$1$$anonfun$apply$mcV$sp$2.apply(Utils.scala:179)

  at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)

  at
org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply$mcV$sp(Utils.scala:179)

  at
org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177)

  at
org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:177)

  at
org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)

  at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:177)


Building Spark on Windows WAS: Any IRC channel on Spark?

2015-03-17 Thread Ted Yu
Have you tried with -X switch ?

Thanks



 On Mar 17, 2015, at 1:47 AM, Ahmed Nawar ahmed.na...@gmail.com wrote:
 
 Dears,
 
 Is there any instructions to build spark 1.3.0 on windows 7.
 
 I tried mvn -Phive -Phive-thriftserver -DskipTests clean package but i 
 got below errors
 
 
 [INFO] Spark Project Parent POM ... SUCCESS [  7.845 
 s]
 [INFO] Spark Project Networking ... SUCCESS [ 26.209 
 s]
 [INFO] Spark Project Shuffle Streaming Service  SUCCESS [  9.701 
 s]
 [INFO] Spark Project Core . SUCCESS [04:29 
 min]
 [INFO] Spark Project Bagel  SUCCESS [ 22.215 
 s]
 [INFO] Spark Project GraphX ... SUCCESS [ 59.676 
 s]
 [INFO] Spark Project Streaming  SUCCESS [01:46 
 min]
 [INFO] Spark Project Catalyst . SUCCESS [01:40 
 min]
 [INFO] Spark Project SQL .. SUCCESS [03:05 
 min]
 [INFO] Spark Project ML Library ... FAILURE [03:49 
 min]
 [INFO] Spark Project Tools  SKIPPED
 [INFO] Spark Project Hive . SKIPPED
 [INFO] Spark Project REPL . SKIPPED
 [INFO] Spark Project Hive Thrift Server ... SKIPPED
 [INFO] Spark Project Assembly . SKIPPED
 [INFO] Spark Project External Twitter . SKIPPED
 [INFO] Spark Project External Flume Sink .. SKIPPED
 [INFO] Spark Project External Flume ... SKIPPED
 [INFO] Spark Project External MQTT  SKIPPED
 [INFO] Spark Project External ZeroMQ .. SKIPPED
 [INFO] Spark Project External Kafka ... SKIPPED
 [INFO] Spark Project Examples . SKIPPED
 [INFO] Spark Project External Kafka Assembly .. SKIPPED
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 16:58 min
 [INFO] Finished at: 2015-03-17T11:04:40+03:00
 [INFO] Final Memory: 77M/1840M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.scalastyle:scalastyle-maven-plugin:0.4.0:check (default) on project 
 spark-mllib_2.10: Failed during scalastyle exe
 p 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the 
 command
 [ERROR]   mvn goals -rf :spark-mllib_2.10
 
 
 
 
 
  
 
 
 On Tue, Mar 17, 2015 at 10:06 AM, Akhil Das ak...@sigmoidanalytics.com 
 wrote:
 There's one on Freenode, You can join #Apache-Spark There's like 60 people 
 idling. :)
 
 Thanks
 Best Regards
 
 On Mon, Mar 16, 2015 at 10:46 PM, Feng Lin lfliu.x...@gmail.com wrote:
 Hi, everyone,
 I'm wondering whether there is a possibility to setup an official IRC 
 channel on freenode.
 
 I noticed that a lot of apache projects would have a such channel to let 
 people talk directly.
 
 Best 
 Michael
 


Re: Building Spark on Windows WAS: Any IRC channel on Spark?

2015-03-17 Thread Ahmed Nawar
Scalastyle violation(s).
at
org.scalastyle.maven.plugin.ScalastyleViolationCheckMojo.performCheck(ScalastyleViolationCheckMojo.java:230)
... 22 more
[ERROR]
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the
command
[ERROR]   mvn goals -rf :spark-mllib_2.10
C:\Nawwar\Hadoop\spark\spark-1.3.0mvn -X -Phive -Phive-thriftserver
-DskipTests clean package





On Tue, Mar 17, 2015 at 12:14 PM, Ted Yu yuzhih...@gmail.com wrote:

 Have you tried with -X switch ?

 Thanks



 On Mar 17, 2015, at 1:47 AM, Ahmed Nawar ahmed.na...@gmail.com wrote:

 Dears,

 Is there any instructions to build spark 1.3.0 on windows 7.

 I tried mvn -Phive -Phive-thriftserver -DskipTests clean package but
 i got below errors


 [INFO] Spark Project Parent POM ... SUCCESS [
  7.845 s]
 [INFO] Spark Project Networking ... SUCCESS [
 26.209 s]
 [INFO] Spark Project Shuffle Streaming Service  SUCCESS [
  9.701 s]
 [INFO] Spark Project Core . SUCCESS [04:29
 min]
 [INFO] Spark Project Bagel  SUCCESS [
 22.215 s]
 [INFO] Spark Project GraphX ... SUCCESS [
 59.676 s]
 [INFO] Spark Project Streaming  SUCCESS [01:46
 min]
 [INFO] Spark Project Catalyst . SUCCESS [01:40
 min]
 [INFO] Spark Project SQL .. SUCCESS [03:05
 min]
 [INFO] Spark Project ML Library ... FAILURE [03:49
 min]
 [INFO] Spark Project Tools  SKIPPED
 [INFO] Spark Project Hive . SKIPPED
 [INFO] Spark Project REPL . SKIPPED
 [INFO] Spark Project Hive Thrift Server ... SKIPPED
 [INFO] Spark Project Assembly . SKIPPED
 [INFO] Spark Project External Twitter . SKIPPED
 [INFO] Spark Project External Flume Sink .. SKIPPED
 [INFO] Spark Project External Flume ... SKIPPED
 [INFO] Spark Project External MQTT  SKIPPED
 [INFO] Spark Project External ZeroMQ .. SKIPPED
 [INFO] Spark Project External Kafka ... SKIPPED
 [INFO] Spark Project Examples . SKIPPED
 [INFO] Spark Project External Kafka Assembly .. SKIPPED
 [INFO]
 
 [INFO] BUILD FAILURE
 [INFO]
 
 [INFO] Total time: 16:58 min
 [INFO] Finished at: 2015-03-17T11:04:40+03:00
 [INFO] Final Memory: 77M/1840M
 [INFO]
 
 [ERROR] Failed to execute goal
 org.scalastyle:scalastyle-maven-plugin:0.4.0:check (default) on project
 spark-mllib_2.10: Failed during scalastyle exe
 p 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the
 -e switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions,
 please read the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the
 command
 [ERROR]   mvn goals -rf :spark-mllib_2.10








 On Tue, Mar 17, 2015 at 10:06 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 There's one on Freenode, You can join #Apache-Spark There's like 60
 people idling. :)

 Thanks
 Best Regards

 On Mon, Mar 16, 2015 at 10:46 PM, Feng Lin lfliu.x...@gmail.com wrote:

 Hi, everyone,
 I'm wondering whether there is a possibility to setup an official IRC
 channel on freenode.

 I noticed that a lot of apache projects would have a such channel to let
 people talk directly.

 Best
 Michael






Re: Building Spark on Windows WAS: Any IRC channel on Spark?

2015-03-17 Thread Ted Yu
)
 at 
 org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
 at 
 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
 ... 19 more
 Caused by: org.apache.maven.plugin.MojoFailureException: You have 1 
 Scalastyle violation(s).
 at 
 org.scalastyle.maven.plugin.ScalastyleViolationCheckMojo.performCheck(ScalastyleViolationCheckMojo.java:230)
 ... 22 more
 [ERROR]
 [ERROR]
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the 
 command
 [ERROR]   mvn goals -rf :spark-mllib_2.10
 C:\Nawwar\Hadoop\spark\spark-1.3.0mvn -X -Phive -Phive-thriftserver 
 -DskipTests clean package
 
 
 
 
 
 On Tue, Mar 17, 2015 at 12:14 PM, Ted Yu yuzhih...@gmail.com wrote:
 Have you tried with -X switch ?
 
 Thanks
 
 
 
 On Mar 17, 2015, at 1:47 AM, Ahmed Nawar ahmed.na...@gmail.com wrote:
 
 Dears,
 
 Is there any instructions to build spark 1.3.0 on windows 7.
 
 I tried mvn -Phive -Phive-thriftserver -DskipTests clean package but 
 i got below errors
 
 
 [INFO] Spark Project Parent POM ... SUCCESS [  
 7.845 s]
 [INFO] Spark Project Networking ... SUCCESS [ 
 26.209 s]
 [INFO] Spark Project Shuffle Streaming Service  SUCCESS [  
 9.701 s]
 [INFO] Spark Project Core . SUCCESS [04:29 
 min]
 [INFO] Spark Project Bagel  SUCCESS [ 
 22.215 s]
 [INFO] Spark Project GraphX ... SUCCESS [ 
 59.676 s]
 [INFO] Spark Project Streaming  SUCCESS [01:46 
 min]
 [INFO] Spark Project Catalyst . SUCCESS [01:40 
 min]
 [INFO] Spark Project SQL .. SUCCESS [03:05 
 min]
 [INFO] Spark Project ML Library ... FAILURE [03:49 
 min]
 [INFO] Spark Project Tools  SKIPPED
 [INFO] Spark Project Hive . SKIPPED
 [INFO] Spark Project REPL . SKIPPED
 [INFO] Spark Project Hive Thrift Server ... SKIPPED
 [INFO] Spark Project Assembly . SKIPPED
 [INFO] Spark Project External Twitter . SKIPPED
 [INFO] Spark Project External Flume Sink .. SKIPPED
 [INFO] Spark Project External Flume ... SKIPPED
 [INFO] Spark Project External MQTT  SKIPPED
 [INFO] Spark Project External ZeroMQ .. SKIPPED
 [INFO] Spark Project External Kafka ... SKIPPED
 [INFO] Spark Project Examples . SKIPPED
 [INFO] Spark Project External Kafka Assembly .. SKIPPED
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 16:58 min
 [INFO] Finished at: 2015-03-17T11:04:40+03:00
 [INFO] Final Memory: 77M/1840M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.scalastyle:scalastyle-maven-plugin:0.4.0:check (default) on project 
 spark-mllib_2.10: Failed during scalastyle exe
 p 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions, 
 please read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the 
 command
 [ERROR]   mvn goals -rf :spark-mllib_2.10
 
 
 
 
 
  
 
 
 On Tue, Mar 17, 2015 at 10:06 AM, Akhil Das ak...@sigmoidanalytics.com 
 wrote:
 There's one on Freenode, You can join #Apache-Spark There's like 60 people 
 idling. :)
 
 Thanks
 Best Regards
 
 On Mon, Mar 16, 2015 at 10:46 PM, Feng Lin lfliu.x...@gmail.com wrote:
 Hi, everyone,
 I'm wondering whether there is a possibility to setup an official IRC 
 channel on freenode.
 
 I noticed that a lot of apache projects would have a such channel to let 
 people talk directly.
 
 Best 
 Michael
 


Re: Building Spark on Windows WAS: Any IRC channel on Spark?

2015-03-17 Thread Ahmed Nawar
(Launcher.java:356)
 Caused by: org.apache.maven.plugin.MojoExecutionException: Failed during
 scalastyle execution
 at
 org.scalastyle.maven.plugin.ScalastyleViolationCheckMojo.performCheck(ScalastyleViolationCheckMojo.java:238)
 at
 org.scalastyle.maven.plugin.ScalastyleViolationCheckMojo.execute(ScalastyleViolationCheckMojo.java:199)
 at
 org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
 at
 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
 ... 19 more
 Caused by: org.apache.maven.plugin.MojoFailureException: You have 1
 Scalastyle violation(s).
 at
 org.scalastyle.maven.plugin.ScalastyleViolationCheckMojo.performCheck(ScalastyleViolationCheckMojo.java:230)
 ... 22 more
 [ERROR]
 [ERROR]
 [ERROR] For more information about the errors and possible solutions,
 please read the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the
 command
 [ERROR]   mvn goals -rf :spark-mllib_2.10
 C:\Nawwar\Hadoop\spark\spark-1.3.0mvn -X -Phive -Phive-thriftserver
 -DskipTests clean package





 On Tue, Mar 17, 2015 at 12:14 PM, Ted Yu yuzhih...@gmail.com wrote:

 Have you tried with -X switch ?

 Thanks



 On Mar 17, 2015, at 1:47 AM, Ahmed Nawar ahmed.na...@gmail.com wrote:

 Dears,

 Is there any instructions to build spark 1.3.0 on windows 7.

 I tried mvn -Phive -Phive-thriftserver -DskipTests clean package
 but i got below errors


 [INFO] Spark Project Parent POM ... SUCCESS [
  7.845 s]
 [INFO] Spark Project Networking ... SUCCESS [
 26.209 s]
 [INFO] Spark Project Shuffle Streaming Service  SUCCESS [
  9.701 s]
 [INFO] Spark Project Core . SUCCESS
 [04:29 min]
 [INFO] Spark Project Bagel  SUCCESS [
 22.215 s]
 [INFO] Spark Project GraphX ... SUCCESS [
 59.676 s]
 [INFO] Spark Project Streaming  SUCCESS
 [01:46 min]
 [INFO] Spark Project Catalyst . SUCCESS
 [01:40 min]
 [INFO] Spark Project SQL .. SUCCESS
 [03:05 min]
 [INFO] Spark Project ML Library ... FAILURE
 [03:49 min]
 [INFO] Spark Project Tools  SKIPPED
 [INFO] Spark Project Hive . SKIPPED
 [INFO] Spark Project REPL . SKIPPED
 [INFO] Spark Project Hive Thrift Server ... SKIPPED
 [INFO] Spark Project Assembly . SKIPPED
 [INFO] Spark Project External Twitter . SKIPPED
 [INFO] Spark Project External Flume Sink .. SKIPPED
 [INFO] Spark Project External Flume ... SKIPPED
 [INFO] Spark Project External MQTT  SKIPPED
 [INFO] Spark Project External ZeroMQ .. SKIPPED
 [INFO] Spark Project External Kafka ... SKIPPED
 [INFO] Spark Project Examples . SKIPPED
 [INFO] Spark Project External Kafka Assembly .. SKIPPED
 [INFO]
 
 [INFO] BUILD FAILURE
 [INFO]
 
 [INFO] Total time: 16:58 min
 [INFO] Finished at: 2015-03-17T11:04:40+03:00
 [INFO] Final Memory: 77M/1840M
 [INFO]
 
 [ERROR] Failed to execute goal
 org.scalastyle:scalastyle-maven-plugin:0.4.0:check (default) on project
 spark-mllib_2.10: Failed during scalastyle exe
 p 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the
 -e switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions,
 please read the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the
 command
 [ERROR]   mvn goals -rf :spark-mllib_2.10








 On Tue, Mar 17, 2015 at 10:06 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 There's one on Freenode, You can join #Apache-Spark There's like 60
 people idling. :)

 Thanks
 Best Regards

 On Mon, Mar 16, 2015 at 10:46 PM, Feng Lin lfliu.x...@gmail.com wrote:

 Hi, everyone,
 I'm wondering whether there is a possibility to setup an official IRC
 channel on freenode.

 I noticed that a lot of apache projects would have a such channel to
 let people talk directly.

 Best
 Michael







Re: can not submit job to spark in windows

2015-03-12 Thread Arush Kharbanda
-56b32155-2779-4345-9597-2bfa6a87a51d\pi.py
 Traceback (most recent call last):
   File C:/spark-1.2.1-bin-hadoop2.4/bin/pi.py, line 29, in module
 sc = SparkContext(appName=PythonPi)
   File C:\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py, line 105,
 in __
 init__
 conf, jsc)
   File C:\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py, line 153,
 in _d
 o_init
 self._jsc = jsc or self._initialize_context(self._conf._jconf)
   File C:\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py, line 202,
 in _i
 nitialize_context
 return self._jvm.JavaSparkContext(jconf)
   File
 C:\spark-1.2.1-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\java_g
 ateway.py, line 701, in __call__
   File
 C:\spark-1.2.1-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\protoc
 ol.py, line 300, in get_return_value
 py4j.protocol.Py4JJavaError: An error occurred while calling
 None.org.apache.spa
 rk.api.java.JavaSparkContext.
 : java.lang.NullPointerException
 at java.lang.ProcessBuilder.start(Unknown Source)
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
 at org.apache.hadoop.util.Shell.run(Shell.java:418)
 at
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
 650)
 at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
 at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
 at org.apache.spark.util.Utils$.fetchFile(Utils.scala:445)
 at org.apache.spark.SparkContext.addFile(SparkContext.scala:1004)
 at
 org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:28
 8)
 at
 org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:28
 8)
 at scala.collection.immutable.List.foreach(List.scala:318)
 at org.apache.spark.SparkContext.init(SparkContext.scala:288)
 at
 org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.sc
 ala:61)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)

 at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown
 Source)

 at
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
 Sou
 rce)
 at java.lang.reflect.Constructor.newInstance(Unknown Source)
 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
 at
 py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
 at py4j.Gateway.invoke(Gateway.java:214)
 at
 py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand
 .java:79)
 at
 py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
 at py4j.GatewayConnection.run(GatewayConnection.java:207)
 at java.lang.Thread.run(Unknown Source)

 What is wrong on my side?

 Should I run some scripts before spark-submit.cmd?

 Regards,
 Sergey.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/can-not-submit-job-to-spark-in-windows-tp21824.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




-- 

[image: Sigmoid Analytics] http://htmlsig.com/www.sigmoidanalytics.com

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com


can not submit job to spark in windows

2015-02-26 Thread sergunok
-bin-hadoop2.4\python\pyspark\context.py, line 153,
in _d
o_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File C:\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py, line 202,
in _i
nitialize_context
return self._jvm.JavaSparkContext(jconf)
  File
C:\spark-1.2.1-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\java_g
ateway.py, line 701, in __call__
  File
C:\spark-1.2.1-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\protoc
ol.py, line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
None.org.apache.spa
rk.api.java.JavaSparkContext.
: java.lang.NullPointerException
at java.lang.ProcessBuilder.start(Unknown Source)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
650)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:445)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1004)
at
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:28
8)
at
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:28
8)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.SparkContext.init(SparkContext.scala:288)
at
org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.sc
ala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown
Source)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Sou
rce)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand
.java:79)
at
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Unknown Source)

What is wrong on my side?

Should I run some scripts before spark-submit.cmd?

Regards,
Sergey.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/can-not-submit-job-to-spark-in-windows-tp21824.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Stand-alone Spark on windows

2015-02-26 Thread Sergey Gerasimov
Hi!

I downloaded Spark binaries unpacked and could successfully run pyspark shell 
and write and execute some code here

BUT

I failed with submitting stand-alone python scripts or jar files via 
spark-submit:
spark-submit pi.py

I always get exception stack trace with NullPointerException in 
java.lang.ProcessBuilder.start().

What could be wrong?

Should I start some scripts before spark-submit?

I have windows 7 and spark 1.2.1

Sergey.



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Spark on Windows 2008 R2 serv er does not work

2015-01-29 Thread Wang, Ningjun (LNG-NPV)
I solved this problem following this article
http://qnalist.com/questions/4994960/run-spark-unit-test-on-windows-7

1) download compiled winutils.exe from
http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty(hadoop.home.dir, d:\\winutil\\)

It solved my original problem. But then I got a new error

lang.NullPointerException org.apache.hadoop.util.Shell.runCommand
at java.lang.ProcessBuilder.start(Unknown Source)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:411)
at 
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDep
endencies$6.apply(Executor.scala:350)
at 
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDep
endencies$6.apply(Executor.scala:347)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scal
a:772)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at 
org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies
(Executor.scala:347)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Very frustrated. Does anybody successfully get spark running on Windows?


Regards,

Ningjun Wang
Consulting Software Engineer
LexisNexis
121 Chanlon Road
New Providence, NJ 07974-1541

-Original Message-
From: Marcelo Vanzin [mailto:van...@cloudera.com] 
Sent: Wednesday, January 28, 2015 5:15 PM
To: Wang, Ningjun (LNG-NPV)
Cc: user@spark.apache.org
Subject: Re: Spark on Windows 2008 R2 serv er does not work

https://issues.apache.org/jira/browse/SPARK-2356

Take a look through the comments, there are some workarounds listed there.

On Wed, Jan 28, 2015 at 1:40 PM, Wang, Ningjun (LNG-NPV) 
ningjun.w...@lexisnexis.com wrote:
 Has anybody successfully install and run spark-1.2.0 on windows 2008 
 R2 or windows 7? How do you get that works?



 Regards,



 Ningjun Wang

 Consulting Software Engineer

 LexisNexis

 121 Chanlon Road

 New Providence, NJ 07974-1541



 From: Wang, Ningjun (LNG-NPV) [mailto:ningjun.w...@lexisnexis.com]
 Sent: Tuesday, January 27, 2015 10:28 PM
 To: user@spark.apache.org
 Subject: Spark on Windows 2008 R2 serv er does not work



 I download and install  spark-1.2.0-bin-hadoop2.4.tgz pre-built 
 version on Windows 2008 R2 server. When I submit a job using 
 spark-submit, I got the following error



 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load 
 native-hadoop library for your platform

 ... using builtin-java classes where applicable

 ERROR org.apache.hadoop.util.Shell: Failed to locate the winutils 
 binary in the hadoop binary path

 java.io.IOException: Could not locate executable null\bin\winutils.exe 
 in the Hadoop binaries.

 at 
 org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)

 at 
 org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)

 at org.apache.hadoop.util.Shell.clinit(Shell.java:326)

 at 
 org.apache.hadoop.util.StringUtils.clinit(StringUtils.java:76)

 at
 org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)

 at org.apache.hadoop.security.Groups.init(Groups.java:77)

 at
 org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups
 .java:240)

 at
 org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupIn
 formation.java:255)





 Please advise. Thanks.





 Ningjun







--
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Spark on Windows 2008 R2 serv er does not work

2015-01-28 Thread Wang, Ningjun (LNG-NPV)
Has anybody successfully install and run spark-1.2.0 on windows 2008 R2 or 
windows 7? How do you get that works?

Regards,

Ningjun Wang
Consulting Software Engineer
LexisNexis
121 Chanlon Road
New Providence, NJ 07974-1541

From: Wang, Ningjun (LNG-NPV) [mailto:ningjun.w...@lexisnexis.com]
Sent: Tuesday, January 27, 2015 10:28 PM
To: user@spark.apache.org
Subject: Spark on Windows 2008 R2 serv er does not work

I download and install  spark-1.2.0-bin-hadoop2.4.tgz pre-built version on 
Windows 2008 R2 server. When I submit a job using spark-submit, I got the 
following error

WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform
... using builtin-java classes where applicable
ERROR org.apache.hadoop.util.Shell: Failed to locate the winutils binary in the 
hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the 
Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.clinit(Shell.java:326)
at org.apache.hadoop.util.StringUtils.clinit(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.init(Groups.java:77)
at 
org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)


Please advise. Thanks.


Ningjun




Re: Spark on Windows 2008 R2 serv er does not work

2015-01-28 Thread Marcelo Vanzin
https://issues.apache.org/jira/browse/SPARK-2356

Take a look through the comments, there are some workarounds listed there.

On Wed, Jan 28, 2015 at 1:40 PM, Wang, Ningjun (LNG-NPV)
ningjun.w...@lexisnexis.com wrote:
 Has anybody successfully install and run spark-1.2.0 on windows 2008 R2 or
 windows 7? How do you get that works?



 Regards,



 Ningjun Wang

 Consulting Software Engineer

 LexisNexis

 121 Chanlon Road

 New Providence, NJ 07974-1541



 From: Wang, Ningjun (LNG-NPV) [mailto:ningjun.w...@lexisnexis.com]
 Sent: Tuesday, January 27, 2015 10:28 PM
 To: user@spark.apache.org
 Subject: Spark on Windows 2008 R2 serv er does not work



 I download and install  spark-1.2.0-bin-hadoop2.4.tgz pre-built version on
 Windows 2008 R2 server. When I submit a job using spark-submit, I got the
 following error



 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform

 ... using builtin-java classes where applicable

 ERROR org.apache.hadoop.util.Shell: Failed to locate the winutils binary in
 the hadoop binary path

 java.io.IOException: Could not locate executable null\bin\winutils.exe in
 the Hadoop binaries.

 at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)

 at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)

 at org.apache.hadoop.util.Shell.clinit(Shell.java:326)

 at org.apache.hadoop.util.StringUtils.clinit(StringUtils.java:76)

 at
 org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)

 at org.apache.hadoop.security.Groups.init(Groups.java:77)

 at
 org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)

 at
 org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)





 Please advise. Thanks.





 Ningjun







-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark on Windows 2008 R2 serv er does not work

2015-01-27 Thread Wang, Ningjun (LNG-NPV)
I download and install  spark-1.2.0-bin-hadoop2.4.tgz pre-built version on 
Windows 2008 R2 server. When I submit a job using spark-submit, I got the 
following error

WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform
... using builtin-java classes where applicable
ERROR org.apache.hadoop.util.Shell: Failed to locate the winutils binary in the 
hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the 
Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.clinit(Shell.java:326)
at org.apache.hadoop.util.StringUtils.clinit(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.init(Groups.java:77)
at 
org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)


Please advise. Thanks.


Ningjun