Re: hadoop2.6.0 + spark1.4.1 + python2.7.10

2015-09-09 Thread Ashish Dutt
t; My understanding of yarn and spark is that these binaries get compressed > and packaged with Java to be pushed to work node. > Regards, > On Sep 7, 2015 9:00 PM, "Ashish Dutt" <ashish.du...@gmail.com> wrote: > >> Hello Sasha, >> >> I have no answer

Re: hadoop2.6.0 + spark1.4.1 + python2.7.10

2015-09-07 Thread Ashish Dutt
s not getting packages > required to run app. > > If someone confirms that I need to build everything from source with > specific version of software I will do that, but at this point I am not > sure what to do to remedy this situation... > > --sasha > > > On Sun, Sep 6, 201

Re: hadoop2.6.0 + spark1.4.1 + python2.7.10

2015-09-06 Thread Ashish Dutt
flow <http://stackoverflow.com/search?q=no+module+named+pyspark> website Sincerely, Ashish Dutt On Mon, Sep 7, 2015 at 7:17 AM, Sasha Kacanski <skacan...@gmail.com> wrote: > Hi, > I am successfully running python app via pyCharm in local mode > setMaster("local[*]")

PySpark in Pycharm- unable to connect to remote server

2015-08-05 Thread Ashish Dutt
) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) Traceback (most recent call last): File C:/Users/ashish dutt/PycharmProjects

How to connect to remote HDFS programmatically to retrieve data, analyse it and then write the data back to HDFS?

2015-08-05 Thread Ashish Dutt
) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) Traceback (most recent call last): File C:/Users/ashish dutt/PycharmProjects

Re: SparkR Error in sparkR.init(master=“local”) in RStudio

2015-07-13 Thread Ashish Dutt
()? where is it in windows environment Thanks for your help Sincerely, Ashish Dutt On Mon, Jul 13, 2015 at 3:48 PM, Sun, Rui rui@intel.com wrote: Hi, Kachau, If you are using SparkR with RStudio, have you followed the guidelines in the section Using SparkR from RStudio in https

Re: Is it possible to change the default port number 7077 for spark?

2015-07-13 Thread Ashish Dutt
Hello Arun, Thank you for the descriptive response. And thank you for providing the sample file too. It certainly is a great help. Sincerely, Ashish On Mon, Jul 13, 2015 at 10:30 PM, Arun Verma arun.verma...@gmail.com wrote: PFA sample file On Mon, Jul 13, 2015 at 7:37 PM, Arun Verma

Re: Connecting to nodes on cluster

2015-07-09 Thread Ashish Dutt
Hello Akhil, Thanks for the response. I will have to figure this out. Sincerely, Ashish On Thu, Jul 9, 2015 at 3:40 PM, Akhil Das ak...@sigmoidanalytics.com wrote: On Wed, Jul 8, 2015 at 7:31 PM, Ashish Dutt ashish.du...@gmail.com wrote: Hi, We have a cluster with 4 nodes. The cluster

Re: PySpark MLlib: py4j cannot find trainImplicitALSModel method

2015-07-08 Thread Ashish Dutt
My apologies for double posting but I missed the web links that i followed which are 1 http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-pyspark/, 2 http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/, 3

Re: Parallelizing multiple RDD / DataFrame creation in Spark

2015-07-08 Thread Ashish Dutt
Thanks you Akhil for the link Sincerely, Ashish Dutt PhD Candidate Department of Information Systems University of Malaya, Lembah Pantai, 50603 Kuala Lumpur, Malaysia On Wed, Jul 8, 2015 at 3:43 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Have a look http://alvinalexander.com/scala/how

Re: PySpark MLlib: py4j cannot find trainImplicitALSModel method

2015-07-08 Thread Ashish Dutt
and hence not much help to me. I am able to launch ipython on localhost but cannot get it to work on the cluster Sincerely, Ashish Dutt On Wed, Jul 8, 2015 at 5:49 PM, sooraj soora...@gmail.com wrote: That turned out to be a silly data type mistake. At one point in the iterative call, I

How to upgrade Spark version in CDH 5.4

2015-07-08 Thread Ashish Dutt
Hi, I need to upgrade spark version 1.3 to version 1.4 on CDH 5.4. I checked the documentation here

Re: Getting started with spark-scala developemnt in eclipse.

2015-07-08 Thread Ashish Dutt
Hello Prateek, I started with getting the pre built binaries so as to skip the hassle of building them from scratch. I am not familiar with scala so can't comment on it. I have documented my experiences on my blog www.edumine.wordpress.com Perhaps it might be useful to you. On 08-Jul-2015 9:39

Connecting to nodes on cluster

2015-07-08 Thread Ashish Dutt
Sincerely, Ashish Dutt

Re: Parallelizing multiple RDD / DataFrame creation in Spark

2015-07-08 Thread Ashish Dutt
Thanks for your reply Akhil. How do you multithread it? Sincerely, Ashish Dutt On Wed, Jul 8, 2015 at 3:29 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Whats the point of creating them in parallel? You can multi-thread it run it in parallel though. Thanks Best Regards On Wed, Jul 8

Re: Connecting to nodes on cluster

2015-07-08 Thread Ashish Dutt
The error is JVM has not responded after 10 seconds. On 08-Jul-2015 10:54 PM, ayan guha guha.a...@gmail.com wrote: What's the error you are getting? On 9 Jul 2015 00:01, Ashish Dutt ashish.du...@gmail.com wrote: Hi, We have a cluster with 4 nodes. The cluster uses CDH 5.4 for the past two

DLL load failed: %1 is not a valid win32 application on invoking pyspark

2015-07-08 Thread Ashish Dutt
. Sincerely, Ashish Dutt - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: PySpark without PySpark

2015-07-08 Thread Ashish Dutt
written something wrong here. Cannot seem to figure out, what is it? Thank you for your help Sincerely, Ashish Dutt On Thu, Jul 9, 2015 at 11:53 AM, Sujit Pal sujitatgt...@gmail.com wrote: Hi Ashish, Nice post. Agreed, kudos to the author of the post, Benjamin Benfort of District Labs

How to verify that the worker is connected to master in CDH5.4

2015-07-07 Thread Ashish Dutt
Hi, I have CDH 5.4 installed on a linux server. It has 1 cluster in which spark is deployed as a history server. I am trying to connect my laptop to the spark history server. When I run spark-shell master ip: port number I get the following output How can I verify that the worker is connected to

Re: How to verify that the worker is connected to master in CDH5.4

2015-07-07 Thread Ashish Dutt
Thank you Ayan for your response.. But I have just realised that the Spark is configured to be a history server. Please, can somebody suggest to me how can I convert Spark history server to be a Master server? Thank you Sincerely, Ashish Dutt On Wed, Jul 8, 2015 at 12:28 PM, ayan guha guha.a

Re: How to verify that the worker is connected to master in CDH5.4

2015-07-07 Thread Ashish Dutt
initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/07/08 11:28:35 INFO SecurityManager: Changing view acls to: Ashish Dutt 15/07/08 11:28:35 INFO

Re: How to verify that the worker is connected to master in CDH5.4

2015-07-07 Thread Ashish Dutt
. All I want for now is how to connect my laptop to the spark cluster machine using either pyspark or SparkR. (I have python 2.7) On my laptop I am using winutils in place of hadoop and have spark 1.4 installed Thank you Sincerely, Ashish Dutt PhD Candidate Department of Information Systems University

JVM is not ready after 10 seconds.

2015-07-06 Thread Ashish Dutt
the sparkR.init() from RStudio Any help in this reference will be appreciated Thank you, Ashish Dutt

Re: JVM is not ready after 10 seconds

2015-07-06 Thread Ashish Dutt
Hello Shivaram, Thank you for your response. Being a novice at this stage can you also tell how to configure or set the execute permission for the spark-submit file? Thank you for your time. Sincerely, Ashish Dutt On Tue, Jul 7, 2015 at 9:21 AM, Shivaram Venkataraman shiva

Re: JVM is not ready after 10 seconds

2015-07-06 Thread Ashish Dutt
# spark.driver.memory 5g # spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers=one two three Sincerely, Ashish Dutt On Tue, Jul 7, 2015 at 9:30 AM, Ashish Dutt ashish.du...@gmail.com wrote: Hello Shivaram, Thank you for your response. Being a novice at this stage