Running Java Program using Eclipse on Existing Spark Cluster

2016-03-09 Thread Gaini Rajeshwar
Hi All, I have one master & 2 workers on my local machine. I wrote the following Java program to count number of lines in README.md file (I am using Maven project to do this) import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.api.java.JavaRDD; import

Re: Installing Spark on Mac

2016-03-09 Thread Gaini Rajeshwar
It should just work with these steps. You don't need to configure much. As mentioned, some settings on your machine are overriding default spark settings. Even running as super-user should not be a problem. It works just fine as super-user as well. Can you tell us what version of Java you are

Re: GroupBy on DataFrame taking too much time

2016-01-11 Thread Gaini Rajeshwar
e issue, > please check the jdbc link and the data is loaded successfully?? > > Thanks > Xingchi > > 2016-01-11 15:43 GMT+08:00 Gaini Rajeshwar <raja.rajeshwar2...@gmail.com>: > >> Hi All, >> >> I have a table named *customer *(customer_id, event, countr

Re: Getting an error while submitting spark jar

2016-01-11 Thread Gaini Rajeshwar
Hi Sree, I think it has to be *--class mllib.perf.TestRunner* instead of *--class mllib.perf.TesRunner* On Mon, Jan 11, 2016 at 1:19 PM, Sree Eedupuganti wrote: > The way how i submitting jar > > hadoop@localhost:/usr/local/hadoop/spark$ ./bin/spark-submit \ > > --class

XML column not supported in Database

2016-01-11 Thread Gaini Rajeshwar
Hi All, I am using PostgreSQL database. I am using the following jdbc call to access a customer table (*customer_id int, event text, country text, content xml)* in my database. *val dataframe1 = sqlContext.load("jdbc", Map("url" ->

Re: XML column not supported in Database

2016-01-11 Thread Gaini Rajeshwar
RK > > On Mon, Jan 11, 2016 at 1:44 AM, Gaini Rajeshwar < > raja.rajeshwar2...@gmail.com> wrote: > >> Hi All, >> >> I am using PostgreSQL database. I am using the following jdbc call to >> access a customer table (*customer_id int, event text, count

Re: Unable to compile from source

2016-01-11 Thread Gaini Rajeshwar
stions/21252800/maven-trusting-all-certs-unlimited-java-policy > > Check above link to know how to disable SSL check. > > - hareesh. > On Jan 8, 2016 4:54 PM, "Gaini Rajeshwar" <raja.rajeshwar2...@gmail.com> > wrote: > >> Hi All, >> >> I am new to

GroupBy on DataFrame taking too much time

2016-01-10 Thread Gaini Rajeshwar
Hi All, I have a table named *customer *(customer_id, event, country, ) in postgreSQL database. This table is having more than 100 million rows. I want to know number of events from each country. To achieve that i am doing groupBY using spark as following. *val dataframe1 =

Unable to compile from source

2016-01-08 Thread Gaini Rajeshwar
Hi All, I am new to apache spark. I have downloaded *Spark 1.6.0 (Jan 04 2016) source code version*. I did run the following command following command as per spark documentation . build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0