Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
AND %sql select f13, count(1) value from summary group by f13 throws org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 132.0 failed 4 times, most recent failure: Lost task 0.3 in stage 132.0 (TID 3364, datanode-9-7497.phx01.dev.ebayc3.com): java.lang.NumberFormatE

Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
select f1,f11 from summary works but when i do select f1, f11 from summary group by f1 it throws error org.apache.spark.sql.AnalysisException: expression 'f1' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() if you don't care which value you ge

Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
Figured it out val summary = rowStructText.filter(s => s.length != 1).map(s => s.split("\t")) AND select * from summary shows the table On Wed, Aug 5, 2015 at 10:37 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote: > For some reason the path of the HDFS is coming up in the data i am reading. > > > rowStructText*.fil

Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
For some reason the path of the HDFS is coming up in the data i am reading. rowStructText*.filter(s => s.length != 1)*.map(s => { println(s) s.split("\t").size }).countByValue foreach println However the output (println()) on the executors still have the the characters of the HDFS file

Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
I see the spark job. The println statements has one character per line. 2 0 1 5 / 0 8 / 0 3 / r e g u l a r / p a r t - m On Wed, Aug 5, 2015 at 10:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote: > val summary = rowStructText.map(s => s.split(",")).map( > { > s => > *println(s)* > Summary(for

Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
val summary = rowStructText.map(s => s.split(",")).map( { s => *println(s)* Summary(formatStringAsDate(s(0)), s(1).replaceAll("\"", "").toLong, s(3).replaceAll("\"", "").toLong, s(4).replaceAll("\"", "").toInt, s(5).replaceAll("\"", ""),

Authentication and Authorization on Zeppelin

2015-08-05 Thread manya cancerian
hi , Have anyone implemented any authentication and authorization mechanism over Zeppelin. Right now there is absolutely no login and permissions required to access the notebook interface Its imperative in case we use it with any valuable and important data. Regards Manya

Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
summary: org.apache.spark.rdd.RDD[Summary] = MapPartitionsRDD[285] at map at :169 (1,517252) What does that mean ? On Wed, Aug 5, 2015 at 10:14 PM, Jeff Zhang wrote: > You data might have format issue (with less fields than you expect) > > Please try execute the following code to check whether

Re: Unable to run count(*)

2015-08-05 Thread Jeff Zhang
You data might have format issue (with less fields than you expect) Please try execute the following code to check whether all the lines with 14 fields: rowStructText.map(s => s.split(",").size).countByValue foreach println On Thu, Aug 6, 2015 at 1:01 PM, Randy Gelhausen wrote: > You lik

Re: Unable to run count(*)

2015-08-05 Thread Randy Gelhausen
You likely have a problem with your parsing logic. I can’t see the data to know for sure, but since Spark is lazily evaluated, it doesn’t try to run your map until you execute the SQL that applies it to the data. That’s why your first paragraph can run (it’s only defining metadata), but paragra

Re: Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
%sql select * from summary Throws same error On Wed, Aug 5, 2015 at 9:33 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote: > Para-1 > import java.text.SimpleDateFormat > import java.util.Calendar > import java.sql.Date > > def formatStringAsDate(dateStr: String) = new java.sql.Date(new > SimpleDateFormat("-MM-dd").

Unable to run count(*)

2015-08-05 Thread ๏̯͡๏
Para-1 import java.text.SimpleDateFormat import java.util.Calendar import java.sql.Date def formatStringAsDate(dateStr: String) = new java.sql.Date(new SimpleDateFormat("-MM-dd").parse(dateStr).getTime()) //(2015-07-27,12459,,31242,6,Daily,-999,2099-01-01,2099-01-02,1,0,0.1,0,1,-1,isGeo,,,204

Re: ZeppelinContext not found in spark executor classpath

2015-08-05 Thread moon soo Lee
Hi, Rémy Thanks for sharing the problem. Could you give a complete example that can reproduce the error? Thanks, moon On 2015년 7월 24일 (금) at 오전 2:51 PHELIPOT, REMY wrote: > Hello, > > I’m trying to use Zeppelin with a machine learning algorithms: > val c = regionRDD.map{r => (r.id > ,clusters.

RE: Documented and Published- Install zeppelin on CDH

2015-08-05 Thread Vadla, Karthik
Hi Lee, Thank you very much for the comments. In my case my admin installed spark on every node I guess. That’s the reason I didn’t encounter any issues. Sure it’s a good a point to mention. Thanks Karthik Vadla From: Jongyoul Lee [mailto:jongy...@gmail.com] Sent: Tuesday, August 4, 2015 11:46

RE: Exception while submitting spark job using Yarn

2015-08-05 Thread Vadla, Karthik
Hi Naveen, This will help you as a startup guide to setup zeppelin. http://blog.cloudera.com/blog/2015/07/how-to-install-apache-zeppelin-on-cdh/ Thanks Karthik Vadla From: Naveenkumar GP [mailto:naveenkumar...@infosys.com] Sent: Wednesday, August 5, 2015 5:06 AM To: users@zeppelin.incubator.apa

Re: Exception while submitting spark job using Yarn

2015-08-05 Thread Todd Nist
@Naveen, This email thread has the steps to follow: https://mail.google.com/mail/u/0/?ui=2&ik=89501a9ed8&view=lg&msg=14ef946743652b50 Along with these from the specific vendors depending on your Hadoop Installation: Cloudera: https://mail.google.com/mail/u/0/?ui=2&ik=89501a9ed8&view=lg&msg=14e

Re: Exception while submitting spark job using Yarn

2015-08-05 Thread manya cancerian
hey guys, resolved the issue...there was an entry in /etc/hosts file with localhost due to which YARN was trying to connect to spark driver on localhost of client machine. Once the entry was removed , it picked the hostname and was able to connect. Thanks Jongyoul Lee, Todd Nist for help...this f

RE: Exception while submitting spark job using Yarn

2015-08-05 Thread Naveenkumar GP
No how to do that one.. From: Todd Nist [mailto:tsind...@gmail.com] Sent: Wednesday, August 05, 2015 5:34 PM To: users@zeppelin.incubator.apache.org Subject: Re: Exception while submitting spark job using Yarn Have you built Zeppelin with against the version of Hadoop & Spark you are using? It

Re: Exception while submitting spark job using Yarn

2015-08-05 Thread Todd Nist
Have you built Zeppelin with against the version of Hadoop & Spark you are using? It has to be build with the appropriate versions as this will pull in the required libraries from Hadoop and Spark. By default Zeppelin will not work on Yarn with out doing the build. @deepujain posted a fairly com

Re: Yarn + Spark + Zepplin ?

2015-08-05 Thread manya cancerian
JL, I tried after disabling the firewalls as well , but no luck :( Regards Manya On Wed, Aug 5, 2015 at 12:07 PM, Jongyoul Lee wrote: > Hi, > > You should check your firewalls because Spark executor try to connect > Spark driver which runs in your client machine in yarn-client mode. > > Regard