AND
%sql
select f13, count(1) value from summary group by f13
throws
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 132.0 failed 4 times, most recent failure: Lost task 0.3 in stage
132.0 (TID 3364, datanode-9-7497.phx01.dev.ebayc3.com):
java.lang.NumberFormatE
select f1,f11 from summary works
but when i do
select f1, f11 from summary group by f1
it throws error
org.apache.spark.sql.AnalysisException: expression 'f1' is neither present
in the group by, nor is it an aggregate function. Add to group by or wrap
in first() if you don't care which value you ge
Figured it out
val summary = rowStructText.filter(s => s.length != 1).map(s =>
s.split("\t"))
AND
select * from summary shows the table
On Wed, Aug 5, 2015 at 10:37 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:
> For some reason the path of the HDFS is coming up in the data i am reading.
>
>
> rowStructText*.fil
For some reason the path of the HDFS is coming up in the data i am reading.
rowStructText*.filter(s => s.length != 1)*.map(s => {
println(s)
s.split("\t").size
}).countByValue foreach println
However the output (println()) on the executors still have the the
characters of the HDFS file
I see the spark job.
The println statements has one character per line.
2
0
1
5
/
0
8
/
0
3
/
r
e
g
u
l
a
r
/
p
a
r
t
-
m
On Wed, Aug 5, 2015 at 10:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:
> val summary = rowStructText.map(s => s.split(",")).map(
> {
> s =>
> *println(s)*
> Summary(for
val summary = rowStructText.map(s => s.split(",")).map(
{
s =>
*println(s)*
Summary(formatStringAsDate(s(0)),
s(1).replaceAll("\"", "").toLong,
s(3).replaceAll("\"", "").toLong,
s(4).replaceAll("\"", "").toInt,
s(5).replaceAll("\"", ""),
hi ,
Have anyone implemented any authentication and authorization mechanism over
Zeppelin. Right now there is absolutely no login and permissions required
to access the notebook interface
Its imperative in case we use it with any valuable and important data.
Regards
Manya
summary: org.apache.spark.rdd.RDD[Summary] = MapPartitionsRDD[285] at map
at :169 (1,517252)
What does that mean ?
On Wed, Aug 5, 2015 at 10:14 PM, Jeff Zhang wrote:
> You data might have format issue (with less fields than you expect)
>
> Please try execute the following code to check whether
You data might have format issue (with less fields than you expect)
Please try execute the following code to check whether all the lines with
14 fields:
rowStructText.map(s => s.split(",").size).countByValue foreach
println
On Thu, Aug 6, 2015 at 1:01 PM, Randy Gelhausen
wrote:
> You lik
You likely have a problem with your parsing logic. I can’t see the data to know
for sure, but since Spark is lazily evaluated, it doesn’t try to run your map
until you execute the SQL that applies it to the data.
That’s why your first paragraph can run (it’s only defining metadata), but
paragra
%sql
select * from summary
Throws same error
On Wed, Aug 5, 2015 at 9:33 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:
> Para-1
> import java.text.SimpleDateFormat
> import java.util.Calendar
> import java.sql.Date
>
> def formatStringAsDate(dateStr: String) = new java.sql.Date(new
> SimpleDateFormat("-MM-dd").
Para-1
import java.text.SimpleDateFormat
import java.util.Calendar
import java.sql.Date
def formatStringAsDate(dateStr: String) = new java.sql.Date(new
SimpleDateFormat("-MM-dd").parse(dateStr).getTime())
//(2015-07-27,12459,,31242,6,Daily,-999,2099-01-01,2099-01-02,1,0,0.1,0,1,-1,isGeo,,,204
Hi, Rémy
Thanks for sharing the problem. Could you give a complete example that can
reproduce the error?
Thanks,
moon
On 2015년 7월 24일 (금) at 오전 2:51 PHELIPOT, REMY
wrote:
> Hello,
>
> I’m trying to use Zeppelin with a machine learning algorithms:
> val c = regionRDD.map{r => (r.id
> ,clusters.
Hi Lee,
Thank you very much for the comments.
In my case my admin installed spark on every node I guess. That’s the reason I
didn’t encounter any issues.
Sure it’s a good a point to mention.
Thanks
Karthik Vadla
From: Jongyoul Lee [mailto:jongy...@gmail.com]
Sent: Tuesday, August 4, 2015 11:46
Hi Naveen,
This will help you as a startup guide to setup zeppelin.
http://blog.cloudera.com/blog/2015/07/how-to-install-apache-zeppelin-on-cdh/
Thanks
Karthik Vadla
From: Naveenkumar GP [mailto:naveenkumar...@infosys.com]
Sent: Wednesday, August 5, 2015 5:06 AM
To: users@zeppelin.incubator.apa
@Naveen,
This email thread has the steps to follow:
https://mail.google.com/mail/u/0/?ui=2&ik=89501a9ed8&view=lg&msg=14ef946743652b50
Along with these from the specific vendors depending on your Hadoop
Installation:
Cloudera:
https://mail.google.com/mail/u/0/?ui=2&ik=89501a9ed8&view=lg&msg=14e
hey guys, resolved the issue...there was an entry in /etc/hosts file with
localhost due to which YARN was trying to connect to spark driver on
localhost of client machine.
Once the entry was removed , it picked the hostname and was able to connect.
Thanks Jongyoul Lee, Todd Nist for help...this f
No how to do that one..
From: Todd Nist [mailto:tsind...@gmail.com]
Sent: Wednesday, August 05, 2015 5:34 PM
To: users@zeppelin.incubator.apache.org
Subject: Re: Exception while submitting spark job using Yarn
Have you built Zeppelin with against the version of Hadoop & Spark you are
using? It
Have you built Zeppelin with against the version of Hadoop & Spark you are
using? It has to be build with the appropriate versions as this will pull
in the required libraries from Hadoop and Spark. By default Zeppelin will
not work on Yarn with out doing the build.
@deepujain posted a fairly com
JL,
I tried after disabling the firewalls as well , but no luck :(
Regards
Manya
On Wed, Aug 5, 2015 at 12:07 PM, Jongyoul Lee wrote:
> Hi,
>
> You should check your firewalls because Spark executor try to connect
> Spark driver which runs in your client machine in yarn-client mode.
>
> Regard
20 matches
Mail list logo