Re: Spark SQL

Burak Yavuz Sun, 14 Sep 2014 00:37:51 -0700

Hi,

I'm not a master on SparkSQL, but from what I understand, the problem ıs that 
you're trying to access an RDD
inside an RDD here: val xyz = file.map(line => *** 
extractCurRate(sqlContext.sql("select rate ... *** and 
here:  xyz = file.map(line => *** extractCurRate(sqlContext.sql("select rate 
... ***.
RDDs can't be serialized inside other RDD tasks, therefore you're receiving the 
NullPointerException.


More specifically, you are trying to generate a SchemaRDD inside an RDD, which 
you can't do.

If file isn't huge, you can call .collect() to transform the RDD to an array 
and then use .map() on the Array.

If the file is huge, then you may do number 3 first, join the two RDDs using 
'txCurCode' as a key, and then do filtering
operations, etc...

Best,
Burak

----- Original Message -----
From: "rkishore999" <rkishore...@yahoo.com>
To: u...@spark.incubator.apache.org
Sent: Saturday, September 13, 2014 10:29:26 PM
Subject: Spark SQL

val file =
sc.textFile("hdfs://ec2-54-164-243-97.compute-1.amazonaws.com:9010/user/fin/events.txt")

1. val xyz = file.map(line => extractCurRate(sqlContext.sql("select rate
from CurrencyCodeRates where txCurCode = '" + line.substring(202,205) + "'
and fxCurCode = '" + fxCurCodesMap(line.substring(77,82)) + "' and
effectiveDate >= '" + line.substring(221,229) + "' order by effectiveDate
desc"))

2. val xyz = file.map(line => sqlContext.sql("select rate, txCurCode,
fxCurCode, effectiveDate from CurrencyCodeRates where txCurCode = 'USD' and
fxCurCode = 'CSD' and effectiveDate >= '20140901' order by effectiveDate
desc"))

3. val xyz = sqlContext.sql("select rate, txCurCode, fxCurCode,
effectiveDate from CurrencyCodeRates where txCurCode = 'USD' and fxCurCode =
'CSD' and effectiveDate >= '20140901' order by effectiveDate desc")

xyz.saveAsTextFile("/user/output")

In statements 1 and 2 I'm getting nullpointer expecption. But statement 3 is
good. I'm guessing spark context and sql context are not going together
well.

Any suggestions regarding how I can achieve this?


                



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-tp14183.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark SQL

Reply via email to