In your example, the *rs* instance should be a DataFrame object. In other words, the result of *HiveContext.sql* is a DataFrame that you can manipulate using *filter, map, *etc.
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.hive.HiveContext On Mon, Feb 22, 2016 at 5:16 PM, Mich Talebzadeh < mich.talebza...@cloudtechnologypartners.co.uk> wrote: > Hi, > > I have data stored in Hive tables that I want to do simple manipulation. > > Currently in Spark I perform the following with getting the result set > using SQL from Hive tables, registering as a temporary table in Spark > > Now Ideally I can get the result set into a DF and work on DF to slice and > dice the data using functional programming with filter, map. split etc. > > I wanted to get some ideas on how to go about it. > > thanks > > val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) > > HiveContext.sql("use oraclehadoop") > val rs = HiveContext.sql("""SELECT t.calendar_month_desc, c.channel_desc, > SUM(s.amount_sold) AS TotalSales > FROM smallsales s, times t, channels c > WHERE s.time_id = t.time_id > AND s.channel_id = c.channel_id > GROUP BY t.calendar_month_desc, c.channel_desc > """) > *rs.registerTempTable("tmp")* > > > HiveContext.sql(""" > SELECT calendar_month_desc AS MONTH, channel_desc AS CHANNEL, TotalSales > from tmp > ORDER BY MONTH, CHANNEL > """).collect.foreach(println) > HiveContext.sql(""" > SELECT channel_desc AS CHANNEL, MAX(TotalSales) AS SALES > FROM tmp > GROUP BY channel_desc > order by SALES DESC > """).collect.foreach(println) > > > -- > > Dr Mich Talebzadeh > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > http://talebzadehmich.wordpress.com > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this message > shall not be understood as given or endorsed by Cloud Technology Partners > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus free, > therefore neither Cloud Technology partners Ltd, its subsidiaries nor their > employees accept any responsibility. > > >