bq. CSV data is stored in an underlying table in Hive (actually created and
populated as an ORC table by Spark)

How is it possible?

On Mon, Mar 28, 2016 at 1:50 AM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi,
>
> A while back I was looking for functional programming to filter out
> transactions older > n months etc.
>
> This turned out to be pretty easy.
>
> I get today's day as follows
>
> var today = sqlContext.sql("SELECT FROM_unixtime(unix_timestamp(),
> 'yyyy-MM-dd') ").collect.apply(0).getString(0)
>
>
> CSV data is stored in an underlying table in Hive (actually created and
> populated as an ORC table by Spark)
>
> HiveContext.sql("use accounts")
> var n = HiveContext.table("nw_10124772")
>
> scala> n.printSchema
> root
>  |-- transactiondate: date (nullable = true)
>  |-- transactiontype: string (nullable = true)
>  |-- description: string (nullable = true)
>  |-- value: double (nullable = true)
>  |-- balance: double (nullable = true)
>  |-- accountname: string (nullable = true)
>  |-- accountnumber: integer (nullable = true)
>
> //
> // Check for historical transactions > 60 months old
> //
> var old: Int = 60
>
> val rs = n.filter(add_months(col("transactiondate"),old) <
> lit(today)).select(lit(today),
> col("transactiondate"),add_months(col("transactiondate"),old)).collect.foreach(println)
>
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-22,2016-03-22]
> [2016-03-27,2011-03-23,2016-03-23]
> [2016-03-27,2011-03-23,2016-03-23]
>
>
> Which seems to work. Any other suggestions will be appreciated.
>
> Thanks
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>

Reply via email to