Re: Window function in Spark SQL

2015-12-11 Thread Michael Armbrust
Can you change permissions on that directory so that hive can write to it?
We start up a mini version of hive so that we can use some of its
functionality.

On Fri, Dec 11, 2015 at 12:47 PM, Sourav Mazumder <
sourav.mazumde...@gmail.com> wrote:

> In 1.5.x whenever I try to create a HiveContext from SparkContext I get
> following error. Please note that I'm not running any Hadoop/Hive server in
> my cluster. I'm only running Spark.
>
> I never faced HiveContext creation problem like this previously in 1.4.x.
>
> Is it now a requirement in 1.5.x that to create HIveContext Hive Server
> should be running ?
>
> Regards,
> Sourav
>
>
> -- Forwarded message --
> From: <ross.cramb...@thomsonreuters.com>
> Date: Fri, Dec 11, 2015 at 11:39 AM
> Subject: Re: Window function in Spark SQL
> To: sourav.mazumde...@gmail.com
>
>
> I’m not familiar with that issue, I wasn’t able to reproduce in my
> environment - might want to copy that to the Spark user list. Sorry!
>
> On Dec 11, 2015, at 1:37 PM, Sourav Mazumder <sourav.mazumde...@gmail.com>
> wrote:
>
> Hi Ross,
>
> Thanks for your answer.
>
> In 1.5.x whenever I try to create a HiveContext from SparkContext I get
> following error. Please note that I'm not running any Hadoop/Hive server in
> my cluster. I'm only running Spark.
>
> I never faced HiveContext creation problem like this previously
>
> Regards,
> Sourav
>
> java.lang.RuntimeException: java.lang.RuntimeException: The root scratch
> dir: /t
> mp/hive on HDFS should be writable. Current permissions are: rwx--
> at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav
> a:522)
> at
> org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.s
> cala:171)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct
> orAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC
> onstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at
> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(I
> solatedClientLoader.scala:183)
> at
> org.apache.spark.sql.hive.client.IsolatedClientLoader.(Isolated
> ClientLoader.scala:179)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveCon
> text.scala:226)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:
> 185)
> at
> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
> at
> org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.sc
> ala:174)
> at
> org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:177)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:15)
> at $iwC$$iwC$$iwC$$iwC$$iwC.(:20)
> at $iwC$$iwC$$iwC$$iwC.(:22)
> at $iwC$$iwC$$iwC.(:24)
> at $iwC$$iwC.(:26)
> at $iwC.(:28)
> at (:30)
> at .(:34)
> at .()
> at .(:7)
> at .()
> at $print()
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> On Fri, Dec 11, 2015 at 10:10 AM, <ross.cramb...@thomsonreuters.com>
> wrote:
>
>> Hey Sourav,
>> Window functions require using a HiveContext rather than the default
>> SQLContext. See here:
>> http://spark.apache.org/docs/latest/sql-programming-guide.html#starting-point-sqlcontext
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__spark.apache.org_docs_latest_sql-2Dprogramming-2Dguide.html-23starting-2Dpoint-2Dsqlcontext=AwMFaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=DJcC0Gr3B6BfuPcycQUvAi5ueGCorF1rF8_kDa-hAYg=3CetCpYEuIP9Jnpnh_Rsg7OdguGE6FK3u0ByvvHvwpA=gJ-XUDEXWy54su9cV-7f-AmnPmbwMxOzsEK9Q_K9qu4=>
>>
>> HiveContext provides all the same functionality of SQLContext, as well as
>> extra features like Window functions.
>>
>> - Ross
>>
>> On Dec 11, 2015, at 12:59 PM, Sourav Mazumder <
>> sourav.mazumde...@gmail.com> wrote:
>>
>> Hi,
>>
>> Spark SQL documentation says that it complies with Hive 1.2.1 APIs and
>> supports Window functions. I'm using Spark 1.5.0.
>>
>> However, when I try to execute something like below I get an error
>>
>> val lol5 = sqlContext.sql("select ky, lead(ky, 5, 0) over (order by ky
>> rows 5 following) from lolt")
>>
>> java.lang.RuntimeException: [1.32] failure: ``union'' expected but `('
>> found select ky, lead(ky, 5, 0) over (order 

Re: Window function in Spark SQL

2015-12-11 Thread Ross.Cramblit
Hey Sourav,
Window functions require using a HiveContext rather than the default 
SQLContext. See here: 
http://spark.apache.org/docs/latest/sql-programming-guide.html#starting-point-sqlcontext

HiveContext provides all the same functionality of SQLContext, as well as extra 
features like Window functions.

- Ross

On Dec 11, 2015, at 12:59 PM, Sourav Mazumder 
> wrote:

Hi,

Spark SQL documentation says that it complies with Hive 1.2.1 APIs and supports 
Window functions. I'm using Spark 1.5.0.

However, when I try to execute something like below I get an error

val lol5 = sqlContext.sql("select ky, lead(ky, 5, 0) over (order by ky rows 5 
following) from lolt")

java.lang.RuntimeException: [1.32] failure: ``union'' expected but `(' found 
select ky, lead(ky, 5, 0) over (order by ky rows 5 following) from lolt ^ at 
scala.sys.package$.error(package.scala:27) at 
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)
 at 
org.apache.spark.sql.catalyst.DefaultParserDialect.parse(ParserDialect.scala:67)
 at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:169) at 
org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:169) at 
org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:115)
 at 
org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at 
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at 
scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at 
scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
 at 
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
 at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:166) at 
org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:166) at 
org.apache.spark.sql.execution.datasources.DDLParser.parse(DDLParser.scala:42) 
at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:189) at 
org.apache.spark.sql.SQLContext.sql(SQLContext.scala:719) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:63)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:68)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:70)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:72)
 at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:74) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:76) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:78) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:80) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:82) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:84) at 
$iwC$$iwC$$iwC$$iwC$$iwC.(:86) at 
$iwC$$iwC$$iwC$$iwC.(:88) at $iwC$$iwC$$iwC.(:90) 
at $iwC$$iwC.(:92) at $iwC.(:94) at 
(:96) at .(:100) at .() at 
.(:7) at .() at $print()

Regards,
Sourav



Re: Window function by Spark SQL

2014-12-04 Thread Cheng Lian
Window functions are not supported yet, but there is a PR for it: 
https://github.com/apache/spark/pull/2953


On 12/5/14 12:22 PM, Dai, Kevin wrote:


Hi, ALL

How can I group by one column and order by another one, then select 
the first row for each group (which is just like window function 
doing) by SparkSQL?


Best Regards,

Kevin.