Re: sparkSession.sql("sql query") vs df.sqlContext().sql(this.query) ?

2017-12-07 Thread khathiravan raj maadhaven
Hi Kant,

Based on my understanding, I think the only difference is the overhead of
the selection/creation of SqlContext for the query you have passed. As the
table / view is already available for use, sparkSession.sql('your query')
should be simple & good enough.

Following uses the session/context by default created and available:

* sparkSession.sql(**"select value from table")*

while the following would look for create one & run the query (which I
believe is extra overhead):
*df.sqlContext().sql(**"select value from table")*

Regards
Raj



On Wed, Dec 6, 2017 at 6:07 PM, kant kodali  wrote:

> Hi All,
>
> I have the following snippets of the code and I wonder what is the
> difference between these two and which one should I use? I am using spark
> 2.2.
>
> Dataset df = sparkSession.readStream()
> .format("kafka")
> .load();
>
> df.createOrReplaceTempView("table");
> df.printSchema();
>
> *Dataset resultSet =  df.sqlContext().sql(*
> *"select value from table"); //sparkSession.sql(this.query);*StreamingQuery 
> streamingQuery = resultSet
> .writeStream()
> .trigger(Trigger.ProcessingTime(1000))
> .format("console")
> .start();
>
>
> vs
>
>
> Dataset df = sparkSession.readStream()
> .format("kafka")
> .load();
>
> df.createOrReplaceTempView("table");
>
> *Dataset resultSet =  sparkSession.sql(*
> *"select value from table"); //sparkSession.sql(this.query);*StreamingQuery 
> streamingQuery = resultSet
> .writeStream()
> .trigger(Trigger.ProcessingTime(1000))
> .format("console")
> .start();
>
>
> Thanks!
>
>


sparkSession.sql("sql query") vs df.sqlContext().sql(this.query) ?

2017-12-06 Thread kant kodali
Hi All,

I have the following snippets of the code and I wonder what is the
difference between these two and which one should I use? I am using spark
2.2.

Dataset df = sparkSession.readStream()
.format("kafka")
.load();

df.createOrReplaceTempView("table");
df.printSchema();

*Dataset resultSet =  df.sqlContext().sql(*
*"select value from table");
//sparkSession.sql(this.query);*StreamingQuery streamingQuery =
resultSet
.writeStream()
.trigger(Trigger.ProcessingTime(1000))
.format("console")
.start();


vs


Dataset df = sparkSession.readStream()
.format("kafka")
.load();

df.createOrReplaceTempView("table");

*Dataset resultSet =  sparkSession.sql(*
*"select value from table");
//sparkSession.sql(this.query);*StreamingQuery streamingQuery =
resultSet
.writeStream()
.trigger(Trigger.ProcessingTime(1000))
.format("console")
.start();


Thanks!