R: Tungsten and Spark Streaming

2015-09-10 Thread Paolo Platter
Did you plan to modify dstream interface in order to work with dataframe ? It 
would be nice handle dstreams without generics

Paolo

Inviata dal mio Windows Phone

Da: Tathagata Das<mailto:t...@databricks.com>
Inviato: ‎10/‎09/‎2015 07:42
A: N B<mailto:nb.nos...@gmail.com>
Cc: user<mailto:user@spark.apache.org>
Oggetto: Re: Tungsten and Spark Streaming

Rewriting is necessary. You will have to convert RDD/DStream operations to 
DataFrame operations. So get the RDDs in DStream, using transform/foreachRDD, 
convert to DataFrames and then do DataFrame operations.

On Wed, Sep 9, 2015 at 9:23 PM, N B 
<nb.nos...@gmail.com<mailto:nb.nos...@gmail.com>> wrote:
Hello,

How can we start taking advantage of the performance gains made under Project 
Tungsten in Spark 1.5 for a Spark Streaming program?

>From what I understand, this is available by default for Dataframes. But for a 
>program written using Spark Streaming, would we see any potential gains "out 
>of the box" in 1.5 or will we have to rewrite some portions of the application 
>code to realize that benefit?

Any insight/documentation links etc in this regard will be appreciated.

Thanks
Nikunj




Re: Tungsten and Spark Streaming

2015-09-10 Thread Todd Nist
https://issues.apache.org/jira/browse/SPARK-8360?jql=project%20%3D%20SPARK%20AND%20text%20~%20Streaming

-Todd

On Thu, Sep 10, 2015 at 10:22 AM, Gurvinder Singh <
gurvinder.si...@uninett.no> wrote:

> On 09/10/2015 07:42 AM, Tathagata Das wrote:
> > Rewriting is necessary. You will have to convert RDD/DStream operations
> > to DataFrame operations. So get the RDDs in DStream, using
> > transform/foreachRDD, convert to DataFrames and then do DataFrame
> > operations.
>
> Are there any plans for 1.6 or later to add support of tungsten to
> RDD/DStream directly or it is intended that users should switch to
> dataframe rather then operating on RDD/Dstream level.
>
> >
> > On Wed, Sep 9, 2015 at 9:23 PM, N B  > > wrote:
> >
> > Hello,
> >
> > How can we start taking advantage of the performance gains made
> > under Project Tungsten in Spark 1.5 for a Spark Streaming program?
> >
> > From what I understand, this is available by default for Dataframes.
> > But for a program written using Spark Streaming, would we see any
> > potential gains "out of the box" in 1.5 or will we have to rewrite
> > some portions of the application code to realize that benefit?
> >
> > Any insight/documentation links etc in this regard will be
> appreciated.
> >
> > Thanks
> > Nikunj
> >
> >
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Tungsten and Spark Streaming

2015-09-10 Thread Gurvinder Singh
On 09/10/2015 07:42 AM, Tathagata Das wrote:
> Rewriting is necessary. You will have to convert RDD/DStream operations
> to DataFrame operations. So get the RDDs in DStream, using
> transform/foreachRDD, convert to DataFrames and then do DataFrame
> operations.

Are there any plans for 1.6 or later to add support of tungsten to
RDD/DStream directly or it is intended that users should switch to
dataframe rather then operating on RDD/Dstream level.

> 
> On Wed, Sep 9, 2015 at 9:23 PM, N B  > wrote:
> 
> Hello,
> 
> How can we start taking advantage of the performance gains made
> under Project Tungsten in Spark 1.5 for a Spark Streaming program? 
> 
> From what I understand, this is available by default for Dataframes.
> But for a program written using Spark Streaming, would we see any
> potential gains "out of the box" in 1.5 or will we have to rewrite
> some portions of the application code to realize that benefit?
> 
> Any insight/documentation links etc in this regard will be appreciated.
> 
> Thanks
> Nikunj
> 
> 


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Tungsten and Spark Streaming

2015-09-09 Thread N B
Hello,

How can we start taking advantage of the performance gains made under
Project Tungsten in Spark 1.5 for a Spark Streaming program?

>From what I understand, this is available by default for Dataframes. But
for a program written using Spark Streaming, would we see any potential
gains "out of the box" in 1.5 or will we have to rewrite some portions of
the application code to realize that benefit?

Any insight/documentation links etc in this regard will be appreciated.

Thanks
Nikunj


Re: Tungsten and Spark Streaming

2015-09-09 Thread Tathagata Das
Rewriting is necessary. You will have to convert RDD/DStream operations to
DataFrame operations. So get the RDDs in DStream, using
transform/foreachRDD, convert to DataFrames and then do DataFrame
operations.

On Wed, Sep 9, 2015 at 9:23 PM, N B  wrote:

> Hello,
>
> How can we start taking advantage of the performance gains made under
> Project Tungsten in Spark 1.5 for a Spark Streaming program?
>
> From what I understand, this is available by default for Dataframes. But
> for a program written using Spark Streaming, would we see any potential
> gains "out of the box" in 1.5 or will we have to rewrite some portions of
> the application code to realize that benefit?
>
> Any insight/documentation links etc in this regard will be appreciated.
>
> Thanks
> Nikunj
>
>