Re: Efficient way to compare the current row with previous row contents

2018-02-12 Thread Georg Heiler
See
https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
and
https://stackoverflow.com/questions/42448564/spark-sql-window-function-with-complex-condition
for a more involved example


KhajaAsmath Mohammed  schrieb am Mo. 12. Feb. 2018
um 15:16:

> I am also looking for the same answer. Will this work in streaming
> application too ??
>
> Sent from my iPhone
>
> On Feb 12, 2018, at 8:12 AM, Debabrata Ghosh 
> wrote:
>
> Georg - Thanks ! Will you be able to help me with a few examples please.
>
> Thanks in advance again !
>
> Cheers,
> D
>
> On Mon, Feb 12, 2018 at 6:03 PM, Georg Heiler 
> wrote:
>
>> You should look into window functions for spark sql.
>> Debabrata Ghosh  schrieb am Mo. 12. Feb. 2018 um
>> 13:10:
>>
>>> Hi,
>>>  Greetings !
>>>
>>>  I needed some efficient way in pyspark to execute a
>>> comparison (on all the attributes) between the current row and the previous
>>> row. My intent here is to leverage the distributed framework of Spark to
>>> the best extent so that can achieve a good speed. Please can anyone suggest
>>> me a suitable algorithm / command. Here is a snapshot of the underlying
>>> data which I need to compare:
>>>
>>> [image: Inline image 1]
>>>
>>> Thanks in advance !
>>>
>>> D
>>>
>>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Efficient way to compare the current row with previous row contents

2018-02-12 Thread KhajaAsmath Mohammed
I am also looking for the same answer. Will this work in streaming application 
too ?? 

Sent from my iPhone

> On Feb 12, 2018, at 8:12 AM, Debabrata Ghosh  wrote:
> 
> Georg - Thanks ! Will you be able to help me with a few examples please.
> 
> Thanks in advance again !
> 
> Cheers,
> D
> 
>> On Mon, Feb 12, 2018 at 6:03 PM, Georg Heiler  
>> wrote:
>> You should look into window functions for spark sql. 
>> Debabrata Ghosh  schrieb am Mo. 12. Feb. 2018 um 
>> 13:10:
>>> Hi,
>>>  Greetings !
>>> 
>>>  I needed some efficient way in pyspark to execute a 
>>> comparison (on all the attributes) between the current row and the previous 
>>> row. My intent here is to leverage the distributed framework of Spark to 
>>> the best extent so that can achieve a good speed. Please can anyone suggest 
>>> me a suitable algorithm / command. Here is a snapshot of the underlying 
>>> data which I need to compare:
>>> 
>>> 
>>> 
>>> Thanks in advance !
>>> 
>>> D
> 


Re: Efficient way to compare the current row with previous row contents

2018-02-12 Thread Debabrata Ghosh
Georg - Thanks ! Will you be able to help me with a few examples please.

Thanks in advance again !

Cheers,
D

On Mon, Feb 12, 2018 at 6:03 PM, Georg Heiler 
wrote:

> You should look into window functions for spark sql.
> Debabrata Ghosh  schrieb am Mo. 12. Feb. 2018 um
> 13:10:
>
>> Hi,
>>  Greetings !
>>
>>  I needed some efficient way in pyspark to execute a
>> comparison (on all the attributes) between the current row and the previous
>> row. My intent here is to leverage the distributed framework of Spark to
>> the best extent so that can achieve a good speed. Please can anyone suggest
>> me a suitable algorithm / command. Here is a snapshot of the underlying
>> data which I need to compare:
>>
>> [image: Inline image 1]
>>
>> Thanks in advance !
>>
>> D
>>
>


Re: Efficient way to compare the current row with previous row contents

2018-02-12 Thread Georg Heiler
You should look into window functions for spark sql.
Debabrata Ghosh  schrieb am Mo. 12. Feb. 2018 um
13:10:

> Hi,
>  Greetings !
>
>  I needed some efficient way in pyspark to execute a
> comparison (on all the attributes) between the current row and the previous
> row. My intent here is to leverage the distributed framework of Spark to
> the best extent so that can achieve a good speed. Please can anyone suggest
> me a suitable algorithm / command. Here is a snapshot of the underlying
> data which I need to compare:
>
> [image: Inline image 1]
>
> Thanks in advance !
>
> D
>


Efficient way to compare the current row with previous row contents

2018-02-12 Thread Debabrata Ghosh
Hi,
 Greetings !

 I needed some efficient way in pyspark to execute a
comparison (on all the attributes) between the current row and the previous
row. My intent here is to leverage the distributed framework of Spark to
the best extent so that can achieve a good speed. Please can anyone suggest
me a suitable algorithm / command. Here is a snapshot of the underlying
data which I need to compare:

[image: Inline image 1]

Thanks in advance !

D