...@gmail.com;
user@spark.apache.org user@spark.apache.org
Sent: Friday, January 30, 2015 7:11 AM
Subject: Re: spark challenge: zip with next???
assuming the data can be partitioned then you have many timeseries for which
you want to detect potential gaps. also assuming the resulting gaps info per
30, 2015 7:11 AM
*Subject:* Re: spark challenge: zip with next???
assuming the data can be partitioned then you have many timeseries for
which you want to detect potential gaps. also assuming the resulting gaps
info per timeseries is much smaller data then the timeseries data itself
...@gmail.com
*Cc:* Tobias Pfeiffer t...@preferred.jp; Ganelin, Ilya
ilya.gane...@capitalone.com; derrickburns derrickrbu...@gmail.com;
user@spark.apache.org user@spark.apache.org
*Sent:* Friday, January 30, 2015 7:11 AM
*Subject:* Re: spark challenge: zip with next???
assuming the data can
*Cc:* Tobias Pfeiffer t...@preferred.jp; Ganelin, Ilya
ilya.gane...@capitalone.com; derrickburns derrickrbu...@gmail.com;
user@spark.apache.org user@spark.apache.org
*Sent:* Friday, January 30, 2015 7:11 AM
*Subject:* Re: spark challenge: zip with next???
assuming the data can be partitioned
assuming the data can be partitioned then you have many timeseries for
which you want to detect potential gaps. also assuming the resulting gaps
info per timeseries is much smaller data then the timeseries data itself,
then this is a classical example to me of a sorted (streaming) foldLeft,
Another solution would be to use the reduce action.
Mohammed
From: Ganelin, Ilya [mailto:ilya.gane...@capitalone.com]
Sent: Thursday, January 29, 2015 1:32 PM
To: 'derrickburns'; 'user@spark.apache.org'
Subject: RE: spark challenge: zip with next???
Make a copy of your RDD with an extra entry
Make a copy of your RDD with an extra entry in the beginning to offset. The you
can zip the two RDDs and run a map to generate an RDD of differences.
Sent with Good (www.good.com)
-Original Message-
From: derrickburns [derrickrbu...@gmail.commailto:derrickrbu...@gmail.com]
Sent:
Hi,
On Fri, Jan 30, 2015 at 6:32 AM, Ganelin, Ilya ilya.gane...@capitalone.com
wrote:
Make a copy of your RDD with an extra entry in the beginning to offset.
The you can zip the two RDDs and run a map to generate an RDD of
differences.
Does that work? I recently tried something to compute
http://mail-archives.apache.org/mod_mbox/spark-user/201405.mbox/%3ccalrvtpkn65rolzbetc+ddk4o+yjm+tfaf5dz8eucpl-2yhy...@mail.gmail.com%3E
http://mail-archives.apache.org/mod_mbox/spark-user/201405.mbox/%3ccalrvtpkn65rolzbetc+ddk4o+yjm+tfaf5dz8eucpl-2yhy...@mail.gmail.com%3E
you can use the MLLib