Hi Sujit,
Thanks very much for your kind help.I have found the sliding doc in both scala 
and java spark, it is from mlib RDDFunctions, though in the doc there is always 
not enough example.
Best Regards,Zhiliang

 


     On Monday, September 21, 2015 11:48 PM, Sujit Pal <sujitatgt...@gmail.com> 
wrote:
   

 Hi Zhiliang, 
Would something like this work?
val rdd2 = rdd1.sliding(2).map(v => v(1) - v(0))
-sujit

On Mon, Sep 21, 2015 at 7:58 AM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid> 
wrote:

Hi Romi,
Thanks very much for your kind help comment~~
In fact there is some valid backgroud of the application, it is about R data 
analysis.... #fund_nav_daily is a M X N (or M X 1) matrix or data.frame, col is 
each daily fund return, row is the daily date#fund_return_daily needs to count 
the each fund's daily return subtracted the previous day's return 
fund_return_daily <- diff(log(fund_nav_daily)) 
#the first row would be all 0, since there is no previous row ahead first row
fund_return_daily <- rbind(matrix(0,ncol = ncol(fund_return_daily)), 
fund_return_daily) ...
I need to exactly code the R program by way of spark, then RDD/DataFrame is 
used to replace R data.frame, however, I just found that it is VERY MUCH 
diffcult to make the spark program to flexibly descript & transform R backgroud 
applications.I think I have seriously lost myself into risk about this...
Would you help direct me some about the above coding issue... and my risk about 
practice in spark/R application...
I must show all my sincere thanks torwards your kind help.

P.S. currently sparkR in spark 1.4.1 , there is many bug in the API 
createDataFrame/except/unionAll, and it seemsthat spark Java has more functions 
than sparkR.Also, no specific R regression algorithmn is including in sparkR .
Best Regards,Zhiliang


     On Monday, September 21, 2015 7:36 PM, Romi Kuntsman <r...@totango.com> 
wrote:
   

 RDD is a set of data rows (in your case numbers), there is no meaning for the 
order of the items.
What exactly are you trying to accomplish?

Romi Kuntsman, Big Data Engineer
http://www.totango.com

On Mon, Sep 21, 2015 at 2:29 PM, Zhiliang Zhu <zchl.j...@yahoo.com.invalid> 
wrote:

Dear , 

I have took lots of days to think into this issue, however, without any 
success...I shall appreciate your all kind help.
There is an RDD<int> rdd1, I would like get a new RDD<int> rdd2, each row in 
rdd2[ i ] = rdd1[ i ] - rdd[i - 1] .What kinds of API or function would I use...

Thanks very much!John




   



  

Reply via email to