Re: [R] Best and worst values for each date

arun Wed, 25 Sep 2013 13:31:08 -0700


Hi,
May be you can try this:


obj_name<- load("arun.RData")
Pred1<- get(obj_name[1])
Actual1<- get(obj_name[2])
library(reshape2)
dat<-cbind(melt(Pred1,id.vars="S1"),value2=melt(Actual1,id.vars="S1")[,3])  # 
to reshape to long form
colnames(dat)[3:4]<- c("Predict","Actual")
dat$variable<- as.character(dat$variable) #not that needed
dat1<-  dat[!(is.na(dat$Predict)|is.na(dat$Actual)),] # removes the NA values 
in columns "Predict" and "Actual"


res<- 
do.call(rbind,lapply(split(dat1,dat1$S1),function(x){x1<-x[order(x$Predict),]

                                      xlow<-if(sum(x1$Predict<0) <5){  #in 
cases where you don't have 5 negative numbers

                                                 x1[x1$Predict<0,]
                                                }
                                             else  {
                                            x1[x1$Predict<0,][1:5,]  # select 
first five rows     

                                               }
                                           xhigh<- if(sum(x1$Predict>0) <5){ 
#not having 5 postive numbers

                                                  x1[x1$Predict>0,]}
                                                  else {
                                                    tail(x1[x1$Predict>0,],5)

                                                       }   

                     rbind(xhigh[rev(order(xhigh$Predict)),],xlow)}))  
##reverse the order of high values 
 dim(res)
#[1] 480   4



A.K.

________________________________
From: Ira Sharenow <irasharenow...@yahoo.com>
To: arun <smartpink...@yahoo.com> 
Sent: Wednesday, September 25, 2013 12:55 PM
Subject: Best and worst values for each date



Arun,

I hope you have been doing well.

I have a new problem.

I have two data frames, one for predictions and one for the actual returns.

Each day I act on the returns that have the 5 highest values and the five 
lowest values. I then want to compare to the actual values. So I need to subset 
my two original data frames so that the stocks and their prices that remain 
after each day are the ones I want. At the end of filtering there will be one 
data frame for predictions and one data frame for actual values.

Now for an enhancement. NA values cannot be part of the reduced data frames but 
will occur in great proportion in the original data frames. Each day I need to 
check that the top five are positive; otherwise I need to reduce that number as 
needed. Similarly I need for the bottom five are negative. At the end of 50 
days each original data frame will have 5 * 2 * 50 = 500 rows, but this step 
may reduce that number.

I attached a smallish file with the two data frames. The real ones have 
hundreds of columns and over 1,000 rows.

Please aim for simplicity. If the solution is complex, please explain.


Do you want me to use a different email address?


Thanks.

Ira

Example. But the stocks are not set up this way.

The highlighted stocks are in the first data frames.



Predict Actual 
1/3/2006 S1 3 -1.943 
1/3/2006 S20 4 10.376 
1/3/2006 S3 2 8.611 
1/3/2006 S4 1 7.465 
1/3/2006 S5 0 1.648 
1/3/2006 S6 -1 5.36 
1/3/2006 S7 -2 4.36 
1/3/2006 S8 -3 3.574 
1/3/2006 S9 -4 2.748 
1/3/2006 S10 -5 1.933 
1/3/2006 S11 -6 0.548 
1/3/2006 S12 -7 -0.66 
1/3/2006 S13 -8 -1.793 
1/3/2006 S14 -9 -2.163 
1/3/2006 S15 -10 -3.077 
1/3/2006 S16 -11 -4.723 
1/3/2006 S17 -12 -5.919 
1/3/2006 S18 -13 -6.529 
1/3/2006 S19 -14 -7.979 
1/3/2006 S20 -15 -8.064 


After making sure only positives are in for top 5 predictions and only 
negatives for the bottom 5 predictions
1/3/2006 S1 3 -1.943 
1/3/2006 S20 4 10.376 
1/3/2006 S3 2 8.611 
1/3/2006 S4 1 7.465 
1/3/2006 S16 -11 -4.723 
1/3/2006 S17 -12 -5.919 
1/3/2006 S18 -13 -6.529 
1/3/2006 S19 -14 -7.979 
1/3/2006 S20 -15 -8.064 

Note that the next day different stocks may be selected. Also there cannot any 
NA in either the Predict or Actual columns.                  

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best and worst values for each date

Reply via email to