Re: [R] trouble with looping for effect of sampling interval increase

2012-08-07 Thread White, William Patrick
My apologies, here is a sample dataset generator:

#Running sum Test Data
Coin - c(-1,1)
flips=sample(Coin, 1000, replace=T)
Runningsum -cumsum (flips)
#A deactivated plot
#plot (Runningsum)
Test - cbind (Runningsum)
datasetORIGINAL  - cbind (Runningsum)






From: Jean V Adams [jvad...@usgs.gov]
Sent: Monday, August 06, 2012 1:33 PM
To: White, William Patrick
Cc: r-help@r-project.org
Subject: Re: [R] trouble with looping for effect of sampling interval increase

You would make it much easier for R-help readers to solve your problem if you 
provided a small example data set with your code, so that we could reproduce 
your results and troubleshoot the issues.

Jean


Naidraug white@wright.edu wrote on 08/05/2012 09:08:25 AM:

 I've looked everywhere and tinkered for three days now, so I figure asking
 might be good.
 So here's a general rundown of what I am trying to get my code to do I am
 giving you the whole rundown because I need a solution that retain certain
 ways of doing things because they give me the information i need.
 I want to examine the effect of increasing my sampling interval on my data.
 Example: what if instead of sampling every hour I sampled every two, oh
 yeah, how about every three?.. etc ad nausea.  How I want to do this is to
 take the data I have now, add an index  to it, that contains counters. Those
 counters will look something like 1,2,1,2,.. for the first one,
 1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand...
 Then for each column in the index my loops should start in the first column,
 run only the ones, store that, then run the twos, and store that in the same
 column of output in a different row. Then move to the next column run the
 ones, store in the next column of output, run the twos, store in the next
 row of that column, run the threes, etc on out until there is no more. I
 want to use this index for a number of reasons. The first is that after this
 I will be going back through and using a different method for sub-sampling
 but keeping all else the same. So all I have to do there is change the way I
 generate the index. The second is that it allows me to run  many subsamples
 and see their range.  So the code I have made, generates my index, and does
 the heavy lifting all correctly, as well as my averages, and quartiles, but
 a look at the head () of my key output (IntervalBetas)  shows that something
 has gone a miss. You have to look close to catch it.  The values generated
 for each row of output are identical, this should not be the case, as row
 one of the first output column should be generated from all values indexed
 by a one in the first column, whereas in column two there are different
 values indexed by the number one. I've checked about everything I can think
 of, done print() on my loop sequence things (those little i and j) and
 wiggled about everything. I am flummoxed. I think the bit that is messing up
 is in here :
 #Here is the loop for betas from sampling interval increase
  c - WHOLESIZE[2]-1
  for (i in 1:c)
  {
  x - length(unique(index[,i]))

  for (j in 1:x)
  {

  data - WHOLE [WHOLE[,x]==j,1]

 But also here is the whole code in case I am wrong that that is the problem
 area:

 #loop for making index


  #clean dataset of empty cells
  dataset - na.omit (datasetORIGINAL)
  #how messed up was the data?
  holeyDATA - datasetORIGINAL - dataset

  D - dim(dataset)

 #what is the smallest sample?
 tinysample - 100




 #how long is the dataset?
  datalength - length (dataset)


  #MD - how many divisions

 MD - datalength/tinysample

  #clear things up for the index loop
  WHOLE - NULL
 index - NULL
  #do the index loop

  for (a in 1:MD)
  {
  index - cbind (index, rep (1:a, length = D[1]))
  }
 index - subset(index, select = -c(1) )

  #merge dataset and index loop
  WHOLE - cbind (dataset, index)

  WHOLESIZE - dim (WHOLE)

 #Housekeeping before loops
 IntervalBetas - NULL


 IntervalBetas - c(NA,NA)
 IntervalBetas - as.data.frame (IntervalBetas)
 IntervalLowerQ - NULL
 IntervalUpperQ - NULL
 IntervalMean - NULL
 IntervalMedian - NULL

 #Here is the loop for betas from sampling interval increase
  c - WHOLESIZE[2]-1
  for (i in 1:c)
  {
  x - length(unique(index[,i]))

  for (j in 1:x)
  {

  data - WHOLE [WHOLE[,x]==j,1]




  #get power spectral density

  PSDPLOT - spectrum (data, detrend = TRUE, plot = FALSE)
  frequency - PSDPLOT$freq
  PSD - PSDPLOT$spec
  #log transform the power spectral density
  Logfrequency - log(frequency)
  LogPSD- log(PSD)
  #fit my line to the data
  Line - lm (LogPSD ~ Logfrequency)
  #store the slope of the line
  Betas - rbind (Betas, -coef(Line)[2])

 #Get values on the curve shape
 BSkew - skew (Betas)
 BMean - mean (Betas)
 BMedian - median (Betas)
 Q - quantile (Betas)


 #store curve shape values
 IntervalLowerQ - rbind (IntervalLowerQ , Q[2])
 IntervalUpperQ - rbind (IntervalUpperQ , Q[4])
 IntervalSkew - rbind (IntervalSkew , 

Re: [R] trouble with looping for effect of sampling interval increase

2012-08-06 Thread Jean V Adams
You would make it much easier for R-help readers to solve your problem if 
you provided a small example data set with your code, so that we could 
reproduce your results and troubleshoot the issues.

Jean


Naidraug white@wright.edu wrote on 08/05/2012 09:08:25 AM:
 
 I've looked everywhere and tinkered for three days now, so I figure 
asking
 might be good. 
 So here's a general rundown of what I am trying to get my code to do I 
am
 giving you the whole rundown because I need a solution that retain 
certain
 ways of doing things because they give me the information i need. 
 I want to examine the effect of increasing my sampling interval on my 
data.
 Example: what if instead of sampling every hour I sampled every two, oh
 yeah, how about every three?.. etc ad nausea.  How I want to do this is 
to
 take the data I have now, add an index  to it, that contains counters. 
Those
 counters will look something like 1,2,1,2,.. for the first one,
 1,2,3,1,2,3.. for the next one. I have a lot of them, like say a 
thousand...
 Then for each column in the index my loops should start in the first 
column,
 run only the ones, store that, then run the twos, and store that in the 
same
 column of output in a different row. Then move to the next column run 
the
 ones, store in the next column of output, run the twos, store in the 
next
 row of that column, run the threes, etc on out until there is no more. I
 want to use this index for a number of reasons. The first is that after 
this
 I will be going back through and using a different method for 
sub-sampling
 but keeping all else the same. So all I have to do there is change the 
way I
 generate the index. The second is that it allows me to run  many 
subsamples
 and see their range.  So the code I have made, generates my index, and 
does
 the heavy lifting all correctly, as well as my averages, and quartiles, 
but
 a look at the head () of my key output (IntervalBetas)  shows that 
something
 has gone a miss. You have to look close to catch it.  The values 
generated
 for each row of output are identical, this should not be the case, as 
row
 one of the first output column should be generated from all values 
indexed
 by a one in the first column, whereas in column two there are different
 values indexed by the number one. I've checked about everything I can 
think
 of, done print() on my loop sequence things (those little i and j) and
 wiggled about everything. I am flummoxed. I think the bit that is 
messing up
 is in here :
 #Here is the loop for betas from sampling interval increase
  c - WHOLESIZE[2]-1
  for (i in 1:c)
  {
  x - length(unique(index[,i]))
 
  for (j in 1:x) 
  {
 
  data - WHOLE [WHOLE[,x]==j,1]
 
 But also here is the whole code in case I am wrong that that is the 
problem
 area: 
 
 #loop for making index
 
 
  #clean dataset of empty cells
  dataset - na.omit (datasetORIGINAL)
  #how messed up was the data?
  holeyDATA - datasetORIGINAL - dataset
 
  D - dim(dataset)
 
 #what is the smallest sample? 
 tinysample - 100 
 
 
 
 
 #how long is the dataset?
  datalength - length (dataset)
 
 
  #MD - how many divisions
 
 MD - datalength/tinysample
 
  #clear things up for the index loop
  WHOLE - NULL
 index - NULL
  #do the index loop
 
  for (a in 1:MD)
  {
  index - cbind (index, rep (1:a, length = D[1]))
  }
 index - subset(index, select = -c(1) )
 
  #merge dataset and index loop
  WHOLE - cbind (dataset, index)
 
  WHOLESIZE - dim (WHOLE)
 
 #Housekeeping before loops
 IntervalBetas - NULL
 
 
 IntervalBetas - c(NA,NA)
 IntervalBetas - as.data.frame (IntervalBetas)
 IntervalLowerQ - NULL
 IntervalUpperQ - NULL
 IntervalMean - NULL
 IntervalMedian - NULL
 
 #Here is the loop for betas from sampling interval increase
  c - WHOLESIZE[2]-1
  for (i in 1:c)
  {
  x - length(unique(index[,i]))
 
  for (j in 1:x) 
  {
 
  data - WHOLE [WHOLE[,x]==j,1]
 
 
 
 
  #get power spectral density
 
  PSDPLOT - spectrum (data, detrend = TRUE, plot = FALSE)
  frequency - PSDPLOT$freq
  PSD - PSDPLOT$spec
  #log transform the power spectral density 
  Logfrequency - log(frequency)
  LogPSD- log(PSD)
  #fit my line to the data 
  Line - lm (LogPSD ~ Logfrequency)
  #store the slope of the line
  Betas - rbind (Betas, -coef(Line)[2])
 
 #Get values on the curve shape
 BSkew - skew (Betas)
 BMean - mean (Betas)
 BMedian - median (Betas)
 Q - quantile (Betas) 
 
 
 #store curve shape values
 IntervalLowerQ - rbind (IntervalLowerQ , Q[2]) 
 IntervalUpperQ - rbind (IntervalUpperQ , Q[4]) 
 IntervalSkew - rbind (IntervalSkew , BSkew) 
 IntervalMean - rbind (IntervalMean , BMean)
 IntervalMedian - rbind (IntervalMedian , BMedian)
 
 #Store the Betas
 #This is a pain
 
 
 BetaSave - Betas 
 no.r - nrow(IntervalBetas)
 l.v - length(BetaSave)
 difer - no.r - l.v
 difers - abs(difer)
 if (no.r  l.v){ 
 IntervalBetas - rbind(IntervalBetas,rep(NA,difers))
 }
 else {
 (BetaSave - rbind(BetaSave,rep(NA,difers)))
 }
 
 IntervalBetas - cbind (IntervalBetas, BetaSave)
 
 

[R] trouble with looping for effect of sampling interval increase

2012-08-05 Thread Naidraug
I've looked everywhere and tinkered for three days now, so I figure asking
might be good. 
So here's a general rundown of what I am trying to get my code to do I am
giving you the whole rundown because I need a solution that retain certain
ways of doing things because they give me the information i need. 
I want to examine the effect of increasing my sampling interval on my data.
Example: what if instead of sampling every hour I sampled every two, oh
yeah, how about every three?.. etc ad nausea.  How I want to do this is to
take the data I have now, add an index  to it, that contains counters. Those
counters will look something like 1,2,1,2,.. for the first one,
1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand...
Then for each column in the index my loops should start in the first column,
run only the ones, store that, then run the twos, and store that in the same
column of output in a different row. Then move to the next column run the
ones, store in the next column of output, run the twos, store in the next
row of that column, run the threes, etc on out until there is no more. I
want to use this index for a number of reasons. The first is that after this
I will be going back through and using a different method for sub-sampling
but keeping all else the same. So all I have to do there is change the way I
generate the index. The second is that it allows me to run  many subsamples
and see their range.  So the code I have made, generates my index, and does
the heavy lifting all correctly, as well as my averages, and quartiles, but
a look at the head () of my key output (IntervalBetas)  shows that something
has gone a miss. You have to look close to catch it.  The values generated
for each row of output are identical, this should not be the case, as row
one of the first output column should be generated from all values indexed
by a one in the first column, whereas in column two there are different
values indexed by the number one. I've checked about everything I can think
of, done print() on my loop sequence things (those little i and j) and
wiggled about everything. I am flummoxed. I think the bit that is messing up
is in here :
#Here is the loop for betas from sampling interval increase
 c - WHOLESIZE[2]-1
 for (i in 1:c)
 {
 x - length(unique(index[,i]))

 for (j in 1:x) 
 {

 data - WHOLE [WHOLE[,x]==j,1]

But also here is the whole code in case I am wrong that that is the problem
area: 

#loop for making index


 #clean dataset of empty cells
 dataset - na.omit (datasetORIGINAL)
 #how messed up was the data?
 holeyDATA - datasetORIGINAL - dataset

 D - dim(dataset)

#what is the smallest sample? 
tinysample - 100 




#how long is the dataset?
 datalength - length (dataset)


 #MD - how many divisions
 
MD - datalength/tinysample

 #clear things up for the index loop
 WHOLE - NULL
index - NULL
 #do the index loop

 for (a in 1:MD)
 {
 index - cbind (index, rep (1:a, length = D[1]))
 }
index - subset(index, select = -c(1) )

 #merge dataset and index loop
 WHOLE - cbind (dataset, index)

 WHOLESIZE - dim (WHOLE)

#Housekeeping before loops
IntervalBetas - NULL


IntervalBetas - c(NA,NA)
IntervalBetas - as.data.frame (IntervalBetas)
IntervalLowerQ - NULL
IntervalUpperQ - NULL
IntervalMean - NULL
IntervalMedian - NULL

#Here is the loop for betas from sampling interval increase
 c - WHOLESIZE[2]-1
 for (i in 1:c)
 {
 x - length(unique(index[,i]))

 for (j in 1:x) 
 {

 data - WHOLE [WHOLE[,x]==j,1]




 #get power spectral density

 PSDPLOT - spectrum (data, detrend = TRUE, plot = FALSE)
 frequency - PSDPLOT$freq
 PSD - PSDPLOT$spec
 #log transform the power spectral density 
 Logfrequency - log(frequency)
 LogPSD- log(PSD)
 #fit my line to the data 
 Line - lm (LogPSD ~ Logfrequency)
 #store the slope of the line
 Betas - rbind (Betas, -coef(Line)[2])

#Get values on the curve shape
BSkew - skew (Betas)
BMean - mean (Betas)
BMedian - median (Betas)
Q - quantile (Betas) 


#store curve shape values
IntervalLowerQ - rbind (IntervalLowerQ , Q[2]) 
IntervalUpperQ - rbind (IntervalUpperQ , Q[4]) 
IntervalSkew - rbind (IntervalSkew , BSkew) 
IntervalMean - rbind (IntervalMean , BMean)
IntervalMedian - rbind (IntervalMedian , BMedian)

#Store the Betas
#This is a pain


BetaSave - Betas 
no.r - nrow(IntervalBetas)
l.v - length(BetaSave)
difer - no.r - l.v
difers - abs(difer)
if (no.r  l.v){ 
IntervalBetas - rbind(IntervalBetas,rep(NA,difers))
}
else {
(BetaSave - rbind(BetaSave,rep(NA,difers)))
}

IntervalBetas - cbind (IntervalBetas, BetaSave)


 }
 
 }

#That ends the loop within a loop for how sampling interval
#changes beta
head (IntervalBetas)





--
View this message in context: 
http://r.789695.n4.nabble.com/trouble-with-looping-for-effect-of-sampling-interval-increase-tp4639213.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read