from:"Adaikalavan Ramasamy"

Re: [R] Writing a single output file

2010-12-25 Thread Adaikalavan Ramasamy

Many ways of doing this and you have to think about efficiency and 
logisitcs of different approaches.


If the data is not large, you can read all n files into a list and then 
combine. If data is very large, you may wish to read one file at a time, 
combining and then deleting it before reading the next file. You can use 
cbind() to combine if all the Date columns are the same, otherwise 
merge() is useful.


The simple brute force approach would be:

 fns - list.files(pattern=^output)
 do.call( cbind, lapply(fns, read.csv, row.names=1) )


The slightly more optimized and flexible optiop but slightly less 
elegant could be something like this:


 fns - list.files(pattern=^output)
 out - read.csv(fns[1], row.names=NULL)

 for(fn in fns[-1]){
   tmp - read.csv(fn, row.names=NULL)
   out - merge(out, tmp, by=1, all=T)
   rm(tmp); gc()
 }

You have to see which option is best for your file sizes. Good luck.

Regards, Adai



On 23/12/2010 13:07, Amy Milano wrote:

Dear R helpers!

Let me first wish all of you Merry Christmas and Very Happy New year 2011

Christmas day is a day of Joy and Charity,
May God make you rich in both - Phillips Brooks

## 


I have a process which generates number of outputs. The R code for the same is 
as given below.

for(i in 1:n)
{
write.csv(output[i], file = paste(output, i, .csv, sep = ), row.names = 
FALSE)
}

Depending on value of 'n', I get different output files.

Suppose n = 3, that means I am having three output csv files viz. 
'output1.csv', 'output2.csv' and 'output3.csv'

output1.csv
date   yield_rate
12/23/20105.25
12/22/20105.19
.
.


output2.csv

date   yield_rate

12/23/20104.16

12/22/20104.59

.

.

output3.csv


date   yield_rate


12/23/20106.15


12/22/20106.41


.


.



Thus all the output files have same column names viz. Date and yield_rate. 
Also, I do need these files individually too.

My further requirement is to have a single dataframe as given below.

Date yield_rate1   yield_rate2
yield_rate3
12/23/2010   5.25  4.16  
6.15
12/22/2010   5.19  4.59  
6.41
...
...

where yield_rate1 = output1$yield_rate and so on.

One way is to simply create a dataframe as

df = data.frame(Date = read.csv('output1.csv')$Date, yield_rate1 =  
read.csv('output1.csv')$yield_rate,   yield_rate2 = 
read.csv('output2.csv')$yield_rate,
yield_rate3 = read.csv('output3.csv')$yield_rate)

However, the problem arises when I am not aware how many output files are there 
as n can be 5 or even 100.

So is it possible to write some loop or some function which will enable me to read 'n' 
files individually and then keeping Date common, only pickup the yield_curve 
data from each output file.

Thanking in advance for any guidance.

Regards

Amy





[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help Please!!!!!!!!!

2010-11-30 Thread Adaikalavan Ramasamy


Dear Melissa,

If Jim's solution doesn't work then for some reason your function is 
converting numerical values into either character or factor and I would 
suggest you use the colClasses argument to force the right class.


For example,

 mat - read.table( file=lala.txt, sep=\t, row.names=1, header=T,
colClasses=rep(numeric, 4) )

Then do a str(mat) and see what you get.

Regards, Adai



On 29/11/2010 13:02, jim holtman wrote:

Your data seems to read in just fine, so what is the problem you are
trying to solve?


x- read.table('clipboard', sep='\t', header=TRUE)
str(x)

'data.frame':   5 obs. of  5 variables:
  $ X : Factor w/ 5 levels JE,JM,S,..: 5 2 4 1 3
  $ None  : int  4 4 25 18 10
  $ Light : int  2 3 10 24 6
  $ Medium: int  3 7 12 33 7
  $ Heavy : int  2 4 4 13 2

summary(x)

   X  None  LightMedium Heavy
  JE:1   Min.   : 4.0   Min.   : 2   Min.   : 3.0   Min.   : 2
  JM:1   1st Qu.: 4.0   1st Qu.: 3   1st Qu.: 7.0   1st Qu.: 2
  S :1   Median :10.0   Median : 6   Median : 7.0   Median : 4
  SE:1   Mean   :12.2   Mean   : 9   Mean   :12.4   Mean   : 5
  SM:1   3rd Qu.:18.0   3rd Qu.:10   3rd Qu.:12.0   3rd Qu.: 4
 Max.   :25.0   Max.   :24   Max.   :33.0   Max.   :13


On Mon, Nov 29, 2010 at 12:29 AM, Melissa Waldman
melissawald...@gmail.com  wrote:

Hi,

I have been working with Program R for my stats class and I keep coming upon
the same error, I have read so many sites about inputting data from a text
file into R and I'm using the data to do a correspondence analysis.  I feel
like I have read everything and it is still not explaining why the error
message keeps coming up, I have used the exact examples I have seen in
articles and the same error keeps popping up: Error in sum(N) : invalid
'type' (character) of argument

I have spent so long trying to figure this out without success,
I am sure it has to do with the fact that my rows have names in them.  I
have attached the text file I have been using and if you have any ideas as
to how I can get R to plot the data using correspondence analysis with the
column and row names that would be really helpful!  Or if you could pass
this email to someone who may know how to help me, that would be much
appreciated.

Thank you,
Melissa Waldman

my email: melissawald...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] saving multiple panes to PNG

2010-11-30 Thread Adaikalavan Ramasamy

I cannot run your example because I cannot identify which package the 
function returns is from.


Nonetheless, something like par(mfrow=c(2,3)) should do the trick.

Regards, Adai



On 30/11/2010 14:22, Charles Evans wrote:

After searching multiple combinations of keywords over the past two
days and downloading n R graphics tutorials, I have not been able to
find anything online or in my R books about how to save multiple plot
panes to PNG.

Specifically, I am using the irf() function in the vars package to
generate plots of Impulse Response Functions:

x.data- cbind(na.omit(returns(p[,2])),na.omit(returns(n[,2])))
colnames(x.data)- c(p.ret,n.ret)
x.jo- ca.jo(x.data,type=trace,ecdet=none,spec=transitory)
x.var- vec2var(x.jo)
x.irf- irf(x.var,n.ahead=30)
plot(x.irf)

This results in a plot containing a pair of IRF graphs in Quartz and
the following message in the Console:

HitReturn  to see next plot:

When one hitsReturn, the next pair of IRF graphs appears in Quartz.

When I try to save the plots to PNG

png(...)
plot(...)
dev.off()

I am able to save only one of the plots.  How does one tell plot() to
plot first one of the panes and then the second?

Any help would be greatly appreciated.

Yours,

Charles Evans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] more flexible ave

2010-11-30 Thread Adaikalavan Ramasamy


Here is a possible solution using sweep instead of ave:

  df - data.frame(site = c(a, a, a, b, b, b),
   gr = c(total, x1, x2, x1, total,x2),
   value1 = c(212, 56, 87, 33, 456, 213),
   value2 = c(1546, 560, 543, 234, 654, 312) )

  sdf - split(df, df$site)

  out - lapply( sdf, function(mat){

small.mat - mat[ , -c(1,2)]
totals- mat[ which( mat[ , gr] == total ), -c(1,2) ]
totals- as.numeric(totals)

percent=sweep( small.mat, MARGIN=2, STATS=totals, FUN=/ )
colnames(percent) - paste(percent_, colnames(percent), sep=)
return( cbind(mat, percent) )
  } )

  do.call(rbind, out)

  sitegr value1 value2 percent_value1 percent_value2
  a.1a total212   1546 1.  1.000
  a.2ax1 56560 0.26415094  0.3622251
  a.3ax2 87543 0.41037736  0.3512290
  b.4bx1 33234 0.07236842  0.3577982
  b.5b total456654 1.  1.000
  b.6bx2213312 0.46710526  0.4770642

Also I think it might be more efficient to replace your gr variable 
with a binary 0,1 where 1 indicates the total. That way you don't have 
to generate x1, x2, x3, 


Regards, Adai


On 30/11/2010 14:42, Patrick Hausmann wrote:

Hi all,

I would like to calculate the percent of the total per group for this
data.frame:

df- data.frame(site = c(a, a, a, b, b, b),
   gr = c(total, x1, x2, x1, total,x2),
   value1 = c(212, 56, 87, 33, 456, 213))
df

calcPercent- function(df) {

  df- transform(df, pct_val1 = ave(df[, -c(1:2)], df$gr,
FUN = function(x)
x/df[df$gr == total, value1]) )
}

# This works as intended...
w- lapply(split(df, df$site), calcPercent)
w- do.call(rbind, w)
w

# ... but when I add a new column
df$value2- c(1546, 560, 543, 234, 654, 312)

# the result is not what I want...
w- lapply(split(df, df$site), calcPercent)
w- do.call(rbind, w)
w

Clearly I have to change the function, (particularly value1) - but
how... I've also played around with apply but without any success.

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Significance of the difference between two correlation coefficients

2010-11-29 Thread Adaikalavan Ramasamy

Thanks for providing the example but it would be useful to know who I am 
communicating with or from which institute, but nevermind ...


I don't know much about this subject but a quick google search gives me 
the following site: http://davidmlane.com/hyperstat/A50760.html


Using the info from that website, I can code up the following to give 
the two-tailed p-value of difference in correlations:


 diff.corr - function( r1, n1, r2, n2 ){

   Z1 - 0.5 * log( (1+r1)/(1-r1) )
   Z2 - 0.5 * log( (1+r2)/(1-r2) )

   diff   - Z1 - Z2
   SEdiff - sqrt( 1/(n1 - 3) + 1/(n2 - 3) )
   diff.Z  - diff/SEdiff

   p - 2*pnorm( abs(diff.Z), lower=F)
   cat( Two-tailed p-value, p , \n )
 }

 diff.corr( r1=0.5, n1=100, r2=0.40, n2=80 )
 ## Two-tailed p-value 0.4103526

 diff.corr( r1=0.1, n1=100, r2=-0.1, n2=80 )
 ## Two-tailed p-value 0.1885966

The p-value here is slightly different from the Vassar website because 
the website rounds it's diff.Z values to 2 digits.


Regards, Adai



On 29/11/2010 15:30, syrvn wrote:


Hi,

based on the sample size I want to calculate whether to correlation
coefficients are significantly different or not. I know that as a first step
both coefficients
have to be converted to z values using fisher's z transformation. I have
done this already but I dont know how to further proceed from there.

unlike for correlation coefficients I know that the difference for z values
is mathematically defined but I do not know how to incorporate the sample
size.

I found a couple of websites that provide that service but since I have huge
data sets I need to automate this procedure.

(http://faculty.vassar.edu/lowry/rdiff.html)

Can anyone help?

Cheers,
syrvn



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] standardize columns selectively within a dataframe

2010-09-01 Thread Adaikalavan Ramasamy


If you want to scale within columns, you could try

 cbind( scale(df[,1:2]), df[ ,-c(1:2)] )
a  b c  d
 1 -1 -1 7 10
 2  0  0 8 11
 3  1  1 9 12

and it is data.frame() btw.


On 01/09/2010 15:35, Olga Lyashevska wrote:

Dear all,

I have a dataframe:
df-dataframe(a=c(1,2,3),b=c(4,5,6),c=c(7,8,9),d=c(10,11,12))

I want to obtain a new dataframe with columns a and b being standardized
((x-mean(x))/sd(x)); the other two columns (c,d) I want to leave
unchanged. What is the best way to achieve this? I have been trying to
use subscripts but did not succeed so far.

Any tips?

Many thanks,
Olga

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] forest plot

2010-08-24 Thread Adaikalavan Ramasamy

You can also do meta.summaries() - from rmeta package - followed by a 
plot() on the resulting object.


Or for a much more flexible plot try forestplot() function, also from 
rmeta package, but this requires a bit of work to set it up.


Regards, Adai


On 24/08/2010 05:50, C.H. wrote:

The correct command for forest plot should be plot (instead of
forest) if you are using metagen from meta package.

For help:

?plot.meta

On Tue, Aug 24, 2010 at 11:03 AM, zhangweiweiweiweizhan...@hotmail.com  wrote:


Dear Sir or Madam,



I am trying to plot forest plot. I extracted odds ratio and their corresponding 
95% confidence interval from papers, then I calculated the log(OR) and standard 
error using the following command

  OR-metagen(logOR,selogOR,sm=OR)

forest(OR,comb.fixed=TRUE,comb.random=TRUE,digits=2)



However, it does not produce a forest plot.  Can someone kindly help? Thank you 
in advance.



Best wishes

weiwei


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding a scalar value...

2010-08-19 Thread Adaikalavan Ramasamy


Your best option is to read the relevant help files.

A simple (untested) example to find R when P, T and scal.fn=Z is given, 
is to do this:


 my.fun - function(P, R, T, Z) scal.fn(P, R, T) - Z
 uniroot( fn, R=rr, T=tt, Z=zz, lower=-100, upper=100 )$root

You have to make an intelligent guess on the upper and lower ranges for 
the parameter R. I have used +/- 1 million as a silly example.


HOWEVER, I do not think this works when P,R,T,Z are scalars. Try it to 
be sure. If not, then you may have to write a for or apply loop.


Regards, Adai




On 16/08/2010 13:19, Petar Milin wrote:

Thanks for the answer!
However, if I would have scal.fn() like below, how would I apply
uniroot() or optimize() or the like?

Best,
PM

On 16/08/10 13:24, Adaikalavan Ramasamy wrote:

You probably need to look up on how to write functions.

Try

  scal.fn- function(P, R, T){
   out- ( 1/R - T ) / ( P - T )
   return(out)
  }

Here is a fake example:

  df- cbind.data.frame( P=rnorm(10), R=rnorm(10), T=rnorm(10) )
  scal.fn( df$P, df$R, df$T )

Or are you trying to solve other parameters given scal values? If so,
try having a look at functions like uniroot().

Regards, Adai


On 16/08/2010 11:48, Petar Milin wrote:

Hello!
I need to find a simple scalar value:
Scal = ((1/R) - T) / (P - T),
where R, T, and P are vectors in a data.frame.

Please, can anyone tell me how to solve that in R?

Best,
PM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding a scalar value...

2010-08-16 Thread Adaikalavan Ramasamy


You probably need to look up on how to write functions.

Try

 scal.fn - function(P, R, T){
  out - ( 1/R - T ) / ( P - T )
  return(out)
 }

Here is a fake example:

 df - cbind.data.frame( P=rnorm(10), R=rnorm(10), T=rnorm(10) )
 scal.fn( df$P, df$R, df$T )

Or are you trying to solve other parameters given scal values? If so, 
try having a look at functions like uniroot().


Regards, Adai


On 16/08/2010 11:48, Petar Milin wrote:

Hello!
I need to find a simple scalar value:
Scal = ((1/R) - T) / (P - T),
where R, T, and P are vectors in a data.frame.

Please, can anyone tell me how to solve that in R?

Best,
PM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] working out main effect variance when different parameterization is used and interaction term exists

2010-07-13 Thread Adaikalavan Ramasamy


Dear all,

Apologies if this question is bit theoretical and for the longish email. 
I am meta-analyzing the coefficients and standard errors from multiple 
studies where the raw data is not available.


Each study analyst runs a model that includes an interaction term for, 
say, between sex and smoking and age.


Here is an illustrative example example for one study:

 set.seed(1066)

 status - rbinom( 1000, 1, 0.2 )
 males  - rbinom( 1000, 1, 0.6 )
 smoke  - rbinom( 1000, 1, 0.3 )
 age- runif(1000, min=20, max=80)

 coef( summary( f1 - glm( status ~ males*smoke + age,
   family=binomial ) ) )
 # Estimate  Std. Errorz value Pr(|z|)
 # (Intercept) -1.520399871 0.284464584 -5.3447774 9.052825e-08
 # males0.213851446 0.201717381  1.0601538 2.890746e-01
 # smoke   -0.123103049 0.292346483 -0.4210861 6.736922e-01
 # age -0.001056007 0.004612947 -0.2289223 8.189293e-01
 # males:smoke  0.283775173 0.362821438  0.7821345 4.341355e-01


Now, unfortunately some analysts coded sex as females instead of males. 
Using the same dataset, I get the following output with females:


 females - 1 - males
 coef( summary( f1 - glm( status ~ females*smoke + age,
   family=binomial )) )
 #   Estimate  Std. Errorz value Pr(|z|)
 # (Intercept)   -1.306548425 0.262573162* -4.9759405 6.493160e-07
 # females   -0.213851446 0.201717381* -1.0601538 2.890746e-01
 # smoke  0.160672124 0.214923130*  0.7475795 4.547138e-01
 # age   -0.001056007 0.004612947 -0.2289223 8.189293e-01
 # females:smoke -0.283775173 0.362821438 -0.7821345 4.341355e-01


I have worked out algebrically (and numerically) the following:

 Beta(females)   =  -Beta(males)
 Var(females)=  Var(males)

 Beta(females:smoke) =  -Beta(males:smoke)
 Var(females:smoke)  =  Var(males:smoke)

 Beta(smoke | fit1)  =  Beta(smoke | fit2) + Beta(females:smoke)
 =  0.160672124 -0.283775173
 =  -0.1231030

How can I calculate the Var(smoke | fit1) from Var(smoke | fit2) ?

I tried to derive this algebrically but ended up with a covariance term 
which I could not solve. If I could cleverly convert Var(smoke | fit2) 
to Var(smoke | fit1) then I could avoid going back to each analyst since 
 this particular analyses is only one of many hundreds we run and it 
would be annoying for each analyst to use the same parameterisation.


Any suggestions is much appreciated. Many thanks in advance.

Regards, Adai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] column selection in list

2010-01-25 Thread Adaikalavan Ramasamy

If the columns of all elements of the list are in the same order, then 
you can collapse it first and then extract.


   out - do.call(rbind, SPECSHOR_tx_Asfc)
   out[ , Asfc.median]

Regards, Adai


Ivan Calandra wrote:

Hi everybody!

I have a (stupid) question but I cannot find a way to do it!

I have a list like:
  SPECSHOR_tx_Asfc
$cotau
SPECSHOR Asfc.median
38cotau381.0247
39cotau154.6280
40cotau303.3219
41cotau351.2933
42cotau156.5327
$eqgre
 SPECSHOR Asfc.median
145eqgre219.5389
146eqgre162.5926
147eqgre146.3726
148eqgre127.6413
149eqgre274.2888
$gicam
 SPECSHOR Asfc.median
263gicam174.7445
264gicam 83.4821
265gicam157.6005
266gicam153.7519
267gicam344.9775

I would just like to remove the column SPECSHOR (or extract the other 
one) so that it looks like

$cotau
 Asfc.median
38381.0247
39   154.6280
40303.3219
41351.2933
42156.5327
etc.

How should I do it? I know how to select each element like 
SPECSHOR_tx_Asfc[[1]], but I don't know how to select a single column 
within an element.


Could you please help me on that?

Thanks
Ivan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lm on group

2010-01-23 Thread Adaikalavan Ramasamy

You can guess by looking at class(g). It is a factor. It is NOT 
regressing on the mean of g (i.e. 2.5 and 7.5) and you could have 
changed g from (0,5] and (5,10] to A and B with the same results.


Read some books or help(lm) to get an idea of what the outputs mean.

Regards, Adai



newbieR wrote:

Hi all,

  I have a quick question about lm on group, say I have:


x - 1:10
y - x*3
buckets - seq(0, 10, by=5)
g - cut(x, buckets)
summary(lm(y ~ g - 1))

  Coefficients:
  Estimate Std. Error t value Pr(|t|)
g(0,5]   9.000  2.121   4.243  0.00283 ** 
g(5,10]   24.000  2.121  11.314 3.35e-06 ***


 What is it doing exactly? I guess the estimate is the mean of the y's
in each group. 


 How about other stats.. what do they exactly mean when we do lm on
groups? 



Thanks a lot!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] first and second derivative calculation

2010-01-23 Thread Adaikalavan Ramasamy


How about?

 eval( D( expression( t^3-6*t^2+5*t+30 ), t ) )



David Winsemius wrote:

On Jan 22, 2010, at 6:49 PM, Marlin Keith Cox wrote:


I can plot this just fine:
t-seq(0,4, by=.1)
y- t^3-6*t^2+5*t+30
plot(t,y ,xlab=t-values, ylab=f(t), type=l)
This is the first derivative, how I I make a similar plot?
t-seq(0,4, by=.1)
y- t^3-6*t^2+5*t+30
y1-D(expression(t^3-6*t^2+5*t+30), 't')


There might be some sort of deparse() operation that one could do on  
y1, but what follows sidesteps that level of programming.



y1fn - function(t) {3 * t^2 - 6 * (2 * t) + 5}
par(new=TRUE)
plot(t, y1fn(t), ylab=, xlab=, axes=FALSE)
  axis(side=4, at=seq(-7,5,by=1) )


--

David.


Thanks ahead of time.

kc


On Fri, Jan 22, 2010 at 12:41 PM, Doran, Harold hdo...@air.org  
wrote:



D(expression(t^3-6*t^2+5*t + 30), 't')

3 * t^2 - 6 * (2 * t) + 5


D(D(expression(t^3-6*t^2+5*t + 30), 't'), 't')

3 * (2 * t) - 6 * 2

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
]

On Behalf Of Marlin Keith Cox
Sent: Friday, January 22, 2010 4:37 PM
To: r-help@r-project.org
Subject: [R] first and second derivative calculation

I would like to calculate a first and second derivative and am having
problems finding a simple solution.  My syntax may be off as I am  
not a

mathematician, so pardon ahead of time.
data:
t-seq(0,4, by=.1)
The function is:
H(t) = t^3-6*t^2+5*t + 30

from here I plot the curve:
plot(x,y ,xlab=x-values, ylab=f(x), type=l)
But would like to similarly plot the curve for both the first and  
second

derivatives.
I can calculate the derivatives by hand but would like to get R to  
do this

for me.
by hand:
H'(t) = 3*t^2 - 12*t + 5
H''(t) = 6*t-12
Keith

--
M. Keith Cox, Ph.D.
Alaska NOAA Fisheries, National Marine Fisheries Service
Auke Bay Laboratories
17109 Pt. Lena Loop Rd.
Juneau, AK 99801
keith@noaa.gov
marlink...@gmail.com
U.S. (907) 789-6603

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.





--
M. Keith Cox, Ph.D.
Alaska NOAA Fisheries, National Marine Fisheries Service
Auke Bay Laboratories
17109 Pt. Lena Loop Rd.
Juneau, AK 99801
keith@noaa.gov
marlink...@gmail.com
U.S. (907) 789-6603

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] output

2010-01-18 Thread Adaikalavan Ramasamy

Season X is taken as the reference category. So the output 
factor(season)y   10.59739 means the feed_intake is higher by 10.59 
units in Season Y _compared to_ Season X.


Change your levels in season. E.g.
  season - factor(season, levels=c(Z, X, Y)
which means that Z will be taken as the reference category. Also read 
help(contrasts).


Regards, Adai



Ashta wrote:

Hi all,
I am trying to interparete  the result of the following output from  lm;


fit1 =lm(Feed _Intake ~ weight + season + weight*season)
Season has three classes(x,y,z)

Reults are

Estimate (Intercept)   21.51559
weight   2.13051
factor(season)y  10.59739
factor(season)z1.30421
weight:factor(season)y  10.1
weight:factor(season)z  21.70288

My question are  what is the estimate of season x?

Could it be possible to change the output in the following way?

factor(season)x
factor(season)y
weight:factor(season)x
weight:factor(season)y

Thanks in adavance

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Learning R

2009-11-30 Thread Adaikalavan Ramasamy


Dear Julia,

Welcome. It is good that you wish to learn more about R.

R has certainly become very vast in the last few years. Do you wish to 
learn R for a particular reason (financial analyses, multivariate, 
prediction/classification, genetics)? You might get more targeted 
reading materials, books and websites to follow up.


Regards, Adai



Julia Cains wrote:

Dear R helpers,

Almost 15 days back I have become member of this very active and wonderful 
group. So far I have been only raising  queries and in turn got them solved too 
and I really thank for the spirit this group member show when it comes to the 
guidance.

I wish to learn R language and I have given 2 months time for this. Can anyone 
please guide me as how do I begin i.e. from basics to advance.

R is such a vast thing to learn, so I wish to learn it step by step without 
getting lost at any stage.

Please guide me where do I start and upgrade myself to higher level step by 
step.

Regards

Julia







Only a man of Worth sees Worth in other men






  
	[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to analyze this design using lmer

2009-11-27 Thread Adaikalavan Ramasamy


Dear all,

A friend of mine requested me to analyze some data she has generated. I 
am hoping for some advice on best way of properly analyzing the data as 
I have never worked with such complicated or nested designs.


Here is the setup. She has taken material from 5 animals and each 
material is subdivided into 6 plate (30 plates in total). Each plate is 
then assigned as either a control or a treated with a chemical AND kept 
at one of three concentrations. A sample is taken daily from each plate 
for six continuous days and measured (180 measurement in total). Her 
main question is whether treatment has an effect.


Here is a simulated dataset:

 df - expand.grid( animal=LETTERS[1:5], group=c(Control, Treated),
conc=c(X, Y, Z), day=1:6 )
 df$plate - as.numeric(factor(apply(df[ ,1:3], 1, paste, collapse=)))
 df - df[ order(df$plate), ]
 df$plate - as.factor(df$plate)
 rownames(df) - NULL

 set.seed(1066)
 df$value - runif(90, 1, 2)*(df$group==Control) +
 c(0, -0.5, -0.20)[as.numeric(df$conc)] +
 rnorm(30)[ as.numeric(df$plate) ] +
 runif(180, 0.9, 1.1)*df$day + rnorm(180, sd=0.5)

 df[1:10, ]
 animal   group conc day plate value
 1A ControlX   1 1 3.3403510
 2A ControlX   2 1 5.1042965
 3A ControlX   3 1 5.4003462
 ...
 ...
 178  E TreatedZ   430 2.8558186
 179  E TreatedZ   530 4.4567206
 180  E TreatedZ   630 5.4542460


I have tried analyzing the data as follows:
library(lme4)
lmer( value ~ group + day + conc + (1 | animal/plate), data=df )
lmer( value ~ group + day + conc + (1 | animal), data=df )
lmer( value ~ group + day + conc + (1 | plate), data=df )

BUT I am not sure which of the models above is appropriate. Any advice 
would be very useful. Many thanks in advance.


Regards, Adai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split column

2009-11-24 Thread Adaikalavan Ramasamy


Not very elegant but this does the trick:

df - cbind( var1=c(1,3,2,1,2), var2=c(3,1,1,2,3) )

out - df
out[ which(df==1, arr.ind=T) ] - 11
out[ which(df==2, arr.ind=T) ] - 12
out[ which(df==3, arr.ind=T) ] - 22

outlist - apply(out, 2, strsplit, split=)
do.call( cbind.data.frame, lapply( outlist, do.call, what=rbind ) )
  var1.1 var1.2 var2.1 var2.2
1  1  1  2  2
2  2  2  1  1
3  1  2  1  1
4  1  1  1  2
5  1  2  2  2

Please check.

Regards, Adai



Lisaj wrote:

Hello, R users,

I have a dataset that looks like this: 

id   var1   var2   
 1  1  3   
 2  3  1   
 3  2  1   
 4  1  2   
 5  2  3   


I want to split one column to two columns with 1 = 1 and 1, 2 = 1 and 2, 3 =
2 and 2: 

id   var1.1  var1.2  var2.1  var2.2 
1 1   1   2   2 
2 2   2   1   1

3 1   2   1   1
4 1   1   1   2
5 1   2   2   2

Can anyone please help how to get this done? Thanks a lot in advance

Lisa



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replace a whole word with sub()

2009-11-13 Thread Adaikalavan Ramasamy


Isn't this more straightforward?

 w - grep(^Ig, vec)
 vec[w] - 0

Regards, Adai


Giulio Di Giovanni wrote:
 

Dear all, 

 


I cannot figure out how to solve a small problem (well, not for me), surely 
somebody can help me in few seconds.

 


I have a series of strings in a vector X of the type  xxx, yyy, zzz, IgA, IgG, kkk, 
IgM, aaa.

I want to substitute every ENTIRE string beginning with Ig with 0.

So, I'd like to have xxx, yyy, zzz, 0, 0, kkk, 0, aaa.

 


I can easily identify these strings with grep(^Ig, X), but if I use this criterion in the sub() function 
(sub(^Ig, 0, X) I obviously get 0A, 0G etc.

 


I didn't expect to do it in this way and I tried with metacharacters and regexps 
in order to grep and substitute the whole word (\b \, $). I don't post here my 
tryings,  because they were obviously wrong.

Please can you help me?

 


Giulio
 		 	   		  
_
Carica e scarica in un clic. Fino a 25 GB su SkyDrive 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to numeric

2009-11-02 Thread Adaikalavan Ramasamy


It's a way of extracting from a list. See help([) or help(Extract)

Regards, Adai

dadrivr wrote:

Great, that works very well.  What is the purpose of double brackets vs
single ones?  I will remember next time to include a subset of the data, so
that readers can run the script.  Thanks again for your help!
 


Benilton Carvalho wrote:

it appears that what you really want is to use:

task[[i]]

instead of task[i]

b

On Nov 1, 2009, at 11:04 PM, dadrivr wrote:

I would like to preface this by saying that I am new to R, so I  
would ask
that you be patient and thorough, so that I'm not completely  
clueless.  I am
trying to convert a list to numeric so that I can perform  
computations on it
(specifically mean-center the variable), but I am running into  
problems.  I
have imported the data set into task (data frame).  The data frame  
is made
of factors with variable names in the first row.  I am running a  
loop to set
a variable equal to a column in the data frame.  Here is an example  
of my

problem:

for (i in 1:dim(task)[2]){
predictor.loop - c(task[i])
predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T)
}

I get the following error:
Error in predictor.loop - mean(predictor.loop, na.rm = T) :
 non-numeric argument to binary operator
In addition: Warning message:
In mean.default(predictor.loop, na.rm = T) :
 argument is not numeric or logical: returning NA

The column is entirely made up of numerical data, except for the  
header,

which is a string.  My problem is that I receive an error because the
predictor.loop variable is not numerical, so I need to find a way to  
convert

it.  I tried using:
predictor.loop - c(as.numeric(task[i]))
But I get the following error: Error: (list) object cannot be  
coerced to

type 'double'

If I call the variable, I can assign it to a numerical list (e.g.,  
predictor
loop - task$variablename), but since I am assigning the variable in  
a loop,
I have to find another way as the variable name would have to change  
in each

loop iteration.  Any help would be greatly appreciated.  Thanks!
--
View this message in context:
http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to numeric

2009-11-02 Thread Adaikalavan Ramasamy


It's a way of extracting from a list. See help([) or help(Extract).



dadrivr wrote:

Great, that works very well.  What is the purpose of double brackets vs
single ones?  I will remember next time to include a subset of the data, so
that readers can run the script.  Thanks again for your help!
 


Benilton Carvalho wrote:

it appears that what you really want is to use:

task[[i]]

instead of task[i]

b

On Nov 1, 2009, at 11:04 PM, dadrivr wrote:

I would like to preface this by saying that I am new to R, so I  
would ask
that you be patient and thorough, so that I'm not completely  
clueless.  I am
trying to convert a list to numeric so that I can perform  
computations on it
(specifically mean-center the variable), but I am running into  
problems.  I
have imported the data set into task (data frame).  The data frame  
is made
of factors with variable names in the first row.  I am running a  
loop to set
a variable equal to a column in the data frame.  Here is an example  
of my

problem:

for (i in 1:dim(task)[2]){
predictor.loop - c(task[i])
predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T)
}

I get the following error:
Error in predictor.loop - mean(predictor.loop, na.rm = T) :
 non-numeric argument to binary operator
In addition: Warning message:
In mean.default(predictor.loop, na.rm = T) :
 argument is not numeric or logical: returning NA

The column is entirely made up of numerical data, except for the  
header,

which is a string.  My problem is that I receive an error because the
predictor.loop variable is not numerical, so I need to find a way to  
convert

it.  I tried using:
predictor.loop - c(as.numeric(task[i]))
But I get the following error: Error: (list) object cannot be  
coerced to

type 'double'

If I call the variable, I can assign it to a numerical list (e.g.,  
predictor
loop - task$variablename), but since I am assigning the variable in  
a loop,
I have to find another way as the variable name would have to change  
in each

loop iteration.  Any help would be greatly appreciated.  Thanks!
--
View this message in context:
http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to print the full name of the factors in summary?

2009-11-02 Thread Adaikalavan Ramasamy

It would be useful to say which package the object SJ comes from or 
provide a more reproducible example.


Assuming that Demand variable is continuous and you are fitting a 
standard lm() model, then your results looks suspicious. Where are the 
coefficients for Month, Holiday, Season?





Jen-Chien Chang wrote:

Hi,

I am wondering if there is a simple way to fix the problem I am having. 
For unknown reason, I could not get the full name of the factors to be 
printed in the summary. I have tried to used summary.lm as well but the 
problem still persists.


SJ$Weekday - 
factor(SJ$Weekday,1:7,c(Mon,Tue,Wed,Thu,Fri,Sat,Sun),ordered=T) 



attach(SJ)
lm.SJ - lm(Demand ~ Weekday+Month+Holiday+Season)
summary(lm.SJ)
Call:
lm(formula = Demand ~ Weekday + Month + Holiday + Season)

Residuals:
Min  1Q  Median  3Q Max
-69.767 -12.224  -1.378  10.857  91.376

Coefficients: (3 not defined because of singularities)
Estimate Std. Error t value Pr(|t|)
(Intercept)  88.7091 3.3442  26.527   2e-16 ***
Weekday.L20.8132 2.8140   7.396 1.08e-12 ***
Weekday.Q   -12.7667 2.8156  -4.534 7.99e-06 ***
Weekday.C   -10.6375 2.8113  -3.784 0.000182 ***
Weekday^4-8.3325 2.8103  -2.965 0.003238 **
-

Is there a way for summary to print the full name of the factors and 
levels? Say Weekday.Tue instead Weekday.L?


Thanks!

Jack Chang



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing generating data by category

2009-10-30 Thread Adaikalavan Ramasamy

Hmm, so if read correctly you want to remove exactly duplicated rows. So 
maybe try the following to begin with.


 duplicated(newdf[ , c(id, loc, clm)])
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE 
TRUE TRUE


Then you can remove the duplicated rows before proceeding with what has 
been suggested before.


Also you can try unique(newdf[ , c(id, loc, clm)]) if you are not 
interested in carrying over other corresponding variables.


See help(duplicated) and help(unique).

Regards, Adai




David Winsemius wrote:

Color me puzzled. Can you express the run more clearly in Boolean logic?

If someone has five policies: 3 Life and 2 General ...  is he in or out?

Applying the alternate strategy to that data set I get:
out - tapply( dat$clm, dat$uid, paste ,collapse=,)
 
  out
  A1.B1   
A2.B2  A3.B1
  General  
General,Life  General
  A3.B3   
A4.B4  A5.B5
General,Life,General,General  
General,Life,General General,Life


Please explain why you want A3.B3.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there a faster way to do it?

2009-10-29 Thread Adaikalavan Ramasamy


You might also want to consider using na.string=9 in the scan().



jim holtman wrote:

Here is a faster way of doing the replacement: (provide reproducible
data next time)


x - matrix(sample(6:9, 64, TRUE), 8)
x

 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]87767879
[2,]77867677
[3,]77769667
[4,]99768766
[5,]69988989
[6,]97697867
[7,]79897978
[8,]99699886

x.f - 1:8  # replacement values based on column
x.ind - which(x == 9, arr.ind=TRUE)
x.ind

  row col
 [1,]   4   1
 [2,]   6   1
 [3,]   8   1
 [4,]   4   2
 [5,]   5   2
 [6,]   7   2
 [7,]   8   2
 [8,]   5   3
 [9,]   6   4
[10,]   7   4
[11,]   8   4
[12,]   3   5
[13,]   8   5
[14,]   5   6
[15,]   7   6
[16,]   1   8
[17,]   5   8

x[x.ind] - x.f[x.ind[,'col']]
x

 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]87767878
[2,]77867677
[3,]77765667
[4,]12768766
[5,]62388688
[6,]17647867
[7,]72847678
[8,]12645886


On Wed, Oct 28, 2009 at 12:55 PM, Marcio Resende
mresende...@yahoo.com.br wrote:

#Mdarts is a matrix 2343x788
#frequencia is a vector 2343x1
# 9 in Mdarts[fri,frj] stands for my missing values which i want to replace
by the value in the vector frequencia


Mdarts-t(matrix(scan(C:/GWS/CNB/dartg.txt),ncol=nindT,nrow=nm, byrow=T))
frequencia - matrix(scan(C:/GWS/CNB/freq.txt),ncol=1)
for (fri in 1:nindT){
for (frj in 1:nm){
Mdarts[fri,frj] - if (Mdarts[fri,frj] == 9) frequencia[frj] else
Mdarts[fri,frj]
Mdarts[fri,frj] - Mdarts[fri,frj]/1-(frequencia[frj]^2)
}
}

Is there a faster way to it?
Maybe using any apply function?
Thanks in advance
--
View this message in context: 
http://www.nabble.com/Is-there-a-faster-way-to-do-it--tp26098223p26098223.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing generating data by category

2009-10-29 Thread Adaikalavan Ramasamy


Here is another way based on pasting ids as hinted below:

a - data.frame(id=c(c(A1,A2,A3,A4,A5),
   c(A3,A2,A3,A4,A5)),
   loc=c(B1,B2,B3,B4,B5),
   clm=c(rep((General),6),rep(Life,4)))

a$uid - paste(a$id, ., a$loc, sep=)

out - tapply( a$clm, a$uid, paste ) # can also add collapse=,
$A1.B1
[1] General

$A2.B2
[1] General Life

$A3.B1
[1] General

$A3.B3
[1] General Life

$A4.B4
[1] General Life

$A5.B5
[1] General Life


Then here are those with single policies.

 out[ which( sapply(out, length) == 1 ) ]
$A1.B1
[1] General

$A3.B1
[1] General



David Winsemius wrote:

On Oct 28, 2009, at 9:30 PM, Steven Kang wrote:


Dear R users,


Basically, from the following arbitrary data set:

a -
data
.frame
(id
=
c
(c
(A1
,A2
,A3
,A4
,A5
),c
(A3
,A2
,A3
,A4,A5)),loc=c(B1,B2,B3,B4,B5),clm=c(rep((General), 
6),rep(Life,4)))



a

   id   loc  clm
1  A1  B1 General
2  A2  B2 General
3  A3  B3 General
4  A4  B4 General
5  A5  B5 General
6  A3  B1 General
7  A2  B2Life
8  A3  B3Life
9  A4  B4Life
10 A5  B5Life

I desire removing records (highlighted records above) with identical  
values

in each fields (id  loc) but with different value of clm (i.e
according to category)


Take a look at this merge operation on separate rows of a.

  merge( a[a$clm==Life, ], a[a$clm==General, ] , by=c(id,  
loc), all=T)

   id loc clm.x   clm.y
1 A1  B1  NA General
2 A2  B2  Life General
3 A3  B1  NA General
4 A3  B3  Life General
5 A4  B4  Life General
6 A5  B5  Life General

Assignment of that object and selection with is.na should complete the  
process.


  a2m - merge( a[a$clm==Life, ], a[a$clm==General, ] ,  
by=c(id, loc), all=T)


  a2m[ is.na(a2m$clm.x) | is.na(a2m$clm.y), ]
   id loc clm.x   clm.y
1 A1  B1  NA General
3 A3  B1  NA General

Alternate methods might include paste-ing id to loc and removing  
duplicates.




i.e

categ - table(a$id,a$clm)
categ

General Life
 A1   10
 A2   11
 A3   21
 A4   11
 A5   11

The desired output is

   id   loc  clm
1  A1  B1 General
6  A3  B1 General

Because the data set I am working on is quite big (~ 800,000 x 20)
with majority of the fields values being long strings, looping  
turned out to

be very inefficient in comapring individual rows..

Are there any alternative efficient methods in implementing this  
problem?

Steven


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lost all script

2009-10-28 Thread Adaikalavan Ramasamy

To stop in Rgui mode, you can try pressing the ESC key. If you are using 
 within emacs, change to R buffer and try C-c C-c to stop it.


I am not sure how to recover the script (emacs usually makes a .R~ 
backup). Maybe if you still have the output printed to screen or 
terminal make a copy of it - you may be able to rewrite with some work. 
If your machine is backed up on regular basis, then try to get the last 
available backup.


Also note that you can view the same file (even while it is in the R 
session) using notepad etc externally. So next time, if you face a 
similar situation then you can check/save externally first.


Regards, Adai




David Young wrote:

Hi all,

I just had a rather unpleasant experience.  After considerable work I
finally got a script working and set it to run.  It had some memory
allocation problems when I came back so I used Windows to stop it.
During that process it told me that the script had been changed and
asked if I wanted to save it.  Not being positive that I'd saved the
very last changes I said yes.  Now when I turn on R again the script
is now completely blank.

I guess my questions are:
Is there a way to interrupt a program without using Windows?
Is there anyway to recover my script?

And a nice to know:
Anybody know why it saved blank space as the new script?

Thanks for any advice.

A humble, and humbled, new R user.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting rows according to a column

2009-10-28 Thread Adaikalavan Ramasamy


Not very elegant but try:

 z - data.frame(a = 1:5, b=10*(1:5), c = c(a, a, b, b, b) )
 z[ cbind( 1:nrow(z), match( as.character(z$c) , colnames(z) ) ) ]

If you have very few columns, you can use ifelse() too.

Regards, Adai



Gurpal Kalsi wrote:

Hi,

With a data such as:

z = data.frame(a = 1:5, b=10*a, c = c(a, a, b, b, b) )

* a  b  c*
 *1* 10 *a*
 *2* 20 *a*
 3 *30* *b*
 4 *40* *b*
 5 *50* *b*

Can anyone suggest a way to select [1, 2, 30, 40, 50],
ie. using column c to specify which column is selected for each row.

Many thanks

G

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New variables remember how they were created?

2009-10-28 Thread Adaikalavan Ramasamy


Your example is too complicated for me. But few points:

1) What do you mean by instrument? Do you mean variable?

2) diff(demand) is identical to demand[-1] - demand[-204]

3) system() is a built-in R function, so avoid using it as variable name

4) The variable yd is in the eqInvest formula and subsequently to the 
system formula. The variable y.1 is in the instruments formula. Both 
formulas are passed onto systemfit() call. Thus I see no surprises here.


Try simplifying and rephrasing please if you want further help.

Regards, Adai




Skipper Seabold wrote:

Hello all,

I hope this question is appropriate for this ML.

Basically, I am wondering if when you create a new variable, if the
variable holds some information about how it was created.

Let me explain, I have the following code to replicate an example in a
textbook (Greene's Econometric Analysis), using the systemfit package.

dta - 
read.table('http://pages.stern.nyu.edu/~wgreene/Text/Edition6/TableF5-1.txt',
header = TRUE)
attach(dta)
library(systemfit)
demand - realcons + realinvs + realgovt
c.1 - realcons[-204]
y.1 - demand[-204]
yd - demand[-1] - y.1
eqConsump - realcons[-1] ~ demand[-1] + c.1
eqInvest - realinvs[-1] ~ tbilrate[-1] + yd
system - list( Consumption = eqConsump, Investment = eqInvest)
instruments - ~ realgovt[-1] + tbilrate[-1] + c.1 + y.1
# 2SLS
greene2sls - systemfit( system, 2SLS, inst = instruments,
methodResidCov = noDfCor )

When I do the 2SLS fit, it seems that even though I declared y.1 as an
instrument that the estimator knows that yd was created using y1, so
it (correctly) transforms yd to use the instrument in the final
estimation.

So I'm wondering if yd somehow carries knowledge of how it was created.

Thanks,

Skipper

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] basic statistics to csv

2009-10-27 Thread Adaikalavan Ramasamy


It would be useful to have a simplified version of the 'nsu' object.

I am guessing it is a list of some sort (e.g. mean is single value, 
quantiles here returns 5 numbers) and not a matrix or dataframe (i.e. 
regular array). So you can have several choices here:


1) print nsu to a file. e.g. cat(nsu, file=lala, append=T) or using 
the sequence sink(file=lala); print(nsu); sink()


2) compile the nsu objects into a list (if generating nsu takes time, 
you can save each nsu and then have a script to read them all into a 
list). Then extract the means across the elements in the list (e.g. 
sapply) and compile into a regular array before using csv.


Regards, Adai




lanc...@fns.uniba.sk wrote:

I know that my question is like a very newbie question, but at the moment
I stacked with it and I need a quick solution. I need to make an overall
statistical overview of various datasets, the summary() and numSummary()
functions are fully sufficient. My question is, how can I export results
to a spreadsheet-like file, as a .csv. For the summary() with an x
dataset I can use this way:

su - summary(x)
write.csv(su, file = summary.csv)

The problem with this is that the csv file is rather chaotic.

but when I apply the same for the numSummary(x) output like:

nsu - numSummary(x[,c(a, b, c)], statistics=c(mean, sd,
quantiles), quantiles=c(0,.25,.5,.75,1))


write.csv(nsu, file = numsummary.csv)

I get the  ERROR: cannot coerce class numSummary into a data.frame
message.

Is there a more convenient way to get a spreadsheet-like output for the
basic statistics?

Many thanks for any help

Tomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing R extensions

2008-11-04 Thread Adaikalavan Ramasamy

It sounds like you simply uncompressed your .tar.gz file and then zipped 
it up. If so, it should not work correctly.


You need to compile it for windows. Try something like

Rcmd build --binary myRpackageDir

and you may need to include --force option in the command above.

Also check to make sure the R version in the machine you compile on and 
the machine you install on are recent versions.


Regards, Adai



micha_ wrote:

Hi,

I'm working on a package and got some problems. After I've done R CMD check
and build I get the package.tar.gz which I can install under Linux without
any problems. Now I wanted to have a Windows version. I heard that I only
have to zip the package folder. That worked once, but now the package can't
be installed. I got 1 warning while I did R CMD check, and this was 1 not
documented dataset, but it was also already in the old version that worked.
So is there anything I have to take care of for the Windows version? Or is
there a way to check what happend. The error message in Windows is this:


utils:::menuInstallLocal()

updating HTML package descriptions

library(mask)

Error in library(mask) :
  'mask' is not a valid package -- installed  2.0.0?

Can anybody help me with this?

Michael


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] request: How can we ignore a component of list having no element

2008-10-15 Thread Adaikalavan Ramasamy


Try
x[ !sapply(x, is.null) ]


hadley wickham wrote:

An alternative approach would be to store 0 x 0 matrices instead of
NULLs.  This way every object in your list is a consistent type.

Hadley

On Wed, Oct 15, 2008 at 5:23 AM, Muhammad Azam [EMAIL PROTECTED] wrote:

Dear friends
There is a list of arrays comprising different no of rows and columns even 
sometimes NULL, such as [[2]] given below. How can we ignore [[2]] or others 
like this in the complete list. Any help in this regard is needed. Thanks

[[1]]
  [,1] [,2]
[1,]31
[2,]31
[3,]31

[[2]]
NULL

[[3]]
   [,1] [,2] [,3] [,4] [,5] [,6] [,7]
 [1,]3100000
 [2,]3100000
 [3,]3100000
 [4,]3131321
 [5,]3131321
 [6,]3131320

[[4]]
  [,1] [,2] [,3] [,4]
[1,]3000
[2,]3133
[3,]3133
[4,]3130

OR
x1=c(1,2,3); x2=c(1,2,3,4,6); x3=c(); x=list(x1,x2,x3)

M.Azam



   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] back transforming output from negative binomial

2008-10-02 Thread Adaikalavan Ramasamy


Dear all,

I used the glm.nb with the default values from the MASS package to run a 
negative binomial regression. Here is a simple example:



   set.seed(123)
   y - c( rep(0, 30), rpois(70, lambda=2) )
   smoke  - factor( sample( c(NO, YES), 100, replace=T ) )
   height - c( rnorm(30, mean=100, sd=20), rnorm(70, mean=150, sd=20) )

   fit - glm.nb( y ~ smoke + height )
   coef(summary(fit))
  Estimate  Std. Errorz value Pr(|z|)
   (Intercept) -2.34907191 0.537610710 -4.3694664 1.245505e-05
   smokeYES-0.03479730 0.197627539 -0.1760751 8.602349e-01
   height   0.01942373 0.003527538  5.5063142 3.664243e-08


The question now is how do I report the results, say, for height? Do I 
simply take the anti logs. i.e. 1.019613 = exp(0.019423) ?


I have seen one paper where they report using anti log base 10 instead 
of natural base but they use STATA though.


Please kindly advise. Thank you.

Regards, Adai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] OHLC Plot with EMA in it

2008-09-25 Thread Adaikalavan Ramasamy


Can you give us a simple example which produces the same behavior?


Michael Zak wrote:

Hi there

I have some timeseries data which I plot in a OHLC Plot. In the same 
plot I'd like to have the EMA of this timeseries. I tried to add the EMA 
point to OHLC with lines(), but this doesn't work. Has anyone an idea 
how to handle it?


Regards, Michael Zak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rowSums()

2008-09-24 Thread Adaikalavan Ramasamy


I guess this would be the fastest way would be:

 rs - rowSums( testDat, na.rm=T)
 rs[ which( rowMeans(is.na(testDat)) == 1 ) ] - NA

since both rowSums and rowMeans are internally coded in C.

Regards, Adai



Doran, Harold wrote:

Say I have the following data:

testDat - data.frame(A = c(1,NA,3), B = c(NA, NA, 3))


testDat

   A  B
1  1 NA
2 NA NA
3  3  3

rowsums() with na.rm=TRUE generates the following, which is not desired:


rowSums(testDat[, c('A', 'B')], na.rm=T)

[1] 1 0 6

rowsums() with na.rm=F generates the following, which is also not
desired:



rowSums(testDat[, c('A', 'B')], na.rm=F)

[1] NA NA  6

I see why this occurs, but what I hope to have returned would be:
[1] 1 NA  6

To get what I want I could do the following, but normally my ideas are
bad ideas and there are codified and proper ways to do things. 


rr - numeric(nrow(testDat))
for(i in 1:nrow(testDat)) rr[i] - if(all(is.na(testDat[i,]))) NA else
sum(testDat[i,], na.rm=T)


rr

[1]  1 NA  6

Is there a proper way to do this? In my real data, nrow is over
100,000

Thanks,
Harold


sessionInfo()
R version 2.7.2 (2008-08-25) 
i386-pc-mingw32 


locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


other attached packages:
[1] MiscPsycho_1.2  lattice_0.17-13 statmod_1.3.6  


loaded via a namespace (and not attached):
[1] grid_2.7.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing a plot

2008-09-24 Thread Adaikalavan Ramasamy

One way is to keep a copy of the original and then return to it when you 
need it.


 x - rnorm(100,1,0.5)
 y - rnorm(100,1,0.5)
 plot(x,y,pch=16)
 original - recordPlot()

 for( i in 1:10 ){
   points( x[i], y[i], pch=19, col=yellow, cex=3)
   points( x[i], y[i], pch=16)
   Sys.sleep(1)  # slow the graphs a bit
   replayPlot(original)
 }

Regards, Adai



R Help wrote:

Hello list,

I've been working on this problem for a while and I haven't been able
to come up with a solution.

I have a couple of functions that plot a bunch of data, then a single
point on top of it.  What I want is to be able to change the plot of
the point without replotting all the data.  Consider the following
example:

x = rnorm(100,1,0.5)
y = rnorm(100,1,0.5)
plot(x,y,pch=16)
points(x[35],y[35],pch=19,col=6,cex=3)

What I want to be able to do is to change the purple point to a
different value without replotting everything.

I know this seems like an odd suggestion, but it comes up a lot with
the work I'm doing.  I've prepared a package on CRAN called
ResearchMethods for a course I'm working on, and there are several
functions in there who's GUIs could work better if I could figure this
out.

If anyone has any ideas, or needs some further explanation, feel free
to contact me.

Thanks a lot,
Sam Stewart

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lower / upper case letters in a plot

2008-09-24 Thread Adaikalavan Ramasamy


An example would help.

You generally control the titles using arguments like main, xlab, ylab, 
sub in the plotting functions or afterwards using title() function. You 
can get the upper/lower case using toupper()/tolower() functions. See 
help(par), help(title), help(tolower). Here is an example:



string - My x-axis corresponding to something
plot( rnorm(10), xlab=toupper(string) )

Regards, Adai



Jörg Groß wrote:

Hi,

How can I generate lower case letters for my axis-titles?



Thanks,
Jörg
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] t tests/ANOVA

2008-09-24 Thread Adaikalavan Ramasamy

First check that your data satisfies the normality assumption. If yes, 
then start with the ANOVA test


   summary( fit - aov( genomes ~ clonefed ) )

and *if* you find a significant F-value, you can see which difference is 
significant. i.e. post-hoc analysis.


   TukeyHSD( fit, clonefed )

You can use help(aov) etc to find out more details including examples.

Regards, Adai



Georgina Sarah Humphreys wrote:

I have a set of data that comprises genome numbers in single eggs from three 
different parasite clones - 3D7, HB3, and MIX.  I can draw a boxplot of the 
genome numbers for each clonefed but how do I carry out a t test or ANOVA to 
compare if the means are signifcantly different? (Data is listed below)

Many thanks,
Georgina Humphreys

clonefedgenomes
HB3 21.3
HB3 23.5
HB3 25.9
3D7 27.2
HB3 28.1
MIX 35.1
MIX 37.9
MIX 42.1
MIX 42.4
HB3 46.3
HB3 46.3
MIX 48.4
MIX 52.1
HB3 54.6
MIX 55.4
3D7 57.6
HB3 58.4
3D7 62.1
MIX 63.6
MIX 66.5
3D7 69.1
3D7 76.2
MIX 77.5
MIX 80.4
MIX 85.5
MIX 85.9
HB3 96
HB3 106.3
3D7 108.1
MIX 113.8
MIX 117.4
MIX 118
3D7 122.8
3D7 131.4
MIX 138.7
MIX 142.6
MIX 143
3D7 144
MIX 151.6
MIX 155.2
MIX 162.4
MIX 168.4
MIX 169.3
3D7 172.3
HB3 173
HB3 191.9
MIX 192.7
HB3 200
MIX 206.3
3D7 210.2
HB3 223.7
HB3 223.9
3D7 232.1
HB3 238.6
MIX 240.8
3D7 254.3
3D7 257.6
3D7 261.8
3D7 269.9
HB3 277
MIX 289.1
MIX 293.2
MIX 295.2
MIX 295.7
MIX 310.4
3D7 311.9
3D7 311.9
MIX 313.1
MIX 317.8
MIX 332.2
3D7 334.9
3D7 338.2
MIX 340
MIX 360.5
3D7 372.8
3D7 376.6
HB3 390.3
MIX 419.1
3D7 420
MIX 427.4
MIX 443
MIX 449.7
MIX 452.8
MIX 501.4
3D7 502.9
3D7 505.5
3D7 506.3
3D7 529
MIX 534.4
MIX 540.6
MIX 542
3D7 545.2
MIX 547.2
MIX 554.2
MIX 556.5
3D7 564.9
3D7 575.1
3D7 580.6
MIX 591.5
3D7 655.5
3D7 666.1
3D7 667.2
3D7 699
3D7 741.2
3D7 744.8
3D7 752.2
MIX 795.9
3D7 810.9
HB3 816.4
MIX 849.2
3D7 852.9
3D7 875.4
3D7 891.3
MIX 906.5
MIX 922.3
MIX 949.6
MIX 986.1
MIX 994.3
MIX 1005.3
MIX 1061.3
MIX 1159.5
3D7 1163.2
MIX 1177.5
3D7 1211.3
3D7 1249.7
3D7 1318.3
MIX 1579.3
MIX 1585.2
MIX 1590.3
MIX 1788.7
MIX 2012.9
3D7 2067.4

PhD Student
Division of Infection and Immunity
B5-29, GBRC
120 University Place
Glasgow
G12 8TA
Tel: 0141 330 5650

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] looping through variables

2008-09-24 Thread Adaikalavan Ramasamy


Perhaps what you want is get().

   apple - rnorm(5)
   orange - runif(5)

   fruits - c(apple, orange)

   fruit.data - NULL
   for( fruit in fruits ){
  v - get(fruit)
  fruit.data - cbind(fruit.data, v)
   }
   colnames(fruit.data) - fruits

   fruit.data

Here the resulting output is a matrix which works if all of your inputs 
have the same length. If they don't, then you probably want to use a 
list instead. Also have a look at assign().


Regards, Adai




K. Fleischer wrote:

Hello everyone,

I have the following problem:

My analysis includes many predictor variables (50) in the form of 
raster maps (asc), but I am trying to avoid having to type all their 
names over and over again in the analysis (e.g. for vectorisation, for 
deletion of NA's, etc.)


So ideally I would like to store them in some way that their names 
only have to be typed once and can always be referred back to.


First step would be to automate the vectorisation of the raster maps:

# these are the raster maps which need to combined somehow ??
variables - (temperature, precipitation, elevation, vegcover) 


VariablesNew=c()

For (i in 1:length(variables)) {
Varnew - as.vector(variables[i])
VariablesNew - cbind(VariablesNew, Varnew)
}

This should return a data frame called VariablesNew with each column 
representing one of the variables. 

So the BIG QUESTION is how to input the variable names that they can 
be referred to easily and, the variable itself can be pulled out and 
not just its name!!

I believe this cant be too difficult??

Thanx in advance,
Katrin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there a single command that can revert all the plotting parameters to default?

2008-09-24 Thread Adaikalavan Ramasamy

What do you mean? If you kill the existing graph, perhaps using 
dev.off(), the next plot generated should use default values. Is this 
what you want?


Some plotting functions use this at the start before modifying
  oldpar - par(no.readonly=T)
  on.exit(par(oldpar))

Regards, Adai

Regards, Adai



Arthur Roberts wrote:

Hi, all,

This might be a stupid question.  Is there a single command in R that 
can revert parameters to default?  It is much appreciated.


Best wishes,
Art Roberts
University of Washington

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] keep the row indexes/names when do aggregate

2008-09-24 Thread Adaikalavan Ramasamy


Not the most elegant solution but here goes.

   df - data.frame(g=c(g1,g2,g1,g1,g2),v=c(1,7,3,2,8))

   rownames.which.max - function(m, col){
  w - which.max( m[ , col] )
  return( rownames(m)[w] )
   }

   df.split - split(df, df$g)
   ws - sapply( df.split, rownames.which.max, col=v )

   ws
g1  g2
   3 5

   df[ws, ]
  g v
   3 g1 3
   5 g2 8

Regards, Adai

zhihuali wrote:

Hi, R-users,

If I have a data frame like this:

x-data.frame(g=c(g1,g2,g1,g1,g2),v=c(1,7,3,2,8))

   g v
1 g1 1
2 g2 7
3 g1 3
4 g1 2
5 g2 8


It contains two groups, g1 and g2. Now for each group I want the max v:


aggregate(x$v,list(g=x$g),max)

   g x
1 g1 3
2 g2 8

Beautiful. But what if I want to keep the row index of (g1 3) and (g2 8) in the original x? 
So I want is:

do something

   g x
 3 g1 3
 5 g2 8

Of course it'd may make much more sense if the row indexes are some row names 
that I want to keep.

Is there a simple way to do that?

Thanks a lot!

Z


 



_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calling object outside function

2008-09-24 Thread Adaikalavan Ramasamy

I don't understand why you need to use a function at all, especially 
when all your function arguments are overwritten inside the loop.


Here is a simplified example of what you are doing:

f - function(x){
 x - 5
 print(x)
}

Therefore f(1), f(2), ..., f(1000) etc all gives you the same answer.

However, you can set a default value for x, which will allow you to vary 
it at a later stage if you wish to.


f - function(x=5){
 print(x)
}

So now f() gives 5, f(10) gives 10, ...


Similarly, assuming that you want to vary the file, Loc_Mod_TAZ, 
Dev_Size later, you might be interested in perhaps:


loadTestData - function(file=TAZ_VAC_ACRES.csv,
 Loc_Mod_TAZ=120, Dev_Size=58){

   #Loads TAZ and corresponding vacant acres data
   TAZ_VAC_ACRES - read.csv(file=file,header=TRUE);

   #Determines vacant acres by TAZ
   TAZDetermine=TAZ_VAC_ACRES[TAZ_VAC_ACRES$TAZ==Loc_Mod_TAZ,2]
   return(TAZDetermine)
}

out - LoadTestData()

Regards, Adai



PDXRugger wrote:

What i thought was a simple process isnt working for me.  After i create an
multiple objects in a function (see below), how to i use those objects later
in the program.  I tried calling the function again and then the object i
wanted and it worked the first time but now it doesnt( i think i defined the
object outside the function accidently so then it worked but when run
properly it doesnt).  I did this using 
Testdata(TAZDetermine) to first recall the function then the object i wanted

to use.  This deosnt work and it errors that the object cannot be found.  Do
i use attach?  this didnt seem to work either.  I just want to call an
object defined in a function outside of the function.  Hope you can help

Cheers,
JR


#Function to create hypothetical numbers for process testing 
Testdata=function(TAZ_VAC_ACRES,Loc_Mod_TAZ,Dev_Size,TAZDetermine,Dev_Size){


#Loads TAZ and corresponding vacant acres data
TAZ_VAC_ACRES=
read.csv(file=I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/TAZ_VAC_ACRES.csv,header=TRUE);


#Test Location Choice Model selected TAZ 
Loc_Mod_TAZ = 120
#Create test Development 
Dev_Size=58


#Determines vacant acres by TAZ 
TAZDetermine=TAZ_VAC_ACRES[TAZ_VAC_ACRES$TAZ==Loc_Mod_TAZ,2]


#Displays number of vacant acres in Location Choice Model selected TAZ
TAZDetermine

}

Testdata(TAZDetermine)

error indicating the that function cannot be found even thoug its part of
the argument list in the main function.  



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to keep up with R?

2008-09-19 Thread Adaikalavan Ramasamy

I agree! The best way to learn (and remember for longer) is to teach 
someone else about it.


And there is not reason not to repeat some of the anlysis done on SAS 
with R. That way you can verify your outputs or compare the 
presentations. If you consistently find differences in the outputs, then 
trying to figure out the reason may lead you to better understand the 
methods (e.g. different optimization or estimation procedures).


Regards, Adai



Barry Rowlingson wrote:

2008/9/19 Wensui Liu [EMAIL PROTECTED]:

Dear Listers,

I've been a big fan of R since graduate school. After working in the
industry for years, I haven't had many opportunities to use R and am mainly
using SAS. However, I am still forcing myself really hard to stay close to R
by reading R-help and books and writing R code by myself for fun. But by and
by, I start realizing I have hard time to keep up with R and am afraid that
I would totally forget how to program in R.

I really like it and am very unwilling to give it up. Is there any idea how
I might keep touch with R without using it in work on daily basis? I really
appreciate it.



 How about doing some kind of presentation on R at your work? It's
possible that some of the old fossils don't even know about it at all,
and use SAS because to them the alternative is SPSS. Do some R
evangelization. Find a task that R does better than SAS (not
difficult) and illustrate that to your superiors. Then when they ask
how much a corporate R license is, you tell them it's free, or say
it'll cost them a 2% raise in your salary, or say it will cost them
your resignation if you are feeling brave!

 Sure you may be tied to SAS for some other reasons, but there's no
reason why you can't use R for other things. Work out how to get it
into your corporate framework. Encourage your colleagues to look at it
for their tasks. Enthuse.

 The good thing about training and evangelization is that at first you
don't need mad skillz at R to do it. I have trouble understanding some
of the tips on R-help (especially when do.call() is used), but you can
teach new people with a good knowledge of the basics, which you should
still have. Eventually the hope is that enough people use R at your
workplace to develop a community where everyone keeps everyone else on
their toes with R questions!

 Good luck!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] selecting dataframe values that are not nulls

2008-09-19 Thread Adaikalavan Ramasamy

Ramya, you sent four near identical emails with different subject lines. 
Since the list is run by unpaid volunteers, please avoid wasting 
people's time (and yours too) with such redundancies.


Please read http://www.r-project.org/posting-guide.html and search the 
mailing lists and documentations.


Did you receive the replies to your 1st request from miltinho and Moshe?

If not, have a look at help(merge) with the all.x, all.y and all 
argument. You might also be interested in unique, is.na, list.


Regards, Adai



Rajasekaramya wrote:

Hi,

I have a dataframe with 14319rows and 9 colums. for some rows there are null
values.I want a dataframe without these null values.I wanna select only
those that have values !=NA. 


kindly let me know how to do that.

Ramya



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unix-type commandline keystrokes in the windows RGUI

2008-09-18 Thread Adaikalavan Ramasamy

Ah, I didn't realize Rterm existed (Start - Run - Rterm). It works 
with CTRL-R as you said. Thank you!


Regards, Adai



Peter Dalgaard wrote:

Adaikalavan Ramasamy wrote:

...
Anyway, here is how to do what you want:

1) Install bash on your Windows machine - You can use cgywin. Or 
download and unzip http://www.steve.org.uk/Software/bash/


2) Make the directory to bash.exe and R.exe are in your PATH variable.

3) Start - Run - cmd

4) Start R.exe

and now you should have your CTRL-R functionality (along with ls and 
other bash goodies). Yes, I know you asked about Rgui.exe and not 
R.exe. But this is the best I can do.
Er, I don't think you need bash nor cygwin for this, do you? It is not 
normal that the shell has any influence on programs that run under it. 
Plain Rterm in a console should do it, if and only if linked agaist 
libreadline, which I believe it is.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unix-type commandline keystrokes in the windows RGUI

2008-09-17 Thread Adaikalavan Ramasamy


Why not use a script?

I feel that it is much better than using the history via [CTRL]-R in 
unix, which also pulls up errorneous commands.


A script is vital for statistical analysis and research where you may 
want to or be asked to repeat or reproduce the analysis months later.


Rgui (on windows) has a built in script editor. There are many external 
editors capable of working with R. My recommendation is to use emacs via 
ESS (emacs speaks statistics) which works in most, if not all, operating 
systems and has a Unix feel to it.


If you insist on wanting to use [CTRL]-R like features, then have a look 
at history() within R. You can also try installing cgywin or bash etc 
and see if that works from the DOS prompt.


Regards, Adai



mfrumin wrote:

Hi all,

I am generally quite fond of the unix commandline keystrokes (e.g. searching
back in your history with [CTRL]-R, and cutting/pasting with [CTRL]-K/Y)
which work in the R commandline in *nix.  Does anyone know if there's any
way to get similar functionality in the Windows RGUI?

I know that as of now, [CTRL]-A and -E do the same as unix (beginning and
end of line) and [CTRL]-Y does a paste, but [CTRL]-K crops from the cursor
to the end of the line but doesn't put the text into the clipboard.  the
most important thing I want is the [CTRL]-R functionality which is so poorly
approximated by pressing the up arrow a million times.

I've searched on the archives and didn't find anything about this.   Any
thoughts?

Thanks,
Mike


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unix-type commandline keystrokes in the windows RGUI

2008-09-17 Thread Adaikalavan Ramasamy

Well, I don't see why you need the CTRL-R functionality when you can 
just as rapidly and efficiently using SEARCH functionality  in scripts 
too (CTRL-F in most applications, CTRL-S in emacs etc).


BTW, I am quite familiar with Unix, Linux and Sun Solaris and what 
CTRL-R does (yes, I used it frequently). Which is why I am able to tell 
you that CTRL-R will pull up all matching commands - even commands that 
had failed! At least in a script environment, you tend to correct failed 
commands. So you know when you search scripts, it will likely be the 
correct command.


To summarize my view, I feel that CTRL-R is appropriate for shell 
operations where one codes on the fly while using a search functionality 
and scripting is appropriate for a scientific programming software.



Anyway, here is how to do what you want:

1) Install bash on your Windows machine - You can use cgywin. Or 
download and unzip http://www.steve.org.uk/Software/bash/


2) Make the directory to bash.exe and R.exe are in your PATH variable.

3) Start - Run - cmd

4) Start R.exe

and now you should have your CTRL-R functionality (along with ls and 
other bash goodies). Yes, I know you asked about Rgui.exe and not R.exe. 
But this is the best I can do.


By all means go bother the R developers (most of whom I suspect are on 
the mailing list). I will be interested in what they say.


Regards, Adai



mfrumin wrote:

Adaikalavan, thanks.

Perhaps I was not so specific enough in what I want, for those not so
familiar with unix commandline featuers.  I'm looking for the 'reverse
search' functionality where you hit CTRL-R, then start typing a bit of text
and it finds previous commands with that bit of text, which you just hit
enter to execute.

I already do write tons of code/scripts in R (using Emacs in fact!).  But
one of the great features of R/SPSS/Matlab/etc is that they are interactive
environments.  Thus, I spend lots of time issuing commands as well as
writing code.  I want to be able to search back through those commands as
rapidly and efficiently as you can in the unix (and R unix) commandline.

Another way to think about this is -- the unix commandline environment is a
scripting environment where you can use emacs.  Yet users of unix love the
CTRL-R functionality anyway (they wrote it!).

So, any suggestions to help do what I specifically asked, or should I go
bother the R developers?

thanks,
Mike


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference of two data frames

2008-09-14 Thread Adaikalavan Ramasamy

It would be useful to have indexed both dataframes with a unique 
identifier, such as in rownames etc.


Without that information, you could possibly try to use the same 
approach as duplicated() does by pasting together a character 
representation of rows using | (or any other separator).


   keys1 - apply(DF1, 1, paste, collapse=|)
   keys1
   [1] 1|a 2|b 3|c 4|d 5|e 6|f
   duplicated(keys1)
   [1] FALSE FALSE FALSE FALSE FALSE FALSE

   keys2 - apply(DF2, 1, paste, collapse=|)
   keys2
   [1] 1|a 2|b 3|c
   duplicated(keys2)
   [1] FALSE FALSE FALSE

The duplicated part is neccessary to ensure the key generated is truly 
unique. You might want to experiment and see if you can create a unique 
key using just a few columns.



   keys1 %in% keys2
   [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE

   w - setdiff( keys1, keys2 )
   DF1[ w, ]
  V1 V2
   4  4  d
   5  5  e
   6  6  f

Regards, Adai



joseph wrote:

Hi Jorge
both commands work; 
can you extend it to several coulmns?  the reason I am asking is that in my real data the uniqueness of the rows is made of all the columns; in other words V1 might have duplicates.

Thanks




- Original Message 
From: Jorge Ivan Velez [EMAIL PROTECTED]
To: joseph [EMAIL PROTECTED]
Cc: r-help@r-project.org
Sent: Sunday, September 14, 2008 10:23:33 AM
Subject: Re: [R] difference of two data frames



Hi Joseph,

Try this:


DF1[!DF1$V1%in%DF2$V1,]

subset(DF1,!V1%in%DF2$V1)


HTH,

Jorge


On Sun, Sep 14, 2008 at 12:49 PM, joseph [EMAIL PROTECTED] wrote:

Hello
I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1:
DF1= data.frame(V1=1:6, V2= letters[1:6])
DF2= data.frame(V1=1:3, V2= letters[1:3])
How do I create a new data frame of the difference between DF1 and DF2
newDF=data.frame(V1=4:6, V2= letters[4:6])
In my real data, the rows are not in order as in the example I provided.
Thanks much
Joseph



   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  
	[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate mean/var by ID

2008-09-12 Thread Adaikalavan Ramasamy

AFAIK, tapply() only works for one variable (apart from the grouping 
variable). It might be perhaps better to use split() here:


   df - data.frame(ID = c(111, 111, 111, 178, 178, 138, 138, 138, 138),
value = c(5, 6, 2, 7, 3, 3, 8, 7, 6),
Seg = c(2, 2, 2, 4, 4, 1, 1, 1, 1) )

   df.s - split( df, df$ID )

   out - sapply( df.s, function(m){
c( mu=mean(m$value), var=var(m$value),
   min=min(m$Seg), max=max(m$Seg) ) })
   out - t(out)
 mu  var min max
   111 4.33 4.33   2   2
   138 6.00 4.67   1   1
   178 5.00 8.00   4   4

You could also have used range() here instead of calculating min and max 
separately but naming the resulting columns becomes a bit tricky.


Regards, Adai

PS: If you do a dput() on a subset of the data, you can get a simple 
reproducible example that other R users can easily read in.




Julia Liu wrote:

Adai,

Thank you so much for your help. I like your code the best. :) So simple. I have another question though, if you don't mind. I'd like to include another variable in res. This variable defines the segmentation of each person (ranges, say, from 1 to 4). 
 ID   value   Seg

111 5  2
111 6  2
111 2  2
178 7  4
178 3  4
138 3  1
138 8  1
138 7  1
138 6  1How to do this? Thank you so much for the help.
Sincerely
Julia

--- On Thu, 9/11/08, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote:
From: Adaikalavan Ramasamy [EMAIL PROTECTED]
Subject: Re: [R] Calculate mean/var by ID
To: Jorge Ivan Velez [EMAIL PROTECTED]
Cc: liujb [EMAIL PROTECTED], r-help@r-project.org
Date: Thursday, September 11, 2008, 10:28 PM

A slight variation of what Jorge has proposed is:

f - function(x) c( mu=mean(x), var=var(x) )

do.call( rbind, tapply( df$value, df$ID, f ) )

 mu  var
   111 4.33 4.33
   138 6.00 4.67
   178 5.00 8.00

Regards, Adai



Jorge Ivan Velez wrote:

Dear Julia,
Try also

x=read.table(textConnection(IDvalue
111 5
111 6
111 2
178 7
178 3
138 3
138 8
138 7
138 6),header=TRUE)
 closeAllConnections()
attach(x)

do.call(rbind,tapply(value,ID, function(x){
res=c(mean(x,na.rm=TRUE),var(x,na.rm=TRUE))
names(res)=c('Mean','Variance')
res
}
)
)

HTH,

Jorge




On Thu, Sep 11, 2008 at 1:45 PM, liujb [EMAIL PROTECTED] wrote:


Hello,

I have a data set that looks like this.
IDvalue
111 5
111 6
111 2
178 7
178 3
138 3
138 8
138 7
138 6
.
.
.

I'd like to calculate the mean and var for each object identified

by the

ID.
I can in theory just loop through the whole thing..., but is there a

easier

way/command which let me calculate the mean/var by ID?

Thanks,
Julia
--
View this message in context:


http://www.nabble.com/Calculate-mean-var-by-ID-tp19440461p19440461.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting of factor

2008-09-12 Thread Adaikalavan Ramasamy


help(rowttests) says that fac needs to be a factor. So how about ?

  m - matrix( rnorm(30), nc=6 )
  genotype - c(a, a, b, b, c, c)

  w1 - which( genotype %in% c(a, b) )
  w2 - which( genotype %in% c(a, c) )
  w3 - which( genotype %in% c(b, c) )

  list( ab = rowttests( m[ , w1], factor( genotype[w1] ) ),
ac = rowttests( m[ , w2], factor( genotype[w2] ) ),
bc = rowttests( m[ , w3], factor( genotype[w3] ) ) )

Regards, Adai



Hui-Yi Chu wrote:

Dear R list,

I think my question maybe easy for you but I really spent entire day to
resolve it.
Say I have a matrix, rows are 6000 genes, columns(1-6) are 3 genotypes
(a,b,c) with 2 repeat.
I have to use two groups each time for t-test, a vs. c or b vs. c, but I
dont know how to write correct codes.
Below is my codes, the last two lines are needed to be corrected

library(genefilter)
ef - exprs(esetsub)
kk - factor(esetsub$genotype == c(a, c))
tt - rowttests(ef[,c(1,2,5,6)], kk)

ps.  column 1-6 is a,a,b,b,c,c
  depending on the document, the kk should be a factor..

Any suggestions are really appreciated!!

Best regards,
Hui-Yi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to load functions in R

2008-09-11 Thread Adaikalavan Ramasamy

I would recommend saving the functions into a separate file and then 
using source() as bartjoosen suggested.


I do not recommend using save() here because the output is non-readable 
(even when using ascii=TRUE option). Which means that you have to load() 
it, then copy-and-paste into an editor before making changes and then 
running it again in R and then save() again.


Another better option is to consider making your own package. It may 
sound complicated but once you mastered it, it makes your functions more 
portable and encourages you to document it. Further, the function 
package.skeleton() simplifies much of it.


Regards, Adai



Yihui Xie wrote:

Hi, you may save your functions somewhere on your disk using save()
and load them next time when you want to use them. See ?save and ?load

Yihui

On Thu, Sep 11, 2008 at 9:30 PM,  [EMAIL PROTECTED] wrote:

Hello,

I am trying to use self created functions in other scripts than the one
where they are stored.
For the moment I am using the following structure of commands to do
that:

1. Load the text file with the functions in the current script:
x=parse(path)
2. transform the tex in a function: f1=eval(x[1]), f2=eval(x[2]) if more
than one function is stored in the text file
3. use the functions as normal

Is there another possibility to do the same?
Thank you,

Mihai Mirauta

   [[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to load functions in R

2008-09-11 Thread Adaikalavan Ramasamy


Strange.

source() should read all the function in that file unless there was a 
syntax error or something else preventing the other function from being 
parsed correctly. Could you send us a simplified example that reproduces 
this problem?


Thanks.

Regards, Adai



[EMAIL PROTECTED] wrote:
 
Hello,
It seems that all methods work. 
Source() however loads only the last function. with save(a,b,file=path) i can save more than 1 function. 
Thanks a lot,


Mihai
-Ursprüngliche Nachricht-
Von: Yihui Xie [mailto:[EMAIL PROTECTED] 
Gesendet: Donnerstag, 11. September 2008 16:48

An: [EMAIL PROTECTED]
Cc: Mirauta, Mihai; r-help@r-project.org
Betreff: Re: [R] How to load functions in R

We may just read them in the R console instead of an external editor, and fix() or 
edit() them when we need to make any modifications. A trivial advantage of saving them 
as an image file in Windows is that you can double-click the file and R will be started with these 
objects loaded automatically. Anyway, to save the functions as ASCII files or even write a package 
are also good solutions :-)

Regards,
Yihui

On Thu, Sep 11, 2008 at 10:34 PM, Adaikalavan Ramasamy [EMAIL PROTECTED] 
wrote:
  
I would recommend saving the functions into a separate file and then 
using

source() as bartjoosen suggested.

I do not recommend using save() here because the output is 
non-readable (even when using ascii=TRUE option). Which means that you 
have to load() it, then copy-and-paste into an editor before making 
changes and then running it again in R and then save() again.


Another better option is to consider making your own package. It may 
sound complicated but once you mastered it, it makes your functions 
more portable and encourages you to document it. Further, the function 
package.skeleton() simplifies much of it.


Regards, Adai



Yihui Xie wrote:


Hi, you may save your functions somewhere on your disk using save()
and load them next time when you want to use them. See ?save and 
?load


Yihui

On Thu, Sep 11, 2008 at 9:30 PM,  [EMAIL PROTECTED] wrote:
  

Hello,

I am trying to use self created functions in other scripts than the 
one where they are stored.

For the moment I am using the following structure of commands to do
that:

1. Load the text file with the functions in the current script:
x=parse(path)
2. transform the tex in a function: f1=eval(x[1]), f2=eval(x[2]) if 
more than one function is stored in the text file 3. use the 
functions as normal


Is there another possibility to do the same?
Thank you,

Mihai Mirauta

  [[alternative HTML version deleted]]







--
Yihui Xie [EMAIL PROTECTED]
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building, Renmin University of 
China, Beijing, 100872, China




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate mean/var by ID

2008-09-11 Thread Adaikalavan Ramasamy


A slight variation of what Jorge has proposed is:

   f - function(x) c( mu=mean(x), var=var(x) )

   do.call( rbind, tapply( df$value, df$ID, f ) )

mu  var
  111 4.33 4.33
  138 6.00 4.67
  178 5.00 8.00

Regards, Adai



Jorge Ivan Velez wrote:

Dear Julia,
Try also

x=read.table(textConnection(IDvalue
111 5
111 6
111 2
178 7
178 3
138 3
138 8
138 7
138 6),header=TRUE)
 closeAllConnections()
attach(x)

do.call(rbind,tapply(value,ID, function(x){
res=c(mean(x,na.rm=TRUE),var(x,na.rm=TRUE))
names(res)=c('Mean','Variance')
res
}
)
)

HTH,

Jorge




On Thu, Sep 11, 2008 at 1:45 PM, liujb [EMAIL PROTECTED] wrote:


Hello,

I have a data set that looks like this.
IDvalue
111 5
111 6
111 2
178 7
178 3
138 3
138 8
138 7
138 6
.
.
.

I'd like to calculate the mean and var for each object identified by the
ID.
I can in theory just loop through the whole thing..., but is there a easier
way/command which let me calculate the mean/var by ID?

Thanks,
Julia
--
View this message in context:
http://www.nabble.com/Calculate-mean-var-by-ID-tp19440461p19440461.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] give all combinations

2008-09-02 Thread Adaikalavan Ramasamy

Yuan Jian, sending 9 emails within the span of few seconds all with 
similar text is very confusing to say the least!


Carl, look up combinations() and permutations() in the gtools package.


For two case scenario, you can use combinations()

   v - c(a,b,c)

   library(gtools)
   tmp - combinations(3, 2, v,repeats=TRUE)
   apply( tmp, 1, paste, collapse= )
   [1] aa ab ac bb bc cc


For more than two cases, I don't know of an elegant way except to 
generate all possible permutations and then eliminate those with the 
same ingredients. This function will be slow for large numbers!



   multiple.combinations - function( vec, times ){

  input - vector( mode=list, times )
  for(i in 1:times) input[[i]] - vec

  out - expand.grid( input )
  out - apply( out, 1, function(x) paste( sort(x), collapse= ) )
  unique.out - unique(out)
  return(unique.out)
   }

  multiple.combinations( v, 3 )
  [1] aaa aab aac abb abc acc bbb bbc bcc ccc

  multiple.combinations( v, 6 )
  aa ab ac bb bc cc aaabbb
  aaabbc aaabcc aaaccc aa aabbbc aabbcc aabccc
  aa ab ac abbbcc abbccc ab ac
  bb bc cc bbbccc bb bc cc

Regards, Adai


Carl Witthoft wrote:

I seem to be missing something here:

given a set X:{a,b,c,whatever...}
the mathematical definition of 'permutation' is the set of all possible 
sequences of the elements of X.
The definition of 'combination' is all elements of 'permutation' which 
cannot be re-ordered to become a different element.


example:  X:{a,b,c}

perm(X) = ab, ab, bc, ba, ca, cb
comb(X) = ab, ac, bc


So maybe a better question for this mailing list is:  Are there 
functions available in any R package which produce perm() and comb() 
(perhaps as well as other standard combinatoric functions) ?


Carl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting the gene list

2008-09-02 Thread Adaikalavan Ramasamy

Have you tried reading some of the material from the BioConductor 
workshop http://bioconductor.org/workshops/ ?


Here is a simplistic way of proceeding:

 ## Calculate pvalues from t-test
 p - apply( mat, function(x) t.test( x ~ cl )$p.value )

 ## Subset
 mat.sub - mat[ p, ]

 ## Cluster
 heatmap(m)

Regards, Adai




Abhilash Venu wrote:

Hi all,

I am working on a single color expression data using limma. I would like to
perform a cluster analysis after selecting the differentially genes based on
the P value (say 0.001). As far as my knowledge is concerned I have to do
the sub setting of these selected genes on the normalized data (MA), to
retrieve the distribution across the samples.
But I am wondering whether I can perform using the R script?
I would appreciate any help.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Maintaining repeated ID numbers when transposing with reshape

2008-08-28 Thread Adaikalavan Ramasamy

Not the prettiest code but it returns what you want. Might be slow for 
large dataframes.


df - data.frame( ID=c(1,1,1,1,2,2),
  TEST=c(A,A,B,C,B,B),
  RESULT=c(17,12,15,12,8,9) )


big.out - list(NULL)

for( uID in unique(df$ID) ){
 m - df[ df$ID == uID, , drop=FALSE ]
 run.order - unlist(sapply( table(m$TEST), function(x) if(x  0) 1:x) )
 m - cbind( m, run.order=run.order )

 nr - max(run.order)
 out - matrix( nr=nr, nc=nlevels(m$TEST),
dimnames=list( rep(uID, nr), levels(m$TEST) ))

 for(i in 1:nrow(m)) out[ m$run.order[i], m$TEST[i] ] - m$RESULT[i]
 big.out[[uID]] - out
}

do.call( rbind, big.out )

   A  B  C
1 17 15 12
1 12 NA NA
2 NA  8 NA
2 NA  9 NA


Regards, Adai


jcarmichael wrote:

Thank you for your suggestion, I will play around with it. I guess my concern
is that I need each test result to occupy its own cell rather than have
one or more in the same row.


Adaikalavan Ramasamy-2 wrote:
There might be a more elegant way of doing this but here is a way of 
doing it without reshape().


df - data.frame( ID=c(1,1,1,1,2,2),
  TEST=c(A,A,B,C,B,B),
  RESULT=c(17,12,15,12,8,9) )

df.s - split( df, df$ID )

out  - sapply( df.s, function(m)
tapply( m$RESULT, m$TEST, paste, collapse=, ) )

t(out)

  A   B C
1 17,12 15  12
2 NA  8,9 NA

Not the same output as you wanted. This makes more sense unless you have 
a reason to priotize 17 instead of 12 in the first row.


Regards, Adai


jcarmichael wrote:

I have a dataset in long format that looks something like this:

ID   TESTRESULT
1   A  17
1   A  12
1   B  15
1   C  12
2   B   8
2   B   9

Now what I would like to do is transpose it like so:

IDTEST ATEST BTEST C
1 17   15  12
1 12..
2  . 8.
2  . 9.

When I try:

reshape(mydata, v.names=result, idvar=id,timevar=test,
direction=wide)

It gives me only the first occurrence of each test for each subject.  How
can I transpose my dataset in this way without losing information about
repeated tests?

Any help or guidance would be appreciated!  Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cluster

2008-08-28 Thread Adaikalavan Ramasamy

Try reading help(hclust) and help(matplot) and run the examples given in 
the documentation. If that doesn't work, try posting again with a simple 
reproducible example.


Regards, Adai



Marco Chiapello wrote:

Hi all,
I'm trying to do a cluster analysis,but I don't know if it's possible in
the way that I want.
I have a data set like the follow:
115/114 
 116/114 
 117/114 
0.45

0.72
0.41
1.16
0.63
0.91
0.42
0.94
0.61
My real data set is, just a bit bigger, 610 entries.
I want plot each row on the same graph, like a line (see the attach
file). Then if it's possible I want perform a cluster analysis. The
final perfect result would be a graph with many lines, with the cluster
line in the same color.
Any advice?
Marco




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Maintaining repeated ID numbers when transposing with reshape

2008-08-25 Thread Adaikalavan Ramasamy

There might be a more elegant way of doing this but here is a way of 
doing it without reshape().


   df - data.frame( ID=c(1,1,1,1,2,2),
 TEST=c(A,A,B,C,B,B),
 RESULT=c(17,12,15,12,8,9) )

   df.s - split( df, df$ID )

   out  - sapply( df.s, function(m)
   tapply( m$RESULT, m$TEST, paste, collapse=, ) )

   t(out)

 A   B C
   1 17,12 15  12
   2 NA  8,9 NA

Not the same output as you wanted. This makes more sense unless you have 
a reason to priotize 17 instead of 12 in the first row.


Regards, Adai


jcarmichael wrote:

I have a dataset in long format that looks something like this:

ID   TESTRESULT
1   A  17
1   A  12
1   B  15
1   C  12
2   B   8
2   B   9

Now what I would like to do is transpose it like so:

IDTEST ATEST BTEST C
1 17   15  12
1 12..
2  . 8.
2  . 9.

When I try:

reshape(mydata, v.names=result, idvar=id,timevar=test,
direction=wide)

It gives me only the first occurrence of each test for each subject.  How
can I transpose my dataset in this way without losing information about
repeated tests?

Any help or guidance would be appreciated!  Thanks!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] howto optimize operations between pairs of rows in a single matrix like cor and pairs

2008-08-25 Thread Adaikalavan Ramasamy

Thank you to Jim and Moshe. I will try the Rprof option as well as the 
running the function to run on columns instead. Thank you.




jim holtman wrote:

Use Rprof to see where time is being spent.  If it is in FUN, then
there is probably no way to optimize outside of changing the way FUN
works.  So the first thing is to decide where time is being spent.

On Sun, Aug 24, 2008 at 6:35 PM, Adaikalavan Ramasamy
[EMAIL PROTECTED] wrote:

Hi,

I calculating the output of a function when applied to pairs of row from a
single matrix or dataframe similar to how cor() and pairs() work. This is
the code that I have been using:

  pairwise.apply - function(x, FUN, ...){


n - nrow(x)
r - rownames(x)
output - matrix(NA, nc=n, nr=n, dimnames=list(r, r))


for(i in 1:n){
  for(j in 1:n){
if(i = j) next()
output[i, j] - FUN( x[i,], x[j,] )
  }
}
return(output)
  }

I realize that the output of the pairwise operation needs to be scalar. Here
is an example. The actual function and dataset I want to use is more
complicated and thus the function runs slow for large datasets.

  m - iris[ 1:5, 1:4 ]

  pairwise.apply(m, sum)
 12345
  1 NA 19.7 19.6 19.6 20.4
  2 NA   NA 18.9 18.9 19.7
  3 NA   NA   NA 18.8 19.6
  4 NA   NA   NA   NA 19.6
  5 NA   NA   NA   NA   NA

Can I use apply() or any of it's family to optimize the codes? I have tried
playing around with outer, kronecker, mapply without any sucess.

Any suggestions? Thank you.

Regards, Adai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] howto optimize operations between pairs of rows in a single matrix like cor and pairs

2008-08-24 Thread Adaikalavan Ramasamy


Hi,

I calculating the output of a function when applied to pairs of row from 
a single matrix or dataframe similar to how cor() and pairs() work. This 
is the code that I have been using:


   pairwise.apply - function(x, FUN, ...){ 




 n - nrow(x) 

 r - rownames(x) 

 output - matrix(NA, nc=n, nr=n, dimnames=list(r, r)) 




 for(i in 1:n){ 

   for(j in 1:n){ 


 if(i = j) next()
 output[i, j] - FUN( x[i,], x[j,] ) 

   } 

 } 

 return(output) 

   } 



I realize that the output of the pairwise operation needs to be scalar. 
Here is an example. The actual function and dataset I want to use is 
more complicated and thus the function runs slow for large datasets.


   m - iris[ 1:5, 1:4 ] 



   pairwise.apply(m, sum) 

  12345 

   1 NA 19.7 19.6 19.6 20.4 

   2 NA   NA 18.9 18.9 19.7 

   3 NA   NA   NA 18.8 19.6 

   4 NA   NA   NA   NA 19.6 

   5 NA   NA   NA   NA   NA 



Can I use apply() or any of it's family to optimize the codes? I have 
tried playing around with outer, kronecker, mapply without any sucess.


Any suggestions? Thank you.

Regards, Adai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

58 matches

Mail list logo