date:20110413

Re: [R] is this an ANOVA ?

2011-04-13 Thread Lao Meng

You may try the following to perform anova:
 anova(lm(y~x))
or
summary(aov(y~x))




2011/4/13 Ubuntu Diego ubuntu.di...@gmail.com

 Hi all,
I have a very easy questions (I hope). I had measure a property of
 plants, growing in three different substrates (A, B and C). The rest of the
 conditions remained constant. There was very high variation on the results.
I want to do address, whether there is any difference in the
 response (my measurement) from substrate to substrate?

 x-c('A','A','A','A','A','B','B','B','B','B','C','C','C','C','C') #
 Substrate type
 y - c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) # Results of the measurement
 MD-data.frame(x,y)

I wrote a linear model for this:

 summary(lm(y~x,data=MD))

This is the output:

 Call:
 lm(formula = y ~ x, data = MD)

 Residuals:
   Min 1Q Median 3QMax
 -2.000e+00 -1.000e+00  5.551e-17  1.000e+00  2.000e+00

 Coefficients:
Estimate Std. Error t value Pr(|t|)
 (Intercept)   3. 0.7071   4.243 0.001142 **
 xB5. 1.   5.000 0.000309 ***
 xC   10. 1.  10.000 3.58e-07 ***
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

 Residual standard error: 1.581 on 12 degrees of freedom
 Multiple R-squared: 0.8929, Adjusted R-squared: 0.875
 F-statistic:50 on 2 and 12 DF,  p-value: 1.513e-06

I conclude that there is an effect of substrate type (x).
NOW the questions :
1) Do the fact that the all p-values are significant means
 that all the groups are different from each other ?
2) Is there a (easy) way to plot,  mean plus/minus 2*sd for
 each substrate type ? (with asterisks denoting significant differences ?)


THANKS !

 version
 platform   x86_64-apple-darwin9.8.0
 arch   x86_64
 os darwin9.8.0
 system x86_64, darwin9.8.0
 status
 major  2
 minor  11.1
 year   2010
 month  05
 day31
 svn rev52157
 language   R
 version.string R version 2.11.1 (2010-05-31)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Dump the source code of data frame

2011-04-13 Thread C.H.

Dear R experts,

I remember a similar function existed and have been mentioned in
R-help before. I tried my best to search but I really can't find it
out.

suppose I have an data frame like this:

 somedata - data.frame(age.min = 1, age.max = 1.5, male = TRUE, l = -1.013, 
 m=16.133, s=0.07656)

In order to back up the data and I don't want to use write.table(), I
would like to back up the source code of the data frame. When I
apply that function (let's call it dumpdf() ), the function will
reproduce the source code that creates the data.frame. For example:

 dumpdf(somedata)
somedata - data.frame(age.min = 1, age.max = 1.5, male = TRUE, l =
-1.013, m=16.133, s=0.07656)

Is there any function similar to the dumpdf() above?

Thank you so much!

Regards,

CH

-- 
CH Chan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dump the source code of data frame

2011-04-13 Thread Jorge Ivan Velez

Hi CH,

Take a look at  ?dput

HTH,
Jorge


On Wed, Apr 13, 2011 at 3:09 AM, C.H.  wrote:

 Dear R experts,

 I remember a similar function existed and have been mentioned in
 R-help before. I tried my best to search but I really can't find it
 out.

 suppose I have an data frame like this:

  somedata - data.frame(age.min = 1, age.max = 1.5, male = TRUE, l =
 -1.013, m=16.133, s=0.07656)

 In order to back up the data and I don't want to use write.table(), I
 would like to back up the source code of the data frame. When I
 apply that function (let's call it dumpdf() ), the function will
 reproduce the source code that creates the data.frame. For example:

  dumpdf(somedata)
 somedata - data.frame(age.min = 1, age.max = 1.5, male = TRUE, l =
 -1.013, m=16.133, s=0.07656)

 Is there any function similar to the dumpdf() above?

 Thank you so much!

 Regards,

 CH

 --
 CH Chan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Compatibility with Work Load/Resource Managers

2011-04-13 Thread Philipp Pagel

 I was wondering if anyone knew whether R is capable of integrating
 with the following work load/resource managers TORQUE, OpenPBS, PBS
 Pro, LSF, and SGE? 

I am running R scripts in our cluster under SGE on a regular basis and
have also done that under Platform LSF in the past but I am not sure
what you mean by integrating with these systems.

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] forest + igraph ?

2011-04-13 Thread Viechtbauer Wolfgang (STAT)

Sure. Here is an example:

library(metafor)
data(dat.bcg)
windows(height=8, width=6, pointsize=10)
par(mfrow=c(2,1))
dat - escalc(measure=RR, ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
res - rma(yi, vi, data=dat)
forest(res, atransf=exp)
title(Forest Plot of Relative Risks)
dat - escalc(measure=OR, ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
res - rma(yi, vi, data=dat)
forest(res, atransf=exp)
title(Forest Plot of Odds Ratios)

Best,

-- 
Wolfgang Viechtbauer 
Department of Psychiatry and Neuropsychology 
School for Mental Health and Neuroscience 
Maastricht University, P.O. Box 616 
6200 MD Maastricht, The Netherlands 
Tel: +31 (43) 368-5248 
Fax: +31 (43) 368-8689 
Web: http://www.wvbauer.com 

 -Original Message-
 From: Samor Gandhi [mailto:samorgan...@yahoo.com]
 Sent: Tuesday, April 12, 2011 15:59
 To: r-h...@stat.math.ethz.ch; Viechtbauer Wolfgang (STAT)
 Subject: RE: [R] forest + igraph ?
 
 Thanks, That is a nice one. Is there any option that I can plot the pooled
 estimate?
 
 Regards,
 Samor
 
 --- On Tue, 12/4/11, Viechtbauer Wolfgang (STAT)
 wolfgang.viechtba...@maastrichtuniversity.nl wrote:
 
 From: Viechtbauer Wolfgang (STAT)
 wolfgang.viechtba...@maastrichtuniversity.nl
 Subject: RE: [R] forest + igraph ?
 To: Samor Gandhi samorgan...@yahoo.com, r-h...@stat.math.ethz.ch r-
 h...@stat.math.ethz.ch
 Date: Tuesday, 12 April, 2011, 15:54
 You said that you do NOT want to use par(mfrow=c(2,1)). Why not?
 
 Isn't this (below) what you want?
 
 library(metafor)
 data(dat.bcg)
 
 windows(height=8, width=6, pointsize=10)
 par(mfrow=c(2,1))
 dat - escalc(measure=RR, ai=tpos, bi=tneg, ci=cpos, di=cneg,
 data=dat.bcg)
 forest(dat$yi, dat$vi, atransf=exp)
 title(Forest Plot of Relative Risks)
 dat - escalc(measure=OR, ai=tpos, bi=tneg, ci=cpos, di=cneg,
 data=dat.bcg)
 forest(dat$yi, dat$vi, atransf=exp)
 title(Forest Plot of Odds Ratios)
 
 Best,
 
 --
 Wolfgang Viechtbauer
 Department of Psychiatry and Neuropsychology
 School for Mental Health and Neuroscience
 Maastricht University, P.O. Box 616
 6200 MD Maastricht, The Netherlands
 Tel: +31 (43) 368-5248
 Fax: +31 (43) 368-8689
 Web: http://www.wvbauer.com
 
 
  -Original Message-
  From: Samor Gandhi [mailto:samorgan...@yahoo.com]
  Sent: Tuesday, April 12, 2011 11:52
  To: r-h...@stat.math.ethz.ch; Viechtbauer Wolfgang (STAT)
  Subject: RE: [R] forest + igraph ?
 
  Thank you for your reply. I would like to have two forest plots one on
 top
  and the other on the bottom. I am using R version 2.12.2 (32-bit)
 Windows.
  The code you sent me still plotting two windows one after the other?
 
  Best wishes and many thanks,
  Samor
 
  --- On Tue, 12/4/11, Viechtbauer Wolfgang (STAT)
  wolfgang.viechtba...@maastrichtuniversity.nl wrote:
 
  From: Viechtbauer Wolfgang (STAT)
  wolfgang.viechtba...@maastrichtuniversity.nl
  Subject: RE: [R] forest + igraph ?
  To: r-h...@stat.math.ethz.ch r-h...@stat.math.ethz.ch
  Cc: Samor Gandhi samorgan...@yahoo.com
  Date: Tuesday, 12 April, 2011, 13:42
  It is not clear (at least to me) what exactly you want. You want two
  forest plots in one graph but apparently not side-by-side or one on
 top
  and the other on the bottom. So, you want to superimpose them? How do
 you
  want to do that without creating an illegible mess? Or do you want one
  graph, where you can scroll through various plots? Then try this:
 
  library(metafor)
  data(dat.bcg)
 
  if (interactive()) {
     windows(record=TRUE)
     dat - escalc(measure=RR, ai=tpos, bi=tneg, ci=cpos, di=cneg,
  data=dat.bcg)
     forest(dat$yi, dat$vi)
     dat - escalc(measure=OR, ai=tpos, bi=tneg, ci=cpos, di=cneg,
  data=dat.bcg)
     forest(dat$yi, dat$vi)
  }
 
  and then use PageUp and PageDown to switch between the figures.
 
  (I don't know what OS you are using, so windows() may not work).
 
  Best,
 
  --
  Wolfgang Viechtbauer
  Department of Psychiatry and Neuropsychology
  School for Mental Health and Neuroscience
  Maastricht University, P.O. Box 616
  6200 MD Maastricht, The Netherlands
  Tel: +31 (43) 368-5248
  Fax: +31 (43) 368-8689
  Web: http://www.wvbauer.com
 
   -Original Message-
   From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org]
   On Behalf Of Samor Gandhi
   Sent: Monday, April 11, 2011 18:25
   To: r-h...@stat.math.ethz.ch
   Subject: [R] forest + igraph ?
  
   Hello,
  
   Is it possible to have two meta-plots in one graph (not
  par(mfrow=c(2,1))?
   But somthing like
  
    library(metafor)
    library(igraph)
  
    if (interactive()) {
       forest(dat.Treat$RR, ci.lb=dat.Treat$lower, ci.ub=dat.Treat$upper,
   xlab=Relative Risk,slab=dat.Treat$ID,refline=1)
       forest(dat.Control$RR, ci.lb=dat.Control$lower,
   ci.ub=dat.Control$upper, xlab=Relative
   Risk,slab=dat.Control$ID,refline=1)
     }
  
   i.e. both metaplots on the same graph!
  
   Regards,
   Samor

__
R-help@r-project.org mailing list

[R] R 2.13.0 is released

2011-04-13 Thread Peter Dalgaard

I've rolled up R-2.13.0.tar.gz a short while ago. This is a development
release which contains a number of new features.

Also, a number of mostly minor bugs have been fixed (but notice that serious 
build issues were fixed in 2.12.2). See the full list
of changes below.

You can get it from

http://cran.r-project.org/src/base/R-2/R-2.13.0.tar.gz

or wait for it to be mirrored at a CRAN site nearer to you.

Binaries for various platforms will appear in due course.

 For the R Core Team

 Peter Dalgaard

These are the md5sums for the freshly created files, in case you wish
to check that they are uncorrupted:

MD5 (AUTHORS) = ac9746b4845ae81f51cfc99262f5
MD5 (COPYING) = eb723b61539feef013de476e68b5c50a
MD5 (COPYING.LIB) = a6f89e2100d9b6cdffcea4f398e37343
MD5 (FAQ) = 3cbcd5d33708d03431cd13f695d1bcb0
MD5 (INSTALL) = 70447ae7f2c35233d3065b004aa4f331
MD5 (NEWS) = de39d09adf2c02e573722507c610d0f6
MD5 (ONEWS) = 0c3e10eef74439786e5fceddd06dac71
MD5 (OONEWS) = b0d650eba25fc5664980528c147a20db
MD5 (R-latest.tar.gz) = ecfb928067cfd932e75135f8b8bba3e7
MD5 (README) = 296871fcf14f49787910c57b92655c76
MD5 (RESOURCES) = 020479f381d5f9038dcb18708997f5da
MD5 (THANKS) = 03d783ff71270c77d497fd573d711fd8
MD5 (R-2/R-2.13.0.tar.gz) = ecfb928067cfd932e75135f8b8bba3e7


This is the relevant part of the NEWS file:

R News

CHANGES IN R VERSION 2.13.0:

  SIGNIFICANT USER-VISIBLE CHANGES:

• replicate() (by default) and vapply() (always) now return a
  higher-dimensional array instead of a matrix in the case where
  the inner function value is an array of dimension = 2.

• Printing and formatting of floating point numbers is now using
  the correct number of digits, where it previously rarely differed
  by a few digits. (See “scientific” entry below.)  This affects
  _many_ *.Rout.save checks in packages.

  NEW FEATURES:

• normalizePath() has been moved to the base package (from utils):
  this is so it can be used by library() and friends.

  It now does tilde expansion.

  It gains new arguments winslash (to select the separator on
  Windows) and mustWork to control the action if a canonical path
  cannot be found.

• The previously barely documented limit of 256 bytes on a symbol
  name has been raised to 10,000 bytes (a sanity check).  Long
  symbol names can sometimes occur when deparsing expressions (for
  example, in model.frame).

• reformulate() gains a intercept argument.

• cmdscale(add = FALSE) now uses the more common definition that
  there is a representation in n-1 or less dimensions, and only
  dimensions corresponding to positive eigenvalues are used.
  (Avoids confusion such as PR#14397.)

• Names used by c(), unlist(), cbind() and rbind() are marked with
  an encoding when this can be ascertained.

• R colours are now defined to refer to the sRGB color space.

  The PDF, PostScript, and Quartz graphics devices record this
  fact.  X11 (and Cairo) and Windows just assume that your screen
  conforms.

• system.file() gains a mustWork argument (suggestion of Bill
  Dunlap).

• new.env(hash = TRUE) is now the default.

• list2env(envir = NULL) defaults to hashing (with a suitably sized
  environment) for lists of more than 100 elements.

• text() gains a formula method.

• IQR() now has a type argument which is passed to quantile().

• as.vector(), as.double() etc duplicate less when they leave the
  mode unchanged but remove attributes.

  as.vector(mode = any) no longer duplicates when it does not
  remove attributes.  This helps memory usage in matrix() and
  array().

  matrix() duplicates less if data is an atomic vector with
  attributes such as names (but no class).

  dim(x) - NULL duplicates less if x has neither dimensions nor
  names (since this operation removes names and dimnames).

• setRepositories() gains an addURLs argument.

• chisq.test() now also returns a stdres component, for
  standardized residuals (which have unit variance, unlike the
  Pearson residuals).

• write.table() and friends gain a fileEncoding argument, to
  simplify writing files for use on other OSes (e.g. a spreadsheet
  intended for Windows or Mac OS X Excel).

• Assignment expressions of the form foo::bar(x) - y and
  foo:::bar(x) - y now work; the replacement functions used are
  foo::`bar-` and foo:::`bar-`.

• Sys.getenv() gains a names argument so Sys.getenv(x, names =
  FALSE) can replace the common idiom of as.vector(Sys.getenv()).
  The default has been changed to not name a length-one result.

• Lazy loading of environments now preserves attributes and locked
  status. (The locked status of bindings and active bindings are
  still not preserved; this may be addressed in the future).

• options(install.lock) may be set to FALSE so that
  install.packages()

[R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Matthieu Stigler


Hi

We are about to publish a book, which contains figures made with R 
plots. An important detail that we did not take into account is that the 
book will not be printed in 4 colors (cmyk mode), but only 2 (black 
+spotcolor). The spotcolor we use is part of the big Pantone family.


The problem is that both pdf() and postscript() offer either rgb or 
cmyk, but no spotcolors such as pantone. I'm afraid this constraint 
can't be solved at all, and we can't use R for creating these plots? I 
did not find any package that would extend the colormodel to include 
spot colors... Did anyone had a similar experience?


Thanks!!

Matthieu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MLE where loglikelihood function is a function of numerical solutions

2011-04-13 Thread Kristian Lind

Albyn and others,

Thank you for your replies.
In order to be more specific I've constructed my program. I know it's long
and in some places quite messy. It works until the last part where the
log-likelihood function has to be defined and maximized wrt the parameters.

The log-likelihood has the form L = 1/T*sum_t=0^T[log(transdens(v_t,
r_t))-log(Jac(r_t,v_t))], where the functions transdens(v_t, r_t) and
Jac(r_t,v_t) are defined below.

The problem remains the same, how do I construct the log-likelihood function
such that the numerical procedures employed are updated in the maximization
procedure?

I was thinking something along the lines of

minuslogLik - function(x,x2)
{f - rep(NA, length(x))
for(i in 1:T)
{
f[1] - -1/T*sum(log(transdens(parameters = parameters, x =
c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i])))
}
f
 }

Thank you in advance.

Kristian

--Program starts
here--
# Solving ODEs
parameters - c(K_vv- 0.0047,
K_rv--0.0268,
K_rr-0.3384,
theta_v - 50,
theta_r -5.68,
Sigma_rv-0.0436,
Sigma_rr-0.1145,
lambda_v-0,
lambda_r--0.0764,
 k_v-K_vv*theta_v,
k_r- K_rv*theta_v+K_rr*theta_r,
alpha_r - 0,
B_rv- (Sigma_rr+Sigma_rv)^2)
#declaring state variables i.e. the functions and their intial conditions
state - c(b_1 = 0,
b_2 = 0,
a = 0)
#delclaring function and system of equations.
Kristian - function(t, state, parameters)
{
with(as.list(c(state, parameters)),
{
db_1 =
-((K_vv+lambda_v)*b_1+(K_rv+Sigma_rv*lambda_v+Sigma_rr*lambda_r)*b_2+0.5*(b_1)^2+Sigma_rv*b_1*b_2+0.5*((Sigma_rv)^2+(Sigma_rr)^2)*(b_2)^2
)
db_2 = -K_rr*b_2+1
da = K_vv*theta_v*b_1+(K_rv*theta_v+K_rr*theta_r)*b_2
list(c(db_1, db_2, da))
})
}
# time making a sequence from t to T evaluated at each delta seq(t, T, by =
delta)
times - seq(0, 10, by = 0.5)

#solving the model using the deSolve function ode
library(deSolve)
outmat - ode(y = state, times = times, func = Kristian, parms = parameters)
#print(outmat)
#Solving 2 equations in 2 unknowns using nleslv
# simulating data for testing
c2 - matrix(data = rnorm(20, mean =0, sd =1)+2, nrow = 20, ncol =1)
c10 - matrix(data = rnorm(20, mean =0, sd =1)+5, nrow = 20, ncol =1)
besselinput - matrix(data = 0, nrow =20, ncol = 1)
vncChi - matrix(data = 0, nrow =20, ncol = 1)
v - matrix(data = 0, nrow =20, ncol = 1)
r - matrix(data = 0, nrow =20, ncol = 1)
for(i in 1:20)
{
Bo - function(x, s, outmat)
{
f - rep(NA, length(x))
z - - outmat[,2:4] %*% c(x[2],x[1],1)
f[1] - (1-exp(z[4]))/sum(exp(z[1:4])) - s[1]
f[2] - (1-exp(z[20]))/sum(exp(z)) - s[2]
f
}
s - c(c2[i],c10[i] )
p - c(50, 5)
# loading nleqslv package
library(nleqslv)
ans.nlq - nleqslv(x=p, fn=Bo, s=s, outmat=outmat, control = list(matxit
=10))
v[i] - ans.nlq$x[1]
r[i] - ans.nlq$x[2]
#print(ans.nlq$termcd)
#ans.nlq$fvec
}
#calculating transition density as a function of parameters
transdens - function(parameters, x)
{
delta -1
c - 2*K_vv*(1-exp(-K_vv*delta))^(-1) #2.004704
q - 2*k_v-1 #0.00959666
f - rep(NA, 1)
#besselinput[2] - 2*c*(x[2]*x[1]*exp(-K_vv*delta))^(0.5)
f[1]-
c*exp(c*(x[2]+exp(-K_vv*delta)*x[1]))*(x[2]/(exp(-K_vv*delta)*x[1]))^(q/2)*besselI(2*c*(x[2]*x[1]*exp(-K_vv*delta))^(0.5),
q, expon.scaled = FALSE)
f
}
vncChi - transdens(parameters = parameters, x=c(v[1],v[2]))
print(vncChi)
#calculating Determinant of Jacobian as a function of outmat.

Jac - function(outmat, x2){
f - rep(NA, 1)
y - - outmat[,2:4] %*% c(x2[1],x2[2],1)
w1 - outmat[,2]*exp(-outmat[,2:4] %*% c(x2[1],x2[2],1))
w2 - outmat[,3]*exp(-outmat[,2:4] %*% c(x2[1],x2[2],1))
f[1] - abs((outmat[5,2]*exp(y[4])/(sum(exp(y[1:4])))+
(1-exp(y[4]))*sum(w1[1:4])/(sum(exp(y[1:4])))^2)*(outmat[21,3]
*exp(y[21])/(sum(exp(y)))+(1-exp(y[21]))*sum(w2)/
(sum(exp(y)))^2)-(outmat[21,2]*exp(y[21])/(sum(exp(y)))
+(1-exp(y[21]))*sum(w1)/(sum(exp(y)))^2)*(outmat[5,3]
*exp(y[4])/(sum(exp(y[1:4])))+(1-exp(y[4]))*sum(w2[1:4])
/(sum(exp(y[1:4])))^2))

f
}

#--Program works as it is
supposed to until here

#Maximum likelihood estimation using mle package
library(stats4)
#defining loglikelighood function
#T - length(v)
#minuslogLik - function(x,x2)
#{f - rep(NA, length(x))
#for(i in 1:T)
#{
#f[1] - -1/T*sum(log(transdens(parameters = parameters, x =
c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i])))
#}
#f
 }


2011/4/10 Albyn Jones jo...@reed.edu

 to clarify: by  if you knew that LL(psi+eps) were well

[R] strategy for writing out file with lines header initiated with comment sign

2011-04-13 Thread Mao Jianfeng

Dear all,

I have data.frame object in R. I want to export it in tab-delimited
file with several lines of header initiated with comment sign (#). I
do not know how to do that in R. Could you please give helps on this
problem?

Thanks in advance.

Best,
Jian-Feng,

##
The lines I want to write in the header lines look like, with words in
the last line (here the line #CHROM POS IDREF ALTQUAL
FILTER INFO  FORMAT  NA1) be
separated by tab :


##fileformat=VCFv4.1
##fileDate=20090805
##source=myImputationProgramV3.1
##reference=file:///seq/references/1000GenomesPilot-NCBI36.fasta
##contig=ID=20,length=62435964,assembly=B36,md5=f126cdf8a6e0c7f379d618ff66beb2da,species=Homo
sapiens,taxonomy=x
##phasing=partial
##INFO=ID=NS,Number=1,Type=Integer,Description=Number of Samples With Data
##INFO=ID=DP,Number=1,Type=Integer,Description=Total Depth
##INFO=ID=AF,Number=A,Type=Float,Description=Allele Frequency
##INFO=ID=AA,Number=1,Type=String,Description=Ancestral Allele
##INFO=ID=DB,Number=0,Type=Flag,Description=dbSNP membership, build 129
##INFO=ID=H2,Number=0,Type=Flag,Description=HapMap2 membership
##FILTER=ID=q10,Description=Quality below 10
##FILTER=ID=s50,Description=Less than 50% of samples have data
##FORMAT=ID=GT,Number=1,Type=String,Description=Genotype
##FORMAT=ID=GQ,Number=1,Type=Integer,Description=Genotype Quality
##FORMAT=ID=DP,Number=1,Type=Integer,Description=Read Depth
##FORMAT=ID=HQ,Number=2,Type=Integer,Description=Haplotype Quality
#CHROM POS IDREF ALTQUAL FILTER INFO
   FORMAT  NA1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there a better way to parse strings than this?

2011-04-13 Thread Dennis Murphy

Hi:

Here's one approach:

strings - c(
A5.Brands.bought...Dulux,
A5.Brands.bought...Haymes,
A5.Brands.bought...Solver,
A5.Brands.bought...Taubmans.or.Bristol,
A5.Brands.bought...Wattyl,
A5.Brands.bought...Other)

slist - strsplit(strings, '\\.\\.\\.')

# Conversion to data frame:
library(plyr)
ldply(slist, rbind)
V1  V2
1 A5.Brands.bought   Dulux
2 A5.Brands.bought  Haymes
3 A5.Brands.bought  Solver
4 A5.Brands.bought Taubmans.or.Bristol
5 A5.Brands.bought  Wattyl
6 A5.Brands.bought   Other

# Conversion to matrix:
laply(slist, rbind)
do.call(rbind, slist)

...and one can subselect from there.

HTH,
Dennis


On Tue, Apr 12, 2011 at 9:07 PM, Chris Howden
ch...@trickysolutions.com.auwrote:

 Hi Everyone,


 I needed to parse some strings recently.

 The code I've wound up using seems rather clunky, and I was wondering if
 anyone had any suggestions on a better way?

 Basically I do the following:

 1) Use substr() to do the parsing
 2) Use regexpr() to find the location of the string I want to parse on, I
 then pass this onto substr()
 3) Use nchar() as the stop input to substr() where necessary



 I've got a simple example of the parsing code I used below. It takes
 questionnaire variable names that includes the question and the brand it
 was answered for and then parses it so the variable name and the brand are
 in separate columns. I then use this to restructure the data from
 unstacked to stacked, but that's another story.

  # this is the data set
  test
 [1] A5.Brands.bought...Dulux
 [2] A5.Brands.bought...Haymes
 [3] A5.Brands.bought...Solver
 [4] A5.Brands.bought...Taubmans.or.Bristol
 [5] A5.Brands.bought...Wattyl
 [6] A5.Brands.bought...Other

  # Where do I want to parse?
  break1 -  regexpr('...',test, fixed=TRUE)
  break1
 [1] 17 17 17 17 17 17
 attr(,match.length)
 [1] 3 3 3 3 3 3

  # Put Variable name in a variable
  str1 - substr(test,1,break1-1)
  str1
 [1] A5.Brands.bought A5.Brands.bought A5.Brands.bought
 A5.Brands.bought
 [5] A5.Brands.bought A5.Brands.bought

  # Put Brand name in a variable
  str2 - substr(test,break1+3, nchar(test))
  str2
 [1] Dulux   Haymes  Solver
 [4] Taubmans.or.Bristol Wattyl  Other



 Thanks for any and all suggestions


 Chris Howden
 Founding Partner
 Tricky Solutions
 Tricky Solutions 4 Tricky Problems
 Evidence Based Strategic Development, IP Commercialisation and Innovation,
 Data Analysis, Modelling and Training
 (mobile) 0410 689 945
 (fax / office) (+618) 8952 7878
 ch...@trickysolutions.com.au

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

2011-04-13 Thread Peter Ehlers


On 2011-04-12 16:57, Michael G Rupert wrote:

I have a question concerning the Wilcoxon signed-rank test, and
specifically, which R subroutine I should use for my particular dataset.
There are three different commands in R (that I'm aware of) that calculate
the Wilcoxon signed-rank test; wilcox.test, wilcox.exact, and
wilcoxsign_test. When I run the three commands on the same dataset, I get
different p-values. I'm hoping that someone can give me guidance on the
strengths and weaknesses of each command, why they produce different
p-values, and which one is the most appropriate for my particular needs.

First, let me describe the dataset I am working with. The project I am
working on collected water samples from groups/networks of about 30 water
wells and analyzed them for nitrate, major ions, and other chemical
constituents. We revisited those same wells about 10 years later and
analyzed the water samples for the same chemical constituents. I now have
a paired dataset, and the question I would like to answer is whether there
was a significant change in concentrations of those chemical
constituents (such as nitrate or chloride). Concentrations measured in
water from some wells have increased, some have decreased, and some have
stayed the same over the ten-year time period. In water from some wells,
the concentrations were below the laboratory detection limits, so those
concentrations are tied at the reporting level. The following is an
example of the data I am evaluating.

x- c(13.60, 9.10, 22.01, 9.08, 1.97, 2.81, 0.66, 0.97, 0.21, 2.23, 0.08,
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
0.06, 0.06, 3.44, 15.18, 5.25, 4.27, 17.81)
y- c( 4.32, 3.39, 16.36, 7.10, 0.08, 2.02, 0.19, 0.59, 0.06, 2.15, 0.06,
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
0.06, 0.06, 4.02, 16.13, 7.30, 7.98, 24.37)

The nonparametric Wilcoxon signed-rank test seems to be the most
appropriate test for these data. There are two different methods to
calculate the signed-rank test. The first is by Wilcoxon (1945), who
discards any tied data and then calculates the signed ranks. The second
method incorporates tied values in the ranking procedure (see J.W. Pratt,
1959, Remarks on zeros and ties in the Wilcoxon signed rank procedure:
Journal of the American Statistical Association, Vol. 54, No. 287, pp.
655-667). There are two commands in R that calculate the original method
by Wilcoxon (that I know of), wilcox.test and wilcoxsign_test (make sure
to include the argument zero.method = c(Wilcoxon)). There are two
other commands in R that incorporate ties in the signed-rank test,
wilcox.exact and wilcoxsign_test (make sure to include the
argumentzero.method = c(Pratt)).

Here's my problem. I get different p-values from each of the 4 signed-rank
tests in R, and I don't know which one to believe. Wilcox.test and
wilcoxsign_test(zero.method = c(Wilcoxon) calculate the standard
Wilcoxon signed-rank test. Even though they are not designed to deal with
tied data, they should at least calculate the same p-value, but they do
not. I ran the same datasets in SYSTAT and Minitab to check on the results
from R. Minitab gives the same results as wilcox.test, and SYSTAT gives
the same results as wilcoxsign_test(zero.method = c(Wilcoxon).
Similarly, wilcox.exact and wilcoxsign_test(zero.method = c(Pratt)) are
designed to incorporate ties, but they give different p-values from each
other. The signed-rank test procedure is relatively straightforward, so
I'm surprised I'm not getting identical results.

To check on these R commands, I calculated the signed-rank tests using the
dataset shown on page 658-659 of Pratt (1959). These R routines do not
produce the same results as that listed in Pratt, which makes me think
that the R routines are not calculating the statistics correctly.


Ahem  that's a pretty strong claim.

Actually, the problem is user misunderstanding and the relevant
help pages do tell you where the differences lie.

Let's take the 3 functions one at a time, using your
x,y data from Pratt:

1. wilcox.test() in the stats package
This function automatically switches to using a Normal
approximation when there are ties in the data:

 wilcox.test(x, y, paired=TRUE)$p.value
#[1] 0.05802402
(You can suppress the warning (due to ties) by specifying
the argument 'exact=FALSE'.)

This function also uses a continuity correction unless
told not to:

 wilcox.test(x, y, paired=TRUE, correct=FALSE)$p.value
#[1] 0.05061243

2. wilcox.exact() in pkg exactRankTests
This function can handle ties (using the Wilcoxon method)
with an 'exact' calculation:

 wilcox.exact(x, y, paired=TRUE)$p.value
#[1] 0.0546875

If you want the Normal approximation:

 wilcox.exact(x, y, paired=TRUE, exact=FALSE)$p.value
#[1] 0.05061243  -- cf. above

3. wilcoxsign_test() in pkg coin
This is the most comprehensive of these functions.
It is also the only one that offers the Pratt method
of handling ties. It will default to this

Re: [R] The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

2011-04-13 Thread peter dalgaard


On Apr 13, 2011, at 01:57 , Michael G Rupert wrote:

 I have a question concerning the Wilcoxon signed-rank test, and 
 specifically, which R subroutine I should use for my particular dataset. 
 There are three different commands in R (that I'm aware of) that calculate 
 the Wilcoxon signed-rank test; wilcox.test, wilcox.exact, and 
 wilcoxsign_test. When I run the three commands on the same dataset, I get 
 different p-values. I'm hoping that someone can give me guidance on the 
 strengths and weaknesses of each command, why they produce different 
 p-values, and which one is the most appropriate for my particular needs.

Well, there are two version of zero-handling, and for each of these, you can 
have exact p values or asymptotic p values with or without continuity 
correction, so that's 6 possibilities already. 


 
 First, let me describe the dataset I am working with. The project I am 
 working on collected water samples from groups/networks of about 30 water 
 wells and analyzed them for nitrate, major ions, and other chemical 
 constituents. We revisited those same wells about 10 years later and 
 analyzed the water samples for the same chemical constituents. I now have 
 a paired dataset, and the question I would like to answer is whether there 
 was a significant change in concentrations of those chemical 
 constituents (such as nitrate or chloride). Concentrations measured in 
 water from some wells have increased, some have decreased, and some have 
 stayed the same over the ten-year time period. In water from some wells, 
 the concentrations were below the laboratory detection limits, so those 
 concentrations are tied at the reporting level. The following is an 
 example of the data I am evaluating. 
 
 x - c(13.60, 9.10, 22.01, 9.08, 1.97, 2.81, 0.66, 0.97, 0.21, 2.23, 0.08, 
 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 
 0.06, 0.06, 3.44, 15.18, 5.25, 4.27, 17.81)
 y - c( 4.32, 3.39, 16.36, 7.10, 0.08, 2.02, 0.19, 0.59, 0.06, 2.15, 0.06, 
 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 
 0.06, 0.06, 4.02, 16.13, 7.30, 7.98, 24.37)
 
 The nonparametric Wilcoxon signed-rank test seems to be the most 
 appropriate test for these data. There are two different methods to 
 calculate the signed-rank test. The first is by Wilcoxon (1945), who 
 discards any tied data and then calculates the signed ranks. The second 
 method incorporates tied values in the ranking procedure (see J.W. Pratt, 
 1959, Remarks on zeros and ties in the Wilcoxon signed rank procedure: 
 Journal of the American Statistical Association, Vol. 54, No. 287, pp. 
 655-667). There are two commands in R that calculate the original method 
 by Wilcoxon (that I know of), wilcox.test and wilcoxsign_test (make sure 
 to include the argument zero.method = c(Wilcoxon)). There are two 
 other commands in R that incorporate ties in the signed-rank test, 
 wilcox.exact and wilcoxsign_test (make sure to include the 
 argumentzero.method = c(Pratt)).
 
 Here's my problem. I get different p-values from each of the 4 signed-rank 
 tests in R, and I don't know which one to believe. Wilcox.test and 
 wilcoxsign_test(zero.method = c(Wilcoxon) calculate the standard 
 Wilcoxon signed-rank test. Even though they are not designed to deal with 
 tied data, they should at least calculate the same p-value, but they do 
 not.

They do if you turn off the continuity correction in wilcox.test:

 wilcox.test(x, y, alternative='two.sided', paired=TRUE, correct=F)

Wilcoxon signed rank test

data:  x and y 
V = 39, p-value = 0.05061
alternative hypothesis: true location shift is not equal to 0 



 I ran the same datasets in SYSTAT and Minitab to check on the results 
 from R. Minitab gives the same results as wilcox.test, and SYSTAT gives 
 the same results as wilcoxsign_test(zero.method = c(Wilcoxon).

So one does continuity correction and the other not.

  
 Similarly, wilcox.exact and wilcoxsign_test(zero.method = c(Pratt)) are 
 designed to incorporate ties, but they give different p-values from each 
 other.

They still handle zeros differently. wilcox.exact does not handle the Pratt 
ranking.

To get exact p values for Pratt ranks, try

 perm.test(c(-3,-4,-5,6:11))

1-sample Permutation Test

data:  c(-3, -4, -5, 6:11) 
T = 51, p-value = 0.08984
alternative hypothesis: true mu is not equal to 0 


... and for the asymptotic counterpart:

 perm.test(c(-3,-4,-5,6:11), exact=F)

Asymptotic 1-sample Permutation Test

data:  c(-3, -4, -5, 6:11) 
T = 51, p-value = 0.08144
alternative hypothesis: true mu is not equal to 0 


 The signed-rank test procedure is relatively straightforward, so 
 I'm surprised I'm not getting identical results. 
 
 To check on these R commands, I calculated the signed-rank tests using the 
 dataset shown on page 658-659 of Pratt (1959).

Not found. Apparently, you _constructed_ a data set to get the same set of 
ranks. 

 These R routines do not

Re: [R] Layout within levelplot from the lattice package

2011-04-13 Thread Ian Renner

Hi Dieter, 

Thank you for that!  Your post helped me on my way by  introducing me to the 
padding settings within lattice, and I'm nearly  there now. 


My new problem related to this graph is that I would like to add  a polygon to 
one of the panels, but it seems that my code also adds the  polygon to the 
paired panel.  Is there a way to condition it so that it  only appears on the 
top panel in column 3? 


Here is my updated code: 

start = expand.grid(1:10,1:14) 
start2 = rbind(start,start,start,start,start,start) 
z = rnorm(840) 
factor.1 = c(rep(A, 280), rep(B, 280), rep(C, 280)) 
factor.2 = c(rep(1, 140), rep(2, 140), rep(1, 140), rep(2, 140), 
rep(1, 140), rep(2, 140)) 

data = data.frame(start2, z, factor.1, factor.2) 
names(data)[1:2] = c(x, y) 

data.A = data[data$factor.1 == A,] 
data.B = data[data$factor.1 == B,] 
data.C = data[data$factor.1 == C,] 

plot.A =  levelplot(z~x*y|1*factor.2,data.A,col.regions=heat.colors, strip =  
FALSE, asp=iso,xlab = , ylab = , colorkey = list(space=bottom),  
scales=list(y=list(draw=F),x=list(draw=F))) 

plot.B = levelplot(z~x*y|1*factor.2,data.B,col.regions=topo.colors,  strip = 
FALSE, asp=iso,xlab = , ylab = , colorkey =  list(space=bottom), 
scales=list(y=list(draw=F),x=list(draw=F))) 

plot.C =  levelplot(z~x*y|1*factor.2,data.C,col.regions=terrain.colors, strip = 
 
FALSE, asp=iso,xlab = , ylab = , colorkey = list(space=bottom),  
scales=list(y=list(draw=F),x=list(draw=F)), 

panel = function(x, y, subscripts, ...) { 
panel.levelplot(x, y, subscripts, ...) 
panel.polygon(c(2, 5, 5, 2), c(3, 3, 8, 8), col=blue) 
} 
) 

print(plot.A, split=c(1,1,3,1)) 
print(plot.B, split=c(2,1,3,1), newpage = FALSE) 
print(plot.C, split=c(3,1,3,1), newpage = FALSE) 

Thanks again for your help! 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] AR(1)

2011-04-13 Thread Alaios

Dear all,
I would like to ask you if you know any AR(1) function that can create a  
sequence of values for different time lags.

The AR(1) based on a starting value and with the gaussian error can produce 
this time lags.

Could you please help me find a function that can do and does not depend on any 
sequence?

I would like to thank you in advance for your help

Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FW: [r] how to enclose two xyplot

2011-04-13 Thread Francesco Nutini


 Dear R-users,

I have to plot two xyplot, and I wish to enclose this two graphs with just one 
headline, the same x scale, the same grid etc.
These parameters should tie in, in order to obtain, visually, a unique graph 
formed by two xyplot. 

I try to give an idea:
 
xyplot1: |_|_|_|



xyplot2: |_|_|_|



what i want: |  |  | |

   |_|_|_|


I tried to use the command par, but it's doesn't work with xyplot. The two 
plot have, by default, the same x-axis scale.
I know it's just a visual solution, but it could be nice for a paper!

Thanks a lot,

Francesco Nutini
PhD student




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to set the dimension of a matrix correctly?

2011-04-13 Thread Ben Bolker

Lin Pei-Ling barthealin at hotmail.com writes:

  Hi all, I use kriging to interpolate the precipitation from
 stations, but the map of this results show lots of stripes. (please
 see the attachment)I think there's something wrong with the setting
 of the dimension of this matrix, however, I have no idea how to know
 or test to see if this setting is correct or not.I've tried to
 switch the latitude and longitude, but still got the same results
 (stripes). Hope anyone can take a look at it and give me some
 suggestion.  Thanks for help,Peiling


  My main suggestion is that you try to boil your example down
to something smaller, simpler, and reproducible.  The attention
span of R-helpers is only about 10 minutes, maximum (maybe a little
longer for problems they are particularly interested in), and
it took me almost that long just to reformat your R code so
it was readable.

   Often the process of trying to simplify the problem leads
you to discover your own problem.

  good luck,
   Ben Bolker


  
 
 --
library(geoR) #functions for geostatistical data analysis
coords- as.matrix(read.table('/Users/R/Code/stncoords.dat'))
ppt- as.matrix(read.table('/Users/R/Code/ppt_15day.dat'))
xx - dim(ppt) # (77,528)
plat - seq(37.5,42,by=0.07273)
plon - seq(-105.5,-93.5,by=0.07273)
pgrid - expand.grid(x=plon,y=plat)
pdim - dim(pgrid) # (10230,2)
#plot(pgrid, cex=0.5)
lat - coords[,1] 
lon - coords[,2] 
ppt1 - ppt[,1:xx[2]] # 1:528
pptpred - matrix(0,ncol=xx[2],nrow=1)
 
# Only test one period data##
ptemp - ppt1[,3]   
ll - which(ptemp0)
ppt2 - matrix(0,nrow=length(ll),ncol=3) # (lon,lat,ptemp)  
ppt2[,1] - lat[ll] # y-axis
ppt2[,2] - lon[ll] # x-axis
ppt2[,3] - ptemp[ll] # ppt
pptd - as.geodata(ppt2)
bin1 - variog(pptd) 
##  plot(bin1) # fig1
bin2 - variog(pptd,estimator.type=modulus)
##  plot(bin2) # fig2
ini1 - max(bin1$v) 
ols - variofit(bin1, fix.nugget = FALSE,
   weights=cressie,ini.cov.pars=c(ini1,4))
kc - krige.conv(pptd,loc=pgrid,
  krige=krige.control(type.krige=OK,trend.d=2nd,
   trend.l=2nd,cov.pars=ols[2]$cov.pars))
pvalxx - which(kc$predict  0) 
kc$predict[pvalxx] - 0 
## ?? something got mangled here ??
## pptpred - kc$predict}else{
##  pptpred - 0*(1:pdim[1])}}
 
## need to fit the lat/lon frame
newpptpred - matrix(pptpred, nrow=62, ncol=165) 
plat - seq(37.5,42,by=0.07273)  ## repeat from above?
plon - seq(-105.5,-93.5,by=0.07273)  ## repeat from above?
lat2 - which((plat 37)(plat 42))
lon2 - which((plon -106)(plon -93))
## fixed ? typos ?
newpptpred - pptpred[lat2,lon2]
nLevel - 60
quartz()
filled.contour(plon,plat,t(newpredppt),
   col=rainbow(nLevel),plot.axes={axis(1);axis(2)})

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Ben Bolker

Matthieu Stigler matthieu.stigler at gmail.com writes:

 
 Hi
 
 We are about to publish a book, which contains figures made with R 
 plots. An important detail that we did not take into account is that the 
 book will not be printed in 4 colors (cmyk mode), but only 2 (black 
 +spotcolor). The spotcolor we use is part of the big Pantone family.
 
 The problem is that both pdf() and postscript() offer either rgb or 
 cmyk, but no spotcolors such as pantone. I'm afraid this constraint 
 can't be solved at all, and we can't use R for creating these plots? I 
 did not find any package that would extend the colormodel to include 
 spot colors... Did anyone had a similar experience?
 

  Wasn't aware of spotcolors, but I bet you could hack the PDF
reasonably easily (if you have many figures you might have to
use awk/sed/perl ?) ... if you don't use R, what is your alternative
for creating the figures?

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there a better way to parse strings than this?

2011-04-13 Thread Hadley Wickham

On Wed, Apr 13, 2011 at 5:18 AM, Dennis Murphy djmu...@gmail.com wrote:
 Hi:

 Here's one approach:

 strings - c(
 A5.Brands.bought...Dulux,
 A5.Brands.bought...Haymes,
 A5.Brands.bought...Solver,
 A5.Brands.bought...Taubmans.or.Bristol,
 A5.Brands.bought...Wattyl,
 A5.Brands.bought...Other)

 slist - strsplit(strings, '\\.\\.\\.')

Or with stringr:

library(stringr)
str_split_fixed(strings, fixed(...), n = 2)

# or maybe
str_match(strings, (..).*\\.\\.\\.(.*))

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there a better way to parse strings than this?

2011-04-13 Thread Gabor Grothendieck

On Wed, Apr 13, 2011 at 12:07 AM, Chris Howden
ch...@trickysolutions.com.au wrote:
 Hi Everyone,


 I needed to parse some strings recently.

 The code I've wound up using seems rather clunky, and I was wondering if
 anyone had any suggestions on a better way?

 Basically I do the following:

 1) Use substr() to do the parsing
 2) Use regexpr() to find the location of the string I want to parse on, I
 then pass this onto substr()
 3) Use nchar() as the stop input to substr() where necessary



 I've got a simple example of the parsing code I used below. It takes
 questionnaire variable names that includes the question and the brand it
 was answered for and then parses it so the variable name and the brand are
 in separate columns. I then use this to restructure the data from
 unstacked to stacked, but that's another story.

 # this is the data set
 test
 [1] A5.Brands.bought...Dulux
 [2] A5.Brands.bought...Haymes
 [3] A5.Brands.bought...Solver
 [4] A5.Brands.bought...Taubmans.or.Bristol
 [5] A5.Brands.bought...Wattyl
 [6] A5.Brands.bought...Other

 # Where do I want to parse?
 break1 -  regexpr('...',test, fixed=TRUE)
 break1
 [1] 17 17 17 17 17 17
 attr(,match.length)
 [1] 3 3 3 3 3 3

 # Put Variable name in a variable
 str1 - substr(test,1,break1-1)
 str1
 [1] A5.Brands.bought A5.Brands.bought A5.Brands.bought
 A5.Brands.bought
 [5] A5.Brands.bought A5.Brands.bought

 # Put Brand name in a variable
 str2 - substr(test,break1+3, nchar(test))
 str2
 [1] Dulux               Haymes              Solver
 [4] Taubmans.or.Bristol Wattyl              Other



Try this:

 x - c(A5.Brands.bought...Dulux, A5.Brands.bought...Haymes,
+ A5.Brands.bought...Solver)

 do.call(rbind, strsplit(x, ..., fixed = TRUE))
 [,1]   [,2]
[1,] A5.Brands.bought Dulux
[2,] A5.Brands.bought Haymes
[3,] A5.Brands.bought Solver

 # or
 xa - sub(..., \1, x, fixed = TRUE)
 read.table(textConnection(xa), sep = \1, as.is = TRUE)
V1 V2
1 A5.Brands.bought  Dulux
2 A5.Brands.bought Haymes
3 A5.Brands.bought Solver


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] BackCast and Forecast with stats.arima and forecast.Arima

2011-04-13 Thread Paolo Rossi

Hello everyone,

I am experiencng some problems in producing forecasts and backcast from an
ARIMA(1,0,0) model. I need to produce an insample backcast and a seasonal
normal backcast and forecast.
I have a seasonal consumption function. By using actual data I get actual
demand. By passing Seasonal Normal regressors I need to obtain seasonal
normal demand.

I tested backcasting in arima by passing data used in the estimation back to
the model through predict. This didnt work as the output was different from
the fitted value I could see for the estimated model. In other words, in
arima I am not able to produce  my seasonal normal backcast.

I then started looking at Arima in Forecast. By passing the estimated model
to  Arima I can reproduce the fitted value from the model. My problem comes
when I try to apply that model  to different dependent (y) data, as the y is
supposed to be passed to Arima, so essentially I can do only an in sample
backcast for same dependent variable.

Is there a way I can get a Seasonal Normal backcast based on the model
estimated on actual data but by passing Seasonal Normal regressors?

Thanks,

Paolo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strategy for writing out file with lines header initiated with comment sign

2011-04-13 Thread jim holtman

Here is an outline of how to do it using connections:

con - file('/temp/mytemp.txt', 'w')
writeLines(c(#comment, # lines, # in the file), con = con)
# create some data to be output as 'tab' separated
myData - as.data.frame(matrix(letters[1:25], 5))
write.table(myData, file = con, sep = '\t')
close(con)
con - file('/temp/mytemp.txt', 'w')
writeLines(c(#comment, # lines, # in the file), con = con)
# create some data to be output as 'tab' separated
myData - as.data.frame(matrix(letters[1:25], 5))
write.table(myData, file = con, sep = '\t')
close(con)



On Wed, Apr 13, 2011 at 6:15 AM, Mao Jianfeng jianfeng@gmail.com wrote:
 Dear all,

 I have data.frame object in R. I want to export it in tab-delimited
 file with several lines of header initiated with comment sign (#). I
 do not know how to do that in R. Could you please give helps on this
 problem?

 Thanks in advance.

 Best,
 Jian-Feng,

 ##
 The lines I want to write in the header lines look like, with words in
 the last line (here the line #CHROM POS     ID        REF ALT    QUAL
 FILTER INFO                              FORMAT      NA1) be
 separated by tab :


 ##fileformat=VCFv4.1
 ##fileDate=20090805
 ##source=myImputationProgramV3.1
 ##reference=file:///seq/references/1000GenomesPilot-NCBI36.fasta
 ##contig=ID=20,length=62435964,assembly=B36,md5=f126cdf8a6e0c7f379d618ff66beb2da,species=Homo
 sapiens,taxonomy=x
 ##phasing=partial
 ##INFO=ID=NS,Number=1,Type=Integer,Description=Number of Samples With Data
 ##INFO=ID=DP,Number=1,Type=Integer,Description=Total Depth
 ##INFO=ID=AF,Number=A,Type=Float,Description=Allele Frequency
 ##INFO=ID=AA,Number=1,Type=String,Description=Ancestral Allele
 ##INFO=ID=DB,Number=0,Type=Flag,Description=dbSNP membership, build 129
 ##INFO=ID=H2,Number=0,Type=Flag,Description=HapMap2 membership
 ##FILTER=ID=q10,Description=Quality below 10
 ##FILTER=ID=s50,Description=Less than 50% of samples have data
 ##FORMAT=ID=GT,Number=1,Type=String,Description=Genotype
 ##FORMAT=ID=GQ,Number=1,Type=Integer,Description=Genotype Quality
 ##FORMAT=ID=DP,Number=1,Type=Integer,Description=Read Depth
 ##FORMAT=ID=HQ,Number=2,Type=Integer,Description=Haplotype Quality
 #CHROM POS     ID        REF ALT    QUAL FILTER INFO
           FORMAT      NA1

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Ted Harding

On 13-Apr-11 12:30:26, Ben Bolker wrote:
 Matthieu Stigler matthieu.stigler at gmail.com writes:
 
 Hi
 We are about to publish a book, which contains figures made
 with R plots. An important detail that we did not take into
 account is that the book will not be printed in 4 colors
 (cmyk mode), but only 2 (black +spotcolor). The spotcolor
 we use is part of the big Pantone family.
 
 The problem is that both pdf() and postscript() offer either
 rgb or cmyk, but no spotcolors such as pantone. I'm afraid
 this constraint can't be solved at all, and we can't use R
 for creating these plots? I did not find any package that
 would extend the colormodel to include spot colors... Did
 anyone had a similar experience?
 
   Wasn't aware of spotcolors, but I bet you could hack the PDF
 reasonably easily (if you have many figures you might have to
 use awk/sed/perl ?) ... if you don't use R, what is your
 alternative for creating the figures?
 
 Ben Bolker

Don't expect to hack PDF reasonably easily -- for many reasons,
one of which is that in PDF different bits of a document can be
(and usually are) encapsulated in PDF objects, whose physical
location in the PDF file can be pretty random (there is a kind
of hash index at the beginning which points to them). So a PDF
file can be heavily fragmented (as also can a PS file, though
usually not nearly so heavily). In theory it would be possible
for every single character in a textual document to be in a
separate PDF object and located in random order in the file!

As a general comment (which unfortunately doesn't address the
main problem raised by Matthieu), it can often be better to
use independent software to create figures/diagrams based on
numerical results computed by R. R's plots are quite nicely
done by default, but tweaking them to achieve a preferred
layout in R itself can be painfully time-consuming. Myself,
I farm this out to the 'pic' preprocessor in troff/groff,
using which any details whatever can be arranged exactly to
one's taste.

Since spotcolour printing is a multi-pass procedure, one can
prepare the separate layers in the respective colours, along
with any necessary crop-marks or bulls-eyes, quite easily.

However, this too generates PS output in the first instance
(convertible to PDF of course), so suffers the same binding
to the RGB/CMYK colour paradigm. So Pantone would not be
available in the first instance (except insofar as a subset
of the Pantone spectrum corresponds to colours in CMYK).

However, I presume it is highly likely that there is software
which can take a file (PS or PDF) prepared using RGB/CMYK, and
convert this to a Pantone-compatible file.

Even so, this would depend on what your publisher/printer
requires in what you submit. It would be important to obtain
from them a full and exact specification of what they require
for colour printing in files submitted to them for printing.

Hoping this is of some help ...
Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 13-Apr-11   Time: 14:17:56
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strategy for writing out file with lines header initiated with comment sign

2011-04-13 Thread Rainer M Krug

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 13/04/11 15:15, jim holtman wrote:
 Here is an outline of how to do it using connections:
 
 con - file('/temp/mytemp.txt', 'w')
 writeLines(c(#comment, # lines, # in the file), con = con)
 # create some data to be output as 'tab' separated
 myData - as.data.frame(matrix(letters[1:25], 5))
 write.table(myData, file = con, sep = '\t')
 close(con)
 con - file('/temp/mytemp.txt', 'w')
 writeLines(c(#comment, # lines, # in the file), con = con)
 # create some data to be output as 'tab' separated
 myData - as.data.frame(matrix(letters[1:25], 5))
 write.table(myData, file = con, sep = '\t')
 close(con)

or:

myData - as.data.frame(matrix(letters[1:25], 5))
writeLines(c(#comment, # lines, # in the file), con = ./test.csv)
write.table(myData, ./test.csv, append=TRUE)

Cheers,

Rainer

 
 
 
 On Wed, Apr 13, 2011 at 6:15 AM, Mao Jianfeng jianfeng@gmail.com wrote:
 Dear all,

 I have data.frame object in R. I want to export it in tab-delimited
 file with several lines of header initiated with comment sign (#). I
 do not know how to do that in R. Could you please give helps on this
 problem?

 Thanks in advance.

 Best,
 Jian-Feng,

 ##
 The lines I want to write in the header lines look like, with words in
 the last line (here the line #CHROM POS IDREF ALTQUAL
 FILTER INFO  FORMAT  NA1) be
 separated by tab :


 ##fileformat=VCFv4.1
 ##fileDate=20090805
 ##source=myImputationProgramV3.1
 ##reference=file:///seq/references/1000GenomesPilot-NCBI36.fasta
 ##contig=ID=20,length=62435964,assembly=B36,md5=f126cdf8a6e0c7f379d618ff66beb2da,species=Homo
 sapiens,taxonomy=x
 ##phasing=partial
 ##INFO=ID=NS,Number=1,Type=Integer,Description=Number of Samples With 
 Data
 ##INFO=ID=DP,Number=1,Type=Integer,Description=Total Depth
 ##INFO=ID=AF,Number=A,Type=Float,Description=Allele Frequency
 ##INFO=ID=AA,Number=1,Type=String,Description=Ancestral Allele
 ##INFO=ID=DB,Number=0,Type=Flag,Description=dbSNP membership, build 129
 ##INFO=ID=H2,Number=0,Type=Flag,Description=HapMap2 membership
 ##FILTER=ID=q10,Description=Quality below 10
 ##FILTER=ID=s50,Description=Less than 50% of samples have data
 ##FORMAT=ID=GT,Number=1,Type=String,Description=Genotype
 ##FORMAT=ID=GQ,Number=1,Type=Integer,Description=Genotype Quality
 ##FORMAT=ID=DP,Number=1,Type=Integer,Description=Read Depth
 ##FORMAT=ID=HQ,Number=2,Type=Integer,Description=Haplotype Quality
 #CHROM POS IDREF ALTQUAL FILTER INFO
   FORMAT  NA1

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 


- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2lo98ACgkQoYgNqgF2egovxwCeN7G2It+1petLynKW37fIqKbW
YIkAmwfMSgfLLZwTGDHJ2dNsVp5muj+7
=Cfj2
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Prof Brian Ripley


On Wed, 13 Apr 2011, Ben Bolker wrote:


Matthieu Stigler matthieu.stigler at gmail.com writes:



Hi

We are about to publish a book, which contains figures made with R
plots. An important detail that we did not take into account is that the
book will not be printed in 4 colors (cmyk mode), but only 2 (black
+spotcolor). The spotcolor we use is part of the big Pantone family.

The problem is that both pdf() and postscript() offer either rgb or
cmyk, but no spotcolors such as pantone.


Well, how could it?  R's colour model is sRGB, and it has not other 
way to refer to colours.  The colour model is not at the level of a 
package 


I'm afraid this constraint can't be solved at all, and we can't use 
R for creating these plots? I did not find any package that would 
extend the colormodel to include spot colors... Did anyone had a 
similar experience?


 Wasn't aware of spotcolors, but I bet you could hack the PDF
reasonably easily (if you have many figures you might have to
use awk/sed/perl ?) ... if you don't use R, what is your alternative
for creating the figures?


No, PDF is not a text format and not easy to hack.  It has a binary 
index of byte positions so you edit it at your peril.


However, this is exactly what professionals have PDF editing tools 
for.  I believe I used Acrobat (not Reader) to do it when I needed to 
for my books.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strategy for writing out file with lines header initiated with comment sign

2011-04-13 Thread Mao Jianfeng

Dear Jim and Rainer,

I learned much from you all. It is my first time to experience the
functions, like file(), writeLines(), close().

Thanks a lot.

Best,

Jian-Feng,

2011/4/13 Rainer M Krug r.m.k...@gmail.com:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 13/04/11 15:15, jim holtman wrote:
 Here is an outline of how to do it using connections:

 con - file('/temp/mytemp.txt', 'w')
 writeLines(c(#comment, # lines, # in the file), con = con)
 # create some data to be output as 'tab' separated
 myData - as.data.frame(matrix(letters[1:25], 5))
 write.table(myData, file = con, sep = '\t')
 close(con)
 con - file('/temp/mytemp.txt', 'w')
 writeLines(c(#comment, # lines, # in the file), con = con)
 # create some data to be output as 'tab' separated
 myData - as.data.frame(matrix(letters[1:25], 5))
 write.table(myData, file = con, sep = '\t')
 close(con)

 or:

 myData - as.data.frame(matrix(letters[1:25], 5))
 writeLines(c(#comment, # lines, # in the file), con = ./test.csv)
 write.table(myData, ./test.csv, append=TRUE)

 Cheers,

 Rainer




 On Wed, Apr 13, 2011 at 6:15 AM, Mao Jianfeng jianfeng@gmail.com wrote:
 Dear all,

 I have data.frame object in R. I want to export it in tab-delimited
 file with several lines of header initiated with comment sign (#). I
 do not know how to do that in R. Could you please give helps on this
 problem?

 Thanks in advance.

 Best,
 Jian-Feng,

 ##
 The lines I want to write in the header lines look like, with words in
 the last line (here the line #CHROM POS     ID        REF ALT    QUAL
 FILTER INFO                              FORMAT      NA1) be
 separated by tab :


 ##fileformat=VCFv4.1
 ##fileDate=20090805
 ##source=myImputationProgramV3.1
 ##reference=file:///seq/references/1000GenomesPilot-NCBI36.fasta
 ##contig=ID=20,length=62435964,assembly=B36,md5=f126cdf8a6e0c7f379d618ff66beb2da,species=Homo
 sapiens,taxonomy=x
 ##phasing=partial
 ##INFO=ID=NS,Number=1,Type=Integer,Description=Number of Samples With 
 Data
 ##INFO=ID=DP,Number=1,Type=Integer,Description=Total Depth
 ##INFO=ID=AF,Number=A,Type=Float,Description=Allele Frequency
 ##INFO=ID=AA,Number=1,Type=String,Description=Ancestral Allele
 ##INFO=ID=DB,Number=0,Type=Flag,Description=dbSNP membership, build 129
 ##INFO=ID=H2,Number=0,Type=Flag,Description=HapMap2 membership
 ##FILTER=ID=q10,Description=Quality below 10
 ##FILTER=ID=s50,Description=Less than 50% of samples have data
 ##FORMAT=ID=GT,Number=1,Type=String,Description=Genotype
 ##FORMAT=ID=GQ,Number=1,Type=Integer,Description=Genotype Quality
 ##FORMAT=ID=DP,Number=1,Type=Integer,Description=Read Depth
 ##FORMAT=ID=HQ,Number=2,Type=Integer,Description=Haplotype Quality
 #CHROM POS     ID        REF ALT    QUAL FILTER INFO
           FORMAT      NA1

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






 - --
 Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
 Biology, UCT), Dipl. Phys. (Germany)

 Centre of Excellence for Invasion Biology
 Stellenbosch University
 South Africa

 Tel :       +33 - (0)9 53 10 27 44
 Cell:       +33 - (0)6 85 62 59 98
 Fax :       +33 - (0)9 58 10 27 44

 Fax (D):    +49 - (0)3 21 21 25 22 44

 email:      rai...@krugs.de

 Skype:      RMkrug
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.10 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

 iEYEARECAAYFAk2lo98ACgkQoYgNqgF2egovxwCeN7G2It+1petLynKW37fIqKbW
 YIkAmwfMSgfLLZwTGDHJ2dNsVp5muj+7
 =Cfj2
 -END PGP SIGNATURE-




-- 
Jian-Feng, Mao

the Institute of Botany,
Chinese Academy of Botany,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Ben Bolker

Prof Brian Ripley ripley at stats.ox.ac.uk writes:

 
 On Wed, 13 Apr 2011, Ben Bolker wrote:
 
  Matthieu Stigler matthieu.stigler at gmail.com writes:
 
 
  Hi
 
  We are about to publish a book, which contains figures made with R
  plots. An important detail that we did not take into account is that the
  book will not be printed in 4 colors (cmyk mode), but only 2 (black
  +spotcolor). The spotcolor we use is part of the big Pantone family.
 
  The problem is that both pdf() and postscript() offer either rgb or
  cmyk, but no spotcolors such as pantone.
 
 Well, how could it?  R's colour model is sRGB, and it has not other 
 way to refer to colours.  The colour model is not at the level of a 
 package 
 
  I'm afraid this constraint can't be solved at all, and we can't use 
  R for creating these plots? I did not find any package that would 
  extend the colormodel to include spot colors... Did anyone had a 
  similar experience?
 
   Wasn't aware of spotcolors, but I bet you could hack the PDF
  reasonably easily (if you have many figures you might have to
  use awk/sed/perl ?) ... if you don't use R, what is your alternative
  for creating the figures?
 
 No, PDF is not a text format and not easy to hack.  It has a binary 
 index of byte positions so you edit it at your peril.
 
 However, this is exactly what professionals have PDF editing tools 
 for.  I believe I used Acrobat (not Reader) to do it when I needed to 
 for my books.

  OK. I was misremembering the good old days when I used to hack
the PostScript coming out of gnuplot.  I must admit that when I look
at PDFs coming out of R, as in

pdf(test.pdf)
plot(1:10,1:10,pch=16,col=rep(1:5,2),cex=2)
dev.off()

  I still see text-like bits like

/sRGB cs 0.000 0.804 0.000 scn

  that are clearly (by experiment) hackable.
  That doesn't mean it's easy or a good idea in practice.

   Re Ted's comment that it's better to compute in R and draw
figures outside: that really depends on one's comfort level with
various tools and the tradeoffs between (1) command-line
control and reproducibility (2) the ability to do subtle visual
design adjustments by hand.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MLE where loglikelihood function is a function of numerical solutions

2011-04-13 Thread Berend Hasselman

Questions:

1. why are you defining Bo within a loop? 
2. Why are you doing library(nleqslv) within the loop?

Doing both those statements outside the loop once is more efficient.

In your transdens function you are not using the function argument
parameters, why?
Shouldn't there be a with(parameters) since otherwise where is for example
K_vv supposed to come from?

I can't believe that the code worked: in the call of nleqslv you have ...
control=list(matxit=10) ...
It should be maxit and nleqslv will issue an error message and stop (at
least in the latest versions).
And why 10? If that is required, something is not ok with starting
values and/or functions.

Finally the likelihood function at the end of your code

#Maximum likelihood estimation using mle package 
library(stats4) 
#defining loglikelighood function 
#T - length(v) 
#minuslogLik - function(x,x2) 
#{f - rep(NA, length(x)) 
#for(i in 1:T) 
#{ 
#f[1] - -1/T*sum(log(transdens(parameters = parameters, x = 
c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i]))) 
#} 
#f 
# } 

How do the arguments of your function x and x2 influence the calculations in
the likelihood function?
As written now with argument x and x2 not being used in the body of the
function, there is nothing to optimize.
Shouldn't f[1] be f[i] because otherwise the question is why are looping
for( i in 1:T)?
But then returning f as a vector seems wrong here. Shouldn't a likelihood
function return a scalar?


Berend

--
View this message in context: 
http://r.789695.n4.nabble.com/MLE-where-loglikelihood-function-is-a-function-of-numerical-solutions-tp3439436p3447224.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Hadley Wickham

 Even so, this would depend on what your publisher/printer
 requires in what you submit. It would be important to obtain
 from them a full and exact specification of what they require
 for colour printing in files submitted to them for printing.

No one else has mentioned this, but the publisher is trying to make
money, not make your life easier.  Sometimes the right thing to do is
say Hey - you guys are the experts at this, you convert my RGB pdfs
to the correct format.  It's worthwhile to push back a bit to
publishers and get them to do their job.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-13 Thread Karl Ove Hufthammer

Marius Hofert wrote:

 Haha, I found a hack (using the letter l):
 
 plot(0,0,main=expression(italic(X)[1]^bolditalic(l)))

Why cheat when you can use a *real* prime character:

  plot(0, 0, main=expression(paste(italic(X)[1],\u2032)))

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge matrix

2011-04-13 Thread Pankaj Barah

Hi Kevin,

Sorry.it is not Bt ? A and B are two independent Matrices with equal number
rows and different number of columns.

dim (A)
[1] 30380   104

dim(B)

[1] 3038063

I want to  combine both A and B to matrix C wheredim(C)
[1] 30380   167

So I got the answer

C-cbind(A,B)

Thanks and regards,

Pankaj





Regards,

Pankaj



On Tue, Apr 12, 2011 at 5:36 PM, Kevin E. Thorpe
kevin.tho...@utoronto.cawrote:

 On 04/12/2011 10:54 AM, pankaj borah wrote:

 I have two matrices A and B

  dim (A)

 [1] 30380   104

  dim(Bt)


 [1] 3038063

 I want to  combine both A and B to matrix C where

 dim(C)

 [1] 30380   167

 How do I do that ?


 Assuming that Bt is the transpose of B and you want C = [A|Bt] you could
 do:

 C - cbind(A,t(B))


 Regards,

 Pankaj Barah
 Department of Biology,
 Norwegian University of Science  Technology (NTNU)
 Realfagbygget, N-7491

[[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Kevin E. Thorpe
 Biostatistician/Trialist, Knowledge Translation Program
 Assistant Professor, Dalla Lana School of Public Health
 University of Toronto
 email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] iterative method in R software

2011-04-13 Thread rvohen

In confirmatory factor analysis,we need to estimate unknown parameters . I
read a book about it,which needs iterative method to estimate parameters,but
I don't know it.Does someone konw it?Thank u!

--
View this message in context: 
http://r.789695.n4.nabble.com/iterative-method-in-R-software-tp3446918p3446918.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split string into individual valus while reading in R

2011-04-13 Thread Ram H. Sharma

Sorry NULL in V8 should be corrected as NA

On Wed, Apr 13, 2011 at 7:33 AM, Ram H. Sharma sharma.ra...@gmail.comwrote:

 It looks like my question is not clear, I have not get any suggestion, yet
 let me reiterate my problem:

 My data looks like this to be read from a text file. As I provided earlier
 the A, B, and H in V4 column has much longer chain.

 V1 V2   V3V4
 1_1ch1   0.0   AHAH
 1_2ch1   0.20 AHAAA

 ...so on

 I want to read this as (splitting AHBinto individual characters)
 V1 V2   V3 V4   V5V6  V7  V8
 1_1ch1   0.0   AH  AHNA
 1_2ch1   0.20 AH  AAA

 Any trick to do this?

 Thanks;

 Ram H


 On Tue, Apr 12, 2011 at 5:19 PM, Ram H. Sharma sharma.ra...@gmail.comwrote:

 Dear R experts

 Sorry for posting:
 I have text file that I need to read into R (this is out from different
 program).   A portion of  data look like:

 ;example data
 1_1 ch1 0.0
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_2 ch1 0.1
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_3 ch1 0.1
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_4 ch1 0.2
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_5 ch1 0.2
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_6 ch1 0.3
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_7 ch1 0.3
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_8 ch1 0.4
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_9 ch1 0.4
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_10 ch1 0.5
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_11 ch1 0.5
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_12 ch1 0.6
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_13 ch1 0.6
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_14 ch1 0.7
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_15 ch1 0.7
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH

  read_data - read.table (my_text_file.txt,header=F,comment=;)
 How can I read this into table where columns first five column are
 maintained as such (e.g. 1_1  ch1  0.0 H ) where as the long single column
 (filled with A, H, B) split into individual columns. Or alternatively first
 read using the following command and then able to split the variable 4.

 Thank you for help;


 Ram H




 --

 Ram H




-- 

Ram H

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 1 continuous non-normal variable ~ 4 factors + 1 continuous covariate (with interactions)

2011-04-13 Thread Alal

Thanks, I guess I can do that, and it actually seem appropriate for one of my
variable. 

But can you do post-hoc tests on a survival analysis? Use contrasts or
something? 

--
View this message in context: 
http://r.789695.n4.nabble.com/1-continuous-non-normal-variable-4-factors-1-continuous-covariate-with-interactions-tp3444378p3446782.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Find characters in a matrix and return a TRUE-FALSE vector

2011-04-13 Thread Vitrifizierung

I have the following problem:

My data is a matrix of multiple columns and rows. The column I am interested
in looks like that (I think it is a column in a matrix that contains a
vector each?):

[[1]]
[1] A B C
[4] D E

[[2]]
[1] A D E

[[3]]
[1] C E F
[4] G

I now want to look for a special word (for example C; in my case the single
entries are words, not single letters) and return a vector, like

[1] TRUE FALSE TRUE

Is there an easy way to do it?
Sorry, I am really a beginner and I did not really solve the problem by
looking into other threads...

Thanks a lot,
Nina

--
View this message in context: 
http://r.789695.n4.nabble.com/Find-characters-in-a-matrix-and-return-a-TRUE-FALSE-vector-tp3446868p3446868.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split string into individual valus while reading in R

2011-04-13 Thread Ram H. Sharma

It looks like my question is not clear, I have not get any suggestion, yet
let me reiterate my problem:

My data looks like this to be read from a text file. As I provided earlier
the A, B, and H in V4 column has much longer chain.

V1 V2   V3V4
1_1ch1   0.0   AHAH
1_2ch1   0.20 AHAAA

...so on

I want to read this as (splitting AHBinto individual characters)
V1 V2   V3 V4   V5V6  V7  V8
1_1ch1   0.0   AH  AHNULL
1_2ch1   0.20 AH  AAA

Any trick to do this?

Thanks;

Ram H


On Tue, Apr 12, 2011 at 5:19 PM, Ram H. Sharma sharma.ra...@gmail.comwrote:

 Dear R experts

 Sorry for posting:
 I have text file that I need to read into R (this is out from different
 program).   A portion of  data look like:

 ;example data
 1_1 ch1 0.0
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_2 ch1 0.1
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_3 ch1 0.1
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_4 ch1 0.2
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_5 ch1 0.2
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_6 ch1 0.3
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_7 ch1 0.3
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_8 ch1 0.4
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_9 ch1 0.4
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_10 ch1 0.5
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_11 ch1 0.5
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_12 ch1 0.6
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_13 ch1 0.6
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_14 ch1 0.7
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_15 ch1 0.7
 HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH

  read_data - read.table (my_text_file.txt,header=F,comment=;)
 How can I read this into table where columns first five column are
 maintained as such (e.g. 1_1  ch1  0.0 H ) where as the long single column
 (filled with A, H, B) split into individual columns. Or alternatively first
 read using the following command and then able to split the variable 4.

 Thank you for help;


 Ram H




-- 

Ram H

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ddply and nlminb

2011-04-13 Thread Robert Clement


Hello

I'm new to R (one week) so please excuse any obvious mistakes in my code 
or posting.


I am attempting to fit a non linear function defining the relationship 
between dependent variable A and  the variables PAR and T grouped by the 
condition Di.


The following steps are taken in the Rcode below:
1) load the data (not shown)
2) define the function to be fit
3) define the starting values of the fit parameters and their upper and 
lower limits
4) fit the function using all data using nlminb (Di groupings not 
considered)

5) estimate the parameter standard errors
6) fit the function with the data grouped by Di using ddply

The first 5 steps appear to be working alright and produce reasonable 
fit parameters and parameter errors.


However, when attempting to fit the function with the data grouped by Di 
the parameters are returned with the same value for each grouping:


 R3Dpar
Di V1   V2   
V3 V4V5  V6

1 0.033461 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518
2 0.082682 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518
3 0.133670 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518
4 0.195940 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518
5 0.272430 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518
6 0.368960 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518
7 0.500150 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518
8 0.748380 0.06081973 -2.119109e-05 31.75188 -2.631321 17.32565 0.821518


Can someone point out the error in my ways, and also the correct way to 
return both the parameter estimates and the parameter errors.


Thank you

Robert




#  END OF CODE 
--



#  Data loaded in data frame M2 and attached - not shown.

#  Define function
fnonRecHypT - function(x){
qeo   = x[1]
dqe   = x[2]
Fmo   = x[3]
TL= x[4]
To= x[5]
TH= 45
theta = x[6]

qe = qeo + dqe*T
theta =
phi = (TH-To)/(To-TL)
Fm = Fmo*((T-TL)*(TH-T)^ phi) / ((To-TL)*(TH-To)^ phi)
Aest = (qe*PAR + Fm - ((qe*PAR + Fm)^2 - 4*theta*qe*PAR*Fm)^0.5)/(2.*theta)
result = sum((Aest-A)^2 )
}

#  Define parameter starting values and limits
startval =  c(0.05,-0.01,20, -5,15,0.5)
lowval   =  c(0.01,-0.05, 5,-15, 7,0.1)
uppval   =  c(0.2,  0.02,50,  0,25,0.99)

#  Fit using entire data set
R3 -  nlminb(startval, fnonRecHypT, control=list(trace=1), 
lower=lowval, upper = uppval)


#  estimate fit parameter standard errors
x = R3$par
D3 = hessian(fnonRecHypT,x)
e3 = sqrt(diag(solve(D3)))

#  attempt to fit by grouping variable, Di   (Di is a real 
vector with 8 possible values)

R3Dpar= ddply(M2,
.(Di),
function(x) {res - nlminb(startval, fnonRecHypT, 
control=list(trace=1), lower=lowval, upper = uppval)

return(res$par)
}
)


# - END OF CODE 
-


--
School of Geosciences
University of Edinburgh
202 Crew Building
West Mains Road
Edinburgh
Scotland
EH9 3JN

Ph  +44 (0)131 650 7732
Fx  +44 (0)131 662 0478
email  robert.clem...@ed.ac.uk


The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Boxplot with two or more Y vectors

2011-04-13 Thread Håvard Wahl Kongsgård

Hi, for a simple boxplot in R, in the formula is it possible to include two
or more Y vectors directly. Or is that only possibility by aggregating the
data first?


-- 
Håvard Wahl Kongsgård
Peace Research Institute Oslo (PRIO)

http://havard.security-review.net/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ordinal predictor in anova

2011-04-13 Thread Vincent Staszewski


  Hi,

  I have a dataset with a continuous response variable and, among  
other predictors, an ordinal variable.


  Here is what it could look like

treatment - factor(rep(c(AA, AC, AD,AE, AB), each = 10))

length - c(75, 67, 70, 75, 65, 71, 67, 67, 76, 68,
 57, 58, 60, 59, 62, 60, 60, 57, 59, 61,
 58, 61, 56, 58, 57, 56, 61, 60, 57, 58,
 58, 59, 58, 61, 57, 56, 58, 57, 57, 59,
 62, 66, 65, 63, 64, 62, 65, 65, 62, 67)

  This ordinal variable (treatment) contains 5 classes and can not be  
recoded as a numerical variable (for example 1, 2,3,4 and 5) because I  
have no information on the relative difference between classes (it  
could as well be 1, 24, 25, 50,55). Coding it simply as a categorical  
variable is also not ok because there is a hierarchy between the  
groups. This is the definition of ordinal variable.


  I have defined the ordinal variable (the order is from a priori prediction)

#NOW DEFINING THE ORDERED VARIABLE:
sugars$treatment_ordered-ordered(sugars$treatment,c(AA, AB,AC,  
AD,AE))


  The problem is that when I run the ANOVA or perform model  
comparison, R is giving me the same results if I use either  
treatment as a predictor in the model or treatment_ordered.


anova(lm(length ~ treatment_ordered, sugars));anova(lm(length ~  
treatment, sugars))



  I've thought about using planned contrasts but I do not manage to  
translate the prediction (AAABAC ADAE)

would it be
(AA(AB,AC,AD,AE))((AA,AB)(AC,AD,AE))etc etc
(-1,0.25,0.25,0.25,0.25)(-0.5,-0.5,0.33,0.33,0.33)...

OR
(AAAB)(AB,AC)...
(-1,1,0,0,0)(0,-1,1,0,0)...


and anyway you can only test one matrix at a time.


 To make thinks a bit more complicated, I also have another factor,  
categorical variable and I'd like to test the interaction with the  
ordinal factor (of course sample sizes are higher in the real  
dataset):


treatment - factor(rep(c(AA, AC, AD,AE, AB), each = 10))

length - c(75, 67, 70, 75, 65, 71, 67, 67, 76, 68,
 57, 58, 60, 59, 62, 60, 60, 57, 59, 61,
 58, 61, 56, 58, 57, 56, 61, 60, 57, 58,
 58, 59, 58, 61, 57, 56, 58, 57, 57, 59,
 62, 66, 65, 63, 64, 62, 65, 65, 62, 67)
treatment2 - c(BA, BA, BB, BB, BC, BC, BD, BD, BE, BE,
 BA, BA, BB, BB, BC, BC, BD, BD, BE, BE,
 BA, BA, BB, BB, BC, BC, BD, BD, BE, BE,
 BA, BA, BB, BB, BC, BC, BD, BD, BE, BE,
 BA, BA, BB, BB, BC, BC, BD, BD, BE, BE)

sugars - data.frame(treatment, length,treatment2)

sugars$treatment_ordered-ordered(sugars$treatment,c(AA, AB,AC,  
AD,AE))


 anova(lm(length ~  
treatment_ordered+treatment2+treatment_ordered:treatment2, sugars))


  Any suggestions?


  Thanks,

   V.



--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



- End forwarded message -


--
Vincent STASZEWSKI
Institute of Infection and Immunology Research
Ashworth Laboratories
Kings' Buildings
EH9 3JT
Edinburgh
Scotland, UK
Tel: 0044(0)131 650 8682
webpage: http://reece.bio.ed.ac.uk/vincent-staszewski.html

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find characters in a matrix and return a TRUE-FALSE vector

2011-04-13 Thread Jorge Ivan Velez

Hi Nina,

You might try

sapply(yourdata, function(x) any(x == C))

See ?sapply for more details.

HTH,
Jorge


On Wed, Apr 13, 2011 at 7:17 AM, Vitrifizierung  wrote:

 I have the following problem:

 My data is a matrix of multiple columns and rows. The column I am
 interested
 in looks like that (I think it is a column in a matrix that contains a
 vector each?):

 [[1]]
 [1] A B C
 [4] D E

 [[2]]
 [1] A D E

 [[3]]
 [1] C E F
 [4] G

 I now want to look for a special word (for example C; in my case the single
 entries are words, not single letters) and return a vector, like

 [1] TRUE FALSE TRUE

 Is there an easy way to do it?
 Sorry, I am really a beginner and I did not really solve the problem by
 looking into other threads...

 Thanks a lot,
 Nina

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Find-characters-in-a-matrix-and-return-a-TRUE-FALSE-vector-tp3446868p3446868.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Overcoming warning in package zoo

2011-04-13 Thread Rita Carreira


Dear R users,I have a long program that I am trying to run--I am using RStudio 
as my interface with R. The pieces of the program run well individually but 
when I try to run everything in sequence it bogs down because of a warning 
after using rollmax from package zoo. Here is the warning:
In rollmax.zoo(zoo(Pmat), 7, na.pad = FALSE, align = right) :   na.pad is 
deprecated. Use fill. 
The code that generates this warning islibrary(zoo)Pmax - 
as.data.frame(rollmax(zoo(Pmat), 7, na.pad = FALSE, align = right))
I also obtain the same warning when I run Pmax - 
as.data.frame(rollmax(zoo(Pmat), 7, na.pad = TRUE, align = right))
Note that Pmat is a 136x271 double matrix and Pmax is a dataframe (265 obs. of 
136 variables). The function rollmax is doing exactly what I want it to do, so 
my data is as it's supposed to be and everything is fine, except for the 
warning, that is... 
The version of zoo that I am running was obtained by running
install.packages(zoo, repos = http://r-forge.r-project.org;)
So, my ultimate goal is to either find a way for R to ignore this warning and 
move on to the next step or fix the problem that is causing the warning.
Any thoughts or suggestions?
Thanks so much and have a great day!

Rita 

If you think education is expensive, try ignorance--Derek Bok


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overcoming warning in package zoo

2011-04-13 Thread Gabor Grothendieck

On Wed, Apr 13, 2011 at 10:38 AM, Rita Carreira
ritacarre...@hotmail.com wrote:

 Dear R users,I have a long program that I am trying to run--I am using 
 RStudio as my interface with R. The pieces of the program run well 
 individually but when I try to run everything in sequence it bogs down 
 because of a warning after using rollmax from package zoo. Here is the 
 warning:
 In rollmax.zoo(zoo(Pmat), 7, na.pad = FALSE, align = right) :   na.pad is 
 deprecated. Use fill.
 The code that generates this warning islibrary(zoo)Pmax - 
 as.data.frame(rollmax(zoo(Pmat), 7, na.pad = FALSE, align = right))
 I also obtain the same warning when I run Pmax - 
 as.data.frame(rollmax(zoo(Pmat), 7, na.pad = TRUE, align = right))
 Note that Pmat is a 136x271 double matrix and Pmax is a dataframe (265 obs. 
 of 136 variables). The function rollmax is doing exactly what I want it to 
 do, so my data is as it's supposed to be and everything is fine, except for 
 the warning, that is...
 The version of zoo that I am running was obtained by running
 install.packages(zoo, repos = http://r-forge.r-project.org;)
 So, my ultimate goal is to either find a way for R to ignore this warning and 
 move on to the next step or fix the problem that is causing the warning.
 Any thoughts or suggestions?
 Thanks so much and have a great day!


Use the CRAN version of zoo.  If you do want to use the development
version then surely the error message is pretty explicit in what you
have to do.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing objects and clearing memory

2011-04-13 Thread Jim Silverton

Hey guys thanks very much for al your responses. It was very helpful.

Jim

On Tue, Apr 12, 2011 at 6:59 PM, Peter Ehlers ehl...@ucalgary.ca wrote:

 On 2011-04-12 15:52, seeliger.c...@epamail.epa.gov wrote:

 How do I remove all objects except one in R?

  rm(list=ls()) #will remove ALL objects


 But he wanted to remove all objects ***except one***!!!  So what's the
 point of this answer?

 I can't see any way except a rather round-about kludge to do what the
 OP wants. ...


 That might work.  Or try this, replacing QQQ with the name of the object
 you want to keep.
 rm(list=ls()[!grepl('^QQQ$',ls())])

 cur


 package gdata has function keep() which does just what the OP asks for.

 Peter Ehlers




-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Wilcoxon rank sum in unbalanced design

2011-04-13 Thread netrunner

Hi everyone!
I need to perform a Wilcoxon rank sum test, but I have some ties and the
groups have different size also. When I deal with ties I use the
wilcox.exact function, how can I solve the different size problem using this
function? 

thanks

net

--
View this message in context: 
http://r.789695.n4.nabble.com/Wilcoxon-rank-sum-in-unbalanced-design-tp3447400p3447400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Decimals in R/SQL

2011-04-13 Thread Rachel Licata

Hello,

When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
The only way I can get decimals is by multiplying by 1.0, etc.  I
have tried manipulating the options, but it is only effective once I
multiply by 1..  

I appreciate any suggestions!

Thanks! 

Example:

z - sqldf (select ST,
SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
ss09 group by ST)

z shouldn't be whole numbers.  


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glmnet

2011-04-13 Thread Janina Hemmersbach

Hello,
I´m trying to in install the package 'glmnet' but I get always the error 
massage package ‘Matrix’ is not available. I search on you site, but I 
coundn´t find the package there either. Is their still a package called 
Matrix? Or how can I use glmnet?

Thank You in advance.
Kind regards

J.Hemmersbach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split string into individual valus while reading in R

2011-04-13 Thread Upton, Stephen (Steve) (CIV)

Hi Ram,

Try this on your V4:
d - yourdf$V4
unlist(lapply(1:nchar(d),function(x) substr(d,x,x)))

HTH
steve
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ram H. Sharma
Sent: Wednesday, April 13, 2011 7:34 AM
To: r-help@r-project.org
Subject: Re: [R] split string into individual valus while reading in R

It looks like my question is not clear, I have not get any suggestion, yet
let me reiterate my problem:

My data looks like this to be read from a text file. As I provided earlier
the A, B, and H in V4 column has much longer chain.

V1 V2   V3V4
1_1ch1   0.0   AHAH
1_2ch1   0.20 AHAAA

...so on

I want to read this as (splitting AHBinto individual characters)
V1 V2   V3 V4   V5V6  V7  V8
1_1ch1   0.0   AH  AHNULL
1_2ch1   0.20 AH  AAA

Any trick to do this?

Thanks;

Ram H


On Tue, Apr 12, 2011 at 5:19 PM, Ram H. Sharma
sharma.ra...@gmail.comwrote:

 Dear R experts

 Sorry for posting:
 I have text file that I need to read into R (this is out from different
 program).   A portion of  data look like:

 ;example data
 1_1 ch1 0.0

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_2 ch1 0.1

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_3 ch1 0.1

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_4 ch1 0.2

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_5 ch1 0.2

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_6 ch1 0.3

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_7 ch1 0.3

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_8 ch1 0.4

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_9 ch1 0.4

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_10 ch1 0.5

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_11 ch1 0.5

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHA
 1_12 ch1 0.6

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_13 ch1 0.6

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_14 ch1 0.7

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH
 1_15 ch1 0.7

HAAAHHAAAHHAHHAHHAHAHHHAAHHAHHAAAHHAHHAAHAHHHAAAHHHAAAHHAHHHAAAH
HHHAHHAAHHAHAAAHAHHAAHAAHAAHHHAHAHHAHH

  read_data - read.table (my_text_file.txt,header=F,comment=;)
 How can I read this into table where columns first five column are
 maintained as such (e.g. 1_1  ch1  0.0 H ) where as the long single column
 (filled with A, H, B) split into individual columns. Or alternatively
first
 read using the following command and then able to split the variable 4.

 Thank you for help;


 Ram H




-- 

Ram H

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Matthieu Stigler


Jim

Thanks for your feedback! The problem is that the people those 
responsible for layout are us... We are doing the book in Latex, and 
till now did not need any other software.


But I am scared we will need use kind of Indesign  co softwares to be 
able to use our R plots, since R can't export into spotcolor...


Thanks!

Mat





Le 13/04/2011 13:41, Jim Lemon a écrit :

On 04/13/2011 07:15 PM, Matthieu Stigler wrote:

Hi

We are about to publish a book, which contains figures made with R
plots. An important detail that we did not take into account is that the
book will not be printed in 4 colors (cmyk mode), but only 2 (black
+spotcolor). The spotcolor we use is part of the big Pantone family.

The problem is that both pdf() and postscript() offer either rgb or
cmyk, but no spotcolors such as pantone. I'm afraid this constraint
can't be solved at all, and we can't use R for creating these plots? I
did not find any package that would extend the colormodel to include
spot colors... Did anyone had a similar experience?


Hi Matthieu,
I have occasionally had to handle this. In every case, those 
responsible for the actual layout would take grayscale figures (black, 
white and some gray level) and then convert the gray level to the 
desired color in the illustrations. I have always supplied Postscript 
in these cases.


Jim



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmnet

2011-04-13 Thread Ben Bolker

Janina Hemmersbach janina.hemmersbach at scai.fraunhofer.de writes:

 
 Hello,
 I´m trying to in install the package 'glmnet'
 but I get always the error massage package ‘Matrix’ is
 not available. I search on you site, but I coundn´t 
 find the package there either. Is their still a
 package called Matrix? Or how can I use glmnet?

  We need more information.
  What are the results of sessionInfo()?
  Are you using the latest version of R (at least R 2.12.x)?
  What repository are you using or trying to use?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting line graphs for output from crosstabs function

2011-04-13 Thread taby gathoni


Hi R-users,

This is a generic question, is there a way to plot a line graph for the output 
from crosstable function? one of the inputs to the crosstab function is 
categorical.

Taby --








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmnet

2011-04-13 Thread Alexander Engelhardt


Am 13.04.2011 16:58, schrieb Janina Hemmersbach:

Hello,
I´m trying to in install the package 'glmnet' but I get always the error massage package ‘Matrix’ is 
not available. I search on you site, but I coundn´t find the package there either. Is their still a 
package called Matrix? Or how can I use glmnet?


I had the same error when I tried
 install.packages(glmnet)

However, it worked when I first installed Matrix and then glmnet:
 install.packages(Matrix)
 install.packages(glmnet)

Have you tried this?

 -- Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to get the names of all the packages that depend on a give package?

2011-04-13 Thread thmsfuller...@gmail.com

Dear All,

I want to check what packages depends on a given package. For example,
glmnet depends on Matrix. I want get the names of all the packages
that depend on Matrix on CRAN. Is there a way to do so? Thanks!

 library(glmnet)
Loading required package: Matrix
Loading required package: lattice

-- 
Tom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimals in R/SQL

2011-04-13 Thread jim holtman

You at least have to provide a subset of 'ss09' so we can see what the
original data looks like.  I have not had any problems with decimals
in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
+ select key, sum(V1 * V2) / sum(V3)
+ from x
+ group by key
+ ')


 xsum
  key sum(V1 * V2) / sum(V3)
1   1   19.38166
2   2   17.40503
3   3   71.48818
 dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3)
), row.names = c(NA, 3L), class = data.frame)




On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Barry Rowlingson

2011/4/13 Matthieu Stigler matthieu.stig...@gmail.com:
 Jim

 Thanks for your feedback! The problem is that the people those responsible
 for layout are us... We are doing the book in Latex, and till now did not
 need any other software.

 But I am scared we will need use kind of Indesign  co softwares to be able
 to use our R plots, since R can't export into spotcolor...

 I've just had a look into how PostScript handles spot colours, having
been hacking PS files for many years its not something I'd played with
before.

 This page gave me a PS file that used spot colours:

http://pslib.sourceforge.net/examples.php

and it seems that you can set a colour using a colour space array:

[ /Separation (PANTONE Violet C)
  /DeviceCMYK { dup 0.75 mul exch dup 0.94 mul exch dup
0.00 mul exch 0.00 mul }
] setcolorspace 1.00 setcolor

Now, my screen previewer has no idea what colour PANTONE Violet C is,
so there's an alternative CMYK colour space version. In fact PANTONE
Violet C is just a label. The idea is that when processed for colour
separations the software will use the separation colours.

Now, to get this into R is tricky. Either you have to rewrite the PS
driver or hack the PS output file. I produced a simple R plot with one
purple (col=#FF00FF) blob and saved to a PS file. The bit of the PS
that set the colour was this:

 /bg { 1 0 1 rgb } def
 1 0 1 rgb

 - the two settings here are for the outline and fill of the blob,
which was created using pch=19.

 To specify this as a spot colour (with a CMYK alternative for
preview), I changed that to this:

/pvc {
[ /Separation (PANTONE Violet C)
  /DeviceCMYK { dup 0.75 mul exch dup 0.94 mul exch dup
0.00 mul exch 0.00 mul }
] setcolorspace
} def

/bg {
pvc 1.00 setcolor
} def
pvc 1.00 setcolor

Now if you only have one or two spot colours and only a few places
where they occur then it might be possible for you to do this kind of
edit/replacement - either manually or automatically. And beware I'm
not even sure this is right or that your publisher would be happy
anyway.

 Anyway, thats enough PS hacking for one day.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Jeremy Hetzel

Scribus claims to be able to convert RGB/CMYK colors to spot colors:
http://documentation.scribus.net/index.php/Spot_Colors

I've never used Scribus, but it's floss.

Jeremy
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimals in R/SQL

2011-04-13 Thread Rachel Licata

Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large table 
and here is a subset of the variables I am working with.  


SS09
ST  AGEPPWGTP
33323130 130
33324110 186
333251 2 162
33326180  93
33327129 135
33328166  54
33329162  54
0121 138
1129 103
21 7 144
31 5 143

z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage 
from ss09 group by ST)

ST wgtage
1   1 37
2   2 33
3   4 36
4   5 37
5   6 35

z - dbGetQuery( connSQLite , select ST, 
SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 
group by ST)

ST   wgtage
1   1 37.57083
2   2 33.94322
3   4 36.14499
4   5 37.51233
5   6 35.65581

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Wednesday, April 13, 2011 12:16 PM
To: Rachel Licata
Cc: r-help@r-project.org
Subject: Re: [R] Decimals in R/SQL

You at least have to provide a subset of 'ss09' so we can see what the
original data looks like.  I have not had any problems with decimals
in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
+ select key, sum(V1 * V2) / sum(V3)
+ from x
+ group by key
+ ')


 xsum
  key sum(V1 * V2) / sum(V3)
1   1   19.38166
2   2   17.40503
3   3   71.48818
 dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3)
), row.names = c(NA, 3L), class = data.frame)




On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Barry Rowlingson

On Wed, Apr 13, 2011 at 5:26 PM, Jeremy Hetzel jthet...@gmail.com wrote:
 Scribus claims to be able to convert RGB/CMYK colors to spot colors:
 http://documentation.scribus.net/index.php/Spot_Colors

 I've never used Scribus, but it's floss.

 Scribus can import output from R's svg driver (with at least some
degree of success).

 I just imported a simple plot to Scribus, specified some spot
colours, and saved as EPS - the colours were specified as /Separation
colours as described in my earlier message.

 I did try inkscape for this, but forgot about scribus. Inkscape
doesnt seem to have such colour management.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimals in R/SQL

2011-04-13 Thread jim holtman

The problem is that you data is 'integer' and I assume that the
database is keeping everything integer.  You can do what you are
doing, or convert to 'numeric':

 x - read.table(textConnection(   ST  AGEP   
  PWGTP
+ 33323130 130
+ 33324110 186
+ 333251 2 162
+ 33326180  93
+ 33327129 135
+ 33328166  54
+ 33329162  54
+ 0121 138
+ 1129 103
+ 21 7 144
+ 31 5 143), header = TRUE)
 closeAllConnections()
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : int  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 require(sqldf)
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1 23
 # change to numeric instead of integer
 x$AGEP - as.numeric(x$AGEP)
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : num  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1   23.81446



On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata rach...@kff.org wrote:
 Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large 
 table and here is a subset of the variables I am working with.


 SS09
            ST                  AGEP                    PWGTP
 33323    1                30                 130
 33324    1                10                 186
 33325    1                 2                 162
 33326    1                80                  93
 33327    1                29                 135
 33328    1                66                  54
 33329    1                62                  54
 0    1                21                 138
 1    1                29                 103
 2    1                 7                 144
 3    1                 5                 143

 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage 
 from ss09 group by ST)

 ST wgtage
 1   1     37
 2   2     33
 3   4     36
 4   5     37
 5   6     35

 z - dbGetQuery( connSQLite , select ST, 
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 
 group by ST)

 ST   wgtage
 1   1 37.57083
 2   2 33.94322
 3   4 36.14499
 4   5 37.51233
 5   6 35.65581

 -Original Message-
 From: jim holtman [mailto:jholt...@gmail.com]
 Sent: Wednesday, April 13, 2011 12:16 PM
 To: Rachel Licata
 Cc: r-help@r-project.org
 Subject: Re: [R] Decimals in R/SQL

 You at least have to provide a subset of 'ss09' so we can see what the
 original data looks like.  I have not had any problems with decimals
 in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
 +     select key, sum(V1 * V2) / sum(V3)
 +         from x
 +         group by key
 + ')


 xsum
  key sum(V1 * V2) / sum(V3)
 1   1               19.38166
 2   2               17.40503
 3   3               71.48818
 dput(xsum)
 structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / 
 sum(V3)
 ), row.names = c(NA, 3L), class = data.frame)




 On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] How to get the names of all the packages that depend on a give package?

2011-04-13 Thread Dennis Murphy

Hi:

Here's one way, assuming Matrix (or whichever package you want to
investigate) is installed on your system:

u - installed.packages()
u['Matrix', 'Depends']
[1] R (= 2.10.0), stats, methods, utils, lattice

HTH,
Dennis

On Wed, Apr 13, 2011 at 8:57 AM, thmsfuller...@gmail.com 
thmsfuller...@gmail.com wrote:

 Dear All,

 I want to check what packages depends on a given package. For example,
 glmnet depends on Matrix. I want get the names of all the packages
 that depend on Matrix on CRAN. Is there a way to do so? Thanks!

  library(glmnet)
 Loading required package: Matrix
 Loading required package: lattice

 --
 Tom

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get the names of all the packages that depend on a give package?

2011-04-13 Thread Gabor Grothendieck

On Wed, Apr 13, 2011 at 11:57 AM, thmsfuller...@gmail.com
thmsfuller...@gmail.com wrote:
 Dear All,

 I want to check what packages depends on a given package. For example,
 glmnet depends on Matrix. I want get the names of all the packages
 that depend on Matrix on CRAN. Is there a way to do so? Thanks!

 library(glmnet)
 Loading required package: Matrix
 Loading required package: lattice


Look at the bottom of this page:
http://cran.r-project.org/web/packages/Matrix/index.html

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] B %*% t(B) = R , then solve for B

2011-04-13 Thread Doran, Harold

Correct. In the example I gave you yesterday, I used a different matrix, but 
showed this solution because it also answered the other question you had about 
doing it on non-square matrices. Of course, Spencer Graves also gave a very 
useful answer suggesting QR decomposition. 

I also gave you the example in the context of linear regression because that is 
commonly why statisticians will use these factorizations. 

If your matrix is small, chol() works fine. If your matrix is big and sparse, 
see Cholesky() in the matrix package (a package that I often refer to as a 
God-send)

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Shawn Koppenhoefer
 Sent: Tuesday, April 12, 2011 12:33 PM
 To: r-help@r-project.org
 Subject: Re: [R] B %*% t(B) = R , then solve for B
 Importance: High
 
 Solution found...
 
 Sorry for not having known this,...
 
 Apparently, what I was after is called a Choleski factorization.
 
 
 The solution pops right out of R, as follows:
 
   M-matrix(c(0.6098601,  0.2557882,   0.1857773,
 + 0.2557882,  0.5127065,  -0.1384238,
 + 0.1857773, -0.1384238,   0.9351089 ),
 +   nrow=3, ncol=3, byrow=TRUE)
   chol(M)
[,1]  [,2]   [,3]
 [1,] 0.7809354 0.3275408  0.2378907
 [2,] 0.000 0.6367288 -0.3397722
 [3,] 0.000 0.000  0.8735398
  
 
 
 
 Thanks again for all your help!
 
 
 /shawn
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread Jeremy Hetzel

By the way, I had trouble importing PDFs into Scribus 1.3.3.  However, 
Scribus 1.4.0rc3 had no problem opening multi-page PDFs, assuming the 
appropriate Ghostscript was also installed (I'm on Windows 7 at the moment). 
 So Matthieu might be able to combine all of his figures into a single PDF, 
convert to spot color in Scribus, then break the PDF apart again for Latex.

Jeremy
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] latex, eps graphics and transparent colors

2011-04-13 Thread Michael Friendly

I have a diagram to be included in latex, where all my figures are .eps 
graphics (so pdflatex is not an

option) and I want to achieve something
like the following: three concentric filled circles varying in lightness 
or saturation.  It is easiest to do this using
transparency, but in my test using the postscript driver, the 
transparent color fills do not appear.  Is it

correct that postscript() does not support transparency?

circle -function (radius = 1, segments=61) {
angles - (0:segments)*2*pi/segments
radius * cbind( cos(angles), sin(angles))
}

plot(1:5, 1:5, type='n', xlim=c(-1,5), ylim=c(-1,5), xlab='', ylab='',
asp=1, xaxt=n, yaxt=n)

#clrs - trans.colors(lightblue, alpha=c(.2, .4, .6))  ## from heplots 
package

clrs - c(#ADD8E633, #ADD8E666, #ADD8E699)

c1 - circle(3)
polygon(c1, col=clrs[1], border=lightblue)
polygon(.67*c1, col=clrs[2], border=lightblue)
polygon(.33*c1, col=clrs[3], border=lightblue)

arrows(-1,  0, 5, 0, angle=10, length=.2, lwd=2, col=darkgray)
arrows( 0, -1, 0, 5, angle=10, length=.2, lwd=2, col=darkgray)

One alternative that sort of works is to use the png() driver, and then
convert fig.png fig.eps
but I need very high resolution to make the real diagram legible.

It might suffice to use hcl() colors to approximate what I've done with 
transparency,
but I don't know how to start with a given color (lightblue) and 
achieve roughly

similar resuts.

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: CRAN problem with plyr-1.4.1

2011-04-13 Thread Uwe Ligges




On 12.04.2011 17:29, Hadley Wickham wrote:

Then, can we have the ERROR message, please?
Otherwise the only explanation I can guess is that a mirror grabs the
contents of a repository exactly in the second the repository is updated and
that is unlikely, particularly if more than one mirror is involved.


Isn't one possible explanation that PACKAGES.gz on the mirror was
updated before the package directory was?


It is in the same directory.

Uwe


That would seem to me a
plausible hypothesis to me, because rsync seems to send files in the
top level directory before files in directories below.

Hadley



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Previously attainable fisher's exact test

2011-04-13 Thread Jim Silverton

I have a matrix say,

1  4
23  30

and I want to find the previously attainable fisher's exact test p-value. Is
there a way to do this in R?


-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get the names of all the packages that depend on a give package?

2011-04-13 Thread Ben Bolker

Dennis Murphy djmuser at gmail.com writes:

 
 Hi:
 
 Here's one way, assuming Matrix (or whichever package you want to
 investigate) is installed on your system:
 
 u - installed.packages()
 u['Matrix', 'Depends']
 [1] R (= 2.10.0), stats, methods, utils, lattice
 
 HTH,
 Dennis
 

  Or, if it's not installed locally, use

u - available.packages()
u['Matrix', 'Depends']

instead.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wilcoxon rank sum in unbalanced design

2011-04-13 Thread peter dalgaard


On Apr 13, 2011, at 17:07 , netrunner wrote:

 Hi everyone!
 I need to perform a Wilcoxon rank sum test, but I have some ties and the
 groups have different size also. When I deal with ties I use the
 wilcox.exact function, how can I solve the different size problem using this
 function? 

What problem? Wilcoxon tests work quite happily with unequal group sizes. 
There's some loss of power due to imbalance, but the only way around _that_ is 
to move observations from one group to the other...

(And exact testing actually solves a slightly different problem than the tie 
issue, but never mind.)

 
 thanks
 
 net
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Wilcoxon-rank-sum-in-unbalanced-design-tp3447400p3447400.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing objects and clearing memory

2011-04-13 Thread Jim Silverton

On Wed, Apr 13, 2011 at 11:12 AM, Jim Silverton jim.silver...@gmail.comwrote:

 Hey guys thanks very much for al your responses. It was very helpful.

 Jim


 On Tue, Apr 12, 2011 at 6:59 PM, Peter Ehlers ehl...@ucalgary.ca wrote:

 On 2011-04-12 15:52, seeliger.c...@epamail.epa.gov wrote:

  How do I remove all objects except one in R?

  rm(list=ls()) #will remove ALL objects


 But he wanted to remove all objects ***except one***!!!  So what's the
 point of this answer?

 I can't see any way except a rather round-about kludge to do what the
 OP wants. ...


 That might work.  Or try this, replacing QQQ with the name of the object
 you want to keep.
 rm(list=ls()[!grepl('^QQQ$',ls())])

 cur


 package gdata has function keep() which does just what the OP asks for.

 Peter Ehlers




 --
 Thanks,
 Jim.




-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data with different time lines for one analysis

2011-04-13 Thread Oliver

Hello

I have one data set that is in the range from 1950 to 2009
with yearly values (one value per year),
and another one from 1959 to 1983 with 364 values,
which means 15 values per year.

Now I want to use both sets in some analysis,
one for example ccf, but also other analyses.

What would you recommend as way to analyze the data?

Should I use the lowest date as t=0 and then use days as base for data
exploration? And later recreate the real date values from that?

I found out that strptime can help me in converting the dates as string into
internal POSIX representation, but I now stcuk with how to go on.

Ciao,
   Oliver

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread baptiste auguie

Hi,

I may be wrong, but I have the impression that tikz (a LaTeX drawing
package) can handle spot colors (that's what Google seemed to tell me
[*]). If this is the case you could output R graphics using the
tikzDevice package, post-process the output (readable, plain text
file), and eventually have LaTeX produce the pdf image with spot
colors.

Worth a try, perhaps.

baptiste


[*] http://wiki.contextgarden.net/Colors#In_TikZ

On 13 April 2011 21:15, Matthieu Stigler matthieu.stig...@gmail.com wrote:
 Hi

 We are about to publish a book, which contains figures made with R plots. An
 important detail that we did not take into account is that the book will not
 be printed in 4 colors (cmyk mode), but only 2 (black +spotcolor). The
 spotcolor we use is part of the big Pantone family.

 The problem is that both pdf() and postscript() offer either rgb or cmyk,
 but no spotcolors such as pantone. I'm afraid this constraint can't be
 solved at all, and we can't use R for creating these plots? I did not find
 any package that would extend the colormodel to include spot colors... Did
 anyone had a similar experience?

 Thanks!!

 Matthieu

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R plots pdf() does not allow spotcolors?

2011-04-13 Thread mat


Barry, thanks a lot for checking for this!! Find below answers.

Le 13. 04. 11 18:26, Barry Rowlingson a écrit :

2011/4/13 Matthieu Stiglermatthieu.stig...@gmail.com:

Jim

Thanks for your feedback! The problem is that the people those responsible
for layout are us... We are doing the book in Latex, and till now did not
need any other software.

But I am scared we will need use kind of Indesign  co softwares to be able
to use our R plots, since R can't export into spotcolor...

  I've just had a look into how PostScript handles spot colours, having
been hacking PS files for many years its not something I'd played with
before.

  This page gave me a PS file that used spot colours:

http://pslib.sourceforge.net/examples.php

and it seems that you can set a colour using a colour space array:

 [ /Separation (PANTONE Violet C)
   /DeviceCMYK { dup 0.75 mul exch dup 0.94 mul exch dup
0.00 mul exch 0.00 mul }
 ] setcolorspace 1.00 setcolor

Now, my screen previewer has no idea what colour PANTONE Violet C is,
so there's an alternative CMYK colour space version. In fact PANTONE
Violet C is just a label. The idea is that when processed for colour
separations the software will use the separation colours.

Now, to get this into R is tricky. Either you have to rewrite the PS
driver or hack the PS output file. I produced a simple R plot with one
purple (col=#FF00FF) blob and saved to a PS file. The bit of the PS
that set the colour was this:

  /bg { 1 0 1 rgb } def
  1 0 1 rgb


I do no get something similar... I used:
set.seed(123)

a-runif(100)

postscript(RplotRGB.ps, colormodel=rgb)
plot(a, col=#FF00FF)
dev.off()

Interestingly, I compared with:
postscript(RplotCMYK.ps, colormodel=cmyk)
plot(a, col=#FF00FF)
dev.off()

and making a diff, I see only:
87c87,88
 1 0 1 rgb
---
 0 1 0 0 setcmykcolor

194c195,196
 0 setgray
---
 0 0 0 1 setcmykcolor

278c280,281
 0 setgray
---
 0 0 0 1 setcmykcolor

does not look so similar to what you obtained...


  - the two settings here are for the outline and fill of the blob,
which was created using pch=19.


may I ask you what are outline and fill?

  To specify this as a spot colour (with a CMYK alternative for
preview), I changed that to this:

/pvc {
 [ /Separation (PANTONE Violet C)
   /DeviceCMYK { dup 0.75 mul exch dup 0.94 mul exch dup
0.00 mul exch 0.00 mul }
 ] setcolorspace
} def

/bg {
pvc 1.00 setcolor
} def
pvc 1.00 setcolor

Now if you only have one or two spot colours and only a few places
where they occur then it might be possible for you to do this kind of
edit/replacement - either manually or automatically. And beware I'm
not even sure this is right or that your publisher would be happy
anyway.

no but seems quite a nice solution, as I have only known cmyk colors (5) 
for which I can find Pentone matching. Could setup small script then! 
Just need to figure out why we do not get similar results you and me!!


Thanks so much!!



  Anyway, thats enough PS hacking for one day.

Barry


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Line plots in base graphics

2011-04-13 Thread Hadley Wickham

Am I missing something obvious on how to draw multi-line plots in base graphics?

In ggplot2, I can do:

data(Oxboys, package = nlme)
library(ggplot2)

qplot(age, height, data = Oxboys, geom = line, group = Subject)

But in base graphics, the best I can come up with is this:

with(Oxboys, plot(age, height, type = n))
lapply(split(Oxboys[c(age, height)], Oxboys$Subject), lines)

Am I missing something obvious?

Thanks!

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] latex, eps graphics and transparent colors

2011-04-13 Thread Ben Bolker

Michael Friendly friendly at yorku.ca writes:

 
 I have a diagram to be included in latex, where all my figures are .eps 
 graphics (so pdflatex is not an
 option) and I want to achieve something
 like the following: three concentric filled circles varying in lightness 
 or saturation.  It is easiest to do this using
 transparency, but in my test using the postscript driver, the 
 transparent color fills do not appear.  Is it
 correct that postscript() does not support transparency?
 
 circle -function (radius = 1, segments=61) {
  angles - (0:segments)*2*pi/segments
  radius * cbind( cos(angles), sin(angles))
 }
 
 plot(1:5, 1:5, type='n', xlim=c(-1,5), ylim=c(-1,5), xlab='', ylab='',
  asp=1, xaxt=n, yaxt=n)
 
 #clrs - trans.colors(lightblue, alpha=c(.2, .4, .6))  ## from heplots 
 package
 clrs - c(#ADD8E633, #ADD8E666, #ADD8E699)
 
 c1 - circle(3)
 polygon(c1, col=clrs[1], border=lightblue)
 polygon(.67*c1, col=clrs[2], border=lightblue)
 polygon(.33*c1, col=clrs[3], border=lightblue)
 
 arrows(-1,  0, 5, 0, angle=10, length=.2, lwd=2, col=darkgray)
 arrows( 0, -1, 0, 5, angle=10, length=.2, lwd=2, col=darkgray)
 
 One alternative that sort of works is to use the png() driver, and then
 convert fig.png fig.eps
 but I need very high resolution to make the real diagram legible.
 
 It might suffice to use hcl() colors to approximate what I've done with 
 transparency,
 but I don't know how to start with a given color (lightblue) and 
 achieve roughly
 similar resuts.
 

  If you really only want to lighten a specified colour (rather
than overlaying multiple colours), then something like this ought
to do the trick:


testfun - function(clrs=c(#ADD8E633, #ADD8E666, #ADD8E699)) {
  plot(1:5, 1:5, type='n', xlim=c(-1,5), ylim=c(-1,5), xlab='', ylab='',
   asp=1, xaxt=n, yaxt=n)
  c1 - circle(3)
  polygon(c1, col=clrs[1], border=lightblue)
  polygon(.67*c1, col=clrs[2], border=lightblue)
  polygon(.33*c1, col=clrs[3], border=lightblue)

  arrows(-1,  0, 5, 0, angle=10, length=.2, lwd=2, col=darkgray)
  arrows( 0, -1, 0, 5, angle=10, length=.2, lwd=2, col=darkgray)
}

postscript(testalpha1.ps)
testfun()
dev.off()


lblue - #ADD8E6
alphafy - function(col,alpha=1) {
  rr - 1-alpha*(1-c(col2rgb(col)/255))
  rgb(rr[1],rr[2],rr[3])
}
alphafy(#ADD8E6)
alphafy(#ADD8E6,alpha=0)

postscript(testalpha2.ps)
testfun(clrs=c(alphafy(lblue,0.2),alphafy(lblue,0.4),alphafy(lblue,0.6)))
dev.off()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line plots in base graphics

2011-04-13 Thread Ben Bolker

Hadley Wickham hadley at rice.edu writes:

 
 Am I missing something obvious on how to draw multi-line plots in
 base graphics?
 
 In ggplot2, I can do:
 
data(Oxboys, package = nlme)
library(ggplot2)
 
qplot(age, height, data = Oxboys, geom = line, group = Subject)
 
 But in base graphics, the best I can come up with is this:
 
with(Oxboys, plot(age, height, type = n))
lapply(split(Oxboys[c(age, height)], Oxboys$Subject), lines)

[quoting removed to fool gmane] 
 Am I missing something obvious?
 

 reshape to wide format and matplot()?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] latex, eps graphics and transparent colors

2011-04-13 Thread Thomas Lumley

On Thu, Apr 14, 2011 at 5:30 AM, Michael Friendly frien...@yorku.ca wrote:
 I have a diagram to be included in latex, where all my figures are .eps
 graphics (so pdflatex is not an
 option)

You could use the pdf() device and then use pdf2ps to convert to PostScript.

and I want to achieve something
 like the following: three concentric filled circles varying in lightness or
 saturation.  It is easiest to do this using
 transparency, but in my test using the postscript driver, the transparent
 color fills do not appear.  Is it
 correct that postscript() does not support transparency?

 circle -function (radius = 1, segments=61) {
        angles - (0:segments)*2*pi/segments
        radius * cbind( cos(angles), sin(angles))
 }

 plot(1:5, 1:5, type='n', xlim=c(-1,5), ylim=c(-1,5), xlab='', ylab='',
        asp=1, xaxt=n, yaxt=n)

 #clrs - trans.colors(lightblue, alpha=c(.2, .4, .6))  ## from heplots

There's now an adjustcolor() function in base R to do this.

 package
 clrs - c(#ADD8E633, #ADD8E666, #ADD8E699)

 c1 - circle(3)
 polygon(    c1, col=clrs[1], border=lightblue)
 polygon(.67*c1, col=clrs[2], border=lightblue)
 polygon(.33*c1, col=clrs[3], border=lightblue)

 arrows(-1,  0, 5, 0, angle=10, length=.2, lwd=2, col=darkgray)
 arrows( 0, -1, 0, 5, angle=10, length=.2, lwd=2, col=darkgray)

 One alternative that sort of works is to use the png() driver, and then
 convert fig.png fig.eps
 but I need very high resolution to make the real diagram legible.

 It might suffice to use hcl() colors to approximate what I've done with
 transparency,
 but I don't know how to start with a given color (lightblue) and achieve
 roughly
 similar resuts.

It would be useful to have an alpha-blending function for this sort of
purpose, but I don't think we have one.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimals in R/SQL

2011-04-13 Thread Rachel Licata

Thanks again Jim - that is really helpful and I apologize that I am new to R.  
How can I convert to numeric in SQL and when I am working on a table in a 
database?  The file is huge so that is why I am using SQL and the database to 
work through it.  
Thanks!

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Wednesday, April 13, 2011 12:52 PM
To: Rachel Licata
Cc: r-help@r-project.org
Subject: Re: [R] Decimals in R/SQL

The problem is that you data is 'integer' and I assume that the
database is keeping everything integer.  You can do what you are
doing, or convert to 'numeric':

 x - read.table(textConnection(   ST  AGEP   
  PWGTP
+ 33323130 130
+ 33324110 186
+ 333251 2 162
+ 33326180  93
+ 33327129 135
+ 33328166  54
+ 33329162  54
+ 0121 138
+ 1129 103
+ 21 7 144
+ 31 5 143), header = TRUE)
 closeAllConnections()
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : int  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 require(sqldf)
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1 23
 # change to numeric instead of integer
 x$AGEP - as.numeric(x$AGEP)
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : num  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1   23.81446



On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata rach...@kff.org wrote:
 Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large 
 table and here is a subset of the variables I am working with.


 SS09
            ST                  AGEP                    PWGTP
 33323    1                30                 130
 33324    1                10                 186
 33325    1                 2                 162
 33326    1                80                  93
 33327    1                29                 135
 33328    1                66                  54
 33329    1                62                  54
 0    1                21                 138
 1    1                29                 103
 2    1                 7                 144
 3    1                 5                 143

 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage 
 from ss09 group by ST)

 ST wgtage
 1   1     37
 2   2     33
 3   4     36
 4   5     37
 5   6     35

 z - dbGetQuery( connSQLite , select ST, 
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 
 group by ST)

 ST   wgtage
 1   1 37.57083
 2   2 33.94322
 3   4 36.14499
 4   5 37.51233
 5   6 35.65581

 -Original Message-
 From: jim holtman [mailto:jholt...@gmail.com]
 Sent: Wednesday, April 13, 2011 12:16 PM
 To: Rachel Licata
 Cc: r-help@r-project.org
 Subject: Re: [R] Decimals in R/SQL

 You at least have to provide a subset of 'ss09' so we can see what the
 original data looks like.  I have not had any problems with decimals
 in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
 +     select key, sum(V1 * V2) / sum(V3)
 +         from x
 +         group by key
 + ')


 xsum
  key sum(V1 * V2) / sum(V3)
 1   1               19.38166
 2   2               17.40503
 3   3               71.48818
 dput(xsum)
 structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / 
 sum(V3)
 ), row.names = c(NA, 3L), class = data.frame)




 On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] Decimals in R/SQL

2011-04-13 Thread Gabor Grothendieck

On Wed, Apr 13, 2011 at 4:34 PM, Rachel Licata rach...@kff.org wrote:
 Thanks again Jim - that is really helpful and I apologize that I am new to R. 
  How can I convert to numeric in SQL and when I am working on a table in a 
 database?  The file is huge so that is why I am using SQL and the database to 
 work through it.
 Thanks!


There are some examples here (a few of them, as noted, require the
development version of sqldf):

http://code.google.com/p/sqldf/#15._Why_do_certain_calculations_come_out_as_integer_rather_than

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with all/all.equal

2011-04-13 Thread MacQueen, Don

The help pages for identical() and all.equal() have information that will
make it clear why they don't do what you want.

In the meantime, I tend to use a construct such as:

   length(unique(x))==1

But be careful if x is not a vector.
No doubt there are other ways.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





-Original Message-
From: Laura Smith smithlaura...@gmail.com
Date: Wed, 6 Apr 2011 15:09:14 -0700
To: r-help@r-project.org r-help@r-project.org
Subject: [R]  problem with all/all.equal

Hi!

In a function, I may have an instance in which all elements are equal.

 x - rep(1,5)

 x
[1] 1 1 1 1 1
 identical(x)
Error in .Internal(identical(x, y, num.eq, single.NA, attrib.as.set)) :
  'y' is missing
 all.equal(x)
Error in is.expression(x) : 'x' is missing


I don't care what particular value it is, I just want to know if they are
all equal.

What am I doing wrong, please?

Thanks,
Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: [r] how to enclose two xyplot

2011-04-13 Thread baptiste auguie

Hi,

Have you tried ?c.trellis in the latticeExtra package?

HTH,

baptiste

On 13 April 2011 23:36, Francesco Nutini nutini.france...@gmail.com wrote:

  Dear R-users,

 I have to plot two xyplot, and I wish to enclose this two graphs with just 
 one headline, the same x scale, the same grid etc.
 These parameters should tie in, in order to obtain, visually, a unique graph 
 formed by two xyplot.

 I try to give an idea:

 xyplot1: |_|_|_|



 xyplot2: |_|_|_|



 what i want: |  |  | |

                   |_|_|_|


 I tried to use the command par, but it's doesn't work with xyplot. The two 
 plot have, by default, the same x-axis scale.
 I know it's just a visual solution, but it could be nice for a paper!

 Thanks a lot,

 Francesco Nutini
 PhD student





        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] latex, eps graphics and transparent colors

2011-04-13 Thread Ben Bolker

Thomas Lumley tlumley at uw.edu writes:

 
 On Thu, Apr 14, 2011 at 5:30 AM, 
  Michael Friendly friendly at yorku.ca wrote:
  I have a diagram to be included in latex, where all my figures are .eps
  graphics (so pdflatex is not an option)
 
 You could use the pdf() device and then use pdf2ps to convert to PostScript.

  Clever.

   [snip]

 There's now an adjustcolor() function in base R to do this.
 

  That makes my solution more or less obsolete.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line plots in base graphics

2011-04-13 Thread Hadley Wickham

On Wed, Apr 13, 2011 at 2:58 PM, Ben Bolker bbol...@gmail.com wrote:
 Hadley Wickham hadley at rice.edu writes:


 Am I missing something obvious on how to draw multi-line plots in
 base graphics?

 In ggplot2, I can do:

 data(Oxboys, package = nlme)
 library(ggplot2)

 qplot(age, height, data = Oxboys, geom = line, group = Subject)

 But in base graphics, the best I can come up with is this:

 with(Oxboys, plot(age, height, type = n))
 lapply(split(Oxboys[c(age, height)], Oxboys$Subject), lines)

 [quoting removed to fool gmane]
 Am I missing something obvious?


  reshape to wide format and matplot()?

Hmmm, that doesn't work if your measurements are at different times e.g:

Oxboys2 - transform(Oxboys, age = age + runif(234))

Hadley


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimals in R/SQL

2011-04-13 Thread Seth Falcon

On Wed, Apr 13, 2011 at 1:34 PM, Rachel Licata rach...@kff.org wrote:

 Thanks again Jim - that is really helpful and I apologize that I am
 new to R.  How can I convert to numeric in SQL and when I am working
 on a table in a database?  The file is huge so that is why I am
 using SQL and the database to work through it.

I believe that RSQLite will do the right thing if you provide the
correct types in your schema.  So for a new database, you want to make
sure that the columns that you want to be numeric are created like:

CREATE table sometable (my_data REAL);

You should be able to create a new table from an existing table using
SQL where the new table has the types you want.

+ seth


-- 
Seth Falcon | @sfalcon | http://userprimary.net/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File Save As...

2011-04-13 Thread Duncan Murdoch


On 13/04/2011 5:20 PM, Gene Leynes wrote:

as of right now
 x = function(a) print(a)
 attr(x, srcref)
returns NULL in 2.13, am I doing something wrong?


There's a limitation to the debug information:  it can't be attached to 
a function whose body consists of a single simple expression like 
print(a).  If you put braces around print(a), it will be attached to the 
body of x:


Put this line into a file, and source it:

x - function(a) { print(a) }

source(test.R)
attr(body(x), srcref)

In case you're interested, the reason for this limitation is that there 
are some objects in R (NULL is the most obvious one) which can be a 
function body but which can't hold attributes.  At the time the debug 
info is added, the function hasn't been created, but its body has, so 
the attribute has to go there.


Duncan Murdoch



(also, should I post this to a new thread, or the development thread?)

About me: I like long walks on the beach, and this is my current version
of R:

 t(as.data.frame(R.Version()))

[,1]
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status RC
major 2
minor 13.0
year 2011
month 04
day 11
svn.rev 55409
language R
version.string R version 2.13.0 RC (2011-04-11 r55409)





On Wed, Mar 16, 2011 at 2:44 PM, Gene Leynes gleyne...@gmail.com
mailto:gleynes%...@gmail.com wrote:


Thanks for showing me the link to the code / your response / your
work in general.

It seems that the real magic is happening in the call to the
function attributes, via the line
attr(x, srcref)
I'm guessing that attributes must be defined somewhere deep inside
the R machinery (since I didn't find it as a file in base)...  And
there's probably not much benefit for me to know more beyond that.

So, I'll be looking forward to 2.13!



On Tue, Mar 15, 2011 at 3:37 PM, Duncan Murdoch
murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com wrote:

On 15/03/2011 2:56 PM, Gene Leynes wrote:

The getSrcFilename function is exactly what I was trying
to describe, and
I'm excited to know that it's on it way!

I have tried to create that type of function, but I didn't
think it was
possible with currently available functions.  I would be
interested in
seeing how the new function works, maybe I'll check it out
using the google
code search

toolhttp://www.google.com/codesearch?hl=enlr=q=lang%3Ar+sbtn=Search

http://www.google.com/codesearch?hl=enlr=q=lang%3Ar+sbtn=Search(although

I usually have a hard time making sense of that code).


The source is available in

https://svn.r-project.org/R/trunk/src/library/utils/R/sourceutils.R

Duncan Murdoch


Please let me briefly clarify this part:

   But it can.  If you open a script and choose save, it
will be saved to the
   same place.
 

I just mean that when you do save as... R doesn't seem to
use the same
information that it uses during a normal save (the directory
or script
name).  In other applications like Microsoft Word, or
Python's IDLE (screen
shot attached) the user is shown a dialogue box with the
file name in the
current directory of that file.

This is a very minor annoyance though.  I only brought it up
because I
thought it would be easier to explain than asking about a
function that
would do the job of getSrcFilename, which is really what I
was after.

I rarely upgrade my R versions, but this will definitely be
an occasion when
I do!

This makes me want to go back and look at the past release
notes to see what
other goodies I've been overlooking.

Thanks again,

Gene



On Mon, Mar 14, 2011 at 8:17 PM, Duncan
Murdochmurdoch.dun...@gmail.com
mailto:murdoch.dun...@gmail.comwrote:

   On 11-03-14 8:12 PM, Gene Leynes wrote:
 
   Yes, I understand.  Normally I use Eclipse, which does
what I want for
  save as...
 
   The bigger issue is that R can't tell the location of
an open script,
   which makes it harder to create new versions of
existing work
 
 
   But it can.  If you open a script and choose save, it
will be saved to the
   same place.  Or do you mean an executing script?  There
are indirect ways to
   find the name of the executing script.  For example,
   in R-devel (to become 2.13.0 next month), you can do this:

[R] for loop performance

2011-04-13 Thread Barth B. Riley

Dear list

I am running some simulations in R involving reading in several hundred 
datasets, performing some statistics and outputting those statistics to file. I 
have noticed that it seems that the time it takes to process of a dataset (or, 
say, a set of 100 datasets) seems to take longer as the simulation progresses. 
Has anyone else noticed this? I am curious to know if this has to do with how R 
processes code in loops or if it might be due to memory usage issues (e.g., 
repeatedly reading data into the same matrix).

Thanks in advance

Barth

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File Save As...

2011-04-13 Thread Gene Leynes

That's very interesting.  It's not what I was thinking about or expecting,
but I'm glad to know about it (and it will probably be useful at some
point).

Initially, I was asking for a method to find out the name of the current
script.  I mean, the current script that you're editing / running.

However, this is useful.  Just by putting a few lines of code, you can
change the working directory with a simple drag and drop.  Put the following
lines into a script, and drop it into the R GUI.
curpath = dirname(getSrcFilename(function(){}, full=TRUE))
setwd(curpath)
cat('Now in ', curpath, '\n')

What I was looking for is almost accomplished by this:
filepath = getSrcFilename(function(){}, full=TRUE)
setwd(dirname(filepath))
browser()

Of course, you wouldn't want to stay in browser mode forever...

It would be nice to get the name of your current script with something like
this.file().  Of course, I don't know how that would work if someone typed
it into the console.

I'm genuinely excited to explore the new updates in 2.13.  Either it's a
good update, or I'm just now learning enough to appreciate the updates.

Thanks

Gene

On Wed, Apr 13, 2011 at 4:33 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 13/04/2011 5:20 PM, Gene Leynes wrote:

 as of right now
 x = function(a) print(a)
 attr(x, srcref)
 returns NULL in 2.13, am I doing something wrong?


 There's a limitation to the debug information:  it can't be attached to a
 function whose body consists of a single simple expression like print(a).
  If you put braces around print(a), it will be attached to the body of x:

 Put this line into a file, and source it:


 x - function(a) { print(a) }

 source(test.R)
 attr(body(x), srcref)

 In case you're interested, the reason for this limitation is that there are
 some objects in R (NULL is the most obvious one) which can be a function
 body but which can't hold attributes.  At the time the debug info is added,
 the function hasn't been created, but its body has, so the attribute has to
 go there.

 Duncan Murdoch


 (also, should I post this to a new thread, or the development thread?)

 About me: I like long walks on the beach, and this is my current version
 of R:

  t(as.data.frame(R.Version()))

[,1]
 platform i386-pc-mingw32
 arch i386
 os mingw32
 system i386, mingw32
 status RC
 major 2
 minor 13.0
 year 2011
 month 04
 day 11
 svn.rev 55409
 language R
 version.string R version 2.13.0 RC (2011-04-11 r55409)




 On Wed, Mar 16, 2011 at 2:44 PM, Gene Leynes gleyne...@gmail.com
 mailto:gleynes%...@gmail.com wrote:


Thanks for showing me the link to the code / your response / your
work in general.

It seems that the real magic is happening in the call to the
function attributes, via the line
attr(x, srcref)
I'm guessing that attributes must be defined somewhere deep inside
the R machinery (since I didn't find it as a file in base)...  And
there's probably not much benefit for me to know more beyond that.

So, I'll be looking forward to 2.13!



On Tue, Mar 15, 2011 at 3:37 PM, Duncan Murdoch
murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com wrote:

On 15/03/2011 2:56 PM, Gene Leynes wrote:

The getSrcFilename function is exactly what I was trying
to describe, and
I'm excited to know that it's on it way!

I have tried to create that type of function, but I didn't
think it was
possible with currently available functions.  I would be
interested in
seeing how the new function works, maybe I'll check it out
using the google
code search
tool
 http://www.google.com/codesearch?hl=enlr=q=lang%3Ar+sbtn=Search

 http://www.google.com/codesearch?hl=enlr=q=lang%3Ar+sbtn=Search
 (although

I usually have a hard time making sense of that code).


The source is available in


 https://svn.r-project.org/R/trunk/src/library/utils/R/sourceutils.R

Duncan Murdoch


Please let me briefly clarify this part:

   But it can.  If you open a script and choose save, it
will be saved to the
   same place.
 

I just mean that when you do save as... R doesn't seem to
use the same
information that it uses during a normal save (the directory
or script
name).  In other applications like Microsoft Word, or
Python's IDLE (screen
shot attached) the user is shown a dialogue box with the
file name in the
current directory of that file.

This is a very minor annoyance though.  I only brought it up
because I
thought it would be easier to explain than asking about a
function that
would do the job of getSrcFilename, which is really what I

Re: [R] area under roc curve

2011-04-13 Thread Frank Harrell

ROC area does not measure goodness of prediction but does measure pure
predictive discrimination.  The generalization of the ROC area is the
C-index for continuous or censored Y.  See for example the rcorr.cens
function in the Hmisc package.
Frank

agent dunham wrote:
 
 Dear all, 
 
 
 I want to measure the goodness of prediction of my linear model. That's
 why I was thinking about the area under roc curve. 
 
 I'm trying the following, but I don't know how to avoid the error. Any
 help would be appreciated. 
 
 library(ROCR) 
 
 model.lm - lm(log(outcome)~log(v1)+log(v2)+factor1)
 pred-predict(model.lm)
 pred-prediction(as.numeric(pred), as.numeric(log(outcome)))
 auc-performance(pred,auc)
 
 Error en prediction(as.numeric(pred), as.numeric(log(outcome))) : 
   Number of classes is not equal to 2.
 ROCR currently supports only evaluation of binary classification tasks.
 u...@host.com
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/area-under-roc-curve-tp3447420p3448377.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] is this an ANOVA ?

2011-04-13 Thread Bill.Venables

You probably want to do something like this:

 fm - lm(y ~ x, MD)
 anova(fm)
Analysis of Variance Table

Response: y
  Df Sum Sq Mean Sq F valuePr(F)
x  2250   125.0  50 1.513e-06
Residuals 12 30 2.5   


Answers to questions:

1. No.
2. Yes.

(whoever you are). 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Ubuntu Diego
Sent: Wednesday, 13 April 2011 10:10 AM
To: r-help@r-project.org
Subject: [R] is this an ANOVA ?

Hi all,
I have a very easy questions (I hope). I had measure a property of 
plants, growing in three different substrates (A, B and C). The rest of the 
conditions remained constant. There was very high variation on the results. 
I want to do address, whether there is any difference in the response 
(my measurement) from substrate to substrate?

x-c('A','A','A','A','A','B','B','B','B','B','C','C','C','C','C') # Substrate 
type
y - c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) # Results of the measurement
MD-data.frame(x,y)

I wrote a linear model for this:

summary(lm(y~x,data=MD))

This is the output:

Call:
lm(formula = y ~ x, data = MD)

Residuals:
   Min 1Q Median 3QMax 
-2.000e+00 -1.000e+00  5.551e-17  1.000e+00  2.000e+00 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)   3. 0.7071   4.243 0.001142 ** 
xB5. 1.   5.000 0.000309 ***
xC   10. 1.  10.000 3.58e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.581 on 12 degrees of freedom
Multiple R-squared: 0.8929, Adjusted R-squared: 0.875 
F-statistic:50 on 2 and 12 DF,  p-value: 1.513e-06 

I conclude that there is an effect of substrate type (x). 
NOW the questions :
1) Do the fact that the all p-values are significant means that 
all the groups are different from each other ?
2) Is there a (easy) way to plot,  mean plus/minus 2*sd for 
each substrate type ? (with asterisks denoting significant differences ?)


THANKS !

version
platform   x86_64-apple-darwin9.8.0 
arch   x86_64   
os darwin9.8.0  
system x86_64, darwin9.8.0  
status  
major  2
minor  11.1 
year   2010 
month  05   
day31   
svn rev52157
language   R
version.string R version 2.11.1 (2010-05-31)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line plots in base graphics

2011-04-13 Thread Ben Bolker

Hadley Wickham hadley at rice.edu writes:

 On Wed, Apr 13, 2011 at 2:58 PM, Ben Bolker bbolker at gmail.com wrote:
  Hadley Wickham hadley at rice.edu writes:
 
 
  Am I missing something obvious on how to draw multi-line plots in
  base graphics?

 [snip]

  But in base graphics, the best I can come up with is this:
 
  with(Oxboys, plot(age, height, type = n))
  lapply(split(Oxboys[c(age, height)], Oxboys$Subject), lines)
 
  [quoting removed to fool gmane]
  Am I missing something obvious?
 
   reshape to wide format and matplot()?
 
 Hmmm, that doesn't work if your measurements are at different times e.g:
 
 Oxboys2 - transform(Oxboys, age = age + runif(234))

   In that case I think you're stuck with your lapply() approach,
or (I think) using lattice graphics with the group= argument
(that's not base though).

  Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line plots in base graphics

2011-04-13 Thread baptiste auguie

On 14 April 2011 07:51, Hadley Wickham had...@rice.edu wrote:
 Am I missing something obvious on how to draw multi-line plots in base 
 graphics?

 In ggplot2, I can do:

It appears you've been infected with what I like to call the Dijkstra
syndrome [*], quoting

The tools we use have a profound (and devious!) influence on our
thinking habits, and, therefore, on our thinking abilities.

You can probably blame ggplot2 here, messing with our minds and
spoiling us. I can't seem able to think like spreadsheets anymore
either, because of R.

Thanks though,

baptiste


[*] http://www.cs.virginia.edu/~evans/cs655/readings/ewd498.html


 data(Oxboys, package = nlme)
 library(ggplot2)

 qplot(age, height, data = Oxboys, geom = line, group = Subject)

 But in base graphics, the best I can come up with is this:

 with(Oxboys, plot(age, height, type = n))
 lapply(split(Oxboys[c(age, height)], Oxboys$Subject), lines)

 Am I missing something obvious?

 Thanks!

 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Previously attainable fisher's exact test

2011-04-13 Thread Ted Harding

On 13-Apr-11 17:40:53, Jim Silverton wrote:
 I have a matrix say,
 
 1  4
 23  30
 
 and I want to find the previously attainable fisher's exact test
 p-value. Is there a way to do this in R?
 -- 
 Thanks,
 Jim.

I do not understand what you mean by previously attainable.

As far as that particular matrix is concerned, the fisher.test()
function will yield its exact Fisher P-value:

  M - matrix(c(1, 4, 23, 30), byrow=TRUE, nrow=2)
  M
  #  [,1] [,2]
  # [1,]14
  # [2,]   23   30
  fisher.test(M)
  # Fisher's Exact Test for Count Data
  # data:  M 
  # p-value = 0.3918
  # alternative hypothesis: true odds ratio is not equal to 1 
  # 95 percent confidence interval:
  #  0.006355278 3.653391412 
  # sample estimates:
  # odds ratio 
  #  0.3316483

So the P-value is 0.3918 (as attained now, and as attainable
at any time previously if you had done the above ... !).

Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 14-Apr-11   Time: 00:28:59
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Density estimation and plots

2011-04-13 Thread Jim Silverton

I have two columns of data, one is a subset of the other. All the data lie
beteen 0 and 1 inclusive. I want to fit both densities on the same graph. I
would also like the ability to extract the fitted values of both smoothed
density (using the best method of course).

For example, if
h = c( runif(1000, 0, 1), rbeta(500, 3,9))  and
k = rbeta(500,3,9)
then how do I get both density plots on the same diagram and if I want the
actual y values (fitted values) for each of the x values in h and k, how do
I get these?

-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] add names to data frame

2011-04-13 Thread Radhouane Aniba

Hi,

I have a vector V of values I used to create a distance matrix using dist()
function with diag=TRUE and upper=TRUE parameters.

I would like to assign names in another vector on top of each column instead
of 1 2 3 4 ...

How can we do that ? is the distance matrix generated a data frame or a
matrix ?

Regards
-- 
*Radhouane *

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] any package for Heckman selection model when the outcome equation also probit ?

2011-04-13 Thread Yong Wang

Hi, all

Can anybody hint if there is extant package or function to deal with
Heckman selection model where the outcome model is also probit?

 In stata, it is called heckprob.


Thank you

yong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

92 matches

Mail list logo