[R] splitting a vector of strings

2016-07-21 Thread Eric Elguero

Hi everybody,

I have a vector of character strings.
Each string has the same pattern and I want
to split them in pieces and get a vector made
of the first pieces of each string.

The problem is that strsplit returns a list.

All I found is

uu<- matrix(unlist(strsplit(x,";")),ncol=3,byrow=T)[,1]

where x is the vector ";" is the delimiting character
and I know that each string will be cut in 3 pieces.

That works for my problem but I would prefer a
more elegant solution. Besides, it would not
work if all the string didn't have the same
number of pieces.

does someone have a better solution?

sorry if that topic was discussed recently.
There is too much traffic on the r-help list,
I cannot catch up.

--
Eric Elguero

MIVEGEC. - UMR (CNRS/IRD/UM) 5290
Maladies Infectieuses et Vecteurs, Génétique, Evolution et Contrôle
Institut de Recherche pour le Développement (IRD)
911, Avenue Agropolis
BP 64501
34394 Montpellier Cedex 5, France

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with function "polygon"

2014-11-07 Thread Eric Elguero

On 11/07/2014 04:35 PM, Duncan Murdoch wrote:


You are not using the polygon() function from the graphics package,
you're using one coming from somewhere else (maybe an old version of R,
or some package).  The polygon() function in the graphics package
doesn't call .Internal(polygon(..., it calls

.External.graphics(C_polygon, ...

If at some point you made a
copy of the polygon() function and saved it, you're stuck with that one
forever (or at least until you delete it from your workspace, or even
better, delete the whole saved workspace).



you're absolutely right. I was usin a "polygon" function
from package ade4 (that I copied to my workspace, don't
remember why). I will ask ade4 developpers.

thank you.

e.e.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with function "polygon"

2014-11-07 Thread Eric Elguero

Hi all,

I'm trying to use the polygon function from
the graphics package, and get this error
message :

> polygon(x=c(1,2,3,1),y=c(1,4,5,1))
Error in .Internal(polygon(xy$x, xy$y, col, border, lty, ...)) :
  there is no .Internal function 'polygon'

That annoys me because polygon is actually
called by several other functions I need.

my R version:

R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

and I just updated everything.

e.e.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] transmission of parameters to the glmmadmb function

2014-01-09 Thread Eric Elguero

Hi everybody,

I wrote a function where several variables
are created, and the used in a generalized
mixed model, from the glmmADMB package.

here is part of the function:


print(length(spy))

uu<-summary(glmmadmb(spy~sex+poswing+spx+(1|host),data=ni,
family="nbinom",zeroInflation=True))

when I run the function I get

[1] 596
Error in eval(expr, envir, enclos) : object 'spy' not found

(so spy is known to the function "print" but not
to the function glmmadmb)


now I modify my function:


print(length(spy))

ni$spy<-spy
ni$spx<-spx

uu<-summary(glmmadmb(spy~sex+poswing+spx+(1|host),data=ni,
family="nbinom",zeroInflation=True))

and that works.

however, when I call glmmadmb interactively, it
accepts in the formula variables which are in
the dataframe specified by the 'data' argument,
as well as variables which are not.

and if in my function I replace glmmadmb by glm
it works even if spy and spx are not in the 'ni'
dataframe.

that puzzles me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] censored counts and glmer/glmmADMB

2013-12-12 Thread Eric Elguero

dear R-users,

I have to model counts where all counts above some threshold
have been censored. In the same dataset I have too many zeroes for
a Poisson or even a negative binomial distribution to make
sense, so I would need a zero-inflated-censored negative binomial
family for use in glmer (or glmmADMB?). That seems not to exist.

my question is :
how could I add a custom-built family of distributions that
I could call in glmer/glmmADMM ?

if it's not possible, I am considering imputing fake values
to replace the censored ones, but I am unsure whether this
is bad or very bad...

Eric Elguero
MIVEGEC (UM1- UM2 -CNRS 5290-IRD 224)
Maladies infectieuses et vecteurs :
écologie, génétique, évolution et contrôle
Centre IRD de Montpellier
911 Av Agropolis - BP 64501
34394 Montpellier Cedex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] call to system returns warning : status 2 (Ubuntu)

2012-03-18 Thread Eric Elguero
Hi everybody,

I have to run under Ubuntu a programs repeatedly
with different arguments and I am using R just to 
generate the data files and call the external program.

basically, in my script I have inside a loop these two lines:

command <- paste(,sep="")
system(command,intern=T,wait=T)

when I run this script, I get a number of warnings,
like this one:

16: running command '~/LDhat22/ldconvert -seq ld/serca/serca-Trs.fas
-freqcut 0.0 -missfreqcut 100.0 -sites 1 3687 -nous 6 >
ld/serca/serca-Trs.out' had status 2


however, when I run the very same command at the bash prompt,
everything seems fine (no complaint).

in either cases, the output is the same and looks correct.

So, may I just ignore these warnings or is there something
I should fix?

thank you in advance,

Eric Elguero
MIVEGEC
IRD -CNRS - UM1
Montpellier - France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with missing package

2011-06-16 Thread Eric Elguero
Hi everybody,

I just tried to run R on one of my projects
but it did not want to run:



R version 2.12.1 (2010-12-16)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)



Loading required package: utils
Error in loadNamespace(i[[1L]], c(lib.loc, .libPaths())) : 
  there is no package called 'nlme'
Fatal error: unable to restore saved data in .RData


there are two things I do not understand:

i) I had actually nlme installed, and working,
but when I look in /usr/lib/R/library/nlme
I find only a text file named "COPYING"
and containing the gnu license.
Where is the package gone? (by itself)

ii) I tried to reinstall nlme but could not
find it in the usual repositories.

in any case, I would like to recover at least those R objects
that do not depend on nlme.

I tried : 

$mv .RData xxx
$R
>load("xxx")

but that doesn't help. 

Is there a method to extract some information from .RData 
without loading it?
 
Eric Elguero

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] modifynig some elements of a vector

2011-02-10 Thread Eric Elguero
He everybody,

I want to add 1 to some elements of  a vector:

x is a vector
u is a vector of idices, that is, integers
assumed to be within the range 1..length(x)
and I want to add 1 to the elements of x
each time their index appears in u

x[u]<-x[u]+1 works only when there are no
duplicated values in u

I found this solution:

tu <- table(u)
indices <- as.numeric(names(tu))
x[indices] <- x[indices]+tu

but it looks ugly to me and I would
prefer to avoid calling the function 'table'
since this is to be done millions of times
as part of a simulation program.

Eric Elguero
Génétique & Adaptation des Plasmodium
IRD
Montpellier - FRance

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using gedit

2010-10-22 Thread Eric Elguero
Dear all,

I'm using R (2.10.1) under Ubuntu (9.10) and,
as I don't like vi, I edit my functions with
the command : edit(.,editor="gedit") which works
fine, except when gedit happens to be already
running. Then a new tab is created, and on exit
all changes are lost, regardless if I close the tab
or the editor altogether. Is there a solution 
to this problem?

thanks in advance.

Eric Elguero

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] randomness in stepclass (klaR) or lda (MASS) ?

2010-04-29 Thread Eric Elguero
On Thu, 2010-04-29 at 15:08 +0200, Uwe Ligges wrote:

> Well, it is called cross validation which is based on random sampling if 
> you do not have k=n -fold CV (=leave-one-out).
> Again, to get reproducible results, you will need to set a seed.
> 

thank you. I thought that "leave-one-out" was the default.

I looked at the reference file and I am not sure how to get it.

Is that by setting fold=1 ?

> 
> If the results are that unstable: Do you really have a sufficient number 
> of observations for your classification problem?

you're probably right.

e.e.


Eric Elguero
Laboratory Genetics and Evolution of Infectious Diseases,
Team: Genetics and Adaptation of Plasmodium
UMR 2724 CNRS-IRD,
IRD Montpellier,
911 Avenue Agropolis, BP 64501,
34394 Montpellier Cedex 5,
France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] randomness in stepclass (klaR) or lda (MASS) ?

2010-04-29 Thread Eric Elguero
Hi,

a colleague ran a stepwise discriminant analysis
twice in a row and got different results, suggesting
some "sochasticity" in the algorithms involved.
I looked at her data and found that there was a lot
of collinearity, so that I reckoned that maybe "stepclass" 
(klaR) cannot find a clear winner when trying to include a 
new variable and makes a random choice. Is that true?
another possibility is that "lda" (from MASS) computes
CV classification rates from a random subsample instead of
using all the data (?) That might be a sensible choice
with a very large sample.
I advised her to run the function several times and
see if a consensus emerges, but that doesn't seem to
be the case, and besides, I would like to know what
really is going on.

thanks

Eric Elguero
Laboratory Genetics and Evolution of Infectious Diseases, 
Team: Genetics and Adaptation of Plasmodium
UMR 2724 CNRS-IRD,
IRD Montpellier, 
911 Avenue Agropolis, BP 64501, 
34394 Montpellier Cedex 5, 
France


> f4.U.spDA <- stepclass(f.mes, f.gp4,
"lda",improvement=0.01,prior=rep(0.25,4))
 `stepwise classification', using 10-fold cross-validated correctness
rate of method lda'.
89 observations of 31 variables in 4 classes; direction: both
stop criterion: improvement less than 1%.
correctness rate: 0.58333;  in: "X2";  variables (1): X2 
correctness rate: 0.66389;  in: "X9";  variables (2): X2, X9 
correctness rate: 0.69583;  in: "X27";  variables (3): X2, X9, X27 

 hr.elapsed min.elapsed sec.elapsed 
   0.000.00   20.77 

> f4.U.spDA <- stepclass(f.mes, f.gp4,
"lda",improvement=0.01,prior=rep(0.25,4))
 `stepwise classification', using 10-fold cross-validated correctness
rate of method lda'.
89 observations of 31 variables in 4 classes; direction: both
stop criterion: improvement less than 1%.
correctness rate: 0.60556;  in: "X2";  variables (1): X2 
correctness rate: 0.71806;  in: "X6";  variables (2): X2, X6 

 hr.elapsed min.elapsed sec.elapsed 
   0.000.00   15.14

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sigma in glmer (lme4)

2009-05-06 Thread Eric Elguero
dear R-users,

I am trying to understand what is the
sigma parameter returner by glmer

I thought it was (an estimate of) the sigma parameter
defined by Mc Cullagh & Nelder (e.g. p 126 of 2nd edition)
but I ran some simulations and it seems that this is
something else.

I simulated data corresponding to a binomial model,
intended to be fitted by this command:

glmer(cbind(success,failure)~X+(1|group),family=binomial)

but I instead fitted the following model:

glmer(cbind(success,failure)~X+(1|group),family=quasibinomial)

(and repeated this process 500 times)

I expected sigma to be close to 1 but I found
that the mean sigma was about 0.05 (sd = 0.003)

If I do the analogous simulation study with glm,
that is, I simulate binomial data and fit them
with family=quasibinomial instead of binomial,
I find a mean dispersion parameter = 0.
(sd=0.09).

changing parameters values does not alter this
pattern.

In both cases, the fixed effects parameters are 
correctly estimated.


here is the function I used to simulate data (taking
0 as the standard deviation  of random effect
provides data suitable to glm)

function(x,theta,sigmag,nb.groups=10,size=50)
#-
# sim.data.mixed
#-
# simulates data for glmer
# Y is Binom(p,size)
# with logit(p) = theta1 + theta2*X + B
# where B is Norm(0,sigmag)
#-
# x : the x values (same for each group)
# length(x) is the number of observations per group
# theta: the fixed effects parameters (intercept & slope)
# sigmag : the random effects standard deviation
# size : the binomial parameter (same for everybody)
{
group<-rep(1:nb.groups,rep(length(x),nb.groups))
random.effect<-rnorm(nb.groups,mean=0,sd=sigmag)
xmat<-expand.grid(x,random.effect)
eta<-theta[1]+theta[2]*xmat[,1]+xmat[,2]
y<-rbinom(length(eta),size=size,prob=invlogit(eta))
return(data.frame(success=y,failure=size-y,x=xmat[,1],group=group,b=xmat[,2]))
#--
# b (random effects) is returned here but not used by glmer
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] refit with binomial model (lme4)

2009-04-28 Thread Eric Elguero
On Mon, 2009-04-27 at 08:30 -0500, Douglas Bates wrote:
> This is related to using the matrix form of the response for a
> binomial glmm.  The refit method for a model fit by lmer is based on a
> numeric vector response.
> 

thank you for this explanation.

> Is it possible to use the expanded form (i.e. a vector of 0/1 values)
> of the responses instead of the matrix form?
> 

yes I could but I found that I could use the
probability/weights form, at least in my case 
where I am simulating new binomial data with
the observed number of trials.

Eric Elguero

> On Mon, Apr 27, 2009 at 7:20 AM, Eric Elguero  wrote:
> > Dear R users,
> >
> > I'm trying to use function 'refit' from lme4
> > and I get this error that I can't understand:
> >
> >> refit(dolo4.model4,cbind(uu,50-uu))
> > Error in function (classes, fdef, mtable)  :
> >  unable to find an inherited method for function "refit", for signature
> > "mer", "matrix"
> >
> > if I try:
> >
> >> refit(dolo4.model4,uu)
> > Error in asMethod(object) : matrix is not symmetric [1,2]
> >
> > I get this error message that I can no more
> > understand but which suggests that refit expects
> > two columns.
> >
> >
> > the initial model was:
> >
> >> dolo4.mod...@call
> > glmer(formula = cbind(sortis, restes) ~ mean.co2 + (1 | sujet),
> >data = dollo4.df, family = binomial)
> >
> >
> >
> > R version 2.9.0 (2009-04-17)
> >
> > and
> >
> > Package: lme4
> > Version: 0.999375-28
> > Date: 2008-12-13
> >
> > thank you in advance
> >
> > e.e.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] refit with binomial model (lme4)

2009-04-27 Thread Eric Elguero
Dear R users,

I'm trying to use function 'refit' from lme4
and I get this error that I can't understand:

> refit(dolo4.model4,cbind(uu,50-uu))
Error in function (classes, fdef, mtable)  : 
  unable to find an inherited method for function "refit", for signature
"mer", "matrix"

if I try:

> refit(dolo4.model4,uu)
Error in asMethod(object) : matrix is not symmetric [1,2]

I get this error message that I can no more
understand but which suggests that refit expects
two columns.


the initial model was:

> dolo4.mod...@call
glmer(formula = cbind(sortis, restes) ~ mean.co2 + (1 | sujet), 
data = dollo4.df, family = binomial)



R version 2.9.0 (2009-04-17)

and

Package: lme4
Version: 0.999375-28
Date: 2008-12-13

thank you in advance

e.e.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strptime

2008-06-18 Thread Eric Elguero

Hi,

what's wrong with that?


strptime("06:00:00 03.01.2008",format="%H:%M%:%S %d.%m.%Y",tz="GMT")

[1] NA

the command seems to comply with the rules in the help file but returns NA
(R 2.6.1 Windows XT)

Eric Elguero

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] precision in seq

2008-02-05 Thread Eric Elguero
thank you to all who answered.


> 0+0.05+
+ 0.05+0.05+0.05+0.05+0.05+0.05+
+ 0.05+0.05+0.05+0.05+0.05+0.05+
+ 0.05+0.05+0.05+0.05+0.05+0.05 - 0.95
[1] 3.330669e-16

> seq(0,1,0.05)[20] - 0.95
[1] 1.110223e-16

> 0+19*0.05 - 0.95
[1] 1.110223e-16

so this is the way seq calculates. I would have guessed
that addition was more accurate than multiplication,
but that is not the case.

this one however bothers me:
> 19/20-0.95
[1] 0


I noticed this problem when I tried to extract rows of a matrix
according to whether values of some vector where in the set
(0,0.05,...,0.95,1), with something like x%in%seq(0,1,0.05)
Now I understand that I should not use this construction
unless x is of type integer. Would you agree?

Eric Elguero

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] precision in seq

2008-02-04 Thread Eric Elguero
Hi everybody,

this is a warning more than a question.

I noticed that seq produces approximate results:

> seq(0,1,0.05)[19]==0.9
[1] TRUE
> seq(0,1,0.05)[20]==0.95
[1] FALSE
> seq(0,1,0.05)[21]==1
[1] TRUE

> seq(0,1,0.05)[20]-0.95
[1] 1.110223024625157e-16

I do not understand why 0.9 and 1 are correct (within some
tolerance or strictly exact?)  and 0.95 is not.

this one works:

> ((0:20)/20)[20]==0.95
[1] TRUE

Eric Elguero

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tabulations in command-line under linux

2008-01-11 Thread Eric Elguero
Hi everybody,

I'm trying to use R (2.4.1) undr Linux (debian) and a thing bothers me:
sometimes I paste lines from a text editor into the R command line and
tabulations are catched by the completing-names function of the csh.
How could this behaviour be inhibited?

thanks in advance

Eric Elguero

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Contour plot (level curves)

2007-10-05 Thread Eric Elguero
> I have a sample of n values from a bivariate distribution (from a MCMC
> procedure). How could I draw a contour plot of "the joint density" based on
> that sample ?

here is a fast 2D density estimator. Not very sophisticated, but works.
The function assumes that data are in the form of a matrix
with (first) two columns containing x and y coordinates.

To plot the result:
image(dens2d(x)) or contour(dens2d(x))

Play with the h parameter to change the smoothness of the surface.


>dens2d
function(x, nx = 20, ny = 20, margin = 0.05, h = 1)
{
 xrange <- max(x[, 1]) - min(x[, 1])
 yrange <- max(x[, 2]) - min(x[, 2])
 xmin <- min(x[, 1]) - xrange * margin
 xmax <- max(x[, 1]) + xrange * margin
 ymin <- min(x[, 2]) - yrange * margin
 ymax <- max(x[, 2]) + yrange * margin
 xstep <- (xmax - xmin)/(nx - 1)
 ystep <- (ymax - ymin)/(ny - 1)
 xx <- xmin + (0:(nx - 1)) * xstep
 yy <- ymin + (0:(ny - 1)) * ystep
 g <- matrix(0, ncol = nx, nrow = ny)
 n <- dim(x)[[1]]
 for(i in 1:n) {
  coefx <- dnorm(xx - x[i, 1], mean = 0, sd = h)
  coefy <- dnorm(yy - x[i, 2], mean = 0, sd = h)
  g <- g + coefx %*% t(coefy)/n
 }
 return(list(x = xx, y = yy, z = g))
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.