[R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?

2008-08-12 Thread Emmanuel Levy
Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame "df": > col1=sample(c(0,1),10, rep=T) > names = factor(c(rep("A",5),rep("B",5))) > df = data.frame(names,col1) > df names

Re: [R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread Emmanuel Levy
gers does > t4 <- system.time(res <- which(as.integer(x) == match("A", levels(x > print(t4/t1); > usersystem elapsed > 0.417 0.000 0.3636364 > > So, the latter seems to be the fastest way to identify those elements. > > My $.02 > > /Hen

Re: [R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread Emmanuel Levy
l example > that shows what you have and what you want? > > Is ?split what you are after? > > Emmanuel Levy wrote: >> >> Dear Peter and Henrik, >> >> Thanks for your replies - this helps speed up a bit, but I thought >> there would be something much faster. >>

Re: [R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?

2008-08-13 Thread Emmanuel Levy
are doing. Can you make a small example > that shows what you have and what you want? > > Is ?split what you are after? > > Emmanuel Levy wrote: >> >> Dear Peter and Henrik, >> >> Thanks for your replies - this helps speed up a bit, but I thought >> t

[R] RCurl compilation error on ubuntu hardy

2008-09-16 Thread Emmanuel Levy
Dear list members, I encountered this problem and the solution pointed out in a previous thread did not work for me. (e.g. install.packages("RCurl", repos = "http://www.omegahat.org/R";) I work with Ubuntu Hardy, and installed R 2.6.2 via apt-get. I really need RCurl in order to use biomaRt ...

Re: [R] RCurl compilation error on ubuntu hardy

2008-09-17 Thread Emmanuel Levy
oblem should disappear. It relates to encoding of strings. > > D. > > Emmanuel Levy wrote: >> Dear list members, >> >> I encountered this problem and the solution pointed out in a previous >> thread did not work for me. >> (e.g. install.packages("

[R] Smoothing z-values according to their x, y positions

2008-03-19 Thread Emmanuel Levy
Dear All, I'm sure this is not the first time this question comes up but I couldn't find the keywords that would point me out to it - so apologies if this is a re-post. Basically I've got thousands of points, each depending on three variables: x, y, and z. if I do a plot(x,y, col=z), I get somet

Re: [R] Smoothing z-values according to their x, y positions

2008-03-19 Thread Emmanuel Levy
in > the base distribution, which will do exactly what you requested. > > > Bert Gunter > Genentech Nonclinical Statistics > > > > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of Emmanuel Levy > Sent: Wednesday

Re: [R] Smoothing z-values according to their x, y positions

2008-03-19 Thread Emmanuel Levy
0 - 0.17. I haven't looked yet at the locfit package as it is not installed, but I will check it out! Thanks for helping! Emmanuel On 20/03/2008, David Winsemius <[EMAIL PROTECTED]> wrote: > "Emmanuel Levy" <[EMAIL PROTECTED]> wrote in > news:[EMAIL PROTECTED]

[R] Mclust problem with mclust1Dplot: Error in to - from : non-numeric argument to binary operator

2008-10-20 Thread Emmanuel Levy
Dear list members, I am using Mclust in order to deconvolute a distribution that I believe is a sum of two gaussians. First I can make a model: > my.data.model = Mclust(my.data, modelNames=c("E"), warn=T, G=1:3) But then, when I try to plot the result, I get the following error: > mclust1Dplot(

Re: [R] Mclust problem with mclust1Dplot: Error in to - from : non-numeric argument to binary operator

2008-10-21 Thread Emmanuel Levy
this would be great; is it possible to somehow force the parameters (e.g variance) to be greater than a particular threshold? Thanks, Emmanuel 2008/10/20 Emmanuel Levy <[EMAIL PROTECTED]>: > Dear list members, > > I am using Mclust in order to deconvolute a distribution that I &

Re: [R] Mclust problem with mclust1Dplot: Error in to - from : non-numeric argument to binary operator

2008-10-21 Thread Emmanuel Levy
,0.15),type="n",xlab=" ",ylab=" ",axes=F, ylim=c(0,0.4) ) axis(side=1) for (i in 1:2) { ni <- v$parameters$pro[i]*dnorm(x0, mean=as.numeric(v$parameters$mean[i]),sd=1) lines(x0,ni,col=1) nt <- nt+ni } lines(x0,nt,lwd=3) segments(my.data,0,my.data,0.02) Best,

[R] unimodal VS bimodal normal distribution - how to get a pvalue?

2008-10-21 Thread Emmanuel Levy
Dear All, I have a distribution of values and I would like to assess the uni/bimodality of the distribution. I managed to decompose it into two normal distribs using Mclust, and the BIC criteria is best for two parameters. However, the problem is that the BIC criteria is not a P-value, which I wo

Re: [R] unimodal VS bimodal normal distribution - how to get a pvalue?

2008-10-21 Thread Emmanuel Levy
Hi Duncan, I'm really stupid --- yes of course!! Thanks for pointing me out the (now) obvious. All the best, E 2008/10/21 Duncan Murdoch <[EMAIL PROTECTED]>: > On 10/21/2008 2:56 PM, Emmanuel Levy wrote: >> >> Dear All, >> >> I have a distribution of

[R] If I known d1 (density1), and dmix is a mix between d1 and d2 (d2 is unknown), can one infer d2?

2008-10-22 Thread Emmanuel Levy
Dear All, I hope the title speaks by itself. I believe that there should be a solution when I see what Mclust is able to do. However, this problem is quite particular in that d3 is not known and does not necessarily correspond to a common distribution (e.g. normal, exponential ...). However it mu

[R] gregexpr slow and increases exponentially with string length --> how to speed it up?

2008-10-30 Thread Emmanuel Levy
Dear All, I have a long string and need to search for regular expressions in there. However it becomes horribly slow as the string length increases. Below is an example: when "i" increases by 5, the time spent increases by more! (my string is 11,000,000 letters long!) I also noticed that - the s

Re: [R] gregexpr slow and increases exponentially with string length --> how to speed it up?

2008-10-30 Thread Emmanuel Levy
Hi Chuck, Thanks a lot for your suggestion. > You can find all such matches (not just the disjoint ones that gregexpr > finds) using something like this: > >twomatch <-function(x,y) intersect(x+1,y) >match.list <- >list( >which( vec %in% c(3

[R] Mathematica now working with Nvidia GPUs --> any plan for R?

2008-11-18 Thread Emmanuel Levy
Dear All, I just read an announcement saying that Mathematica is launching a version working with Nvidia GPUs. It is claimed that it'd make it ~10-100x faster! http://www.physorg.com/news146247669.html I was wondering if you are aware of any development going into this direction with R? Thanks f

Re: [R] Mathematica now working with Nvidia GPUs --> any plan for R?

2008-11-20 Thread Emmanuel Levy
Dear Brian, Mose, Peter and Stefan, Thanks a lot for your replies - the issues are now clearer to me. (and I apologize for not using the appropriate list). Best wishes, Emmanuel 2008/11/19 Peter Dalgaard <[EMAIL PROTECTED]>: > Stefan Evert wrote: >> >> On 19 Nov 2008, at 07:56, Prof Brian Ri

[R] Is it normal that normalize.loess does not tolerate a single NA value?

2009-03-13 Thread Emmanuel Levy
Dear all, I have been using normalize.loess and I get the following error message when my matrix contains NA values: > my.mat = matrix(nrow=100, ncol=4, runif(400) ) > my.mat[1,1]=NA > my.mat.n = normalize.loess(my.mat, verbose=TRUE) Done with 1 vs 2 in iteration 1 Done with 1 vs 3 in iteration 1

[R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
Dear All, I cannot find a solution to the following problem although I imagine that it is a classic, hence my email. I have a vector V of X values comprised between 1 and N. I would like to get random samples of X values also comprised between 1 and N, but the important point is: * I would like

[R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
Dear All,(my apologies if it got posted twice, it seems it didn't get through) I cannot find a solution to the following problem although I suppose this is a classic. I have a vector V of X=length(V) values comprised between 1 and N. I would like to get random samples of X values also compri

Re: [R] Random sampling while keeping distribution of nearest ne

2009-08-12 Thread Emmanuel Levy
lp me solve it. Many thanks! Emmanuel PS: I apologize that I sent a second post. This one did not appear in my "R-help" label so I assumed it wasn't sent for some reason. 2009/8/12 Ted Harding : > On 12-Aug-09 22:05:24, Emmanuel Levy wrote: >> Dear All, >&

Re: [R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
with this problem? Or even better of a package? Thanks for your help, Emmanuel 2009/8/12 Nordlund, Dan (DSHS/RDA) : >> -Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >> Behalf Of Emmanuel Levy >> Sent: Wedne

Re: [R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
> But if the 1st order differences are the same, then doesn't it follow that > the 2nd, 3rd, ... order differences must be the same between the original and > the new "random" vector.  What am I missing? You are missing nothing sorry, I wrote something wrong. What I would like to be preserved is