date:20090701

Re: [R] How to get best performance from R on Linux?

2009-07-01 Thread Rainer M Krug

On Wed, Jul 1, 2009 at 2:47 AM, Dirk Eddelbuettel  wrote:
>
> Rainer,
>
> On 30 June 2009 at 14:30, Rainer M Krug wrote:
> | following a discussion on difference in speed of R between R and Linux, I am
> | wondering: is there a howto to get the most (concerning speed) out of R? I
> | am not talking about vectorisation and techniques in doing the analysis, but
> | what should I look at when I want to get "the fastest R" on my computer -
> | compiling myself? specific switches? compile libraries?, ...
>
> Did you look at R Admin manual and its Appendix B on Unix configuration?

Yes - thanks - but I seem to have skipped Appendix B.There it says:
###
On most platforms using gcc, having ‘-O3’ in CFLAGS produces
worthwhile performance gains. On systems using the GNU linker
(especially those using R as a shared library), it is likely that
including ‘-Wl,-O1’ in LDFLAGS is worthwhile, and on recent systems21
‘'-Bdirect,--hash-style=both,-Wl,-O1'’ is recommended at
http://lwn.net/Articles/192624/. Tuning compilation to a specific CPU
family (e.g. ‘-mtune=core2’ for gcc) can give worthwhile performance
gains, especially on older architectures such as ‘ix86’.
###

 I'll try it out when I compile R 2.9.1 and will see if it improves
compared to the default values.

Some more looking around, I found the following pages which might be helpful:

GNU optimisation page:
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Linux Review page on optimized gcc compiling:
http://linuxreviews.org/howtos/compiling/

Gentoo Wiki CFLAG guide
http://en.gentoo-wiki.com/wiki/CFLAGS

There seems to be a lot to learn.

>
> And the end of the day, it will most likely depend on exactly what it is you
> are trying to do, so you get back to square one and the need to profile,
> benchmark, ...

Yup - so much to benchmark and so little time...

Cheers and thanks,

Rainer

>
> Hth, Dirk
>
> --
> Three out of two people have difficulties with fractions.

--
Rainer M. Krug, Centre of Excellence for Invasion Biology,
Stellenbosch University, South Africa

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interaction plots (six on one page)

2009-07-01 Thread ukoenig


Thank you, Jim!

It looks much better with that new aspect ratio!

Unfortunately the legend is located at the same place,
too far on the rigt side next to the border.
Any ideas?

Thanks, Udo



Quoting jim holtman :


add

par(mar=c(2.5,4,1,1))

just after layout

On Tue, Jun 30, 2009 at 4:20 PM,  wrote:


#Dear R users,
#I want six interaction plots to be on one page,
#but the following problem occurs: the legend "BMIakt" appears,
#but it is exactly on the border of the plots (too far right).
#My seccond question is, how I can reduce the empty space in the
y-direction
#between the plots.

#Please have a look at my syntax below.

#Many thanks!
#Udo


myData <-
("id,BMIakt,time,thanaa,thalcho,thalino,ponaa,pocho,poino
1,0,1,1.63,0.79,0.28,1.63,0.98,0.58
2,0,2,1.7,0.72,0.37,1.83,0.97,0.42
3,1,1,1.73,0.83,0.32,2.24,1,0.88
4,1,2,1.87,0.76,0.49,1.78,0.68,0.61
5,0,1,1.81,0.99,0.47,1.98,0.96,0.6
6,0,2,1.7,0.77,0.38,1.68,0.79,0.51
7,0,1,1.79,0.95,0.48,1.65,0.9,0.73
8,0,2,2.5,0.79,0.56,2.09,0.81,0.64
9,1,1,1.63,0.71,0.36,2.13,0.98,0.68
10,1,2,1.69,0.85,0.43,2,0.94,0.59
11,1,1,1.95,1.04,0.38,1.7,0.92,0.47
12,1,2,2.16,0.84,0.25,2.01,0.73,0.38
13,0,1,1.65,0.8,0.2,1.74,0.95,0.43
14,0,2,1.83,0.96,0.59,1.88,1.2,0.73
15,1,1,2.02,0.79,0.26,2.01,0.94,0.59
16,1,2,1.71,0.57,0.42,1.87,0.73,0.59
17,1,1,1.5,0.78,0.35,1.68,0.84,0.48
18,1,2,1.4,0.66,0.43,1.87,1.02,0.39
19,1,1,1.45,0.69,0.32,1.74,0.67,0.44
20,1,2,1.65,0.88,0.27,1.7,0.87,0.55
21,1,1,1.89,0.66,0.4,1.93,0.88,0.58
22,1,2,1.71,0.81,0.34,1.87,0.8,0.53
23,1,1,1.71,0.87,0.34,1.65,1.16,0.65
24,1,2,1.82,1.29,0.49,1.98,1.31,0.57
25,1,1,1.66,0.86,0.28,1.4,0.8,0.38
26,1,2,1.82,0.82,0.45,1.4,1.1,0.68
27,1,1,1.67,0.71,0.32,1.83,0.84,0.63
28,1,2,2.06,0.69,0.41,1.62,0.9,0.57
29,1,1,1.62,0.81,0.47,1.88,1.11,0.6
30,1,2,1.76,0.77,0.71,1.74,1.2,0.55
31,0,1,1.8,0.78,0.27,1.96,0.86,0.47
32,0,2,1.7,0.63,0.35,2.22,0.83,0.58
33,1,1,1.92,0.8,0.37,1.8,0.98,0.43
34,1,2,1.94,0.84,0.48,1.92,0.86,0.61
35,1,1,1.55,0.6,0.44,1.78,0.86,0.64
36,1,2,1.68,0.61,0.39,1.84,0.85,0.65
37,0,1,1.77,0.84,0.47,1.72,0.84,0.57
38,0,2,1.79,0.85,0.55,1.89,0.85,0.54
39,1,1,1.87,0.9,0.52,2.01,1.01,0.7
40,1,2,1.91,0.72,0.28,1.81,0.78,0.65")

data <- read.table(textConnection(myData),
header=TRUE, sep=",", row.names="id")

attach(data)
layout(matrix(1:6, 3, 2))


interaction.plot(time, BMIakt, thanaa)
interaction.plot(time, BMIakt, thalcho)
interaction.plot(time, BMIakt, thalino)
interaction.plot(time, BMIakt, ponaa)
interaction.plot(time, BMIakt, pocho)
interaction.plot(time, BMIakt, poino)


#Using "locator()" would be an alternative, but the text "BMIakt" is
missing,
#doing that
interaction.plot(time, BMIakt175, thanaa, legend=FALSE)
legend(locator(1), c("1", "0"), cex=0.8, col=c("black", "black"), lty=1:2)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] odfWeave : problems with odfInsertPlot() and odfFigureCaption()

2009-07-01 Thread Emmanuel Charpentier

Dear Max,

Le mardi 30 juin 2009 à 21:36 -0400, Max Kuhn a écrit :
> Well, on this one, I might bring my record down to commercial software
> bug fix time lines...

Having actually looked at the code before wailing, I had such a hunch...

[ Snip ... ]

> We had to do some interesting things to get figure captions working
> with odfWeave. You would think that the figure caption function just
> write out some xml and are done (like the insert plot code and the
> table caption code). Unfortunately, it doesn't.
> 
> It writes the caption to a global variable (I know, I know) and other
> code actually writes the caption. We had good reason to do this,
> mostly related to the complexity associated with the context of the
> code chunk (e.g. was it inside a table, at paragraph, etc). I've been
> thinking about this one for a while since the current implementation
> doesn't allow you to pass the figure caption to other functions (it
> writes a global variable and returns NULL). I haven't come up with a
> good solution yet.

ISTR from my Lisp/Scheme days that a good heuristic in this kind of case
is to try to build a closure and return that to a wrapper.
Unfortunately, R doesn't have anything as simple to use as let/let*, and
you have to play fast and loose with environments...

> 
> I've not been as prompt with odfWeave changes in comparison to other
> packages (many of you have figured this out). This is mostly due to
> the necessary complexity of the code. In a lot of ways, I think that
> we let the feature set be too rich and should have put some constrains
> on the availability of code chunks (e.g. only in paragraphs).

I didn't even think of this. A priori, that's hard to do (whereas
(\La)Texw will need "\par" or *two* newlines to end a paragraph and
start another, oowriter will take one newline as an end of paragraph and
will emit the corresponding XML.

Come to think of it, you might try to insert a chunk in a table case, or
in a footnote...

And I don't know about other ODF tools...

>   I want
> to add more features (like the one that you are requesting), but
> seemingly small changes end up being a considerable amount of work
> (and I don't time). I've been thinking about simplifying the code (or
> writing rtfWeave) to achieve the same goals without such a complex
> system.

What about something along the line of :
foo<-odfTable(something)
bar<-odfInsertPlot(somethingelse)
odfCaptionize(foo,"Something")
odfCaptionize(bar("Something else")
?

> The ODF format is a double-edged sword. It allows you to do many more
> things than something like RTF, but makes it a lot more difficult to
> program for. Take tables as an example. The last time I looked at the
> RTF specification, this was a simple markup structure. In ODF, the
> table structure is somewhat simple, but the approach to formatting the
> table is complex and not necessarily well documented.

Indeed...

>For example,
> some table style elements go into styles.xml whereas others go into
> contest.xml (this might be implementation dependent)
> 
> So, I could more easily expand the functionality of odfWeave by
> imposing some constraints about where the code chunks reside. The
> earlier versions of odfWeave did not even parse the xml; we got a lot
> done via regular expressions and gsub. However, we went to a xml
> parsing approach when we found out that the context of the code chunk
> matters.
> 
> Feedback is welcome: do people actually use code chunks in places
> other than paragraphs?
> 

I didn't even think of it. But now that you tell it, you got me a nice
idea for a workaround ... :-)

> So, back to your question, the relevant code for captions is in
> odfWeave:::withCaptionXML (odfInsertPlot uses this to write the xml).
> You can try that and I might be able to look at it in the next few
> days.

I did look that. Tjhe problem is that your code has dire warnings not to
call it with a caption more than once in a figure=TRUE (sic..) chunk.

> Thanks,

*I* thank *you, Max,

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot cumulative probability of beta-prime distribution

2009-07-01 Thread aledanda


Hallo,

I need your help.
I fitted my distribution of data with beta-prime, I need now to plot the
Cumulative distribution. For other distribution like Gamma is easy: 

x <- seq (0, 100, 0.5)
plot(x,pgamma(x, shape, scale), type= "l", col="red")

but what about beta-prime? In R it exists only pbeta which is intended only
for the beta distribution (not beta-prime)

This is what I used for the estimation of the parameters:

mleBetaprime <- function(x,start=c(1,1)) {
   mle.Estim <- function(par) {
 shape1 <- par[1]
 shape2 <- par[2]
 BetaprimeDensity <- NULL
 for(i in 1:length(posT))
   BetaprimeDensity[i] <- posT[i]^(shape1-1) *
(1+posT[i])^(-shape1-shape2) / beta(shape1,shape2)
 return(-sum(log(BetaprimeDensity)))
}
   est <- optim(fn=mle.Estim,par=start,method="Nelder-Mead")
   return(list(shape1=est$par[1],shape2=est$par[2]))
  }
posbeta1par <- fdp(posT, family= "beta1") 

Hope you can help me.

Thanks a lot!!!

Ale
-- 
View this message in context: 
http://www.nabble.com/Plot-cumulative-probability-of-beta-prime-distribution-tp24285301p24285301.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot question

2009-07-01 Thread Gabor Grothendieck

Try this:

Key <- list(text = list(c("FDP", "FDL")), points = list(pch = c("*", "o"),
col = c("red", "blue"), cex = 2), space = "right")

xyplot(S ~ t | ID, key = Key, panel = function(...) {
panel.points(...,
col = rep(c("red", "blue"), 3:4),
pch = rep(c("*", "o"), 3:4),
cex = 2)
panel.lines(..., col = 1, lty = 2)
})



On Wed, Jul 1, 2009 at 2:12 AM, jlfmssm wrote:
> I have a data set like this
>
>
> ID=c("A","A","A","A","A","A","A","B","B","B","B","B","B","B")
> s=c(1.1,2.2,1.3,1.1,3.1,4.1,4.2,1.1,2.2,1.3,1.1,3.1,4.1,4.2)
> d=c(1,2,3,4,5,6,7,1,2,3,4,5,6,7)
> t=c(-3,-2,-1,0,1,2,3,-3,-2,-1,0,1,2,3)
>
> mydata<-data.frame(cbind(as.character(ID),as.numeric(s),as.integer(d),as.numeric(t)))
> colnames(mydata)=c("ID","S","d","t")
>
> I want to use xyplot in lattice to draw a plot using the following code:
>
> attach(mydata)
> library(lattice)
>
> xyplot(S~t|ID,type="b")
>
> Now I want to label the line from  -3  to  -1 with one type( for example,
> red with "*") , the line from 0 to 3 with another type(blue with"o"),
> and give text label says: red "*"="FDP", blue"o"="FDL"
>
> Does anyone know how to do this?
>
> thanks,
>
> jlfmssm
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R: fitting in logistic model

2009-07-01 Thread Renzi Fabrizio

Thank you very much for your answer. 

Mark hit the point of my query. Now we need somebody who knows how R
computes the fitting values, and why it does not use the inverse link...
In my humble opinion I think that R uses a kind of interpolation, using
some standard points (with the minimum value 2.220446e-16).. but it is
just a surmise. 
I think it would be useful to investigate the function glm, using the
editor; I tried to do it, but I didn't understand anything...

Fabrizio Renzi

-Messaggio originale-
Da: ted.hard...@manchester.ac.uk [mailto:ted.hard...@manchester.ac.uk] 
Inviato: 30 June 2009 19:54
A: Marc Schwartz
Cc: r-help@r-project.org; Renzi Fabrizio
Oggetto: Re: [R] fitting in logistic model


On 30-Jun-09 17:41:20, Marc Schwartz wrote:
> On Jun 30, 2009, at 10:44 AM, Ted Harding wrote:
> 
>>
>> On 30-Jun-09 14:52:20, Marc Schwartz wrote:
>>> On Jun 30, 2009, at 4:54 AM, Renzi Fabrizio wrote:
>>>
 I would like to know how R computes the probability of an event
 in a logistic model (P(y=1)) from the score s, linear combination
 of x and beta.

 I noticed that there are differences (small, less than e-16)
between
 the fitting values automatically computed in the glm procedure by
R,
 and the values "manually" computed by me applying the reverse
 formula p=e^s/(1+e^s); moreover I noticed  that the minimum value
 of the fitting values in my estimation is 2.220446e-16, and there
 are many observation with this probability (instead the minimum
 value obtained by "manually" estimation is 2.872636e-152).
>>>
>>> It would be helpful to see at least a subset of the output from your
>>> model and your manual computations so that we can at least visually
>>> compare the results to see where the differences may be.
>>>
>>> The model object returned from using glm() will contain both the
>>> linear predictors on the link scale (model$linear.predictors) and
>>> the fitted values (model$fitted.values). The latter can be accessed
>>> using the fitted() extractor function.
>>>
>>> To use an example, let's create a simple LR model using the infert
>>> data set as referenced in ?glm.
>>>
>>> model1 <- glm(case ~ spontaneous + induced, data = infert,
>>>  family = binomial())
>>>
 model1
>>> Call:  glm(formula = case ~ spontaneous + induced,
>> family = binomial(), data = infert)
>>>
>>> Coefficients:
>>> (Intercept)  spontaneous  induced
>>> -1.7079   1.1972   0.4181
>>>
>>> Degrees of Freedom: 247 Total (i.e. Null);  245 Residual
>>> Null Deviance:316.2
>>> Residual Deviance: 279.6  AIC: 285.6
>>>
>>> # Get the coefficients
 coef(model1)
>>> (Intercept) spontaneous induced
>>>  -1.7078601   1.1972050   0.4181294
>>>
>>> # get the linear predictor values
>>> # log odds scale for binomial glm
 head(model1$linear.predictors)
>>>   1   2   3   4   5
>>> 6
>>>  1.10467939 -1.28973068 -0.87160128 -0.87160128 -0.09252564
>>> 0.32560375
>>>
>>> You can also get the above by using the coefficients and the model
>>> matrix for comparison:
>>> # the full set of 248
>>> # coef(model1) %*% t(model.matrix(model1))
 head(as.vector(coef(model1) %*% t(model.matrix(model1
>>> [1]  1.10467939 -1.28973068 -0.87160128 -0.87160128 -0.09252564
>>> 0.32560375
>>>
>>> # get fitted response values (predicted probs)
 head(fitted(model1))
>>> 1 2 3 4 5 6
>>> 0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893
>>>
>>> We can also get the fitted values from the linear predictor values
>>> by using:
>>>
>>> LP <- model1$linear.predictors
 head(exp(LP) / (1 + exp(LP)))
>>>1 2 3 4 5 6
>>> 0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893
>>>
>>> You can also get the above by using the predict.glm() function with
>>> type == "response". The default type of "link" will get you the  
>>> linear
>>> predictor values as above. predict.glm() would typically be used to
>>> generate predictions using the model on new data.
>>>
 head(predict(model1, type = "response"))
>>> 1 2 3 4 5 6
>>> 0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893
>>>
>>> In glm.fit(), which is the workhorse function in glm(), the fitted
>>> values returned in the model object are actually computed by using
>>> the inverse link function for the family passed to glm():
>>>
 binomial()$linkinv
>>> function (eta)
>>> .Call("logit_linkinv", eta, PACKAGE = "stats")
>>> 
>>>
>>> Thus:
 head(binomial()$linkinv(model1$linear.predictors))
>>> 1 2 3 4 5 6
>>> 0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893
>>>
>>> So those are the same values as we saw above using the other
methods.
>>> So, all is consistent across the various methods.
>>>
>>> Perhaps the above provides some

Re: [R] conditional coloring of output text in console or in GUI

2009-07-01 Thread Paolo

Christopher W. Ryan  binghamton.edu> writes:

> 
> suppose I have some logical vector
> 
> x <- as.logical(c(0,0,0,1,0,0,1,1,0))
> x
> 
> How would I make the words TRUE appear on the screen in a different
> color from the words FALSE?
> 
> Thanks.
> 
> --Chris

# install.packages("xterm256")
library(xterm256)

cat(style(x, fg=c(0,0,0,1,0,0,1,1,0)))

HIH

Paolo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hierarchical clustering - variable selection

2009-07-01 Thread Alexander.Herr

Maybe I haven't been sufficiently clear on what I am after:

I am looking for R adaptations of approaches (relevant to hierarchical 
clustering of categorical variables) described in  

Steinley and Brusco 2008 "Selection of variables in cluster analysis: an 
empirical comparison of eight procedures" Psychometrika, 73,1, 125-144


Thanx
Herry

-Original Message-
From: Dylan Beaudette [mailto:dylan.beaude...@gmail.com] 
Sent: Wednesday, July 01, 2009 2:01 PM
To: Herr, Alexander Herr - Herry (CSE, Gungahlin)
Cc: r-help@r-project.org
Subject: Re: [R] hierarchical clustering - variable selection

varclust() in the Hmisc package might be what you are looking for.

Dylan

On Tue, Jun 30, 2009 at 7:27 PM,  wrote:
>
> Hi List,
>
> I am looking for a procedure that allows selection of variables in a 
> clustering attempt.
>
> Specifically I am searching for a way of selecting out noise variables from a 
> set of numeric/categorical variables (or of course selecting "non-noise" 
> variables).
>
> The procedure should work with gower/ward metric/method. So far I have only 
> found procedures that deal with numerical variables.
>
> Any hints to packages/procedure would be appreciated
>
> Thanks
> Herry
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about creating lists with functions as elements

2009-07-01 Thread Craig P. Pyrame


Dear Duncan and Rolf,

That's funny!  Thanks a lot.

Best regards,
Craig

Duncan Murdoch wrote:

On 30/06/2009 5:11 PM, Craig P. Pyrame wrote:

Dear Rolf,

What do you mean?


He was talking about the fortunes package.  Install it, type 
fortune(), and you'll get a fortune cookie message.  Maybe one with 
your name on it.


Duncan Murdoch



Best regards,
Craig

Rolf Turner wrote:

On 1/07/2009, at 12:34 AM, Craig P. Pyrame wrote:


... it's probably not a good idea to submit bug
reports just because I misunderstand what R does.


Gotta be a fortune!!!

cheers,

Rolf

##
Attention: This e-mail message is privileged and confidential. If 
you are not the intended recipient please delete the message and 
notify the sender. Any views or opinions presented are solely those 
of the author.


This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com

##


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] streamlines/vectors

2009-07-01 Thread Andrea Storto


Dear all,

Does anyone know if there exist a R function
in some package to compute streamlines
given the two components of a vectorial field?

Thanks in advance,

Cheers
Andrea

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] simple question

2009-07-01 Thread David Hugh-Jones

Hello all

I have a fit resulting from a call to glm. Now, I would like to extract the
model frame MF, and add some variables
from the original data frame DF. To do this, I need to know which rows in DF
correspond to rows in MF (since some were dropped by na.omit). How can I do
this? It's probably simple but the information is hard to find.


David Hugh-Jones
Post-doctoral Researcher
Max Planck Institute of Economics, Jena
http://davidhughjones.googlepages.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple question

2009-07-01 Thread Peter Dalgaard

David Hugh-Jones wrote:
> Hello all
> 
> I have a fit resulting from a call to glm. 

Now, now, no reason to overreact...

("Women can have fits upstairs" -- sign in Indian tailor shop)

> Now, I would like to extract the
> model frame MF, and add some variables
> from the original data frame DF. To do this, I need to know which rows in DF
> correspond to rows in MF (since some were dropped by na.omit). How can I do
> this? It's probably simple but the information is hard to find.

na.omit leaves a footprint in, e.g.,
attr(na.omit(airquality),"na.action"). This can be used for (negative)
indexing.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A problem on zoo object

2009-07-01 Thread Bogaso


I have a zoo object on daily data for 10 years. Now I want to create a list,
wherein each member of that list is the monthly observations. For example,
1st member of list contains daily observation of 1st month, 2nd member
contains daily observation of 2nd month etc.

Then for a particular month, I want to divide all observations into 3 parts
(arbitrary) and then want to calculate some statistics on each part for each
month. Therefore for a particular month, I will have 3 means (suppose,
statistic is mean).

Can anyone throw some light on how to do that?
-- 
View this message in context: 
http://www.nabble.com/A-problem-on-zoo-object-tp24286720p24286720.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using regular expressions to detect clusters of consonants in a string

2009-07-01 Thread Mark Heckmann


Hi Gabor,

thanks fort his great advice. Just one more question:
I cannot find how to switch off case sensitivity for the regex in the
documentation for gsubfn or strapply, like e.g. in gregexpr the ignore.case
=TRUE command.  Is there a way?

TIA,
Mark 

---

Mark Heckmann
+ 49 (0) 421 - 1614618
www.markheckmann.de
R-Blog: http://ryouready.wordpress.com




-Ursprüngliche Nachricht-
Von: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Gesendet: Dienstag, 30. Juni 2009 18:31
An: Mark Heckmann
Cc: r-help@r-project.org
Betreff: Re: [R] Using regular expressions to detect clusters of consonants
in a string

Try this:

library(gsubfn)
s <- "mystring"
strapply(s, "[bcdfghjklmnpqrstvwxyz]+", nchar)[[1]]

which returns a vector of consonant string lengths.
Now apply your algorithm to that.
See http://gsubfn.googlecode.com for more.

On Tue, Jun 30, 2009 at 11:30 AM, Mark Heckmann wrote:
> Hi,
>
> I want to parse a string extracting the number of occurrences where two
> consonants clump together. Consider for example the word "hallo". Here I
> want the algorithm to return 1. For "chess" if want it to return 2. For
the
> word "screw" the result should be negative as it is a clump of three
> consonants not two. Also for word "abstraction" I do not want the
algorithm
> to detect two times a two consonant cluster. In this case the result
should
> be negative as well as it is four consonants in a row.
>
> str <- "hallo"
> gregexpr("[bcdfghjklmnpqrstvwxyz]{2}[aeiou]{1}" , str, ignore.case =TRUE,
> extended = TRUE)[[1]]
>
> [1] 3
> attr(,"match.length")
> [1] 3
>
> The result is correct. Now I change the word to "hall"
>
> str <- "hall"
> gregexpr("[bcdfghjklmnpqrstvwxyz]{2}[aeiou]{1}" , str, ignore.case =TRUE,
> extended = TRUE)[[1]]
>
> [1] -1
> attr(,"match.length")
> [1] -1
>
> Here my expression fails. How can I write a correct regex to do this? I
> always encounter problems at the beginning or end of a string.
>
> Also:
>
> str <- "abstraction"
> gregexpr("[bcdfghjklmnpqrstvwxyz]{2}[aeiou]{1}" , str, ignore.case =TRUE,
> extended = TRUE)[[1]]
>
> [1] 4 7
> attr(,"match.length")
> [1] 3 3
>
> This also fails.
>
> Thanks in advance,
> Mark
>
> ---
> Mark Heckmann
> www.markheckmann.de
> R-Blog: http://ryouready.wordpress.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] new version of package deSolve on CRAN

2009-07-01 Thread Thomas Petzoldt


Dear R users,

an improved version of package deSolve (version 1.3) is now available on
CRAN. deSolve, the successor of R package odesolve, is a package to
solve initial value problems (IVP) of:

- ordinary differential equations (ODE),
- differential algebraic equations (DAE) and
- partial differential equations (PDE).

The implementation includes stiff integration routines based on the
ODEPACK Fortran codes (Hindmarsh 1983). It also contains fixed and
adaptive time step Runge-Kutta solvers and the Euler method.


Main improvements:

- new introductory tutorial (vignette) "Solving initial value
differential equations in R",

- all integrators are now implemented in compiled languages (Fortran
resp. C). Now all solvers allow to call compiled models directly (see
package vignette "Writing Code in Compiled Language").

- new function ode.3d that supplements the already existing
ode.1d and ode.2d,

- new function "diagnostics",

- many small improvements, especially of the documentation.


Have fun!

Thomas Petzoldt, Karline Soetaert, Woodrow Setzer

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Clearing out or reclaiming memory

2009-07-01 Thread Kenn Konstabel

On Wed, Jul 1, 2009 at 3:02 AM, gug  wrote:

>
>  sapply(ls(), function(x) object.size(get(x)))
> -This lists all objects with the memory each is using (I should be honest
> and say that, never having used "sapply" before, I don't truly understand
> the syntax of this, but it seems to work).
>

In this particular case (getting the size of each object in the global
environment) you can also do:

eapply(.GlobalEnv, object.size)
# or eapply(.GlobalEnv, object.size, all.names=TRUE) , see ?eapply

Eapply applies a function (here:object.size) to every object in an
environment; with `sapply` you first use ls() to get the names of all
objects as a character vector, and then you need `get` because you probably
want the sizes of objects themselves, not their names -- so it can't be
just  sapply(ls(), object.size) .

Regards,

Kenn

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Are there any bloggers amoung us going to useR 2009 ?

2009-07-01 Thread Tal Galili

*(note*: This is an R community question, not a statistical nor coding
question. Since this is my first time writing such a post, I hope no one
will take offence of it.)



Hello all,
I will be attending useR 2009 next week, and was wondering if there are any
of you who are *bloggers *intending to participate and report on useR 2009?
If so - I would love to know your blogs URL so as to follow you.

I am planning on coming with a laptop and see what I'll be able to report
on: www.r-statistics.com  (which is very empty for now)


See you there,
Tal









-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2009-07-01 Thread Andriy Fetsun

Hi,

1.) I am trying to calculate the autocorrelation function for returns based
on rolling window, but it doesn't work.

My code is

rollapply(Returns,20,acf).

2.) My next try is

 rollapply(Returns_2,20,cor)
Error in FUN(cdata[st, i], ...) :  supply both 'x' and 'y' or a matrix-like
'x'

Thank you in advance!

-- 
Best regards,

Andy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grep on vectors?

2009-07-01 Thread Allan Engelhardt


On 30/06/09 17:53, Chuck White wrote:

[...]

Is there a way to avoid the for loop? The following seems to work:
   lapply(density.factor,grep,names(data.df))
However, that produces a list of lists which need to be merged. Note that in 
the above example since we have 2 regular expressions, there will be two lists 
but in the general case there will be many more.
It is hiding, not avoiding the for loop, but if you are happy with the 
lapply() approach then just use unlist() on the result:


unlist(lapply(density.factor, grep, names(data.df)))

I wouldn’t worry about optimizing performance: it isn’t the sort of 
thing you are going to be running a million times per second.  Keep it 
understandable and maintainable.


Hope this helps a little.

Allan.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditional coloring of output text in console or in GUI

2009-07-01 Thread Jim Lemon


Christopher W. Ryan wrote:

suppose I have some logical vector

x <- as.logical(c(0,0,0,1,0,0,1,1,0))
x

How would I make the words TRUE appear on the screen in a different
color from the words FALSE?

  

tfsample<-as.logical(sample(c(0:1),10,TRUE))
plot(1:10,type="n")
text(1:10,1:10,tfsample,col=ifelse(tfsample,"red","blue"))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditional coloring of output text in console or in GUI

2009-07-01 Thread Romain Francois


Hi,

You might want to check package xterm256.
http://cran.r-project.org/web/packages/xterm256/index.html
http://romainfrancois.blog.free.fr/index.php?post/2009/04/18/Colorful-terminal%3A-the-R-package-%22xterm256%22

This works if the terminal you are using recognizes xterm escape 
sequences as defined in this page.

http://frexx.de/xterm-256-notes/

It would be possible for other gui to support these escape sequences.

Romain


On 06/30/2009 08:28 PM, Christopher W. Ryan wrote:

suppose I have some logical vector

x<- as.logical(c(0,0,0,1,0,0,1,1,0))
x

How would I make the words TRUE appear on the screen in a different
color from the words FALSE?

Thanks.

--Chris



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2009-07-01 Thread Gabor Grothendieck

On Wed, Jul 1, 2009 at 6:22 AM, Andriy Fetsun wrote:
> Hi,
>
> 1.) I am trying to calculate the autocorrelation function for returns based
> on rolling window, but it doesn't work.
>
> My code is
>
> rollapply(Returns,20,acf).

That's because acf returns a list.  Try this:

rollapply(z, 20, function(x) acf(x)$acf)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem on zoo object

2009-07-01 Thread Gabor Grothendieck

Try this:

z <- zooreg(1:365, start = as.Date("2001-01-01"), freq = 1)
f <- head
tapply(seq_along(z), as.yearmon(time(z)), function(ix) f(z[ix]))

where you should replace f with a function that does whatever
you want with each month's data.  Here we just used head as
an example.


On Wed, Jul 1, 2009 at 5:17 AM, Bogaso wrote:
>
> I have a zoo object on daily data for 10 years. Now I want to create a list,
> wherein each member of that list is the monthly observations. For example,
> 1st member of list contains daily observation of 1st month, 2nd member
> contains daily observation of 2nd month etc.
>
> Then for a particular month, I want to divide all observations into 3 parts
> (arbitrary) and then want to calculate some statistics on each part for each
> month. Therefore for a particular month, I will have 3 means (suppose,
> statistic is mean).
>
> Can anyone throw some light on how to do that?
> --
> View this message in context: 
> http://www.nabble.com/A-problem-on-zoo-object-tp24286720p24286720.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] is there a way to extract fata from web pages through some R function ?

2009-07-01 Thread mauede

I deal with a huge amount of Biology data stored in different databases.
The databases belongig to Bioconductor organization can be accessed through 
Bioconductor packages.
Unluckily some useful data is stored in databases like, for instance, miRDB, 
miRecords, etc ... which offer just an
interactive HTML interface. See for instance
 http://mirdb.org/cgi-bin/search.cgi, 
 
http://mirecords.umn.edu/miRecords/interactions.php?species=Homo+sapiens&mirna_acc=Any&targetgene_type=refseq_acc&targetgene_info=&v=yes&search_int=Search

Downloading data manually from the web pages is a painstaking time-consumung 
and error-prone activity.
I came across a Python script that downloads (dumps) whole web pages  into a 
text file that is then parsed.
This is possible because Python has a library to access web pages.
But I have no experience with Python programming nor I like such a programming 
language whose syntax is indentation-sensitive.

I am *hoping* that there exists some sort of web pages, HTML connection  from R 
... is there ??

Thank you very much for any suggestion.
Maura



tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using regular expressions to detect clusters of consonants in a string

2009-07-01 Thread Gabor Grothendieck

strapply and gsubfn pass the ... argument to gsub so it accepts
all the same arguments.  See ?strappy and ?gsubfn.  e.g.

> strapply("MyString", "[bcdfghjklmnpqrstvwxyz]+", nchar, ignore.case = TRUE)
[[1]]
[1] 5 2

> gsubfn("[bcdfghjklmnpqrstvwxyz]+", "X", "MyString", ignore.case = TRUE)
[1] "XiX"


On Wed, Jul 1, 2009 at 5:07 AM, Mark Heckmann wrote:
>
> Hi Gabor,
>
> thanks fort his great advice. Just one more question:
> I cannot find how to switch off case sensitivity for the regex in the
> documentation for gsubfn or strapply, like e.g. in gregexpr the ignore.case
> =TRUE command.  Is there a way?
>
> TIA,
> Mark
>
> ---
>
> Mark Heckmann
> + 49 (0) 421 - 1614618
> www.markheckmann.de
> R-Blog: http://ryouready.wordpress.com
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
> Gesendet: Dienstag, 30. Juni 2009 18:31
> An: Mark Heckmann
> Cc: r-help@r-project.org
> Betreff: Re: [R] Using regular expressions to detect clusters of consonants
> in a string
>
> Try this:
>
> library(gsubfn)
> s <- "mystring"
> strapply(s, "[bcdfghjklmnpqrstvwxyz]+", nchar)[[1]]
>
> which returns a vector of consonant string lengths.
> Now apply your algorithm to that.
> See http://gsubfn.googlecode.com for more.
>
> On Tue, Jun 30, 2009 at 11:30 AM, Mark Heckmann wrote:
>> Hi,
>>
>> I want to parse a string extracting the number of occurrences where two
>> consonants clump together. Consider for example the word "hallo". Here I
>> want the algorithm to return 1. For "chess" if want it to return 2. For
> the
>> word "screw" the result should be negative as it is a clump of three
>> consonants not two. Also for word "abstraction" I do not want the
> algorithm
>> to detect two times a two consonant cluster. In this case the result
> should
>> be negative as well as it is four consonants in a row.
>>
>> str <- "hallo"
>> gregexpr("[bcdfghjklmnpqrstvwxyz]{2}[aeiou]{1}" , str, ignore.case =TRUE,
>> extended = TRUE)[[1]]
>>
>> [1] 3
>> attr(,"match.length")
>> [1] 3
>>
>> The result is correct. Now I change the word to "hall"
>>
>> str <- "hall"
>> gregexpr("[bcdfghjklmnpqrstvwxyz]{2}[aeiou]{1}" , str, ignore.case =TRUE,
>> extended = TRUE)[[1]]
>>
>> [1] -1
>> attr(,"match.length")
>> [1] -1
>>
>> Here my expression fails. How can I write a correct regex to do this? I
>> always encounter problems at the beginning or end of a string.
>>
>> Also:
>>
>> str <- "abstraction"
>> gregexpr("[bcdfghjklmnpqrstvwxyz]{2}[aeiou]{1}" , str, ignore.case =TRUE,
>> extended = TRUE)[[1]]
>>
>> [1] 4 7
>> attr(,"match.length")
>> [1] 3 3
>>
>> This also fails.
>>
>> Thanks in advance,
>> Mark
>>
>> ---
>> Mark Heckmann
>> www.markheckmann.de
>> R-Blog: http://ryouready.wordpress.com
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] getOptions("max.print") in R

2009-07-01 Thread saurav pathak

I am typing the following on the command prompt:

>variab = read.csv(file.choose(), header=T)

>variab

It lists 900,000 ( this is the total number of observations in "variab" )
minus 797124 observations and prompts the following message

[ reached getOption("max.print") -- omitted 797124 entries ]]

Is there a way to see the entire set of data, ie all of 900,000 obs, and how
to then save this "variab"

Thanks
Saurav

-- 
Dr.Saurav Pathak
PhD, Univ.of.Florida
Mechanical Engineering
Doctoral Student
Innovation and Entrepreneurship
Imperial College Business School
s.patha...@imperial.ac.uk
0044-7795321121

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getOptions("max.print") in R

2009-07-01 Thread jim holtman

Change the value with 'options':

max.print: integer, defaulting to 9. print or show methods can
make use of this option, to limit the amount of information that is
printed, to something in the order of (and typically slightly less
than) max.print entries.

 Why would you want all 900,000 lines on the console?  You can save
with object with 'save'.  You might try "View" to see all the values.

On Wed, Jul 1, 2009 at 8:04 AM, saurav pathak  wrote:
>
> I am typing the following on the command prompt:
>
> >variab = read.csv(file.choose(), header=T)
>
> >variab
>
> It lists 900,000 ( this is the total number of observations in "variab" )
> minus 797124 observations and prompts the following message
>
> [ reached getOption("max.print") -- omitted 797124 entries ]]
>
> Is there a way to see the entire set of data, ie all of 900,000 obs, and how
> to then save this "variab"
>
> Thanks
> Saurav
>
> --
> Dr.Saurav Pathak
> PhD, Univ.of.Florida
> Mechanical Engineering
> Doctoral Student
> Innovation and Entrepreneurship
> Imperial College Business School
> s.patha...@imperial.ac.uk
> 0044-7795321121
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getOptions("max.print") in R

2009-07-01 Thread Duncan Murdoch

On 01/07/2009 8:04 AM, saurav pathak wrote:

I am typing the following on the command prompt:

variab = read.csv(file.choose(), header=T)

variab

It lists 900,000 ( this is the total number of observations in "variab" )
minus 797124 observations and prompts the following message

[ reached getOption("max.print") -- omitted 797124 entries ]]

Is there a way to see the entire set of data, ie all of 900,000 obs,

You can set max.print to a larger value, e.g. options(max.print=Inf), 
and then the whole thing will print.  It won't be very useful to do that 
at the console, because most consoles don't have an infinite buffer, but 
if you're redirecting output to a file, it might be reasonable.

> and how
> to then save this "variab"

I don't understand this question.  Nothing about variab was lost, it 
just wasn't printed for you.  You can save it just like any other 
variable, using save(variab, file=...) to save in binary format.

Duncan Murdoch

Thanks
Saurav

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] running count in data.frame

2009-07-01 Thread Mark Knecht

Yes Jim. Thanks. That's what I was looking for. My mistake letting [pos] block.

Cheers,
Mark

On Tue, Jun 30, 2009 at 8:04 PM, jim holtman wrote:
> Not exactly sure what you want to count.  Does this do what you want (made a
> change in RunningCount)
>

>> RunningCount = function (MyFrame) {
> + ## Running count of p & l events
> +
> +    pos <- (MyFrame$p > 0)
> +    MyFrame$pc <- cumsum(as.integer(pos))
> +    pos <- (MyFrame$l < 0)
> +    MyFrame$lc <- cumsum(as.integer(pos))
> +

>> F1 <- RunningCount(F1)
>> F1
>     x  y p  l pc lc
> 1   1 -4 0 -4  0  1
> 2   2 -3 0 -3  0  2
> 3   3 -2 0 -2  0  3
> 4   4 -1 0 -1  0  4
> 5   5  0 0  0  0  4
> 6   6  1 1  0  1  4
> 7   7  2 2  0  2  4
> 8   8  3 3  0  3  4
> 9   9  4 4  0  4  4
> 10 10  5 5  0  5  4

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rcorr

2009-07-01 Thread James Allsopp

Hi,
I've just run an rcorr on some data in Spearman's mode and it's just
produced the following values;
  [,1]  [,2]
[1,]  1.00 -0.55
[2,] -0.55  1.00

n= 46


P
 [,1] [,2]
[1,]   0
[2,]  0

I presume this means the p-value is lower than 0.5, but is there any
way of increasing the number of significant figures used? How should I
interpret this value?

Cheers
Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interaction plots (six on one page)

2009-07-01 Thread ukoenig



Now it works, I modified one variable (xleg) in the function,
Thanks a lot!


Quoting jim holtman :


It appears that the legend is fixed in that location within the funciton.
You could modify the function to put the legend in some other location.

On Wed, Jul 1, 2009 at 3:07 AM,  wrote:


Thank you, Jim!

It looks much better with that new aspect ratio!

Unfortunately the legend is located at the same place,
too far on the rigt side next to the border.
Any ideas?

Thanks, Udo




Quoting jim holtman :

  add


   par(mar=c(2.5,4,1,1))

just after layout

On Tue, Jun 30, 2009 at 4:20 PM,  wrote:

  #Dear R users,

#I want six interaction plots to be on one page,
#but the following problem occurs: the legend "BMIakt" appears,
#but it is exactly on the border of the plots (too far right).
#My seccond question is, how I can reduce the empty space in the
y-direction
#between the plots.

#Please have a look at my syntax below.

#Many thanks!
#Udo


myData <-
("id,BMIakt,time,thanaa,thalcho,thalino,ponaa,pocho,poino
1,0,1,1.63,0.79,0.28,1.63,0.98,0.58
2,0,2,1.7,0.72,0.37,1.83,0.97,0.42
3,1,1,1.73,0.83,0.32,2.24,1,0.88
4,1,2,1.87,0.76,0.49,1.78,0.68,0.61
5,0,1,1.81,0.99,0.47,1.98,0.96,0.6
6,0,2,1.7,0.77,0.38,1.68,0.79,0.51
7,0,1,1.79,0.95,0.48,1.65,0.9,0.73
8,0,2,2.5,0.79,0.56,2.09,0.81,0.64
9,1,1,1.63,0.71,0.36,2.13,0.98,0.68
10,1,2,1.69,0.85,0.43,2,0.94,0.59
11,1,1,1.95,1.04,0.38,1.7,0.92,0.47
12,1,2,2.16,0.84,0.25,2.01,0.73,0.38
13,0,1,1.65,0.8,0.2,1.74,0.95,0.43
14,0,2,1.83,0.96,0.59,1.88,1.2,0.73
15,1,1,2.02,0.79,0.26,2.01,0.94,0.59
16,1,2,1.71,0.57,0.42,1.87,0.73,0.59
17,1,1,1.5,0.78,0.35,1.68,0.84,0.48
18,1,2,1.4,0.66,0.43,1.87,1.02,0.39
19,1,1,1.45,0.69,0.32,1.74,0.67,0.44
20,1,2,1.65,0.88,0.27,1.7,0.87,0.55
21,1,1,1.89,0.66,0.4,1.93,0.88,0.58
22,1,2,1.71,0.81,0.34,1.87,0.8,0.53
23,1,1,1.71,0.87,0.34,1.65,1.16,0.65
24,1,2,1.82,1.29,0.49,1.98,1.31,0.57
25,1,1,1.66,0.86,0.28,1.4,0.8,0.38
26,1,2,1.82,0.82,0.45,1.4,1.1,0.68
27,1,1,1.67,0.71,0.32,1.83,0.84,0.63
28,1,2,2.06,0.69,0.41,1.62,0.9,0.57
29,1,1,1.62,0.81,0.47,1.88,1.11,0.6
30,1,2,1.76,0.77,0.71,1.74,1.2,0.55
31,0,1,1.8,0.78,0.27,1.96,0.86,0.47
32,0,2,1.7,0.63,0.35,2.22,0.83,0.58
33,1,1,1.92,0.8,0.37,1.8,0.98,0.43
34,1,2,1.94,0.84,0.48,1.92,0.86,0.61
35,1,1,1.55,0.6,0.44,1.78,0.86,0.64
36,1,2,1.68,0.61,0.39,1.84,0.85,0.65
37,0,1,1.77,0.84,0.47,1.72,0.84,0.57
38,0,2,1.79,0.85,0.55,1.89,0.85,0.54
39,1,1,1.87,0.9,0.52,2.01,1.01,0.7
40,1,2,1.91,0.72,0.28,1.81,0.78,0.65")

data <- read.table(textConnection(myData),
   header=TRUE, sep=",", row.names="id")

attach(data)
layout(matrix(1:6, 3, 2))


interaction.plot(time, BMIakt, thanaa)
interaction.plot(time, BMIakt, thalcho)
interaction.plot(time, BMIakt, thalino)
interaction.plot(time, BMIakt, ponaa)
interaction.plot(time, BMIakt, pocho)
interaction.plot(time, BMIakt, poino)


#Using "locator()" would be an alternative, but the text "BMIakt" is
missing,
#doing that
interaction.plot(time, BMIakt175, thanaa, legend=FALSE)
legend(locator(1), c("1", "0"), cex=0.8, col=c("black", "black"),
lty=1:2)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?








--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Difficulty in calculating MLE through NLM

2009-07-01 Thread Madan Gopal Kundu

Hi R-friends,

Attached is the SAS XPORT file that I have imported into R using following code
library(foreign)
mydata<-read.xport("C:\\ctf.xpt")
print(mydata)

I am trying to maximize logL in order to find Maximum Likelihood Estimate (MLE) 
of 5 parameters (alpha1, beta1, alpha2, beta2, p) using NLM function in R as 
follows.

# Defining Log likelihood - In the function it is noted as logL
> library(stats)
> loglike1<- function(x)
+ {
+ alpha1<-x[1]
+ beta1<-x[2]
+ alpha2<-x[3]
+ beta2<-x[4]
+ p<-x[5]
+ n<- mydata[3]
+ e<-mydata[4]
+ f1<- 
((1+beta1/e)^(-n))*((1+e/beta1)^(-alpha1))*gamma(alpha1+n)/(gamma(n+1)*gamma(alpha1))
+ f2<- 
((1+beta2/e)^(-n))*((1+e/beta2)^(-alpha2))*gamma(alpha2+n)/(gamma(n+1)*gamma(alpha2))
+ logL=sum(log(p*f1+(1-p)*f2))
+ logL<- -logL
+ }

# Supplying starting parameter values
> theta<-c(.2041,.0582, 1.4150, 1.8380,0.0969)

# Calculating MLE using NLM function
> result<- nlm(loglike1, theta, hessian=TRUE, print.level=1)

Now the problem is, this is not working as there is no improvement in final 
parameter estimate over starting values and NLM just stops just after 1 
iteration with gradient value of all the 5 parameters as zero. I have tried 
other set of starting values, but then also I am getting final parameter 
estimates similar to starting values and iteration stops just after one step. 
When I check for warnings, R displays following kind of warnings:

 *   NA/Inf replaced by maximum positive value
 *   value out of range in 'gammafn'

Please suggest what I should do. I am expecting the final MLE of alpha1, 
alpha2, beta1 and beta2 greater than 0 and P should lie between 0 to 1.

Thanks & Regards,
Madan Gopal Kundu
Biostatistician, CDM, MACR, Ranbaxy Labs. Ltd.
Tel(O): +91 (0) 1245194045 - Mobile: +91 (0) 9868788406



(i) The information contained in this e-mail message is intended only 
for the confidential use of the recipient(s) named above. This message 
is privileged and confidential. If the reader of this message is not 
the intended recipient or an agent responsible for delivering it to the 
intended recipient, you are hereby notified that you have received this 
document in error and that any review, dissemination, distribution, or 
copying of this message is strictly prohibited. If you have received this 
communication in error, please notify us immediately by e-mail, and delete 
the original message. 

(ii) madan.ku...@ranbaxy.com confirms that Ranbaxy shall not be responsible if 
this 
email message is used for any indecent, unsolicited or illegal purposes, 
which are in violation of any existing laws and the same shall solely be 
the responsibility of madan.ku...@ranbaxy.com and that Ranbaxy shall at all 
times be 
indemnified of any civil and/ or criminal liabilities or consequences there.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rcorr

2009-07-01 Thread Frank E Harrell Jr


James Allsopp wrote:

Hi,
I've just run an rcorr on some data in Spearman's mode and it's just
produced the following values;
  [,1]  [,2]
[1,]  1.00 -0.55
[2,] -0.55  1.00

n= 46


P
 [,1] [,2]
[1,]   0
[2,]  0

I presume this means the p-value is lower than 0.5, but is there any
way of increasing the number of significant figures used? How should I
interpret this value?

Cheers
Jim



Try options(digits=15) before running rcorr().

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multiple univariate Anova without any loop?

2009-07-01 Thread Sungeun Kim

Hello all,

I have one question about how to increase the performance speed for running
multiple univariate Anova.

I have multiple observations for a group of subjects and want to run
univariate Anova using car:Anova with type 3 sum of square. So, the current
implementation is using apply() function. However, the number of
observations are very large (around 100,000). So, I am wondering whether
there is another way to do this without using any loop.

Any idea or suggestion will be very helpful.

Thank you in advance.


-- 
Sungeun Kim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between "names", "colnames" and "dimnames"

2009-07-01 Thread Don MacQueen


I think your problem is with plotting, not with naming.
Tell the list what kind of plot you're doing 
(with example code, of course) and where you need 
to see names on the plot.


 (What do you have in mind when you say names for 
the "whole" matrix? There are row names, and 
column names, and that's about it.)


-Don

At 6:08 PM -0500 6/30/09, Germán Bonilla wrote:

Content-Type: text/plain
Content-Disposition: inline
Content-length: 1771

Hi all...

I built a matrix binding vectors with rbind, and have something like this:

[,1] [,2][,3] [,4] [,5] [,6]  [,7] [,8]
CLS 3.877328 4.087636 4.72089 4.038361 3.402942 2.786285  2.671222 3.276419
ORD  NaN  NaN NaN  NaN 5.770780 5.901113 11.888054 7.934823
FAM  NaN  NaN NaN  NaN  NaN 3.699455  4.551196 2.885390
GEN  NaN  NaN NaN  NaN  NaN 3.967411  4.390296 2.885390
SPP  NaN  NaN NaN  NaN  NaN  NaN   NaN 2.885390

Then I tried to assign names to each column with names(), but end up with
the following:


 names(tester) <-

c("uno","dos","tres","cuatro","cinco","seis","siete","ocho")

[,1] [,2][,3] [,4] [,5] [,6]  [,7] [,8]
CLS 3.877328 4.087636 4.72089 4.038361 3.402942 2.786285  2.671222 3.276419
ORD  NaN  NaN NaN  NaN 5.770780 5.901113 11.888054 7.934823
FAM  NaN  NaN NaN  NaN  NaN 3.699455  4.551196 2.885390
GEN  NaN  NaN NaN  NaN  NaN 3.967411  4.390296 2.885390
SPP  NaN  NaN NaN  NaN  NaN  NaN   NaN 2.885390
attr(,"names")
 [1] "uno""dos""tres"   "cuatro" "cinco"  "seis"   "siete"  "ocho"
 [9] NA   NA   NA   NA   NA   NA   NA   NA
[17] NA   NA   NA   NA   NA   NA   NA   NA
[25] NA   NA   NA   NA   NA   NA   NA   NA
[33] NA   NA   NA   NA   NA   NA   NA   NA

I can use colnames(tester), but then I cannot identify the colnames on the
points when I plot them.

How can I set the names(tester) for the whole matrix?

Thanks a lot.

Germán,
UNAM

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https:// stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http:// www. R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with superscripts in simple plots

2009-07-01 Thread Steve_Friedman


Hello All,

When I use the following lines of code to create a plot  and add labels
with R-square values the labels have a superscripted R2.

  library(lattice)

 xyplot(PropHatchedNests$Phatched + PropHatchedNests$PropNests +
PropHatchedNests$meanHSI + PropHatchedNests$RelMeanEggsNest ~
PropHatchedNests$Year, type = "b",
scales=list(tick.number=length(PropHatchedNests$Year)), ylab="Score",
xlab="Year", pch=c(1,4,5), col= c("black","blue", "red", "darkgreen"),
   lty = c(1,12,9,16),  main="Shark Slough Alligators")
  trellis.focus()
panel.text(x=1994, y=0.45, labels="Relative Number Nests",
cex=0.75)
   panel.text(1994, y=0.4, label=bquote(R^2 == .(R17)), cex=0.75 )
panel.text(x=1996, y=0.8, labels="Mean API", cex = 0.75)
   panel.text(x=1996, y=0.13, labels="Proportion of Hatched Eggs" ,
cex=0.75)
 panel.text(1996,  y=0.08, label=bquote(R^2 == .(R55)),
cex=0.75)
panel.text(x=1989, y=1.0, labels="Relative Mean Number Eggs Per
Nests Per Year" , cex = 0.75)
 panel.text(x= 1989, y = 0.93, label=bquote(R^2 == .(R44)), cex
= 0.75)
   trellis.unfocus()


But when I use the following in a much simpler plot I can't seem to get it
to work correctly.   Is there a different way of using text and
superscripts that are not associated with lattice plots ?

plot(PropHatchedNests$PropNests, PropHatchedNests$meanHSI, ylab="Mean HSI",
xlab="Proportion of Nests", xlim=c(0,1), ylim=c(0,1),
   text(0.25, 0.9, "B   R^2 = 0.17"))
abline(lm(PropHatchedNests$meanHSI ~ PropHatchedNests$PropNests))

All suggestions will be appreciated.

Running:  R version 2.8.1 (2008 - 12-22) Windows XP

Thanks
Steve

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I change which R Graphics Device is active?

2009-07-01 Thread Don MacQueen


My version would be

  newDev <-  function() { dev.new(); invisible( dev.cur() ) }

I agree with Hadley that return() is redundant in this instance. 
Using invisible() suppresses automatic printing of the returned value 
when it is not being assigned to a variable, thus making it more like 
dev.new().


While we're at it, consider

  newDev <-  function(...) { dev.new(...); invisible( dev.cur() ) }

which should allow one to pass through optional arguments (which only 
makes sense if they're valid for dev.new(), of course).


-Don

At 7:18 PM -0500 6/30/09, hadley wickham wrote:

On Tue, Jun 30, 2009 at 2:12 PM, Barry
Rowlingson wrote:

 On Tue, Jun 30, 2009 at 8:05 PM, Mark Knecht wrote:


 You could wrap it in a function of your own making, right?

 AddNewDev = function() {dev.new();AddNewDev=dev.cur()}

 histPlot=AddNewDev()

 Seems to work.


  You leaRn fast :) Probably better style is:


 >  newDev = function(){dev.new();return(dev.cur())}


  - which returns the value explicitly with return().


R isn't C! ;)  I'd claim idiomatic R only uses return for special
cases (i.e. when you can terminate the function early)

Hadley

--
http:// had.co.nz/

__
R-help@r-project.org mailing list
https:// stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http:// www. R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] comparision among multiple subgroups

2009-07-01 Thread sdzhangping

Dear R users:

In my recent works, I compared the cumulative incidences among three 
different treatment groups. The cuminc function (cmprsk package ) yielded a 
graph (refer to figure 1) and a p value (p = 0.0007). I don’t know how to 
interpret the meaning of the p value ( one p value and three subgroups) ? 
Similar puzzles exist in log-rank test of survival data (figure 2). 

   What dose the meaning of p value in these conditions? What packages and 
functions should I apply?

   thanks 

  Ping Zhang


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I change which R Graphics Device is active?

2009-07-01 Thread Barry Rowlingson

On Wed, Jul 1, 2009 at 3:30 PM, Don MacQueen wrote:
> My version would be
>
>  newDev <-  function() { dev.new(); invisible( dev.cur() ) }
>
> I agree with Hadley that return() is redundant in this instance. Using
> invisible() suppresses automatic printing of the returned value when it is
> not being assigned to a variable, thus making it more like dev.new().

 Hmmm. I really like using explicit return calls in my functions. It
seems, to me, to make it clear that I intend to return a value and
that value is going to be useful. How can others tell (without looking
at the ample documentation, of course) that you intend your function
to have a meaningful return value and it's not just "falling through"?

As The Zen of Python puts it:

 Explicit is better than implicit.
 Readability counts.

but you know, hey, whatever you want to do :)

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I change which R Graphics Device is active?

2009-07-01 Thread Mark Knecht

On Wed, Jul 1, 2009 at 7:45 AM, Barry
Rowlingson wrote:
> On Wed, Jul 1, 2009 at 3:30 PM, Don MacQueen wrote:
>> My version would be
>>
>>  newDev <-  function() { dev.new(); invisible( dev.cur() ) }
>>
>> I agree with Hadley that return() is redundant in this instance. Using
>> invisible() suppresses automatic printing of the returned value when it is
>> not being assigned to a variable, thus making it more like dev.new().
>
>  Hmmm. I really like using explicit return calls in my functions. It
> seems, to me, to make it clear that I intend to return a value and
> that value is going to be useful. How can others tell (without looking
> at the ample documentation, of course) that you intend your function
> to have a meaningful return value and it's not just "falling through"?
>
> As The Zen of Python puts it:
>
>  Explicit is better than implicit.
>  Readability counts.
>
> but you know, hey, whatever you want to do :)
>
> Barry
>

As a newbie and being that I have no formal or job related programming
experience I personally like having a return value as it makes me
think about what I'm doing. That's just me though.

- Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difficulty in calculating MLE through NLM

2009-07-01 Thread Ravi Varadhan

Hi Madan,

You are trying to find the MLE of a binary mixture distribution.  So, there are 
constraints on the parameters, as you have indicated.  The nlm() function 
cannot handle constraints. I would recommend one of the following functions 
(not necessarily in any order), all of which can handle box-constraints:

1. optim(), with method="L-BFGS-B")
2. nlminb()
3. spg() in package "BB"
4. Write an EM algorithm; this automatically imposes all the required 
constraints, and is guaranteed to converge (to a local maximum).

John Nash and I are writing a package called "optimx", which contains a 
function called optimx() that integrates all the different optimization 
functions for smooth nonlinear problems.  Thsi would make it extremely easy for 
you to do steps (1) - (3) above, using a single call to optimx().  Let me know 
if you are interested.

Hope this helps,
Ravi.



Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Madan Gopal Kundu 
Date: Wednesday, July 1, 2009 9:25 am
Subject: [R] Difficulty in calculating MLE through NLM
To: r-help 


> Hi R-friends,
>  
>  Attached is the SAS XPORT file that I have imported into R using 
> following code
>  library(foreign)
>  mydata<-read.xport("C:\\ctf.xpt")
>  print(mydata)
>  
>  I am trying to maximize logL in order to find Maximum Likelihood 
> Estimate (MLE) of 5 parameters (alpha1, beta1, alpha2, beta2, p) using 
> NLM function in R as follows.
>  
>  # Defining Log likelihood - In the function it is noted as logL
>  > library(stats)
>  > loglike1<- function(x)
>  + {
>  + alpha1<-x[1]
>  + beta1<-x[2]
>  + alpha2<-x[3]
>  + beta2<-x[4]
>  + p<-x[5]
>  + n<- mydata[3]
>  + e<-mydata[4]
>  + f1<- 
> ((1+beta1/e)^(-n))*((1+e/beta1)^(-alpha1))*gamma(alpha1+n)/(gamma(n+1)*gamma(alpha1))
>  + f2<- 
> ((1+beta2/e)^(-n))*((1+e/beta2)^(-alpha2))*gamma(alpha2+n)/(gamma(n+1)*gamma(alpha2))
>  + logL=sum(log(p*f1+(1-p)*f2))
>  + logL<- -logL
>  + }
>  
>  # Supplying starting parameter values
>  > theta<-c(.2041,.0582, 1.4150, 1.8380,0.0969)
>  
>  # Calculating MLE using NLM function
>  > result<- nlm(loglike1, theta, hessian=TRUE, print.level=1)
>  
>  Now the problem is, this is not working as there is no improvement in 
> final parameter estimate over starting values and NLM just stops just 
> after 1 iteration with gradient value of all the 5 parameters as zero. 
> I have tried other set of starting values, but then also I am getting 
> final parameter estimates similar to starting values and iteration 
> stops just after one step. When I check for warnings, R displays 
> following kind of warnings:
>  
>   *   NA/Inf replaced by maximum positive value
>   *   value out of range in 'gammafn'
>  
>  Please suggest what I should do. I am expecting the final MLE of 
> alpha1, alpha2, beta1 and beta2 greater than 0 and P should lie 
> between 0 to 1.
>  
>  Thanks & Regards,
>  Madan Gopal Kundu
>  Biostatistician, CDM, MACR, Ranbaxy Labs. Ltd.
>  Tel(O): +91 (0) 1245194045 - Mobile: +91 (0) 9868788406
>  
>  
>  
>  (i) The information contained in this e-mail message is intended only 
> 
>  for the confidential use of the recipient(s) named above. This 
> message 
>  is privileged and confidential. If the reader of this message is not 
> 
>  the intended recipient or an agent responsible for delivering it to 
> the 
>  intended recipient, you are hereby notified that you have received 
> this 
>  document in error and that any review, dissemination, distribution, 
> or 
>  copying of this message is strictly prohibited. If you have received 
> this 
>  communication in error, please notify us immediately by e-mail, and 
> delete 
>  the original message. 
>  
>  (ii) madan.ku...@ranbaxy.com confirms that Ranbaxy shall not be 
> responsible if this 
>  email message is used for any indecent, unsolicited or illegal 
> purposes, 
>  which are in violation of any existing laws and the same shall solely 
> be 
>  the responsibility of madan.ku...@ranbaxy.com and that Ranbaxy shall 
> at all times be 
>  indemnified of any civil and/ or criminal liabilities or consequences 
> there. 
> __
>  R-help@r-project.org mailing list
>  
>  PLEASE do read the posting guide 
>  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is there a way to extract some fields data from HTML pages through any R function ?

2009-07-01 Thread mauede

I deal with a huge amount of Biology data stored in different databases.
The databases belongig to Bioconductor organization can be accessed through 
Bioconductor packages.
Unluckily some useful data is stored in databases like, for instance, miRDB, 
miRecords, etc ... which offer just an
interactive HTML interface. See for instance
 http://mirdb.org/cgi-bin/search.cgi, 
 
http://mirecords.umn.edu/miRecords/interactions.php?species=Homo+sapiens&mirna_acc=Any&targetgene_type=refseq_acc&targetgene_info=&v=yes&search_int=Search

Downloading data manually from the web pages is a painstaking time-consumung 
and error-prone activity.
I came across a Python script that downloads (dumps) whole web pages  into a 
text file that is then parsed.
This is possible because Python has a library to access web pages.
But I have no experience with Python programming nor I like such a programming 
language whose syntax is indentation-sensitive.

I am *hoping* that there exists some sort of web pages, HTML connection  from R 
... is there ??

Thank you very much for any suggestion.
Maura


tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] is there a way to extract fata from web pages through some R function ?

2009-07-01 Thread Greg Hirson

Maura,

Try the RCurl package, specifically the functions getURL and getForm.

Greg

mau...@alice.it wrote:

I deal with a huge amount of Biology data stored in different databases.
The databases belongig to Bioconductor organization can be accessed through
Bioconductor packages.
Unluckily some useful data is stored in databases like, for instance, miRDB,
miRecords, etc ... which offer just an
interactive HTML interface. See for instance
http://mirdb.org/cgi-bin/search.cgi,
http://mirecords.umn.edu/miRecords/interactions.php?species=Homo+sapiens&mirna_acc=Any&targetgene_type=refseq_acc&targetgene_info=&v=yes&search_int=Search

Downloading data manually from the web pages is a painstaking time-consumung
and error-prone activity.
I came across a Python script that downloads (dumps) whole web pages into a
text file that is then parsed.
This is possible because Python has a library to access web pages.
But I have no experience with Python programming nor I like such a programming
language whose syntax is indentation-sensitive.

I am *hoping* that there exists some sort of web pages, HTML connection from R
... is there ??

Thank you very much for any suggestion.
Maura

tutti i telefonini TIM!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Greg Hirson
ghir...@ucdavis.edu

Graduate Student
Agricultural and Environmental Chemistry

1106 Robert Mondavi Institute North
One Shields Avenue
Davis, CA 95616

Re: [R] Is there a way to extract some fields data from HTML pages through any R function ?

2009-07-01 Thread Martin Morgan

Hi Maura --

mau...@alice.it wrote:
> I deal with a huge amount of Biology data stored in different databases.
> The databases belongig to Bioconductor organization can be accessed through 
> Bioconductor packages.
> Unluckily some useful data is stored in databases like, for instance, miRDB, 
> miRecords, etc ... which offer just an
> interactive HTML interface. See for instance
>  http://mirdb.org/cgi-bin/search.cgi, 
>  
> http://mirecords.umn.edu/miRecords/interactions.php?species=Homo+sapiens&mirna_acc=Any&targetgene_type=refseq_acc&targetgene_info=&v=yes&search_int=Search
> 
> Downloading data manually from the web pages is a painstaking time-consumung 
> and error-prone activity.
> I came across a Python script that downloads (dumps) whole web pages  into a 
> text file that is then parsed.
> This is possible because Python has a library to access web pages.
> But I have no experience with Python programming nor I like such a 
> programming language whose syntax is indentation-sensitive.
> 
> I am *hoping* that there exists some sort of web pages, HTML connection  from 
> R ... is there ??

Tools in R for this are the RCurl package and the XML package.

  library(RCurl)
  library(XML)

Typically this involves manual exploration of the web form, Then you
might query the web form

  result <- postForm("http://mirdb.org/cgi-bin/search.cgi";,
 searchType="miRNA", species="Human",
 searchBox="hsa-let-7a", submitButton="Go")

and parse the results into a convenient structure

  html <- htmlTreeParse(result, asText=TRUE, useInternalNodes=TRUE)

you can then use XPath (http://www.w3.org/TR/xpath, especially section
2.5) to explore and extract information, e.g.,

  ## second table, first row
  getNodeSet(html, "//table[2]/tr[1]")
  ## second table, makes subsequent paths shorter
  tbl <- getNodeSet(html, "//table[2]")[[1]]
  xget <- function(xml, path) # a helper function
  unlist(xpathApply(xml, path, xmlValue))[-1]
  df <- data.frame(TargetRank=as.numeric(xget(tbl, "./tr/td[2]")),
   TargetScore=as.numeric(xget(tbl, "./tr/td[3]")),
   miRNAName=xget(tbl, "./tr/td[4]"),
   GeneSymbol=xget(tbl, "./tr/td[5]"),
   GeneDescription=xget(tbl, "./tr/td[6]"))

There are many ways through this latter part, probably some much cleaner
than presented above. There are fairly extensive examples on each of the
relevant help pages, e.g., ?postForm.

Martin


> Thank you very much for any suggestion.
> Maura
> 
> 
> tutti i telefonini TIM!
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Iteratively Reweighted Least Squares of nonlinear regression

2009-07-01 Thread Derek An

Dear all,


When doing nonlinear regression, we normally use nls if e are iid normal.

  i learned that if the form of the variance of e is not completely known,
we can use the IRWLS (Iteratively Reweighted Least Squares )

algorithm:

for example, var e*i =*g0+g1*x*1

1. Start with *w**i = *1

2. Use least squares to estimate b.

3. Use the residuals to estimate g, perhaps by regressing e^2 on *x*.

4. Recompute the weights and goto 2.

Continue until convergence

i was wondering whether there is a instruction of R to do this?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rcorr

2009-07-01 Thread James Allsopp

No, that's made no difference, sorry.

Frank E Harrell Jr wrote:
> James Allsopp wrote:
>> Hi,
>> I've just run an rcorr on some data in Spearman's mode and it's just
>> produced the following values;
>>   [,1]  [,2]
>> [1,]  1.00 -0.55
>> [2,] -0.55  1.00
>>
>> n= 46
>>
>>
>> P
>>  [,1] [,2]
>> [1,]   0
>> [2,]  0
>>
>> I presume this means the p-value is lower than 0.5, but is there any
>> way of increasing the number of significant figures used? How should I
>> interpret this value?
>>
>> Cheers
>> Jim
>>
> 
> Try options(digits=15) before running rcorr().
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ?max (so far...)

2009-07-01 Thread Mark Knecht

Hi,
   I have a data.frame that is date ordered by row number - earliest
date first and most current last. I want to create a couple of new
columns that show the max and min values from other columns *so far* -
not for the whole data.frame.

   It seems this sort of question is really coming from my lack of
understanding about how R intends me to limit myself to portions of a
data.frame. I get the impression from the help files that the generic
way is that if I'm on the 500th row of a 1000 row data.frame and want
to limit the search max does to rows 1:500  I should use something
like [1:row] but it's not working inside my function. The idea works
outside the function, in the sense I can create tempt1[1:7] and the
max function returns what I expect. How do I do this with row?

   Simple example attached. hp should be 'highest p', ll should be
'lowest l'. I get an error message "Error in 1:row : NA/NaN argument"

Thanks,
Mark

AddCols = function (MyFrame) {
MyFrame$p<-0
MyFrame$l<-0
MyFrame$pc<-0
MyFrame$lc<-0
MyFrame$pwin<-0
MyFrame$hp<-0
MyFrame$ll<-0
return(MyFrame)
}

BinPosNeg = function (MyFrame) {

## Positive y in p column, negative y in l column
pos <- MyFrame$y > 0
MyFrame$p[pos] <- MyFrame$y[pos]
MyFrame$l[!pos] <- MyFrame$y[!pos]
return(MyFrame)
}

RunningCount = function (MyFrame) {
## Running count of p & l events

pos <- (MyFrame$p > 0)
MyFrame$pc <- cumsum(pos)
pos <- (MyFrame$l < 0)
MyFrame$lc <- cumsum(pos)

return(MyFrame)
}

PercentWins = function (MyFrame) {

MyFrame$pwin <- round((MyFrame$pc / (MyFrame$pc+MyFrame$lc)),2)

return(MyFrame)
}

HighLow = function (MyFrame) {
temp1 <- MyFrame$p[1:row]
MyFrame$hp <- max(temp1) ## Highest p
temp1 <- MyFrame$l[1:row]
MyFrame$ll <- min(temp1) ## Lowest l

return(MyFrame)
}

F1 <- data.frame(x=1:10, y=2*(-4:5) )
F1 <- AddCols(F1)
F1 <- BinPosNeg(F1)
F1 <- RunningCount(F1)
F1 <- PercentWins(F1)
F1
F1 <- HighLow(F1)
F1

temp1<-F1$p[1:5]
max(temp1)
temp1<-F1$p[1:7]
max(temp1)
temp1<-F1$p[1:10]
max(temp1)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] garchFit in fGarch fitted values are all the same

2009-07-01 Thread Ron Burns

I tried tseries::garch() and was getting a lot of False Convergence so 
tried fGarch::grachFit.  

Looking a bit further I find that from fGarch::garchFit I get

 ..@ fit:List of 17

  .. ..$ convergence: int 1
  .. ..$ message: chr "singular convergence (7)"

for any and all fits.  I have tried various real and lots of simulated 
data including a wide range of garch(1,1) simulations.  The @fitted 
values are all the same; the @h.t and @sigma.t seem reasonable. Am I 
interpreting  the @fitted incorrectly and is the convergence error 
perhaps bogus it would seem in at least some of my many tries should 
have converged.

Ron

Liviu Andronic wrote:
> Hello,
>
> On 7/1/09, Ron Burns  wrote:
>   
>>  In trying to fit garch models in above environment. I am getting
>>  "reasonable" fitted coefficients, but the fitobj...@fitted are all the
>>  same. This is true even for the help page example:
>>
>> 
> There is a better chance of getting good answers if asking finance
> related questions on r-sig-finance. Concerning your question, have you
> tried fitting the same examples with tseries::garch() or
> rgarch::ugarchfit() [1]?
> Liviu
>
> [1] http://rgarch.r-forge.r-project.org/index.html
>
>   

-- 
R. R. Burns
Physicist (Retired)
Oceanside, CA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RKWard

2009-07-01 Thread Ubirajara


  Does anyone using linux KDE heard about RKWard? Is it good?
Is it better than emacs/ess? Any thought?

Ubirajara Alberton

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] locale changing on Windows

2009-07-01 Thread Ben Bolker

  Dear r-helpers,

  This is a little bit more of a Windows problem than
an R problem, but ...

  any idea how to query the *available* locales from
within R (or otherwise) on a Windows system?  Teaching
in a Spanish-language setting and would like to do
something like

Sys.setlocale("LC_TIME","en_US")

(for example so that we can convert dates like
"1970-jan-01" with as.Date(x,"%Y-%b-%d")

but keep getting reports that this is not honored
by the OS.  Does anyone have useful pointers?

  thanks
Ben Bolker

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] double bootstrap

2009-07-01 Thread Seunghee Baek

Hi All,

I would like to do double boostrapping to estimate 95% CI coverage.

So, I can only need to estimate 95% confidence interval from each
bootstrapped sample.
Since we don't have a closed form of 95% CI, in order to get 95% CI for each
sample, we need to use bootstrapping.

For outer bootstrapping, I used boot function in library(boot).
How can I add double bootstrap inside the function. Is there any good way?
or reference to see?

Thanks,

Becky

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RScaLAPACK package with OpenMPI

2009-07-01 Thread Michela Cameletti

Hi all,

I'm using RScalapack library for parallelizing some heavy matrix
operations required by MCMC methods for spatio-temporal models. The
package reference manuals (dated 2005) states that the library needs
LamMPI to work but we have a Linux Cluster with OpenMPI. We have found
(http://cvs.fedoraproject.org/viewvc/devel/R-RScaLAPACK/) a patch for
OpenMPI but we are wondering if in the meanwhile the package has been
compiled also for OpenMPI. Does anybody use RScalapack and has any
suggestions? Sorry for writing here but nobody answered in the R-sig- 
hpc (High performance computing with R) mailing list.

Thank you very much.
My best,
Michela
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2009-07-01 Thread Andriy Fetsun

Hi,

I am trying to calculate the volatility on not overlapping basis. Do you
know functions for not overlapping calculation?

It is like to take first 20 observations and apply st.dev to 20 and then
take next 20 observations and calculate st. deviation.

I tried with function rollapply(), but it doesn't work.

Thank you in advance.

Regards,

Andy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ?max (so far...)

2009-07-01 Thread Duncan Murdoch


On 01/07/2009 11:49 AM, Mark Knecht wrote:

Hi,
   I have a data.frame that is date ordered by row number - earliest
date first and most current last. I want to create a couple of new
columns that show the max and min values from other columns *so far* -
not for the whole data.frame.

   It seems this sort of question is really coming from my lack of
understanding about how R intends me to limit myself to portions of a
data.frame. I get the impression from the help files that the generic
way is that if I'm on the 500th row of a 1000 row data.frame and want
to limit the search max does to rows 1:500  I should use something
like [1:row] but it's not working inside my function. The idea works
outside the function, in the sense I can create tempt1[1:7] and the
max function returns what I expect. How do I do this with row?

   Simple example attached. hp should be 'highest p', ll should be
'lowest l'. I get an error message "Error in 1:row : NA/NaN argument"

Thanks,
Mark

AddCols = function (MyFrame) {
MyFrame$p<-0
MyFrame$l<-0
MyFrame$pc<-0
MyFrame$lc<-0
MyFrame$pwin<-0
MyFrame$hp<-0
MyFrame$ll<-0
return(MyFrame)
}

BinPosNeg = function (MyFrame) {

## Positive y in p column, negative y in l column
pos <- MyFrame$y > 0
MyFrame$p[pos] <- MyFrame$y[pos]
MyFrame$l[!pos] <- MyFrame$y[!pos]
return(MyFrame)
}

RunningCount = function (MyFrame) {
## Running count of p & l events

pos <- (MyFrame$p > 0)
MyFrame$pc <- cumsum(pos)
pos <- (MyFrame$l < 0)
MyFrame$lc <- cumsum(pos)

return(MyFrame)
}

PercentWins = function (MyFrame) {

MyFrame$pwin <- round((MyFrame$pc / (MyFrame$pc+MyFrame$lc)),2)

return(MyFrame)
}

HighLow = function (MyFrame) {
temp1 <- MyFrame$p[1:row]
MyFrame$hp <- max(temp1) ## Highest p
temp1 <- MyFrame$l[1:row]
MyFrame$ll <- min(temp1) ## Lowest l

return(MyFrame)
}


You get an error in this function because you didn't define row, so R 
assumes you mean the function in the base package, and 1:row doesn't 
make sense.


What you want for the "highest so far" is the cummax (for "cumulative 
maximum") function.  See ?cummax.


Duncan Murdoch



F1 <- data.frame(x=1:10, y=2*(-4:5) )
F1 <- AddCols(F1)
F1 <- BinPosNeg(F1)
F1 <- RunningCount(F1)
F1 <- PercentWins(F1)
F1
F1 <- HighLow(F1)
F1

temp1<-F1$p[1:5]
max(temp1)
temp1<-F1$p[1:7]
max(temp1)
temp1<-F1$p[1:10]
max(temp1)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Iteratively Reweighted Least Squares of nonlinear regression

2009-07-01 Thread Ravi Varadhan

You are describing a "generalized nonlinear least-squares" estimation procedure.

This is implemented in the gnls() function in "nlme" package.

?gnls

Ravi.



Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Derek An 
Date: Wednesday, July 1, 2009 11:50 am
Subject: [R] Iteratively Reweighted Least Squares of nonlinear regression
To: R-help@r-project.org


> Dear all,
>  
>  
>  When doing nonlinear regression, we normally use nls if e are iid normal.
>  
>i learned that if the form of the variance of e is not completely known,
>  we can use the IRWLS (Iteratively Reweighted Least Squares )
>  
>  algorithm:
>  
>  for example, var e*i =*g0+g1*x*1
>  
>  1. Start with *w**i = *1
>  
>  2. Use least squares to estimate b.
>  
>  3. Use the residuals to estimate g, perhaps by regressing e^2 on *x*.
>  
>  4. Recompute the weights and goto 2.
>  
>  Continue until convergence
>  
>  i was wondering whether there is a instruction of R to do this?
>  
>   [[alternative HTML version deleted]]
>  
>  __
>  R-help@r-project.org mailing list
>  
>  PLEASE do read the posting guide 
>  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] productivity tools in R?

2009-07-01 Thread Michael

Hi all,

Could anybody point me to some latest productivity tools in R? I am
interested in speeding up my R programming and improving my efficiency
in terms of debugging and developing R programs.

I saw my friend has a R Console window which has automatic syntax
reminder when he types in the first a few letters of R command. And
he's using R under MAC. Is that a MAC thing, or I could do the same on
my PC Windows?

More pointers about using R for efficiency in development are highly
apprecaited!

Thanks a lot!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sorting question

2009-07-01 Thread Steve Jaffe


I've asked about custom sorting before and it appears that -- in terms of a
user-defined order -- it can only be done either by defining a custom class
or using various tricks with "order"

Just wondering if anyone has a clever way to order "vintages" of the form
2002, 2003H1, 2003H2, 2004,  2005Q1, 2005Q2, etc
some have H1 or H2, some have Q1,Q2,Q3,Q4, some are just plain years. They
should be sorted in the obvious order. I can think of doing something with 
s'trsplit' and 'order' but anyone have anything better?

(I still wonder why sorting with a user-defined function isn't supported.  I
guess I should follow the open source philosophy and contribute my own, but
it seems that would involve implementing an explicit, iterative sort
algorithm, whereas it would make more sense for it to be  integrated with
the internal sort function, if that were possible)

Thanks
-- 
View this message in context: 
http://www.nabble.com/sorting-question-tp24293430p24293430.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2009-07-01 Thread Gabor Grothendieck

See the by= argument.

On Wed, Jul 1, 2009 at 11:08 AM, Andriy Fetsun wrote:
> Hi,
>
> I am trying to calculate the volatility on not overlapping basis. Do you
> know functions for not overlapping calculation?
>
> It is like to take first 20 observations and apply st.dev to 20 and then
> take next 20 observations and calculate st. deviation.
>
> I tried with function rollapply(), but it doesn't work.
>
> Thank you in advance.
>
> Regards,
>
> Andy
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Are there any bloggers amoung us going to useR 2009 ?

2009-07-01 Thread David M Smith

I'm going to both UseR! (in Rennes) and DSC (in Copenhagen), and will
be blogging about the talks and other interesting things I learn here:

http://blog.revolution-computing.com/

# David Smith

-- 
David M Smith 
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)

Check out our upcoming events schedule at www.revolution-computing.com/events




On Wed, Jul 1, 2009 at 2:58 AM, Tal Galili wrote:
> *(note*: This is an R community question, not a statistical nor coding
> question. Since this is my first time writing such a post, I hope no one
> will take offence of it.)
>
>
>
> Hello all,
> I will be attending useR 2009 next week, and was wondering if there are any
> of you who are *bloggers *intending to participate and report on useR 2009?
> If so - I would love to know your blogs URL so as to follow you.
>
> I am planning on coming with a laptop and see what I'll be able to report
> on: www.r-statistics.com  (which is very empty for now)
>
>
> See you there,
> Tal
>
>
>
>
>
>
>
>
>
> --
> --
>
>
> My contact information:
> Tal Galili
> Phone number: 972-50-3373767
> FaceBook: Tal Galili
> My Blogs:
> http://www.r-statistics.com/
> http://www.talgalili.com
> http://www.biostatistics.co.il
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Map projections - converting latitude/longitude to eastings and northings

2009-07-01 Thread stephen sefick

I am trying to set up a Grass project and need to set up the region so
that I can view the map.  I can look at a map and find the lat/lon,
but the map projection is in UTM NAD38 WGS84 and I need to set the
eastings and northings.  Is there a package that will help me
calculate this in R.
thanks

-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] abline going out of bounds

2009-07-01 Thread rajesh j

Hi,
I have a multiplot of 6 rows and 1 column.I need to draw vertical lines in
each plot.However when I use abline(v=locator(1)$x) in some plots the line
only comes for half the box and it goes beyond the box in others.I suspect
this has something to do with the margins.any help?

-- 
Rajesh.J

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sorting question

2009-07-01 Thread Nordlund, Dan (DSHS/RDA)

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of Steve Jaffe
> Sent: Wednesday, July 01, 2009 9:59 AM
> To: r-help@r-project.org
> Subject: [R] sorting question
> 
> 
> I've asked about custom sorting before and it appears that -- in terms of a
> user-defined order -- it can only be done either by defining a custom class
> or using various tricks with "order"
> 
> Just wondering if anyone has a clever way to order "vintages" of the form
> 2002, 2003H1, 2003H2, 2004,  2005Q1, 2005Q2, etc
> some have H1 or H2, some have Q1,Q2,Q3,Q4, some are just plain years. They
> should be sorted in the obvious order. I can think of doing something with
> s'trsplit' and 'order' but anyone have anything better?
> 
Steve,

I don't have a solution to your sort problem, but let me ask: what is the 
"obvious order" is in this situation?

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204
 
> (I still wonder why sorting with a user-defined function isn't supported.  I
> guess I should follow the open source philosophy and contribute my own, but
> it seems that would involve implementing an explicit, iterative sort
> algorithm, whereas it would make more sense for it to be  integrated with
> the internal sort function, if that were possible)
> 
> Thanks
> --
> View this message in context: http://www.nabble.com/sorting-question-
> tp24293430p24293430.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] locale changing on Windows

2009-07-01 Thread Gabor Grothendieck

It can be done without setting locales using chron:

> library(chron)
> as.Date(chron("1970-Jan-01", format = "Year-Month-Day"))
[1] "1970-01-01"

On Wed, Jul 1, 2009 at 10:09 AM, Ben Bolker wrote:
>  Dear r-helpers,
>
>  This is a little bit more of a Windows problem than
> an R problem, but ...
>
>  any idea how to query the *available* locales from
> within R (or otherwise) on a Windows system?  Teaching
> in a Spanish-language setting and would like to do
> something like
>
> Sys.setlocale("LC_TIME","en_US")
>
> (for example so that we can convert dates like
> "1970-jan-01" with as.Date(x,"%Y-%b-%d")
>
> but keep getting reports that this is not honored
> by the OS.  Does anyone have useful pointers?
>
>  thanks
>    Ben Bolker
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] abline going out of bounds

2009-07-01 Thread Greg Snow

How are you creating the plots before adding the line?  This sounds like you 
may be mixing graphics types (creating the plots with grid graphics (lattice or 
ggplot2) then using abline from the base graphics system) or using a plot 
function that plays with the graphics settings and leaves them in a different 
state from when the plotting was done (filled.contour is one such function).

If you can show us the code that you are seeing the problem with, then it will 
be easier for us to help.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of rajesh j
> Sent: Wednesday, July 01, 2009 11:07 AM
> To: r-help@r-project.org
> Subject: [R] abline going out of bounds
> 
> Hi,
> I have a multiplot of 6 rows and 1 column.I need to draw vertical lines
> in
> each plot.However when I use abline(v=locator(1)$x) in some plots the
> line
> only comes for half the box and it goes beyond the box in others.I
> suspect
> this has something to do with the margins.any help?
> 
> --
> Rajesh.J
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] productivity tools in R?

2009-07-01 Thread Tal Galili

Hi Michael,
Great topic - I hope to see others respond.

For me there are several big "time savers" with using R (on windows XP),
search them on google :
1) tinn-r, for syntax highlighting.
2) "Rexcel" package - for getting data from excel. (BTW, for excel, I also
recommend the ASAP utillities)
3) "debug" package - especially the mtrace command, that allows for "live"
debugging of a function
4) www.rseek.org and http://r-project.markmail.org/ , for searching R
related things in general and in the forum.

I hope for more good tips from people,

Tal

On Wed, Jul 1, 2009 at 7:58 PM, Michael  wrote:

> Hi all,
>
> Could anybody point me to some latest productivity tools in R? I am
> interested in speeding up my R programming and improving my efficiency
> in terms of debugging and developing R programs.
>
> I saw my friend has a R Console window which has automatic syntax
> reminder when he types in the first a few letters of R command. And
> he's using R under MAC. Is that a MAC thing, or I could do the same on
> my PC Windows?
>
> More pointers about using R for efficiency in development are highly
> apprecaited!
>
> Thanks a lot!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
--

My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ?max (so far...)

2009-07-01 Thread Mark Knecht

On Wed, Jul 1, 2009 at 9:39 AM, Duncan Murdoch wrote:
> On 01/07/2009 11:49 AM, Mark Knecht wrote:
>>
>> Hi,
>>   I have a data.frame that is date ordered by row number - earliest
>> date first and most current last. I want to create a couple of new
>> columns that show the max and min values from other columns *so far* -
>> not for the whole data.frame.
>>
>>   It seems this sort of question is really coming from my lack of
>> understanding about how R intends me to limit myself to portions of a
>> data.frame. I get the impression from the help files that the generic
>> way is that if I'm on the 500th row of a 1000 row data.frame and want
>> to limit the search max does to rows 1:500  I should use something
>> like [1:row] but it's not working inside my function. The idea works
>> outside the function, in the sense I can create tempt1[1:7] and the
>> max function returns what I expect. How do I do this with row?
>>
>>   Simple example attached. hp should be 'highest p', ll should be
>> 'lowest l'. I get an error message "Error in 1:row : NA/NaN argument"
>>
>> Thanks,
>> Mark
>>

>>
>> HighLow = function (MyFrame) {
>>        temp1 <- MyFrame$p[1:row]
>>        MyFrame$hp <- max(temp1) ## Highest p
>>        temp1 <- MyFrame$l[1:row]
>>        MyFrame$ll <- min(temp1) ## Lowest l
>>
>>        return(MyFrame)
>> }
>
> You get an error in this function because you didn't define row, so R
> assumes you mean the function in the base package, and 1:row doesn't make
> sense.
>
> What you want for the "highest so far" is the cummax (for "cumulative
> maximum") function.  See ?cummax.
>
> Duncan Murdoch
>

Duncon,
   OK, thanks. That makes sense, as long as I want the cummax from the
beginning of the data.frame. (Which is exactly what I asked for!)

   How would I do this in the more general case if I was looking for
the cummax of only the most recent 50 rows in my data.frame? What I'm
trying to get down to is that as I fill in my data.frame I need to be
able get a max or min or standard deviation of the previous so many
rows of data - not the whole column - and I'm just not grasping how to
do this. Is seems like I should be able to create a data set that's
only a portion of a column while I'm in the function and then take the
cummax on that, or use it as an input to a standard deviation, etc.?

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sorting question

2009-07-01 Thread Gabor Grothendieck

This maps each string to one of the form yearQqtr at which point
you can sort them.   Modify the mapping as necessary.

> library(gsubfn)
> dd <- c("2002", "2003H1", "2003H2", "2004", "2005Q1", "2005Q2")

> gsubfn("H.|Q.|$", list(H1 = "Q1", H2 = "Q2", Q2 = "Q2", Q3 = "Q3", Q4 = "Q4", 
> "Q1"), dd)
[1] "2002Q1" "2003Q1" "2003Q2" "2004Q1" "2005Q1" "2005Q2"

See http://gsubfn.googlecode.com for more.

On Wed, Jul 1, 2009 at 12:58 PM, Steve Jaffe wrote:
>
> I've asked about custom sorting before and it appears that -- in terms of a
> user-defined order -- it can only be done either by defining a custom class
> or using various tricks with "order"
>
> Just wondering if anyone has a clever way to order "vintages" of the form
> 2002, 2003H1, 2003H2, 2004,  2005Q1, 2005Q2, etc
> some have H1 or H2, some have Q1,Q2,Q3,Q4, some are just plain years. They
> should be sorted in the obvious order. I can think of doing something with
> s'trsplit' and 'order' but anyone have anything better?
>
> (I still wonder why sorting with a user-defined function isn't supported.  I
> guess I should follow the open source philosophy and contribute my own, but
> it seems that would involve implementing an explicit, iterative sort
> algorithm, whereas it would make more sense for it to be  integrated with
> the internal sort function, if that were possible)
>
> Thanks
> --
> View this message in context: 
> http://www.nabble.com/sorting-question-tp24293430p24293430.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] productivity tools in R?

2009-07-01 Thread miguel bernal

Emacs or X-emacs with ess (Emacs Speaks Statistics) is great on Linux and
Mac (can be the console you saw on Mac) for syntax highlight, programming
and debugging. I think there is a package to visualize the links between
functions in a package, but I don't know its name (if anybody knows it, I
will love to know it). 

Best, 

Miguel.
-
Current address:
Ocean Modeling group,
Institute of Marine and Coastal Sciences University of Rutgers
71 Dudley Road, New Brusnkwick,
New Jersey 08901, USA
e-mail: mber...@marine.rutgers.edu
phone: +1 732 932 3692
Fax: +1 732 932 8578
-
Permanent address:
Instituto Español de Oceanografía
Centro Oceanográfico de Cádiz
Puerto Pesquero, Muelle de Levante, s/n, 11006 Cádiz
e-mail: miguel.ber...@cd.ieo.es
Tel.: +34 956 294189
Fax: +34 956 294232

-Mensaje original-
De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En
nombre de Tal Galili
Enviado el: Wednesday, July 01, 2009 1:24 PM
Para: Michael
CC: r-help
Asunto: Re: [R] productivity tools in R?

Hi Michael,
Great topic - I hope to see others respond.

For me there are several big "time savers" with using R (on windows XP),
search them on google :
1) tinn-r, for syntax highlighting.
2) "Rexcel" package - for getting data from excel. (BTW, for excel, I also
recommend the ASAP utillities)
3) "debug" package - especially the mtrace command, that allows for "live"
debugging of a function
4) www.rseek.org and http://r-project.markmail.org/ , for searching R
related things in general and in the forum.

I hope for more good tips from people,

Tal

On Wed, Jul 1, 2009 at 7:58 PM, Michael  wrote:

> Hi all,
>
> Could anybody point me to some latest productivity tools in R? I am
> interested in speeding up my R programming and improving my efficiency
> in terms of debugging and developing R programs.
>
> I saw my friend has a R Console window which has automatic syntax
> reminder when he types in the first a few letters of R command. And
> he's using R under MAC. Is that a MAC thing, or I could do the same on
> my PC Windows?
>
> More pointers about using R for efficiency in development are highly
> apprecaited!
>
> Thanks a lot!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
--

My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Checked by AVG - www.avg.com 

07/01/09
05:53:00

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] "Error: cannot allocate vector of size 332.3 Mb"

2009-07-01 Thread Steve Ellis


Dear R-helpers,

I am running R version 2.9.1 on a Mac Quad with 32Gb of RAM running  
Mac OS X version 10.5.6.  With over 20Gb of RAM "free" (according to  
the Activity Monitor) the following happens.


> x <- matrix(rep(0, 6600^2), ncol = 6600)

# So far so good.  But I need 3 matrices of this size.

> y <- matrix(rep(0, 6600^2), ncol = 6600)
R(3219) malloc: *** mmap(size=348483584) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
R(3219) malloc: *** mmap(size=348483584) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Error: cannot allocate vector of size 332.3 Mb

Now a 6600 x 6600 matrix should take up less than 400Mb of RAM.  So  
the question is, with 20Gb of RAM free how come I can't create more  
than one matrix of this size?  (In fact, sometimes R won't even create  
one of them.)  More to the point, is there some simple remedy?  
(Rewriting all my code to use the "Matrix" library, for example, is  
not a simple remedy.)


I tried launching R in a terminal with

R --min-vsize=10M --max-vsize=5G --min-nsize=500k --max-nsize=900M

and that didn't work either.  Finally, let me remark that I had the  
same problem with an older version of R.


  -- Steve Ellis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] abline going out of bounds

2009-07-01 Thread rajesh j

Here's my code

library(sound);
q1<-loadSample(" path to wav");
q2<-loadSample("path to wav");
q3<-loadSample(" path to wav");
m1<-read.table("txt",header=FALSE);
m2<-read.table("txt",header=FALSE);
m3<-read.table("txt",header=FALSE);
layout(matrix(c(1,2,3,4,5,6), 6, 1, byrow = TRUE));
par(mar = c(0.6,4,2,4));
plot.Sample(q1,xlab="",ylab="",xaxloc=3);
title(ylab="Waveform",cex.lab=1.5);
par(las=1);mtext("(a)",4,line=1,cex=1.5);
mtext("/she//had//your//dark//suit/   /in/  /grea//sy//wash/ /wa//ter/
/all/ /year/",1,line=0.8,cex=1,adj=0.3);
par(mar
=c(0.6,4,2,4));plot(m1$V1[1:1200],labels=F,tcl=0,xlab="",ylab="",type="l",xlim=c(50,1150));
axis(2);axis(1);
par(las=1);
mtext("(b)",4,line=1,cex=1.5);
abline(v=200);
title(ylab="Group Delay",cex.lab=1.5);
par(mar =
c(0.6,4,2,4));plot.Sample(q2,xlab="",ylab="",xaxloc=3);title(ylab="Waveform",cex.lab=1.5);par(las=1);mtext("(c)",4,line=1,cex=1.5);
mtext("/she//had//your//dark/   /suit/  /in/  /grea//sy//wash/
/wa//ter/   /all/ /year/",1,line=0.8,cex=1,adj=0.5);
par(mar
=c(0.6,4,2,4));plot(m2$V1[1:850],labels=F,tcl=0,xlab="",ylab="",type="l",xlim=c(30,780));axis(2);axis(1);par(las=1);mtext("(d)",4,line=1,cex=1.5);title(ylab="Group
Delay",cex.lab=1.5);
abline(v=200)


On Wed, Jul 1, 2009 at 10:55 PM, Greg Snow  wrote:

> How are you creating the plots before adding the line?  This sounds like
> you may be mixing graphics types (creating the plots with grid graphics
> (lattice or ggplot2) then using abline from the base graphics system) or
> using a plot function that plays with the graphics settings and leaves them
> in a different state from when the plotting was done (filled.contour is one
> such function).
>
> If you can show us the code that you are seeing the problem with, then it
> will be easier for us to help.
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> > project.org] On Behalf Of rajesh j
> > Sent: Wednesday, July 01, 2009 11:07 AM
> > To: r-help@r-project.org
> > Subject: [R] abline going out of bounds
> >
> > Hi,
> > I have a multiplot of 6 rows and 1 column.I need to draw vertical lines
> > in
> > each plot.However when I use abline(v=locator(1)$x) in some plots the
> > line
> > only comes for half the box and it goes beyond the box in others.I
> > suspect
> > this has something to do with the margins.any help?
> >
> > --
> > Rajesh.J
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>



-- 
Rajesh.J

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] productivity tools in R?

2009-07-01 Thread Romain Francois

If you are coming to useR! next week, then you might want to check the 
session on "Workbenches":

http://www.agrocampus-ouest.fr/math/useR-2009/abstracts/schedule.html

Romain


On 07/01/2009 06:58 PM, Michael wrote:

Hi all,

Could anybody point me to some latest productivity tools in R? I am
interested in speeding up my R programming and improving my efficiency
in terms of debugging and developing R programs.

I saw my friend has a R Console window which has automatic syntax
reminder when he types in the first a few letters of R command. And
he's using R under MAC. Is that a MAC thing, or I could do the same on
my PC Windows?

More pointers about using R for efficiency in development are highly
apprecaited!

Thanks a lot!



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Revolutions blog: June roundup

2009-07-01 Thread David M Smith

I write about R every weekday at http://blog.revolution-computing.com
. In case you missed them, here are some articles from last month of
particular interest to R users.

http://bit.ly/tygLz announced the release of the "foreach" and
"iterators" packages on CRAN, for simple scalable parallel programming
in R.

http://bit.ly/FDS67 linked to Thomas Levin's Joy-Division-esque band
T-shirt made with R

http://bit.ly/16zTph discussed R's involvement in the Netflix Prize

http://bit.ly/dTOkF revealed that the New York Times uses R to create
graphics, including an interactive chart of Michael Jackson's musical
career

http://bit.ly/PZRA2 described how to run batch jobs using R CMD BATCH

http://bit.ly/hE56B reviewed "Data Mashups in R", an extended example
of corralling foreclosure data from various sources

http://bit.ly/h9yfB announced a new "statistical learning web service"
implemented with R

http://bit.ly/Ncmwe was one of a series of posts looking at fraud in
the Iranian election (including a now-famous analysis done in R)

http://bit.ly/wTYAe linked to a widely-reproduced R graph on gay
marriage support

http://bit.ly/11AV4k linked to free source code of many R graphs on
Wikimedia Commons

http://bit.ly/npwoA demonstrated animation of R graphics using Flash

http://bit.ly/VYvnD linked to a comparison of various statistical
analyses done in R, SAS and SPSS

http://bit.ly/gxA3g showed how Twitter users can "tweet" using R code

http://bit.ly/31AOs linked to an analysis of basketball plays done in R

http://bit.ly/18quFi linked to a video walkthrough and code for bagged
decision trees in R

http://bit.ly/BIdNp gave some tips, tricks and pitfalls on working
with dates and time zones in R

http://bit.ly/TwlvR brought the news that R was used in a winning
entry of KDD 2009

(I've provided short URLs above because many mailers break the long
direct URLs.)

Other non-R-specific stories in June covered Simpson's Paradox in
polling data, Google Squared, real random numbers, the chances of a
meteor bringing down an aircraft, amusing article titles from PubMed,
and open-source user interfaces.

June was a record-breaking traffic month for the blog. The "Data
Mashups" article and the discussion of the Iranian Election were
highly visited. Thanks to everyone who provided comments and tips and
please keep them coming to da...@revolution-computing.com .

Regards to all,
# David Smith

-- 
David M Smith 
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)

Check out our upcoming events schedule at www.revolution-computing.com/events

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The step before interfacing to GRASS

2009-07-01 Thread Javi Hidalgo


Dear all,

A very basic terrain calculated as a matrix from Spatial Points Patterns:

#interpolate using the akima package
library(akima)
terrain=interp(ppoints$x,ppoints$y,ppoints$marks,xo=x0,yo=y0, linear=F)

> class(terrain)
[1] "list"
> class(terrain$x) #these are the x-coord i.e: [1...1000]
[1] "numeric"
> class(terrain$y)#these are the y-coord i.e: [1...1000]
[1] "numeric"
> class(terrain$z)#these are the height
[1] "matrix"

I would like to export this "terrain" (list object) to GRASS.
Reading the documentation that I found for interfacing between GRASS 6 and R 
(library (spgrass6)). I understood the following (I already managed to 
configured properly the mapset and so on):

1. I have to create a SpatialGridDataFrame object before to write into a Raster 
file.
2. In order to create a SpatialGridDataFrame I need a GridTopology.
3. After doing it, I am allow to write the file into GRASS using the function 
writeRAST6, like: writeRAST6 (spterrain, "maps.dem").

How to create the SpatialGridDataFrame and/or the GridTopology? I am doing 
something wrong.

Thanks,

Javier Hidalgo.
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Revolutions blog: June roundup

2009-07-01 Thread Tal Galili

Great roundup - thank you David.


On Wed, Jul 1, 2009 at 8:48 PM, David M Smith <
da...@revolution-computing.com> wrote:

> David




-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Revolutions blog: June roundup

2009-07-01 Thread Mark Knecht

David,
   Using this mail I think I found a simple solution for something I
knew I was going to have to learn about.

Thanks,
Mark

On Wed, Jul 1, 2009 at 10:48 AM, David M
Smith wrote:
> I write about R every weekday at http://blog.revolution-computing.com
> . In case you missed them, here are some articles from last month of
> particular interest to R users.
>
> http://bit.ly/tygLz announced the release of the "foreach" and
> "iterators" packages on CRAN, for simple scalable parallel programming
> in R.
>
> http://bit.ly/FDS67 linked to Thomas Levin's Joy-Division-esque band
> T-shirt made with R
>
> http://bit.ly/16zTph discussed R's involvement in the Netflix Prize
>
> http://bit.ly/dTOkF revealed that the New York Times uses R to create
> graphics, including an interactive chart of Michael Jackson's musical
> career
>
> http://bit.ly/PZRA2 described how to run batch jobs using R CMD BATCH
>
> http://bit.ly/hE56B reviewed "Data Mashups in R", an extended example
> of corralling foreclosure data from various sources
>
> http://bit.ly/h9yfB announced a new "statistical learning web service"
> implemented with R
>
> http://bit.ly/Ncmwe was one of a series of posts looking at fraud in
> the Iranian election (including a now-famous analysis done in R)
>
> http://bit.ly/wTYAe linked to a widely-reproduced R graph on gay
> marriage support
>
> http://bit.ly/11AV4k linked to free source code of many R graphs on
> Wikimedia Commons
>
> http://bit.ly/npwoA demonstrated animation of R graphics using Flash
>
> http://bit.ly/VYvnD linked to a comparison of various statistical
> analyses done in R, SAS and SPSS
>
> http://bit.ly/gxA3g showed how Twitter users can "tweet" using R code
>
> http://bit.ly/31AOs linked to an analysis of basketball plays done in R
>
> http://bit.ly/18quFi linked to a video walkthrough and code for bagged
> decision trees in R
>
> http://bit.ly/BIdNp gave some tips, tricks and pitfalls on working
> with dates and time zones in R
>
> http://bit.ly/TwlvR brought the news that R was used in a winning
> entry of KDD 2009
>
> (I've provided short URLs above because many mailers break the long
> direct URLs.)
>
> Other non-R-specific stories in June covered Simpson's Paradox in
> polling data, Google Squared, real random numbers, the chances of a
> meteor bringing down an aircraft, amusing article titles from PubMed,
> and open-source user interfaces.
>
> June was a record-breaking traffic month for the blog. The "Data
> Mashups" article and the discussion of the Iranian Election were
> highly visited. Thanks to everyone who provided comments and tips and
> please keep them coming to da...@revolution-computing.com .
>
> Regards to all,
> # David Smith
>
> --
> David M Smith 
> Director of Community, REvolution Computing www.revolution-computing.com
> Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)
>
> Check out our upcoming events schedule at www.revolution-computing.com/events
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] productivity tools in R?

2009-07-01 Thread baptiste auguie

2009/7/1 miguel bernal 

> I think there is a package to visualize the links between
> functions in a package, but I don't know its name (if anybody knows it, I
> will love to know it).


reminds me of roxygen's callgraph (relies on graphviz), is that what you
meant?

baptiste

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] productivity tools in R?

2009-07-01 Thread Seeliger . Curt

> ... I saw my friend has a R Console window which has automatic syntax
> reminder when he types in the first a few letters of R command. ...

You might be thinking of JGR (Jaguar) at 
http://jgr.markushelbig.org/JGR.html . This editor also prompts you with 
function argument lists, including for functions that you wrote.  It's a 
very nice editor, but currently lacks the overall function of Tinn-R. Even 
so, I have it on my desktop.  The RUnit package is a good start at a 
standalone test harness, and I'm looking forward to additional 
capabilities as it matures.

There is no IDE for R in the same way that there is for other languages -- 
something that supports integrated versioning, debugging and testing, 
perhaps using Eclipse.  Boy howdee, I hope someone knows otherwise.

cur

-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
seeliger.c...@epa.gov
541/754-4638


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] match() indexing

2009-07-01 Thread Hane, Christopher A

 
 Hello,

I'm trying to vectorize some assignment statements using match(), but
can't seem to get it correct.

I have 2 data frames each with a key column of unique values. I want to
copy a column from one frame to another where the key values are the
same.  The data frames are not the same length, and the set of keys is
non-overlapping (each frame has keys not in the other).

Example:
x <- data.frame(id=c(1,3,4,6,7,9,10), r=runif(7))
> x
  id r
1  1 0.4243219
2  3 0.2389127
3  4 0.7094532
4  6 0.2053836
5  7 0.9630027
6  9 0.1218458
7 10 0.9183175
> y <- data.frame(id=c(1,2,4,7,8,9,10,11,12))
> y
  id
1  1
2  2
3  4
4  7
5  8
6  9
7 10
8 11
9 12

I want to copy the x$r values to y to obtain
>y
  idr
1  10.4243219
2  2NA
3  40.7094532
4  70.9630027
5  8NA
6  90.1218458
7 100.9183175
8 11NA
9 12NA

xIdx <- match(x$id,y$id) yields a vector of length x, so it cannot be
used directly to copy to y. 

Any help in vectorizing this assignment would be most welcome.

Chris

This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] match() indexing

2009-07-01 Thread Jorge Ivan Velez

Dear Christopher,
Try this:

merge(x,y,all=TRUE)

HTH,

Jorge


On Wed, Jul 1, 2009 at 2:51 PM, Hane, Christopher A <
christopher.h...@ingenixconsulting.com> wrote:

>
>  Hello,
>
> I'm trying to vectorize some assignment statements using match(), but
> can't seem to get it correct.
>
> I have 2 data frames each with a key column of unique values. I want to
> copy a column from one frame to another where the key values are the
> same.  The data frames are not the same length, and the set of keys is
> non-overlapping (each frame has keys not in the other).
>
> Example:
> x <- data.frame(id=c(1,3,4,6,7,9,10), r=runif(7))
> > x
>  id r
> 1  1 0.4243219
> 2  3 0.2389127
> 3  4 0.7094532
> 4  6 0.2053836
> 5  7 0.9630027
> 6  9 0.1218458
> 7 10 0.9183175
> > y <- data.frame(id=c(1,2,4,7,8,9,10,11,12))
> > y
>  id
> 1  1
> 2  2
> 3  4
> 4  7
> 5  8
> 6  9
> 7 10
> 8 11
> 9 12
>
> I want to copy the x$r values to y to obtain
> >y
>  idr
> 1  10.4243219
> 2  2NA
> 3  40.7094532
> 4  70.9630027
> 5  8NA
> 6  90.1218458
> 7 100.9183175
> 8 11NA
> 9 12NA
>
> xIdx <- match(x$id,y$id) yields a vector of length x, so it cannot be
> used directly to copy to y.
>
> Any help in vectorizing this assignment would be most welcome.
>
> Chris
>
> This e-mail, including attachments, may include confidential and/or
> proprietary information, and may be used only by the person or entity
> to which it is addressed. If the reader of this e-mail is not the intended
> recipient or his or her authorized agent, the reader is hereby notified
> that any dissemination, distribution or copying of this e-mail is
> prohibited. If you have received this e-mail in error, please notify the
> sender by replying to this message and delete this e-mail immediately.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] productivity tools in R?

2009-07-01 Thread Tobias Verbeke


seeliger.c...@epamail.epa.gov wrote:



There is no IDE for R in the same way that there is for other languages -- 
something that supports integrated versioning, debugging and testing, 
perhaps using Eclipse.  Boy howdee, I hope someone knows otherwise.


There is a feature-rich R plug-in for Eclipse at

http://www.walware.de/goto/statet

see the link below if you'd like to install the latest testing version.

https://lists.r-forge.r-project.org/pipermail/statet-user/2009-May/000147.html

HTH,
Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select values at random by id value

2009-07-01 Thread Sunil Suchindran

#Highlight the text below (without the header)
# read the data in from clipboard

df <- do.call(data.frame, scan("clipboard", what=list(id=0,
date="",loctype=0 ,haptype=0)))

# split the data by date, sample 1 observation from each split, and rbind

sampled_df <- do.call(rbind, lapply(split(df,
df$date),function(x)x[sample(1:nrow(x), 1),]))

On Mon, Jun 29, 2009 at 9:11 AM, James Martin wrote:

> All,
>
> I have data that looks like below.  For each id there may be more than one
> value per day.  I want to select a random value for that day for that id.
> The end result would hopefully be a matrix with the id as rows, date as
> columns and populated by the random hab value.  Thanks to someone on here
> (Jim) I know how to do the matrix, but now realize I need to randomly
> select
> some of my values.  All help is appreciated. jm
> id, date, loctype, habtype
>  50022 1/25/2006 0 6  50022 1/31/2006 0 6  50022 2/8/2006 0 6  50022
> 2/13/2006 0 6  50022 2/15/2006 0 6  50022 2/24/2006 0 6  50022 3/2/2006 0 6
> 50022 3/9/2006 0 6  50022 3/16/2006 0 6  50022 3/24/2006 0 6  50022
> 4/9/2006
> 0 3  50022 4/18/2006 0 6  50022 4/27/2006 0 3  50022 5/23/2006 1 3  50022
> 5/23/2006 1 6  50022 5/24/2006 1 3  50022 5/24/2006 1 3  50022 5/24/2006 1
> 3
> 50022 5/24/2006 1 3  50022 5/25/2006 1 3  50022 5/25/2006 1 3  50022
> 5/25/2006 1 3  50022 5/26/2006 1 5  50022 5/26/2006 1 3  50022 5/27/2006 1
> 5
> 50022 5/27/2006 1 3  50022 5/28/2006 1 3  50022 5/29/2006 1 3  50022
> 5/30/2006 1 5  50022 5/30/2006 1 3  50022 5/31/2006 1 3  50022 5/31/2006 1
> 3
> 50022 6/1/2006 1 3  50022 6/2/2006 1 3  50022 6/3/2006 1 3  50022 6/4/2006
> 1
> 3  50022 6/5/2006 1 3  50022 6/6/2006 1 5  50022 6/6/2006 1 5  50022
> 6/6/2006 1 5  50022 6/6/2006 1 3  50022 6/6/2006 1 3  50022 6/7/2006 1 5
> 50022 6/7/2006 1 3  50022 6/7/2006 1 3  50022 6/7/2006 1 3  50022 6/7/2006
> 1
> 3  50022 6/8/2006 1 3  50022 6/8/2006 1 3  50022 6/8/2006 1 3  50022
> 6/9/2006 1 5  50022 6/10/2006 1 3  50022 6/11/2006 1 5
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How should I denormalise a data frame list of lists column?

2009-07-01 Thread Tim Slidel

Hi,

I have a data frame where one column is a list of lists. I would like to
subset the data frame based on membership of the lists in that column and be
able to 'denormalise' the data frame so that a row is duplicated for each of
its list elements. Example code follows:

# The data is read in in this form with the c2 list values in single strings
which I then split to give lists:
> f1 <- data.frame(c1=0:2, c2=c("A,B,C", "A,E", "F,G,H"))
> f1$Split <- strsplit(as.character(f1$c2), ",")
> f1
  c1c2   Split
1  0 A,B,C A, B, C
2  1   A,EA, E
3  2 F,G,H F, G, H

# So f1$Split is the list of lists column I want to denormalise or use as
the subject for subsetting

# f2 is data to use to select subsets from f1
> f2 <- data.frame(c1=LETTERS[0:8], c2=c("Apples",
"Badger","Camel","Dog","Elephants","Fish","Goat","Horse"))
> f2
  c1   c2
1  AApple
2  B   Badger
3  CCamel
4  D  Dog
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# I was able to find which rows of f2 are represented in the f1 lists (not
entirely sure if this is the best way to do this):
> f3 <- f2[f2$c1 %in% unlist(f1$Split),]
> f3
  c1   c2
1  AApple
2  B   Badger
3  CCamel
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# Note that 'D' is missing from f3 because it is not in any of the f1$Split
lists

# f4 is a subset of f3 and I want to find the rows of f1 where f1$Split
contains any of f4$c1:
> f4 <- f3[c(1,3),]
> f4
  c1c2
1  A Apple
3  C Camel

# I tried this and it didn't work, presumably because it's trying to match
against each list object rather than the list elements, but unlist doesn't
do the trick here because I need the individual rows, I need to unlist on a
row by row basis.
> f1[f1$Split %in% f4$c1,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[f4$c1 %in% f1$Split,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[match(f4$c1, f1$Split),]
 c1   c2 Split
NA   NA   NULL
NA.1 NA   NULL

I also looked at reshape which I don't think helps. I thought I might be
able to create a new data frame with the f1$Split denormalised and use that,
but couldn't find a way to do this, the result I'd want there is something
like:
> f1_denorm
  c1c2   Split   SplitDenorm
1  0 A,B,C A, B, C   A
2  0 A,B,C A, B, C   B
3  0 A,B,C A, B, C   C
4  1   A,EA, E A
5  1   A,EA, E E
6  2 F,G,H F, G, H   F
7  2 F,G,H F, G, H  G
8  2 F,G,H F, G, H  H

I thought perhaps for loops would be the next thing to try, but there must
be a better way!

Thanks for any help.

Tim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] Parallel programming packages iterators, foreach and doMC released

2009-07-01 Thread David M Smith

REvolution Computing has just released three new packages for R to
CRAN (under the open-source Apache 2.0 license): foreach, iterators,
and doMC. Together, they provide a simple, scalable parallel computing
framework for R that lets you take advantage of your multicore or
multiprocessor workstation to program loops that run faster than
traditional loops in R.

The three packages build on each other to implement a new loop
construct for R -- foreach -- where iterations may run in parallel on
separate cores or processors.

"iterators" implements the iterator object familiar to users of
languages like Java, C# and Python to make it easy to program useful
sequences - from all the prime numbers to the columns of a matrix or
the rows of an external database. Iterators objects are used as the
index variable for the parallel loops.

"foreach" builds on the "iterators" package to introduce a new way of
programming loops in R. Unlike the traditional for loop, a foreach
loop can run multiple iterations simultaneously, in parallel. If you
only have a single-processor machine, foreach runs iterations
sequentially. But with a multiprocessor workstation and a connection
to a parallel programming backend, multiple iterations of the loop
will run in parallel. This means that without any changes to your
code, the loop will run faster (and with a speedup scaled by the
number of available cores or processors, potentially much faster).

"doMC" provides such a link between "foreach" and a parallel
programming backend -- in this case, the multicore package (from Simon
Urbanek). With this connection, foreach loops on MacOS and Unix/Linux
systems will make use of all the available cores/processors on the
local workstation to run iterations in parallel.

More information at blog.revolution-computing.com:
http://bit.ly/tygLz

-- 
David M Smith 
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)

Check out our upcoming events schedule at www.revolution-computing.com/events

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How should I denormalise a data frame list of lists column?

2009-07-01 Thread Tim Slidel

Hi,

(apologies for initial html posting)

I have a data frame where one column is a list of lists. I would like to
subset the data frame based on membership of the lists in that column and be
able to 'denormalise' the data frame so that a row is duplicated for each of
its list elements. Example code follows:

# The data is read in in this form with the c2 list values in single strings
which I then split to give lists:
> f1 <- data.frame(c1=0:2, c2=c("A,B,C", "A,E", "F,G,H"))
> f1$Split <- strsplit(as.character(f1$c2), ",")
> f1
  c1c2   Split
1  0 A,B,C A, B, C
2  1   A,EA, E
3  2 F,G,H F, G, H

# So f1$Split is the list of lists column I want to denormalise or use as
the subject for subsetting

# f2 is data to use to select subsets from f1
> f2 <- data.frame(c1=LETTERS[0:8], c2=c("Apples",
"Badger","Camel","Dog","Elephants","Fish","Goat","Horse"))
> f2
  c1   c2
1  AApple
2  B   Badger
3  CCamel
4  D  Dog
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# I was able to find which rows of f2 are represented in the f1 lists (not
entirely sure if this is the best way to do this):
> f3 <- f2[f2$c1 %in% unlist(f1$Split),]
> f3
  c1   c2
1  AApple
2  B   Badger
3  CCamel
5  E Elephant
6  F Fish
7  G Goat
8  HHorse

# Note that 'D' is missing from f3 because it is not in any of the f1$Split
lists

# f4 is a subset of f3 and I want to find the rows of f1 where f1$Split
contains any of f4$c1:
> f4 <- f3[c(1,3),]
> f4
  c1c2
1  A Apple
3  C Camel

# I tried this and it didn't work, presumably because it's trying to match
against each list object rather than the list elements, but unlist doesn't
do the trick here because I need the individual rows, I need to unlist on a
row by row basis.
> f1[f1$Split %in% f4$c1,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[f4$c1 %in% f1$Split,]
[1] c1c2Split
<0 rows> (or 0-length row.names)
> f1[match(f4$c1, f1$Split),]
 c1   c2 Split
NA   NA   NULL
NA.1 NA   NULL

I also looked at reshape which I don't think helps. I thought I might be
able to create a new data frame with the f1$Split denormalised and use that,
but couldn't find a way to do this, the result I'd want there is something
like:
> f1_denorm
  c1c2   Split   SplitDenorm
1  0 A,B,C A, B, C   A
2  0 A,B,C A, B, C   B
3  0 A,B,C A, B, C   C
4  1   A,EA, E A
5  1   A,EA, E E
6  2 F,G,H F, G, H   F
7  2 F,G,H F, G, H  G
8  2 F,G,H F, G, H  H

I thought perhaps for loops would be the next thing to try, but there must
be a better way!

Thanks for any help.

Tim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select values at random by id value

2009-07-01 Thread hadley wickham

On Wed, Jul 1, 2009 at 2:10 PM, Sunil
Suchindran wrote:
> #Highlight the text below (without the header)
> # read the data in from clipboard
>
> df <- do.call(data.frame, scan("clipboard", what=list(id=0,
> date="",loctype=0 ,haptype=0)))
>
> # split the data by date, sample 1 observation from each split, and rbind
>
> sampled_df <- do.call(rbind, lapply(split(df,
> df$date),function(x)x[sample(1:nrow(x), 1),]))

ddply from the plyr package (http://had.co.nz/plyr), makes this sort
of operation a little simpler:

ddply(df, "date", function(df) df[sample(nrow(df), 1), ])

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] timer in R?

2009-07-01 Thread Michael

Hi all,

How could I set a timer in R, so that at fixed interval, the R program
will invoke some other functions to run some tasks?

Thank you very much!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Neural Networks

2009-07-01 Thread Brigid Mooney

Hi,

I am starting to play around with neural networks and noticed that there are
several packages on the CRAN website for neural networks (AMORE, grnnR,
neural, neuralnet, maybe more if I missed them).

Are any of these packages more well-suited for newbies to neural networks?
Are there any relative strengths / weaknesses to the different
implementations?

If anyone has any advice before I dive into this project, I'd appreciate
it.

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] productivity tools in R?

2009-07-01 Thread Dirk Eddelbuettel

On Wed, Jul 01, 2009 at 01:35:39PM -0400, miguel bernal wrote:
> Emacs or X-emacs with ess (Emacs Speaks Statistics) is great on Linux and
> Mac (can be the console you saw on Mac) for syntax highlight, programming
> and debugging. 

Also see 

http://vgoulet.act.ulaval.ca/en/ressources/emacs/windows

for a Windows-distribution of Emacs bundled with ESS, AucTeX (for
LaTeX), Aspell and more. Works fine on all recent flavours of Windoze.

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] odd behaviour in quantreg::rq

2009-07-01 Thread Dylan Beaudette

Thanks Roger. Your comments were very helpful. Unfortunately, each of 
the 'groups' in this example are derived from the same set of data, two of 
which were subsets-- so it is not that unlikely that the weighted medians 
were the same in some cases.

This all leads back to an operation attempting to answer the question:

Of the 2 subsetting methods, which one produces a distribution most like the 
complete data set? Since the distributions are not normal, and there are 
area-weights involved others on the list suggested quantile-regression. For a 
more complete picture of how 'different' the distributions are, I have tried 
looking at the differences between weighted quantiles: (0.1, 0.25, 0.5, 0.75, 
0.9) as a more complete 'description' of each distribution.

I imagine that there may be a better way to perform this comparison...

Cheers,
Dylan


On Tuesday 30 June 2009, roger koenker wrote:
> Admittedly this seemed quite peculiar  but if you look at the
> entrails
> of the following code you will see that with the weights the first and
> second levels of your x$method variable have the same (weighted) median
> so the contrast that you are estimating SHOULD be zero.  Perhaps
> there is something fishy about the data construction that would have
> allowed us to anticipate this?  Regarding the "fn" option, and the
> non-uniqueness warning,  this is covered in the (admittedly obscure)
> faq on quantile regression available at:
>
>   http://www.econ.uiuc.edu/~roger/research/rq/FAQ
>
> # example:
> library(quantreg)
>
> # load data
> x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv'))
>
> # with weights
> summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5),
> se='ker')
>
> #Reproduction with more convenient notation:
>
> X0 <- model.matrix(~method, data = x)
> y <- x$sand
> w <- x$area_fraction
> f0 <- summary(rq(y ~ X0 - 1, weights = w),se = "ker")
>
> #Second reproduction with orthogonal design:
>
> X1 <- model.matrix(~method - 1, data = x)
> f1 <- summary(rq(y ~ X1 - 1, weights = w),se = "ker")
>
> #Comparing f0 and f1 we see that they are consistent!!  How can that
> be??
> #Since the columns of X1 are orthogonal estimation of the 3 parameters
> are separable
> #so we can check to see whether the univariate weighted medians are
> reproducible.
>
> s1 <- X1[,1] == 1
> s2 <- X1[,2] == 1
> g1 <- rq(y[s1] ~ X1[s1,1] - 1, weights = w[s1])
> g2 <- rq(y[s2] ~ X1[s2,2] - 1, weights = w[s2])
>
> #Now looking at the g1 and g2 objects we see that they are equal and
> agree with f1.
>
>
> url:www.econ.uiuc.edu/~rogerRoger Koenker
> emailrkoen...@uiuc.eduDepartment of Economics
> vox: 217-333-4558University of Illinois
> fax:   217-244-6678Urbana, IL 61801
>
> On Jun 30, 2009, at 3:54 PM, Dylan Beaudette wrote:
> > Hi,
> >
> > I am trying to use quantile regression to perform weighted-
> > comparisons of the
> > median across groups. This works most of the time, however I am
> > seeing some
> > odd output in summary(rq()):
> >
> > Call: rq(formula = sand ~ method, tau = 0.5, data = x, weights =
> > area_fraction)
> > Coefficients:
> >   ValueStd. Error t value  Pr(>|t|)
> > (Intercept)45.44262  3.64706   12.46007  0.0
> > methodmukey-HRU 0.0  4.671150.0  1.0
> >   ^
> >
> > When I do not include the weights, I get something a little closer
> > to a
> > weighted comparison of means, along with an error message:
> >
> > Call: rq(formula = sand ~ method, tau = 0.5, data = x)
> > Coefficients:
> >   ValueStd. Error t value  Pr(>|t|)
> > (Intercept)44.91579  2.46341   18.23318  0.0
> > methodmukey-HRU 9.57601  9.293481.03040  0.30380
> > Warning message:
> > In rq.fit.br(x, y, tau = tau, ...) : Solution may be nonunique
> >
> >
> > I have noticed that the error message goes away when specifying
> > method='fn' to
> > rq(). An example is below. Could this have something to do with
> > replication
> > in the data?
> >
> >
> > # example:
> > library(quantreg)
> >
> > # load data
> > x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv'))
> >
> > # with weights
> > summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5),
> > se='ker')
> >
> > # without weights
> > # note error message
> > summary(rq(sand ~ method, data=x, tau=0.5), se='ker')
> >
> > # without weights, no error message
> > summary(rq(sand ~ method, data=x, tau=0.5, method='fn'), se='ker')
> >

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ?max (so far...)

2009-07-01 Thread Duncan Murdoch


On 01/07/2009 1:26 PM, Mark Knecht wrote:

On Wed, Jul 1, 2009 at 9:39 AM, Duncan Murdoch wrote:

On 01/07/2009 11:49 AM, Mark Knecht wrote:

Hi,
  I have a data.frame that is date ordered by row number - earliest
date first and most current last. I want to create a couple of new
columns that show the max and min values from other columns *so far* -
not for the whole data.frame.

  It seems this sort of question is really coming from my lack of
understanding about how R intends me to limit myself to portions of a
data.frame. I get the impression from the help files that the generic
way is that if I'm on the 500th row of a 1000 row data.frame and want
to limit the search max does to rows 1:500  I should use something
like [1:row] but it's not working inside my function. The idea works
outside the function, in the sense I can create tempt1[1:7] and the
max function returns what I expect. How do I do this with row?

  Simple example attached. hp should be 'highest p', ll should be
'lowest l'. I get an error message "Error in 1:row : NA/NaN argument"

Thanks,
Mark




HighLow = function (MyFrame) {
   temp1 <- MyFrame$p[1:row]
   MyFrame$hp <- max(temp1) ## Highest p
   temp1 <- MyFrame$l[1:row]
   MyFrame$ll <- min(temp1) ## Lowest l

   return(MyFrame)
}

You get an error in this function because you didn't define row, so R
assumes you mean the function in the base package, and 1:row doesn't make
sense.

What you want for the "highest so far" is the cummax (for "cumulative
maximum") function.  See ?cummax.

Duncan Murdoch



Duncon,
   OK, thanks. That makes sense, as long as I want the cummax from the
beginning of the data.frame. (Which is exactly what I asked for!)

   How would I do this in the more general case if I was looking for
the cummax of only the most recent 50 rows in my data.frame? What I'm
trying to get down to is that as I fill in my data.frame I need to be
able get a max or min or standard deviation of the previous so many
rows of data - not the whole column - and I'm just not grasping how to
do this. Is seems like I should be able to create a data set that's
only a portion of a column while I'm in the function and then take the
cummax on that, or use it as an input to a standard deviation, etc.?


What you describe might be called a "running max".  The caTools package 
has a runmax function that probably does what you want.


More generally, you can always write a loop.  They aren't necesssrily 
fast or elegant, but they're pretty general.  For example, to calculate 
the max of the previous 50 observations (or fewer near the start of a 
vector), you could do


x <- ... some vector ...

result <- numeric(length(x))
for (i in seq_along(x)) {
  result[i] <- max( x[ max(1, i-49):i ])
}

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interaction plots (six on one page)

2009-07-01 Thread Richard M. Heiberger


I would do this as a lattice plot.  Continuing with your data:

tmp <- data.frame(sapply(data, tapply, data[1:2], mean))
tmp$time <- factor(tmp$time)
xyplot(thanaa+thalcho+thalino+ponaa+pocho+poino ~ time,
  group=BMIakt, data=tmp, type="l", scales=list(relation="free"),
  auto.key=list(title="BMIakt", border=TRUE))

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rcorr

2009-07-01 Thread Frank E Harrell Jr


James Allsopp wrote:

No, that's made no difference, sorry.


Sorry I forgot to check the print method for rcorr. If P<.0001 it prints 
as 0.  To print under your control print the object $P from the list 
created by rcorr:


r <- rcorr(. . .)
r$P

Frank



Frank E Harrell Jr wrote:

James Allsopp wrote:

Hi,
I've just run an rcorr on some data in Spearman's mode and it's just
produced the following values;
  [,1]  [,2]
[1,]  1.00 -0.55
[2,] -0.55  1.00

n= 46


P
 [,1] [,2]
[1,]   0
[2,]  0

I presume this means the p-value is lower than 0.5, but is there any
way of increasing the number of significant figures used? How should I
interpret this value?

Cheers
Jim


Try options(digits=15) before running rcorr().






--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory issues on a 64-bit debian system (quantreg)

2009-07-01 Thread Jonathan Greenberg

Just wanted to leave a note on this, after I got my new iMac (and 
installed R64 from the AT&T site) -- quantreg did run, after topping out 
at whopping 12GB of swap space (MacOS X, at least, should theoretically 
have as much swap space as there is space on the HD -- it will 
dynamically increase it as memory usage goes up).  I did get a "caught 
segfault" error but it wasn't until I did a ?rqss and clicked on a PDF 
vignette in the help browser (I was able to summary(tahoe_rq) with no 
problem).  I don't know if the mac help browser has some issue under 64 
bit systems, may be worth looking into.


I figure its best to first work out the parameters (tau) with a random 
subset first, at least for efficiency sake, then deploy the algorithm on 
the entire dataset. 


--j

roger koenker wrote:
my earlier comment is probably irrelevant since you are fitting only 
one qss component and have no other covariates.
A word of warning though when you go back to this on your new  machine 
-- you are almost surely going to want to specify
a large lambda for the qss component  in the rqss call.  The default 
of 1 is likely to produce something very very rough with

such a large dataset.


url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Urbana, IL 61801



On Jun 24, 2009, at 5:04 PM, Jonathan Greenberg wrote:

Yep, its looking like a memory issue -- we have 6GB RAM and 1GB swap 
-- I did notice that the analysis takes far less memory (and runs) if I:


tahoe_rq <- 
rqss(ltbmu_4_stemsha_30m_exp.img~ltbmu_eto_annual_mm.img,tau=.99,data=boundary_data) 


  (which I assume fits a line to the quantiles)
vs.
tahoe_rq <- 
rqss(ltbmu_4_stemsha_30m_exp.img~qss(ltbmu_eto_annual_mm.img),tau=.99,data=boundary_data) 


  (which is fitting a spline)

Unless anyone else has any hints as to whether or not I'm making a 
mistake in my call (beyond randomly subsetting the data -- I'd like 
to run the analysis on the full dataset to begin with) -- I'd like to 
fit a spline to the upper 1% of the data, I'll just wait until my new 
computer comes in next week which has more RAM.  Thanks!


--j


roger koenker wrote:

Jonathan,

Take a look at the output of sessionInfo(), it should say x86-64 if 
you have a 64bit installation, or at least I think this is the case.


Regarding rqss(),  my experience is that (usually) memory problems 
are due to the fact that early on the processing there is
a call to model.matrix()  which is supposed to create a design, aka 
X, matrix  for the problem.  This matrix is then coerced to
matrix.csr sparse format, but the dense form is often too big for 
the machine to cope with.  Ideally, someone would write an
R version of model.matrix that would permit building the matrix in 
sparse form from the get-go, but this is a non-trivial task.
(Or at least so it appeared to me when I looked into it a few years 
ago.)  An option is to roll your own X matrix:  take a smalller
version of the data, apply the formula, look at the structure of X 
and then try to make a sparse version of the full X matrix.
This is usually not that difficult, but "usually" is based on a 
rather small sample that may not be representative of your problems.


Hope that this helps,

Roger

url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Urbana, IL 61801



On Jun 24, 2009, at 4:07 PM, Jonathan Greenberg wrote:


Rers:

 I installed R 2.9.0 from the Debian package manager on our amd64 
system that currently has 6GB of RAM -- my first question is 
whether this installation is a true 64-bit installation (should R 
have access to > 4GB of RAM?)  I suspect so, because I was running 
an rqss() (package quantreg, installed via install.packages() -- I 
noticed it required a compilation of the source) and watched the 
memory usage spike to 4.9GB (my input data contains > 500,000 
samples).


 With this said, after 30 mins or so of processing, I got the 
following error:


tahoe_rq <- 
rqss(ltbmu_4_stemsha_30m_exp.img~qss(ltbmu_eto_annual_mm.img),tau=.99,data=boundary_data) 


Error: cannot allocate vector of size 1.5 Gb

 The dataset is a bit big (300mb or so), so I'm not providing it 
unless necessary to solve this memory problem.


 Thoughts?  Do I need to compile either the main R "by hand" or the 
quantreg package?


--j

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] timer in R?

2009-07-01 Thread Michael

I use Windows. Thank you!

On Wed, Jul 1, 2009 at 12:53 PM, Eduardo Leoni wrote:
> I think you are better off writing the R script and invoke it using a
> OS specific tool. For Unix-like systems there is cron.
>
> hth,
>
> -e
>
> On Wed, Jul 1, 2009 at 3:41 PM, Michael wrote:
>> Hi all,
>>
>> How could I set a timer in R, so that at fixed interval, the R program
>> will invoke some other functions to run some tasks?
>>
>> Thank you very much!
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Science is the art of the soluble. (Peter Medawar)
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] timer in R?

2009-07-01 Thread Barry Rowlingson

On Wed, Jul 1, 2009 at 8:41 PM, Michael wrote:
> Hi all,
>
> How could I set a timer in R, so that at fixed interval, the R program
> will invoke some other functions to run some tasks?
>

 Use timer events in the tcltk package:

> z=function(){cat("Hello you!\n");tcl("after",1000,z)}
> tcl("after",1000,z)

 now every 1000ms it will say "Hello you!". I'm not sure how to stop
this without quitting R. There's probably some tcl function that
clears it. Maybe. Or not. Just make sure you don't *always* reschedule
another event on every event.

 Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ?max (so far...)

2009-07-01 Thread Mark Knecht

On Wed, Jul 1, 2009 at 12:54 PM, Duncan Murdoch wrote:
> On 01/07/2009 1:26 PM, Mark Knecht wrote:
>>
>> On Wed, Jul 1, 2009 at 9:39 AM, Duncan Murdoch
>> wrote:
>>>
>>> On 01/07/2009 11:49 AM, Mark Knecht wrote:

 Hi,
  I have a data.frame that is date ordered by row number - earliest
 date first and most current last. I want to create a couple of new
 columns that show the max and min values from other columns *so far* -
 not for the whole data.frame.

  It seems this sort of question is really coming from my lack of
 understanding about how R intends me to limit myself to portions of a
 data.frame. I get the impression from the help files that the generic
 way is that if I'm on the 500th row of a 1000 row data.frame and want
 to limit the search max does to rows 1:500  I should use something
 like [1:row] but it's not working inside my function. The idea works
 outside the function, in the sense I can create tempt1[1:7] and the
 max function returns what I expect. How do I do this with row?

  Simple example attached. hp should be 'highest p', ll should be
 'lowest l'. I get an error message "Error in 1:row : NA/NaN argument"

 Thanks,
 Mark

>> 

 HighLow = function (MyFrame) {
       temp1 <- MyFrame$p[1:row]
       MyFrame$hp <- max(temp1) ## Highest p
       temp1 <- MyFrame$l[1:row]
       MyFrame$ll <- min(temp1) ## Lowest l

       return(MyFrame)
 }
>>>
>>> You get an error in this function because you didn't define row, so R
>>> assumes you mean the function in the base package, and 1:row doesn't make
>>> sense.
>>>
>>> What you want for the "highest so far" is the cummax (for "cumulative
>>> maximum") function.  See ?cummax.
>>>
>>> Duncan Murdoch
>>>
>>
>> Duncon,
>>   OK, thanks. That makes sense, as long as I want the cummax from the
>> beginning of the data.frame. (Which is exactly what I asked for!)
>>
>>   How would I do this in the more general case if I was looking for
>> the cummax of only the most recent 50 rows in my data.frame? What I'm
>> trying to get down to is that as I fill in my data.frame I need to be
>> able get a max or min or standard deviation of the previous so many
>> rows of data - not the whole column - and I'm just not grasping how to
>> do this. Is seems like I should be able to create a data set that's
>> only a portion of a column while I'm in the function and then take the
>> cummax on that, or use it as an input to a standard deviation, etc.?
>
> What you describe might be called a "running max".  The caTools package has
> a runmax function that probably does what you want.
>
> More generally, you can always write a loop.  They aren't necesssrily fast
> or elegant, but they're pretty general.  For example, to calculate the max
> of the previous 50 observations (or fewer near the start of a vector), you
> could do
>
> x <- ... some vector ...
>
> result <- numeric(length(x))
> for (i in seq_along(x)) {
>  result[i] <- max( x[ max(1, i-49):i ])
> }
>
> Duncan Murdoch
>

Thanks for the pointer. I'll check it out.

Today I've managed to get pretty much all of my Excel spreadsheet
built in R except for some of the charts. It took me a week and a half
in Excel. This is my 3rd full day with R. Charts are next.

I appreciate your help and the help I've gotten from others. Thanks so much.

cheers,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ?max (so far...)

2009-07-01 Thread baptiste auguie

For another generic approach, you might be interested in the Reduce
function,

rolling <- function( x, window=seq_along(x), f=max){

Reduce(f, x[window])

}

x= c(1:10, 2:10, 15, 1)

rolling(x)
#15
rolling(x, 1:10)
#10
rolling(x, 1:12)
#10

Of course this is only part of the solution to the initial problem (where
the window needs to move along).

HTH,

baptiste


2009/7/1 Duncan Murdoch 

> On 01/07/2009 1:26 PM, Mark Knecht wrote:
>
>> On Wed, Jul 1, 2009 at 9:39 AM, Duncan Murdoch
>> wrote:
>>
>>> On 01/07/2009 11:49 AM, Mark Knecht wrote:
>>>
 Hi,
  I have a data.frame that is date ordered by row number - earliest
 date first and most current last. I want to create a couple of new
 columns that show the max and min values from other columns *so far* -
 not for the whole data.frame.

  It seems this sort of question is really coming from my lack of
 understanding about how R intends me to limit myself to portions of a
 data.frame. I get the impression from the help files that the generic
 way is that if I'm on the 500th row of a 1000 row data.frame and want
 to limit the search max does to rows 1:500  I should use something
 like [1:row] but it's not working inside my function. The idea works
 outside the function, in the sense I can create tempt1[1:7] and the
 max function returns what I expect. How do I do this with row?

  Simple example attached. hp should be 'highest p', ll should be
 'lowest l'. I get an error message "Error in 1:row : NA/NaN argument"

 Thanks,
 Mark

  
>>
>>> HighLow = function (MyFrame) {
   temp1 <- MyFrame$p[1:row]
   MyFrame$hp <- max(temp1) ## Highest p
   temp1 <- MyFrame$l[1:row]
   MyFrame$ll <- min(temp1) ## Lowest l

   return(MyFrame)
 }

>>> You get an error in this function because you didn't define row, so R
>>> assumes you mean the function in the base package, and 1:row doesn't make
>>> sense.
>>>
>>> What you want for the "highest so far" is the cummax (for "cumulative
>>> maximum") function.  See ?cummax.
>>>
>>> Duncan Murdoch
>>>
>>>
>> Duncon,
>>   OK, thanks. That makes sense, as long as I want the cummax from the
>> beginning of the data.frame. (Which is exactly what I asked for!)
>>
>>   How would I do this in the more general case if I was looking for
>> the cummax of only the most recent 50 rows in my data.frame? What I'm
>> trying to get down to is that as I fill in my data.frame I need to be
>> able get a max or min or standard deviation of the previous so many
>> rows of data - not the whole column - and I'm just not grasping how to
>> do this. Is seems like I should be able to create a data set that's
>> only a portion of a column while I'm in the function and then take the
>> cummax on that, or use it as an input to a standard deviation, etc.?
>>
>
> What you describe might be called a "running max".  The caTools package has
> a runmax function that probably does what you want.
>
> More generally, you can always write a loop.  They aren't necesssrily fast
> or elegant, but they're pretty general.  For example, to calculate the max
> of the previous 50 observations (or fewer near the start of a vector), you
> could do
>
> x <- ... some vector ...
>
> result <- numeric(length(x))
> for (i in seq_along(x)) {
>  result[i] <- max( x[ max(1, i-49):i ])
> }
>
> Duncan Murdoch
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
__

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] odd behaviour in quantreg::rq

2009-07-01 Thread roger koenker


It's not clear to me whether you are looking for an exploratory tool
or something more like formal inference.  For the former, it seems
that estimating a few weighted quantiles would be quite useful.  at  
least

it is rather Tukeyesque.  While I'm appealing to authorities, I can't
resist recalling that Galton's "invention of correlation" paper:  Co- 
relations

and their measurement, Proceedings of the Royal Society, 1888-89,
used medians and interquartile ranges.


url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Urbana, IL 61801



On Jul 1, 2009, at 2:48 PM, Dylan Beaudette wrote:


Thanks Roger. Your comments were very helpful. Unfortunately, each of
the 'groups' in this example are derived from the same set of data,  
two of
which were subsets-- so it is not that unlikely that the weighted  
medians

were the same in some cases.

This all leads back to an operation attempting to answer the question:

Of the 2 subsetting methods, which one produces a distribution most  
like the
complete data set? Since the distributions are not normal, and there  
are
area-weights involved others on the list suggested quantile- 
regression. For a
more complete picture of how 'different' the distributions are, I  
have tried
looking at the differences between weighted quantiles: (0.1, 0.25,  
0.5, 0.75,

0.9) as a more complete 'description' of each distribution.

I imagine that there may be a better way to perform this comparison...

Cheers,
Dylan


On Tuesday 30 June 2009, roger koenker wrote:

Admittedly this seemed quite peculiar  but if you look at the
entrails
of the following code you will see that with the weights the first  
and
second levels of your x$method variable have the same (weighted)  
median

so the contrast that you are estimating SHOULD be zero.  Perhaps
there is something fishy about the data construction that would have
allowed us to anticipate this?  Regarding the "fn" option, and the
non-uniqueness warning,  this is covered in the (admittedly obscure)
faq on quantile regression available at:

http://www.econ.uiuc.edu/~roger/research/rq/FAQ

# example:
library(quantreg)

# load data
x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv'))

# with weights
summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5),
se='ker')

#Reproduction with more convenient notation:

X0 <- model.matrix(~method, data = x)
y <- x$sand
w <- x$area_fraction
f0 <- summary(rq(y ~ X0 - 1, weights = w),se = "ker")

#Second reproduction with orthogonal design:

X1 <- model.matrix(~method - 1, data = x)
f1 <- summary(rq(y ~ X1 - 1, weights = w),se = "ker")

#Comparing f0 and f1 we see that they are consistent!!  How can that
be??
#Since the columns of X1 are orthogonal estimation of the 3  
parameters

are separable
#so we can check to see whether the univariate weighted medians are
reproducible.

s1 <- X1[,1] == 1
s2 <- X1[,2] == 1
g1 <- rq(y[s1] ~ X1[s1,1] - 1, weights = w[s1])
g2 <- rq(y[s2] ~ X1[s2,2] - 1, weights = w[s2])

#Now looking at the g1 and g2 objects we see that they are equal and
agree with f1.


url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Urbana, IL 61801

On Jun 30, 2009, at 3:54 PM, Dylan Beaudette wrote:

Hi,

I am trying to use quantile regression to perform weighted-
comparisons of the
median across groups. This works most of the time, however I am
seeing some
odd output in summary(rq()):

Call: rq(formula = sand ~ method, tau = 0.5, data = x, weights =
area_fraction)
Coefficients:
 ValueStd. Error t value  Pr(>|t|)
(Intercept)45.44262  3.64706   12.46007  0.0
methodmukey-HRU 0.0  4.671150.0  1.0
  ^

When I do not include the weights, I get something a little closer
to a
weighted comparison of means, along with an error message:

Call: rq(formula = sand ~ method, tau = 0.5, data = x)
Coefficients:
 ValueStd. Error t value  Pr(>|t|)
(Intercept)44.91579  2.46341   18.23318  0.0
methodmukey-HRU 9.57601  9.293481.03040  0.30380
Warning message:
In rq.fit.br(x, y, tau = tau, ...) : Solution may be nonunique


I have noticed that the error message goes away when specifying
method='fn' to
rq(). An example is below. Could this have something to do with
replication
in the data?


# example:
library(quantreg)

# load data
x <- read.csv(url('http://169.237.35.250/~dylan/temp/test.csv'))

# with weights
summary(rq(sand ~ method, data=x, weights=area_fraction, tau=0.5),
se='ker')

# without weights
# note error message
summary(rq(sand ~ method, data=x, tau=0.5), se='ker')

# without wei

Re: [R] "Error: cannot allocate vector of size 332.3 Mb"

2009-07-01 Thread Jonathan Greenberg

Steve: 

   Are you running R64.app?  If not, grab it from here:

http://r.research.att.com/R-2.9.0.pkg

(http://r.research.att.com/ under "Leopard build") .

   As far as I know (and I actually just tried it this morning), the 
standard R 2.9.1 package off the CRAN website is the 32 bit version, so 
you won't be able to access > 4GB of RAM.

--j

Steve Ellis wrote:

Dear R-helpers,

I am running R version 2.9.1 on a Mac Quad with 32Gb of RAM running 
Mac OS X version 10.5.6.  With over 20Gb of RAM "free" (according to 
the Activity Monitor) the following happens.

> x <- matrix(rep(0, 6600^2), ncol = 6600)

# So far so good.  But I need 3 matrices of this size.

> y <- matrix(rep(0, 6600^2), ncol = 6600)
R(3219) malloc: *** mmap(size=348483584) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
R(3219) malloc: *** mmap(size=348483584) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Error: cannot allocate vector of size 332.3 Mb

Now a 6600 x 6600 matrix should take up less than 400Mb of RAM.  So 
the question is, with 20Gb of RAM free how come I can't create more 
than one matrix of this size?  (In fact, sometimes R won't even create 
one of them.)  More to the point, is there some simple remedy? 
(Rewriting all my code to use the "Matrix" library, for example, is 
not a simple remedy.)

I tried launching R in a terminal with

R --min-vsize=10M --max-vsize=5G --min-nsize=500k --max-nsize=900M

and that didn't work either.  Finally, let me remark that I had the 
same problem with an older version of R.

  -- Steve Ellis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Cell: 415-794-5043
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "Error: cannot allocate vector of size 332.3 Mb"

2009-07-01 Thread Jonathan Greenberg

By the way, you'll probably have to reinstall some or all of your 
packages (and dependencies) if you are using R64.app, probably 
downgrading them in the process.

--j

Steve Ellis wrote:

Dear R-helpers,

I am running R version 2.9.1 on a Mac Quad with 32Gb of RAM running 
Mac OS X version 10.5.6.  With over 20Gb of RAM "free" (according to 
the Activity Monitor) the following happens.

> x <- matrix(rep(0, 6600^2), ncol = 6600)

# So far so good.  But I need 3 matrices of this size.

> y <- matrix(rep(0, 6600^2), ncol = 6600)
R(3219) malloc: *** mmap(size=348483584) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
R(3219) malloc: *** mmap(size=348483584) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Error: cannot allocate vector of size 332.3 Mb

Now a 6600 x 6600 matrix should take up less than 400Mb of RAM.  So 
the question is, with 20Gb of RAM free how come I can't create more 
than one matrix of this size?  (In fact, sometimes R won't even create 
one of them.)  More to the point, is there some simple remedy? 
(Rewriting all my code to use the "Matrix" library, for example, is 
not a simple remedy.)

I tried launching R in a terminal with

R --min-vsize=10M --max-vsize=5G --min-nsize=500k --max-nsize=900M

and that didn't work either.  Finally, let me remark that I had the 
same problem with an older version of R.

  -- Steve Ellis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Cell: 415-794-5043
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 141 matches

Mail list logo