[R] Introduction to R (in french)

2008-12-16 Thread Julien Barnier
Hi all,

I recently put a new version of my french introduction to R online. It
is more specifically targeted at social sciences students and
researchers, but could be interesting for beginners who are not
really familiar with statistics and coding.

The document is available (in french) in PDF, as well as the Sweave
source code, from the following page :

http://alea.fr.eu.org/j/intro_R.html

Someone advised me to submit it to CRAN, for the contributed
documentation section, but I don't know who I must contact for
that. Should I just send a mail to c...@r-project.org ?

Anyway, I would be happy to receive any feedback on the document.

Sincerely,

Julien

-- 
Julien Barnier
Groupe de recherche sur la socialisation
ENS-LSH - Lyon, France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sliding window over a large vector

2008-12-16 Thread markleeds
Hi: Veslot:  I'm too tired to even try to figure out why but I think 
that there is something wrong with your sl function. see below for an 
empirical
proof of that statement.  OR maybe you're definition of sliding window 
is different than rollapply's definition but rollapply's answer makes

more sense to me ?

Output


set.seed(1)
x - rbinom(24, 1, 0.5)
print(x)

 [1] 0 0 1 1 0 1 1 1 1 0 0 0 1 0 1 0 1 1 0 1 1 0 1 0


xx1 - sl(x,3)
print(xx1)

 [1] 1 1 2 2 1 2 2 2 2 1 1 1 2 1 2 1 2 2 1 2 2


temp - zoo(x)
ans-rollapply(temp,3,sum)
print(ans)

 2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 1  2  2  2  2  3  3  2  1  0  1  1  2  1  2  2  2  2  2  2  2  1


On Tue, Dec 16, 2008 at  3:47 AM, Veslot Jacques wrote:

sl - function(x,z) c(0,cumsum(diff(x)[1:(length(x)-z-1)])) + 
rep(sum(x[1:z]),length(x)-z)

x - rbinom(10, 1, 0.5)
system.time(xx1 - slide(x,12))
utilisateur système  écoulé   36.860.45 
37.32

system.time(xx2 - sl(x,12))
utilisateur système  écoulé0.010.00 
0.02

all.equal(xx1,xx2)

[1] TRUE

Jacques VESLOT

CEMAGREF - UR Hydrobiologie

Route de Cézanne - CS 40061  13182 AIX-EN-PROVENCE Cedex 5, France

Tél.   + 0033   04 42 66 99 76
fax+ 0033   04 42 66 99 34
email   jacques.ves...@cemagref.fr


-Message d'origine-
De : r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.org] De la part

de Chris Oldmeadow
Envoyé : mardi 16 décembre 2008 05:20
À : r-help@r-project.org
Objet : [R] sliding window over a large vector

Hi all,

I have a very large binary vector, I wish to calculate the number of
1's  over sliding windows.

this is my very slow function

slide-function(seq,window){
  n-length(seq)-window
  tot-c()
  tot[1]-sum(seq[1:window])
  for (i in 2:n) {
 tot[i]- tot[i-1]-seq[i-1]+seq[i]
  }
  return(tot)
}

this works well for for reasonably sized vectors. Does anybody know a
way for large vectors ( length=12 million), im trying to avoid using 
C.


Thanks,
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sliding window over a large vector

2008-12-16 Thread Stavros Macrakis
For this particular proble (counting), doesn't cumsum solve it
effectively and efficiently?

vv - cumsum(v)
vv[n:length(vv)] - vv[1:(length(vv)-n+1]

Of course, this doesn't work for the general case of an arbitrary
sliding window function.

 -s

On 12/15/08, Chris Oldmeadow c.oldmea...@student.qut.edu.au wrote:
 Hi all,

 I have a very large binary vector, I wish to calculate the number of
 1's  over sliding windows.

 this is my very slow function

 slide-function(seq,window){
n-length(seq)-window
tot-c()
tot[1]-sum(seq[1:window])
for (i in 2:n) {
   tot[i]- tot[i-1]-seq[i-1]+seq[i]
}
return(tot)
 }

 this works well for for reasonably sized vectors. Does anybody know a
 way for large vectors ( length=12 million), im trying to avoid using C.

 Thanks,
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Sent from my mobile device

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert opengis wkt to geometry?

2008-12-16 Thread Roger Bivand
Jeff Hamann jeff.hamann at forestinformatics.com writes:

 
 After writing some code (stupidly without checking to see if there was 
 code to do this already) to generate PostGIS SQL insert statements for 
 simple geometry (wkt), I didn't check see if there is already something 
 available to convert WKT strings into some R package geometry (sp?). 
 Does anyone have any advice, hints, code (?) for converting the 
 following OpenGIS strings into something useful in R:

If you are thinking of PostGIS, then readOGR() in rgdal will read them if build
with the necessary driver(s). On Linux and/or OSX, the user would configure
OGR to choose the external headers and libraries. On Windows, the user might
choose to install rgdal from source against the FWTools DLLs, which are at:

http://fwtools.maptools.org/

The PostGIS driver is described here:

http://www.gdal.org/ogr/drv_pg.html

There are some notes on using the driver with rgdal here:

http://wiki.intamap.org/index.php/PostGIS

 
 POINT, MULTIPOINT, LINESTRING, MULTILINESTRING, POLYGON, MULTIPOLYGON,
 GEOMETRYCOLLECTION

In principle, only POINT, LINESTRING, and POLYGON (maybe MULTIPOLYGON, but
handled like a shapefile, that is flattened) are supported in sp/rgdal.

Please consider following up on R-sig-geo; there are possibly more eyes with
relevant experience there.

If the OpenGIS strings could rather be treated as GML, OGR has a driver for that
too.

Roger Bivand

 
 Jeff.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Soundex codes

2008-12-16 Thread Doran, Harold
Dear List:

Has anyone done any work developing functions for producing Soundex
codes in R? RSiteSearch('soundex') did not yield any results or did my
google searches.

Harold

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sliding window over a large vector

2008-12-16 Thread Whit Armstrong
if you want the speed, you can simply build an fts time series from
it, then apply the moving.sum function and throw away the dates.

this will probably be the fastest implementation of rolling applies
out there unless you do a cumsum difference function.

I got a sample timing of 2 seconds on 12m length vector (see botttom of email).

library(fts)

your.data - c(0,1,1,0,1,1,1,0,0,0,0,1,1,1,1)

## dates generated automatically
fake.fts - fts(data=your.data)

answer.fts - moving.sum(fake.fts,10)

## throw away dates
answer.as.vector - as.numeric(answer.fts)


my timing:
 library(fts)
 big.fts - fts(data=rep(1,1200))
 system.time(ans.fts - moving.sum(big.fts,20))
   user  system elapsed
  1.970   0.081   2.051
 nrow(big.fts)
[1] 1200
 nrow(ans.fts)
[1] 1181




-Whit


On Tue, Dec 16, 2008 at 9:12 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Tue, Dec 16, 2008 at 8:23 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 There seems to be something wrong:

 slide(c(1, 1, 0, 1), 2)
 [1] 2 2

 but the output should be c(2, 1, 2)

 That should be c(2, 1, 1)


 At any rate try this:

 library(zoo)
 3 * rollmean(x, 3)


 On Mon, Dec 15, 2008 at 11:19 PM, Chris Oldmeadow
 c.oldmea...@student.qut.edu.au wrote:
 Hi all,

 I have a very large binary vector, I wish to calculate the number of 1's
  over sliding windows.

 this is my very slow function

 slide-function(seq,window){
  n-length(seq)-window
  tot-c()
  tot[1]-sum(seq[1:window])for (i in 2:n) {
 tot[i]- tot[i-1]-seq[i-1]+seq[i]
  }
  return(tot)
 }

 this works well for for reasonably sized vectors. Does anybody know a way
 for large vectors ( length=12 million), im trying to avoid using C.

 Thanks,
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: (quasi-?) separation in a logistic GLM

2008-12-16 Thread Ioannis Kosmidis
sorry for reposting. Some code was missing in my previous email...

--
Dear Gavin

glm reported exactly what it noticed, giving a warning that some very  
small fitted probabilities have been found.
However, your data are **not** quasi-separated. The maximum likelihood  
estimates are really those reported by glm.

A first elementary way is to change the tolerance and maximum number  
of iterations in glm and see if you get the same result:
#
  mod1 - glm(formula = analogs ~ Dij, family = binomial, data = dat,  
control = glm.control(epsilon = 1e-16,  maxit = 1000))
  mod1

Call:  glm(formula = analogs ~ Dij, family = binomial, data = dat,  
control = glm.control(epsilon = 1e-16,  maxit = 1000))

Coefficients:
(Intercept)  Dij
   4.191  -29.388

Degrees of Freedom: 4033 Total (i.e. Null);  4032 Residual
Null Deviance:  1929
Residual Deviance: 613.5AIC: 617.5
#
This is exactly the same fit as the one you have. If separation  
occured the effects ususally diverge as we allow more iterations to  
glm and at some point.

**
Secondly an inspection of the estimated asymptotic standard errors,  
reveals nothing to worry for.
#
  summary(mod1)

Call:
glm(formula = analogs ~ Dij, family = binomial, data = dat, control =  
glm.control(epsilon = 1e-16,
 maxit = 1000))

Deviance Residuals:
Min  1Q  Median  3Q Max
-1.676e+00  -1.319e-02  -1.250e-04  -1.958e-06   4.104e+00

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept)   4.1912 0.3248   12.90   2e-16 ***
Dij -29.3875 1.9345  -15.19   2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

 Null deviance: 1928.62  on 4033  degrees of freedom
Residual deviance:  613.53  on 4032  degrees of freedom
AIC: 617.53

Number of Fisher Scoring iterations: 11
#
If separation occurred the estimated asymptotic standard errors would  
be unnaturally large. This is because, in the case of separation  
(quasi or not) glm would calculate the standard errors taking the sqrt  
of the diagonal elements of minus the hessian of the log-likelihood,  
in a point where the log-likelihood appears to be flat for the given  
tolerance.

**
To be certain, you could also try fitting with brglm, which is  
guaranteed to give finite estimates, that have bias of smaller order  
than the MLE and compare the results.
#
  library(brglm)
  mod.br - brglm(analogs ~ Dij, data = dat, family = binomial)
  mod.br

Call:  brglm(formula = analogs ~ Dij, family = binomial, data = dat)

Coefficients:
(Intercept)  Dij
   4.161  -29.188

Degrees of Freedom: 4033 Total (i.e. Null);  4032 Residual
Deviance:   613.5448
Penalized Deviance: 610.2794AIC: 617.5448
#
The estimates are similar a bit shrunk towards the origin which is  
natural for bias removal. If separation occurred, and given the  
previous discussion, the bias-reduced estimates would be considerably  
different than the estimates that glm reports.

**
Lastly, the more certain way to check for separation is to inspect the  
profiles of the log-likelihood. Vito suggested this but the chosen  
limits for the xval are not appropriate. If separation would occur the  
estimate would be -Inf so that the profiling as done in his email  
should be done starting from example from -40 rather than -20. This  
would reveal that the profile deviance starts increasing again, while  
if separation occured there would be an asymptote on the left. Below I  
give the correct profiles, as reported by profileModel.
  library(profileModel)
  pp - profileModel(mod1, quantile = qchisq(0.95, 1), objective =  
ordinaryDeviance)
Preliminary iteration .. Done

Profiling for parameter (Intercept) ... Done
Profiling for parameter Dij ... Done
  plot(pp)
The profiles are quite quadratic. In the case of separation you would  
have seen asymptotes on the left or on the right (see  
help(profileModel) for an example).

**
It appears that the fitted logistic curve, while steep still has a  
finite gradient, for example, at the LD50 point
  library(MASS)
  dose.p(mod)
   Dose  SE
p = 0.5: 0.1426167 0.003646903
When separation occurs the LD50 point cannot be identified (computer  
software would return something with enormous estimated standard error).

In conclusion, if you get data sets that result in large estimated  
effects on the log-odds scale, the above checks can be used to  
convince you whether separation occurred or not. If there is  
separation (not the case in the current example) then, you could use  
an alternative to maximum likelihood for estimation ---such as  
penalized maximum likelihood in brglm--- which always return  finite  
estimates. Though in that case, I suggest you incorporate the  
uncertainty on how large the estimated 

[R] Prediction intervals for zero inflated Poisson regression

2008-12-16 Thread ONKELINX, Thierry
Dear all,

I'm using zeroinfl() from the pscl-package for zero inflated Poisson
regression. I would like to calculate (aproximate) prediction intervals
for the fitted values. The package itself does not provide them. Can
this be calculated analyticaly? Or do I have to use bootstrap?

What I tried until now is to use bootstrap to estimate these intervals.
Any comments on the code are welcome. The data and the model are based
on the examples in zeroinfl().

#aproximate prediction intervals with Poisson regression
fm_pois - glm(art ~ fem, data = bioChemists, family = poisson)
newdata - na.omit(unique(bioChemists[, fem, drop = FALSE]))
prediction - predict(fm_pois, newdata = newdata, se.fit = TRUE)
ci - data.frame(exp(prediction$fit + matrix(prediction$se.fit, ncol =
1) %*% c(-1.96, 1.96)))
newdata$fit - exp(prediction$fit)
newdata - cbind(newdata, ci)
newdata$model - Poisson

library(pscl)
#aproximate prediction intervals with zero inflated poisson regression
fm_zip - zeroinfl(art ~ fem | 1, data = bioChemists)
fit - predict(fm_zip)
Pearson - resid(fm_zip, type = pearson)
VarComp - resid(fm_zip, type = response) / Pearson
fem - bioChemists$fem
bootstrap - replicate(999, {
yStar - pmax(round(fit + sample(Pearson) * VarComp, 0), 0)
predict(zeroinfl(yStar ~ fem | 1), newdata = newdata)
})
newdata0 - newdata
newdata0$fit - predict(fm_zip, newdata = newdata, type = response)
newdata0[, 3:4] - t(apply(bootstrap, 1, quantile, c(0.025, 0.975)))
newdata0$model - Zero inflated

#compare the intervals in a nice plot.
newdata - rbind(newdata, newdata0)
library(ggplot2)
ggplot(newdata, aes(x = fem, y = fit, min = X1, max = X2, colour =
model)) + geom_point(position = position_dodge(width = 0.4)) +
geom_errorbar(position = position_dodge(width = 0.4))


Best regards,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] structure Arrays

2008-12-16 Thread Amit Patel
Hi

Does anyone know how I can use structured arrays in r
similar to a dataframe in matlab



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find all numbers in a certain interval

2008-12-16 Thread Antje

Hi,

sorry, but it shouldn't be different. The result should be the same but I was 
looking if there is a method I can use...


# having a function defined like baptiste proposed:
isIn -
function (interval, x)
{
(x  min(interval))  (x  max(interval))
}

#--


a - rnorm(100)

# it's simply more human readable if I can write

which( isIn( c(-0.5, 0.5), a) )

# instead of

which( a  -0.5  a  0.5 )

Thanks to baptiste! So there is no method available doing this and I have to 
define this by myself. That's all I wanted to know :-)


Antje


markle...@verizon.net schrieb:
hi:  could you explain EXACTLY what you want to do with the dataframe 
because it shouldn't be that different ?




On Tue, Dec 16, 2008 at  5:09 AM, Antje wrote:


Hi all,

I'd like to know, if I can solve this with a shorter command:

a - rnorm(100)
which(a  -0.5  a  0.5)

# would give me all indices of numbers greater than -0.5 and smaller 
than +0.5


I have something similar with a dataframe and it produces sometimes 
quite long commands...

I'd like to have something like:

which(within.interval(a, -0.5, 0.5))

Is there anything I could use for this purpose?


Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting a date vector

2008-12-16 Thread David Winsemius
You cannot keep them as strings and still get the benefits of working  
with date-class objects. You should read more documentation regarding  
dates. The as.Date function turns strings into a form that is stored  
internally as number of days since some reference date and what you  
are seeing is the default display format, %Y-%m-%d. Learn how to use  
the output formats so that you see what you desire.


?as.Date
?Dates
?format.Date

--
David Winsemius


On Dec 16, 2008, at 8:24 AM, RON70 wrote:



Yes you are right. However using that code, format of date is  
altered. I need
to main same format as the input data i.e. 10-02-2008 not  
2008-10-02,

still having date-class. Any better idea?


David Winsemius wrote:


You might want to look at your date format more closely. Both the
separator and the year format specs fail to match your input.


as.Date(10-02-2008, format = %m/%d/%y)

[1] NA

as.Date(10-02-2008, format = %m-%d-%Y)

[1] 2008-10-02

--
David Winsemius
On Dec 16, 2008, at 7:54 AM, RON70 wrote:



I have a date-like-vector like :


date_file

10-02-2008 10-03-2008 10-06-2008 10-07-2008 10-09-2008
10-10-2008 10-13-2008 10-14-2008 10-15-2008
10-16-2008 10-17-2008 10-20-2008 10-21-2008 10-22-2008
10-23-2008 10-24-2008 10-28-2008 10-29-2008
10-30-2008 10-31-2008 11-03-2008 11-04-2008 11-05-2008
11-06-2008 11-07-2008 11-10-2008 11-11-2008
11-12-2008 11-13-2008 11-14-2008 11-17-2008 11-18-2008
11-19-2008 11-20-2008 11-21-2008 11-24-2008
11-25-2008 11-26-2008 11-28-2008 12-01-2008 12-02-2008
12-03-2008 12-04-2008 12-05-2008 12-08-2008
12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008
4-18-2008  4-21-2008  4-22-2008  4-23-2008
4-24-2008  4-28-2008  4-29-2008  5-01-2008  5-05-2008
5-06-2008  5-07-2008  5-09-2008  5-12-2008
5-13-2008  5-14-2008  5-15-2008  5-16-2008  5-19-2008
5-20-2008  5-21-2008  5-22-2008  5-23-2008
5-27-2008  5-28-2008  5-29-2008  5-30-2008  6-02-2008
6-03-2008  6-05-2008  6-06-2008  6-09-2008
6-10-2008  6-11-2008  6-12-2008  6-13-2008  6-17-2008
6-18-2008  6-19-2008  6-20-2008  6-23-2008
6-24-2008  6-25-2008  6-26-2008  6-27-2008  7-01-2008
7-02-2008  7-04-2008  7-07-2008  7-08-2008
7-09-2008  7-10-2008  7-11-2008  7-15-2008  7-16-2008
7-18-2008  7-21-2008  7-22-2008  7-23-2008
7-24-2008  7-25-2008  7-28-2008  7-30-2008  7-31-2008
8-01-2008  8-04-2008  8-05-2008  8-06-2008
8-07-2008  8-08-2008  8-11-2008  8-12-2008  8-13-2008
8-15-2008  8-18-2008  8-19-2008  8-20-2008
8-21-2008  8-22-2008  8-25-2008  8-26-2008  8-27-2008
8-28-2008  8-29-2008  9-03-2008  9-04-2008
9-05-2008  9-08-2008  9-09-2008  9-10-2008  9-11-2008
9-12-2008  9-15-2008  9-16-2008  9-17-2008
9-18-2008  9-19-2008  9-22-2008  9-23-2008  9-24-2008
9-25-2008  9-26-2008  9-29-2008  9-30-2008

I wanted to sort this in ascending order. I tried using simply  
sort()

function, without altering the format of date, but it didnot work.
Next I
tried to convert that vector in a date-class vector so that, I could
sort
them but in vein :(

I used :
as.Date(date_file, format=%m/%d/%y)

However it did not work.

Can anyone please tell me what would be correct approach?
--
View this message in context:
http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032540.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032997.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] R CMD check on window XP

2008-12-16 Thread Shu Chen
Hi, there,

I used R CMD check  to build my ATGGS package under window XP system. My R 
version is 2.7.2. But I encounter some problems. The log file is like:
**
installing R.css in C:/ATGGS.Rcheck


-- Making package ATGGS 
  adding build stamp to DESCRIPTION
  installing R files
  installing inst files
find: `C:/ATGGS.Rcheck/ATGGS/csvscripts': Permission denied
make[2]: *** [C:/ATGGS.Rcheck/ATGGS/inst] Error 1
make[1]: *** [all] Error 2
make: *** [pkg-ATGGS] Error 2
Can't read C:/ATGGS.Rcheck/ATGGS/auxData: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS/auxData: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't read C:/ATGGS.Rcheck/ATGGS/csvData: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS/csvData: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't read C:/ATGGS.Rcheck/ATGGS/csvscripts: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS/csvscripts: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't read C:/ATGGS.Rcheck/ATGGS/doc: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
*** Installation of ATGGS failed ***

Removing 'C:/ATGGS.Rcheck/ATGGS'



I am not able to delete c:/ATGGS.Rcheck until I change the permission of the 
folder. I'm the admin of C driver. I have full control of all other folders 
under  C driver.


Thanks for help.

Sue



**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parameter Estimation - Generalized Extreme Value Distribution

2008-12-16 Thread J. R. M. Hosking

Maithili Shiva wrote:

Dear R helpers,

How do you estimate the (Location, Scale, Shape) parameters of Generalized 
Extreme Value distribution using R?

...


Package lmom, function pelgev.

J. R. M. Hosking

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a pdf

2008-12-16 Thread Prof Brian Ripley

On Tue, 16 Dec 2008, Sergi M.Garrido wrote:


Hi guys,



I'm working on a package, and I want to create a new version file pdf.

On R 2.6.2 it ran ok with the code: R CMD Rd2dvi.sh --pdf pkg.

But it doesn't run on R 2.8.0.

What I'm doing wrong?


Not telling us what the problem was.  But the correct syntax is
R CMD Rd2dvi --pdf pkg (and probably was in 2.6.2).  If that does not 
work, show us a session log including all the errors.





These are my components:

 ActivePerl-5.8.8.822-MSWin32-x86-280952

 basic-miktex-2.7.2960

 htmlhelp

 MinGW-3.2.0-rc-3


(BTW, looks really old.)


 Rtools28



Thanks in advance,

Sergi M.Garrido

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] socket server, textConnection and readLines

2008-12-16 Thread Krishna Dagli
Hello;

This is bit long email but hope someone can guide me.

I have questions regarding socket, readLines and textConnection. I am
not sure if my code is efficient (due to textConnection) and how to
handle client disconnect and restart of the socket server in R.

I have a huge(3.5+G) text file on machine 'A', which I want to process
on machine 'B' using read.table (one line or a chunk at a time). On
machine B, I would like to use NWS and multiple R scripts to process
each line/chunk.

To do this I am running netcat (http://netcat.sourceforge.net/) on
macine 'A' and sending data to machine 'B's R socket server.

Here is the data that I have on machine 'A'

---data---
RELIANCE,1200.00,03-NOV-2008,09:00:02:286
RELIANCE,1200.20,03-NOV-2008,09:00:02:287
RELIANCE,1200.10,03-NOV-2008,09:00:02:289
RELIANCE,1201.10,03-NOV-2008,09:00:02:310
INFOSYSTCH,1400.00,03-NOV-2008,09:00:02:286
INFOSYSTCH,1400.20,03-NOV-2008,09:00:02:287
INFOSYSTCH,1400.10,03-NOV-2008,09:00:02:289
INFOSYSTCH,1401.10,03-NOV-2008,09:00:02:310
---end data---


Here is the code that I am using for reading this data on machine 'B'.

---code---
a.connection - socketConnection(host = 'localhost', 1234,
 server = TRUE,
 blocking = TRUE,
 open = r,
 encoding = getOption(encoding)
 )
while(1) {
  line.raw - NULL;
  line.raw - readLines( a.connection, n = 1, ok = TRUE);
  tConnection - textConnection(line.raw);
  line.data - read.table(tConnection);
  if ( (class(line.data) == 'try-error') ||
 (length(line.data) = 0)) {
print (may be client is disconnected! );
break;
  }
  # validate line.data and store it using
  print (line.data);
  close(tConnection);
}
---end code---

Questions:

1) Is there a way to avoid creation and closing of textConnection in
   above code? How can I directly read a line over socket in R?
   If I do not explicitly close the connection I get an warning message
   saying  closing unused connection 7 (line.raw).

2) What is the best way to detect that client is disconnected?

3) In C, we can create a socket, bind it but do accept() in side a
   while loop using select() call but how do I do the same in R.


Thanks again for reading such a long email and thanks in advance for
your pointers.

Thanks and Regards
Krishna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem assigning NA as a level name in a list

2008-12-16 Thread Cliff Behrens
I want to generate a list (called dataList below) where each of its 
levels is named.  These names are assigned to nameList, which contains 
all possible permutations of size two taking letters from a larger 
alphabet, e.g., aa,...,Fd,..,Z1,...  One of these permutations is 
the character string NA.  It seems that when I try to name one of the 
dataList levels NA, using names(dataList)- nameList, the names() 
function assigns the missing character to the level.  Is there someway 
to preserve NA as the name of a level in dataList?  Here is the R code 
I have been using to do this.


namePerms- permutations(ncol(coinMat),2,colnames(coinMat),repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList ## The NA in nameList is interpreted 
so that the name NA is missing for one level in dataList


I am running R 2.4.1 in the Windows XP environment.

Thanks for any help that can be offerred.

Cliff Behrens


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting a date vector

2008-12-16 Thread RON70

Yes you are right. However using that code, format of date is altered. I need
to main same format as the input data i.e. 10-02-2008 not 2008-10-02,
still having date-class. Any better idea?


David Winsemius wrote:
 
 You might want to look at your date format more closely. Both the  
 separator and the year format specs fail to match your input.
 
   as.Date(10-02-2008, format = %m/%d/%y)
 [1] NA
   as.Date(10-02-2008, format = %m-%d-%Y)
 [1] 2008-10-02
 
 -- 
 David Winsemius
 On Dec 16, 2008, at 7:54 AM, RON70 wrote:
 

 I have a date-like-vector like :

 date_file
 10-02-2008 10-03-2008 10-06-2008 10-07-2008 10-09-2008
 10-10-2008 10-13-2008 10-14-2008 10-15-2008
 10-16-2008 10-17-2008 10-20-2008 10-21-2008 10-22-2008
 10-23-2008 10-24-2008 10-28-2008 10-29-2008
 10-30-2008 10-31-2008 11-03-2008 11-04-2008 11-05-2008
 11-06-2008 11-07-2008 11-10-2008 11-11-2008
 11-12-2008 11-13-2008 11-14-2008 11-17-2008 11-18-2008
 11-19-2008 11-20-2008 11-21-2008 11-24-2008
 11-25-2008 11-26-2008 11-28-2008 12-01-2008 12-02-2008
 12-03-2008 12-04-2008 12-05-2008 12-08-2008
 12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008
 4-18-2008  4-21-2008  4-22-2008  4-23-2008
 4-24-2008  4-28-2008  4-29-2008  5-01-2008  5-05-2008
 5-06-2008  5-07-2008  5-09-2008  5-12-2008
 5-13-2008  5-14-2008  5-15-2008  5-16-2008  5-19-2008
 5-20-2008  5-21-2008  5-22-2008  5-23-2008
 5-27-2008  5-28-2008  5-29-2008  5-30-2008  6-02-2008
 6-03-2008  6-05-2008  6-06-2008  6-09-2008
 6-10-2008  6-11-2008  6-12-2008  6-13-2008  6-17-2008
 6-18-2008  6-19-2008  6-20-2008  6-23-2008
 6-24-2008  6-25-2008  6-26-2008  6-27-2008  7-01-2008
 7-02-2008  7-04-2008  7-07-2008  7-08-2008
 7-09-2008  7-10-2008  7-11-2008  7-15-2008  7-16-2008
 7-18-2008  7-21-2008  7-22-2008  7-23-2008
 7-24-2008  7-25-2008  7-28-2008  7-30-2008  7-31-2008
 8-01-2008  8-04-2008  8-05-2008  8-06-2008
 8-07-2008  8-08-2008  8-11-2008  8-12-2008  8-13-2008
 8-15-2008  8-18-2008  8-19-2008  8-20-2008
 8-21-2008  8-22-2008  8-25-2008  8-26-2008  8-27-2008
 8-28-2008  8-29-2008  9-03-2008  9-04-2008
 9-05-2008  9-08-2008  9-09-2008  9-10-2008  9-11-2008
 9-12-2008  9-15-2008  9-16-2008  9-17-2008
 9-18-2008  9-19-2008  9-22-2008  9-23-2008  9-24-2008
 9-25-2008  9-26-2008  9-29-2008  9-30-2008

 I wanted to sort this in ascending order. I tried using simply sort()
 function, without altering the format of date, but it didnot work.  
 Next I
 tried to convert that vector in a date-class vector so that, I could  
 sort
 them but in vein :(

 I used :
 as.Date(date_file, format=%m/%d/%y)

 However it did not work.

 Can anyone please tell me what would be correct approach?
 -- 
 View this message in context:
 http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032540.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032997.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parameter Estimation - Generalized Extreme Value Distribution

2008-12-16 Thread Maithili Shiva
Dear R helpers,

How do you estimate the (Location, Scale, Shape) parameters of Generalized 
Extreme Value distribution using R?

I have tried VGAM but just not able to write the R script.

Please advise.

With regards

Maithili

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Peter Dalgaard

Cliff Behrens wrote:
I want to generate a list (called dataList below) where each of its 
levels is named.  These names are assigned to nameList, which contains 
all possible permutations of size two taking letters from a larger 
alphabet, e.g., aa,...,Fd,..,Z1,...  One of these permutations is 
the character string NA.  It seems that when I try to name one of the 
dataList levels NA, using names(dataList)- nameList, the names() 
function assigns the missing character to the level.  Is there someway 
to preserve NA as the name of a level in dataList?  Here is the R code 
I have been using to do this.


namePerms- permutations(ncol(coinMat),2,colnames(coinMat),repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList ## The NA in nameList is interpreted 
so that the name NA is missing for one level in dataList


I am running R 2.4.1 in the Windows XP environment.

Thanks for any help that can be offerred.


Your example is not reproducible and self-contained. What is 
permutations and coinMat??


I bet it isn't minimal either.

It doesn't seem to be happening for me with a recent(!) version of R, 
but you could just be misinterpreting the backtick quoting.


-
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Find all numbers in a certain interval

2008-12-16 Thread Antje

Hi all,

I'd like to know, if I can solve this with a shorter command:

a - rnorm(100)
which(a  -0.5  a  0.5)

# would give me all indices of numbers greater than -0.5 and smaller than +0.5

I have something similar with a dataframe and it produces sometimes quite long 
commands...

I'd like to have something like:

which(within.interval(a, -0.5, 0.5))

Is there anything I could use for this purpose?


Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Dr Eberhard W Lisse
Quite irritating to me as the Manager of .NA too, when I
used NA for .NA :-)-O

el

Peter Dalgaard wrote:
 Cliff Behrens wrote:
 One of these permutations
 is the character string NA.  It seems that when I try to name one of
 the dataList levels NA, using names(dataList)- nameList, the
 names() function assigns the missing character to the level.  Is there
 someway to preserve NA as the name of a level in dataList?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Cliff Behrens

Peter,

OK...here is reproducible, self-contained code:

library(gregmisc)
columnNames - c(A,B,C,D,N,a,b,c)
namePerms- permutations(length(columnNames),2,columnNames,repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList  ## The NA is interpreted that the name is 
missing for one list in dataList


If you inspect the contents of dataList, you will find the following 
showing that the name NA is treated differently:


..

$Na
NULL

$`NA`
NULL

$Nb
NULL
.
Peter Dalgaard wrote:

Cliff Behrens wrote:
I want to generate a list (called dataList below) where each of its 
levels is named.  These names are assigned to nameList, which 
contains all possible permutations of size two taking letters from a 
larger alphabet, e.g., aa,...,Fd,..,Z1,...  One of these 
permutations is the character string NA.  It seems that when I try 
to name one of the dataList levels NA, using names(dataList)- 
nameList, the names() function assigns the missing character to the 
level.  Is there someway to preserve NA as the name of a level in 
dataList?  Here is the R code I have been using to do this.


namePerms- permutations(ncol(coinMat),2,colnames(coinMat),repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList ## The NA in nameList is 
interpreted so that the name NA is missing for one level in dataList


I am running R 2.4.1 in the Windows XP environment.

Thanks for any help that can be offerred.


Your example is not reproducible and self-contained. What is 
permutations and coinMat??


I bet it isn't minimal either.

It doesn't seem to be happening for me with a recent(!) version of R, 
but you could just be misinterpreting the backtick quoting.


-
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sliding window over a large vector

2008-12-16 Thread c.oldmeadow
the function works for me

s-rbinom(1000,1,0.5)

t-slide(s,50)

just too slow.

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pwr.prop.test and continuity correction

2008-12-16 Thread Peter Dalgaard

Daniel Brewer wrote:

Hi,

I am trying to sort out a discrepancy between power calculations results
between me and another statistician.  I use R but I am not sure what she
uses.  It is on the proportions test and so I have been using
pwr.prop.test.  I think I have tracked the problem down to pwr.prop.test
not using the continuity correction for the test (I did this by using
the java applet from
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/power.prop.test.html).

So I was wondering whether:
1) Someone could confirm that pwr.prop.test does not use a continuity
correction in its calculation.
2) Someone could tell me either how to use pwr.prop.test or another
function to get the power of a prop.test with continuity correction.
The reason I want this is that I would normally apply the correction
when I actually used the test.

Many thanks

Dan



power.prop.test (sic) is relying heavily on asymptotic normality, as do 
similar formulas. It doesn't use continuity correction, but if you're 
working with such small group sizes, I suspect that the correction term 
is the least of your worries and that direct simulation would be better.


(Another source of discrepancy, sometimes seen in textbooks, is that 
authors use the null variance of p1-p2 also under the alternative. This 
simplifies the formulas considerably, but it does assume that the actual 
difference is rather small.)


R is Open Source. If you want a correction term, it is just a matter of 
figuring out where to modify expressions like


p.body - quote(pnorm(((sqrt(n) * abs(p1 - p2) -
(qnorm(sig.level/tside,
lower.tail = FALSE) * sqrt((p1 + p2) * (1 - (p1 +
 p2)/2/sqrt(p1 *
(1 - p1) + p2 * (1 - p2)

by adding or subtracting 0.5 or 0.5/n in the appropriate places.


--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pwr.prop.test and continuity correction

2008-12-16 Thread Frank E Harrell Jr

Peter Dalgaard wrote:

Daniel Brewer wrote:

Hi,

I am trying to sort out a discrepancy between power calculations results
between me and another statistician.  I use R but I am not sure what she
uses.  It is on the proportions test and so I have been using
pwr.prop.test.  I think I have tracked the problem down to pwr.prop.test
not using the continuity correction for the test (I did this by using
the java applet from
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/power.prop.test.html). 



So I was wondering whether:
1) Someone could confirm that pwr.prop.test does not use a continuity
correction in its calculation.
2) Someone could tell me either how to use pwr.prop.test or another
function to get the power of a prop.test with continuity correction.
The reason I want this is that I would normally apply the correction
when I actually used the test.

Many thanks

Dan



power.prop.test (sic) is relying heavily on asymptotic normality, as do 
similar formulas. It doesn't use continuity correction, but if you're 
working with such small group sizes, I suspect that the correction term 
is the least of your worries and that direct simulation would be better.


(Another source of discrepancy, sometimes seen in textbooks, is that 
authors use the null variance of p1-p2 also under the alternative. This 
simplifies the formulas considerably, but it does assume that the actual 
difference is rather small.)


R is Open Source. If you want a correction term, it is just a matter of 
figuring out where to modify expressions like


p.body - quote(pnorm(((sqrt(n) * abs(p1 - p2) -
(qnorm(sig.level/tside,
lower.tail = FALSE) * sqrt((p1 + p2) * (1 - (p1 +
 p2)/2/sqrt(p1 *
(1 - p1) + p2 * (1 - p2)

by adding or subtracting 0.5 or 0.5/n in the appropriate places.




In addition to what Peter said, the continuity correction is in effect 
an attempt to make the proportion test behave like Fisher's exact test 
which is known to be conservative.  We don't usually desire P-values 
that are too large, so I don't recommend the continuity correction.


See the bpower.sim function in the Hmisc package for a simulation-based 
method, and the reference below.


Frank

@Article{cra08how,
  author =   {Crans, Gerald G. and Shuster, Jonathan J.},
  title = 		 {How conservative is {Fisher's} exact test? {A} 
quantitative evaluation of the two-sample comparative binomial trial},

  journal =  Stat in Med,
  year = 2008,
  volume =   27,
  pages ={3598-3611},
  annote = 	 {Fisher's exact test; $2\times 2$ contingency table;size 
of test; comparative binomial experiment;first paper to truly quantify 
the conservativeness of Fisher's test;``the test size of FET was less 
than 0.035 for nearly all sample sizes before 50 and did not approach 
0.05 even for sample sizes over 100.'';conservativeness of ``exact'' 
methods}

}


--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sorting a date vector

2008-12-16 Thread RON70

I have a date-like-vector like :

 date_file
 10-02-2008 10-03-2008 10-06-2008 10-07-2008 10-09-2008
10-10-2008 10-13-2008 10-14-2008 10-15-2008
 10-16-2008 10-17-2008 10-20-2008 10-21-2008 10-22-2008
10-23-2008 10-24-2008 10-28-2008 10-29-2008
 10-30-2008 10-31-2008 11-03-2008 11-04-2008 11-05-2008
11-06-2008 11-07-2008 11-10-2008 11-11-2008
 11-12-2008 11-13-2008 11-14-2008 11-17-2008 11-18-2008
11-19-2008 11-20-2008 11-21-2008 11-24-2008
 11-25-2008 11-26-2008 11-28-2008 12-01-2008 12-02-2008
12-03-2008 12-04-2008 12-05-2008 12-08-2008
 12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008
4-18-2008  4-21-2008  4-22-2008  4-23-2008 
 4-24-2008  4-28-2008  4-29-2008  5-01-2008  5-05-2008 
5-06-2008  5-07-2008  5-09-2008  5-12-2008 
 5-13-2008  5-14-2008  5-15-2008  5-16-2008  5-19-2008 
5-20-2008  5-21-2008  5-22-2008  5-23-2008 
 5-27-2008  5-28-2008  5-29-2008  5-30-2008  6-02-2008 
6-03-2008  6-05-2008  6-06-2008  6-09-2008 
 6-10-2008  6-11-2008  6-12-2008  6-13-2008  6-17-2008 
6-18-2008  6-19-2008  6-20-2008  6-23-2008 
 6-24-2008  6-25-2008  6-26-2008  6-27-2008  7-01-2008 
7-02-2008  7-04-2008  7-07-2008  7-08-2008 
 7-09-2008  7-10-2008  7-11-2008  7-15-2008  7-16-2008 
7-18-2008  7-21-2008  7-22-2008  7-23-2008 
 7-24-2008  7-25-2008  7-28-2008  7-30-2008  7-31-2008 
8-01-2008  8-04-2008  8-05-2008  8-06-2008 
 8-07-2008  8-08-2008  8-11-2008  8-12-2008  8-13-2008 
8-15-2008  8-18-2008  8-19-2008  8-20-2008 
 8-21-2008  8-22-2008  8-25-2008  8-26-2008  8-27-2008 
8-28-2008  8-29-2008  9-03-2008  9-04-2008 
 9-05-2008  9-08-2008  9-09-2008  9-10-2008  9-11-2008 
9-12-2008  9-15-2008  9-16-2008  9-17-2008 
 9-18-2008  9-19-2008  9-22-2008  9-23-2008  9-24-2008 
9-25-2008  9-26-2008  9-29-2008  9-30-2008 

I wanted to sort this in ascending order. I tried using simply sort()
function, without altering the format of date, but it didnot work. Next I
tried to convert that vector in a date-class vector so that, I could sort
them but in vein :(

I used :
as.Date(date_file, format=%m/%d/%y)

However it did not work.

Can anyone please tell me what would be correct approach?
-- 
View this message in context: 
http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032540.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Peter Dalgaard

Cliff Behrens wrote:

Peter,

OK...here is reproducible, self-contained code:

library(gregmisc)


Relying on a 3rd party package is not kosher either... Whatever did
list(NA=2)  or l - list(2); names(l) - NA do to you?


columnNames - c(A,B,C,D,N,a,b,c)
namePerms- permutations(length(columnNames),2,columnNames,repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList  ## The NA is interpreted that the name is 
missing for one list in dataList


If you inspect the contents of dataList, you will find the following 
showing that the name NA is treated differently:


Anyways  As I thought:

Remember that NA is a reserved word. You get the same kind of reaction 
if you name an element for or in. It denotes that you need to quote 
the name for indexing with $:


 names(l) - NA
 l$NA
Error: unexpected numeric constant in l$NA
 l$`NA`
[1] 2
 l$NA
[1] 2
 l[[NA]]
[1] 2
 names(l)
[1] NA


..

$Na
NULL

$`NA`
NULL

$Nb
NULL
.
Peter Dalgaard wrote:

Cliff Behrens wrote:
I want to generate a list (called dataList below) where each of its 
levels is named.  These names are assigned to nameList, which 
contains all possible permutations of size two taking letters from a 
larger alphabet, e.g., aa,...,Fd,..,Z1,...  One of these 
permutations is the character string NA.  It seems that when I try 
to name one of the dataList levels NA, using names(dataList)- 
nameList, the names() function assigns the missing character to the 
level.  Is there someway to preserve NA as the name of a level in 
dataList?  Here is the R code I have been using to do this.


namePerms- permutations(ncol(coinMat),2,colnames(coinMat),repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList ## The NA in nameList is 
interpreted so that the name NA is missing for one level in dataList


I am running R 2.4.1 in the Windows XP environment.

Thanks for any help that can be offerred.


Your example is not reproducible and self-contained. What is 
permutations and coinMat??


I bet it isn't minimal either.

It doesn't seem to be happening for me with a recent(!) version of R, 
but you could just be misinterpreting the backtick quoting.


-
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907





--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem install modul R-base-2.5.0-2.1.x86_64.rpm on SLES9 64-bit

2008-12-16 Thread Mutti Luca
 good morging,
 
 I need to install R-2.x on my host: 
 
 Linux, 2.6.5-7.308-smp #1 SMP Mon Dec 10 11:36:40 UTC 2007 x86_64
 x86_64 x86_64 GNU/Linux
 
 now I have check the packages on your document:
 http://cran.r-project.org/bin/linux/suse/ReadMe.html
 but I have a problem whit xorg-x11-lib, in my host I have installed
 package: XFree86-libs-4.3.99.902-43.94
 and I have a conflit (see my log)
  log1.txt 
 can you help me or inform me where find right information for install
 in my host?
 
 tanks in advance, best regards
 _
 Repubblica e Cantone Ticino, www.ti.ch/csi
 Dipartimento delle finanze e dell'economia
 Centro sistemi informativi/PESC
 Luca Mutti
 Via Pretorio 16   (+41 91) 815.57.49
 6901 Lugano   luca.mu...@ti.ch
 

Salvaguarda l'ambiente; stampa questo messaggio soltanto se è veramente 
necessario!
Il presente messaggio e i suoi eventuali allegati possono contenere dati o 
informazioni confidenziali o protette giuridicamente. Esso è destinato 
esclusivamente alle persone sopra indicate che sono le uniche autorizzate ad 
usarlo, copiarlo e, sotto la propria responsabilità, diffonderlo. Chiunque 
ricevesse questo messaggio (o una sua copia) per errore è pregato di rinviarlo 
immediatamente al mittente, eliminando definitivamente l'originale, senza 
distribuire, copiare, inoltrare o fare altrimenti uso dello stesso.
# rpm -i xorg-x11-libs-6.8.2-0.1.x86_64.rpm
warning: xorg-x11-libs-6.8.2-0.1.x86_64.rpm: V3 DSA signature: NOKEY, key ID 
6143b445
file /usr/X11R6/bin/xauth from install of xorg-x11-libs-6.8.2-0.1 
conflicts with file from package XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/compose.dir from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/en_US.UTF-8/Compose from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/en_US.UTF-8/XLC_LOCALE from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/iso8859-15/Compose from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/ja_JP.UTF-8/Compose from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/ja_JP.UTF-8/XLC_LOCALE from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/ko_KR.UTF-8/Compose from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/ko_KR.UTF-8/XLC_LOCALE from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/lib64/common/ximcp.so.2 from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/lib64/common/xlcDef.so.2 from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/lib64/common/xlcUTF8Load.so.2 from 
install of xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/lib64/common/xlibi18n.so.2 from install 
of xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/lib64/common/xlocale.so.2 from install 
of xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/lib64/common/xomGeneric.so.2 from 
install of xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/locale.alias from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/locale.dir from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/th_TH.UTF-8/Compose from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/th_TH.UTF-8/XLC_LOCALE from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/zh_CN.UTF-8/Compose from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file /usr/X11R6/lib/X11/locale/zh_CN.UTF-8/XLC_LOCALE from install of 
xorg-x11-libs-6.8.2-0.1 conflicts with file from package 
XFree86-libs-4.3.99.902-43.94
file 

Re: [R] ggplot2 and lattice

2008-12-16 Thread Wayne F


stephen sefick wrote:
 
 yes a parallel coordinates plot- I understand that it is for
 multivariate data, but I am having a hard time figuring out what it is
 telling me.  Thanks for your help.
 
In the lattice book, the author mentions that static parallel plots aren't
very useful, in general.

With a lot of data, they tend to be a spaghetti mess. They're more useful
when you can brush over data to highlight it dynamically, which could show
you common patterns. (E.g. that cars with smaller engines tend to have
better mileage, but poorer acceleration.)

At least that's my limited experience with them.

Wikipedia has a page: http://en.wikipedia.org/wiki/Parallel_coordinates and
the sample graph they have at the top of the page shows data that clusters
on the first 5 features/dimensions, and then goes spaghetti on you. (As the
article says, ordering of the dimensions is important, and they obviously
got a reasonable order... or had boring data.)
-- 
View this message in context: 
http://www.nabble.com/ggplot2-and-lattice-tp19579003p21036174.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parameter estimation - Generalized Extreme value Distribution

2008-12-16 Thread Maithili Shiva
Dear R help,


I have an xls file with the name ONS.csv having 25 obseravations as given below.

This is my data. (i.e. the first column of file ONS.csv)

(5.55,4.56,17.82,5.03,5.3,40.28,8.05,27.8,5.85,5.42,14.75,46.13,18.5,4.58,
4.31,9.19,6.61,15.92,96.94,21.63,4.44,4.88,241.74,38592.1,5.24)


I am trying to fit the Generalized Extreme Value distribution to this data.


Following is my R - script

  Library (lsmev)
  ONS - read.csv(GEV.csv,header = TRUE)
  gev.fit(ONS[,1])

I get following output

$conv
[1] 0

$nllh
[1] 99.28817

$mle
[1] 5.940866 2.703154 1.425794

$se
[1] 0.6827288 1.1263298 0.2590853

What is the meaning of mle (entries). Does it give me the parameter estimated 
for the location(5.940866), scale(2.703154) and shape(1.425794) parameter of 
the Generalized Extreme Value distribution.

Please guide.

Thanking you in advance

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting a date vector

2008-12-16 Thread Bingxiang Miao
2008/12/16 RON70 ron_michae...@yahoo.com


 Yes you are right. However using that code, format of date is altered. I
 need
 to main same format as the input data i.e. 10-02-2008 not 2008-10-02,
 still having date-class. Any better idea?


   You may try this:
format(sort(a,decreasing=TRUE),%m-%d-%Y)




 David Winsemius wrote:
 
  You might want to look at your date format more closely. Both the
  separator and the year format specs fail to match your input.
 
as.Date(10-02-2008, format = %m/%d/%y)
  [1] NA
as.Date(10-02-2008, format = %m-%d-%Y)
  [1] 2008-10-02
 
  --
  David Winsemius
  On Dec 16, 2008, at 7:54 AM, RON70 wrote:
 
 
  I have a date-like-vector like :
 
  date_file
  10-02-2008 10-03-2008 10-06-2008 10-07-2008 10-09-2008
  10-10-2008 10-13-2008 10-14-2008 10-15-2008
  10-16-2008 10-17-2008 10-20-2008 10-21-2008 10-22-2008
  10-23-2008 10-24-2008 10-28-2008 10-29-2008
  10-30-2008 10-31-2008 11-03-2008 11-04-2008 11-05-2008
  11-06-2008 11-07-2008 11-10-2008 11-11-2008
  11-12-2008 11-13-2008 11-14-2008 11-17-2008 11-18-2008
  11-19-2008 11-20-2008 11-21-2008 11-24-2008
  11-25-2008 11-26-2008 11-28-2008 12-01-2008 12-02-2008
  12-03-2008 12-04-2008 12-05-2008 12-08-2008
  12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008
  4-18-2008  4-21-2008  4-22-2008  4-23-2008
  4-24-2008  4-28-2008  4-29-2008  5-01-2008  5-05-2008
  5-06-2008  5-07-2008  5-09-2008  5-12-2008
  5-13-2008  5-14-2008  5-15-2008  5-16-2008  5-19-2008
  5-20-2008  5-21-2008  5-22-2008  5-23-2008
  5-27-2008  5-28-2008  5-29-2008  5-30-2008  6-02-2008
  6-03-2008  6-05-2008  6-06-2008  6-09-2008
  6-10-2008  6-11-2008  6-12-2008  6-13-2008  6-17-2008
  6-18-2008  6-19-2008  6-20-2008  6-23-2008
  6-24-2008  6-25-2008  6-26-2008  6-27-2008  7-01-2008
  7-02-2008  7-04-2008  7-07-2008  7-08-2008
  7-09-2008  7-10-2008  7-11-2008  7-15-2008  7-16-2008
  7-18-2008  7-21-2008  7-22-2008  7-23-2008
  7-24-2008  7-25-2008  7-28-2008  7-30-2008  7-31-2008
  8-01-2008  8-04-2008  8-05-2008  8-06-2008
  8-07-2008  8-08-2008  8-11-2008  8-12-2008  8-13-2008
  8-15-2008  8-18-2008  8-19-2008  8-20-2008
  8-21-2008  8-22-2008  8-25-2008  8-26-2008  8-27-2008
  8-28-2008  8-29-2008  9-03-2008  9-04-2008
  9-05-2008  9-08-2008  9-09-2008  9-10-2008  9-11-2008
  9-12-2008  9-15-2008  9-16-2008  9-17-2008
  9-18-2008  9-19-2008  9-22-2008  9-23-2008  9-24-2008
  9-25-2008  9-26-2008  9-29-2008  9-30-2008
 
  I wanted to sort this in ascending order. I tried using simply sort()
  function, without altering the format of date, but it didnot work.
  Next I
  tried to convert that vector in a date-class vector so that, I could
  sort
  them but in vein :(
 
  I used :
  as.Date(date_file, format=%m/%d/%y)
 
  However it did not work.
 
  Can anyone please tell me what would be correct approach?
  --
  View this message in context:
  http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032540.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 --
 View this message in context:
 http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032997.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting a date vector

2008-12-16 Thread David Winsemius
You might want to look at your date format more closely. Both the  
separator and the year format specs fail to match your input.


 as.Date(10-02-2008, format = %m/%d/%y)
[1] NA
 as.Date(10-02-2008, format = %m-%d-%Y)
[1] 2008-10-02

--
David Winsemius
On Dec 16, 2008, at 7:54 AM, RON70 wrote:



I have a date-like-vector like :


date_file

10-02-2008 10-03-2008 10-06-2008 10-07-2008 10-09-2008
10-10-2008 10-13-2008 10-14-2008 10-15-2008
10-16-2008 10-17-2008 10-20-2008 10-21-2008 10-22-2008
10-23-2008 10-24-2008 10-28-2008 10-29-2008
10-30-2008 10-31-2008 11-03-2008 11-04-2008 11-05-2008
11-06-2008 11-07-2008 11-10-2008 11-11-2008
11-12-2008 11-13-2008 11-14-2008 11-17-2008 11-18-2008
11-19-2008 11-20-2008 11-21-2008 11-24-2008
11-25-2008 11-26-2008 11-28-2008 12-01-2008 12-02-2008
12-03-2008 12-04-2008 12-05-2008 12-08-2008
12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008
4-18-2008  4-21-2008  4-22-2008  4-23-2008
4-24-2008  4-28-2008  4-29-2008  5-01-2008  5-05-2008
5-06-2008  5-07-2008  5-09-2008  5-12-2008
5-13-2008  5-14-2008  5-15-2008  5-16-2008  5-19-2008
5-20-2008  5-21-2008  5-22-2008  5-23-2008
5-27-2008  5-28-2008  5-29-2008  5-30-2008  6-02-2008
6-03-2008  6-05-2008  6-06-2008  6-09-2008
6-10-2008  6-11-2008  6-12-2008  6-13-2008  6-17-2008
6-18-2008  6-19-2008  6-20-2008  6-23-2008
6-24-2008  6-25-2008  6-26-2008  6-27-2008  7-01-2008
7-02-2008  7-04-2008  7-07-2008  7-08-2008
7-09-2008  7-10-2008  7-11-2008  7-15-2008  7-16-2008
7-18-2008  7-21-2008  7-22-2008  7-23-2008
7-24-2008  7-25-2008  7-28-2008  7-30-2008  7-31-2008
8-01-2008  8-04-2008  8-05-2008  8-06-2008
8-07-2008  8-08-2008  8-11-2008  8-12-2008  8-13-2008
8-15-2008  8-18-2008  8-19-2008  8-20-2008
8-21-2008  8-22-2008  8-25-2008  8-26-2008  8-27-2008
8-28-2008  8-29-2008  9-03-2008  9-04-2008
9-05-2008  9-08-2008  9-09-2008  9-10-2008  9-11-2008
9-12-2008  9-15-2008  9-16-2008  9-17-2008
9-18-2008  9-19-2008  9-22-2008  9-23-2008  9-24-2008
9-25-2008  9-26-2008  9-29-2008  9-30-2008

I wanted to sort this in ascending order. I tried using simply sort()
function, without altering the format of date, but it didnot work.  
Next I
tried to convert that vector in a date-class vector so that, I could  
sort

them but in vein :(

I used :
as.Date(date_file, format=%m/%d/%y)

However it did not work.

Can anyone please tell me what would be correct approach?
--
View this message in context: 
http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032540.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] surface contour plot help

2008-12-16 Thread Brad B
I am trying to do a surface profile plot.
data is 
X  Y(1) Z(1)
1-jan-02   2002    number
2-jan-02   2002    number
.
.
.
1-jan-03   2003 (Y2) number Z(2)
2-jan-03   2003 (Y2) number Z(2)
.
.
.
until dec 31 2007.
 
I used the plot3d funtions to build a scatter point plot.
Call rinterface.rrun(library(rgl))
Call 
rinterface.rrun(plot3d(x,y1,z1,xlab='Date',ylab='Year',zlab='Vol',ylim=c(2001,2008)))
Call rinterface.rrun(plot3d(x,y2,z2,add=TRUE))
Call rinterface.rrun(plot3d(x,y3,z3,add=TRUE))
Call rinterface.rrun(plot3d(x,y4,z4,add=TRUE))
Call rinterface.rrun(plot3d(x,y5,z5,add=TRUE))
Call rinterface.rrun(plot3d(x,y6,z6,add=TRUE))
 
Is thier a way to lay a surface to this?
 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-pkgs] R CMD check on window XP

2008-12-16 Thread Martin Maechler
This message accidentally (list moderator mistake) made it to
R-packages; it clearly should have been R-help only.

Please only reply to R-help if you can help Sue.

Martin Maechler,
R-packages list moderator

 Hi, there,
 I used R CMD check  to build my ATGGS package under window XP system. My R 
 version is 2.7.2. But I encounter some problems. The log file is like:
 
 ..

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Check if data frame column is numeric

2008-12-16 Thread Mark Heckmann
Hi R-users,

I want to apply a function to each column of a data frame that is numeric.
Thus I tried to check it for each column first: 

 apply(df, 2, function(x) is.numeric(x))

 A60   A64  A66a   A67   A71  A75a   A80
A85   A91   A95   A96   A97   A98   A99 
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE FALSE FALSE FALSE

I get only FALSE results although the variables are numeric. When I try the
following it works:

 is.numeric(df$A60)
[1] TRUE

What am I doing wrong?

TIA
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Check if data frame column is numeric

2008-12-16 Thread Henrique Dallazuanna
Try:

sapply(df, is.numeric)

On Tue, Dec 16, 2008 at 1:25 PM, Mark Heckmann mark.heckm...@gmx.de wrote:

 Hi R-users,

 I want to apply a function to each column of a data frame that is numeric.
 Thus I tried to check it for each column first:

  apply(df, 2, function(x) is.numeric(x))

 A60   A64  A66a   A67   A71  A75a   A80
 A85   A91   A95   A96   A97   A98   A99
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE

 I get only FALSE results although the variables are numeric. When I try the
 following it works:

  is.numeric(df$A60)
 [1] TRUE

 What am I doing wrong?

 TIA
 Mark

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find all numbers in a certain interval

2008-12-16 Thread David Winsemius


On Dec 16, 2008, at 7:19 AM, Antje wrote:


Hi David,

thanks a lot for your proposal. I got a lot of useful hints from all  
of you :-)


David Winsemius schrieb:
It's not entirely clear what you are asking for, since  
which(within.interval(a, -0.5, 0.5)) is actually longer than  
which(a  -0.5  a  0.5).


Right but in case 'a' is something with a long name and '0.5' is a  
variable you might end up with something like this (for the data  
frame example):


DF[which( DF$myReallyLongColumnName  -myReallyLongThreshold  DF 
$myReallyLongColumnName  -myReallyLongThreshold ), ]


I see your point, but I must point out that no cases would ever  
satisfy that construction.





instead of:

DF[which( within.interval(DF$myReallyLongColumnName,  
myReallyLongThreshold), ]


That would be a different within.interval function than I suggested,  
but you could certainly create one which accepted a vector.


within.interval - function(x, y) { min(y)  x  x  max(y) }
--
 within.interval2 - function(x,y) { min(y)  x  x  max(y)}

 y - c(-.1, -.2, .1,.2)

 which(within.interval2(DF$a,y))
[1]  7 13 14 17






You mention that you want a solution that applies to
dataframes. Using indexing you can get entire rows of dataframes  
that satisfy multiple conditions on one of its columns:
 DF - data.frame(a = rnorm(20), b= LETTERS[1:20], c =  
letters[20:1], stringsAsFactors=FALSE)

 DF[which( DF$a  -0.5  DF$a  0.5 ), ]
 # note that one needs to avoid DF[which(a  -0.5  a0.5) , ]
 # the a vector is not the same as the a column vector within DF
a b c
3  -0.47310672 C r
6  -0.49784460 F o
9   0.02571058 I l
10  0.16893759 J k
11 -0.11963322 K j
12  0.39378887 L i
16  0.03712263 P e
Could get the indices that satisfy more than one condition:
 which(DF$a  0.5  DF$b  K)
[1]  1  2  6 10
Or you can get rows of DF that satisfy conditions on multiple  
columns with the subset function:

 subset(DF, a  0.5  b  K)
  a b c
1  2.2500997 A t
2  0.7251357 B s
6  0.7845355 F o
10 1.0685649 J k
Or if you wanted a within.interval function
 within.interval - function(x,a,b) { x  a  x  b}
 which(within.interval(DF$a, -0.5, 0.5))
[1]  3  4  7  8  9 13 14 17 20



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is = now the same as - in assigning values

2008-12-16 Thread Petter Hedberg
Thank you all for the reply. I´ll start using -.

Best regards,

Petter Hedberg
University of Warsaw.


2008/12/16 Gabor Grothendieck ggrothendi...@gmail.com:
 In most cases - and = are the same yet its not always
 true so its safest to use - for assignment.

 Check this out:

 http://tolstoy.newcastle.edu.au/R/e4/help/08/06/12940.html

 On Mon, Dec 15, 2008 at 4:26 PM, Petter Hedberg ekologkons...@gmail.com 
 wrote:
 I´m a PhD student at the University of Warsaw, and have started using R.
 In many books they specify to use - instead of = when assigning
 values, and this is also mentioned in older posts on the R website.

 However, it seams to me that some update has occured, becuase I
 continously get the same result wether I use - or =.

 I would be extremely helpful for any answer to this.
 = seams more intuitive, so I assumed that an update had been made due
 to popular demand and that was why I get the same output wether I use
 - or =.

 Best regards,

 Petter Hedberg

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: (quasi-?) separation in a logistic GLM

2008-12-16 Thread Gavin Simpson
On Tue, 2008-12-16 at 13:31 +0100, vito muggeo wrote:
 dear Gavin,
 I do not know whether such comment may be still useful..

Very much so, Thank you.

 
 Why are you unsure about quasi-separation?
 I think that it is quite evident in the plot

Unsure in the sense that I had been unable to ascertain what
quasi-complete separation was ;-)

I'm still not convinced about the quasi-separation issue though. The
coefficients on the glm are large but the standard errors don't indicate
anything much wrong.

I tried brglm() in the package of the same name and this gave
effectively the same coefficients and standard errors as glm() where I
would have expected them to differ considerably if (quasi-)separation
were an issue. I'm not very familiar with the approach behind brglm()
however.

I'll take a look at the profiling you describe below also when our
computing problems here get sorted.

Apologies if people have had problems downloading the file from my web
space - we are having all sorts of filestore problems here this week.

Thanks again Vito for your comments,

G

 
 plot(analogs ~ Dij, data = dat)
 
 Also it may be useful to see the plot of the monotone (profile) deviance 
 (or the log-lik) for the coef of Dij,
 
 xval-seq(-20,0,l=50)
 ll-vector(length=50)
 for(i in 1:length(xval)){
 mod - glm(analogs ~ offset(xval[i]*Dij), data = dat, family = binomial)
 ll[i]-mod$dev
 }
 
 plot(xval, ll)
 
 Hope this helps you,
 
 vito
 
 Gavin Simpson ha scritto:
  Dear List,
  
  Apologies for this off-topic post but it is R-related in the sense that
  I am trying to understand what R is telling me with the data to hand.
  
  ROC curves have recently been used to determine a dissimilarity
  threshold for identifying whether two samples are from the same type
  or not. Given the bashing that ROC curves get whenever anyone asks about
  them on this list (and having implemented the ROC methodology in my
  analogue package) I wanted to try directly modelling the probability
  that two sites are analogues for one another for given dissimilarity
  using glm().
  
  The data I have then are a logical vector ('analogs') indicating whether
  the two sites come from the same vegetation and a vector of the
  dissimilarity between the two sites ('Dij'). These are in a csv file
  currently in my university web space. Each 'row' in this file
  corresponds to single comparison between 2 sites.
  
  When I analyse these data using glm() I get the familiar fitted
  probabilities numerically 0 or 1 occurred warning. The data do not look
  linearly separable when plotted (code for which is below). I have read
  Venables and Ripley's discussion of this in MASS4 and other sources that
  discuss this warning and R (Faraway's Extending the Linear Model with R
  and John Fox's new Applied Regression, Generalized Linear Models, and
  Related Methods, 2nd Ed) as well as some of the literature on Firth's
  bias reduction method. But I am still somewhat unsure what
  (quasi-)separation is and if this is the reason for the warnings in this
  case.
  
  My question then is, is this a separation issue with my data, or is it
  quasi-separation that I have read a bit about whilst researching this
  problem? Or is this something completely different?
  
  Code to reproduce my problem with the actual data is given below. I'd
  appreciate any comments or thoughts on this.
  
   Begin code snippet 
  
  ## note data file is ~93Kb in size
  dat - read.csv(url(http://www.homepages.ucl.ac.uk/~ucfagls/dat.csv;))
  head(dat)
  ## fit model --- produces warning
  mod - glm(analogs ~ Dij, data = dat, family = binomial)
  ## plot the data
  plot(analogs ~ Dij, data = dat)
  fit.mod - fitted(mod)
  ord - with(dat, order(Dij))
  with(dat, lines(Dij[ord], fit.mod[ord], col = red, lwd = 2))
  
   End code snippet ##
  
  Thanks in advance
  
  Gavin
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [ExternalEmail] Pearson Correlation Speed

2008-12-16 Thread Charles C. Berry

On Tue, 16 Dec 2008, Nathan S. Watson-Haigh wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Charles C. Berry wrote:

On Mon, 15 Dec 2008, Nathan S. Watson-Haigh wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Nathan S. Watson-Haigh wrote:

I'm trying to calculate Pearson correlation coefficients for a large
matrix of size 18563 x 18563. The following function takes about XX
minutes to complete, and I'd like to do this calculation about 15 times
and so speed is some what of an issue.


I think you are on the wrong track, Nathan.

The matrix you are starting with is 18563 x 18563 and the result of
finding the correlations amongst the columns of that matrix is also 18563
x 18563. It will require more than 5 Gigabytes of memory to store the
result and the original matrix.


Yes the memory usage is somewhat large - luckily I have the use of a
cluster with lots of shared memory! However, I'm interested to learn how
you came about the calculation to determine the memory requirements.



The original object is


18563^2*8/1024^3

[1] 2.567358




Gigabytes, and so is the result. I added them together.




Likely the time needed to do the calc is inflated because of caching
issues and if your machine has less than enough memory to store the
result and all the intermediate pieces by swapping as well.

You can finesse these by breaking your problem into smaller pieces, say
computing the correlations between each pair of 19 blocks of columns
(columns 1:977, 977+1:977, ... 18*977+1:977 ), then assembling the
results.


This is possibly, however why is something like this not implemented
internally in the cor() function if it poorly scales due to the large
memory requirements?


Because nobody ever really needed it?

Seriously, optimizing something like this is machine dependent, and R-core 
probably has higher priorities.


cor() provides lots of options - it handles NAs, for example - and it is 
probably not worth the trouble to try to optimize over those options. The 
calculation sans NAs is a simple one and can be done using the built in 
BLAS (as crossprod() does), which BLAS can in turn be tuned to the machine 
used. So, if your environment has a tuned or multithreaded BLAS, you might 
be better off to use crossprod() and scale the result.






---

BTW, R already has the necessary machinery to calculate the crossproduct
matrix (etc) needed to find the correlations. You can access the low level
linear algebra that R uses. You can marry R to an optimized BLAS if you
like.

So pulling in some other code to do this will not save you anything. If
you ever do decide to import C[++] code there is excellent documentation
in the Writing R Extensions manual, which you should review before
attempting to import C++ code into R.


Thanks, I have seen this and it seemed quite technical to use as a
starting point for someone unfamiliar with both C++ and incorporating
C++ code into R.



Well, in that case the path of least resistance is to start the process 
when you leave for the night and pick up the results the next morning.



HTH,

Chuck

Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-pkgs] R CMD check on window XP

2008-12-16 Thread Prof Brian Ripley

On Mon, 15 Dec 2008, Shu Chen wrote:


Hi, there,

I used R CMD check  to build my ATGGS package under window XP system. My R 
version is 2.7.2. But I encounter some problems. The log file is like:
**
installing R.css in C:/ATGGS.Rcheck


-- Making package ATGGS 
 adding build stamp to DESCRIPTION
 installing R files
 installing inst files
find: `C:/ATGGS.Rcheck/ATGGS/csvscripts': Permission denied
make[2]: *** [C:/ATGGS.Rcheck/ATGGS/inst] Error 1
make[1]: *** [all] Error 2
make: *** [pkg-ATGGS] Error 2
Can't read C:/ATGGS.Rcheck/ATGGS/auxData: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS/auxData: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't read C:/ATGGS.Rcheck/ATGGS/csvData: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS/csvData: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't read C:/ATGGS.Rcheck/ATGGS/csvscripts: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS/csvscripts: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't read C:/ATGGS.Rcheck/ATGGS/doc: Invalid argument at 
c:\R\R-27~1.2/bin/INSTALL line 434
Can't remove directory C:/ATGGS.Rcheck/ATGGS: Directory not empty at 
c:\R\R-27~1.2/bin/INSTALL line 434
*** Installation of ATGGS failed ***

Removing 'C:/ATGGS.Rcheck/ATGGS'



I am not able to delete c:/ATGGS.Rcheck until I change the permission of 
the folder. I'm the admin of C driver. I have full control of all other 
folders under C driver.



Please sort out the permissions in your source directory, including under 
ATGGS/inst.  Something there has incorrect permissions that are confusing 
the Cygwin tools used.  (It might be worth checking ownership too: I've 
seen similar problems on a drive shared between XP and Vista where they 
disagreed about ownership.)


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find all numbers in a certain interval

2008-12-16 Thread Antje

Hi David,

thanks a lot for your proposal. I got a lot of useful hints from all of you :-)

David Winsemius schrieb:


It's not entirely clear what you are asking for, since 
which(within.interval(a, -0.5, 0.5)) is actually longer than which(a  
-0.5  a  0.5). 


Right but in case 'a' is something with a long name and '0.5' is a variable you 
might end up with something like this (for the data frame example):


DF[which( DF$myReallyLongColumnName  -myReallyLongThreshold  
DF$myReallyLongColumnName  -myReallyLongThreshold ), ]


instead of:

DF[which( within.interval(DF$myReallyLongColumnName, myReallyLongThreshold), ]

You mention that you want a solution that applies to
dataframes. Using indexing you can get entire rows of dataframes that 
satisfy multiple conditions on one of its columns:


  DF - data.frame(a = rnorm(20), b= LETTERS[1:20], c = letters[20:1], 
stringsAsFactors=FALSE)


  DF[which( DF$a  -0.5  DF$a  0.5 ), ]
  # note that one needs to avoid DF[which(a  -0.5  a0.5) , ]
  # the a vector is not the same as the a column vector within DF
 a b c
3  -0.47310672 C r
6  -0.49784460 F o
9   0.02571058 I l
10  0.16893759 J k
11 -0.11963322 K j
12  0.39378887 L i
16  0.03712263 P e

Could get the indices that satisfy more than one condition:
  which(DF$a  0.5  DF$b  K)
[1]  1  2  6 10

Or you can get rows of DF that satisfy conditions on multiple columns 
with the subset function:


  subset(DF, a  0.5  b  K)
   a b c
1  2.2500997 A t
2  0.7251357 B s
6  0.7845355 F o
10 1.0685649 J k

Or if you wanted a within.interval function

  within.interval - function(x,a,b) { x  a  x  b}

  which(within.interval(DF$a, -0.5, 0.5))
[1]  3  4  7  8  9 13 14 17 20





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Application b-spline basis for polynomial splines

2008-12-16 Thread Charles C. Berry

On Mon, 15 Dec 2008, ARIF WIJAYA wrote:

Hai everbody,??Is there anyone have simple application b-spline in r 
language? I need it for make me understanding about b-spline for 
polynomial spline.



Try this:


library(splines)
example(bs)



Did you reading the posting guide??

There are some terrific hints about how to learn about things -- like R's 
splines capabilities --- in the 'Do Your Homework' section.



HTH,

Chuck


Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Cliff Behrens

Peter,

I've inserted response inline below:

Cliff

Peter Dalgaard wrote:

Cliff Behrens wrote:

Peter,

OK...here is reproducible, self-contained code:

library(gregmisc)


Relying on a 3rd party package is not kosher either... Whatever did
list(NA=2)  or l - list(2); names(l) - NA do to you?

I'm not sure what you mean by 3rd party?  I downloaded this package 
from the CRAN site where I get all others.  I don't understand your 
question.

columnNames - c(A,B,C,D,N,a,b,c)
namePerms- permutations(length(columnNames),2,columnNames,repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList  ## The NA is interpreted that the name 
is missing for one list in dataList


If you inspect the contents of dataList, you will find the following 
showing that the name NA is treated differently:


Anyways  As I thought:

Remember that NA is a reserved word. You get the same kind of reaction 
if you name an element for or in. It denotes that you need to 
quote the name for indexing with $:


I thought that since all of the names in namesList were type char, there 
was no need to enclose these in quotation marks.

 names(l) - NA
 l$NA
Error: unexpected numeric constant in l$NA
 l$`NA`
[1] 2
 l$NA
[1] 2
 l[[NA]]
[1] 2
 names(l)
[1] NA


..

$Na
NULL

$`NA`
NULL

$Nb
NULL
.
Peter Dalgaard wrote:

Cliff Behrens wrote:
I want to generate a list (called dataList below) where each of 
its levels is named.  These names are assigned to nameList, which 
contains all possible permutations of size two taking letters from 
a larger alphabet, e.g., aa,...,Fd,..,Z1,...  One of these 
permutations is the character string NA.  It seems that when I 
try to name one of the dataList levels NA, using 
names(dataList)- nameList, the names() function assigns the 
missing character to the level.  Is there someway to preserve NA 
as the name of a level in dataList?  Here is the R code I have been 
using to do this.


namePerms- 
permutations(ncol(coinMat),2,colnames(coinMat),repeats=TRUE)

nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList ## The NA in nameList is 
interpreted so that the name NA is missing for one level in dataList


I am running R 2.4.1 in the Windows XP environment.

Thanks for any help that can be offerred.


Your example is not reproducible and self-contained. What is 
permutations and coinMat??


I bet it isn't minimal either.

It doesn't seem to be happening for me with a recent(!) version of 
R, but you could just be misinterpreting the backtick quoting.


-
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 
35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 
35327907







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Beta Conjugate Prior for Random intercept model -WInBUGS

2008-12-16 Thread Anamika Chaudhuri
I have been using the following random intercept model with non-informative
prior:
model {
for (i in 1:n.samples) {
vomit[i] ~ dbern(p[i])
logit(p[i]) - beta0 + alpha[siteid[i]]
}
for (j in 1:n.sites) {
alpha[j]~dnorm(0,tau)
}
beta0 ~ dnorm(0.0,1.0E-6)
tau ~ dgamma(0.01,0.01)
}
list(n.samples=3780,n.sites=63)

How could I use a beta conjugate prior for the same model so that
p(i) ~ dbeta(alpha,beta)?

Thanks for your help.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find all numbers in a certain interval

2008-12-16 Thread Duncan Murdoch

Antje wrote:

Hi,

sorry, but it shouldn't be different. The result should be the same but I was 
looking if there is a method I can use...


# having a function defined like baptiste proposed:
isIn -
function (interval, x)
{
 (x  min(interval))  (x  max(interval))
}
  


Along the lines I suggested before, I'd suggest a function ordered(...) 
(or increasing()?) that could be called as


ordered(-0.5, x, 0.5)

If you do write this, be careful about how you handle recycling of values.

Duncan Murdoch

#--


a - rnorm(100)

# it's simply more human readable if I can write

which( isIn( c(-0.5, 0.5), a) )

# instead of

which( a  -0.5  a  0.5 )

Thanks to baptiste! So there is no method available doing this and I have to 
define this by myself. That's all I wanted to know :-)


Antje


markle...@verizon.net schrieb:
  
hi:  could you explain EXACTLY what you want to do with the dataframe 
because it shouldn't be that different ?




On Tue, Dec 16, 2008 at  5:09 AM, Antje wrote:



Hi all,

I'd like to know, if I can solve this with a shorter command:

a - rnorm(100)
which(a  -0.5  a  0.5)

# would give me all indices of numbers greater than -0.5 and smaller 
than +0.5


I have something similar with a dataframe and it produces sometimes 
quite long commands...

I'd like to have something like:

which(within.interval(a, -0.5, 0.5))

Is there anything I could use for this purpose?


Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.
  


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting a date vector

2008-12-16 Thread David Winsemius


On Dec 16, 2008, at 8:49 AM, Prof Brian Ripley wrote:


On Tue, 16 Dec 2008, David Winsemius wrote:

You cannot keep them as strings and still get the benefits of  
working with date-class objects. You should read more documentation  
regarding dates. The


You can: order() will work on the Date class and the ordering can be  
applied to the original data.


Got it. Worked examples:

 dts - c(10-02-2008, 10-03-2008, 10-06-2008, 10-07-2008,  
10-09-2008, 12-09-2008, 12-10-2008, 12-11-2008, 12-12-2008,  
12-15-2008,4-18-2008,  4-21-2008,  4-22-2008,  4-23-2008)


 order(as.Date(dts, format = %m-%d-%Y))
 [1] 11 12 13 14  1  2  3  4  5  6  7  8  9 10
 rev(order(as.Date(dts, format = %m-%d-%Y)))
 [1] 10  9  8  7  6  5  4  3  2  1 14 13 12 11

 dts[rev(order(as.Date(dts, format = %m-%d-%Y)))]
 [1] 12-15-2008 12-12-2008 12-11-2008 12-10-2008 12-09-2008  
10-09-2008 10-07-2008 10-06-2008 10-03-2008

[10] 10-02-2008 4-23-2008  4-22-2008  4-21-2008  4-18-2008
 dts[order(as.Date(dts, format = %m-%d-%Y))]
 [1] 4-18-2008  4-21-2008  4-22-2008  4-23-2008  10-02-2008  
10-03-2008 10-06-2008 10-07-2008 10-09-2008

[10] 12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008

I still suggest that RON70 educate himself further regarding the Date  
class and formats.


--
David Winsemius





as.Date function turns strings into a form that is stored  
internally as number of days since some reference date and what you  
are seeing is the default display format, %Y-%m-%d. Learn how to  
use the output formats so that you see what you desire.


?as.Date
?Dates
?format.Date

--
David Winsemius


On Dec 16, 2008, at 8:24 AM, RON70 wrote:

Yes you are right. However using that code, format of date is  
altered. I need
to main same format as the input data i.e. 10-02-2008 not  
2008-10-02,

still having date-class. Any better idea?
David Winsemius wrote:

You might want to look at your date format more closely. Both the
separator and the year format specs fail to match your input.

as.Date(10-02-2008, format = %m/%d/%y)

[1] NA

as.Date(10-02-2008, format = %m-%d-%Y)

[1] 2008-10-02
--
David Winsemius
On Dec 16, 2008, at 7:54 AM, RON70 wrote:

I have a date-like-vector like :

date_file

10-02-2008 10-03-2008 10-06-2008 10-07-2008 10-09-2008
10-10-2008 10-13-2008 10-14-2008 10-15-2008
10-16-2008 10-17-2008 10-20-2008 10-21-2008 10-22-2008
10-23-2008 10-24-2008 10-28-2008 10-29-2008
10-30-2008 10-31-2008 11-03-2008 11-04-2008 11-05-2008
11-06-2008 11-07-2008 11-10-2008 11-11-2008
11-12-2008 11-13-2008 11-14-2008 11-17-2008 11-18-2008
11-19-2008 11-20-2008 11-21-2008 11-24-2008
11-25-2008 11-26-2008 11-28-2008 12-01-2008 12-02-2008
12-03-2008 12-04-2008 12-05-2008 12-08-2008
12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008
4-18-2008  4-21-2008  4-22-2008  4-23-2008
4-24-2008  4-28-2008  4-29-2008  5-01-2008  5-05-2008
5-06-2008  5-07-2008  5-09-2008  5-12-2008
5-13-2008  5-14-2008  5-15-2008  5-16-2008  5-19-2008
5-20-2008  5-21-2008  5-22-2008  5-23-2008
5-27-2008  5-28-2008  5-29-2008  5-30-2008  6-02-2008
6-03-2008  6-05-2008  6-06-2008  6-09-2008
6-10-2008  6-11-2008  6-12-2008  6-13-2008  6-17-2008
6-18-2008  6-19-2008  6-20-2008  6-23-2008
6-24-2008  6-25-2008  6-26-2008  6-27-2008  7-01-2008
7-02-2008  7-04-2008  7-07-2008  7-08-2008
7-09-2008  7-10-2008  7-11-2008  7-15-2008  7-16-2008
7-18-2008  7-21-2008  7-22-2008  7-23-2008
7-24-2008  7-25-2008  7-28-2008  7-30-2008  7-31-2008
8-01-2008  8-04-2008  8-05-2008  8-06-2008
8-07-2008  8-08-2008  8-11-2008  8-12-2008  8-13-2008
8-15-2008  8-18-2008  8-19-2008  8-20-2008
8-21-2008  8-22-2008  8-25-2008  8-26-2008  8-27-2008
8-28-2008  8-29-2008  9-03-2008  9-04-2008
9-05-2008  9-08-2008  9-09-2008  9-10-2008  9-11-2008
9-12-2008  9-15-2008  9-16-2008  9-17-2008
9-18-2008  9-19-2008  9-22-2008  9-23-2008  9-24-2008
9-25-2008  9-26-2008  9-29-2008  9-30-2008
I wanted to sort this in ascending order. I tried using simply  
sort()

function, without altering the format of date, but it didnot work.
Next I
tried to convert that vector in a date-class vector so that, I  
could

sort
them but in vein :(
I used :
as.Date(date_file, format=%m/%d/%y)
However it did not work.
Can anyone please tell me what would be correct approach?
--
View this message in context:
http://www.nabble.com/Sorting-a-date-vector-tp21032540p21032540.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context: 

[R] stably updating the SD

2008-12-16 Thread John Christie

Hi,

I have some summary data from which I know a few points and I'd like  
to remove them.  Does anyone know if there is an R module that  
implements something like Hanson (1975)?


Hanson (1975).  Stably updating mean and standard deviation of data.   
Communications of the ACM, 18(1), 57-58.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pwr.prop.test and continuity correction

2008-12-16 Thread Daniel Brewer
Hi,

I am trying to sort out a discrepancy between power calculations results
between me and another statistician.  I use R but I am not sure what she
uses.  It is on the proportions test and so I have been using
pwr.prop.test.  I think I have tracked the problem down to pwr.prop.test
not using the continuity correction for the test (I did this by using
the java applet from
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/power.prop.test.html).

So I was wondering whether:
1) Someone could confirm that pwr.prop.test does not use a continuity
correction in its calculation.
2) Someone could tell me either how to use pwr.prop.test or another
function to get the power of a prop.test with continuity correction.
The reason I want this is that I would normally apply the correction
when I actually used the test.

Many thanks

Dan

-- 
**
Daniel Brewer, Ph.D.

Institute of Cancer Research
Molecular Carcinogenesis
Email: daniel.bre...@icr.ac.uk
**

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the a...{{dropped:2}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: (quasi-?) separation in a logistic GLM

2008-12-16 Thread vito muggeo

dear Gavin,
I do not know whether such comment may be still useful..

Why are you unsure about quasi-separation?
I think that it is quite evident in the plot

plot(analogs ~ Dij, data = dat)

Also it may be useful to see the plot of the monotone (profile) deviance 
(or the log-lik) for the coef of Dij,


xval-seq(-20,0,l=50)
ll-vector(length=50)
for(i in 1:length(xval)){
mod - glm(analogs ~ offset(xval[i]*Dij), data = dat, family = binomial)
ll[i]-mod$dev
}

plot(xval, ll)

Hope this helps you,

vito

Gavin Simpson ha scritto:

Dear List,

Apologies for this off-topic post but it is R-related in the sense that
I am trying to understand what R is telling me with the data to hand.

ROC curves have recently been used to determine a dissimilarity
threshold for identifying whether two samples are from the same type
or not. Given the bashing that ROC curves get whenever anyone asks about
them on this list (and having implemented the ROC methodology in my
analogue package) I wanted to try directly modelling the probability
that two sites are analogues for one another for given dissimilarity
using glm().

The data I have then are a logical vector ('analogs') indicating whether
the two sites come from the same vegetation and a vector of the
dissimilarity between the two sites ('Dij'). These are in a csv file
currently in my university web space. Each 'row' in this file
corresponds to single comparison between 2 sites.

When I analyse these data using glm() I get the familiar fitted
probabilities numerically 0 or 1 occurred warning. The data do not look
linearly separable when plotted (code for which is below). I have read
Venables and Ripley's discussion of this in MASS4 and other sources that
discuss this warning and R (Faraway's Extending the Linear Model with R
and John Fox's new Applied Regression, Generalized Linear Models, and
Related Methods, 2nd Ed) as well as some of the literature on Firth's
bias reduction method. But I am still somewhat unsure what
(quasi-)separation is and if this is the reason for the warnings in this
case.

My question then is, is this a separation issue with my data, or is it
quasi-separation that I have read a bit about whilst researching this
problem? Or is this something completely different?

Code to reproduce my problem with the actual data is given below. I'd
appreciate any comments or thoughts on this.

 Begin code snippet 

## note data file is ~93Kb in size
dat - read.csv(url(http://www.homepages.ucl.ac.uk/~ucfagls/dat.csv;))
head(dat)
## fit model --- produces warning
mod - glm(analogs ~ Dij, data = dat, family = binomial)
## plot the data
plot(analogs ~ Dij, data = dat)
fit.mod - fitted(mod)
ord - with(dat, order(Dij))
with(dat, lines(Dij[ord], fit.mod[ord], col = red, lwd = 2))

 End code snippet ##

Thanks in advance

Gavin


--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Check if data frame column is numeric

2008-12-16 Thread Wacek Kusnierczyk
from ?apply:

 If 'X' is not an array but has a dimension attribute, 'apply' attempts
to coerce it to an array via as.matrix' if it is two-dimensional (e.g.,
data frames) or via 'as.array'.

if any of the columns in your dataframe is not numeric, apply will try
to coerce all of them to the least common supertype, and you'll get
FALSE for each column;  this is not the case with sapply.

d1 = data.frame(x=numeric(10), y=numeric(10))
d2 = data.frame(d1, z=character(10))

apply(d1, 2, is.numeric)
# TRUE TRUE
apply(d1, 2, function(x) is.numeric(x))
# same as above, redundant code
sapply(d1, is.numeric)
# TRUE TRUE

apply(d2, 2, is.numeric)
# FALSE FALSE FALSE
sapply(d2, is.numeric)
# TRUE TRUE FALSE

vQ




Mark Heckmann wrote:
 Hi R-users,

 I want to apply a function to each column of a data frame that is numeric.
 Thus I tried to check it for each column first: 

   
 apply(df, 2, function(x) is.numeric(x))
 

  A60   A64  A66a   A67   A71  A75a   A80
 A85   A91   A95   A96   A97   A98   A99 
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE

 I get only FALSE results although the variables are numeric. When I try the
 following it works:

   
 is.numeric(df$A60)
 
 [1] TRUE

 What am I doing wrong?



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change in Lattice bwplot?

2008-12-16 Thread Fredrik Karlsson
Dear list,

Sorry for asking this question, but has something changed in the
syntax for bwplot in Lattice? In an old publication, I used

 bwplot( VOTMS ~gender |type * group,
 data=merge(vot,words,by=ord),
 nint=30,
 horizontal=F,
 layout=c(3,3),
 box.ratio=0.8)

which produced a lovelly 3x3 lattice plot with one box/gender in each
panel. Now, I try

 bwplot( SyllableNucleusDiff ~ SourceLanguage,data=Hstar,horizontal=F)

to get just a simple 1x1 panel plot with groups (which I will then of
course make into a panel plot by adding | factor1 +factor2...), but I
get a Syntax error.
So, has anything changed, or am I just doing something very silly?

/Fredrik

-- 
Life is like a trumpet - if you don't put anything into it, you don't
get anything out of it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sliding window over a large vector

2008-12-16 Thread Gabor Grothendieck
On Tue, Dec 16, 2008 at 8:23 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 There seems to be something wrong:

 slide(c(1, 1, 0, 1), 2)
 [1] 2 2

 but the output should be c(2, 1, 2)

That should be c(2, 1, 1)


 At any rate try this:

 library(zoo)
 3 * rollmean(x, 3)


 On Mon, Dec 15, 2008 at 11:19 PM, Chris Oldmeadow
 c.oldmea...@student.qut.edu.au wrote:
 Hi all,

 I have a very large binary vector, I wish to calculate the number of 1's
  over sliding windows.

 this is my very slow function

 slide-function(seq,window){
  n-length(seq)-window
  tot-c()
  tot[1]-sum(seq[1:window])for (i in 2:n) {
 tot[i]- tot[i-1]-seq[i-1]+seq[i]
  }
  return(tot)
 }

 this works well for for reasonably sized vectors. Does anybody know a way
 for large vectors ( length=12 million), im trying to avoid using C.

 Thanks,
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Check if data frame column is numeric

2008-12-16 Thread Bert Gunter
... and an addendum

Hadley Wickham's plyR package attempts to redress these (nevertheless
documented) apparent inconsistencies in the *apply family of functions by
handling everything in a more consistent intuitive manner. You may wish to
use those instead of the base R *apply functions.

-- Bert Gunter
Genentech



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Henrique Dallazuanna
Sent: Tuesday, December 16, 2008 9:32 AM
To: Mark Heckmann
Cc: r-help@r-project.org
Subject: Re: [R] Check if data frame column is numeric

Try:

sapply(df, is.numeric)

On Tue, Dec 16, 2008 at 1:25 PM, Mark Heckmann mark.heckm...@gmx.de wrote:

 Hi R-users,

 I want to apply a function to each column of a data frame that is numeric.
 Thus I tried to check it for each column first:

  apply(df, 2, function(x) is.numeric(x))

 A60   A64  A66a   A67   A71  A75a   A80
 A85   A91   A95   A96   A97   A98   A99
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE

 I get only FALSE results although the variables are numeric. When I try
the
 following it works:

  is.numeric(df$A60)
 [1] TRUE

 What am I doing wrong?

 TIA
 Mark

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Parana-Brasil
250 25' 40 S 490 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: (quasi-?) separation in a logistic GLM

2008-12-16 Thread Ioannis Kosmidis
Dear Gavin,

glm reported exactly what it noticed, giving a warning that some very  
small fitted probabilities have been found.
However, your data are **not** quasi-separated. The maximum likelihood  
estimates are really those reported by glm.

A first elementary way is to change the tolerance and maximum number  
of iterations in glm and see if you get the same result:
#
  mod1

Call:  glm(formula = analogs ~ Dij, family = binomial, data = dat,  
control = glm.control(epsilon = 1e-16,  maxit = 1000))

Coefficients:
(Intercept)  Dij
   4.191  -29.388

Degrees of Freedom: 4033 Total (i.e. Null);  4032 Residual
Null Deviance:  1929
Residual Deviance: 613.5AIC: 617.5
#
This is exactly the same fit as the one you have. If separation  
occured the effects ususally diverge as we allow more iterations to  
glm and at some point.

**
Secondly an inspection of the estimated asymptotic standard errors,  
reveals nothing to worry for.
#
  summary(mod1)

Call:
glm(formula = analogs ~ Dij, family = binomial, data = dat, control =  
glm.control(epsilon = 1e-16,
 maxit = 1000))

Deviance Residuals:
Min  1Q  Median  3Q Max
-1.676e+00  -1.319e-02  -1.250e-04  -1.958e-06   4.104e+00

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept)   4.1912 0.3248   12.90   2e-16 ***
Dij -29.3875 1.9345  -15.19   2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

 Null deviance: 1928.62  on 4033  degrees of freedom
Residual deviance:  613.53  on 4032  degrees of freedom
AIC: 617.53

Number of Fisher Scoring iterations: 11
#
If separation occurred the estimated asymptotic standard errors would  
be unnaturally large. This is because, in the case of separation  
(quasi or not) glm would calculate the standard errors taking the sqrt  
of the diagonal elements of minus the hessian of the log-likelihood,  
in a point where the log-likelihood appears to be flat for the given  
tolerance.

**
To be certain, you could also try fitting with brglm, which is  
guaranteed to give finite estimates, that have bias of smaller order  
than the MLE and compare the results.
#
  library(brglm)
  mod.br - brglm(analogs ~ Dij, data = dat, family = binomial)
  mod.br

Call:  brglm(formula = analogs ~ Dij, family = binomial, data = dat)

Coefficients:
(Intercept)  Dij
   4.161  -29.188

Degrees of Freedom: 4033 Total (i.e. Null);  4032 Residual
Deviance:   613.5448
Penalized Deviance: 610.2794AIC: 617.5448
#
The estimates are similar a bit shrunk towards the origin which is  
natural for bias removal. If separation occurred, and given the  
previous discussion, the bias-reduced estimates would be considerably  
different than the estimates that glm reports.

**
Lastly, the more certain way to check for separation is to inspect the  
profiles of the log-likelihood. Vito suggested this but the chosen  
limits for the xval are not appropriate. If separation would occur the  
estimate would be -Inf so that the profiling as done in his email  
should be done starting from example from -40 rather than -20. This  
would reveal that the profile deviance starts increasing again, while  
if separation occured there would be an asymptote on the left. Below I  
give the correct profiles, as reported by profileModel.
  library(profileModel)
  pp - profileModel(mod1, quantile = qchisq(0.95, 1), objective =  
ordinaryDeviance)
Preliminary iteration .. Done

Profiling for parameter (Intercept) ... Done
Profiling for parameter Dij ... Done
  plot(pp)
The profiles are quite quadratic. In the case of separation you would  
have seen asymptotes on the left or on the right (see  
help(profileModel) for an example).

**
It appears that the fitted logistic curve, while steep still has a  
finite gradient, for example, at the LD50 point
  library(MASS)
  dose.p(mod)
   Dose  SE
p = 0.5: 0.1426167 0.003646903
When separation occurs the LD50 point cannot be identified (computer  
software would return something with enormous estimated standard error).

In conclusion, if you get data sets that result in large estimated  
effects on the log-odds scale, the above checks can be used to  
convince you whether separation occurred or not. If there is  
separation (not the case in the current example) then, you could use  
an alternative to maximum likelihood for estimation ---such as  
penalized maximum likelihood in brglm--- which always return  finite  
estimates. Though in that case, I suggest you incorporate the  
uncertainty on how large the estimated effects are in having  
confidence intervals with one infinite endpoint, for example  
confidence intervals as in help(profile.brglm).

Hope this helps,

Best wishes,

Ioannis


On 15 Dec 2008, at 18:03, Gavin Simpson wrote:

 Dear List,

 

Re: [R] Find all numbers in a certain interval

2008-12-16 Thread Duncan Murdoch

Antje wrote:

Hi all,

I'd like to know, if I can solve this with a shorter command:

a - rnorm(100)
which(a  -0.5  a  0.5)

# would give me all indices of numbers greater than -0.5 and smaller than +0.5

I have something similar with a dataframe and it produces sometimes quite long 
commands...

I'd like to have something like:

which(within.interval(a, -0.5, 0.5))

Is there anything I could use for this purpose?
Not in general, but in this particular case abs(a)  0.5 gives you the 
right result.


By the way, some advice I read many years ago (in Kernighan and 
Plauger):  always use  or =, avoid  or = in multiple comparisons.  
It's easier to read


-0.5  a  a  0.5

than it is to read the form you used, because it is so much like the 
math notation -0.5  a  0.5.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R2winbugs : vectorization

2008-12-16 Thread Vitalie Spinu

I remember having similar problem with inprod function. As far as I could 
remember a sole deference in my models was that I used inprod instead of 
explicit sum (exactly as you did). In my case the inprod version was faster but 
result were completely aberrant. So I abandoned the inprod as unreliable.

I did use OpenBugs (it's newer version of WinBugs) and BRugs interface from R.

On Mon, 15 Dec 2008 18:23:39 +0100, Philip A. Viton vito...@osu.edu wrote:



I'm new to bugs, so please bear with me. Can someone tell me if the  
following two models are doing the same thing? The reason I ask is that  
with the same data, the first (based on 4 separate coeffs a1--a4) takes  
about 50 secs, while the second (based on a vectorized form, a[]) takes  
about 300. The means are about the same, though R-hat's in the second  
version are quite a bit better.



(Also, and completely unrelated: is there any way to get more than 2  
decimal places in the display of the means?)



Thanks!!



Here are the two models: (these are censored regressions, the first is  
essentially a copy of code in Gelman+Hill):


= model 1 : 4 separate a's
model{
  for (i in 1:n){
z.lo[i]- C * equals(y[i],C)
z[i]~dnorm(z.hat[i],tau.y)I(z.lo[i],)
z.hat[i]-a1*x[i,1]+a2*x[i,2]+a3*x[i,3]+a4*x[i,4]
  }
a1~dunif(0,100)
a2~dunif(0,100)
a3~dunif(0,100)
a4~dunif(0,100)
tau.y-pow(sigma.y,-2)
sigma.y~dunif(0,100)
}


== model 2 : vector of a's
model{
  for (i in 1:n){
z.lo[i]- C * equals(y[i],C)
z[i]~dnorm(z.hat[i],tau.y)I(z.lo[i],)
z.hat[i]-inprod(a[],x[i,])
  }
  for (j in 1:k){
a[j]~dunif(0,100)
  }
tau.y-pow(sigma.y,-2)
sigma.y~dunif(0,100)
}


and here, for reference, is the R calling code:

x-as.matrix(iv)
y-dv
C-cens
z-ifelse(y==C,NA,y)
n-length(z)
data1-list(x=x,y=y,z=z,n=n,C=C)
inits1-function(){
   list(a1=runif(1),a2=runif(1),a3=runif(1),a4=runif(1),sigma.y=runif(1))}
params1-c(a1,a2,a3,a4,sigma.y)

## now the bugs call for model 1
proc.time()
aasho.1-bugs(data1,inits1,params1,aasho1.bug,n.iter=1,debug=FALSE)
proc.time()
print(aasho.1,digits=4)

now we try a vector approach
k-4 # niv
data2-list(x=x,y=y,z=z,n=n,C=C,k=k)
inits2-function(){
   list(a=runif(k),sigma.y=runif(1))}
params2-c(a,sigma.y)

## now the bugs call for model 2
proc.time()
aasho.2-bugs(data2,inits2,params2,aasho2.bug,n.iter=1,debug=FALSE)
proc.time()
print(aasho.2,digits=6)


Philip A. Viton
City Planning, Ohio State University
275 West Woodruff Avenue, Columbus OH 43210
vito...@osu.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide  
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Peter Dalgaard

Cliff Behrens wrote:

Peter,

I've inserted response inline below:

Cliff

Peter Dalgaard wrote:

Cliff Behrens wrote:

Peter,

OK...here is reproducible, self-contained code:

library(gregmisc)


Relying on a 3rd party package is not kosher either... Whatever did
list(NA=2)  or l - list(2); names(l) - NA do to you?

I'm not sure what you mean by 3rd party?  I downloaded this package 
from the CRAN site where I get all others.  I don't understand your 
question.


3rd party means that you didn't write it and neither did I/we. You are 
requesting people to help you, yet expecting that they go out of their 
way to install a package first. (As it happens, I really don't have 
gregmisc on this machine.) You could easily have created an example of a 
list with NA as a name, but that would of course have been work for 
you rather than for people on the list.





columnNames - c(A,B,C,D,N,a,b,c)
namePerms- permutations(length(columnNames),2,columnNames,repeats=TRUE)
nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList  ## The NA is interpreted that the name 
is missing for one list in dataList


If you inspect the contents of dataList, you will find the following 
showing that the name NA is treated differently:


Anyways  As I thought:

Remember that NA is a reserved word. You get the same kind of reaction 
if you name an element for or in. It denotes that you need to 
quote the name for indexing with $:


I thought that since all of the names in namesList were type char, there 
was no need to enclose these in quotation marks.


That's not the point. It works fine, it is just that the output is 
showing you how to access the element afterwards.



 names(l) - NA
 l$NA
Error: unexpected numeric constant in l$NA
 l$`NA`
[1] 2
 l$NA
[1] 2
 l[[NA]]
[1] 2
 names(l)
[1] NA


..






--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting a date vector

2008-12-16 Thread Prof Brian Ripley

On Tue, 16 Dec 2008, RON70 wrote:



I have a date-like-vector like :


date_file

10-02-2008 10-03-2008 10-06-2008 10-07-2008 10-09-2008
10-10-2008 10-13-2008 10-14-2008 10-15-2008
10-16-2008 10-17-2008 10-20-2008 10-21-2008 10-22-2008
10-23-2008 10-24-2008 10-28-2008 10-29-2008
10-30-2008 10-31-2008 11-03-2008 11-04-2008 11-05-2008
11-06-2008 11-07-2008 11-10-2008 11-11-2008
11-12-2008 11-13-2008 11-14-2008 11-17-2008 11-18-2008
11-19-2008 11-20-2008 11-21-2008 11-24-2008
11-25-2008 11-26-2008 11-28-2008 12-01-2008 12-02-2008
12-03-2008 12-04-2008 12-05-2008 12-08-2008
12-09-2008 12-10-2008 12-11-2008 12-12-2008 12-15-2008
4-18-2008  4-21-2008  4-22-2008  4-23-2008
4-24-2008  4-28-2008  4-29-2008  5-01-2008  5-05-2008
5-06-2008  5-07-2008  5-09-2008  5-12-2008
5-13-2008  5-14-2008  5-15-2008  5-16-2008  5-19-2008
5-20-2008  5-21-2008  5-22-2008  5-23-2008
5-27-2008  5-28-2008  5-29-2008  5-30-2008  6-02-2008
6-03-2008  6-05-2008  6-06-2008  6-09-2008
6-10-2008  6-11-2008  6-12-2008  6-13-2008  6-17-2008
6-18-2008  6-19-2008  6-20-2008  6-23-2008
6-24-2008  6-25-2008  6-26-2008  6-27-2008  7-01-2008
7-02-2008  7-04-2008  7-07-2008  7-08-2008
7-09-2008  7-10-2008  7-11-2008  7-15-2008  7-16-2008
7-18-2008  7-21-2008  7-22-2008  7-23-2008
7-24-2008  7-25-2008  7-28-2008  7-30-2008  7-31-2008
8-01-2008  8-04-2008  8-05-2008  8-06-2008
8-07-2008  8-08-2008  8-11-2008  8-12-2008  8-13-2008
8-15-2008  8-18-2008  8-19-2008  8-20-2008
8-21-2008  8-22-2008  8-25-2008  8-26-2008  8-27-2008
8-28-2008  8-29-2008  9-03-2008  9-04-2008
9-05-2008  9-08-2008  9-09-2008  9-10-2008  9-11-2008
9-12-2008  9-15-2008  9-16-2008  9-17-2008
9-18-2008  9-19-2008  9-22-2008  9-23-2008  9-24-2008
9-25-2008  9-26-2008  9-29-2008  9-30-2008

I wanted to sort this in ascending order. I tried using simply sort()
function, without altering the format of date, but it didnot work. Next I
tried to convert that vector in a date-class vector so that, I could sort
them but in vein :(

I used :
as.Date(date_file, format=%m/%d/%y)

However it did not work.


Your separator is '-' not '/', and you have 4-figure dates.  Looks like

sort(as.Date(date_file, format=%m-%d-%Y))

is what you intended.



PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Please do, and remember to be more helpful than 'it did not work'!


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Cliff Behrens
Sorry...I didn't realize that there were such distinct lines drawn 
around core vs contributed packages.  I merely thought that r-help put 
those with questions in touch with others who might have used or 
authored a package and experienced the same problem.  I didn't intend to 
make more work for you or anyone else on this list.  In fact, I was 
merely trying to be thorough and exact, including a note with the 
version of R and the OS I am running.  I have no idea what packages 
others have installed in their R environments.  For future reference, am 
I to assume that no contributed packages should be implicated in 
resolving a problem?


Peter Dalgaard wrote:

Cliff Behrens wrote:

Peter,

I've inserted response inline below:

Cliff

Peter Dalgaard wrote:

Cliff Behrens wrote:

Peter,

OK...here is reproducible, self-contained code:

library(gregmisc)


Relying on a 3rd party package is not kosher either... Whatever did
list(NA=2)  or l - list(2); names(l) - NA do to you?

I'm not sure what you mean by 3rd party?  I downloaded this package 
from the CRAN site where I get all others.  I don't understand your 
question.


3rd party means that you didn't write it and neither did I/we. You are 
requesting people to help you, yet expecting that they go out of their 
way to install a package first. (As it happens, I really don't have 
gregmisc on this machine.) You could easily have created an example of 
a list with NA as a name, but that would of course have been work 
for you rather than for people on the list.





columnNames - c(A,B,C,D,N,a,b,c)
namePerms- 
permutations(length(columnNames),2,columnNames,repeats=TRUE)

nameList - paste(namePerms[,1],namePerms[,2],sep=)
dataList - lapply(1:length(nameList), function(level) {})
names(dataList)-  nameList  ## The NA is interpreted that the 
name is missing for one list in dataList


If you inspect the contents of dataList, you will find the 
following showing that the name NA is treated differently:


Anyways  As I thought:

Remember that NA is a reserved word. You get the same kind of 
reaction if you name an element for or in. It denotes that you 
need to quote the name for indexing with $:


I thought that since all of the names in namesList were type char, 
there was no need to enclose these in quotation marks.


That's not the point. It works fine, it is just that the output is 
showing you how to access the element afterwards.



 names(l) - NA
 l$NA
Error: unexpected numeric constant in l$NA
 l$`NA`
[1] 2
 l$NA
[1] 2
 l[[NA]]
[1] 2
 names(l)
[1] NA


..








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help!

2008-12-16 Thread Kathi
Laura,

Try using a different browser for your download. On MacOS X, Safari quite often 
does weird stuff to files 
I want to download, frequently damaging the files. Downloading the same file 
from the same site using 
FireFox usually works fine.

Hope this helps,

Kathi

--
DropNet AG - Das Unternehmen fuer Ihren Internet-Auftritt!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dip test p-values for large sample sizes

2008-12-16 Thread Hufkens Koen

Hi list,

I'm calculating dip statistics using the diptest package for large sample 
sizes. For everything under 5000 samples I can use the table qDiptab but over 
5000 I have no reference. Is there any way to extend the table of hartigan's 
paper to larger sample sizes. Other solutions are also welcome.

Kind regards,
Koen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find all numbers in a certain interval

2008-12-16 Thread David Winsemius


It's not entirely clear what you are asking for, since  
which(within.interval(a, -0.5, 0.5)) is actually longer than which(a   
-0.5  a  0.5). You mention that you want a solution that applies to  
dataframes. Using indexing you can get entire rows of dataframes that  
satisfy multiple conditions on one of its columns:


 DF - data.frame(a = rnorm(20), b= LETTERS[1:20], c =  
letters[20:1], stringsAsFactors=FALSE)


 DF[which( DF$a  -0.5  DF$a  0.5 ), ]
  # note that one needs to avoid DF[which(a  -0.5  a0.5) , ]
  # the a vector is not the same as the a column vector within DF
 a b c
3  -0.47310672 C r
6  -0.49784460 F o
9   0.02571058 I l
10  0.16893759 J k
11 -0.11963322 K j
12  0.39378887 L i
16  0.03712263 P e

Could get the indices that satisfy more than one condition:
 which(DF$a  0.5  DF$b  K)
[1]  1  2  6 10

Or you can get rows of DF that satisfy conditions on multiple columns  
with the subset function:


 subset(DF, a  0.5  b  K)
   a b c
1  2.2500997 A t
2  0.7251357 B s
6  0.7845355 F o
10 1.0685649 J k

Or if you wanted a within.interval function

 within.interval - function(x,a,b) { x  a  x  b}

 which(within.interval(DF$a, -0.5, 0.5))
[1]  3  4  7  8  9 13 14 17 20



--
David Winsemius
Heritage Labs

On Dec 16, 2008, at 5:09 AM, Antje wrote:


Hi all,

I'd like to know, if I can solve this with a shorter command:

a - rnorm(100)
which(a  -0.5  a  0.5)

# would give me all indices of numbers greater than -0.5 and smaller  
than +0.5


I have something similar with a dataframe and it produces sometimes  
quite long commands...

I'd like to have something like:

which(within.interval(a, -0.5, 0.5))

Is there anything I could use for this purpose?


Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Ben Bolker



Cliff Behrens-3 wrote:
 
  For future reference, am 
 I to assume that no contributed packages should be implicated in 
 resolving a problem?
 
 

It does bring things one step closer to minimal, reproducible.
If you can identify the problem as specifically involving the package,
then it's still OK to query the general R list, but it's generally a
good idea to Cc: the package maintainer as well.

  Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/Problem-assigning-%22NA%22-as-a-level-name-in-a-list-tp21036232p21039112.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem assigning NA as a level name in a list

2008-12-16 Thread Cliff Behrens
Very good...thanks!  As you can tell, I really haven't made much (READ 
any) previous use of this list.


Cliff

Ben Bolker wrote:


Cliff Behrens-3 wrote:
  
 For future reference, am 
I to assume that no contributed packages should be implicated in 
resolving a problem?






It does bring things one step closer to minimal, reproducible.
If you can identify the problem as specifically involving the package,
then it's still OK to query the general R list, but it's generally a
good idea to Cc: the package maintainer as well.

  Ben Bolker

  


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sliding window over a large vector

2008-12-16 Thread Gabor Grothendieck
There seems to be something wrong:

 slide(c(1, 1, 0, 1), 2)
[1] 2 2

but the output should be c(2, 1, 2)

At any rate try this:

library(zoo)
3 * rollmean(x, 3)


On Mon, Dec 15, 2008 at 11:19 PM, Chris Oldmeadow
c.oldmea...@student.qut.edu.au wrote:
 Hi all,

 I have a very large binary vector, I wish to calculate the number of 1's
  over sliding windows.

 this is my very slow function

 slide-function(seq,window){
  n-length(seq)-window
  tot-c()
  tot[1]-sum(seq[1:window])for (i in 2:n) {
 tot[i]- tot[i-1]-seq[i-1]+seq[i]
  }
  return(tot)
 }

 this works well for for reasonably sized vectors. Does anybody know a way
 for large vectors ( length=12 million), im trying to avoid using C.

 Thanks,
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sliding window over a large vector

2008-12-16 Thread Veslot Jacques
Hi,

I just wrote a function quicker than slide() function with the same output, but 
I don't know what to do with this function! 

 sl - function(x,z) c(0,cumsum(diff(x)[1:(length(x)-z-1)])) + 
 rep(sum(x[1:z]),length(x)-z)

 sl(c(0,0,1,1,0,1,1,1,1,0,0,0,1,0,1,0,1,1,0,1,1,0,1,0),3)
 [1] 1 1 2 2 1 2 2 2 2 1 1 1 2 1 2 1 2 2 1 2 2
 
 slide-function(seq,window){
+n-length(seq)-window
+tot-c()
+tot[1]-sum(seq[1:window])   
+for (i in 2:n) {
+   tot[i]- tot[i-1]-seq[i-1]+seq[i]
+}
+return(tot)
+ }
  
 sl(c(0,0,1,1,0,1,1,1,1,0,0,0,1,0,1,0,1,1,0,1,1,0,1,0),3)
 [1] 1 1 2 2 1 2 2 2 2 1 1 1 2 1 2 1 2 2 1 2 2

 slide(c(0,0,1,1,0,1,1,1,1,0,0,0,1,0,1,0,1,1,0,1,1,0,1,0),3)
 [1] 1 1 2 2 1 2 2 2 2 1 1 1 2 1 2 1 2 2 1 2 2



 sl - function(x,z) c(0,cumsum(diff(x)[1:(length(x)-z-1)])) +  
 rep(sum(x[1:z]),length(x)-z) 

 x - rbinom(10, 1, 0.5)

 system.time(xx1 - slide(x,12))
utilisateur système  écoulé 
  36.860.45   37.32 
 system.time(xx2 - sl(x,12))
utilisateur système  écoulé 
   0.010.000.02 
 all.equal(xx1,xx2)
[1] TRUE

Jacques VESLOT

CEMAGREF - UR Hydrobiologie

Route de Cézanne - CS 40061  
13182 AIX-EN-PROVENCE Cedex 5, France

Tél.   + 0033   04 42 66 99 76
fax+ 0033   04 42 66 99 34
email   jacques.ves...@cemagref.fr  


-Message d'origine-
De : markle...@verizon.net [mailto:markle...@verizon.net]
Envoyé : mardi 16 décembre 2008 10:25
À : Veslot Jacques
Cc : Chris Oldmeadow; r-help@r-project.org
Objet : Re: [R] sliding window over a large vector

Hi: Veslot:  I'm too tired to even try to figure out why but I think
that there is something wrong with your sl function. see below for an
empirical
proof of that statement.  OR maybe you're definition of sliding window
is different than rollapply's definition but rollapply's answer makes
more sense to me ?

Output

 set.seed(1)
 x - rbinom(24, 1, 0.5)
 print(x)
  [1] 0 0 1 1 0 1 1 1 1 0 0 0 1 0 1 0 1 1 0 1 1 0 1 0

 xx1 - sl(x,3)
 print(xx1)
  [1] 1 1 2 2 1 2 2 2 2 1 1 1 2 1 2 1 2 2 1 2 2

 temp - zoo(x)
 ans-rollapply(temp,3,sum)
 print(ans)
  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
  1  2  2  2  2  3  3  2  1  0  1  1  2  1  2  2  2  2  2  2  2  1


On Tue, Dec 16, 2008 at  3:47 AM, Veslot Jacques wrote:

 sl - function(x,z) c(0,cumsum(diff(x)[1:(length(x)-z-1)])) +
 rep(sum(x[1:z]),length(x)-z)
 x - rbinom(10, 1, 0.5)
 system.time(xx1 - slide(x,12))
 utilisateur système  écoulé   36.860.45
 37.32
 system.time(xx2 - sl(x,12))
 utilisateur système  écoulé0.010.00
 0.02
 all.equal(xx1,xx2)
 [1] TRUE

 Jacques VESLOT

 CEMAGREF - UR Hydrobiologie

 Route de Cézanne - CS 40061  13182 AIX-EN-PROVENCE Cedex 5, France

 Tél.   + 0033   04 42 66 99 76
 fax+ 0033   04 42 66 99 34
 email   jacques.ves...@cemagref.fr

 -Message d'origine-
 De : r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] De la part
 de Chris Oldmeadow
 Envoyé : mardi 16 décembre 2008 05:20
 À : r-help@r-project.org
 Objet : [R] sliding window over a large vector

 Hi all,

 I have a very large binary vector, I wish to calculate the number of
 1's  over sliding windows.

 this is my very slow function

 slide-function(seq,window){
   n-length(seq)-window
   tot-c()
   tot[1]-sum(seq[1:window])
   for (i in 2:n) {
  tot[i]- tot[i-1]-seq[i-1]+seq[i]
   }
   return(tot)
 }

 this works well for for reasonably sized vectors. Does anybody know a
 way for large vectors ( length=12 million), im trying to avoid using
 C.

 Thanks,
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with alignDailySwries in R-metrics

2008-12-16 Thread John Kerpel
Hi Folks!  I seem to be having a problem with alignDailySeries in Rmetrics:

DTB6-fredSeries(DTB6,frequency = daily,from = 1980-01-01)
trying URL '
http://research.stlouisfed.org/fred2/series/DTB6/downloaddata/DTB6.txt'
Content type 'text/plain; charset=UTF-8' length 248392 bytes (242 Kb)
opened URL
downloaded 242 Kb

Read 13060 items

class(DTB6)

[1] timeSeries
attr(,package)
[1] fSeries

DTB6-alignDailySeries(DTB6, method = interp,include.weekends = FALSE,
units = NULL)
Error in getDataPart(S4 object of class timeSeries) :
  no '.Data' slot defined for class timeSeries


What's causing this error?

--John


sessionInfo()
R version 2.8.0 (2008-10-20)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
 [1] strucchange_1.3-5  sandwich_2.1-0 quantmod_0.3-7
Defaults_1.1-1 xts_0.0-16 FinTS_0.3-6
 [7] fracdiff_1.3-1 fTrading_270.74fGarch_280.75
fMultivar_270.74   fBasics_280.74 sn_0.4-8
[13] mnormt_1.3-1   fSeries_270.76.1   fCalendar_270.78.1
fEcofin_270.75 fUtilities_270.75  MASS_7.2-44
[19] robustbase_0.4-3   dyn_0.2-6  zoo_1.5-4
fImport_270.74 timeSeries_290.79  timeDate_290.81

loaded via a namespace (and not attached):
[1] grid_2.8.0  lattice_0.17-15 tools_2.8.0

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dotted lines at the end of the KM-curve

2008-12-16 Thread Fredrik Lundgren

R-ers!

Referees demand that the line in the KM-curve should be changed to  
dotted at the point where standarerror is = 10 %. I don't think it's
a good habit but I urgently need to implement such a thing in R with  
survfit, survplot or another program. They also want numbers at risk

below the curve

Some help, please


Fredrik





Fredrik Lundgren
fredrik.bg.lundg...@gmail.com

Engelbrektsgatan 31
582 21 Linköping
tel 013 - 47 30 117
mob 0706 - 86 39 29

Sommarhus: Ljungnäs 158
380 30 Rockneby
0480 - 650 98

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 and lattice

2008-12-16 Thread Claudia Beleites
Am Dienstag 16 Dezember 2008 17:13:33 schrieb Wayne F:
 stephen sefick wrote:
  yes a parallel coordinates plot- I understand that it is for
  multivariate data, but I am having a hard time figuring out what it is
  telling me.  Thanks for your help.

 In the lattice book, the author mentions that static parallel plots aren't
 very useful, in general.
While for some data they are just natural: e.g. when spectra are treated as 
multidimensional data. Then the parallel coordinate plot just gives you the 
spectrum. 
Of course, in this situation it is maybe the treatment as high-dimensional 
data that is somewhat weird for spectra. 

However, this offers a way, that might help understanding what's going on. 

I have a data set of p dimensions. E.g. spectra measured with p channels.
Now, we can either think of such a spectrum as a point in p-d. E.g. a spectrum 
consisting of red, green, blue intensity is at a certain point in rgb-space.

On the other hand, here the p dimensions have something to do with each other 
(e.g. an intrinsic order, let's say, by the wavelength). So it does make sense 
to plot the intensity over the p dimensions. That's the parallel coordinate 
plot. 

What you can tell from such a plot, depends very much on your data, and how 
you treated it. 

Claudia



-- 
Claudia Beleites
Dipartimento dei Materiali e delle Risorse Naturali
Università degli Studi di Trieste
Via Alfonso Valerio 6/a
I-34127 Trieste

phone: +39 (0 40) 5 58-34 47
email: cbelei...@units.it

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] model.tables error from aov

2008-12-16 Thread Harlan Harris
Hi, I'm a new R user, coming from SPSS, and without a particularly strong
stats background.

I've got a data set that I'd like to do a mixed-design ANOVA with. No
missing values. Here's the summary:

summary(learnDat.ae)
 Type  Subjectidio struct TrainErrscond
 0:20   11 : 3   idio   :28   ae  :58   Min.   : 0.00   idioae   :28
 2:19   12 : 3   nonidio:30   fact: 0   1st Qu.: 6.25   idiofact : 0
 3:19   14 : 3  Median :11.50   nonidioae:30
15 : 3  Mean   :13.40
18 : 3  3rd Qu.:16.00
2  : 3  Max.   :59.00
(Other):40

Note that the TrainErrs column is the only numeric column, and I forced
everything else to be a factor. (Is that correct?)

I then do the following:

aov.errs.ae - aov(TrainErrs ~ (idio*Type) + Error(Subject/Type) + (idio),
learnDat.ae)

So, idio is between-subjects and Type is within-subjects. This is based on
examples I've found elsewhere.

summary(aov.errs.ae)

This seems to work fine:

Error: Subject
  Df Sum Sq Mean Sq F value Pr(F)
idio   1179 1790.89   0.36
Type   1210 2101.05   0.32
Residuals 17   3401 200

Error: Subject:Type
  Df Sum Sq Mean Sq F value Pr(F)
Type   2515 2582.44  0.103
idio:Type  2680 3403.22  0.053 .
Residuals 34   3595 106
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Now the problem:

 model.tables(aov.errs.ae,means)
Error in outer(rownames(efficiency), colnames(efficiency), paste)[eff.used]
:
  invalid subscript type 'list'
In addition: Warning message:
In any(efficiency) : coercing argument of type 'double' to logical

All the examples and manuals I've found said this should work. When I did a
fully between-subjects ANOVA on another data set, I had no problem with
model.tables. I have no idea what to make of this error message. I've tried
a number of variations on things, with no improvement.

This is R version 2.7.2 (2008-08-25), running on Redhat, x86_64.

Suggestions? Thanks!

 -Harlan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] model.tables error from aov

2008-12-16 Thread Prof Brian Ripley
Your design seems to be unbalanced: multistatum aov is intended for 
balanced designs.  My guess is that one idio subject has two Type=1 
observations: in which case try removing one of them.


On Tue, 16 Dec 2008, Harlan Harris wrote:


Hi, I'm a new R user, coming from SPSS, and without a particularly strong
stats background.

I've got a data set that I'd like to do a mixed-design ANOVA with. No
missing values. Here's the summary:

summary(learnDat.ae)
Type  Subjectidio struct TrainErrscond
0:20   11 : 3   idio   :28   ae  :58   Min.   : 0.00   idioae   :28
2:19   12 : 3   nonidio:30   fact: 0   1st Qu.: 6.25   idiofact : 0
3:19   14 : 3  Median :11.50   nonidioae:30
   15 : 3  Mean   :13.40
   18 : 3  3rd Qu.:16.00
   2  : 3  Max.   :59.00
   (Other):40

Note that the TrainErrs column is the only numeric column, and I forced
everything else to be a factor. (Is that correct?)

I then do the following:

aov.errs.ae - aov(TrainErrs ~ (idio*Type) + Error(Subject/Type) + (idio),
learnDat.ae)

So, idio is between-subjects and Type is within-subjects. This is based on
examples I've found elsewhere.

summary(aov.errs.ae)

This seems to work fine:

Error: Subject
 Df Sum Sq Mean Sq F value Pr(F)
idio   1179 1790.89   0.36
Type   1210 2101.05   0.32
Residuals 17   3401 200

Error: Subject:Type
 Df Sum Sq Mean Sq F value Pr(F)
Type   2515 2582.44  0.103
idio:Type  2680 3403.22  0.053 .
Residuals 34   3595 106
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Now the problem:


model.tables(aov.errs.ae,means)

Error in outer(rownames(efficiency), colnames(efficiency), paste)[eff.used]
:
 invalid subscript type 'list'
In addition: Warning message:
In any(efficiency) : coercing argument of type 'double' to logical

All the examples and manuals I've found said this should work. When I did a
fully between-subjects ANOVA on another data set, I had no problem with
model.tables. I have no idea what to make of this error message. I've tried
a number of variations on things, with no improvement.

This is R version 2.7.2 (2008-08-25), running on Redhat, x86_64.

Suggestions? Thanks!

-Harlan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave learning resources

2008-12-16 Thread Tubin

In general I try not to post questions to forums until I've tried my best to
read about them in the available documentation.  I recently undertook a
project that used odfWeave and have been very pleased with the package. 
But, the R help documentation suggests that there are more sophisticated
things I can do - for example, with conditionally formatted tables.  

Can anyone point me to resources I could review to educate myself about the
full capabilities of this lovely package?

Thanks!
-- 
View this message in context: 
http://www.nabble.com/odfWeave-learning-resources-tp21041939p21041939.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] model.tables error from aov

2008-12-16 Thread Harlan Harris
Ah, that was it. I had a bad row in there that I had forgotten to remove.
Thank you very much for the prompt (and correct!) response.

 -Harlan

On Tue, Dec 16, 2008 at 3:58 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote:

 Your design seems to be unbalanced: multistatum aov is intended for
 balanced designs.  My guess is that one idio subject has two Type=1
 observations: in which case try removing one of them.


 On Tue, 16 Dec 2008, Harlan Harris wrote:

  Hi, I'm a new R user, coming from SPSS, and without a particularly strong
 stats background.

 I've got a data set that I'd like to do a mixed-design ANOVA with. No
 missing values. Here's the summary:

 summary(learnDat.ae)
 Type  Subjectidio struct TrainErrscond
 0:20   11 : 3   idio   :28   ae  :58   Min.   : 0.00   idioae   :28
 2:19   12 : 3   nonidio:30   fact: 0   1st Qu.: 6.25   idiofact : 0
 3:19   14 : 3  Median :11.50   nonidioae:30
   15 : 3  Mean   :13.40
   18 : 3  3rd Qu.:16.00
   2  : 3  Max.   :59.00
   (Other):40

 Note that the TrainErrs column is the only numeric column, and I forced
 everything else to be a factor. (Is that correct?)

 I then do the following:

 aov.errs.ae - aov(TrainErrs ~ (idio*Type) + Error(Subject/Type) +
 (idio),
 learnDat.ae)

 So, idio is between-subjects and Type is within-subjects. This is based on
 examples I've found elsewhere.

 summary(aov.errs.ae)

 This seems to work fine:

 Error: Subject
 Df Sum Sq Mean Sq F value Pr(F)
 idio   1179 1790.89   0.36
 Type   1210 2101.05   0.32
 Residuals 17   3401 200

 Error: Subject:Type
 Df Sum Sq Mean Sq F value Pr(F)
 Type   2515 2582.44  0.103
 idio:Type  2680 3403.22  0.053 .
 Residuals 34   3595 106
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 Now the problem:

  model.tables(aov.errs.ae,means)

 Error in outer(rownames(efficiency), colnames(efficiency),
 paste)[eff.used]
 :
  invalid subscript type 'list'
 In addition: Warning message:
 In any(efficiency) : coercing argument of type 'double' to logical

 All the examples and manuals I've found said this should work. When I did
 a
 fully between-subjects ANOVA on another data set, I had no problem with
 model.tables. I have no idea what to make of this error message. I've
 tried
 a number of variations on things, with no improvement.

 This is R version 2.7.2 (2008-08-25), running on Redhat, x86_64.

 Suggestions? Thanks!

 -Harlan

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~ripley/http://www.stats.ox.ac.uk/%7Eripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pwr.prop.test and continuity correction

2008-12-16 Thread Thomas Lumley

On Tue, 16 Dec 2008, Peter Dalgaard wrote:


power.prop.test (sic) is relying heavily on asymptotic normality, as do similar 
formulas. It doesn't use continuity correction, but if you're working with such 
small group sizes, I suspect that the correction term is the least of your 
worries and that direct simulation would be better.




In fact, for tests in 2x2 tables, it is fairly straightforward and fast to 
compute the entire sampling distribution explicitly, over a grid of parameter 
values.

This gives the exact power (under alternatives) and the exact Type I error 
(under null). You can also compare different tests and see how much the 
continuity correction moves the actual Type I error rate away from the nominal 
rate.

   -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using a covariance matrix as input to relaimpo package

2008-12-16 Thread Ku!Rt

By trial and error I have discovered that it works if I don't use the formula
interface in combination with a covariance matrix as input.

If the covariance matrix has the dependent variable as its left-most
variable as the relaimpo documentation suggests, then the relaimpo package
will run by simply naming the covariance matrix as the first object in the
call and not using a formula.  The downside of this is needing to create
different covariance matrices for different models.

The following will work:

# calculate covariance matrix from survey respondent data using pairwise
deletion
covmatrx =
cov(respdata[,c(V0007,V0029,V0031,V0032,V0034,V0035,V0036)],
use = pairwise)

# try the lmg method of relative importance
 imps1 = calc.relimp(covmatrx, type=lmg, rela=TRUE)


-- 
View this message in context: 
http://www.nabble.com/Using-a-covariance-matrix-as-input-to-relaimpo-package-tp21022295p21041633.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dotted lines at the end of the KM-curve

2008-12-16 Thread Frank E Harrell Jr

Fredrik Lundgren wrote:

R-ers!

Referees demand that the line in the KM-curve should be changed to 
dotted at the point where standarerror is = 10 %. I don't think it's
a good habit but I urgently need to implement such a thing in R with 
survfit, survplot or another program. They also want numbers at risk

below the curve

Some help, please


Fredrik


Numbers at risk can be done with

library(Design)
f - cph(Surv( ) ~ ..., surv=TRUE)
survplot(f, n.risk=TRUE, ...)

Frank







Fredrik Lundgren
fredrik.bg.lundg...@gmail.com

Engelbrektsgatan 31
582 21 Linköping
tel013 - 47 30 117
mob 0706 - 86 39 29

Sommarhus: Ljungnäs 158
380 30 Rockneby
0480 - 650 98

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find all numbers in a certain interval

2008-12-16 Thread Greg Snow
Here are a couple of function definitions that may be more intuitive for some 
people (see the examples below the function defs).  They are not perfect, but 
my tests showed they work left to right, right to left, outside in, but not 
inside out.

`%%` - function(x,y) {
xx - attr(x,'orig.y')
yy - attr(y,'orig.x')

if(is.null(xx)) {
xx - x
x - rep(TRUE, length(x))
}
if(is.null(yy)) {
yy - y
y - rep(TRUE, length(y))
}

out - x  y  (xx  yy)
attr(out, 'orig.x') - xx
attr(out, 'orig.y') - yy

out
}

`%=%` - function(x,y) {
xx - attr(x,'orig.y')
yy - attr(y,'orig.x')

if(is.null(xx)) {
xx - x
x - rep(TRUE, length(x))
}
if(is.null(yy)) {
yy - y
y - rep(TRUE, length(y))
}

out - x  y  (xx = yy)
attr(out, 'orig.x') - xx
attr(out, 'orig.y') - yy

out
}




x - -3:3

 -2 %% x %% 2
c( -2 %% x %% 2 )
x[ -2 %% x %% 2 ]
x[ -2 %=% x %=% 2 ]


x - rnorm(100)
y - rnorm(100)

x[ -1 %% x %% 1 ]
range( x[ -1 %% x %% 1 ] )


cbind(x,y)[ -1 %% x %% y %% 1, ]
cbind(x,y)[ (-1 %% x) %% (y %% 1), ]
cbind(x,y)[ ((-1 %% x) %% y) %% 1, ]
cbind(x,y)[ -1 %% (x %% (y %% 1)), ]
cbind(x,y)[ -1 %% (x %% y) %% 1, ] # oops

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Antje
 Sent: Tuesday, December 16, 2008 3:09 AM
 To: r-h...@stat.math.ethz.ch
 Subject: [R] Find all numbers in a certain interval

 Hi all,

 I'd like to know, if I can solve this with a shorter command:

 a - rnorm(100)
 which(a  -0.5  a  0.5)

 # would give me all indices of numbers greater than -0.5 and smaller
 than +0.5

 I have something similar with a dataframe and it produces sometimes
 quite long
 commands...
 I'd like to have something like:

 which(within.interval(a, -0.5, 0.5))

 Is there anything I could use for this purpose?


 Antje

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Programmatically minimising main R window (on windows)

2008-12-16 Thread hadley wickham
Hi all,

Is it possible to programmatically minimise the main window of the
windows R gui?  I'm designing a small gui with gwidgets  RGtk2 for an
non-statistician to use, and it would be nice if I could easily hide
all the R stuff that they don't need.

Thanks,

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Applying a function to a dataframe

2008-12-16 Thread glenn roberts
Another Newbie Question sorry:

I am trying to apply a function a dataframe and could use some help:


Assuming, dim(df) = (10,2) say, I would like to apply a function that looks
at each row in turn and returns a list (dim =(10,1)) using the columns as
inputs to the function, but with no INDEX stuff that the by() function
refers to.

For a simple function x+y I know in Mathematica it would be this;

Table[df[[i,1]]+ df[[i,2]],{i,1,10}]

Of if the function was defined is would read;

Table[f[df[[i,1]], df[[i,2]]],{i,1,10}]

Thanks for help in advance

Glenn





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying a function to a dataframe

2008-12-16 Thread David Winsemius


On Dec 16, 2008, at 6:00 PM, glenn roberts wrote:


Another Newbie Question sorry:

I am trying to apply a function a dataframe and could use some help:


Assuming, dim(df) = (10,2) say, I would like to apply a function  
that looks
at each row in turn and returns a list (dim =(10,1)) using the  
columns as

inputs to the function, but with no INDEX stuff that the by() function
refers to.

For a simple function x+y I know in Mathematica it would be this;

Table[df[[i,1]]+ df[[i,2]],{i,1,10}]

Of if the function was defined is would read;

Table[f[df[[i,1]], df[[i,2]]],{i,1,10}]


I don't know Mathematica, but if you just want the sum by rows

?apply

When the second argument is 1 the rows are taken singly as arguments  
to the third argument FUN:


 DF - data.frame(col1 = 1:10, col2 = 11:20)
 apply(DF,1,sum)
 [1] 12 14 16 18 20 22 24 26 28 30


If you want minimums then the apply method  with FUN=min would still  
work:


 apply(DF,1,min)
 [1]  1  2  3  4  5  6  7  8  9 10





Thanks for help in advance

Glenn





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Programmatically minimising main R window (on windows)

2008-12-16 Thread Prof Brian Ripley

On Tue, 16 Dec 2008, hadley wickham wrote:


Hi all,

Is it possible to programmatically minimise the main window of the
windows R gui?  I'm designing a small gui with gwidgets  RGtk2 for an
non-statistician to use, and it would be nice if I could easily hide
all the R stuff that they don't need.


Not from R itself, but you can by Windows script programming (which you 
can launch by 'system'.  It would also be esay to add a small bit of C 
code to do so.


However why are you using Rgui if you don't want a GUI?  That is what 
Rterm or Rscript or embedded R are for.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract Data from a Webpage

2008-12-16 Thread Chuck Cleland
Hi All:
  I would like to extract the provider name, address, and phone number
from multiple webpages like this:

http://oasasapps.oasas.state.ny.us/portal/pls/portal/oasasrep.providersearch.take_to_rpt?P1=3489P2=11490

  Based on searching R-help archives, it seems like the XML package
might have something useful for this task.  I can load the XML package
and supply the url as an argument to htmlTreeParse(), but I don't know
how to go from there.

thanks,

Chuck Cleland

 sessionInfo()
R version 2.8.0 Patched (2008-12-04 r47066)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] XML_1.98-1

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] append lines to a created file

2008-12-16 Thread Jörg Groß

hi,

I try to append a line to a file with;

writeLines(xxx, con = file.txt, sep = \n)


but it always overwrites the existing content.


How can I change the mode of writeLines to append (a) ?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Programmatically minimising main R window (on windows)

2008-12-16 Thread hadley wickham
On Tue, Dec 16, 2008 at 5:40 PM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 On Tue, 16 Dec 2008, hadley wickham wrote:

 Hi all,

 Is it possible to programmatically minimise the main window of the
 windows R gui?  I'm designing a small gui with gwidgets  RGtk2 for an
 non-statistician to use, and it would be nice if I could easily hide
 all the R stuff that they don't need.

 Not from R itself, but you can by Windows script programming (which you can
 launch by 'system'.  It would also be esay to add a small bit of C code to
 do so.

 However why are you using Rgui if you don't want a GUI?  That is what Rterm
 or Rscript or embedded R are for.

That's a good question.  The main reason is because it's easy for me
to tell my remote user how to load the gui - source(http://;)

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] append lines to a created file

2008-12-16 Thread Rolf Turner


On 17/12/2008, at 1:43 PM, Jörg Groß wrote:


hi,

I try to append a line to a file with;

writeLines(xxx, con = file.txt, sep = \n)


but it always overwrites the existing content.


How can I change the mode of writeLines to append (a) ?


The help on connections says:

In general functions using connections will open them if they are not
	open, but then close them again, so to leave a connection open call  
open explicitly.


So do something like:

zz - file(file.txt,w)
writeLines(xxx,con=zz,sep=\n)
writeLines(A load of dingos' kidneys.,con=zz,sep=\n)

etc.
etc.

close(zz)

But why not just use sink() and cat()?  Much simpler, IMHO.

cheers,

Rolf Turner
##
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.


This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com

##

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Programmatically minimising main R window (on windows)

2008-12-16 Thread Berwin A Turlach
G'day Hadley,

On Tue, 16 Dec 2008 18:54:48 -0600
hadley wickham h.wick...@gmail.com wrote:

 On Tue, Dec 16, 2008 at 5:40 PM, Prof Brian Ripley
 rip...@stats.ox.ac.uk wrote:
  On Tue, 16 Dec 2008, hadley wickham wrote:
[...]
  Is it possible to programmatically minimise the main window of the
  windows R gui?  I'm designing a small gui with gwidgets  RGtk2
  for an non-statistician to use, and it would be nice if I could
  easily hide all the R stuff that they don't need.
 
  Not from R itself, but you can by Windows script programming (which
  you can launch by 'system'.  It would also be esay to add a small
  bit of C code to do so.
[...]

Not sure if this is what you are after, but I believe we had once a
similar problem.  Client wanted to have a GUI that would read in a file
with the list of people who had bought Raffle tickets, select the
winners, and write the winners to a file.  They were just interested in
seeing the GUI stuff and not the underlying R main window c.  

We ended up giving them an USB stick with R on it and the package we
wrote for them together with other packages they needed.  I attach the
instructions that I wrote up for our consultant (sitting in Perth) on
how I could create such an USB stick  (sitting in Singapore).  UWA has
an authenticating proxy while NUS does not, hence the references to
--internet2 on that write up.  

As it turned out, if you do not use the standard way of installing R
but select SDI during the installation (if memory serves correctly,
this choice can also be made after R is installed, but then doing this
change is a bit more involved), then you can start R minimized.  That
is the R main window does not appear.  You just have to make sure that
your code is executed when you start R (via .onAttach, .First c) and
brings up the GUI that the user is supposed to see.

HTH.

Cheers,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability+65 6515 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore
6 Science Drive 2, Blk S16, Level 7  e-mail: sta...@nus.edu.sg
Singapore 117546http://www.stat.nus.edu.sg/~statba
1) Install R on USB stick:
   Start R installer (named something like (R-2.x.y-win32.exe).
   During installation:

   i) Select drive corresponding to USB stick for location to which R
  is to be installed (i.e. if USB stick is drive DRV, then install R
  to location DRV:\R-2.x.y).

   ii) Customize startup:
   a) select SDI, everything else per default
   b) may be necessary to select internet2 as internet connection
  if sitting behind a proxy (but it is also possible to do so
  later, i.e o.k. to use the default) 

   iii) Don't create a Start Menu folder
   iv)  Don't create desktop icon or registry entries

2) Create a short cut to Rgui.exe (located in
   DRV:\R-2.x.y\bin\Rgui.exe) and  move it to the top folder of the
   USB stick.  
   Optionally, rename short cut (e.g. Raffle Draw)
   
3) Start R using the short cut.

4) Select Install package(s) from local zip files from Packages menu:
   select RaffleDraw_1.y.zip (currently y=1, but may change) for installation.

5) Select Install package(s)... from Packages menu:
   select appropriate CRAN mirror, then select gWidgets and
   gWidgetsrJava to be installed. 
   quit R

   If this step does not work, then you are probably sitting behind a
   proxy.  In that case, go to the short cut that points to Rgui.exe,
   right click on the short cut and select properties from pop-up
   window; add --internet2 to the target (i.e. the target should
   read something like DRV:\R-2.x.y\bin\Rgui.exe --internet2).
   Click Apply and then Ok and try again.

6) Right-click on short cut and select properties from pop-up window;
   change entry for Run: from Normal window to minimized.  Click
   Apply and then Ok.  (Remove the --internet2 option if it had
   been added)

7) Goto to the folder DRV:\R-2.x.y\etc and edit the Rprofile.site file
   located in that folder:
   add library(RaffleDraw) as last line (without the quotation marks)
   save file and quit

8) go to top folder of USB stick and double click on the short cut.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to iterate dataframe within a hash

2008-12-16 Thread Gundala Viswanath
Dear all,

I have the following data structure

 print(testlib)
$tags
 tagcount.raw  count.adj  err
1 aa94 93 0.5
2 ac 1   2   0.2
3 ag 3   2   0.1
4 ca 1   1   0.003


I want to iterate the data above and print only
tag, count.raw and count.adj column.

Why my script below failed to do the task?


for (i in 1:nrow(testlib)) {
   cat(testlib$tags[[count.tag]],,, testlib$tags[[count.raw]],
,, testlib$tags[[count.adj]],\n)
}



- Gundala Viswanath
Jakarta - Indonesia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to iterate dataframe within a hash

2008-12-16 Thread stephen sefick
I don't know if this is what you want, but it seems that you just want
to print a subset of your columns:

testlib$tags[,c(tag, count.raw, count.adj)]

if you want to do something other than just print the columns then
look at the apply family of functions.


On Tue, Dec 16, 2008 at 9:02 PM, Gundala Viswanath gunda...@gmail.com wrote:
 Dear all,

 I have the following data structure

 print(testlib)
 $tags
 tagcount.raw  count.adj  err
 1 aa94 93 0.5
 2 ac 1   2   0.2
 3 ag 3   2   0.1
 4 ca 1   1   0.003


 I want to iterate the data above and print only
 tag, count.raw and count.adj column.

 Why my script below failed to do the task?


 for (i in 1:nrow(testlib)) {
   cat(testlib$tags[[count.tag]],,, testlib$tags[[count.raw]],
 ,, testlib$tags[[count.adj]],\n)
 }



 - Gundala Viswanath
 Jakarta - Indonesia

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Noobie question, regression across levels

2008-12-16 Thread RichardLang

Much thanks! This helped a lot. Another quick one:
In using the lmList function in the nlme package, is it possible to subset
my data according to the number of observations in each level? (ie. I
obviously want to include only those levels in which the observations are of
sufficient size for regression). What is the best way to exclude factors of
insufficient size? Can I do it inside the lmList function? I've read the
requisite help files etc. and two hours later am still confused.
Thanks in advance,
Allen


Ben Bolker wrote:
 
 
 
 RichardLang wrote:
 
 I've just started using R last week and am still scratching my head.
 
 I have a data set and want to run a separate regression across each level
 of a factor (treating each one separately). The data right now is
 arranged such that the value of the factor along which I want to split
 my data is one column among many.
 Best way to do this?
 
 Thanks!
 
 
 You can check out lmList function in the nlme package, or more crudely:
 
 lmfun - function(d) { lm(y~x,data=d) }
 myLmList - lapply(split(mydata,splitfactor),lmfun)
 
 even more compactly/confusingly:
 
 myLmList - lapply(split(mydata,splitfactor),lm,formula=y~x)
 
   good luck
Ben Bolker
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Noobie-question%2C-regression-across-levels-tp21020222p21046298.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] append string to a string

2008-12-16 Thread Jörg Groß

hi,


I want to append a string to a string like;

x - c(abc)
append(x, def)


so that I get for x:

[1] abcdef


not (!)
[1] abc   def


How can I do that in R?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract Data from a Webpage

2008-12-16 Thread Duncan Temple Lang


Hi Chuck.

 Well, here is one way

theURL = 
http://oasasapps.oasas.state.ny.us/portal/pls/portal/oasasrep.providersearch.take_to_rpt?P1=3489P2=11490;


doc = htmlParse(theURL, useInternalNodes = TRUE,
  error = function(...) {}) # discard any error messages

  # Find the nodes in the table that are of interest.
x = xpathSApply(doc, //table//td|//table//th, xmlValue)

Now depending on the regularity of the page, we can do something like

 i = seq(1, by = 2, length = 3)
 structure(x[i + 1], names = x[i])


And we end up with a named character vector with the fields of interest.

The useInternalNodes is vital so that we can use XPath.  The XPath
language is very convenient for navigating subsets of the resulting
XML tree.

 D.


Chuck Cleland wrote:

Hi All:
  I would like to extract the provider name, address, and phone number
from multiple webpages like this:

http://oasasapps.oasas.state.ny.us/portal/pls/portal/oasasrep.providersearch.take_to_rpt?P1=3489P2=11490

  Based on searching R-help archives, it seems like the XML package
might have something useful for this task.  I can load the XML package
and supply the url as an argument to htmlTreeParse(), but I don't know
how to go from there.

thanks,

Chuck Cleland


sessionInfo()

R version 2.8.0 Patched (2008-12-04 r47066)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] XML_1.98-1



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] append string to a string

2008-12-16 Thread David Winsemius


?paste

On Dec 16, 2008, at 10:39 PM, Jörg Groß wrote:


hi,


I want to append a string to a string like;

x - c(abc)
append(x, def)


so that I get for x:

[1] abcdef


not (!)
[1] abc   def


How can I do that in R?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] append string to a string

2008-12-16 Thread Aval Sarri
On Wed, Dec 17, 2008 at 9:09 AM, Jörg Groß jo...@licht-malerei.de wrote:
 hi,


 I want to append a string to a string like;

 x - c(abc)
 append(x, def)

paste (x, def, sep=)

see ?paste

HTH
Aval

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] surface contour plot help

2008-12-16 Thread Brad B
I was able to get a surface plot with wireframe, however I cant rotate it 
around like you can with the plot3d function?
Is thier a way to do this in R?
 


 
To: r-help@r-project.org
Date: Tuesday, December 16, 2008, 9:13 AM







I am trying to do a surface profile plot.
data is 
X  Y(1) Z(1)
1-jan-02   2002    number
2-jan-02   2002    number
.
.
.
1-jan-03   2003 (Y2) number Z(2)
2-jan-03   2003 (Y2) number Z(2)
.
.
.
until dec 31 2007.
 
I used the plot3d funtions to build a scatter point plot.
Call rinterface.rrun(library(rgl))
Call 
rinterface.rrun(plot3d(x,y1,z1,xlab='Date',ylab='Year',zlab='Vol',ylim=c(2001,2008)))
Call rinterface.rrun(plot3d(x,y2,z2,add=TRUE))
Call rinterface.rrun(plot3d(x,y3,z3,add=TRUE))
Call rinterface.rrun(plot3d(x,y4,z4,add=TRUE))
Call rinterface.rrun(plot3d(x,y5,z5,add=TRUE))
Call rinterface.rrun(plot3d(x,y6,z6,add=TRUE))
 
Is thier a way to lay a surface to this?
 



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simulate binary markov chain

2008-12-16 Thread Chris Oldmeadow
Hi all, I was hoping somebody may know of a function for simulating a 
large binary sequence (length 10 million) using a (1st order) markov 
model with known (2x2) transition matrix. It needs to be reasonably 
fast. I have tried the following;


mc-function(sq,P){
 s-c()
 x-row.names(P)
 n-length(sq)
 p1-sum(sq)/n
 s[1] - rbinom(1,1,p1);
 for ( i in 2:n){
s[i] - rbinom( 1, 1, P[s[i-1]+1] )
 }
 return(s)
}


P-c(0.63,0.27)
x-rbinom(500,1,0.5)
new-mc(x,P)

thanks in advance!
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] surface contour plot help

2008-12-16 Thread Duncan Murdoch

On 16/12/2008 5:05 PM, Brad B wrote:

I was able to get a surface plot with wireframe, however I cant rotate it 
around like you can with the plot3d function?
Is thier a way to do this in R?


You are making your question impossible to answer, by not giving the 
right details.  If you show us code that uses wireframe to do what you 
want, surely someone could show you how to do the same thing in rgl.


That means putting together a small, self-contained example.  Don't post 
a vague description of your data and what you want, simplify your 
question to something we can see.


There are many ways to add a surface to a plot in rgl.  We have no idea 
which one would be appropriate for you.


Duncan Murdoch

 



 
To: r-help@r-project.org

Date: Tuesday, December 16, 2008, 9:13 AM







I am trying to do a surface profile plot.
data is 
X  Y(1) Z(1)

1-jan-02   2002number
2-jan-02   2002number
.
.
.
1-jan-03   2003 (Y2) number Z(2)
2-jan-03   2003 (Y2) number Z(2)
.
.
.
until dec 31 2007.
 
I used the plot3d funtions to build a scatter point plot.

Call rinterface.rrun(library(rgl))
Call 
rinterface.rrun(plot3d(x,y1,z1,xlab='Date',ylab='Year',zlab='Vol',ylim=c(2001,2008)))
Call rinterface.rrun(plot3d(x,y2,z2,add=TRUE))
Call rinterface.rrun(plot3d(x,y3,z3,add=TRUE))
Call rinterface.rrun(plot3d(x,y4,z4,add=TRUE))
Call rinterface.rrun(plot3d(x,y5,z5,add=TRUE))
Call rinterface.rrun(plot3d(x,y6,z6,add=TRUE))
 
Is thier a way to lay a surface to this?
 




  
	[[alternative HTML version deleted]]






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simulate binary markov chain

2008-12-16 Thread Charles C. Berry

On Wed, 17 Dec 2008, Chris Oldmeadow wrote:

Hi all, I was hoping somebody may know of a function for simulating a large 
binary sequence (length 10 million) using a (1st order) markov model with 
known (2x2) transition matrix. It needs to be reasonably fast.


Chris,

The trick is to recognize that the length of each run is a sample from
the geometric distribution (with 1 added to it). rgeom() is vectorized,
so using it provides fast results.

Suppose that your transition matrix is

   |   | 0 | 1 |
   |---+---+---|
   | 0 | pi.11 | pi.12 |
   | 1 | pi.21 | pi.22 |
   |---+---+---|

where pi.11+p.12 == 1 and pi.21+pi.22 == 1

This function

 foo - function(n,pi.12,pi.21) inverse.rle( list(values=rep(0:1,n) ,
lengths=1+rgeom( 2*n, rep( c( pi.12, pi.21 ), n) )))

will generate a sequence of 0/1's according to that matrix with length 
approximately n/pi.12+n/pi.21

On my macbook I get this timing:


system.time(res - foo(1205000,.3,.2))

   user  system elapsed
  1.088   0.204   1.291

prop.table(table(head(res,-1),tail(res,-1)),1) # check!


0 1
  0 0.6999024 0.3000976
  1 0.1997453 0.8002547

length(res) # long enough!

[1] 10048040





So, if this is fast enough, you just choose 'n' to be a bit larger
than desired length divided by (1/pi.12+1/pi.21) and then discard the 
excess.


Chuck


I have tried 

the following;

mc-function(sq,P){
 s-c()
 x-row.names(P)
 n-length(sq)
 p1-sum(sq)/n
 s[1] - rbinom(1,1,p1);
 for ( i in 2:n){
s[i] - rbinom( 1, 1, P[s[i-1]+1] )
 }
 return(s)
}


P-c(0.63,0.27)
x-rbinom(500,1,0.5)
new-mc(x,P)

thanks in advance!
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with graphical devices, e.g., png(), pdf(): blurry graphical output

2008-12-16 Thread Y-H Chen
On my current home system, I am getting undesirable output from
graphical devices such as png() and pdf(). The graphical output is
blurry. I haven't experienced the problem on other systems. As you
will see from the attached text file (more information on this file
below), the problem does not occur when type='Xlib' is forced. The
blurriness is more severe with bitmap output (yes, I am viewing the
bitmap files at 100%), but occurs with pdf output as well.

Software details: Fedora 10, with at least the following packages:

-- R, R-core, R-devel
-- cairo, cairo-devel
-- pixman, pixman-devel
-- libpng, libpng-devel
-- poppler

Everything is current and updated via Fedora's repository. R was
installed via Fedora's repository.

I've attached some commands and output in a text file. This file includes:

(1) hardware information
(2) information about my R installation
(3) code for simple R graphics, with comments re output, plus URLs for
the corresponding graphical output

Any advice would be really appreciated.
[u...@localhost ~]$ lspci
00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 
945GT Express Memory Controller Hub (rev 03)
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 
943/940GML Express Integrated Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML 
Express Integrated Graphics Controller (rev 03)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition 
Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 
(rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 
(rev 02)
00:1c.2 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3 
(rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 
(rev 02)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI 
Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge 
(rev 02)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller 
(rev 02)
00:1f.2 SATA controller: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA AHCI 
Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02)
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet 
Controller
03:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network 
Connection (rev 02)
15:00.0 CardBus bridge: Texas Instruments PCI1510 PC card Cardbus Controller

[u...@localhost ~]$ R
 sessionInfo()
R version 2.8.0 (2008-10-20) 
i386-redhat-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 


 version
   _   
platform   i386-redhat-linux-gnu   
arch   i386
os linux-gnu   
system i386, linux-gnu 
status 
major  2   
minor  8.0 
year   2008
month  10  
day20  
svn rev46754   
language   R   
version.string R version 2.8.0 (2008-10-20)


 capabilities()
   jpeg  png tifftcltk  X11 aqua http/ftp  sockets 
   TRUE TRUE TRUE TRUE TRUEFALSE TRUE TRUE 
 libxml fifo   clediticonv  NLS  profmemcairo 
   TRUE TRUE TRUE TRUE TRUEFALSE TRUE 


 X11.options(reset=TRUE)
 X11.options()
$display
[1] 

$width
[1] NA

$height
[1] NA

$pointsize
[1] 12

$bg
[1] transparent

$canvas
[1] white

$gamma
[1] 1

$colortype
[1] true

$maxcubesize
[1] 256

$fonts
[1] -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*
[2] -adobe-symbol-medium-r-*-*-%d-*-*-*-*-*-*-*

$xpos
[1] NA

$ypos
[1] NA

$title
[1] 

$type
[1] cairo

$antialias
[1] 1

 
 X11.options(reset=TRUE)
 plot(1:10)
 ## result: box lines fuzzy at top and left, and appears 
 ## darker and thicker where the axes are overplotted

 X11.options(reset=TRUE)
 X11.options(antialias=2) # antialias=2 is 'none'
 plot(1:10,