Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Jim Lemon
Wayne Aldo Gavioli wrote:
> 
> Hello all,
> 
> 
> I'm trying to graph a scatterplot of a large (5,000 x,y coordinates) of data
> with the caveat that many of the data points overlap with each other (share 
> the
> same x AND y coordinates).  In using the usual "plot" command,
> 
> 
> 
>>plot(education, xlab="etc", ylab="etc")
> 
> 
> 
> it seems that the overlap of points is not shown in the graph.  Namely, there
> are 5,000 points that should be plotted, as I mentioned above, but because so
> many of the points overlap with each other exactly, only about 50-60 points 
> are
> actually plotted on the graph.  Thus, there's no indication that Point A 
> shares
> its coordinates with 200 other pieces of data and thus is very common while
> Point B doesn't share its coordinates with any other pieces of data and thus
> isn't common at all.  Is there anyway to indicate the frequency of such points
> on such a graph?  Should I be using a different command than "plot"?
> 
Hi Wayne,
While this is not a really pretty picture, you can get a viewable plot 
with count.overplot if the first two elements of "education" are named 
"x" and "y" and they are the coordinates you want to plot. Otherwise, 
pass the x and y coordinates separately.

library(plotrix)
count.overplot(education,
  tol=c(diff(range(education$x))/10,
  diff(range(education$y))/10))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Jari Oksanen
Wayne Aldo Gavioli  fas.harvard.edu> writes:

> 
> 
> Hello all,
> 
> I'm trying to graph a scatterplot of a large (5,000 x,y coordinates) of data
> with the caveat that many of the data points overlap with each other (share 
> the
> same x AND y coordinates).  In using the usual "plot" command,
> 
> > plot(education, xlab="etc", ylab="etc")
> 
> it seems that the overlap of points is not shown in the graph.  Namely, there
> are 5,000 points that should be plotted, as I mentioned above, but because so
> many of the points overlap with each other exactly, only about 50-60 points 
> are
> actually plotted on the graph.  Thus, there's no indication that Point A 
> shares
> its coordinates with 200 other pieces of data and thus is very common while
> Point B doesn't share its coordinates with any other pieces of data and thus
> isn't common at all.  Is there anyway to indicate the frequency of such points
> on such a graph?  Should I be using a different command than "plot"?
> 
> 
One suggestion seems to be still missing: 'sunflowerplot' of base R. May look
taggy, though, if you have 200 "petals". 

Actually the documentation of sunflowerplot is wrong in botanical sense.
Sunflowers have composite flowers in capitula, and the things called 'petals' in
documentation are ligulate, sterile ray-florets (each with vestigial petals
which are not easily visible in sunflower, but in some other species you may see
three (occasionally two) teeth). 

cheers, jari oksanen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bar plot colors

2007-12-18 Thread Jim Lemon
Winkel, David wrote:
> All,
> 
>  
> 
> I have a question regarding colors in bar plots.  I want to stack a
> total of 18 cost values in each bar. Basically, it is six cost types and
> each cost type has three components- direct, indirect, and induced
> costs.  I would like to use both solid color bars and bars with the
> slanted lines (using the density parameter).  The colors would
> distinguish cost types and the lines would distinguish
> direct/indirect/induced.  I want the cost types (i.e. colors) to be
> stacked together for each cost type.  In other words, I don't want all
> of the solid bars at the bottom and all of the slanted lines at the top.
> 
> 
> So far, I have made a bar plot with all solid colors and then tried to
> overwrite that bar plot by calling barplot() again and putting the white
> slanted lines across the bars.  However, I can't get this method to work
> while still grouping the cost types together.
> 
>  
Hi David,
This is a real challenge:

heights<-matrix(sample(10:70,54),ncol=3)
bar.colors<-rep(rep(2:7,each=3),3)
bar.densities<-rep(10,54)
bar.angles<-matrix(rep(rep(c(45,90,135),6),3),ncol=3)
barplot(heights,col=bar.colors)
barplot(heights,angle=bar.angles,add=TRUE,density=bar.densities)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot-How to define fill colours?

2007-12-18 Thread Pedro de Barros
Dear R's (most likely Hadley),

I want to build a stacked bar plot where I would like to define which 
colours will be used for each of the groups. However, I do not seem 
to find a way to do this, even if I've been looking over many places.

I have tried several variations, and my final try was this code, but 
I still do not manage to get the colours as I pre-define. Any hints 
about how to get this?

Thanks in advance,
Pedro

my code:
 >plotdata1<-data.frame(x=rep(factor(1:4),4), y=rep(0.1*(1:4),4), 
+group=as.character(rep(c('white', 'red', 'blue', 'green'),rep(4,4
 >plot0<-ggplot()
 >plot3<-plot0+layer(data=plotdata1, mapping=aes_string(x='x',y='y', 
+fill='group'),geom='bar', stat='identity', position='stack')
 >print(plot3)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] integration

2007-12-18 Thread [EMAIL PROTECTED]
Dear All,
I need to perform a numerical integration of one dimensional 
fucntions. The extrems of integration are both finite and the functions 
I'm working on are quite complicated. I have already tried both area() 
and integrate(), but they do not perform well: area() is very slow and 
integrate() does not converge. Are in R other functions for numerical 
integration of one dimentional functions?

Thanks in advance 

Davide






Tiscali.Fax: il tuo fax online in promo fino al 31 dicembre, 
paghi 15€ e ricarichi 20€ 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hazard ratio of interaction Cox model

2007-12-18 Thread Giulia Barbati
Dear Forum,
I have a question about interaction estimate in the Cox model: 
why the hazard ratio of the interaction is not produced in the summary of the 
model?
(Instead, the estimate of the coefficient is given in the print of the model.) 


# Example:

modINT <-cph( Surv(T_BASE, T_FIN,STATUS)~ NYHA + ASINI + RFP + FE_REC + 
XX_PR*XX_DISF)

print(modINT)


  coef se(coef) zp
NYHA=2   1.2540.584  2.15 0.031767
ASINI   0.6650.409  1.62 0.104247
RFP=20.7250.704  1.03 0.302578
FE_REC=2-1.6370.810 -2.02 0.043331
XX_PR2.1890.649  3.37 0.000748
XX_DISF  3.2331.000  3.23 0.001222
XX_PR * XX_DISF -2.8521.280 -2.23 0.025852

summary(modINT)

 Effects  Response : Surv(T_BASE, T_FIN, STATUS) 

 Factor LowHigh Diff.  Effect S.E. Lower 0.95 Upper 0.95
 ASINI 2.0725 2.85 0.7775  0.52  0.32 -0.111.14
  Hazard Ratio  2.0725 2.85 0.7775  1.68NA  0.903.13
 XX_PR  0. 1.00 1.  2.19  0.65  0.923.46
  Hazard Ratio  0. 1.00 1.  8.92NA  2.50   31.86
 XX_DISF0. 1.00 1.  3.23  1.00  1.275.19
  Hazard Ratio  0. 1.00 1. 25.35NA  3.57  179.88
 NYHA - 2:1 1. 2.00 NA  1.25  0.58  0.112.40
  Hazard Ratio  1. 2.00 NA  3.50NA  1.12   11.00
 RFP - 2:1  1. 2.00 NA  0.73  0.70 -0.652.10
  Hazard Ratio  1. 2.00 NA  2.07NA  0.528.20
 FE_REC - 2:1   1. 2.00 NA -1.64  0.81 -3.23   -0.05
  Hazard Ratio  1. 2.00 NA  0.19NA  0.040.95

Adjusted to: XX_PR=0 XX_DISF=0  




  

Be a better friend, newshound, and 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-users

2007-12-18 Thread Peter Dalgaard
Kunio takezawa wrote:
> R-users
> E-mail: r-help@r-project.org
>
>I have a quenstion on "gam()" in "gam" package.
> The help of gam() says:
>   'gam' uses the _backfitting
>  algorithm_ to combine different smoothing or fitting methods.
>
> On the other hand,  lm.wfit(), which is a routine of gam.fit() contains:
>
> z <- .Fortran("dqrls", qr = x * wts, n = n, p = p, y = y *
> wts, ny = ny, tol = as.double(tol), coefficients = mat.or.vec(p,
> ny), residuals = y, effects = mat.or.vec(n, ny), rank = integer(1),
> pivot = 1:p, qraux = double(p), work = double(2 * p),
> PACKAGE = "base")
> It may indicate that QR decomposition is used to derive an additive model
> instead of backfitting.
>I am wondering if my guess is correct, or this "the _backfitting
> algorithm"
> has another meaning.
>   
Please don't ask the same question multiple times!

And no, backfitting and QR are unrelated concepts. You need to read up
on the theory, there are two fundamental books: Hastie & Tibshirani (gam
package) and Simon Wood (mgcv package). Both are a bit much to ask to
have summarized in email.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R command "leap"

2007-12-18 Thread Maura E Monville
After applying the "step" command to a long list of predictors I came up
with the following showing there still are some non significant
coefficients.
I'd like to try the command "leap". However, I don't quite understand how it
returns the spared predicting variables.
Can please, someone help ?

Tank you in advance.

Mara EM

>summary(stepmod)

Call:
lm(formula = yy ~ cosmat[, 1] + cosmat[, 2] + cosmat[, 3] + cosmat[,
4] + cosmat[, 5] + cosmat[, 6] + cosmat[, 7] + cosmat[, 8] +
cosmat[, 9] + cosmat[, 11] + cosmat[, 12] + cosmat[, 15] +
cosmat[, 16] + cosmat[, 17] + cosmat[, 22] + cosmat[, 25] +
cosmat[, 29] + cosmat[, 30] + cosmat[, 31] + cosmat[, 38] +
cosmat[, 42] + cosmat[, 44] + cosmat[, 45] + cosmat[, 46] +
cosmat[, 47] + cosmat[, 48] + cosmat[, 49] + cosmat[, 50] +
cosmat[, 51] + cosmat[, 52] + cosmat[, 53] + cosmat[, 54] +
cosmat[, 56] + sinmat[, 1] + sinmat[, 3] + sinmat[, 4] +
sinmat[, 5] + sinmat[, 6] + sinmat[, 8] + sinmat[, 9] + sinmat[,
10] + sinmat[, 13] + sinmat[, 14] + sinmat[, 17] + sinmat[,
19] + sinmat[, 22] + sinmat[, 23] + sinmat[, 26] + sinmat[,
35] + sinmat[, 36] + sinmat[, 39] + sinmat[, 43] + sinmat[,
45] + sinmat[, 46] + sinmat[, 47] + sinmat[, 48] + sinmat[,
49] + sinmat[, 50] + sinmat[, 51] + sinmat[, 52] + sinmat[,
53])

Residuals:
  Min1QMedian3Q   Max
-0.175619 -0.009864  0.001284  0.008596  0.160115

Coefficients:
  Estimate Std. Error  t value Pr(>|t|)
(Intercept)  -0.474379   0.003426 -138.484  < 2e-16 ***
cosmat[, 1]  -0.767037   0.004832 -158.726  < 2e-16 ***
cosmat[, 2]  -0.214150   0.004833  -44.306  < 2e-16 ***
cosmat[, 3]  -0.022436   0.004836   -4.639 2.15e-05 ***
cosmat[, 4]  -0.022201   0.004840   -4.587 2.58e-05 ***
cosmat[, 5]  -0.052807   0.004845  -10.899 1.84e-15 ***
cosmat[, 6]   0.012219   0.0048472.521 0.014573 *
cosmat[, 7]   0.033468   0.0048546.895 5.16e-09 ***
cosmat[, 8]   0.007026   0.0048571.447 0.153564
cosmat[, 9]  -0.009502   0.004858   -1.956 0.055481 .
cosmat[, 11]  0.015284   0.0048583.146 0.002647 **
cosmat[, 12]  0.006231   0.0048561.283 0.204778
cosmat[, 15]  0.005461   0.0048411.128 0.264160
cosmat[, 16]  0.010987   0.0048412.269 0.027110 *
cosmat[, 17]  0.005367   0.0048361.110 0.271857
cosmat[, 22] -0.005541   0.004834   -1.146 0.256570
cosmat[, 25] -0.010025   0.004849   -2.068 0.043310 *
cosmat[, 29] -0.006114   0.004861   -1.258 0.213668
cosmat[, 30] -0.005734   0.004860   -1.180 0.243061
cosmat[, 31] -0.005932   0.004860   -1.221 0.227281
cosmat[, 38]  0.006269   0.0048311.298 0.199735
cosmat[, 42]  0.006167   0.0048321.276 0.207140
cosmat[, 44] -0.006002   0.004854   -1.237 0.221395
cosmat[, 45]  0.006484   0.0048491.337 0.186496
cosmat[, 46]  0.010074   0.0048512.077 0.042425 *
cosmat[, 47] -0.004843   0.004859   -0.997 0.323225
cosmat[, 48] -0.009098   0.004852   -1.875 0.065982 .
cosmat[, 49]  0.008740   0.0048651.797 0.077797 .
cosmat[, 50]  0.009175   0.0048461.893 0.063481 .
cosmat[, 51] -0.005664   0.004865   -1.164 0.249310
cosmat[, 52] -0.007255   0.004853   -1.495 0.140536
cosmat[, 53]  0.004950   0.0048451.022 0.311249
cosmat[, 54]  0.006561   0.0048531.352 0.181764
cosmat[, 56] -0.005954   0.004847   -1.228 0.224436
sinmat[, 1]  -0.086812   0.004858  -17.871  < 2e-16 ***
sinmat[, 3]  -0.041194   0.004853   -8.488 1.22e-11 ***
sinmat[, 4]  -0.040007   0.004849   -8.250 3.00e-11 ***
sinmat[, 5]   0.012811   0.0048442.645 0.010593 *
sinmat[, 6]   0.010845   0.0048422.240 0.029096 *
sinmat[, 8]  -0.018422   0.004833   -3.812 0.000346 ***
sinmat[, 9]   0.008006   0.0048311.657 0.103063
sinmat[, 10]  0.023215   0.0048314.805 1.20e-05 ***
sinmat[, 13]  0.013663   0.0048372.824 0.006550 **
sinmat[, 14]  0.015827   0.0048453.267 0.001860 **
sinmat[, 17]  0.006820   0.0048531.405 0.165442
sinmat[, 19]  0.007181   0.0048611.477 0.145211
sinmat[, 22]  0.010290   0.0048552.120 0.038492 *
sinmat[, 23]  0.005533   0.0048541.140 0.259254
sinmat[, 26]  0.014857   0.0048393.070 0.003298 **
sinmat[, 35] -0.006008   0.004847   -1.240 0.220321
sinmat[, 36] -0.004984   0.004845   -1.029 0.308021
sinmat[, 39] -0.006897   0.004862   -1.419 0.161574
sinmat[, 43] -0.004804   0.004828   -0.995 0.324064
sinmat[, 45]  0.005079   0.0048421.049 0.298680
sinmat[, 46] -0.008775   0.004839   -1.813 0.075144 .
sinmat[, 47] -0.008346   0.004831   -1.728 0.089575 .
sinmat[, 48]  0.007352   0.0048391.519 0.134335
sinmat[, 49]  0.007382   0.0048271.529 0.131870
sinmat[, 50] -0.005618   0.004844   -1.160 0.251083
sinmat[, 51] -0.007934   0.004826   -1.644 0.105786
sinmat[, 52]  0.006890   0.0048391.424 0.160050
sinmat[, 53]  0.009242   0.0048471.907 0.061708 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.03716 on 56 degrees of freedom
Multiple R-Squared: 0.998,  Adjusted R-square

[R] axis names in triangle.plot

2007-12-18 Thread Thomas Hoffmann
Hi folks,

I am using a data.frame with sediment grain sizes:

 > grain
 sand  silt  clay
OAT 10.03 56.77 18.25
OAT 10.40 57.40 17.94
WG1 50.03 20.68 12.57
WG1 43.20 25.69 13.41
WG1 33.89 31.10 14.48
WG2  2.84 62.81 20.79
WG2  2.79 60.46 19.16
WG2 16.27 33.04  6.48
WG2  1.39 57.90  9.13
WG3  4.54 52.91 17.20
WG3  5.20 50.55 15.65
WG3  7.71 49.13 10.80
WG3  4.43 50.03 11.83
WG3  1.72 57.53 14.20
WG3  1.51 58.99 13.96

I would like to do a trinagle.plot with labeled axis-names "sand" "silt" 
and "clay". However using the command:

tringle.plot(grain)

from ade4-apckage) does not plot the axis names and there is no paramter 
to set the labels like "xlab" in the plot command. Does anybody has a 
good advice?

Thanks in advance
Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table() and precision?

2007-12-18 Thread Prof Brian Ripley
On Mon, 17 Dec 2007, Moshe Olshansky wrote:

> Dear List,
>
> Following the below question I have a question of my
> own:
> Suppose that I have large matrices which are produced
> sequentially and must be used sequentially in the
> reverse order. I do not have enough memory to store
> them and so I would like to write them to disk and
> then read them. This raises two questions:
> 1) what is the fastest (and the most economic
> space-wise) way to do this?

Using save/load is the simplest.  Don't worry about finding better 
solutions until you know those are not good enough.  (serialize / 
unserialize is another interface to the same underlying idea.)

> 2) functions like write, write.table, etc. write the
> data the way it is printed and this may result in a
> loss of accuracy. Is there any way to prevent this,
> except for setting the "digits" option to a higher
> value or using format prior to writing the data?

Do please read the help before making false claims. ?write.table says

  Real and complex numbers are written to the maximal possible
  precision.

OTOH, ?write says it is a wrapper for cat, whose help says

  'cat' converts numeric/complex elements in the same way as 'print'
  (and not in the same way as 'as.character' which is used by the S
  equivalent), so 'options' '"digits"' and '"scipen"' are relevant.
  However, it uses the minimum field width necessary for each
  element, rather than the same field width for all elements.

so this hints as.character() might be a useful preprocessor.

> Is it possible to write binary files (similar to Fortran)?

See ?writeBin.  save/load by default write binary files, but use of 
writeBin can be faster (and less flexible).

> Any suggestion will be greatly appreciated.

Somehow you have missed a great deal of information about R I/O.
Try help.start() and reading the sections the search engine shows you 
that look relevant.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Res: Scatterplot Showing All Points

2007-12-18 Thread S Ellison
?jitter is simpler:

x<-rep(1:10,10)
y<-x
plot(x,y)   #100 points, only 10 show
plot(jitter(x),jitter(y))   #overlap removed.



>>> Milton Cezar Ribeiro <[EMAIL PROTECTED]> 18/12/2007 04:36
>>>
Hi Wayne,

I have two suggestion to you.
1. You add some random noise on both x and y data or
2. You graph bubble points, where the size is proportional to the
frequence of the xy combination.

x<-sample(1:10,1,replace=T)
y<-sample(1:10,1,replace=T)
xy<-cbind(x,y)
x11(1400,800)
par(mfrow=c(1,3))
plot(xy)
xy.random<-xy+rnorm(2,0.1,0.1) 
plot(xy.random,cex=0.1)
xy.tab<-data.frame(table(x,y))
xy.tab$x<-as.numeric(as.character(xy.tab$x))
xy.tab$y<-as.numeric(as.character(xy.tab$y))
min.freq<-min(xy.tab$Freq)
max.freq<-max(xy.tab$Freq)
plot(xy.tab$x,xy.tab$y,cex=(xy.tab$Freq-min.freq)/(max.freq-min.freq)*5)

Kind regards,

Miltinho
Brazil


- Mensagem original 
De: Wayne Aldo Gavioli <[EMAIL PROTECTED]>
Para: r-help@r-project.org
Enviadas: Segunda-feira, 17 de Dezembro de 2007 22:14:23
Assunto: [R] Scatterplot Showing All Points



Hello all,


I'm trying to graph a scatterplot of a large (5,000 x,y coordinates) of
data
with the caveat that many of the data points overlap with each other
(share the
same x AND y coordinates).  In using the usual "plot" command,


> plot(education, xlab="etc", ylab="etc")


it seems that the overlap of points is not shown in the graph.  Namely,
there
are 5,000 points that should be plotted, as I mentioned above, but
because so
many of the points overlap with each other exactly, only about 50-60
points are
actually plotted on the graph.  Thus, there's no indication that Point
A shares
its coordinates with 200 other pieces of data and thus is very common
while
Point B doesn't share its coordinates with any other pieces of data and
thus
isn't common at all.  Is there anyway to indicate the frequency of such
points
on such a graph?  Should I be using a different command than "plot"?


Thanks,


Wayne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.



 para armazenamento!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot-How to define fill colours?

2007-12-18 Thread ONKELINX, Thierry
Pedro,

I've had a similar problem. See this post for the solution:
http://thread.gmane.org/gmane.comp.lang.r.general/100649/focus=100673

Cheers,

Thierry 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
[EMAIL PROTECTED] 
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

-Oorspronkelijk bericht-
Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Namens Pedro de Barros
Verzonden: maandag 17 december 2007 23:55
Aan: [EMAIL PROTECTED]
Onderwerp: [R] ggplot-How to define fill colours?
Urgentie: Hoog

Dear R's (most likely Hadley),

I want to build a stacked bar plot where I would like to define which 
colours will be used for each of the groups. However, I do not seem 
to find a way to do this, even if I've been looking over many places.

I have tried several variations, and my final try was this code, but 
I still do not manage to get the colours as I pre-define. Any hints 
about how to get this?

Thanks in advance,
Pedro

my code:
 >plotdata1<-data.frame(x=rep(factor(1:4),4), y=rep(0.1*(1:4),4), 
+group=as.character(rep(c('white', 'red', 'blue', 'green'),rep(4,4
 >plot0<-ggplot()
 >plot3<-plot0+layer(data=plotdata1, mapping=aes_string(x='x',y='y', 
+fill='group'),geom='bar', stat='identity', position='stack')
 >print(plot3)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis names in triangle.plot

2007-12-18 Thread Stéphane Dray
Hi Thomas,
This looks quite strange. By default, the function use the column names 
as labels.
What is grain ? A data.frame ? Why have you duplicated row.names? Please 
return the results of class(grain) and names(grain)

Cheers,
PS: If you have questions related to ade4, you can use the adelist 
(http://listes.univ-lyon1.fr/wws/info/adelist)


Thomas Hoffmann wrote:
> Hi folks,
>
> I am using a data.frame with sediment grain sizes:
>
>  > grain
>  sand  silt  clay
> OAT 10.03 56.77 18.25
> OAT 10.40 57.40 17.94
> WG1 50.03 20.68 12.57
> WG1 43.20 25.69 13.41
> WG1 33.89 31.10 14.48
> WG2  2.84 62.81 20.79
> WG2  2.79 60.46 19.16
> WG2 16.27 33.04  6.48
> WG2  1.39 57.90  9.13
> WG3  4.54 52.91 17.20
> WG3  5.20 50.55 15.65
> WG3  7.71 49.13 10.80
> WG3  4.43 50.03 11.83
> WG3  1.72 57.53 14.20
> WG3  1.51 58.99 13.96
>
> I would like to do a trinagle.plot with labeled axis-names "sand" "silt" 
> and "clay". However using the command:
>
> tringle.plot(grain)
>
> from ade4-apckage) does not plot the axis names and there is no paramter 
> to set the labels like "xlab" in the plot command. Does anybody has a 
> good advice?
>
> Thanks in advance
> Thomas
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>   


-- 
Stéphane DRAY ([EMAIL PROTECTED] )
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - Lyon I
43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
Tel: 33 4 72 43 27 57   Fax: 33 4 72 43 13 88
http://biomserv.univ-lyon1.fr/~dray/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dual Core vs Quad Core

2007-12-18 Thread S Ellison
Hiding in the windows faq is the observation that "R's computation is
single-threaded, and so it cannot use more than one CPU". So multi-core
should make no difference other than allowing R to run with less
interruption from other tasks. That is often a significant advantage,
though.




>>> Andrew Perrin <[EMAIL PROTECTED]> 18/12/2007 01:13 >>>
On Mon, 17 Dec 2007, Kitty Lee wrote:

> Dear R-users,
>
> I use R to run spatial stuff and it takes up a lot of ram. Runs can
take hours or days. I am thinking of getting a new desktop. Can R take
advantage of the dual-core system?
>
> I have a dual-core computer at work. But it seems that right now R is
using only one processor.
>
> The new computers feature quad core with 3GB of RAM. Can R take
advantage of the 4 chips? Or am I better off getting a dual core with
faster processing speed per chip?
>
> Thanks! Any advice would be really appreciated!
>
> K.

If I have my information right, R will use dual- or quad-cores if it's

doing two (or four) things at once. The second core will help a little
bit 
insofar as whatever else your machine is doing won't interfere with the

one core on which it's running, but generally things that take a single

thread will remain on a single core.

As for RAM, if you're doing memory-bound work you should certainly be 
using a 64-bit machine and OS so you can utilize the larger memory
space.


--
Andrew J Perrin - andrew_perrin (at) unc.edu -
http://perrin.socsci.unc.edu 
Associate Professor of Sociology; Book Review Editor, _Social Forces_
University of North Carolina - CB#3210, Chapel Hill, NC 27599-3210 USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gene shaving method

2007-12-18 Thread Turner, Heather
Gene shaving is implemented in the GeneClust package for R which you can
download from
http://odin.mdacc.tmc.edu/~kim/geneclust/
For more details see "The Analysis of Gene Expression Data: Methods and
Software" edited by Giovanni Parmiagiani, Elizabeth S. Garett, Rafael A.
Irizarry and Scott L. Zeger (you can get a preview on Google Book Search
http://books.google.com/books?id=r9gROQvdelcC&pg=PA352&lpg=PA352&dq=gene
clust&source=web&ots=3FO3jQlfrp&sig=BeOgUK2cgfuv7d12vWsvpRXOkBU#PPA353,M
1)

Best wishes,

Heather

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Aimin Yan
Sent: 17 December 2007 19:51
To: r-help@r-project.org
Subject: [R] gene shaving method

Does anyone know if Hastie's gene shaving method is implemented in R

Thanks,

Aimin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] accessing dimension names

2007-12-18 Thread born . to . b . wyld
I have a matrix y:

> dimnames(y)
$x93
[1] "1" "2"

$x94
[1] "0" "1" "2"
.. so on  (there are other dimensions as well)



I need to access a particular dimension, but a random mechanism tells me
which dimension it would. So, sometimes I might need to access
dimnames(y)$x93, some other time it would be dimnames(y)$x94.. and so on.
Now let that random dimension be idx, then dimnames(y)$paste('x',idx,sep='')
doesn't work.

Can anyone help?

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two repeated warnings when running gam(mgcv) to analyse my dataset?

2007-12-18 Thread Simon Wood
The model here is just a penalised GLM, and the warnings relate to the GLM 
fitting process. Fitted probabilities of 0 or 1 can be perfectly appropriate, 
but do indicate that the linear predictor is not really uniquely defined, and 
that some care may be needed in interpreting results (for example, if the 
fitted probabilities are zero or one, then a CI for the corresponding linear 
predictor will depend more on the prior assumptions about smoothness than 
anything else). This problem is not really GAM specific, it relates to any 
`logistic regression' model. 

Similarly, the GLM fitting IRLS iterations are not guaranteed to converge, and 
can fail, especially for overly flexible logistic regression models. Try 
this, for example

x <- 1:10
y <- c(0,0,0,0,0,1,1,1,1,1)
glm(y~x,family=binomial)

I get...
...
Warning messages:
1: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = 
etastart,  :
  algorithm did not converge
2: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = 
etastart,  :
  fitted probabilities numerically 0 or 1 occurred

...as models become more complex the scope for this sort of thing to happen 
increases, and some simplification may be appropriate. 

That said, mgcv::gam fitting with all smoothing parameters fixed, is slightly 
more likely to fail in this way than `glm' or `mgcv::gam' with some smoothing 
parameters  estimated, because of the steps taken to stabilise divergent fit 
iterations. When all smoothing parameters are fixed, mgcv uses older fitting 
routines that don't try as hard to stabilise a divergent fit as the newer 
fitting routines. This is a bit of an anomaly and I'll try and fix it for a 
future release. 

best,
Simon



On Monday 17 December 2007 11:53, zhijie zhang wrote:
> Dear Simon,
> Sorry for an incomplete listing of the question.
> #mgcv version is  1.3-29, R 2.6.1, windows XP
> #m.gam<-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+
>disbinary,family=binomial(logit),data=point) The above program's the core
> codes in my following loop programs.
>  It works well if i run the above codes only one time for my dataset, but
> warnings will occur if i run many times for the following loop.
>
> > while (j<1001) {
>
> +  index=sample(ID, replace=F)
> +  m.data$x=coords[index,]$x
> +  m.data$y=coords[index,]$y
> +  # For each permutation, we run the GAM using the optimal span for the
> above model m.gam
> +  s.gam
> <-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disbin
>ary,,sp=c( 5.582647e-07,4.016504e-02,2.300424e-04,1.274065e+03,9.558236e-09,
> 1.868827e-08),family=binomial(logit),data=m.data)
> +  permresults[,i]=predict.gam(s.gam)
> +  i=i+1
> +  if (j%%100==0) print(i)
> +  j=j+1
> +  }
> [1] 101
> [1] 201
> [1] 301
> [1] 401
> [1] 501
> [1] 601
> [1] 701
> [1] 801
> [1] 901
> [1] 1001
> warnings() over 50
>
> > warnings()
>
> 1: In gam.fit(G, family = G$family, control = control, gamma = gamma,  ...
> : fitted probabilities numerically 0 or 1 occurred
> ..
> 14: In gam.fit(G, family = G$family, control = control, gamma = gamma,  ...
>
>   Algorithm did not converge
> ..
>
> On Dec 17, 2007 4:54 PM, Simon Wood <[EMAIL PROTECTED]> wrote:
> > What mgcv version are you running (and on what platform)?
> >
> > n Thursday 13 December 2007 17:46, zhijie zhang wrote:
> > > Dear all,
> > >  I run the GAMs (generalized additive models) in gam(mgcv) using the
> > > following codes.
> > >
> > > m.gam
> >
> > <-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disb
> >in
> >
> > >ary,family=binomial(logit),data=point)
> > >
> > >  And two repeated warnings appeared.
> > > Warnings:
> > > 1: In gam.fit(G, family = G$family, control = control, gamma = gamma,
> >
> >  ...
> >
> > > : Algorithm did not converge
> > >
> > > 2: In gam.fit(G, family = G$family, control = control, gamma = gamma,
> >
> >  ...
> >
> > > : fitted probabilities numerically 0 or 1 occurred
> > >
> > > Q1: For warning1, could it be solved by changing the value of
> > > mgcv.toloptions for
> > > gam.control(mgcv.tol=1e-7)?
> > >
> > > Q1: For warning2, is there any impact for the results if the "fitted
> > > probabilities numerically 0 or 1 occurred" ?  How can i solve it?
> > >
> > >  I didn't try the possible solutions for them, because it took such a
> > > longer time to run the whole programs.
> > >  Could anybody suggest their solutions?
> > >  Any help or suggestions are greatly appreciated.
> > >   Thanks.
> >
> > --
> >
> > > Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> > > +44 1225 386603  www.maths.bath.ac.uk/~sw283

-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-

Re: [R] clean programming

2007-12-18 Thread cgenolin
Gabor Grothendieck <[EMAIL PROTECTED]> a écrit :

> Its a FAQ

Oups... Sorry for that.
Just to close the topic :

cleanProg <- function(name,tolerance){
if(length(findGlobals(get(name),FALSE)$variables) > tolerance){
   cat("More than",tolerance,"global variable(s) in ",name,"\a\n")
}
}
cleanProg(fun,0)


Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dual Core vs Quad Core

2007-12-18 Thread Prof Brian Ripley
On Tue, 18 Dec 2007, S Ellison wrote:

> Hiding in the windows faq is the observation that "R's computation is
> single-threaded, and so it cannot use more than one CPU". So multi-core
> should make no difference other than allowing R to run with less
> interruption from other tasks. That is often a significant advantage,
> though.

Yes, but that is Windows-specific.

On most other platforms you can benefit from using a multi-threaded BLAS, 
such as ATLAS, ACML or Dr Goto's.  The speedup for linear algebra can be 
substantial (although sometimes it will slow things down).  Luke Tierney 
has an experimental package to make use of parallel threads for some basic 
R computations which may appear in R 2.7.0.

It should be possible to use a multi-threaded BLAS under Windows, but I 
know no one who has done it.  There is a viable pthreads implementation 
for Windows, and I've tested Luke's experimental package using it.

Some compilers' runtimes will be able to use parallel threads for other 
tasks.  Since all the examples I am aware of are expensive commercial 
compilers, I suspect R will make limited use of them.  (In particular, 
base R does not use the Fortran 9x vector operations at which many of 
these features are targeted: we probably would if we routinely used such 
compilers.)

I've had dual-CPU desktops for more than ten years.  Given how little 
speedup you are likely to get via parallel processing (only under ideal 
conditions do the optimized BLASes run >1.5x faster using two CPUs), the 
most effective way to make use of multiple CPUs has been to run multiple 
jobs: I typically run 3-4 at once to keep the CPUs fully used.

One way to run multiple R processes to cooperate on a single task is to 
use a package such as snow to distribute the load.



 Andrew Perrin <[EMAIL PROTECTED]> 18/12/2007 01:13 >>>
> On Mon, 17 Dec 2007, Kitty Lee wrote:
>
>> Dear R-users,
>>
>> I use R to run spatial stuff and it takes up a lot of ram. Runs can
> take hours or days. I am thinking of getting a new desktop. Can R take
> advantage of the dual-core system?
>>
>> I have a dual-core computer at work. But it seems that right now R is
> using only one processor.
>>
>> The new computers feature quad core with 3GB of RAM. Can R take
> advantage of the 4 chips? Or am I better off getting a dual core with
> faster processing speed per chip?
>>
>> Thanks! Any advice would be really appreciated!
>>
>> K.
>
> If I have my information right, R will use dual- or quad-cores if it's
> doing two (or four) things at once. The second core will help a little
> bit
> insofar as whatever else your machine is doing won't interfere with the
> one core on which it's running, but generally things that take a single
> thread will remain on a single core.
>
> As for RAM, if you're doing memory-bound work you should certainly be
> using a 64-bit machine and OS so you can utilize the larger memory
> space.

They only have 3GB of RAM, which 32-bit OSes can address.  The benefits 
really come with more than that.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis names in triangle.plot

2007-12-18 Thread Thomas Hoffmann
Thanks for this advice.

grain was not a data.frame but a matrix. Now it works:-)

Cheers
Thomas


Stéphane Dray schrieb:
> Hi Thomas,
> This looks quite strange. By default, the function use the column 
> names as labels.
> What is grain ? A data.frame ? Why have you duplicated row.names? 
> Please return the results of class(grain) and names(grain)
>
> Cheers,
> PS: If you have questions related to ade4, you can use the adelist 
> (http://listes.univ-lyon1.fr/wws/info/adelist)
>
>
> Thomas Hoffmann wrote:
>> Hi folks,
>>
>> I am using a data.frame with sediment grain sizes:
>>
>>  > grain
>>  sand  silt  clay
>> OAT 10.03 56.77 18.25
>> OAT 10.40 57.40 17.94
>> WG1 50.03 20.68 12.57
>> WG1 43.20 25.69 13.41
>> WG1 33.89 31.10 14.48
>> WG2  2.84 62.81 20.79
>> WG2  2.79 60.46 19.16
>> WG2 16.27 33.04  6.48
>> WG2  1.39 57.90  9.13
>> WG3  4.54 52.91 17.20
>> WG3  5.20 50.55 15.65
>> WG3  7.71 49.13 10.80
>> WG3  4.43 50.03 11.83
>> WG3  1.72 57.53 14.20
>> WG3  1.51 58.99 13.96
>>
>> I would like to do a trinagle.plot with labeled axis-names "sand" 
>> "silt" and "clay". However using the command:
>>
>> tringle.plot(grain)
>>
>> from ade4-apckage) does not plot the axis names and there is no 
>> paramter to set the labels like "xlab" in the plot command. Does 
>> anybody has a good advice?
>>
>> Thanks in advance
>> Thomas
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>   
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot-How to define fill colours?

2007-12-18 Thread Pedro de Barros
Thanks Thierry,
Rigth on target.

Cheers,
Pedro
At 10:35 2007/12/18, you wrote:
>Pedro,
>
>I've had a similar problem. See this post for the solution:
>http://thread.gmane.org/gmane.comp.lang.r.general/100649/focus=100673
>
>Cheers,
>
>Thierry
>
>
>
>
>ir. Thierry Onkelinx
>Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>and Forest
>Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
>methodology and quality assurance
>Gaverstraat 4
>9500 Geraardsbergen
>Belgium
>tel. + 32 54/436 185
>[EMAIL PROTECTED]
>www.inbo.be
>
>Do not put your faith in what statistics say until you have carefully
>considered what they do not say.  ~William W. Watt
>A statistical analysis, properly conducted, is a delicate dissection of
>uncertainties, a surgery of suppositions. ~M.J.Moroney
>
>-Oorspronkelijk bericht-
>Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
>Namens Pedro de Barros
>Verzonden: maandag 17 december 2007 23:55
>Aan: [EMAIL PROTECTED]
>Onderwerp: [R] ggplot-How to define fill colours?
>Urgentie: Hoog
>
>Dear R's (most likely Hadley),
>
>I want to build a stacked bar plot where I would like to define which
>colours will be used for each of the groups. However, I do not seem
>to find a way to do this, even if I've been looking over many places.
>
>I have tried several variations, and my final try was this code, but
>I still do not manage to get the colours as I pre-define. Any hints
>about how to get this?
>
>Thanks in advance,
>Pedro
>
>my code:
>  >plotdata1<-data.frame(x=rep(factor(1:4),4), y=rep(0.1*(1:4),4),
>+group=as.character(rep(c('white', 'red', 'blue', 'green'),rep(4,4
>  >plot0<-ggplot()
>  >plot3<-plot0+layer(data=plotdata1, mapping=aes_string(x='x',y='y',
>+fill='group'),geom='bar', stat='identity', position='stack')
>  >print(plot3)
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
Jari Oksanen wrote:
> Wayne Aldo Gavioli  fas.harvard.edu> writes:
>
>   
>> Hello all,
>>
>> I'm trying to graph a scatterplot of a large (5,000 x,y coordinates) of data
>> with the caveat that many of the data points overlap with each other (share 
>> the
>> same x AND y coordinates).  In using the usual "plot" command,
>>
>> 
>>> plot(education, xlab="etc", ylab="etc")
>>>   
>> it seems that the overlap of points is not shown in the graph.  Namely, there
>> are 5,000 points that should be plotted, as I mentioned above, but because so
>> many of the points overlap with each other exactly, only about 50-60 points 
>> are
>> actually plotted on the graph.  Thus, there's no indication that Point A 
>> shares
>> its coordinates with 200 other pieces of data and thus is very common while
>> Point B doesn't share its coordinates with any other pieces of data and thus
>> isn't common at all.  Is there anyway to indicate the frequency of such 
>> points
>> on such a graph?  Should I be using a different command than "plot"?
>>
>>
>> 
> One suggestion seems to be still missing: 'sunflowerplot' of base R. May look
> taggy, though, if you have 200 "petals". 
>
> Actually the documentation of sunflowerplot is wrong in botanical sense.
> Sunflowers have composite flowers in capitula, and the things called 'petals' 
> in
> documentation are ligulate, sterile ray-florets (each with vestigial petals
> which are not easily visible in sunflower, but in some other species you may 
> see
> three (occasionally two) teeth). 
>   
Could you please put together a patch that replaces "petals" with 
"ligulate, sterile ray-florets" in appropriate places?

;-)

Duncan Murdoch
> cheers, jari oksanen
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: accessing dimension names

2007-12-18 Thread Petr PIKAL
Hi

[EMAIL PROTECTED] napsal dne 18.12.2007 12:01:41:

> I have a matrix y:
> 
> > dimnames(y)
> $x93
> [1] "1" "2"
> 
> $x94
> [1] "0" "1" "2"
> .. so on  (there are other dimensions as well)
> 
> 
> 
> I need to access a particular dimension, but a random mechanism tells me
> which dimension it would. So, sometimes I might need to access
> dimnames(y)$x93, some other time it would be dimnames(y)$x94.. and so 
on.
> Now let that random dimension be idx, then 
dimnames(y)$paste('x',idx,sep='')
> doesn't work.

Why not

dimnames(y)[idx]

Regards
Petr


> 
> Can anyone help?
> 
> Thanks!
> 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Antony Unwin
Wayne,

Try the iplot command in iPlots.  You can then vary both the  
pointsize and the transparency of your scatterplot interactively and  
decide which scatterplot conveys the information best.  Sometimes  
it's helpful to use more than one scatterplot when presenting your  
results.

(I must admit to being very surprised that jittering and sunflower  
plots have been suggested for a dataset of 5000 points.  Do those who  
mentioned these methods have examples on that scale where they are  
effective?)

Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
University of Augsburg,
Germany
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hazard ratio of interaction Cox model

2007-12-18 Thread Frank E Harrell Jr
Giulia Barbati wrote:
> Dear Forum,
> I have a question about interaction estimate in the Cox model: 
> why the hazard ratio of the interaction is not produced in the summary of the 
> model?
> (Instead, the estimate of the coefficient is given in the print of the 
> model.) 

The 'hazard ratio of the interaction' is not well defined.  Decide what 
hazard ratio you want to estimate, then ask summary to compute that, e.g.

summary(modINT, XX_PR=c(1,3), XX_DISF=2)

will estimate the 3:1 XX_PR hazard ratio at XX_DISF=2

Frank Harrell

> 
> 
> # Example:
> 
> modINT <-cph( Surv(T_BASE, T_FIN,STATUS)~ NYHA + ASINI + RFP + FE_REC + 
> XX_PR*XX_DISF)
> 
> print(modINT)
> 
> 
>   coef se(coef) zp
> NYHA=2   1.2540.584  2.15 0.031767
> ASINI   0.6650.409  1.62 0.104247
> RFP=20.7250.704  1.03 0.302578
> FE_REC=2-1.6370.810 -2.02 0.043331
> XX_PR2.1890.649  3.37 0.000748
> XX_DISF  3.2331.000  3.23 0.001222
> XX_PR * XX_DISF -2.8521.280 -2.23 0.025852
> 
> summary(modINT)
> 
>  Effects  Response : Surv(T_BASE, T_FIN, STATUS) 
> 
>  Factor LowHigh Diff.  Effect S.E. Lower 0.95 Upper 0.95
>  ASINI 2.0725 2.85 0.7775  0.52  0.32 -0.111.14
>   Hazard Ratio  2.0725 2.85 0.7775  1.68NA  0.903.13
>  XX_PR  0. 1.00 1.  2.19  0.65  0.923.46
>   Hazard Ratio  0. 1.00 1.  8.92NA  2.50   31.86
>  XX_DISF0. 1.00 1.  3.23  1.00  1.275.19
>   Hazard Ratio  0. 1.00 1. 25.35NA  3.57  179.88
>  NYHA - 2:1 1. 2.00 NA  1.25  0.58  0.112.40
>   Hazard Ratio  1. 2.00 NA  3.50NA  1.12   11.00
>  RFP - 2:1  1. 2.00 NA  0.73  0.70 -0.652.10
>   Hazard Ratio  1. 2.00 NA  2.07NA  0.528.20
>  FE_REC - 2:1   1. 2.00 NA -1.64  0.81 -3.23   -0.05
>   Hazard Ratio  1. 2.00 NA  0.19NA  0.040.95
> 
> Adjusted to: XX_PR=0 XX_DISF=0  
> 
> 
> 
> 
>   
> 
> Be a better friend, newshound, and 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread S Ellison


>> Antony Unwin <[EMAIL PROTECTED]> >>
>I must admit to being very surprised that jittering and sunflower  
>plots have been suggested for a dataset of 5000 points.  Do those who 

>mentioned these methods have examples on that scale where they are  
>effective?)

You have a point. haha. 
But check the microarray literature; scatterplots have been used -
often - to display microarray data with 1 observations at a time.
And in their defence, even on screen, a 600x600 pixel plot window holds
36 pixels - 5000 is not a large fraction of that. Jittering has
visible effects on data at that resolution. Compare the two plots in 

library(MASS)
Sigma <- matrix(c(10,4,4,2),2,2)
xy<- round(mvrnorm(n=5000, rep(0, 2), Sigma), 1)
plot(xy,pch=".")
plot(jitter(xy, factor=2),pch=".")

But you're of course right to question how sensible this is. The best
you can get is a visual impression of the 'shape' of the data with a
greater perceived density at multiple observations which otherwise
overlapped. 

S.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Capture warning messages from coxph()

2007-12-18 Thread Terry Therneau

Xinyi Li asked how to keep track of which coxph models, called from within a
loop, were responsible for warning messges.  One solution is to modify the
coxph code so that those models are marked in the return coxph object.  Below
is a set of changes to the final 40 lines of coxph.fit, that will cause the
component "infinite.warn" to be added to the result whenever a warning was
generated; it will be a vector of T/F showing which component(s) of the
coefficient vector generated the warning.
  Change the code for coxph.fit.s, then do
> source('coxph.fit.revised.s')#or whatever you called it
> coxph <- source('coxph.s')
> coxph.wtest <- survival:::coxph.wtest

Line 2 causes you to have a local version of coxph, otherwise, due
to name spaces, the original version of coxph.fit will still be used.  Line 3
is another consequence of name spaces.  Then code such as

for(i in 1:ncol(x)){
fit=coxph(TIME~x[,i])
if (!is.null(fit$infinite.warn)) cat("Waring in fit", i, "\n")
sfit <- summary(fit)
results[i,1]=sfit$coef[1]
results[i,2]=sfit$coef[3]
results[i,3]=sfit$coef[5]
 }
Will report out the models with a warning.  

Notes
  1. The warning message is not completely reliable

  2. Name spaces protect a package from accidental overrides, when a user or 
some other package reuses a function name.  With its hundreds of packages, they
are a necessity for R.  But sometimes you really do want to override a
function; then they are a bit of a pain.  Others with a better grasp of R
internals may be able to suggest a better override than I have done here.

Terry Therneau

-


infs <- abs(coxfit$u %*% var)
keep.infs <- F# new line
if (maxiter >1) {
if (coxfit$flag == 1000)
   warning("Ran out of iterations and did not converge")
else {
infs <- ((infs > control$eps) & 
 infs > control$toler.inf*abs(coef))
if (any(infs))
warning(paste("Loglik converged before variable ",
  paste((1:nvar)[infs],collapse=","),
  "; beta may be infinite. "))
keep.infs <- T  #new line
}
}

names(coef) <- dimnames(x)[[2]]
lp <- c(x %*% coef) + offset - sum(coef*coxfit$means)
score <- exp(lp[sorted])
coxres <- .C("coxmart", as.integer(n),
as.integer(method=='efron'),
stime,
sstat,
newstrat,
as.double(score),
as.double(weights),
resid=double(n))
resid <- double(n)
resid[sorted] <- coxres$resid
names(resid) <- rownames
coef[which.sing] <- NA

temp <- list(coefficients  = coef,  #modified line
var= var,
loglik = coxfit$loglik,
score  = coxfit$sctest,
iter   = coxfit$iter,
linear.predictors = as.vector(lp),
residuals = resid,
means = coxfit$means,
method='coxph')
if (keep.infs) temp$infinite.warn <- infs   #new line
temp#new line
}
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 18/12/2007 7:31 AM, Antony Unwin wrote:
> Wayne,
> 
> Try the iplot command in iPlots.  You can then vary both the  
> pointsize and the transparency of your scatterplot interactively and  
> decide which scatterplot conveys the information best.  Sometimes  
> it's helpful to use more than one scatterplot when presenting your  
> results.
> 
> (I must admit to being very surprised that jittering and sunflower  
> plots have been suggested for a dataset of 5000 points.  Do those who  
> mentioned these methods have examples on that scale where they are  
> effective?)

Sure.  The original post said there were about 50-60 unique locations. 
This plot:

x <- rbinom(5000, 20, 0.15)
y <- rbinom(5000, 20, 0.15)
plot(x,y)

has a few more unique locations; tune those probabilities if you want it 
closer.  Due to the overlap, the distribution is very unclear.  But this 
plot

plot(jitter(x), jitter(y))

makes the distribution quite clear.

I wouldn't use the default pch if I had 5 points, but with pch=".", 
it's not so bad even in that case.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GLM and factor in forular

2007-12-18 Thread Knut Krueger
I used a GLM with a factor variable and wondered about that the first 
factor is missing in the results.
means there is no result for Y1
Is it wrong to use factors in GLM or is there a statistical reason that 
there is no Y1 result ?

X<-rnorm(31:40)
Y<-factor(c(1:10))
glm(X~Y)

Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] accessing dimension names

2007-12-18 Thread Petr PIKAL
Hard to help as i do not have "y" and it definitelly is not a matrix as 
you tried to pretend. 

1.  Try to look at structure of your y object by str(y)
2.  Try to learn about how to extract parts of objects e.g. by reading 
?"["
3.  Try to use what you learned on your y object
4.  If you still does not get what you want then make some example 
which can be reproduced and ask again

> mat<-matrix(rnorm(12),3,4)
> dmat<-data.frame(mat)
> dimnames(dmat)
[[1]]
[1] "1" "2" "3"

[[2]]
[1] "X1" "X2" "X3" "X4"

> dimnames(dmat)[1]
[[1]]
[1] "1" "2" "3"

> dimnames(dmat)[1][1]
[[1]]
[1] "1" "2" "3"

> dimnames(dmat)[[1]][1]
[1] "1"

Regards

Petr
[EMAIL PROTECTED]

[EMAIL PROTECTED] napsal dne 18.12.2007 14:25:06:

> Thanks. Actually, I need something else as well.
> 
> I need to get as.numeric(dimnames(y)$x93[1]), which in this case is 1. I 
tried
> as.numeric(dimnames(y)$paste('x',idx,sep='')[1]), and it did not work.
> 
> Please help.
> 
> 
> 

> On Dec 18, 2007 6:26 AM, Petr PIKAL <[EMAIL PROTECTED]> wrote:
> Hi
> 
> [EMAIL PROTECTED] napsal dne 18.12.2007 12:01:41:
> 
> > I have a matrix y:
> >
> > > dimnames(y)
> > $x93
> > [1] "1" "2"
> >
> > $x94
> > [1] "0" "1" "2"
> > .. so on  (there are other dimensions as well)
> >
> >
> >
> > I need to access a particular dimension, but a random mechanism tells 
me
> > which dimension it would. So, sometimes I might need to access
> > dimnames(y)$x93, some other time it would be dimnames(y)$x94.. and so 
> on.
> > Now let that random dimension be idx, then
> dimnames(y)$paste('x',idx,sep='')
> > doesn't work.

> Why not
> 
> dimnames(y)[idx]
> 
> Regards
> Petr
> 
> 
> >
> > Can anyone help?
> >
> > Thanks!
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reshape Dataframe

2007-12-18 Thread Bert Jacobs

Hi,

I'm having a bit of problems in creating a new dataframe. 
Below you'll find a description of the current dataframe and of the
dataframe that needs to be created.
Can someone help me out on this one?
Thx in advance.
Bert

Current Dataframe

Var1Var2Var3Var4
A   Fa  W1  1
A   Si  W1  2
A   Fa  W2  3
A   Si  W3  4
B   Si  W1  5
C   La  W2  6
C   Do  W4  7

New Dataframe

Var1Var2W1  W2  W3  W4
A   Fa  1   3   
A   Si  2   4
A   La
A   Do  
B   Fa  
B   Si  5   
B   La
B   Do
C   Fa  
C   Si  
C   La  6
C   Do  7

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread hadley wickham
On 12/17/07, Jim Porzak <[EMAIL PROTECTED]> wrote:
> Wayne,
>
> I am fond of the bagplot (think 2D box plot) to replace scatter plots
> for large N. See
> http://www.wiwi.uni-bielefeld.de/~wolf/software/aplpack/ and aplpack
> in CRAN.

The big drawback of the bagplot, like the boxplot, is that it's
difficult to see multimodality.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape Dataframe

2007-12-18 Thread hadley wickham
On 12/18/07, Bert Jacobs <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I'm having a bit of problems in creating a new dataframe.
> Below you'll find a description of the current dataframe and of the
> dataframe that needs to be created.
> Can someone help me out on this one?

library(reshape)
dfm <- melt(df, id = 1:3)
cast(dfm, ... ~ Var3)

You can find out more about the reshape package at http://had.co.nz/reshape

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 - getting at the grobs

2007-12-18 Thread Pedro de Barros
Dear All,

I continue trying to get several of my plotting functions to use 
ggplot, because I really do like the concept of the graphical 
objects, and working with them in the abstract.
I am now trying to access the grobs to manipulate using grid. 
However, until now all I managed was to get the plot as a gTree 
object, and manipulate it as a gTree from there. The problem is that 
then it is no longer a ggplot, and thus I can no longer use the 
ggplot functions.
How to get at the grobs, without converting the ggplot into a gTree?

Thanks,
Pedro

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integration

2007-12-18 Thread Ravi Varadhan
Hi Davide,

It is difficult to say what the problem is without knowing more about the 
nature of the integrand.  So, you should do a couple of preliminary things 
before attempting compute the integral.

First, is the integral is finite? You should establish this.  Second, plot the 
integrand over the entire interval.  Then you need to think about the 
following: Is the integrand unimodal, with the mass concentrated over a small 
region? Or is it multimodal? Does it have thick tail?

Assuming that the integral is finite, you could try a few things:
1.  Divide the interval of integration into several small intervals (say, 10 or 
100), and then use integrate() on each and then add up the results.  You can 
make this process more efficient if you know where the mass is concentrated.
2.  Transform the integrand.
3.  Try a simple trapezoidal rule quadrature.

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html






-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Monday, December 17, 2007 11:03 AM
To: r-help@r-project.org
Subject: [R] integration

Dear All,
I need to perform a numerical integration of one dimensional 
fucntions. The extrems of integration are both finite and the functions 
I'm working on are quite complicated. I have already tried both area() 
and integrate(), but they do not perform well: area() is very slow and 
integrate() does not converge. Are in R other functions for numerical 
integration of one dimentional functions?

Thanks in advance 

Davide






Tiscali.Fax: il tuo fax online in promo fino al 31 dicembre, 
paghi 15€ e ricarichi 20€ 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape Dataframe

2007-12-18 Thread Gabor Grothendieck
On Dec 18, 2007 9:07 AM, Bert Jacobs <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I'm having a bit of problems in creating a new dataframe.
> Below you'll find a description of the current dataframe and of the
> dataframe that needs to be created.
> Can someone help me out on this one?
> Thx in advance.
> Bert
>
> Current Dataframe
>
> Var1Var2Var3Var4
> A   Fa  W1  1
> A   Si  W1  2
> A   Fa  W2  3
> A   Si  W3  4
> B   Si  W1  5
> C   La  W2  6
> C   Do  W4  7
>
> New Dataframe
>
> Var1Var2W1  W2  W3  W4
> A   Fa  1   3
> A   Si  2   4
> A   La
> A   Do
> B   Fa
> B   Si  5
> B   La
> B   Do
> C   Fa
> C   Si
> C   La  6
> C   Do  7

Try this:

out <- ftable(xtabs(Var4 ~ Var1 + Var2 + Var3, DF))
out[out == 0] <- NA

Omit the last line is 0 fill is what you had wanted.

This will do it except that it will eliminate all rows
without data:

out2 <- reshape(DF, dir = "wide", timevar = "Var3", idvar = c("Var1", "Var2"))
out2[is.na(out2)] <- 0

Omit the last line if NA fill is what you wanted.

The reshape package melt/cast routines (see Hadley's solution in this
thread) can be used
to give a similar result to the reshape command above (i.e. all
missing rows are not
included) except that cast is a bit more flexible since it has a fill= argument.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analyzing Publications from Pubmed via XML

2007-12-18 Thread Armin Goralczyk
On 12/18/07, David Winsemius <[EMAIL PROTECTED]> wrote:
> David Winsemius <[EMAIL PROTECTED]> wrote in
> news:[EMAIL PROTECTED]:
>
> > "Armin Goralczyk" <[EMAIL PROTECTED]> wrote in
> > news:[EMAIL PROTECTED]:
>
> >> I tried the above function with simple search terms and it worked
> >> fine for me (also more output thanks to Martin's post) but when I use
> >> search terms attributed to certain fields, i.e. with [au] or [ta], I
> >> get the following error message:
> >>> pm.srch()
> >> 1: "laryngeal neoplasms[mh]"
> >> 2:
>
> > I am wondering if you used spaces, rather than "+"'s? If so then you
> > may want your function to do more gsub-processing of the input string.
>
> I tried my theory that one would need "+"'s instead of spaces, but
> disproved it. Spaces in the input string seems to produce acceptable
> results on my WinXP/R.2.6.1/RGui system even with more complex search
> strings.
>
> --
>

It's not the spaces, the problem is the tag (sorry that I didn't
specify this), or maybe the string []. I am working on a Mac OS X 10.4
with R version 2.6. Is it maybe a string conversion problem? In the
following warning strings in the html adress seem to be different:

Fehler in .Call("RS_XML_ParseTree", as.character(file), handlers,
as.logical(ignoreBlanks),  :
 error in creating parser for
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=laryngeal
neoplasms[mh]
I/O warning : failed to load external entity
"http%3A//eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi%3Fdb=pubmed&term=laryngeal%20neoplasms%5Bmh%5D"

-- 
Armin Goralczyk, M.D.
--
Universitätsmedizin Göttingen
Abteilung Allgemein- und Viszeralchirurgie
Rudolf-Koch-Str. 40
39099 Göttingen
--
Dept. of General Surgery
University of Göttingen
Göttingen, Germany
--
http://www.chirurgie-goettingen.de
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread James W. MacDonald
Duncan Murdoch wrote:
> On 18/12/2007 7:31 AM, Antony Unwin wrote:
>> Wayne,
>>
>> Try the iplot command in iPlots.  You can then vary both the  
>> pointsize and the transparency of your scatterplot interactively and  
>> decide which scatterplot conveys the information best.  Sometimes  
>> it's helpful to use more than one scatterplot when presenting your  
>> results.
>>
>> (I must admit to being very surprised that jittering and sunflower  
>> plots have been suggested for a dataset of 5000 points.  Do those who  
>> mentioned these methods have examples on that scale where they are  
>> effective?)
> 
> Sure.  The original post said there were about 50-60 unique locations. 
> This plot:
> 
> x <- rbinom(5000, 20, 0.15)
> y <- rbinom(5000, 20, 0.15)
> plot(x,y)
> 
> has a few more unique locations; tune those probabilities if you want it 
> closer.  Due to the overlap, the distribution is very unclear.  But this 
> plot
> 
> plot(jitter(x), jitter(y))

Another alternative is smoothscatter() in the geneplotter package from 
Bioconductor, which does a pretty reasonable job with these example data.

Best,

Jim


> 
> makes the distribution quite clear.
> 
> I wouldn't use the default pch if I had 5 points, but with pch=".", 
> it's not so bad even in that case.
> 
> Duncan Murdoch
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape Dataframe

2007-12-18 Thread Gabor Grothendieck
On Dec 18, 2007 9:54 AM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
>
> On Dec 18, 2007 9:07 AM, Bert Jacobs <[EMAIL PROTECTED]> wrote:
> >
> > Hi,
> >
> > I'm having a bit of problems in creating a new dataframe.
> > Below you'll find a description of the current dataframe and of the
> > dataframe that needs to be created.
> > Can someone help me out on this one?
> > Thx in advance.
> > Bert
> >
> > Current Dataframe
> >
> > Var1Var2Var3Var4
> > A   Fa  W1  1
> > A   Si  W1  2
> > A   Fa  W2  3
> > A   Si  W3  4
> > B   Si  W1  5
> > C   La  W2  6
> > C   Do  W4  7
> >
> > New Dataframe
> >
> > Var1Var2W1  W2  W3  W4
> > A   Fa  1   3
> > A   Si  2   4
> > A   La
> > A   Do
> > B   Fa
> > B   Si  5
> > B   La
> > B   Do
> > C   Fa
> > C   Si
> > C   La  6
> > C   Do  7
>
> Try this:
>
> out <- ftable(xtabs(Var4 ~ Var1 + Var2 + Var3, DF))
> out[out == 0] <- NA
>
> Omit the last line is 0 fill is what you had wanted.
>
> This will do it except that it will eliminate all rows
> without data:
>
> out2 <- reshape(DF, dir = "wide", timevar = "Var3", idvar = c("Var1", "Var2"))
> out2[is.na(out2)] <- 0
>
> Omit the last line if NA fill is what you wanted.
>
> The reshape package melt/cast routines (see Hadley's solution in this
> thread) can be used
> to give a similar result to the reshape command above (i.e. all
> missing rows are not
> included) except that cast is a bit more flexible since it has a fill= 
> argument.

Just one correction.  The cast function in reshape has an add.missing= argument
that can control this so actually any of the solutions could be
obtained with cast
using the fill= and add.missing= arguments to control which one you want.

>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Antony Unwin

On 18 Dec 2007, at 2:42 pm, Duncan Murdoch wrote:

>> (I must admit to being very surprised that jittering and  
>> sunflower  plots have been suggested for a dataset of 5000  
>> points.  Do those who  mentioned these methods have examples on  
>> that scale where they are  effective?)
>
> Sure.  The original post said there were about 50-60 unique  
> locations. This plot:
>
> x <- rbinom(5000, 20, 0.15)
> y <- rbinom(5000, 20, 0.15)
> plot(x,y)
>
> has a few more unique locations; tune those probabilities if you  
> want it closer.  Due to the overlap, the distribution is very  
> unclear.  But this plot
>
> plot(jitter(x), jitter(y))
>
> makes the distribution quite clear.

No it doesn't!  It makes it moderately clearer than the plot without  
jittering.  One good alternative here is the fluctuation diagram  
variant of a mosaic plot:

xx<-as.factor(x)
yy<-as.factor(y)
imosaic(xx,yy, type="f")

Using jittering for categorical data is really not to be recommended  
and will certainly degrade in performance as the dataset gets bigger.


Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
University of Augsburg,
Germany
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape Dataframe

2007-12-18 Thread Bert Jacobs

Thx Hadley,
It works, but I need some finetuning.

If I use the following expression:
Newdf <-reshape(df, timevar="Var3", idvar=c("Var1","Var2"),direction="wide")

Newdf
Var1Var2Var3.W1 Var3.W2 Var3.W3 var3.W4
A   Fa  1   3   
A   Si  2   4
B   Si  5   
C   La  6
C   Do  7

Is there an option so that for each Var1 all possible combinations of Var2
are listed (i.e. creation of blanco lines).
Is it possible to name the columns with the values of the original Var3
variable, so that the name Var3.W1 changes to W1? 

Var1Var2W1  W2  W3  W4
A   Fa  1   3   
A   Si  2   4
A   La
A   Do  
B   Fa  
B   Si  5   
B   La
B   Do
C   Fa  
C   Si  
C   La  6
C   Do  7


Thx,
Bert

-Original Message-
From: hadley wickham [mailto:[EMAIL PROTECTED] 
Sent: 18 December 2007 15:16
To: Bert Jacobs
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Reshape Dataframe

On 12/18/07, Bert Jacobs <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I'm having a bit of problems in creating a new dataframe.
> Below you'll find a description of the current dataframe and of the
> dataframe that needs to be created.
> Can someone help me out on this one?

library(reshape)
dfm <- melt(df, id = 1:3)
cast(dfm, ... ~ Var3)

You can find out more about the reshape package at http://had.co.nz/reshape

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 18/12/2007 10:01 AM, Antony Unwin wrote:
> On 18 Dec 2007, at 2:42 pm, Duncan Murdoch wrote:
> 
>>> (I must admit to being very surprised that jittering and  
>>> sunflower  plots have been suggested for a dataset of 5000  
>>> points.  Do those who  mentioned these methods have examples on  
>>> that scale where they are  effective?)
>> Sure.  The original post said there were about 50-60 unique  
>> locations. This plot:
>>
>> x <- rbinom(5000, 20, 0.15)
>> y <- rbinom(5000, 20, 0.15)
>> plot(x,y)
>>
>> has a few more unique locations; tune those probabilities if you  
>> want it closer.  Due to the overlap, the distribution is very  
>> unclear.  But this plot
>>
>> plot(jitter(x), jitter(y))
>>
>> makes the distribution quite clear.
> 
> No it doesn't!  It makes it moderately clearer than the plot without  
> jittering.  One good alternative here is the fluctuation diagram  
> variant of a mosaic plot:
> 
> xx<-as.factor(x)
> yy<-as.factor(y)
> imosaic(xx,yy, type="f")

That plot is better than jittering, but there's the problem in the 
mosaic plot of understanding the scale of the rectangles:  is it area or 
diameter that encodes the count?  With a jittered plot, you lose 
resolution when the number of points gets too high because you just see 
a mess of ink, but at least you only require the viewer to count in 
order to get a close numerical reading from the plot.

I could also claim that while imperfect, at least jittering is widely 
applicable.  For example, if the data were not on a regular grid, 
perhaps because they had been generated like this:

xloc <- rnorm(50)
yloc <- rnorm(50)
index <- sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
x <- xloc[index]
y <- yloc[index]

then jittering still works as well (or as poorly), but the imosaic would 
not work at all.  There are better plots than jittering available, but 
jittering is easy.

(Actually, with this dataset, plot(jitter(x), jitter(y)) is really poor, 
because jitter() chooses a bad amount of jittering.  But with manual 
tuning (e.g.  plot(jitter(x, a=0.1), jitter(y, a=0.1), pch=".")) it's 
not too bad.  So I'd say jittering worked, but the R implementation of 
it may need improvement).

> Using jittering for categorical data is really not to be recommended  
> and will certainly degrade in performance as the dataset gets bigger.

Yes, I probably wouldn't recommend jittering if there were more than a 
few hundred replications at any point, or more than a few hundred unique 
points.

Duncan Murdoch

P.S. iplots 1.1-1 may have an init problem in Windows: in my first 
attempt, the plot made the boxes too large to fit in their cells, but it 
fixed itself when I resized the window, and the bug doesn't seem to be 
repeatable.


> 
> Antony Unwin
> Professor of Computer-Oriented Statistics and Data Analysis,
> University of Augsburg,
> Germany

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 18/12/2007 10:02 AM, James W. MacDonald wrote:
> Duncan Murdoch wrote:
>> On 18/12/2007 7:31 AM, Antony Unwin wrote:
>>> Wayne,
>>>
>>> Try the iplot command in iPlots.  You can then vary both the  
>>> pointsize and the transparency of your scatterplot interactively and  
>>> decide which scatterplot conveys the information best.  Sometimes  
>>> it's helpful to use more than one scatterplot when presenting your  
>>> results.
>>>
>>> (I must admit to being very surprised that jittering and sunflower  
>>> plots have been suggested for a dataset of 5000 points.  Do those who  
>>> mentioned these methods have examples on that scale where they are  
>>> effective?)
>> Sure.  The original post said there were about 50-60 unique locations. 
>> This plot:
>>
>> x <- rbinom(5000, 20, 0.15)
>> y <- rbinom(5000, 20, 0.15)
>> plot(x,y)
>>
>> has a few more unique locations; tune those probabilities if you want it 
>> closer.  Due to the overlap, the distribution is very unclear.  But this 
>> plot
>>
>> plot(jitter(x), jitter(y))
> 
> Another alternative is smoothscatter() in the geneplotter package from 
> Bioconductor, which does a pretty reasonable job with these example data.

Yes, I agree.  (As an aside, there's actually a capital S in 
smoothScatter(), and it's a bit of a pain to install, because 
geneplotter depends on something that depends on DBI, which is not so 
easily available these days.)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] accessing dimension names

2007-12-18 Thread jim holtman
dimnames(y)[[paste('x', idx, sep="")]]

On Dec 18, 2007 6:01 AM,  <[EMAIL PROTECTED]> wrote:
> I have a matrix y:
>
> > dimnames(y)
> $x93
> [1] "1" "2"
>
> $x94
> [1] "0" "1" "2"
> .. so on  (there are other dimensions as well)
>
>
>
> I need to access a particular dimension, but a random mechanism tells me
> which dimension it would. So, sometimes I might need to access
> dimnames(y)$x93, some other time it would be dimnames(y)$x94.. and so on.
> Now let that random dimension be idx, then dimnames(y)$paste('x',idx,sep='')
> doesn't work.
>
> Can anyone help?
>
> Thanks!
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New version of systemfit (not backward compatible)

2007-12-18 Thread Arne Henningsen
Dear R users,

the systemfit package contains functions for fitting systems of simultaneous 
equations by various estimation methods (e.g. OLS, SUR, 2SLS, 3SLS). 
Currently version 0.8 of systemfit is available on CRAN. However, shortly we 
will upload version 1.0, which is NOT BACKWARD COMPATIBLE. The changes that 
broke backward compatibility were necessary to make systemfit() more similar 
to standard regression tools in R such as lm(). We hope that the usage of 
systemfit() is more intuitive for R users now. We will continue to maintain 
the 0.8 branch so that users can still use the old version if they do not 
want to update their R scripts. Both versions are and will be available for 
download from systemfit's website:
   http://www.systemfit.org/
which is a shortcut to 
   http://www.uni-kiel.de/agrarpol/ahenningsen/systemfit/

A paper that describes the (new version of the) systemfit package is 
forthcoming in the Journal of Statistical Software (JSS). A preprint of this 
paper is available on systemfit's website:
   http://www.systemfit.org/systemfit_paper_1.0.pdf

The following list summarizes the most important changes 
from version 0.8 to 1.0:
- some names of systemfit()'s arguments have changed to make it more 
  similar to standard regression tools in R
- the order of systemfit()'s arguments has changed to make it more 
  similar to standard regression tools in R
- the names of the elements in the object returned by systemfit() have 
  changed to make it more similar to lm()
- added several methods for systemfit objects that are generally 
  available for standard regression tools in R
- restrictions on the coefficients can be specified symbolically now
- the functionality of systemfitClassic() has been integrated into systemfit()
- replaced ftest.systemfit() and waldtest.systemfit() by the method
  linear.hypothesis()
- systemfit now uses the "Matrix" package for matrix calculations (this 
  makes the estimation of large models and large data sets much faster)
- improved checking of the arguments so that error messages are more 
  helpful now

We thank two anonymous referees of the JSS, Achim Zeileis, John Fox, William 
H. Greene, Ott Toomet, Duncan Murdoch, Martin Maechler, Duglas Bates and 
several (other) systemfit users for their answers, comments, and/or 
suggestions that helped us to improve the systemfit package.

Feedback is always welcome!
Arne & Jeff

-- 
Arne Henningsen
Department of Agricultural Economics
University of Kiel
Olshausenstr. 40
D-24098 Kiel (Germany)
Tel: +49-431-880 4445 or +49-4349-914871
Fax: +49-431-880 1397
[EMAIL PROTECTED]
http://www.uni-kiel.de/agrarpol/ahenningsen/

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread James W. MacDonald
Duncan Murdoch wrote:
> Yes, I agree.  (As an aside, there's actually a capital S in 
> smoothScatter(), and it's a bit of a pain to install, because 
> geneplotter depends on something that depends on DBI, which is not so 
> easily available these days.)

Somehow I always forget the capital S and wonder if I have loaded the 
correct package ;-D

As for installing the required dependencies, I believe this is actually 
quite straightforward:

source("http://www.bioconductor.org/biocLite.R";)
biocLite("geneplotter")

Should install geneplotter and all required dependencies.

Best,

Jim


> 
> Duncan Murdoch

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bar plot colors

2007-12-18 Thread John Kane
I think you're going to find that barchart with that
many values in a bar is going to be pretty well
uninterpretable.  

Jim Lemon gives the desired barchart but it is very
difficult to read.  

Stealing his code to create the same matrix I'd
suggest may be looking at a dotchart.  I'm not sure if
this is even close to an optimal solution but I do
think it's a bit better than a barchart approach
==
heights<-matrix(sample(10:70,54),ncol=3)
bar.colors<-rep(rep(2:7,each=3),3)
cost.types <- c("Direct", "Indirec", "Induced")
colnames(heights) <-  c("A", "B", "C")
rownames(heights) <- c(rep(cost.types, 6))

dotchart(heights, col=bar.colors, pch=16, cex=.6)

===
--- "Winkel, David" <[EMAIL PROTECTED]> wrote:

> All,
> 
>  
> 
> I have a question regarding colors in bar plots.  I
> want to stack a
> total of 18 cost values in each bar. Basically, it
> is six cost types and
> each cost type has three components- direct,
> indirect, and induced
> costs.  I would like to use both solid color bars
> and bars with the
> slanted lines (using the density parameter).  The
> colors would
> distinguish cost types and the lines would
> distinguish
> direct/indirect/induced.  I want the cost types
> (i.e. colors) to be
> stacked together for each cost type.  In other
> words, I don't want all
> of the solid bars at the bottom and all of the
> slanted lines at the top.
> 
> 
> So far, I have made a bar plot with all solid colors
> and then tried to
> overwrite that bar plot by calling barplot() again
> and putting the white
> slanted lines across the bars.  However, I can't get
> this method to work
> while still grouping the cost types together.
> 
>  
> 
> Thanks in advance for any help you can provide.
> 
>  
> 
> David Winkel
> 
> Applied Biology and Aerosol Technology
> 
> Battelle Memorial Institute
> 
> 505 King Ave.
> 
> Columbus, Ohio 43201
> 
> 614.424.3513
> 
>  
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression towards the mean, AS paper November 2007

2007-12-18 Thread Kevin Wright
On Dec 17, 2007 3:10 PM, hadley wickham <[EMAIL PROTECTED]> wrote:
> > This has nothing to do really with the question that Troels asked,
> > but the exposition quoted from the AA paper is unnecessarily 
> > confusing.
> > The phrase ``Because X0 and X1 have identical marginal
> > distributions ...''
> > throws the reader off the track.  The identical marginal 
> > distributions
> > are irrelevant.  All one needs is that the ***means*** of X0 and X1
> > be the same, and then the null hypothesis tested by a paired t-test
> > is true and so the p-values are (asymptotically) Uniform[0,1].  With
> > a sample size of 100, the ``asymptotically'' bit can be safely 
> > ignored
> > for any ``decent'' joint distribution of X0 and X1.  If one further
> > assumes that X0 - X1 is Gaussian (which has nothing to do with X0 
> > and
> > X1 having identical marginal distributions) then ``asymptotically''
> > turns into ``exactly''.
>
> Another related issue is that uniform distributions don't look very uniform:
>
> hist(runif(100))
> hist(runif(1000))
> hist(runif(1))
>
> Be sure to calibrate your eyes (and your bin width) before rejecting
> the hypothesis that the distribution is uniform.
>
> Hadley

Thanks for the example, Hadley.  To me, this suggests we should stop
teaching histograms in Stat 101 and instead use quantile plots, which
give excellent results for n=100 and even surprisingly good results
for n=10:

par(mfrow=c(2,2))
for(i in c(10, 100, 1000, 1)) {
  qqplot(runif(i), qunif(seq(1/i, 1, length=i)), main=i,
 xlim=c(0,1), ylim=c(0,1),
 xlab="runif", ylab="Uniform distribution quantiles")
  abline(0,1,col="lightgray")
}

Kevin (drifting even further off topic)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Antony Unwin

On 18 Dec 2007, at 4:49 pm, Duncan Murdoch wrote:

>> One good alternative here is the fluctuation diagram  variant of a  
>> mosaic plot:
>> xx<-as.factor(x)
>> yy<-as.factor(y)
>> imosaic(xx,yy, type="f")
>
> That plot is better than jittering, but there's the problem in the  
> mosaic plot of understanding the scale of the rectangles:  is it  
> area or diameter that encodes the count?

Area is used.

> With a jittered plot, you lose resolution when the number of points  
> gets too high because you just see a mess of ink, but at least you  
> only require the viewer to count in order to get a close numerical  
> reading from the plot.

If someone needs a count, they should be given a table.   Graphics  
are for qualitative conclusions not details.  Anyway, counting will  
only work for really small datasets.

> I could also claim that while imperfect, at least jittering is  
> widely applicable.  For example, if the data were not on a regular  
> grid, perhaps because they had been generated like this:
>
> xloc <- rnorm(50)
> yloc <- rnorm(50)
> index <- sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
> x <- xloc[index]
> y <- yloc[index]
>
> then jittering still works as well (or as poorly), but the imosaic  
> would not work at all.

That's right and that's (almost) the sort of example I was thinking  
of.  For a limited number of locations like this a bubble plot would  
be best (which has already been suggested in this thread, I think).   
For many locations and few replications I would still go for varying  
pointsize and transparency.

Incidentally, to check your suggestion I ran your code and discovered  
that the transparency in iplot does not seem to like replications.   
Very strange, we'll have to check why.  I then looked closely at the  
numbers of replications generated and discovered that case 25 was  
picked 325 times and case 40 only once.  Rather too extreme for my  
liking!  Running it again gave very similar results, though not  
exactly the same: this time it was 325 times for case 25 and case 40  
was not picked at all.  Other numbers varied slightly.  This is not  
what I expected, any ideas?

> P.S. iplots 1.1-1 may have an init problem in Windows: in my first  
> attempt, the plot made the boxes too large to fit in their cells,  
> but it fixed itself when I resized the window, and the bug doesn't  
> seem to be repeatable.

Thanks.  This happens occasionally on the Mac too.  Refreshing solves  
it in practice, but we need to find out why it can happen (and stop  
it happening!).

Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
University of Augsburg,
Germany
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculating the number of days from dates

2007-12-18 Thread bogdan romocea
> Sorry for using library instead package, but
> library() is one command for using packages.

... which is why all efforts to make folks say "package" instead of >>
"library" << are doomed to fail, IMHO. Besides, in English, "library"
also means "a collection of software or data usually reflecting a
specific theme or application" (#9 on the list from
http://dictionary.reference.com/ ). Therefore:
   > "library" == "package"
   [1] TRUE!
and just about the only way to clear up the "confusion" would be to
rename library() to package(), and replace "library" with "folder" or
"directory".



> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Knut Krueger
> Sent: Monday, December 17, 2007 2:11 AM
> To: 'R R-help'
> Subject: Re: [R] calculating the number of days from dates
>
>
> > it's  a  >> package <<  , not a library, please!
> >
> >
> Sorry for using library instead package, but
>
> library() is one command for using packages.
>
> Therefore I (and it seems that i am not the only one) used
> library instead package.
>
> Knut
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave and Scientific Workplace

2007-12-18 Thread Dietrich Trenkler
Dear HelpeRs,

a colleague of mine uses Scientific Workplace to write his LaTeX documents.
I made his mouth water mentioning the advantages of using Sweave.

Not using SW myself I wonder if anyone out there has gathered some 
experiences
in using the combination of both.

Thank you in advance

Dietrich

-- 
Dietrich Trenkler c/o Universitaet Osnabrueck 
Rolandstr. 8; D-49069 Osnabrueck, Germany
email: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression towards the mean, AS paper November 2007

2007-12-18 Thread hadley wickham
> Thanks for the example, Hadley.  To me, this suggests we should stop
> teaching histograms in Stat 101 and instead use quantile plots, which
> give excellent results for n=100 and even surprisingly good results
> for n=10:

It all depends on what you're trying to do - I don't think histograms
are particularly good as density estimators, but that's not what
you're using them for most of the time! You're using them as an
exploratory tool to try and understand what's going on in your data -
often you'll need to use very small bin widths which help find
unexpected gaps and patterns in your data.   It's helpful to have some
feel for what common distributions look like.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 - getting at the grobs

2007-12-18 Thread hadley wickham
Hi Pedro,

Could you be a bit more explicit about what you're trying to do?  Have
you read the last chapter of the draft ggplot book?

Hadley

On Dec 18, 2007 8:41 AM, Pedro de Barros <[EMAIL PROTECTED]> wrote:
> Dear All,
>
> I continue trying to get several of my plotting functions to use
> ggplot, because I really do like the concept of the graphical
> objects, and working with them in the abstract.
> I am now trying to access the grobs to manipulate using grid.
> However, until now all I managed was to get the plot as a gTree
> object, and manipulate it as a gTree from there. The problem is that
> then it is no longer a ggplot, and thus I can no longer use the
> ggplot functions.
> How to get at the grobs, without converting the ggplot into a gTree?
>
> Thanks,
> Pedro
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 12/18/2007 11:21 AM, James W. MacDonald wrote:
> Duncan Murdoch wrote:
>> Yes, I agree.  (As an aside, there's actually a capital S in 
>> smoothScatter(), and it's a bit of a pain to install, because 
>> geneplotter depends on something that depends on DBI, which is not so 
>> easily available these days.)
> 
> Somehow I always forget the capital S and wonder if I have loaded the 
> correct package ;-D
> 
> As for installing the required dependencies, I believe this is actually 
> quite straightforward:
> 
> source("http://www.bioconductor.org/biocLite.R";)
> biocLite("geneplotter")
> 
> Should install geneplotter and all required dependencies.

Yes, that works.  Not sure why DBI was unavailable for a simple install 
of geneplotter from the Windows Rgui; when I try it now (on a different 
PC, maybe using a different mirror) it's there.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 12/18/2007 12:44 PM, Antony Unwin wrote:
> On 18 Dec 2007, at 4:49 pm, Duncan Murdoch wrote:
> 
>>> One good alternative here is the fluctuation diagram  variant of a  
>>> mosaic plot:
>>> xx<-as.factor(x)
>>> yy<-as.factor(y)
>>> imosaic(xx,yy, type="f")
>>
>> That plot is better than jittering, but there's the problem in the  
>> mosaic plot of understanding the scale of the rectangles:  is it  
>> area or diameter that encodes the count?
> 
> Area is used.
> 
>> With a jittered plot, you lose resolution when the number of points  
>> gets too high because you just see a mess of ink, but at least you  
>> only require the viewer to count in order to get a close numerical  
>> reading from the plot.
> 
> If someone needs a count, they should be given a table.   Graphics  
> are for qualitative conclusions not details.  Anyway, counting will  
> only work for really small datasets.
> 
>> I could also claim that while imperfect, at least jittering is  
>> widely applicable.  For example, if the data were not on a regular  
>> grid, perhaps because they had been generated like this:
>>
>> xloc <- rnorm(50)
>> yloc <- rnorm(50)
>> index <- sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
>> x <- xloc[index]
>> y <- yloc[index]
>>
>> then jittering still works as well (or as poorly), but the imosaic  
>> would not work at all.
> 
> That's right and that's (almost) the sort of example I was thinking  
> of.  For a limited number of locations like this a bubble plot would  
> be best (which has already been suggested in this thread, I think).   
> For many locations and few replications I would still go for varying  
> pointsize and transparency.
> 
> Incidentally, to check your suggestion I ran your code and discovered  
> that the transparency in iplot does not seem to like replications.   
> Very strange, we'll have to check why.  I then looked closely at the  
> numbers of replications generated and discovered that case 25 was  
> picked 325 times and case 40 only once.  Rather too extreme for my  
> liking!  Running it again gave very similar results, though not  
> exactly the same: this time it was 325 times for case 25 and case 40  
> was not picked at all.  Other numbers varied slightly.  This is not  
> what I expected, any ideas?

abs(xloc) typically varies by a factor of about 100 from smallest to 
largest, but sometimes the small end is really small, and so the ratio 
is really big.

Duncan Murdoch

> 
>> P.S. iplots 1.1-1 may have an init problem in Windows: in my first  
>> attempt, the plot made the boxes too large to fit in their cells,  
>> but it fixed itself when I resized the window, and the bug doesn't  
>> seem to be repeatable.
> 
> Thanks.  This happens occasionally on the Mac too.  Refreshing solves  
> it in practice, but we need to find out why it can happen (and stop  
> it happening!).
> 
> Antony Unwin
> Professor of Computer-Oriented Statistics and Data Analysis,
> University of Augsburg,
> Germany

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R brakes when submitting a query to MySQL

2007-12-18 Thread Marc Moragues
Hello,

I would like to retrieve data stored in MySQL database, so I installed
RMySQL package.
I can successfully connect with the my database using the following code

> dvr<-dbDriver("MySQL")
> con2<-dbConnect(dvr,group="exbardiv")
> mysqlDescribeConnection(con2)

 
  User: mmorag 
  Host: localhost 
  Dbname: exbardiv 
  Connection type: localhost via TCP/IP 
  No resultSet available

I can even see the tables in the database

> dbListTables(con2)
[1] "agoueb""high_ld"   "rescue""sjlc_info" "sjlc_ld"   "temp"

[7] "temp_snp1" "temp_snp2"

However, when I try to query the database, R breakes.

res<-dbSendQuery(con,'select * from sjlc_ld')

Can anyone help me tune up the connection between R and MySQL?

Thank you,
Marc.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by 
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views 
expressed by the sender are not necessarily the views of SCRI and its 
subsidiaries.  This email and any files transmitted with it are confidential 
to the intended recipient at the e-mail address to which it has been 
addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this 
confidentiality and you must not use, disclose, copy, print or rely on this 
e-mail in any way. Please notify [EMAIL PROTECTED] quoting the 
name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are 
present in this email, neither the Institute nor the sender accepts any 
responsibility for any viruses, and it is your responsibility to scan the email 
and the attachments (if any).


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCA - "cov.wt(z) : 'x' must contain finite values only"

2007-12-18 Thread Johnson, Bethany
I am trying to run PCA on a matrix (the first column and row are
headers).  There are several cells with NA's.  When I run PCA with the
following code:
__
setwd("I:/PCA")
AsianProp<-read.csv("Matrix.csv", sep=",", header=T, row.names=1)
attach(AsianProp)
AsianProp
AsianProp.pca<-princomp(AsianProp, na.omit)
_

I get the error message:

cov.wt(z) : 'x' must contain finite values only

What am I doing wrong?  

Thanks very much!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R brakes when submitting a query to MySQL

2007-12-18 Thread Zembower, Kevin
Is it your use of 'con' rather than 'con2' in dbSendQuery? -Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Marc Moragues
Sent: Tuesday, December 18, 2007 1:14 PM
To: r-help@r-project.org
Subject: [R] R brakes when submitting a query to MySQL

Hello,

I would like to retrieve data stored in MySQL database, so I installed
RMySQL package.
I can successfully connect with the my database using the following code

> dvr<-dbDriver("MySQL")
> con2<-dbConnect(dvr,group="exbardiv")
> mysqlDescribeConnection(con2)

 
  User: mmorag 
  Host: localhost 
  Dbname: exbardiv 
  Connection type: localhost via TCP/IP 
  No resultSet available

I can even see the tables in the database

> dbListTables(con2)
[1] "agoueb""high_ld"   "rescue""sjlc_info" "sjlc_ld"   "temp"

[7] "temp_snp1" "temp_snp2"

However, when I try to query the database, R breakes.

res<-dbSendQuery(con,'select * from sjlc_ld')

Can anyone help me tune up the connection between R and MySQL?

Thank you,
Marc.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:\ \ This email is from the Scottish Crop Rese...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R brakes when submitting a query to MySQL

2007-12-18 Thread Emmanuel Charpentier
Marc Moragues a écrit :
> Hello,
> 
> I would like to retrieve data stored in MySQL database, so I installed
> RMySQL package.
> I can successfully connect with the my database using the following code
> 
>> dvr<-dbDriver("MySQL")
>> con2<-dbConnect(dvr,group="exbardiv")
>> mysqlDescribeConnection(con2)
> 
>  
>   User: mmorag 
>   Host: localhost 
>   Dbname: exbardiv 
>   Connection type: localhost via TCP/IP 
>   No resultSet available
> 
> I can even see the tables in the database
> 
>> dbListTables(con2)
> [1] "agoueb""high_ld"   "rescue""sjlc_info" "sjlc_ld"   "temp"
> 
> [7] "temp_snp1" "temp_snp2"
> 
> However, when I try to query the database, R breakes.

What does *that* means ? You should be a bit more descriptive...

> res<-dbSendQuery(con,'select * from sjlc_ld')

require(MindeReaderAlpha).

H ... Isn't the "breakage" just a lng wait with no answer ? Or
maybe a timeout ?

In which case I'd try to put a semicolon (";")  at the end of the SQL
query, thus making it syntactically valid SQL...

HTH,

Emmanuel Charpentier

> Can anyone help me tune up the connection between R and MySQL?
> 
> Thank you,
> Marc.
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> 
> SCRI, Invergowrie, Dundee, DD2 5DA.  
> The Scottish Crop Research Institute is a charitable company limited by 
> guarantee. 
> Registered in Scotland No: SC 29367.
> Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.
> 
> 
> DISCLAIMER:
> 
> This email is from the Scottish Crop Research Institute, but the views 
> expressed by the sender are not necessarily the views of SCRI and its 
> subsidiaries.  This email and any files transmitted with it are confidential 
> to the intended recipient at the e-mail address to which it has been 
> addressed.  It may not be disclosed or used by any other than that addressee.
> If you are not the intended recipient you are requested to preserve this 
> confidentiality and you must not use, disclose, copy, print or rely on this 
> e-mail in any way. Please notify [EMAIL PROTECTED] quoting the 
> name of the sender and delete the email from your system.
> 
> Although SCRI has taken reasonable precautions to ensure no viruses are 
> present in this email, neither the Institute nor the sender accepts any 
> responsibility for any viruses, and it is your responsibility to scan the 
> email 
> and the attachments (if any).
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread bogdan romocea
Another approach which I'm pleased with but was not suggested so far
is jitter + kde2d from MASS:

plot(jitter(x), jitter(y))
if (!exists("kde2d")) require(MASS)
kdesamp <- 2  #depending on your RAM
forkde <- if (kdesamp < length(x)) sample(1:length(x), kdesamp,
replace=FALSE) else 1:length(x)
d <- kde2d(x[forkde], y[forkde])
contour(d, add=TRUE)



> -Original Message-
> From: [EMAIL PROTECTED]
> Subject: Re: [R] Scatterplot Showing All Points
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R brakes when submitting a query to MySQL

2007-12-18 Thread Marc Moragues
You are right, it was a mistake copying and pasting the code. There is
no error message from R when I run con2. I get a Windows error message
saying "R for windows terminal front-end has encountered a problem and
need to close".

The error signature is:

AppName: rterm.exe   AppVer: 2.60.43063.0ModName: msvcrt.dll
ModVer: 7.0.2600.2180Offset: 000378c0

Marc. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Zembower, Kevin
Sent: 18 December 2007 18:18
To: r-help@r-project.org
Subject: Re: [R] R brakes when submitting a query to MySQL

Is it your use of 'con' rather than 'con2' in dbSendQuery? -Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Marc Moragues
Sent: Tuesday, December 18, 2007 1:14 PM
To: r-help@r-project.org
Subject: [R] R brakes when submitting a query to MySQL

Hello,

I would like to retrieve data stored in MySQL database, so I installed
RMySQL package.
I can successfully connect with the my database using the following code

> dvr<-dbDriver("MySQL")
> con2<-dbConnect(dvr,group="exbardiv")
> mysqlDescribeConnection(con2)


  User: mmorag
  Host: localhost
  Dbname: exbardiv
  Connection type: localhost via TCP/IP
  No resultSet available

I can even see the tables in the database

> dbListTables(con2)
[1] "agoueb""high_ld"   "rescue""sjlc_info" "sjlc_ld"   "temp"

[7] "temp_snp1" "temp_snp2"

However, when I try to query the database, R breakes.

res<-dbSendQuery(con,'select * from sjlc_ld')

Can anyone help me tune up the connection between R and MySQL?

Thank you,
Marc.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:\ \ This email is from the Scottish Crop
Rese...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by 
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views 
expressed by the sender are not necessarily the views of SCRI and its 
subsidiaries.  This email and any files transmitted with it are confidential 
to the intended recipient at the e-mail address to which it has been 
addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this 
confidentiality and you must not use, disclose, copy, print or rely on this 
e-mail in any way. Please notify [EMAIL PROTECTED] quoting the 
name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are 
present in this email, neither the Institute nor the sender accepts any 
responsibility for any viruses, and it is your responsibility to scan the email 
and the attachments (if any).


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA - "cov.wt(z) : 'x' must contain finite values only"

2007-12-18 Thread Ravi Varadhan
The problem is the missing values.  The argument "na.action" is not active
in princomp(), which I think is a bug, even though the help page claims that
"factory fresh" default is na.omit.

So, you need to either get rid of the rows with any missing values in them,
or use a PCA code that can deal with missing values by somehow imputing
them.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Johnson, Bethany 
Sent: Tuesday, December 18, 2007 1:14 PM
To: r-help@r-project.org
Subject: [R] PCA - "cov.wt(z) : 'x' must contain finite values only"

I am trying to run PCA on a matrix (the first column and row are
headers).  There are several cells with NA's.  When I run PCA with the
following code:
__
setwd("I:/PCA")
AsianProp<-read.csv("Matrix.csv", sep=",", header=T, row.names=1)
attach(AsianProp)
AsianProp
AsianProp.pca<-princomp(AsianProp, na.omit)
_

I get the error message:

cov.wt(z) : 'x' must contain finite values only

What am I doing wrong?  

Thanks very much!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scatterplot3d model reporting question

2007-12-18 Thread Max
I've used the scatterplot3d function to graph some data and had it 
graph a "smooth" fit. Is there a way to actualy find out the function 
of the surface? I've looked through the help and figured out how to get 
it to report the following:

Family: gaussian
Link function: identity

Formula:
y ~ s(x, z)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.207500.01223   16.97   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
 edf Est.rank F  p-value
s(x,z) 8.403   17 9.729 1.76e-14 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.684   Deviance explained = 70.9%
GCV score = 0.017692   Scale est. = 0.016151  n = 108

But I'm still not really sure what I'm looking at, either that or 
"smooth" means something different than I thought. Any help would be 
great!

thanks,

-Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] comparing poisson distributions

2007-12-18 Thread Mark Gosink
Hello all,

I would like to compare two sets of count data which form
Poisson distributions. I'd like to generate some sort of p-value of the
likely-hood that the distributions are the same. Thanks in advance for
your advice.

 

Cheers,

Mark

 

Mark Gosink, Ph.D.

Head of Computational Biology
Scripps Florida
5353 Parkside Drive - RFA
Jupiter, FL  33458
tel: 561-799-8921
fax: 561-799-8952
[EMAIL PROTECTED]

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3d plotting

2007-12-18 Thread Brad B
I am trying to dp a 3d plot.
  I tried persp but my data is not a matrix.
  the 3dplot function returns this error, (list) object cannot be coerced to 
'double'
  heres my code,
  td<-read.csv("td.csv", header=TRUE)
price<-read.csv("price.csv", header=TRUE)
contractdate<-read.csv("contractdate.csv", header=TRUE)
library(rgl)
plot3d(td,contractdate,price)
   
  the 3 csv data files have the following format,
  1
  2
  3
  4
  5
  6
  ...
  60,000
   
  basically I have 3 columns, x y and z that have 6 rows (data points) I 
want to plot.
   

   
-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forestplot

2007-12-18 Thread francogrex

I know there is a function forestplot from rmeta package and also the
plot.meta from the meta package and maybe others, but they are rather
complicated with extra plot parameters that I do not need and also they
process only objects created with other package functions. 
But I wonder if anyone has a much simpler function using the basic plot to
make a forestplot with only a median (or mean) and just the confidence
intervals, something like the data below in graphics. thanks

Events  2.50%   50% 97.50%
A   0.332.4924.96
B   0.251.9 19.56
C   0.341.285.35
D   1.582.945.54
E   0.821.944.71
F   1.043.1810.32
G   0.581.443.72
H   0.040.483.79
I   0.170.672.52

-- 
View this message in context: 
http://www.nabble.com/Forestplot-tp14404133p14404133.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Import GAUSS .FMT files

2007-12-18 Thread Pedro.Rodriguez
Dear All,

 

Is it possible to import GAUSS .FMT files into R? 

 

Thanks for your time.

 

Kind Regards,

 

Pedro N. Rodriguez

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3d plotting

2007-12-18 Thread Scionforbai
> 60,000

I hope that you actually haven't got any comma to separate the
thousands... it separates fields in a csv files (as the "Comma
Separated Values" name may suggest). If so, get rid of the commas.

>   the 3dplot function returns this error,
>  (list) object cannot be coerced to 'double'

> td<-read.csv("td.csv", header=TRUE)
> price<-read.csv("price.csv", header=TRUE)
> contractdate<-read.csv("contractdate.csv", header=TRUE)

You have to coerce the 1-column dataframes created by read.csv to
numeric vectors or to a 6x3 dataframe.
solution 1:

myData <- cbind(td,contractdate,price)

> library(rgl)

plot3d(mydata)

solution 2:

td <- as.numeric(td)
...
price <- as.numeric(price)
plot3d(td,contractdate,price)

Bye,

ScionForbai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Specifying starting values in lme (nlme package) using msScale

2007-12-18 Thread Catherine A. Holt

I am using package nlme and would like to specify initial values for a linear 
mixed-effects model to help with convergence. I am trying to specify those 
initial values using the msScale option under ‘control’ in the lme() function:

lme(Y ~ X1,  random= ~ X1|X2, control=list(msScale=lmeScale))

where, (as far as I understand), lmeScale is a function that can take initial 
values for parameters. However, I am unsure about how to input those starting 
values (e.g., what names lme will recognize for fixed and random effects, in 
what format, and if a partial list of initial values would be acceptable?).

Any advice or examples of code inputting starting values would be extremely 
helpful. I have been unable to find examples myself online. Note, although it 
may be easier to do this in the lme4 package, I would prefer to use nlme.

Thank you for your attention.
Sincerely, 
Carrie Holt, Ph.D., M.Sc., B.Sc.(Honours)
University of Washington
School of Aquatic & Fishery Sciences
Box 355020
Seattle, WA 98195

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot3d model reporting question

2007-12-18 Thread John Fox
Dear Max,

I'm guessing that you're actually using scatter3d() in the Rcmdr package rather 
than scatterplot3d(), since the latter, I believe, doesn't fit regression 
surfaces.

If I'm right, then as it says in ?scatter3d, the smooth surface is fit by the 
gam() function in the mgcv package using a smoothing spline and there is no 
explicit equation to examine. See ?gam for more information.

I hope this helps,
 John


John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> project.org] On Behalf Of Max
> Sent: December-18-07 2:02 PM
> To: [EMAIL PROTECTED]
> Subject: [R] Scatterplot3d model reporting question
> 
> I've used the scatterplot3d function to graph some data and had it
> graph a "smooth" fit. Is there a way to actualy find out the function
> of the surface? I've looked through the help and figured out how to get
> it to report the following:
> 
> Family: gaussian
> Link function: identity
> 
> Formula:
> y ~ s(x, z)
> 
> Parametric coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept)  0.207500.01223   16.97   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Approximate significance of smooth terms:
>  edf Est.rank F  p-value
> s(x,z) 8.403   17 9.729 1.76e-14 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> R-sq.(adj) =  0.684   Deviance explained = 70.9%
> GCV score = 0.017692   Scale est. = 0.016151  n = 108
> 
> But I'm still not really sure what I'm looking at, either that or
> "smooth" means something different than I thought. Any help would be
> great!
> 
> thanks,
> 
> -Max
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting magnitude

2007-12-18 Thread dkowalske
I am plotting fishing vessel positions and want these points to be
relative in size to the catch at that point.  Is this possible? I am just
begining to use R and my search of the help section didnt help in this
area.  Heres what Im using so far

xyplot(data$latdeg~data$londeg |vessek , groups=vessek,
xlim=rev(range(69:77)),ylim=(range(35:42)), data=data,
main=list ("Mackerel catches", cex=1.0),
ylab="latitude", notch=T, varwidth=T,
xlab="longitude", cex.axis=0.5,)
any info would be appreciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3d plotting

2007-12-18 Thread Brad B
that worked.
  however Im trying to get a surface countour like persp() would show.
  Since I dont have a matrix data set, I assumed that the wireframe function 
would do.
  since I get an error using wireframe,
  no applicable method for "wireframe"
  I am using this plot3d. I was under the impression that plot3d would do 
similar as wireframe or persp but its not.
   
  any other advice would be great.
  

Scionforbai <[EMAIL PROTECTED]> wrote:
  > 60,000

I hope that you actually haven't got any comma to separate the
thousands... it separates fields in a csv files (as the "Comma
Separated Values" name may suggest). If so, get rid of the commas.

> the 3dplot function returns this error,
> (list) object cannot be coerced to 'double'

> td<-read.csv("td.csv", header=TRUE)
> price<-read.csv("price.csv", header=TRUE)
> contractdate<-read.csv("contractdate.csv", header=TRUE)

You have to coerce the 1-column dataframes created by read.csv to
numeric vectors or to a 6x3 dataframe.
solution 1:

myData <- cbind(td,contractdate,price)

> library(rgl)

plot3d(mydata)

solution 2:

td <- as.numeric(td)
...
price <- as.numeric(price)
plot3d(td,contractdate,price)

Bye,

ScionForbai


   
-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot3d model reporting question

2007-12-18 Thread John Fox
Dear Max,

I'm guessing that you're actually using scatter3d() in the Rcmdr package rather 
than scatterplot3d(), since the latter, I believe, doesn't fit regression 
surfaces.

If I'm right, then as it says in ?scatter3d, the smooth surface is fit by the 
gam() function in the mgcv package using a regression spline and there is no 
explicit equation to examine.

I hope this helps,
 John


John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> project.org] On Behalf Of Max
> Sent: December-18-07 2:02 PM
> To: [EMAIL PROTECTED]
> Subject: [R] Scatterplot3d model reporting question
> 
> I've used the scatterplot3d function to graph some data and had it
> graph a "smooth" fit. Is there a way to actualy find out the function
> of the surface? I've looked through the help and figured out how to get
> it to report the following:
> 
> Family: gaussian
> Link function: identity
> 
> Formula:
> y ~ s(x, z)
> 
> Parametric coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept)  0.207500.01223   16.97   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Approximate significance of smooth terms:
>  edf Est.rank F  p-value
> s(x,z) 8.403   17 9.729 1.76e-14 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> R-sq.(adj) =  0.684   Deviance explained = 70.9%
> GCV score = 0.017692   Scale est. = 0.016151  n = 108
> 
> But I'm still not really sure what I'm looking at, either that or
> "smooth" means something different than I thought. Any help would be
> great!
> 
> thanks,
> 
> -Max
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analyzing Publications from Pubmed via XML

2007-12-18 Thread David Winsemius
"Armin Goralczyk" <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> On 12/18/07, David Winsemius <[EMAIL PROTECTED]> wrote:
>> David Winsemius <[EMAIL PROTECTED]> wrote in
>> news:[EMAIL PROTECTED]:
>>
>> > "Armin Goralczyk" <[EMAIL PROTECTED]> wrote in
>> > news:[EMAIL PROTECTED]:
>>
>> >> I tried the above function with simple search terms and it worked
>> >> fine for me (also more output thanks to Martin's post) but when I
>> >> use search terms attributed to certain fields, i.e. with [au] or
>> >> [ta], I get the following error message:
>> >>> pm.srch()
>> >> 1: "laryngeal neoplasms[mh]"
>> >> 2:
>>
>> > I am wondering if you used spaces, rather than "+"'s? If so then
>> > you may want your function to do more gsub-processing of the input
>> > string. 
>>
>> I tried my theory that one would need "+"'s instead of spaces, but
>> disproved it. Spaces in the input string seems to produce acceptable
>> results on my WinXP/R.2.6.1/RGui system even with more complex search
>> strings.
>>
>> --
>>
> It's not the spaces, the problem is the tag (sorry that I didn't
> specify this), or maybe the string []. I am working on a Mac OS X 10.4
> with R version 2.6. Is it maybe a string conversion problem? In the
> following warning strings in the html adress seem to be different:
> Fehler in .Call("RS_XML_ParseTree", as.character(file), handlers,
> as.logical(ignoreBlanks),  :
>  error in creating parser for
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&ter
> m=laryngeal neoplasms[mh]
> I/O warning : failed to load external entity
> "http%3A//eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi%3Fdb=pubme
> d&term=laryngeal%20neoplasms%5Bmh%5D" 

I do not have an up-to-date version of R on my Mac, since I have not yet 
upgraded to OSX10.4. I can try with my older version of R, but failure 
(or even success) with versions OSX-10.2/R-2.0 is not likely to be very 
informative. If you will post an example of the input that is resulting 
in the error, I can try it on my WinXP machine. If we cannot reproduce it 
there, then it may be more appropriate to take further questions to the 
Mac-R mailing list. The error message suggests to me that the fault lies 
in the connection phase of the task. 

-- 
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Hi all,

What may be a smart, efficient way to get the following result:

myvector <- c("A","B","C","D","E")
myseries <- miracle(myvector)
myseries
[1]
[[1]] "A"
[2]
[[1]] "A" "B"
[3]
[[1]] "A" "B"
[4]
[[1]] "A" "B" "C"
[5]
[[1]] "A" "B" "C" "D"
[6]
[[1]] "A" "B" "C" "D" "E"

Thanks for any hints,

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Should have been:
> myvector <- c("A","B","C","D","E")
> myseries <- miracle(myvector)
> myseries
> [1]
> [[1]] "A"
> [2]
> [[1]] "A" "B"
> [3]
> [[1]] "A" "B" "C"
> [4]
> [[1]] "A" "B" "C" "D"
> [5]
> [[1]] "A" "B" "C" "D" "E"

Sorry, Joh

Johannes Graumann wrote:

> Hi all,
> 
> What may be a smart, efficient way to get the following result:
> 
> myvector <- c("A","B","C","D","E")
> myseries <- miracle(myvector)
> myseries
> [1]
> [[1]] "A"
> [2]
> [[1]] "A" "B"
> [3]
> [[1]] "A" "B"
> [4]
> [[1]] "A" "B" "C"
> [5]
> [[1]] "A" "B" "C" "D"
> [6]
> [[1]] "A" "B" "C" "D" "E"
> 
> Thanks for any hints,
> 
> Joh
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread markleeds
>From: Johannes Graumann <[EMAIL PROTECTED]>
>Date: 2007/12/18 Tue PM 04:40:37 CST
>To: [EMAIL PROTECTED]
>Subject: [R] All anchored series from a vector?

lapply(1:length(myvector) function(.length) {
c(myvector[1}:myvector[.length])
})

but test it because i didn't.



>Hi all,
>
>What may be a smart, efficient way to get the following result:
>
>myvector <- c("A","B","C","D","E")
>myseries <- miracle(myvector)
>myseries
>[1]
>[[1]] "A"
>[2]
>[[1]] "A" "B"
>[3]
>[[1]] "A" "B"
>[4]
>[[1]] "A" "B" "C"
>[5]
>[[1]] "A" "B" "C" "D"
>[6]
>[[1]] "A" "B" "C" "D" "E"
>
>Thanks for any hints,
>
>Joh
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread markleeds
>From: [EMAIL PROTECTED]
>Date: 2007/12/18 Tue PM 02:50:52 CST
>To: Johannes Graumann <[EMAIL PROTECTED]>
>Cc: r-help@r-project.org
>Subject: Re: [R] All anchored series from a vector?

i'm sorry. i tested it afterwards and of course
it had some problems. below is the working version.


myvector<-c("A","B","C","D","E")

result<- lapply(1:length(myvector), function(.length) {
myvector[1:.length]
})


print(result)




>>From: Johannes Graumann <[EMAIL PROTECTED]>
>>Date: 2007/12/18 Tue PM 04:40:37 CST
>>To: [EMAIL PROTECTED]
>>Subject: [R] All anchored series from a vector?
>
>lapply(1:length(myvector) function(.length) {
>c(myvector[1}:myvector[.length])
>})
>
>but test it because i didn't.
>
>
>
>>Hi all,
>>
>>What may be a smart, efficient way to get the following result:
>>
>>myvector <- c("A","B","C","D","E")
>>myseries <- miracle(myvector)
>>myseries
>>[1]
>>[[1]] "A"
>>[2]
>>[[1]] "A" "B"
>>[3]
>>[[1]] "A" "B"
>>[4]
>>[[1]] "A" "B" "C"
>>[5]
>>[[1]] "A" "B" "C" "D"
>>[6]
>>[[1]] "A" "B" "C" "D" "E"
>>
>>Thanks for any hints,
>>
>>Joh
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Gabor Csardi
miracle <- function(x) { lapply(seq(along=x), function(y) x[1:y]) }

Gabor

On Tue, Dec 18, 2007 at 11:40:37PM +0100, Johannes Graumann wrote:
> Hi all,
> 
> What may be a smart, efficient way to get the following result:
> 
> myvector <- c("A","B","C","D","E")
> myseries <- miracle(myvector)
> myseries
> [1]
> [[1]] "A"
> [2]
> [[1]] "A" "B"
> [3]
> [[1]] "A" "B"
> [4]
> [[1]] "A" "B" "C"
> [5]
> [[1]] "A" "B" "C" "D"
> [6]
> [[1]] "A" "B" "C" "D" "E"
> 
> Thanks for any hints,
> 
> Joh
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor <[EMAIL PROTECTED]>MTA RMKI, ELTE TTK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting magnitude

2007-12-18 Thread Dylan Beaudette
On Tuesday 18 December 2007, [EMAIL PROTECTED] wrote:
> I am plotting fishing vessel positions and want these points to be
> relative in size to the catch at that point.  Is this possible? I am just
> begining to use R and my search of the help section didnt help in this
> area.  Heres what Im using so far
>
> xyplot(data$latdeg~data$londeg |vessek , groups=vessek,
> xlim=rev(range(69:77)),ylim=(range(35:42)), data=data,
>   main=list ("Mackerel catches", cex=1.0),
>   ylab="latitude", notch=T, varwidth=T,
>   xlab="longitude", cex.axis=0.5,)
> any info would be appreciated
>

how about scaling your plotting symbols by the sqrt() of their value. or 
see ?bubble in the gstat package.

cheers,

Dylan


-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Debugged version:
lapply(1:length(myvector), function(.length) {
myvector[1:.length]
})

Thanks for showing the direction!

Joh

[EMAIL PROTECTED] wrote:

>>From: Johannes Graumann <[EMAIL PROTECTED]>
>>Date: 2007/12/18 Tue PM 04:40:37 CST
>>To: [EMAIL PROTECTED]
>>Subject: [R] All anchored series from a vector?
> 
> lapply(1:length(myvector) function(.length) {
> c(myvector[1}:myvector[.length])
> })
> 
> but test it because i didn't.
> 
> 
> 
>>Hi all,
>>
>>What may be a smart, efficient way to get the following result:
>>
>>myvector <- c("A","B","C","D","E")
>>myseries <- miracle(myvector)
>>myseries
>>[1]
>>[[1]] "A"
>>[2]
>>[[1]] "A" "B"
>>[3]
>>[[1]] "A" "B"
>>[4]
>>[[1]] "A" "B" "C"
>>[5]
>>[[1]] "A" "B" "C" "D"
>>[6]
>>[[1]] "A" "B" "C" "D" "E"
>>
>>Thanks for any hints,
>>
>>Joh
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html and provide commented,
>>minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Elegant. Thanks to you too.

Joh

Gabor Csardi wrote:

> miracle <- function(x) { lapply(seq(along=x), function(y) x[1:y]) }
> 
> Gabor
> 
> On Tue, Dec 18, 2007 at 11:40:37PM +0100, Johannes Graumann wrote:
>> Hi all,
>> 
>> What may be a smart, efficient way to get the following result:
>> 
>> myvector <- c("A","B","C","D","E")
>> myseries <- miracle(myvector)
>> myseries
>> [1]
>> [[1]] "A"
>> [2]
>> [[1]] "A" "B"
>> [3]
>> [[1]] "A" "B"
>> [4]
>> [[1]] "A" "B" "C"
>> [5]
>> [[1]] "A" "B" "C" "D"
>> [6]
>> [[1]] "A" "B" "C" "D" "E"
>> 
>> Thanks for any hints,
>> 
>> Joh
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide commented,
>> minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Gabor Csardi
On Wed, Dec 19, 2007 at 12:01:25AM +0100, Johannes Graumann wrote:
> Debugged version:
> lapply(1:length(myvector), function(.length) {
> myvector[1:.length]
> })
> 
> Thanks for showing the direction!
> 
> Joh

Note that this fails if length(myvector)==0. 
Good to know the corner cases.

Gabor

> 
> [EMAIL PROTECTED] wrote:
> 
> >>From: Johannes Graumann <[EMAIL PROTECTED]>
> >>Date: 2007/12/18 Tue PM 04:40:37 CST
> >>To: [EMAIL PROTECTED]
> >>Subject: [R] All anchored series from a vector?
> > 
> > lapply(1:length(myvector) function(.length) {
> > c(myvector[1}:myvector[.length])
> > })
> > 
> > but test it because i didn't.
> > 
> > 
> > 
> >>Hi all,
> >>
> >>What may be a smart, efficient way to get the following result:
> >>
> >>myvector <- c("A","B","C","D","E")
> >>myseries <- miracle(myvector)
> >>myseries
> >>[1]
> >>[[1]] "A"
> >>[2]
> >>[[1]] "A" "B"
> >>[3]
> >>[[1]] "A" "B"
> >>[4]
> >>[[1]] "A" "B" "C"
> >>[5]
> >>[[1]] "A" "B" "C" "D"
> >>[6]
> >>[[1]] "A" "B" "C" "D" "E"
> >>
> >>Thanks for any hints,
> >>
> >>Joh
> >>
> >>__
> >>R-help@r-project.org mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html and provide commented,
> >>minimal, self-contained, reproducible code.
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor <[EMAIL PROTECTED]>MTA RMKI, ELTE TTK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Nothing to be sorry about. You suggested a viable solution untested ... my
job to figure it out ;0)

Joh

[EMAIL PROTECTED] wrote:

>>From: [EMAIL PROTECTED]
>>Date: 2007/12/18 Tue PM 02:50:52 CST
>>To: Johannes Graumann <[EMAIL PROTECTED]>
>>Cc: r-help@r-project.org
>>Subject: Re: [R] All anchored series from a vector?
> 
> i'm sorry. i tested it afterwards and of course
> it had some problems. below is the working version.
> 
> 
> myvector<-c("A","B","C","D","E")
> 
> result<- lapply(1:length(myvector), function(.length) {
> myvector[1:.length]
> })
> 
> 
> print(result)
> 
> 
> 
> 
>>>From: Johannes Graumann <[EMAIL PROTECTED]>
>>>Date: 2007/12/18 Tue PM 04:40:37 CST
>>>To: [EMAIL PROTECTED]
>>>Subject: [R] All anchored series from a vector?
>>
>>lapply(1:length(myvector) function(.length) {
>>c(myvector[1}:myvector[.length])
>>})
>>
>>but test it because i didn't.
>>
>>
>>
>>>Hi all,
>>>
>>>What may be a smart, efficient way to get the following result:
>>>
>>>myvector <- c("A","B","C","D","E")
>>>myseries <- miracle(myvector)
>>>myseries
>>>[1]
>>>[[1]] "A"
>>>[2]
>>>[[1]] "A" "B"
>>>[3]
>>>[[1]] "A" "B"
>>>[4]
>>>[[1]] "A" "B" "C"
>>>[5]
>>>[[1]] "A" "B" "C" "D"
>>>[6]
>>>[[1]] "A" "B" "C" "D" "E"
>>>
>>>Thanks for any hints,
>>>
>>>Joh
>>>
>>>__
>>>R-help@r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide
>>>http://www.R-project.org/posting-guide.html and provide commented,
>>>minimal, self-contained, reproducible code.
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html and provide commented,
>>minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dual Core vs Quad Core

2007-12-18 Thread Luke Tierney
On Tue, 18 Dec 2007, Prof Brian Ripley wrote:

> On Tue, 18 Dec 2007, S Ellison wrote:
>
>> Hiding in the windows faq is the observation that "R's computation is
>> single-threaded, and so it cannot use more than one CPU". So multi-core
>> should make no difference other than allowing R to run with less
>> interruption from other tasks. That is often a significant advantage,
>> though.
>
> Yes, but that is Windows-specific.
>
> On most other platforms you can benefit from using a multi-threaded BLAS,
> such as ATLAS, ACML or Dr Goto's.  The speedup for linear algebra can be
> substantial (although sometimes it will slow things down).  Luke Tierney
> has an experimental package to make use of parallel threads for some basic
> R computations which may appear in R 2.7.0.

There are two experimental packages available in
http://www.stat.uiowa.edu/~luke/R/experimental: pnmath, based on
OpenMP, and pnmath0, based on basic pthreads. These packages provide
parallelized versions of many of the R vectorized math functions. The
README files in these packages give more details.  OpenMP is I think
the way we want to go in the longer term; there are a few configu
issues that need sorting out and so in the interim a non OpenMP
version might be useful.

Best,

luke

>
> It should be possible to use a multi-threaded BLAS under Windows, but I
> know no one who has done it.  There is a viable pthreads implementation
> for Windows, and I've tested Luke's experimental package using it.
>
> Some compilers' runtimes will be able to use parallel threads for other
> tasks.  Since all the examples I am aware of are expensive commercial
> compilers, I suspect R will make limited use of them.  (In particular,
> base R does not use the Fortran 9x vector operations at which many of
> these features are targeted: we probably would if we routinely used such
> compilers.)
>
> I've had dual-CPU desktops for more than ten years.  Given how little
> speedup you are likely to get via parallel processing (only under ideal
> conditions do the optimized BLASes run >1.5x faster using two CPUs), the
> most effective way to make use of multiple CPUs has been to run multiple
> jobs: I typically run 3-4 at once to keep the CPUs fully used.
>
> One way to run multiple R processes to cooperate on a single task is to
> use a package such as snow to distribute the load.
>
>
>
> Andrew Perrin <[EMAIL PROTECTED]> 18/12/2007 01:13 >>>
>> On Mon, 17 Dec 2007, Kitty Lee wrote:
>>
>>> Dear R-users,
>>>
>>> I use R to run spatial stuff and it takes up a lot of ram. Runs can
>> take hours or days. I am thinking of getting a new desktop. Can R take
>> advantage of the dual-core system?
>>>
>>> I have a dual-core computer at work. But it seems that right now R is
>> using only one processor.
>>>
>>> The new computers feature quad core with 3GB of RAM. Can R take
>> advantage of the 4 chips? Or am I better off getting a dual core with
>> faster processing speed per chip?
>>>
>>> Thanks! Any advice would be really appreciated!
>>>
>>> K.
>>
>> If I have my information right, R will use dual- or quad-cores if it's
>> doing two (or four) things at once. The second core will help a little
>> bit
>> insofar as whatever else your machine is doing won't interfere with the
>> one core on which it's running, but generally things that take a single
>> thread will remain on a single core.
>>
>> As for RAM, if you're doing memory-bound work you should certainly be
>> using a 64-bit machine and OS so you can utilize the larger memory
>> space.
>
> They only have 3GB of RAM, which 32-bit OSes can address.  The benefits
> really come with more than that.
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:  [EMAIL PROTECTED]
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Update of the np package (version 0.14-1)

2007-12-18 Thread Jeffrey S. Racine
Dear R users,

An updated version of the np package has recently been uploaded to CRAN
(version 0.14-1). 

The package is briefly described in a recent issue of Rnews (October,
2007, http://cran.r-project.org/doc/Rnews/Rnews_2007-2.pdf) for those
who might be interested.

A somewhat more detailed paper that describes the np package is
forthcoming in the Journal of Statistical Software
(http://www.jstatsoft.org) for those might be interested.

A much more thorough treatment of the subject matter can be found in Li,
Q. and J. S. Racine (2007), Nonparametric Econometrics: Theory and
Practice, Princeton University Press, ISBN: 0691121613 (768 Pages) for
those who might be interested
(http://press.princeton.edu/titles/8355.html)

Information on the np package:

This package provides a variety of nonparametric (and semiparametric)
kernel methods that seamlessly handle a mix of continuous, unordered,
and ordered factor datatypes. We would like to gratefully acknowledge
support from  the Natural Sciences and Engineering Research Council of
Canada (NSERC:www.nserc.ca), the Social Sciences and Humanities Research
Council of Canada (SSHRC:www.sshrc.ca), and the Shared Hierarchical
Academic Research Computing Network (SHARCNET:www.sharcnet.ca).

Changes from version 0.13-1 to 0.14-1:

* now use optim rather than nlm for minimisation in single index and
smooth coefficient models
* fixed bug in klein-spady objective function
* regression standard errors are now available in the case of no
  continuous variables
* summary should look prettier, print additional information
* tidied up lingering issues with out-of-sample data and conditional
modes
* fixed error when plotting asymptotic errors with conditional densities
* fixed a bug in npplot with partially linear regressions and 
plot.behavior='data' or 'plot-data'
* maximum default number of multistarts is now set to 5
* least-squares cross-validation of conditional densities uses a new,
 faster algorithm
* new, faster algorithm for least-squares cross-validation for both 
local-constant and local linear regressions.
   The estimator has changed somewhat: both cross-validation and
   the estimator use a method of shrinking towards the local constant
   estimator rather than the standard ridge approach that shrinks
   towards zero
* optimised smooth coefficient code, added ridging
* fixed bug in uniform CDF kernel
* fixed bug where npindexbw would ignore bandwidth.compute = FALSE and
   compute bandwidths when supplied with a preexisting bw object
* now can handle estimation out of discrete support.
* summary would misreport the values of discrete scale factors which
   were computed with bwscaling = TRUE

We are grateful to John Fox, Achim Zeilies, Roger Koenker, and numerous
users for their valuable feedback which resulted in an improved version
of the package.

-- Jeffrey Racine & Tristen Hayfield.

-- 
Professor J. S. Racine Phone:  (905) 525 9140 x 23825
Department of EconomicsFAX:(905) 521-8232
McMaster Universitye-mail: [EMAIL PROTECTED]
1280 Main St. W.,Hamilton, URL:
http://www.economics.mcmaster.ca/racine/
Ontario, Canada. L8S 4M4

`The generation of random numbers is too important to be left to chance'

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I extract the AIC score from a mixed model object produced using lmer?

2007-12-18 Thread Peter H Singleton

I am running a series of candidate mixed models using lmer (package lme4)
and I'd like to be able to compile a list of the AIC scores for those
models so that I can quickly summarize and rank the models by AIC. When I
do logistic regression, I can easily generate this kind of list by creating
the model objects using glm, and doing:

> md <- c("md1.lr", "md2.lr", "md3.lr")
> aic <- c(md1.lr$aic, md2.lr$aic, md3.lr$aic)
> aic2 <- cbind(md, aic)

but when I try to extract the AIC score from the model object produced by
lmer I get:

> md1.lme$aic
NULL
Warning message:
In md1.lme$aic : $ operator not defined for this S4 class, returning NULL

So... How do I query the AIC value out of a mixed model object created by
lmer?

<<->><<->><<->><<->><<->><<->><<->>
Peter Singleton
USFS Pacific Northwest Research Station
1133 N. Western Ave.
Wenatchee WA 98801
Phone: (509)664-1732
Fax: (509)665-8362
E-mail: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random forests

2007-12-18 Thread Naiara Pinto
Dear all,

I would like to use a tree regression method to analyze my dataset. I
am interested in the fact that random forests creates in-bag and
out-of-bag datasets, but I also need an estimate of support for each
split. That seems hard to do in random forests since each tree is
grown using a subset of the predictor variables.

I was thinking of setting mtry = number of predictor variables,
growing several trees, and computing the support for each node as the
number of times that a certain predictor variable was chosen for that
node. Can this be implemented using random forests?

Thanks!

Naiara.

-- 
Naiara Pinto
PhD Candidate
Ecology, Evolution and Behavior
University of Texas Austin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I extract the AIC score from a mixed model object produced using lmer?

2007-12-18 Thread David Barron
You can calculate the AIC as follows:

(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
aic1 <- AIC(logLik(fm1))

Hope this helps.
Dave

On 12/18/07, Peter H Singleton <[EMAIL PROTECTED]> wrote:
>
> I am running a series of candidate mixed models using lmer (package lme4)
> and I'd like to be able to compile a list of the AIC scores for those
> models so that I can quickly summarize and rank the models by AIC. When I
> do logistic regression, I can easily generate this kind of list by creating
> the model objects using glm, and doing:
>
> > md <- c("md1.lr", "md2.lr", "md3.lr")
> > aic <- c(md1.lr$aic, md2.lr$aic, md3.lr$aic)
> > aic2 <- cbind(md, aic)
>
> but when I try to extract the AIC score from the model object produced by
> lmer I get:
>
> > md1.lme$aic
> NULL
> Warning message:
> In md1.lme$aic : $ operator not defined for this S4 class, returning NULL
>
> So... How do I query the AIC value out of a mixed model object created by
> lmer?
>
> <<->><<->><<->><<->><<->><<->><<->>
> Peter Singleton
> USFS Pacific Northwest Research Station
> 1133 N. Western Ave.
> Wenatchee WA 98801
> Phone: (509)664-1732
> Fax: (509)665-8362
> E-mail: [EMAIL PROTECTED]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
David Barron
Said Business School Jesus College
Park End Street  Oxford
Oxford OX1 1HP  OX1 3DW
01865 288906  01865 279684

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculating the number of days from dates

2007-12-18 Thread Rolf Turner

On 19/12/2007, at 6:46 AM, bogdan romocea wrote:

>> Sorry for using library instead package, but
>> library() is one command for using packages.
>
> ... which is why all efforts to make folks say "package" instead of >>
> "library" << are doomed to fail, IMHO.

Yes, but it gives so much pleasure to those who appreciate
the distinction to rail at those who don't! :-)

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting magnitude

2007-12-18 Thread hadley wickham
On Dec 18, 2007 2:06 PM,  <[EMAIL PROTECTED]> wrote:
> I am plotting fishing vessel positions and want these points to be
> relative in size to the catch at that point.  Is this possible? I am just
> begining to use R and my search of the help section didnt help in this
> area.  Heres what Im using so far
>
> xyplot(data$latdeg~data$londeg |vessek , groups=vessek,
> xlim=rev(range(69:77)),ylim=(range(35:42)), data=data,
> main=list ("Mackerel catches", cex=1.0),
> ylab="latitude", notch=T, varwidth=T,
> xlab="longitude", cex.axis=0.5,)
> any info would be appreciated

This is pretty easy to do with the ggplot2 package:

library(ggplot2)
qplot(longdeg, latdeg, data = data, facets = . ~ vessek, size = catch)

or maybe
qplot(longdeg, latdeg, data = data, facets = . ~ vessek, size = catch)
+ scale_area()

if you want the area of the points proportional to the catch, rather
than their radius

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Clustering Question (Support Vector Clustering)

2007-12-18 Thread Ionut Florescu
I am currently designing a clustering algorithm in collaboration with one of
my colleagues. For comparison purposes we would like to contrast it with the
Support Vector Clustering algorithm of (A. Ben-Hur, D. Horn, H.T.
Siegelmann, and V. Vapnik. Support vector clustering. Journal of Machine
Learning Research, 2:125-137, 2001). 

 

This is supposedly the most powerful unsupervised clustering algorithm
available. Unfortunately, we cannot find code for it anywhere. We have tried
emailing the authors and they cannot find the code either (one of them has
not answer yet). I was wondering, perhaps someone in the R community has
implemented it or knows about some implementation for it. We really do not
want to implement it ourselves because this will probably delay our paper
considerably. 

 

One more thing, I am talking about Support vector clustering not
classification, algorithms for that we could find in abundance.

 

Thank you for any help.

Ionut Florescu

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple plots with single box

2007-12-18 Thread Giovanni Petris

Hello,

I am trying to display some harmonic functions in a plot. The kind of
display I have in mind is like the one that cn be obtained by a call
to plot.ts with plot.type = "multiple". The only difference is that I
want a single box containing all the plots instead of one box per
plot. I thought box(which = "outer") would have done the job, but it
didn't. 

Below is the code I have used so far. (R 2.5.1, I know, I know...)

Any help is greatly appreciated. 

Thank you in advance,
Giovanni

=
### Plot harmonic functions
n <- 6 # even
omega <- 2 * pi / n

par(mfrow = c(n - 1, 1), mar = c(0, 5.1, 0, 5.1), oma = c(3, 1, 2, 1))
for (i in 1:(n/2 - 1)) {
curve(cos(x * i * omega), 0, n, ylim = c(-1.1, 1.1), ylab = "", axes = 
FALSE)
points(1:n, cos(i * omega * 1:n))
axis(2); abline(h = 0, col = "lightgrey")
curve(sin(x * i * omega), 0, n, ylim = c(-1.1, 1.1), ylab = "", axes = 
FALSE)
points(1:n, sin(i * omega * 1:n))
axis(4); abline(h = 0, col = "lightgrey")
}
curve(cos(x * (n/2) * omega), 0, n, ylim = c(-1.1, 1.1), ylab = "", axes = 
FALSE)
points(1:n, rep(c(-1,1), n/2))
axis(1); axis(2); abline(h = 0, col = "lightgrey")

-- 

Giovanni Petris  <[EMAIL PROTECTED]>
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor Madness

2007-12-18 Thread Tony Plate
 From ?cbind:

Data frame methods
The cbind data frame method is just a wrapper for data.frame(..., 
check.names = FALSE). This means that it will split matrix columns in data 
frame arguments, and convert character columns to factors unless 
stringsAsFactors = TRUE is passed.

(I'm guessing 'spectrum' is a data.frame before the code fragment you've shown)

hope this helps,

Tony Plate

Johannes Graumann wrote:
> Why is class(spectrum[["Ion"]]) after this "factor"?
> 
> spectrum <- cbind(spectrum,Ion=rep("",
> nrow(spectrum)),Deviation.AMU=rep(0.0, nrow(spectrum)))
> 
> slowly going crazy ...
> 
> Joh
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Factor Madness

2007-12-18 Thread Johannes Graumann
Why is class(spectrum[["Ion"]]) after this "factor"?

spectrum <- cbind(spectrum,Ion=rep("",
nrow(spectrum)),Deviation.AMU=rep(0.0, nrow(spectrum)))

slowly going crazy ...

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "gam()" in "gam" package

2007-12-18 Thread Kunio takezawa
R-users
E-mail: r-help@r-project.org

>Please don't ask the same question multiple times!

   I am really sorry about it. I thought that my first mail did not
work.

>And no, backfitting and QR are unrelated concepts. You need to read up
>on the theory,

   To derive an additive model, we have two methods: (1) backfitting,
(2) solving multiple linear equation using QR decomposition.

>If i read the code correctly, lm.wfit is called iteratively in gam.fit,
>via the line
>   fit <- eval(bf.call)
>The iteration is necessary to the backfitting algorithm.

   This iteration seems to be for "iteratively reweighted least squares" not
for backfitting. And lm.wfit may solve multiple linear equation using
QR decomposition; but I am not sure.
-- 
*[EMAIL PROTECTED]*
http://cse.naro.affrc.go.jp/takezawa/intro.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] leaps

2007-12-18 Thread Maura E Monville
Thank you very much for the example. I think interactively I could get
something.
But my obstacle is to write an R script that  processes my set of data
automatically.
My difficulty is to extract the information that appears on the screen, when
R is operated interactively, from a scripts.

Let me go over some steps to make sure I am doing things right.
Assume my data have been read into the matrix xx. Such a matrix contains a
number of curve samples. I call each curve a cycle.
I am processing one cycle at a time and using regression analysis to find
the frequencies in the single cycle. Since I have many, I have to come up
with an automatic way to do that. The main phases are:

1) run a regression analysis including all possible frequencies
2) use "step" to cut off the non significant frequencies
3) input remaining frequencies from phase (2) to "regsubsets"

In the following I have cut and pasted some code and the information I get
from phase (2) and phase (3). To start with I would like to make sure that I
got the output from (3) right. The ouput of (3) tells me that the highest
R^2  value was reached after 8 iterations and there are only 8 significant
predictors in this model ???  In addition the only significant frequencies
(predictors) left are:
 cos1, cos2, cos4, cos7, sin1,sin2, sin3, sin5
I got this information interactively. But I'm in troubles at extracting it
automatically.
Any suggestion ?

Question: Do I have to run "step" in advance of "regsubsets" for a
first-pass model  pruning or may I run  "regsubsets" on the original model
bringing in all possible frequencies (88 in all as sums of sinuoids) ?

Thank you so much.

Kind regards,
Maura



  T <- xx$timestamp[end] - xx$timestamp[start]
   nsamples <- end +1 - start
   nfr <- ceiling(nsamples/2)
   yy <- xx[start:end,"amplitude"]
   tt <- xx[start:end,"timestamp"]
   cosmat <- matrix(nrow=nsamples,ncol=nfr)
   coscol <- NULL
   sinmat <- matrix(nrow=nsamples,ncol=nfr)
   sincol <- NULL
   for(i in 1:nfr){
 cosmat[,i] <- cos(tt*2*pi*i/T)
 coscol <- c(coscol,paste("cos",i,sep=""))
 sinmat[,i] <- sin(tt*2*pi*i/T)
 sincol <- c(sincol,paste("sin",i,sep=""))
   }
  colnames(cosmat) <- coscol
   colnames(sinmat) <- sincol
   xnam1 <- NULL
   xnam1 <- paste(sep="","cosmat[,",1:nfr,"]",collapse="+")
   xnam2 <- NULL
   xnam2 <- paste(sep="","sinmat[,",1:nfr,"]",collapse="+")
   fmla <- as.formula(paste("yy ~ ", paste(xnam1,"+",xnam2,sep="")))
   FTmod <- lm(fmla)
   stepmod <- step(FTmod, direction="both")
   summary(stepmod)

*Coefficients:
  Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.237808   0.008728  27.248  < 2e-16 ***
cosmat[, 1]  -1.011932   0.012390 -81.675  < 2e-16 ***
cosmat[, 2]  -0.417265   0.012329 -33.844  < 2e-16 ***
cosmat[, 3]   0.020143   0.012318   1.635 0.106604
cosmat[, 4]   0.081425   0.012397   6.568 8.43e-09 ***
cosmat[, 6]  -0.042804   0.012388  -3.455 0.000951 ***
cosmat[, 7]  -0.044927   0.012340  -3.641 0.000526 ***
cosmat[, 9]   0.039322   0.012411   3.168 0.002298 **
cosmat[, 11] -0.020710   0.012375  -1.673 0.098831 .
cosmat[, 14]  0.016393   0.012390   1.323 0.190245
cosmat[, 17] -0.016482   0.012374  -1.332 0.187319
cosmat[, 19]  0.016537   0.012387   1.335 0.186306
sinmat[, 1]  -0.460705   0.012296 -37.469  < 2e-16 ***
sinmat[, 2]  -0.289316   0.012356 -23.416  < 2e-16 ***
sinmat[, 3]  -0.049610   0.012367  -4.011 0.000153 ***
sinmat[, 4]   0.023937   0.012289   1.948 0.01 .
sinmat[, 5]   0.045542   0.012406   3.671 0.000476 ***
sinmat[, 6]  -0.017782   0.012298  -1.446 0.152790
sinmat[, 12] -0.016093   0.012325  -1.306 0.196038
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08135 on 68 degrees of freedom
Multiple R-Squared: 0.9932, Adjusted R-squared: 0.9914
F-statistic: 549.8 on 18 and 68 DF,  p-value: < 2.2e-16*

newmat <- cosmat[,-c(5,8,10,12,13,15,16,18,20:44)]
newmat <- cbind(newmat,sinmat[,-c(7:11,13:44)])
Regmod <- regsubsets(newmat, yy)
rs <- summary(Regmod)
which.max(rs$adjr)

*[1] 8

*rs$which[which.max(rs$adjr), ]

*(Intercept)cos1cos2cos3cos4cos6
   TRUETRUETRUE   FALSETRUE   FALSE
   cos7cos9   cos11   cos14   cos17   cos19
   TRUE   FALSE   FALSE   FALSE   FALSE   FALSE
   sin1sin2sin3sin4sin5sin6
   TRUETRUETRUE   FALSETRUE   FALSE
  sin12
  FALSE*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor Madness

2007-12-18 Thread Tony Plate
Whoops, it looks like there's a typo in ?cbind (R version 2.6.0 Patched 
(2007-10-11 r43143)), and I blindly copied it into my message.

That should read (emphasis added):

"and convert character columns to factors unless stringsAsFactors = 
***FALSE***"

Here's an example:

 > x <- data.frame(X=1:3)
 > sapply(cbind(x, letters[1:3]), class)
X letters[1:3]
"integer" "factor"
 > sapply(cbind(x, letters[1:3], stringsAsFactors=FALSE), class)
X letters[1:3]
"integer"  "character"
 >

Thanks to Mark Leeds for pointing that out to me in a private message!

(I see this still in the source at 
https://svn.r-project.org/R/trunk/src/library/base/man/cbind.Rd -- is that 
the right place to look for the latest source to make sure it hasn't been 
fixed already?)

-- Tony Plate


Tony Plate wrote:
>  From ?cbind:
> 
> Data frame methods
> The cbind data frame method is just a wrapper for data.frame(..., 
> check.names = FALSE). This means that it will split matrix columns in data 
> frame arguments, and convert character columns to factors unless 
> stringsAsFactors = TRUE is passed.
> 
> (I'm guessing 'spectrum' is a data.frame before the code fragment you've 
> shown)
> 
> hope this helps,
> 
> Tony Plate
> 
> Johannes Graumann wrote:
>> Why is class(spectrum[["Ion"]]) after this "factor"?
>>
>> spectrum <- cbind(spectrum,Ion=rep("",
>> nrow(spectrum)),Deviation.AMU=rep(0.0, nrow(spectrum)))
>>
>> slowly going crazy ...
>>
>> Joh
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strange timings in convolve(x,y,type="open")

2007-12-18 Thread Art Owen
Dear R-ophiles,

I've found something very odd when I apply convolve
to ever larger vectors.  Here is an example below
with vectors ranging from 2^11 to 2^17.   There is
a funny bump up at 2^12.  Then it gets very slow at 2^16.


 >  for( i in 11:20 )print( system.time(convolve(1:2^i,1:2^i,type="o")))
   user  system elapsed
  0.002   0.000   0.002
   user  system elapsed
  0.373   0.002   0.375
   user  system elapsed
  0.014   0.001   0.016
   user  system elapsed
  0.031   0.002   0.034
   user  system elapsed
  0.126   0.004   0.130
   user  system elapsed
194.095   0.013 194.185
   user  system elapsed
  0.345   0.011   0.356

This example is run on a fedora machine with 64 bits.  I hit the same
wall at 2^16 on a Macbook (Intel processor I think).  The fedora machine
is running R 2.5.0.  That's a bit old (April 07) but I saw no mention of 
this speed
problem in some web searching, and it's not mentioned in the 2.6
what's new notes.

I've rerun it and found the same bump at 12 and wall at 16.
The timing at 2^16  can change appreciably.  In one
other case it was about 270 user, 271 elapsed.
The 2^18 case ran for hours without ever finishing.

At first I thought that this was a memory latency issue.  Maybe it
is.  But that makes it hard to explain why 2^17 works better than
2^16.  I've seen that three times now, so I'm almost ready to call it 
reproducible.
Also, one of the machines I'm using has lots of memory.  Maybe it's
a cache issue ... but that still does not explain why 2^12 is slower
than 2^13 or 2^16 is slower than 2^17.

I've checked by running convolve without type="o" and I don't
see the wall.  Similarly fft does not have that problem.

Here's an example without type="open"
 > for( k in 11:20)print(system.time( convolve( 1:2^k,1:2^k)))
   user  system elapsed
  0.001   0.000   0.000
   user  system elapsed
  0.001   0.000   0.001
   user  system elapsed
  0.002   0.000   0.002
   user  system elapsed
  0.004   0.000   0.004
   user  system elapsed
  0.009   0.001   0.010
   user  system elapsed
  0.017   0.001   0.018
   user  system elapsed
  0.138   0.005   0.143
   user  system elapsed
  0.368   0.012   0.389
   user  system elapsed
  1.010   0.032   1.051
   user  system elapsed
  1.945   0.069   2.015

This is more what I expected.  Something like N or N log(N) , with
the difference hard to discern in granularity and noise.

The convolve function is not very big (see below).  When type is
not specified, it defaults to "circular".  So my guess is that something
mysterious might be happening inside the first else clause below,
at least on some architectures.

-Art Owen



 > convolve
function (x, y, conj = TRUE, type = c("circular", "open", "filter"))
{
type <- match.arg(type)
n <- length(x)
ny <- length(y)
Real <- is.numeric(x) && is.numeric(y)
if (type == "circular") {
if (ny != n)
stop("length mismatch in convolution")
}
else {
n1 <- ny - 1
x <- c(rep.int(0, n1), x)
n <- length(y <- c(y, rep.int(0, n - 1)))
}
x <- fft(fft(x) * (if (conj)
Conj(fft(y))
else fft(y)), inv = TRUE)
if (type == "filter")
(if (Real)
Re(x)
else x)[-c(1:n1, (n - n1 + 1):n)]/n
else (if (Real)
Re(x)
else x)/n
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "gam()" in "gam" package

2007-12-18 Thread Kunio takezawa
R-users
E-mail: r-help@r-project.org

> This iteration seems to be for "iteratively reweighted least squares" not
>for backfitting. And lm.wfit may solve multiple linear equation using
>QR decomposition; but I am not sure.

   Let me tell you something about my guess above.
The iteration below is from 1 to maxit.

  for (iter in 1:maxit) {
--
fit <- eval(bf.call)
--
  }

The help of "gam.control()" says:
   maxit: maximum number of local scoring iterations
   bf.maxit: maximum number of backfitting iterations

Therefore the iteration above is local scoring iteration not
backfitting iteration.

   The first line of "lm.wfit()" is:
function (x, y, w, offset = NULL, method = "qr", tol = 1e-07,
 singular.ok = TRUE, ...)
There is no "bf.maxit" or its equivalent in these arguments. And it
contains 'method = "qr"'; it may mean that this routine solves
a multiple linear equation using QR decomposition.
-- 
*[EMAIL PROTECTED]*
http://cse.naro.affrc.go.jp/takezawa/intro.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >