Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Jim Lemon
Wayne Aldo Gavioli wrote:
 
 Hello all,
 
 
 I'm trying to graph a scatterplot of a large (5,000 x,y coordinates) of data
 with the caveat that many of the data points overlap with each other (share 
 the
 same x AND y coordinates).  In using the usual plot command,
 
 
 
plot(education, xlab=etc, ylab=etc)
 
 
 
 it seems that the overlap of points is not shown in the graph.  Namely, there
 are 5,000 points that should be plotted, as I mentioned above, but because so
 many of the points overlap with each other exactly, only about 50-60 points 
 are
 actually plotted on the graph.  Thus, there's no indication that Point A 
 shares
 its coordinates with 200 other pieces of data and thus is very common while
 Point B doesn't share its coordinates with any other pieces of data and thus
 isn't common at all.  Is there anyway to indicate the frequency of such points
 on such a graph?  Should I be using a different command than plot?
 
Hi Wayne,
While this is not a really pretty picture, you can get a viewable plot 
with count.overplot if the first two elements of education are named 
x and y and they are the coordinates you want to plot. Otherwise, 
pass the x and y coordinates separately.

library(plotrix)
count.overplot(education,
  tol=c(diff(range(education$x))/10,
  diff(range(education$y))/10))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Jari Oksanen
Wayne Aldo Gavioli wgavioli at fas.harvard.edu writes:

 
 
 Hello all,
 
 I'm trying to graph a scatterplot of a large (5,000 x,y coordinates) of data
 with the caveat that many of the data points overlap with each other (share 
 the
 same x AND y coordinates).  In using the usual plot command,
 
  plot(education, xlab=etc, ylab=etc)
 
 it seems that the overlap of points is not shown in the graph.  Namely, there
 are 5,000 points that should be plotted, as I mentioned above, but because so
 many of the points overlap with each other exactly, only about 50-60 points 
 are
 actually plotted on the graph.  Thus, there's no indication that Point A 
 shares
 its coordinates with 200 other pieces of data and thus is very common while
 Point B doesn't share its coordinates with any other pieces of data and thus
 isn't common at all.  Is there anyway to indicate the frequency of such points
 on such a graph?  Should I be using a different command than plot?
 
 
One suggestion seems to be still missing: 'sunflowerplot' of base R. May look
taggy, though, if you have 200 petals. 

Actually the documentation of sunflowerplot is wrong in botanical sense.
Sunflowers have composite flowers in capitula, and the things called 'petals' in
documentation are ligulate, sterile ray-florets (each with vestigial petals
which are not easily visible in sunflower, but in some other species you may see
three (occasionally two) teeth). 

cheers, jari oksanen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bar plot colors

2007-12-18 Thread Jim Lemon
Winkel, David wrote:
 All,
 
  
 
 I have a question regarding colors in bar plots.  I want to stack a
 total of 18 cost values in each bar. Basically, it is six cost types and
 each cost type has three components- direct, indirect, and induced
 costs.  I would like to use both solid color bars and bars with the
 slanted lines (using the density parameter).  The colors would
 distinguish cost types and the lines would distinguish
 direct/indirect/induced.  I want the cost types (i.e. colors) to be
 stacked together for each cost type.  In other words, I don't want all
 of the solid bars at the bottom and all of the slanted lines at the top.
 
 
 So far, I have made a bar plot with all solid colors and then tried to
 overwrite that bar plot by calling barplot() again and putting the white
 slanted lines across the bars.  However, I can't get this method to work
 while still grouping the cost types together.
 
  
Hi David,
This is a real challenge:

heights-matrix(sample(10:70,54),ncol=3)
bar.colors-rep(rep(2:7,each=3),3)
bar.densities-rep(10,54)
bar.angles-matrix(rep(rep(c(45,90,135),6),3),ncol=3)
barplot(heights,col=bar.colors)
barplot(heights,angle=bar.angles,add=TRUE,density=bar.densities)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot-How to define fill colours?

2007-12-18 Thread Pedro de Barros
Dear R's (most likely Hadley),

I want to build a stacked bar plot where I would like to define which 
colours will be used for each of the groups. However, I do not seem 
to find a way to do this, even if I've been looking over many places.

I have tried several variations, and my final try was this code, but 
I still do not manage to get the colours as I pre-define. Any hints 
about how to get this?

Thanks in advance,
Pedro

my code:
 plotdata1-data.frame(x=rep(factor(1:4),4), y=rep(0.1*(1:4),4), 
+group=as.character(rep(c('white', 'red', 'blue', 'green'),rep(4,4
 plot0-ggplot()
 plot3-plot0+layer(data=plotdata1, mapping=aes_string(x='x',y='y', 
+fill='group'),geom='bar', stat='identity', position='stack')
 print(plot3)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] integration

2007-12-18 Thread [EMAIL PROTECTED]
Dear All,
I need to perform a numerical integration of one dimensional 
fucntions. The extrems of integration are both finite and the functions 
I'm working on are quite complicated. I have already tried both area() 
and integrate(), but they do not perform well: area() is very slow and 
integrate() does not converge. Are in R other functions for numerical 
integration of one dimentional functions?

Thanks in advance 

Davide






Tiscali.Fax: il tuo fax online in promo fino al 31 dicembre, 
paghi 15€ e ricarichi 20€ 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hazard ratio of interaction Cox model

2007-12-18 Thread Giulia Barbati
Dear Forum,
I have a question about interaction estimate in the Cox model: 
why the hazard ratio of the interaction is not produced in the summary of the 
model?
(Instead, the estimate of the coefficient is given in the print of the model.) 


# Example:

modINT -cph( Surv(T_BASE, T_FIN,STATUS)~ NYHA + ASINI + RFP + FE_REC + 
XX_PR*XX_DISF)

print(modINT)


  coef se(coef) zp
NYHA=2   1.2540.584  2.15 0.031767
ASINI   0.6650.409  1.62 0.104247
RFP=20.7250.704  1.03 0.302578
FE_REC=2-1.6370.810 -2.02 0.043331
XX_PR2.1890.649  3.37 0.000748
XX_DISF  3.2331.000  3.23 0.001222
XX_PR * XX_DISF -2.8521.280 -2.23 0.025852

summary(modINT)

 Effects  Response : Surv(T_BASE, T_FIN, STATUS) 

 Factor LowHigh Diff.  Effect S.E. Lower 0.95 Upper 0.95
 ASINI 2.0725 2.85 0.7775  0.52  0.32 -0.111.14
  Hazard Ratio  2.0725 2.85 0.7775  1.68NA  0.903.13
 XX_PR  0. 1.00 1.  2.19  0.65  0.923.46
  Hazard Ratio  0. 1.00 1.  8.92NA  2.50   31.86
 XX_DISF0. 1.00 1.  3.23  1.00  1.275.19
  Hazard Ratio  0. 1.00 1. 25.35NA  3.57  179.88
 NYHA - 2:1 1. 2.00 NA  1.25  0.58  0.112.40
  Hazard Ratio  1. 2.00 NA  3.50NA  1.12   11.00
 RFP - 2:1  1. 2.00 NA  0.73  0.70 -0.652.10
  Hazard Ratio  1. 2.00 NA  2.07NA  0.528.20
 FE_REC - 2:1   1. 2.00 NA -1.64  0.81 -3.23   -0.05
  Hazard Ratio  1. 2.00 NA  0.19NA  0.040.95

Adjusted to: XX_PR=0 XX_DISF=0  




  

Be a better friend, newshound, and 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-users

2007-12-18 Thread Peter Dalgaard
Kunio takezawa wrote:
 R-users
 E-mail: r-help@r-project.org

I have a quenstion on gam() in gam package.
 The help of gam() says:
   'gam' uses the _backfitting
  algorithm_ to combine different smoothing or fitting methods.

 On the other hand,  lm.wfit(), which is a routine of gam.fit() contains:

 z - .Fortran(dqrls, qr = x * wts, n = n, p = p, y = y *
 wts, ny = ny, tol = as.double(tol), coefficients = mat.or.vec(p,
 ny), residuals = y, effects = mat.or.vec(n, ny), rank = integer(1),
 pivot = 1:p, qraux = double(p), work = double(2 * p),
 PACKAGE = base)
 It may indicate that QR decomposition is used to derive an additive model
 instead of backfitting.
I am wondering if my guess is correct, or this the _backfitting
 algorithm
 has another meaning.
   
Please don't ask the same question multiple times!

And no, backfitting and QR are unrelated concepts. You need to read up
on the theory, there are two fundamental books: Hastie  Tibshirani (gam
package) and Simon Wood (mgcv package). Both are a bit much to ask to
have summarized in email.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] axis names in triangle.plot

2007-12-18 Thread Thomas Hoffmann
Hi folks,

I am using a data.frame with sediment grain sizes:

  grain
 sand  silt  clay
OAT 10.03 56.77 18.25
OAT 10.40 57.40 17.94
WG1 50.03 20.68 12.57
WG1 43.20 25.69 13.41
WG1 33.89 31.10 14.48
WG2  2.84 62.81 20.79
WG2  2.79 60.46 19.16
WG2 16.27 33.04  6.48
WG2  1.39 57.90  9.13
WG3  4.54 52.91 17.20
WG3  5.20 50.55 15.65
WG3  7.71 49.13 10.80
WG3  4.43 50.03 11.83
WG3  1.72 57.53 14.20
WG3  1.51 58.99 13.96

I would like to do a trinagle.plot with labeled axis-names sand silt 
and clay. However using the command:

tringle.plot(grain)

from ade4-apckage) does not plot the axis names and there is no paramter 
to set the labels like xlab in the plot command. Does anybody has a 
good advice?

Thanks in advance
Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table() and precision?

2007-12-18 Thread Prof Brian Ripley
On Mon, 17 Dec 2007, Moshe Olshansky wrote:

 Dear List,

 Following the below question I have a question of my
 own:
 Suppose that I have large matrices which are produced
 sequentially and must be used sequentially in the
 reverse order. I do not have enough memory to store
 them and so I would like to write them to disk and
 then read them. This raises two questions:
 1) what is the fastest (and the most economic
 space-wise) way to do this?

Using save/load is the simplest.  Don't worry about finding better 
solutions until you know those are not good enough.  (serialize / 
unserialize is another interface to the same underlying idea.)

 2) functions like write, write.table, etc. write the
 data the way it is printed and this may result in a
 loss of accuracy. Is there any way to prevent this,
 except for setting the digits option to a higher
 value or using format prior to writing the data?

Do please read the help before making false claims. ?write.table says

  Real and complex numbers are written to the maximal possible
  precision.

OTOH, ?write says it is a wrapper for cat, whose help says

  'cat' converts numeric/complex elements in the same way as 'print'
  (and not in the same way as 'as.character' which is used by the S
  equivalent), so 'options' 'digits' and 'scipen' are relevant.
  However, it uses the minimum field width necessary for each
  element, rather than the same field width for all elements.

so this hints as.character() might be a useful preprocessor.

 Is it possible to write binary files (similar to Fortran)?

See ?writeBin.  save/load by default write binary files, but use of 
writeBin can be faster (and less flexible).

 Any suggestion will be greatly appreciated.

Somehow you have missed a great deal of information about R I/O.
Try help.start() and reading the sections the search engine shows you 
that look relevant.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Res: Scatterplot Showing All Points

2007-12-18 Thread S Ellison
?jitter is simpler:

x-rep(1:10,10)
y-x
plot(x,y)   #100 points, only 10 show
plot(jitter(x),jitter(y))   #overlap removed.



 Milton Cezar Ribeiro [EMAIL PROTECTED] 18/12/2007 04:36

Hi Wayne,

I have two suggestion to you.
1. You add some random noise on both x and y data or
2. You graph bubble points, where the size is proportional to the
frequence of the xy combination.

x-sample(1:10,1,replace=T)
y-sample(1:10,1,replace=T)
xy-cbind(x,y)
x11(1400,800)
par(mfrow=c(1,3))
plot(xy)
xy.random-xy+rnorm(2,0.1,0.1) 
plot(xy.random,cex=0.1)
xy.tab-data.frame(table(x,y))
xy.tab$x-as.numeric(as.character(xy.tab$x))
xy.tab$y-as.numeric(as.character(xy.tab$y))
min.freq-min(xy.tab$Freq)
max.freq-max(xy.tab$Freq)
plot(xy.tab$x,xy.tab$y,cex=(xy.tab$Freq-min.freq)/(max.freq-min.freq)*5)

Kind regards,

Miltinho
Brazil


- Mensagem original 
De: Wayne Aldo Gavioli [EMAIL PROTECTED]
Para: r-help@r-project.org
Enviadas: Segunda-feira, 17 de Dezembro de 2007 22:14:23
Assunto: [R] Scatterplot Showing All Points



Hello all,


I'm trying to graph a scatterplot of a large (5,000 x,y coordinates) of
data
with the caveat that many of the data points overlap with each other
(share the
same x AND y coordinates).  In using the usual plot command,


 plot(education, xlab=etc, ylab=etc)


it seems that the overlap of points is not shown in the graph.  Namely,
there
are 5,000 points that should be plotted, as I mentioned above, but
because so
many of the points overlap with each other exactly, only about 50-60
points are
actually plotted on the graph.  Thus, there's no indication that Point
A shares
its coordinates with 200 other pieces of data and thus is very common
while
Point B doesn't share its coordinates with any other pieces of data and
thus
isn't common at all.  Is there anyway to indicate the frequency of such
points
on such a graph?  Should I be using a different command than plot?


Thanks,


Wayne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.



 para armazenamento!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis names in triangle.plot

2007-12-18 Thread Stéphane Dray
Hi Thomas,
This looks quite strange. By default, the function use the column names 
as labels.
What is grain ? A data.frame ? Why have you duplicated row.names? Please 
return the results of class(grain) and names(grain)

Cheers,
PS: If you have questions related to ade4, you can use the adelist 
(http://listes.univ-lyon1.fr/wws/info/adelist)


Thomas Hoffmann wrote:
 Hi folks,

 I am using a data.frame with sediment grain sizes:

   grain
  sand  silt  clay
 OAT 10.03 56.77 18.25
 OAT 10.40 57.40 17.94
 WG1 50.03 20.68 12.57
 WG1 43.20 25.69 13.41
 WG1 33.89 31.10 14.48
 WG2  2.84 62.81 20.79
 WG2  2.79 60.46 19.16
 WG2 16.27 33.04  6.48
 WG2  1.39 57.90  9.13
 WG3  4.54 52.91 17.20
 WG3  5.20 50.55 15.65
 WG3  7.71 49.13 10.80
 WG3  4.43 50.03 11.83
 WG3  1.72 57.53 14.20
 WG3  1.51 58.99 13.96

 I would like to do a trinagle.plot with labeled axis-names sand silt 
 and clay. However using the command:

 tringle.plot(grain)

 from ade4-apckage) does not plot the axis names and there is no paramter 
 to set the labels like xlab in the plot command. Does anybody has a 
 good advice?

 Thanks in advance
 Thomas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


   


-- 
Stéphane DRAY ([EMAIL PROTECTED] )
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - Lyon I
43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
Tel: 33 4 72 43 27 57   Fax: 33 4 72 43 13 88
http://biomserv.univ-lyon1.fr/~dray/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dual Core vs Quad Core

2007-12-18 Thread S Ellison
Hiding in the windows faq is the observation that R's computation is
single-threaded, and so it cannot use more than one CPU. So multi-core
should make no difference other than allowing R to run with less
interruption from other tasks. That is often a significant advantage,
though.




 Andrew Perrin [EMAIL PROTECTED] 18/12/2007 01:13 
On Mon, 17 Dec 2007, Kitty Lee wrote:

 Dear R-users,

 I use R to run spatial stuff and it takes up a lot of ram. Runs can
take hours or days. I am thinking of getting a new desktop. Can R take
advantage of the dual-core system?

 I have a dual-core computer at work. But it seems that right now R is
using only one processor.

 The new computers feature quad core with 3GB of RAM. Can R take
advantage of the 4 chips? Or am I better off getting a dual core with
faster processing speed per chip?

 Thanks! Any advice would be really appreciated!

 K.

If I have my information right, R will use dual- or quad-cores if it's

doing two (or four) things at once. The second core will help a little
bit 
insofar as whatever else your machine is doing won't interfere with the

one core on which it's running, but generally things that take a single

thread will remain on a single core.

As for RAM, if you're doing memory-bound work you should certainly be 
using a 64-bit machine and OS so you can utilize the larger memory
space.


--
Andrew J Perrin - andrew_perrin (at) unc.edu -
http://perrin.socsci.unc.edu 
Associate Professor of Sociology; Book Review Editor, _Social Forces_
University of North Carolina - CB#3210, Chapel Hill, NC 27599-3210 USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gene shaving method

2007-12-18 Thread Turner, Heather
Gene shaving is implemented in the GeneClust package for R which you can
download from
http://odin.mdacc.tmc.edu/~kim/geneclust/
For more details see The Analysis of Gene Expression Data: Methods and
Software edited by Giovanni Parmiagiani, Elizabeth S. Garett, Rafael A.
Irizarry and Scott L. Zeger (you can get a preview on Google Book Search
http://books.google.com/books?id=r9gROQvdelcCpg=PA352lpg=PA352dq=gene
clustsource=webots=3FO3jQlfrpsig=BeOgUK2cgfuv7d12vWsvpRXOkBU#PPA353,M
1)

Best wishes,

Heather

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Aimin Yan
Sent: 17 December 2007 19:51
To: r-help@r-project.org
Subject: [R] gene shaving method

Does anyone know if Hastie's gene shaving method is implemented in R

Thanks,

Aimin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] accessing dimension names

2007-12-18 Thread born . to . b . wyld
I have a matrix y:

 dimnames(y)
$x93
[1] 1 2

$x94
[1] 0 1 2
.. so on  (there are other dimensions as well)



I need to access a particular dimension, but a random mechanism tells me
which dimension it would. So, sometimes I might need to access
dimnames(y)$x93, some other time it would be dimnames(y)$x94.. and so on.
Now let that random dimension be idx, then dimnames(y)$paste('x',idx,sep='')
doesn't work.

Can anyone help?

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two repeated warnings when running gam(mgcv) to analyse my dataset?

2007-12-18 Thread Simon Wood
The model here is just a penalised GLM, and the warnings relate to the GLM 
fitting process. Fitted probabilities of 0 or 1 can be perfectly appropriate, 
but do indicate that the linear predictor is not really uniquely defined, and 
that some care may be needed in interpreting results (for example, if the 
fitted probabilities are zero or one, then a CI for the corresponding linear 
predictor will depend more on the prior assumptions about smoothness than 
anything else). This problem is not really GAM specific, it relates to any 
`logistic regression' model. 

Similarly, the GLM fitting IRLS iterations are not guaranteed to converge, and 
can fail, especially for overly flexible logistic regression models. Try 
this, for example

x - 1:10
y - c(0,0,0,0,0,1,1,1,1,1)
glm(y~x,family=binomial)

I get...
...
Warning messages:
1: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = 
etastart,  :
  algorithm did not converge
2: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = 
etastart,  :
  fitted probabilities numerically 0 or 1 occurred

...as models become more complex the scope for this sort of thing to happen 
increases, and some simplification may be appropriate. 

That said, mgcv::gam fitting with all smoothing parameters fixed, is slightly 
more likely to fail in this way than `glm' or `mgcv::gam' with some smoothing 
parameters  estimated, because of the steps taken to stabilise divergent fit 
iterations. When all smoothing parameters are fixed, mgcv uses older fitting 
routines that don't try as hard to stabilise a divergent fit as the newer 
fitting routines. This is a bit of an anomaly and I'll try and fix it for a 
future release. 

best,
Simon



On Monday 17 December 2007 11:53, zhijie zhang wrote:
 Dear Simon,
 Sorry for an incomplete listing of the question.
 #mgcv version is  1.3-29, R 2.6.1, windows XP
 #m.gam-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+
disbinary,family=binomial(logit),data=point) The above program's the core
 codes in my following loop programs.
  It works well if i run the above codes only one time for my dataset, but
 warnings will occur if i run many times for the following loop.

  while (j1001) {

 +  index=sample(ID, replace=F)
 +  m.data$x=coords[index,]$x
 +  m.data$y=coords[index,]$y
 +  # For each permutation, we run the GAM using the optimal span for the
 above model m.gam
 +  s.gam
 -gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disbin
ary,,sp=c( 5.582647e-07,4.016504e-02,2.300424e-04,1.274065e+03,9.558236e-09,
 1.868827e-08),family=binomial(logit),data=m.data)
 +  permresults[,i]=predict.gam(s.gam)
 +  i=i+1
 +  if (j%%100==0) print(i)
 +  j=j+1
 +  }
 [1] 101
 [1] 201
 [1] 301
 [1] 401
 [1] 501
 [1] 601
 [1] 701
 [1] 801
 [1] 901
 [1] 1001
 warnings() over 50

  warnings()

 1: In gam.fit(G, family = G$family, control = control, gamma = gamma,  ...
 : fitted probabilities numerically 0 or 1 occurred
 ..
 14: In gam.fit(G, family = G$family, control = control, gamma = gamma,  ...

   Algorithm did not converge
 ..

 On Dec 17, 2007 4:54 PM, Simon Wood [EMAIL PROTECTED] wrote:
  What mgcv version are you running (and on what platform)?
 
  n Thursday 13 December 2007 17:46, zhijie zhang wrote:
   Dear all,
I run the GAMs (generalized additive models) in gam(mgcv) using the
   following codes.
  
   m.gam
 
  -gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disb
 in
 
  ary,family=binomial(logit),data=point)
  
And two repeated warnings appeared.
   Warnings:
   1: In gam.fit(G, family = G$family, control = control, gamma = gamma,
 
   ...
 
   : Algorithm did not converge
  
   2: In gam.fit(G, family = G$family, control = control, gamma = gamma,
 
   ...
 
   : fitted probabilities numerically 0 or 1 occurred
  
   Q1: For warning1, could it be solved by changing the value of
   mgcv.toloptions for
   gam.control(mgcv.tol=1e-7)?
  
   Q1: For warning2, is there any impact for the results if the fitted
   probabilities numerically 0 or 1 occurred ?  How can i solve it?
  
I didn't try the possible solutions for them, because it took such a
   longer time to run the whole programs.
Could anybody suggest their solutions?
Any help or suggestions are greatly appreciated.
 Thanks.
 
  --
 
   Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
   +44 1225 386603  www.maths.bath.ac.uk/~sw283

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis names in triangle.plot

2007-12-18 Thread Thomas Hoffmann
Thanks for this advice.

grain was not a data.frame but a matrix. Now it works:-)

Cheers
Thomas


Stéphane Dray schrieb:
 Hi Thomas,
 This looks quite strange. By default, the function use the column 
 names as labels.
 What is grain ? A data.frame ? Why have you duplicated row.names? 
 Please return the results of class(grain) and names(grain)

 Cheers,
 PS: If you have questions related to ade4, you can use the adelist 
 (http://listes.univ-lyon1.fr/wws/info/adelist)


 Thomas Hoffmann wrote:
 Hi folks,

 I am using a data.frame with sediment grain sizes:

   grain
  sand  silt  clay
 OAT 10.03 56.77 18.25
 OAT 10.40 57.40 17.94
 WG1 50.03 20.68 12.57
 WG1 43.20 25.69 13.41
 WG1 33.89 31.10 14.48
 WG2  2.84 62.81 20.79
 WG2  2.79 60.46 19.16
 WG2 16.27 33.04  6.48
 WG2  1.39 57.90  9.13
 WG3  4.54 52.91 17.20
 WG3  5.20 50.55 15.65
 WG3  7.71 49.13 10.80
 WG3  4.43 50.03 11.83
 WG3  1.72 57.53 14.20
 WG3  1.51 58.99 13.96

 I would like to do a trinagle.plot with labeled axis-names sand 
 silt and clay. However using the command:

 tringle.plot(grain)

 from ade4-apckage) does not plot the axis names and there is no 
 paramter to set the labels like xlab in the plot command. Does 
 anybody has a good advice?

 Thanks in advance
 Thomas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


   



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reshape Dataframe

2007-12-18 Thread Bert Jacobs

Hi,

I'm having a bit of problems in creating a new dataframe. 
Below you'll find a description of the current dataframe and of the
dataframe that needs to be created.
Can someone help me out on this one?
Thx in advance.
Bert

Current Dataframe

Var1Var2Var3Var4
A   Fa  W1  1
A   Si  W1  2
A   Fa  W2  3
A   Si  W3  4
B   Si  W1  5
C   La  W2  6
C   Do  W4  7

New Dataframe

Var1Var2W1  W2  W3  W4
A   Fa  1   3   
A   Si  2   4
A   La
A   Do  
B   Fa  
B   Si  5   
B   La
B   Do
C   Fa  
C   Si  
C   La  6
C   Do  7

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape Dataframe

2007-12-18 Thread hadley wickham
On 12/18/07, Bert Jacobs [EMAIL PROTECTED] wrote:

 Hi,

 I'm having a bit of problems in creating a new dataframe.
 Below you'll find a description of the current dataframe and of the
 dataframe that needs to be created.
 Can someone help me out on this one?

library(reshape)
dfm - melt(df, id = 1:3)
cast(dfm, ... ~ Var3)

You can find out more about the reshape package at http://had.co.nz/reshape

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 - getting at the grobs

2007-12-18 Thread Pedro de Barros
Dear All,

I continue trying to get several of my plotting functions to use 
ggplot, because I really do like the concept of the graphical 
objects, and working with them in the abstract.
I am now trying to access the grobs to manipulate using grid. 
However, until now all I managed was to get the plot as a gTree 
object, and manipulate it as a gTree from there. The problem is that 
then it is no longer a ggplot, and thus I can no longer use the 
ggplot functions.
How to get at the grobs, without converting the ggplot into a gTree?

Thanks,
Pedro

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integration

2007-12-18 Thread Ravi Varadhan
Hi Davide,

It is difficult to say what the problem is without knowing more about the 
nature of the integrand.  So, you should do a couple of preliminary things 
before attempting compute the integral.

First, is the integral is finite? You should establish this.  Second, plot the 
integrand over the entire interval.  Then you need to think about the 
following: Is the integrand unimodal, with the mass concentrated over a small 
region? Or is it multimodal? Does it have thick tail?

Assuming that the integral is finite, you could try a few things:
1.  Divide the interval of integration into several small intervals (say, 10 or 
100), and then use integrate() on each and then add up the results.  You can 
make this process more efficient if you know where the mass is concentrated.
2.  Transform the integrand.
3.  Try a simple trapezoidal rule quadrature.

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html






-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Monday, December 17, 2007 11:03 AM
To: r-help@r-project.org
Subject: [R] integration

Dear All,
I need to perform a numerical integration of one dimensional 
fucntions. The extrems of integration are both finite and the functions 
I'm working on are quite complicated. I have already tried both area() 
and integrate(), but they do not perform well: area() is very slow and 
integrate() does not converge. Are in R other functions for numerical 
integration of one dimentional functions?

Thanks in advance 

Davide






Tiscali.Fax: il tuo fax online in promo fino al 31 dicembre, 
paghi 15€ e ricarichi 20€ 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape Dataframe

2007-12-18 Thread Gabor Grothendieck
On Dec 18, 2007 9:07 AM, Bert Jacobs [EMAIL PROTECTED] wrote:

 Hi,

 I'm having a bit of problems in creating a new dataframe.
 Below you'll find a description of the current dataframe and of the
 dataframe that needs to be created.
 Can someone help me out on this one?
 Thx in advance.
 Bert

 Current Dataframe

 Var1Var2Var3Var4
 A   Fa  W1  1
 A   Si  W1  2
 A   Fa  W2  3
 A   Si  W3  4
 B   Si  W1  5
 C   La  W2  6
 C   Do  W4  7

 New Dataframe

 Var1Var2W1  W2  W3  W4
 A   Fa  1   3
 A   Si  2   4
 A   La
 A   Do
 B   Fa
 B   Si  5
 B   La
 B   Do
 C   Fa
 C   Si
 C   La  6
 C   Do  7

Try this:

out - ftable(xtabs(Var4 ~ Var1 + Var2 + Var3, DF))
out[out == 0] - NA

Omit the last line is 0 fill is what you had wanted.

This will do it except that it will eliminate all rows
without data:

out2 - reshape(DF, dir = wide, timevar = Var3, idvar = c(Var1, Var2))
out2[is.na(out2)] - 0

Omit the last line if NA fill is what you wanted.

The reshape package melt/cast routines (see Hadley's solution in this
thread) can be used
to give a similar result to the reshape command above (i.e. all
missing rows are not
included) except that cast is a bit more flexible since it has a fill= argument.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread James W. MacDonald
Duncan Murdoch wrote:
 On 18/12/2007 7:31 AM, Antony Unwin wrote:
 Wayne,

 Try the iplot command in iPlots.  You can then vary both the  
 pointsize and the transparency of your scatterplot interactively and  
 decide which scatterplot conveys the information best.  Sometimes  
 it's helpful to use more than one scatterplot when presenting your  
 results.

 (I must admit to being very surprised that jittering and sunflower  
 plots have been suggested for a dataset of 5000 points.  Do those who  
 mentioned these methods have examples on that scale where they are  
 effective?)
 
 Sure.  The original post said there were about 50-60 unique locations. 
 This plot:
 
 x - rbinom(5000, 20, 0.15)
 y - rbinom(5000, 20, 0.15)
 plot(x,y)
 
 has a few more unique locations; tune those probabilities if you want it 
 closer.  Due to the overlap, the distribution is very unclear.  But this 
 plot
 
 plot(jitter(x), jitter(y))

Another alternative is smoothscatter() in the geneplotter package from 
Bioconductor, which does a pretty reasonable job with these example data.

Best,

Jim


 
 makes the distribution quite clear.
 
 I wouldn't use the default pch if I had 5 points, but with pch=., 
 it's not so bad even in that case.
 
 Duncan Murdoch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape Dataframe

2007-12-18 Thread Gabor Grothendieck
On Dec 18, 2007 9:54 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote:

 On Dec 18, 2007 9:07 AM, Bert Jacobs [EMAIL PROTECTED] wrote:
 
  Hi,
 
  I'm having a bit of problems in creating a new dataframe.
  Below you'll find a description of the current dataframe and of the
  dataframe that needs to be created.
  Can someone help me out on this one?
  Thx in advance.
  Bert
 
  Current Dataframe
 
  Var1Var2Var3Var4
  A   Fa  W1  1
  A   Si  W1  2
  A   Fa  W2  3
  A   Si  W3  4
  B   Si  W1  5
  C   La  W2  6
  C   Do  W4  7
 
  New Dataframe
 
  Var1Var2W1  W2  W3  W4
  A   Fa  1   3
  A   Si  2   4
  A   La
  A   Do
  B   Fa
  B   Si  5
  B   La
  B   Do
  C   Fa
  C   Si
  C   La  6
  C   Do  7

 Try this:

 out - ftable(xtabs(Var4 ~ Var1 + Var2 + Var3, DF))
 out[out == 0] - NA

 Omit the last line is 0 fill is what you had wanted.

 This will do it except that it will eliminate all rows
 without data:

 out2 - reshape(DF, dir = wide, timevar = Var3, idvar = c(Var1, Var2))
 out2[is.na(out2)] - 0

 Omit the last line if NA fill is what you wanted.

 The reshape package melt/cast routines (see Hadley's solution in this
 thread) can be used
 to give a similar result to the reshape command above (i.e. all
 missing rows are not
 included) except that cast is a bit more flexible since it has a fill= 
 argument.

Just one correction.  The cast function in reshape has an add.missing= argument
that can control this so actually any of the solutions could be
obtained with cast
using the fill= and add.missing= arguments to control which one you want.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Antony Unwin

On 18 Dec 2007, at 2:42 pm, Duncan Murdoch wrote:

 (I must admit to being very surprised that jittering and  
 sunflower  plots have been suggested for a dataset of 5000  
 points.  Do those who  mentioned these methods have examples on  
 that scale where they are  effective?)

 Sure.  The original post said there were about 50-60 unique  
 locations. This plot:

 x - rbinom(5000, 20, 0.15)
 y - rbinom(5000, 20, 0.15)
 plot(x,y)

 has a few more unique locations; tune those probabilities if you  
 want it closer.  Due to the overlap, the distribution is very  
 unclear.  But this plot

 plot(jitter(x), jitter(y))

 makes the distribution quite clear.

No it doesn't!  It makes it moderately clearer than the plot without  
jittering.  One good alternative here is the fluctuation diagram  
variant of a mosaic plot:

xx-as.factor(x)
yy-as.factor(y)
imosaic(xx,yy, type=f)

Using jittering for categorical data is really not to be recommended  
and will certainly degrade in performance as the dataset gets bigger.


Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
University of Augsburg,
Germany
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 18/12/2007 10:01 AM, Antony Unwin wrote:
 On 18 Dec 2007, at 2:42 pm, Duncan Murdoch wrote:
 
 (I must admit to being very surprised that jittering and  
 sunflower  plots have been suggested for a dataset of 5000  
 points.  Do those who  mentioned these methods have examples on  
 that scale where they are  effective?)
 Sure.  The original post said there were about 50-60 unique  
 locations. This plot:

 x - rbinom(5000, 20, 0.15)
 y - rbinom(5000, 20, 0.15)
 plot(x,y)

 has a few more unique locations; tune those probabilities if you  
 want it closer.  Due to the overlap, the distribution is very  
 unclear.  But this plot

 plot(jitter(x), jitter(y))

 makes the distribution quite clear.
 
 No it doesn't!  It makes it moderately clearer than the plot without  
 jittering.  One good alternative here is the fluctuation diagram  
 variant of a mosaic plot:
 
 xx-as.factor(x)
 yy-as.factor(y)
 imosaic(xx,yy, type=f)

That plot is better than jittering, but there's the problem in the 
mosaic plot of understanding the scale of the rectangles:  is it area or 
diameter that encodes the count?  With a jittered plot, you lose 
resolution when the number of points gets too high because you just see 
a mess of ink, but at least you only require the viewer to count in 
order to get a close numerical reading from the plot.

I could also claim that while imperfect, at least jittering is widely 
applicable.  For example, if the data were not on a regular grid, 
perhaps because they had been generated like this:

xloc - rnorm(50)
yloc - rnorm(50)
index - sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
x - xloc[index]
y - yloc[index]

then jittering still works as well (or as poorly), but the imosaic would 
not work at all.  There are better plots than jittering available, but 
jittering is easy.

(Actually, with this dataset, plot(jitter(x), jitter(y)) is really poor, 
because jitter() chooses a bad amount of jittering.  But with manual 
tuning (e.g.  plot(jitter(x, a=0.1), jitter(y, a=0.1), pch=.)) it's 
not too bad.  So I'd say jittering worked, but the R implementation of 
it may need improvement).

 Using jittering for categorical data is really not to be recommended  
 and will certainly degrade in performance as the dataset gets bigger.

Yes, I probably wouldn't recommend jittering if there were more than a 
few hundred replications at any point, or more than a few hundred unique 
points.

Duncan Murdoch

P.S. iplots 1.1-1 may have an init problem in Windows: in my first 
attempt, the plot made the boxes too large to fit in their cells, but it 
fixed itself when I resized the window, and the bug doesn't seem to be 
repeatable.


 
 Antony Unwin
 Professor of Computer-Oriented Statistics and Data Analysis,
 University of Augsburg,
 Germany

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 18/12/2007 10:02 AM, James W. MacDonald wrote:
 Duncan Murdoch wrote:
 On 18/12/2007 7:31 AM, Antony Unwin wrote:
 Wayne,

 Try the iplot command in iPlots.  You can then vary both the  
 pointsize and the transparency of your scatterplot interactively and  
 decide which scatterplot conveys the information best.  Sometimes  
 it's helpful to use more than one scatterplot when presenting your  
 results.

 (I must admit to being very surprised that jittering and sunflower  
 plots have been suggested for a dataset of 5000 points.  Do those who  
 mentioned these methods have examples on that scale where they are  
 effective?)
 Sure.  The original post said there were about 50-60 unique locations. 
 This plot:

 x - rbinom(5000, 20, 0.15)
 y - rbinom(5000, 20, 0.15)
 plot(x,y)

 has a few more unique locations; tune those probabilities if you want it 
 closer.  Due to the overlap, the distribution is very unclear.  But this 
 plot

 plot(jitter(x), jitter(y))
 
 Another alternative is smoothscatter() in the geneplotter package from 
 Bioconductor, which does a pretty reasonable job with these example data.

Yes, I agree.  (As an aside, there's actually a capital S in 
smoothScatter(), and it's a bit of a pain to install, because 
geneplotter depends on something that depends on DBI, which is not so 
easily available these days.)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread James W. MacDonald
Duncan Murdoch wrote:
 Yes, I agree.  (As an aside, there's actually a capital S in 
 smoothScatter(), and it's a bit of a pain to install, because 
 geneplotter depends on something that depends on DBI, which is not so 
 easily available these days.)

Somehow I always forget the capital S and wonder if I have loaded the 
correct package ;-D

As for installing the required dependencies, I believe this is actually 
quite straightforward:

source(http://www.bioconductor.org/biocLite.R;)
biocLite(geneplotter)

Should install geneplotter and all required dependencies.

Best,

Jim


 
 Duncan Murdoch

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bar plot colors

2007-12-18 Thread John Kane
I think you're going to find that barchart with that
many values in a bar is going to be pretty well
uninterpretable.  

Jim Lemon gives the desired barchart but it is very
difficult to read.  

Stealing his code to create the same matrix I'd
suggest may be looking at a dotchart.  I'm not sure if
this is even close to an optimal solution but I do
think it's a bit better than a barchart approach
==
heights-matrix(sample(10:70,54),ncol=3)
bar.colors-rep(rep(2:7,each=3),3)
cost.types - c(Direct, Indirec, Induced)
colnames(heights) -  c(A, B, C)
rownames(heights) - c(rep(cost.types, 6))

dotchart(heights, col=bar.colors, pch=16, cex=.6)

===
--- Winkel, David [EMAIL PROTECTED] wrote:

 All,
 
  
 
 I have a question regarding colors in bar plots.  I
 want to stack a
 total of 18 cost values in each bar. Basically, it
 is six cost types and
 each cost type has three components- direct,
 indirect, and induced
 costs.  I would like to use both solid color bars
 and bars with the
 slanted lines (using the density parameter).  The
 colors would
 distinguish cost types and the lines would
 distinguish
 direct/indirect/induced.  I want the cost types
 (i.e. colors) to be
 stacked together for each cost type.  In other
 words, I don't want all
 of the solid bars at the bottom and all of the
 slanted lines at the top.
 
 
 So far, I have made a bar plot with all solid colors
 and then tried to
 overwrite that bar plot by calling barplot() again
 and putting the white
 slanted lines across the bars.  However, I can't get
 this method to work
 while still grouping the cost types together.
 
  
 
 Thanks in advance for any help you can provide.
 
  
 
 David Winkel
 
 Applied Biology and Aerosol Technology
 
 Battelle Memorial Institute
 
 505 King Ave.
 
 Columbus, Ohio 43201
 
 614.424.3513
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Antony Unwin

On 18 Dec 2007, at 4:49 pm, Duncan Murdoch wrote:

 One good alternative here is the fluctuation diagram  variant of a  
 mosaic plot:
 xx-as.factor(x)
 yy-as.factor(y)
 imosaic(xx,yy, type=f)

 That plot is better than jittering, but there's the problem in the  
 mosaic plot of understanding the scale of the rectangles:  is it  
 area or diameter that encodes the count?

Area is used.

 With a jittered plot, you lose resolution when the number of points  
 gets too high because you just see a mess of ink, but at least you  
 only require the viewer to count in order to get a close numerical  
 reading from the plot.

If someone needs a count, they should be given a table.   Graphics  
are for qualitative conclusions not details.  Anyway, counting will  
only work for really small datasets.

 I could also claim that while imperfect, at least jittering is  
 widely applicable.  For example, if the data were not on a regular  
 grid, perhaps because they had been generated like this:

 xloc - rnorm(50)
 yloc - rnorm(50)
 index - sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
 x - xloc[index]
 y - yloc[index]

 then jittering still works as well (or as poorly), but the imosaic  
 would not work at all.

That's right and that's (almost) the sort of example I was thinking  
of.  For a limited number of locations like this a bubble plot would  
be best (which has already been suggested in this thread, I think).   
For many locations and few replications I would still go for varying  
pointsize and transparency.

Incidentally, to check your suggestion I ran your code and discovered  
that the transparency in iplot does not seem to like replications.   
Very strange, we'll have to check why.  I then looked closely at the  
numbers of replications generated and discovered that case 25 was  
picked 325 times and case 40 only once.  Rather too extreme for my  
liking!  Running it again gave very similar results, though not  
exactly the same: this time it was 325 times for case 25 and case 40  
was not picked at all.  Other numbers varied slightly.  This is not  
what I expected, any ideas?

 P.S. iplots 1.1-1 may have an init problem in Windows: in my first  
 attempt, the plot made the boxes too large to fit in their cells,  
 but it fixed itself when I resized the window, and the bug doesn't  
 seem to be repeatable.

Thanks.  This happens occasionally on the Mac too.  Refreshing solves  
it in practice, but we need to find out why it can happen (and stop  
it happening!).

Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
University of Augsburg,
Germany
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculating the number of days from dates

2007-12-18 Thread bogdan romocea
 Sorry for using library instead package, but
 library() is one command for using packages.

... which is why all efforts to make folks say package instead of 
library  are doomed to fail, IMHO. Besides, in English, library
also means a collection of software or data usually reflecting a
specific theme or application (#9 on the list from
http://dictionary.reference.com/ ). Therefore:
library == package
   [1] TRUE!
and just about the only way to clear up the confusion would be to
rename library() to package(), and replace library with folder or
directory.



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Knut Krueger
 Sent: Monday, December 17, 2007 2:11 AM
 To: 'R R-help'
 Subject: Re: [R] calculating the number of days from dates


  it's  a   package   , not a library, please!
 
 
 Sorry for using library instead package, but

 library() is one command for using packages.

 Therefore I (and it seems that i am not the only one) used
 library instead package.

 Knut

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave and Scientific Workplace

2007-12-18 Thread Dietrich Trenkler
Dear HelpeRs,

a colleague of mine uses Scientific Workplace to write his LaTeX documents.
I made his mouth water mentioning the advantages of using Sweave.

Not using SW myself I wonder if anyone out there has gathered some 
experiences
in using the combination of both.

Thank you in advance

Dietrich

-- 
Dietrich Trenkler c/o Universitaet Osnabrueck 
Rolandstr. 8; D-49069 Osnabrueck, Germany
email: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 12/18/2007 11:21 AM, James W. MacDonald wrote:
 Duncan Murdoch wrote:
 Yes, I agree.  (As an aside, there's actually a capital S in 
 smoothScatter(), and it's a bit of a pain to install, because 
 geneplotter depends on something that depends on DBI, which is not so 
 easily available these days.)
 
 Somehow I always forget the capital S and wonder if I have loaded the 
 correct package ;-D
 
 As for installing the required dependencies, I believe this is actually 
 quite straightforward:
 
 source(http://www.bioconductor.org/biocLite.R;)
 biocLite(geneplotter)
 
 Should install geneplotter and all required dependencies.

Yes, that works.  Not sure why DBI was unavailable for a simple install 
of geneplotter from the Windows Rgui; when I try it now (on a different 
PC, maybe using a different mirror) it's there.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread Duncan Murdoch
On 12/18/2007 12:44 PM, Antony Unwin wrote:
 On 18 Dec 2007, at 4:49 pm, Duncan Murdoch wrote:
 
 One good alternative here is the fluctuation diagram  variant of a  
 mosaic plot:
 xx-as.factor(x)
 yy-as.factor(y)
 imosaic(xx,yy, type=f)

 That plot is better than jittering, but there's the problem in the  
 mosaic plot of understanding the scale of the rectangles:  is it  
 area or diameter that encodes the count?
 
 Area is used.
 
 With a jittered plot, you lose resolution when the number of points  
 gets too high because you just see a mess of ink, but at least you  
 only require the viewer to count in order to get a close numerical  
 reading from the plot.
 
 If someone needs a count, they should be given a table.   Graphics  
 are for qualitative conclusions not details.  Anyway, counting will  
 only work for really small datasets.
 
 I could also claim that while imperfect, at least jittering is  
 widely applicable.  For example, if the data were not on a regular  
 grid, perhaps because they had been generated like this:

 xloc - rnorm(50)
 yloc - rnorm(50)
 index - sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
 x - xloc[index]
 y - yloc[index]

 then jittering still works as well (or as poorly), but the imosaic  
 would not work at all.
 
 That's right and that's (almost) the sort of example I was thinking  
 of.  For a limited number of locations like this a bubble plot would  
 be best (which has already been suggested in this thread, I think).   
 For many locations and few replications I would still go for varying  
 pointsize and transparency.
 
 Incidentally, to check your suggestion I ran your code and discovered  
 that the transparency in iplot does not seem to like replications.   
 Very strange, we'll have to check why.  I then looked closely at the  
 numbers of replications generated and discovered that case 25 was  
 picked 325 times and case 40 only once.  Rather too extreme for my  
 liking!  Running it again gave very similar results, though not  
 exactly the same: this time it was 325 times for case 25 and case 40  
 was not picked at all.  Other numbers varied slightly.  This is not  
 what I expected, any ideas?

abs(xloc) typically varies by a factor of about 100 from smallest to 
largest, but sometimes the small end is really small, and so the ratio 
is really big.

Duncan Murdoch

 
 P.S. iplots 1.1-1 may have an init problem in Windows: in my first  
 attempt, the plot made the boxes too large to fit in their cells,  
 but it fixed itself when I resized the window, and the bug doesn't  
 seem to be repeatable.
 
 Thanks.  This happens occasionally on the Mac too.  Refreshing solves  
 it in practice, but we need to find out why it can happen (and stop  
 it happening!).
 
 Antony Unwin
 Professor of Computer-Oriented Statistics and Data Analysis,
 University of Augsburg,
 Germany

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R brakes when submitting a query to MySQL

2007-12-18 Thread Marc Moragues
Hello,

I would like to retrieve data stored in MySQL database, so I installed
RMySQL package.
I can successfully connect with the my database using the following code

 dvr-dbDriver(MySQL)
 con2-dbConnect(dvr,group=exbardiv)
 mysqlDescribeConnection(con2)

MySQLConnection:(972,0) 
  User: mmorag 
  Host: localhost 
  Dbname: exbardiv 
  Connection type: localhost via TCP/IP 
  No resultSet available

I can even see the tables in the database

 dbListTables(con2)
[1] agouebhigh_ld   rescuesjlc_info sjlc_ld   temp

[7] temp_snp1 temp_snp2

However, when I try to query the database, R breakes.

res-dbSendQuery(con,'select * from sjlc_ld')

Can anyone help me tune up the connection between R and MySQL?

Thank you,
Marc.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by 
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views 
expressed by the sender are not necessarily the views of SCRI and its 
subsidiaries.  This email and any files transmitted with it are confidential 
to the intended recipient at the e-mail address to which it has been 
addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this 
confidentiality and you must not use, disclose, copy, print or rely on this 
e-mail in any way. Please notify [EMAIL PROTECTED] quoting the 
name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are 
present in this email, neither the Institute nor the sender accepts any 
responsibility for any viruses, and it is your responsibility to scan the email 
and the attachments (if any).


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R brakes when submitting a query to MySQL

2007-12-18 Thread Zembower, Kevin
Is it your use of 'con' rather than 'con2' in dbSendQuery? -Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Marc Moragues
Sent: Tuesday, December 18, 2007 1:14 PM
To: r-help@r-project.org
Subject: [R] R brakes when submitting a query to MySQL

Hello,

I would like to retrieve data stored in MySQL database, so I installed
RMySQL package.
I can successfully connect with the my database using the following code

 dvr-dbDriver(MySQL)
 con2-dbConnect(dvr,group=exbardiv)
 mysqlDescribeConnection(con2)

MySQLConnection:(972,0) 
  User: mmorag 
  Host: localhost 
  Dbname: exbardiv 
  Connection type: localhost via TCP/IP 
  No resultSet available

I can even see the tables in the database

 dbListTables(con2)
[1] agouebhigh_ld   rescuesjlc_info sjlc_ld   temp

[7] temp_snp1 temp_snp2

However, when I try to query the database, R breakes.

res-dbSendQuery(con,'select * from sjlc_ld')

Can anyone help me tune up the connection between R and MySQL?

Thank you,
Marc.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:\ \ This email is from the Scottish Crop Rese...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot Showing All Points

2007-12-18 Thread bogdan romocea
Another approach which I'm pleased with but was not suggested so far
is jitter + kde2d from MASS:

plot(jitter(x), jitter(y))
if (!exists(kde2d)) require(MASS)
kdesamp - 2  #depending on your RAM
forkde - if (kdesamp  length(x)) sample(1:length(x), kdesamp,
replace=FALSE) else 1:length(x)
d - kde2d(x[forkde], y[forkde])
contour(d, add=TRUE)



 -Original Message-
 From: [EMAIL PROTECTED]
 Subject: Re: [R] Scatterplot Showing All Points


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R brakes when submitting a query to MySQL

2007-12-18 Thread Marc Moragues
You are right, it was a mistake copying and pasting the code. There is
no error message from R when I run con2. I get a Windows error message
saying R for windows terminal front-end has encountered a problem and
need to close.

The error signature is:

AppName: rterm.exe   AppVer: 2.60.43063.0ModName: msvcrt.dll
ModVer: 7.0.2600.2180Offset: 000378c0

Marc. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Zembower, Kevin
Sent: 18 December 2007 18:18
To: r-help@r-project.org
Subject: Re: [R] R brakes when submitting a query to MySQL

Is it your use of 'con' rather than 'con2' in dbSendQuery? -Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Marc Moragues
Sent: Tuesday, December 18, 2007 1:14 PM
To: r-help@r-project.org
Subject: [R] R brakes when submitting a query to MySQL

Hello,

I would like to retrieve data stored in MySQL database, so I installed
RMySQL package.
I can successfully connect with the my database using the following code

 dvr-dbDriver(MySQL)
 con2-dbConnect(dvr,group=exbardiv)
 mysqlDescribeConnection(con2)

MySQLConnection:(972,0)
  User: mmorag
  Host: localhost
  Dbname: exbardiv
  Connection type: localhost via TCP/IP
  No resultSet available

I can even see the tables in the database

 dbListTables(con2)
[1] agouebhigh_ld   rescuesjlc_info sjlc_ld   temp

[7] temp_snp1 temp_snp2

However, when I try to query the database, R breakes.

res-dbSendQuery(con,'select * from sjlc_ld')

Can anyone help me tune up the connection between R and MySQL?

Thank you,
Marc.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:\ \ This email is from the Scottish Crop
Rese...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by 
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views 
expressed by the sender are not necessarily the views of SCRI and its 
subsidiaries.  This email and any files transmitted with it are confidential 
to the intended recipient at the e-mail address to which it has been 
addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this 
confidentiality and you must not use, disclose, copy, print or rely on this 
e-mail in any way. Please notify [EMAIL PROTECTED] quoting the 
name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are 
present in this email, neither the Institute nor the sender accepts any 
responsibility for any viruses, and it is your responsibility to scan the email 
and the attachments (if any).


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA - cov.wt(z) : 'x' must contain finite values only

2007-12-18 Thread Ravi Varadhan
The problem is the missing values.  The argument na.action is not active
in princomp(), which I think is a bug, even though the help page claims that
factory fresh default is na.omit.

So, you need to either get rid of the rows with any missing values in them,
or use a PCA code that can deal with missing values by somehow imputing
them.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Johnson, Bethany 
Sent: Tuesday, December 18, 2007 1:14 PM
To: r-help@r-project.org
Subject: [R] PCA - cov.wt(z) : 'x' must contain finite values only

I am trying to run PCA on a matrix (the first column and row are
headers).  There are several cells with NA's.  When I run PCA with the
following code:
__
setwd(I:/PCA)
AsianProp-read.csv(Matrix.csv, sep=,, header=T, row.names=1)
attach(AsianProp)
AsianProp
AsianProp.pca-princomp(AsianProp, na.omit)
_

I get the error message:

cov.wt(z) : 'x' must contain finite values only

What am I doing wrong?  

Thanks very much!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scatterplot3d model reporting question

2007-12-18 Thread Max
I've used the scatterplot3d function to graph some data and had it 
graph a smooth fit. Is there a way to actualy find out the function 
of the surface? I've looked through the help and figured out how to get 
it to report the following:

Family: gaussian
Link function: identity

Formula:
y ~ s(x, z)

Parametric coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0.207500.01223   16.97   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
 edf Est.rank F  p-value
s(x,z) 8.403   17 9.729 1.76e-14 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.684   Deviance explained = 70.9%
GCV score = 0.017692   Scale est. = 0.016151  n = 108

But I'm still not really sure what I'm looking at, either that or 
smooth means something different than I thought. Any help would be 
great!

thanks,

-Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] comparing poisson distributions

2007-12-18 Thread Mark Gosink
Hello all,

I would like to compare two sets of count data which form
Poisson distributions. I'd like to generate some sort of p-value of the
likely-hood that the distributions are the same. Thanks in advance for
your advice.

 

Cheers,

Mark

 

Mark Gosink, Ph.D.

Head of Computational Biology
Scripps Florida
5353 Parkside Drive - RFA
Jupiter, FL  33458
tel: 561-799-8921
fax: 561-799-8952
[EMAIL PROTECTED]

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3d plotting

2007-12-18 Thread Brad B
I am trying to dp a 3d plot.
  I tried persp but my data is not a matrix.
  the 3dplot function returns this error, (list) object cannot be coerced to 
'double'
  heres my code,
  td-read.csv(td.csv, header=TRUE)
price-read.csv(price.csv, header=TRUE)
contractdate-read.csv(contractdate.csv, header=TRUE)
library(rgl)
plot3d(td,contractdate,price)
   
  the 3 csv data files have the following format,
  1
  2
  3
  4
  5
  6
  ...
  60,000
   
  basically I have 3 columns, x y and z that have 6 rows (data points) I 
want to plot.
   

   
-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Import GAUSS .FMT files

2007-12-18 Thread Pedro.Rodriguez
Dear All,

 

Is it possible to import GAUSS .FMT files into R? 

 

Thanks for your time.

 

Kind Regards,

 

Pedro N. Rodriguez

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3d plotting

2007-12-18 Thread Scionforbai
 60,000

I hope that you actually haven't got any comma to separate the
thousands... it separates fields in a csv files (as the Comma
Separated Values name may suggest). If so, get rid of the commas.

   the 3dplot function returns this error,
  (list) object cannot be coerced to 'double'

 td-read.csv(td.csv, header=TRUE)
 price-read.csv(price.csv, header=TRUE)
 contractdate-read.csv(contractdate.csv, header=TRUE)

You have to coerce the 1-column dataframes created by read.csv to
numeric vectors or to a 6x3 dataframe.
solution 1:

myData - cbind(td,contractdate,price)

 library(rgl)

plot3d(mydata)

solution 2:

td - as.numeric(td)
...
price - as.numeric(price)
plot3d(td,contractdate,price)

Bye,

ScionForbai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Specifying starting values in lme (nlme package) using msScale

2007-12-18 Thread Catherine A. Holt

I am using package nlme and would like to specify initial values for a linear 
mixed-effects model to help with convergence. I am trying to specify those 
initial values using the msScale option under ‘control’ in the lme() function:

lme(Y ~ X1,  random= ~ X1|X2, control=list(msScale=lmeScale))

where, (as far as I understand), lmeScale is a function that can take initial 
values for parameters. However, I am unsure about how to input those starting 
values (e.g., what names lme will recognize for fixed and random effects, in 
what format, and if a partial list of initial values would be acceptable?).

Any advice or examples of code inputting starting values would be extremely 
helpful. I have been unable to find examples myself online. Note, although it 
may be easier to do this in the lme4 package, I would prefer to use nlme.

Thank you for your attention.
Sincerely, 
Carrie Holt, Ph.D., M.Sc., B.Sc.(Honours)
University of Washington
School of Aquatic  Fishery Sciences
Box 355020
Seattle, WA 98195

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analyzing Publications from Pubmed via XML

2007-12-18 Thread David Winsemius
Armin Goralczyk [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]: 

 On 12/18/07, David Winsemius [EMAIL PROTECTED] wrote:
 David Winsemius [EMAIL PROTECTED] wrote in
 news:[EMAIL PROTECTED]:

  Armin Goralczyk [EMAIL PROTECTED] wrote in
  news:[EMAIL PROTECTED]:

  I tried the above function with simple search terms and it worked
  fine for me (also more output thanks to Martin's post) but when I
  use search terms attributed to certain fields, i.e. with [au] or
  [ta], I get the following error message:
  pm.srch()
  1: laryngeal neoplasms[mh]
  2:

  I am wondering if you used spaces, rather than +'s? If so then
  you may want your function to do more gsub-processing of the input
  string. 

 I tried my theory that one would need +'s instead of spaces, but
 disproved it. Spaces in the input string seems to produce acceptable
 results on my WinXP/R.2.6.1/RGui system even with more complex search
 strings.

 --

 It's not the spaces, the problem is the tag (sorry that I didn't
 specify this), or maybe the string []. I am working on a Mac OS X 10.4
 with R version 2.6. Is it maybe a string conversion problem? In the
 following warning strings in the html adress seem to be different:
 Fehler in .Call(RS_XML_ParseTree, as.character(file), handlers,
 as.logical(ignoreBlanks),  :
  error in creating parser for
 http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmedter
 m=laryngeal neoplasms[mh]
 I/O warning : failed to load external entity
 http%3A//eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi%3Fdb=pubme
 dterm=laryngeal%20neoplasms%5Bmh%5D 

I do not have an up-to-date version of R on my Mac, since I have not yet 
upgraded to OSX10.4. I can try with my older version of R, but failure 
(or even success) with versions OSX-10.2/R-2.0 is not likely to be very 
informative. If you will post an example of the input that is resulting 
in the error, I can try it on my WinXP machine. If we cannot reproduce it 
there, then it may be more appropriate to take further questions to the 
Mac-R mailing list. The error message suggests to me that the fault lies 
in the connection phase of the task. 

-- 
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Hi all,

What may be a smart, efficient way to get the following result:

myvector - c(A,B,C,D,E)
myseries - miracle(myvector)
myseries
[1]
[[1]] A
[2]
[[1]] A B
[3]
[[1]] A B
[4]
[[1]] A B C
[5]
[[1]] A B C D
[6]
[[1]] A B C D E

Thanks for any hints,

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Debugged version:
lapply(1:length(myvector), function(.length) {
myvector[1:.length]
})

Thanks for showing the direction!

Joh

[EMAIL PROTECTED] wrote:

From: Johannes Graumann [EMAIL PROTECTED]
Date: 2007/12/18 Tue PM 04:40:37 CST
To: [EMAIL PROTECTED]
Subject: [R] All anchored series from a vector?
 
 lapply(1:length(myvector) function(.length) {
 c(myvector[1}:myvector[.length])
 })
 
 but test it because i didn't.
 
 
 
Hi all,

What may be a smart, efficient way to get the following result:

myvector - c(A,B,C,D,E)
myseries - miracle(myvector)
myseries
[1]
[[1]] A
[2]
[[1]] A B
[3]
[[1]] A B
[4]
[[1]] A B C
[5]
[[1]] A B C D
[6]
[[1]] A B C D E

Thanks for any hints,

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Gabor Csardi
On Wed, Dec 19, 2007 at 12:01:25AM +0100, Johannes Graumann wrote:
 Debugged version:
 lapply(1:length(myvector), function(.length) {
 myvector[1:.length]
 })
 
 Thanks for showing the direction!
 
 Joh

Note that this fails if length(myvector)==0. 
Good to know the corner cases.

Gabor

 
 [EMAIL PROTECTED] wrote:
 
 From: Johannes Graumann [EMAIL PROTECTED]
 Date: 2007/12/18 Tue PM 04:40:37 CST
 To: [EMAIL PROTECTED]
 Subject: [R] All anchored series from a vector?
  
  lapply(1:length(myvector) function(.length) {
  c(myvector[1}:myvector[.length])
  })
  
  but test it because i didn't.
  
  
  
 Hi all,
 
 What may be a smart, efficient way to get the following result:
 
 myvector - c(A,B,C,D,E)
 myseries - miracle(myvector)
 myseries
 [1]
 [[1]] A
 [2]
 [[1]] A B
 [3]
 [[1]] A B
 [4]
 [[1]] A B C
 [5]
 [[1]] A B C D
 [6]
 [[1]] A B C D E
 
 Thanks for any hints,
 
 Joh
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
  minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All anchored series from a vector?

2007-12-18 Thread Johannes Graumann
Nothing to be sorry about. You suggested a viable solution untested ... my
job to figure it out ;0)

Joh

[EMAIL PROTECTED] wrote:

From: [EMAIL PROTECTED]
Date: 2007/12/18 Tue PM 02:50:52 CST
To: Johannes Graumann [EMAIL PROTECTED]
Cc: r-help@r-project.org
Subject: Re: [R] All anchored series from a vector?
 
 i'm sorry. i tested it afterwards and of course
 it had some problems. below is the working version.
 
 
 myvector-c(A,B,C,D,E)
 
 result- lapply(1:length(myvector), function(.length) {
 myvector[1:.length]
 })
 
 
 print(result)
 
 
 
 
From: Johannes Graumann [EMAIL PROTECTED]
Date: 2007/12/18 Tue PM 04:40:37 CST
To: [EMAIL PROTECTED]
Subject: [R] All anchored series from a vector?

lapply(1:length(myvector) function(.length) {
c(myvector[1}:myvector[.length])
})

but test it because i didn't.



Hi all,

What may be a smart, efficient way to get the following result:

myvector - c(A,B,C,D,E)
myseries - miracle(myvector)
myseries
[1]
[[1]] A
[2]
[[1]] A B
[3]
[[1]] A B
[4]
[[1]] A B C
[5]
[[1]] A B C D
[6]
[[1]] A B C D E

Thanks for any hints,

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Update of the np package (version 0.14-1)

2007-12-18 Thread Jeffrey S. Racine
Dear R users,

An updated version of the np package has recently been uploaded to CRAN
(version 0.14-1). 

The package is briefly described in a recent issue of Rnews (October,
2007, http://cran.r-project.org/doc/Rnews/Rnews_2007-2.pdf) for those
who might be interested.

A somewhat more detailed paper that describes the np package is
forthcoming in the Journal of Statistical Software
(http://www.jstatsoft.org) for those might be interested.

A much more thorough treatment of the subject matter can be found in Li,
Q. and J. S. Racine (2007), Nonparametric Econometrics: Theory and
Practice, Princeton University Press, ISBN: 0691121613 (768 Pages) for
those who might be interested
(http://press.princeton.edu/titles/8355.html)

Information on the np package:

This package provides a variety of nonparametric (and semiparametric)
kernel methods that seamlessly handle a mix of continuous, unordered,
and ordered factor datatypes. We would like to gratefully acknowledge
support from  the Natural Sciences and Engineering Research Council of
Canada (NSERC:www.nserc.ca), the Social Sciences and Humanities Research
Council of Canada (SSHRC:www.sshrc.ca), and the Shared Hierarchical
Academic Research Computing Network (SHARCNET:www.sharcnet.ca).

Changes from version 0.13-1 to 0.14-1:

* now use optim rather than nlm for minimisation in single index and
smooth coefficient models
* fixed bug in klein-spady objective function
* regression standard errors are now available in the case of no
  continuous variables
* summary should look prettier, print additional information
* tidied up lingering issues with out-of-sample data and conditional
modes
* fixed error when plotting asymptotic errors with conditional densities
* fixed a bug in npplot with partially linear regressions and 
plot.behavior='data' or 'plot-data'
* maximum default number of multistarts is now set to 5
* least-squares cross-validation of conditional densities uses a new,
 faster algorithm
* new, faster algorithm for least-squares cross-validation for both 
local-constant and local linear regressions.
   The estimator has changed somewhat: both cross-validation and
   the estimator use a method of shrinking towards the local constant
   estimator rather than the standard ridge approach that shrinks
   towards zero
* optimised smooth coefficient code, added ridging
* fixed bug in uniform CDF kernel
* fixed bug where npindexbw would ignore bandwidth.compute = FALSE and
   compute bandwidths when supplied with a preexisting bw object
* now can handle estimation out of discrete support.
* summary would misreport the values of discrete scale factors which
   were computed with bwscaling = TRUE

We are grateful to John Fox, Achim Zeilies, Roger Koenker, and numerous
users for their valuable feedback which resulted in an improved version
of the package.

-- Jeffrey Racine  Tristen Hayfield.

-- 
Professor J. S. Racine Phone:  (905) 525 9140 x 23825
Department of EconomicsFAX:(905) 521-8232
McMaster Universitye-mail: [EMAIL PROTECTED]
1280 Main St. W.,Hamilton, URL:
http://www.economics.mcmaster.ca/racine/
Ontario, Canada. L8S 4M4

`The generation of random numbers is too important to be left to chance'

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I extract the AIC score from a mixed model object produced using lmer?

2007-12-18 Thread Peter H Singleton

I am running a series of candidate mixed models using lmer (package lme4)
and I'd like to be able to compile a list of the AIC scores for those
models so that I can quickly summarize and rank the models by AIC. When I
do logistic regression, I can easily generate this kind of list by creating
the model objects using glm, and doing:

 md - c(md1.lr, md2.lr, md3.lr)
 aic - c(md1.lr$aic, md2.lr$aic, md3.lr$aic)
 aic2 - cbind(md, aic)

but when I try to extract the AIC score from the model object produced by
lmer I get:

 md1.lme$aic
NULL
Warning message:
In md1.lme$aic : $ operator not defined for this S4 class, returning NULL

So... How do I query the AIC value out of a mixed model object created by
lmer?

---
Peter Singleton
USFS Pacific Northwest Research Station
1133 N. Western Ave.
Wenatchee WA 98801
Phone: (509)664-1732
Fax: (509)665-8362
E-mail: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random forests

2007-12-18 Thread Naiara Pinto
Dear all,

I would like to use a tree regression method to analyze my dataset. I
am interested in the fact that random forests creates in-bag and
out-of-bag datasets, but I also need an estimate of support for each
split. That seems hard to do in random forests since each tree is
grown using a subset of the predictor variables.

I was thinking of setting mtry = number of predictor variables,
growing several trees, and computing the support for each node as the
number of times that a certain predictor variable was chosen for that
node. Can this be implemented using random forests?

Thanks!

Naiara.

-- 
Naiara Pinto
PhD Candidate
Ecology, Evolution and Behavior
University of Texas Austin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I extract the AIC score from a mixed model object produced using lmer?

2007-12-18 Thread David Barron
You can calculate the AIC as follows:

(fm1 - lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
aic1 - AIC(logLik(fm1))

Hope this helps.
Dave

On 12/18/07, Peter H Singleton [EMAIL PROTECTED] wrote:

 I am running a series of candidate mixed models using lmer (package lme4)
 and I'd like to be able to compile a list of the AIC scores for those
 models so that I can quickly summarize and rank the models by AIC. When I
 do logistic regression, I can easily generate this kind of list by creating
 the model objects using glm, and doing:

  md - c(md1.lr, md2.lr, md3.lr)
  aic - c(md1.lr$aic, md2.lr$aic, md3.lr$aic)
  aic2 - cbind(md, aic)

 but when I try to extract the AIC score from the model object produced by
 lmer I get:

  md1.lme$aic
 NULL
 Warning message:
 In md1.lme$aic : $ operator not defined for this S4 class, returning NULL

 So... How do I query the AIC value out of a mixed model object created by
 lmer?

 ---
 Peter Singleton
 USFS Pacific Northwest Research Station
 1133 N. Western Ave.
 Wenatchee WA 98801
 Phone: (509)664-1732
 Fax: (509)665-8362
 E-mail: [EMAIL PROTECTED]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
David Barron
Said Business School Jesus College
Park End Street  Oxford
Oxford OX1 1HP  OX1 3DW
01865 288906  01865 279684

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculating the number of days from dates

2007-12-18 Thread Rolf Turner

On 19/12/2007, at 6:46 AM, bogdan romocea wrote:

 Sorry for using library instead package, but
 library() is one command for using packages.

 ... which is why all efforts to make folks say package instead of 
 library  are doomed to fail, IMHO.

Yes, but it gives so much pleasure to those who appreciate
the distinction to rail at those who don't! :-)

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting magnitude

2007-12-18 Thread hadley wickham
On Dec 18, 2007 2:06 PM,  [EMAIL PROTECTED] wrote:
 I am plotting fishing vessel positions and want these points to be
 relative in size to the catch at that point.  Is this possible? I am just
 begining to use R and my search of the help section didnt help in this
 area.  Heres what Im using so far

 xyplot(data$latdeg~data$londeg |vessek , groups=vessek,
 xlim=rev(range(69:77)),ylim=(range(35:42)), data=data,
 main=list (Mackerel catches, cex=1.0),
 ylab=latitude, notch=T, varwidth=T,
 xlab=longitude, cex.axis=0.5,)
 any info would be appreciated

This is pretty easy to do with the ggplot2 package:

library(ggplot2)
qplot(longdeg, latdeg, data = data, facets = . ~ vessek, size = catch)

or maybe
qplot(longdeg, latdeg, data = data, facets = . ~ vessek, size = catch)
+ scale_area()

if you want the area of the points proportional to the catch, rather
than their radius

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Clustering Question (Support Vector Clustering)

2007-12-18 Thread Ionut Florescu
I am currently designing a clustering algorithm in collaboration with one of
my colleagues. For comparison purposes we would like to contrast it with the
Support Vector Clustering algorithm of (A. Ben-Hur, D. Horn, H.T.
Siegelmann, and V. Vapnik. Support vector clustering. Journal of Machine
Learning Research, 2:125-137, 2001). 

 

This is supposedly the most powerful unsupervised clustering algorithm
available. Unfortunately, we cannot find code for it anywhere. We have tried
emailing the authors and they cannot find the code either (one of them has
not answer yet). I was wondering, perhaps someone in the R community has
implemented it or knows about some implementation for it. We really do not
want to implement it ourselves because this will probably delay our paper
considerably. 

 

One more thing, I am talking about Support vector clustering not
classification, algorithms for that we could find in abundance.

 

Thank you for any help.

Ionut Florescu

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple plots with single box

2007-12-18 Thread Giovanni Petris

Hello,

I am trying to display some harmonic functions in a plot. The kind of
display I have in mind is like the one that cn be obtained by a call
to plot.ts with plot.type = multiple. The only difference is that I
want a single box containing all the plots instead of one box per
plot. I thought box(which = outer) would have done the job, but it
didn't. 

Below is the code I have used so far. (R 2.5.1, I know, I know...)

Any help is greatly appreciated. 

Thank you in advance,
Giovanni

=
### Plot harmonic functions
n - 6 # even
omega - 2 * pi / n

par(mfrow = c(n - 1, 1), mar = c(0, 5.1, 0, 5.1), oma = c(3, 1, 2, 1))
for (i in 1:(n/2 - 1)) {
curve(cos(x * i * omega), 0, n, ylim = c(-1.1, 1.1), ylab = , axes = 
FALSE)
points(1:n, cos(i * omega * 1:n))
axis(2); abline(h = 0, col = lightgrey)
curve(sin(x * i * omega), 0, n, ylim = c(-1.1, 1.1), ylab = , axes = 
FALSE)
points(1:n, sin(i * omega * 1:n))
axis(4); abline(h = 0, col = lightgrey)
}
curve(cos(x * (n/2) * omega), 0, n, ylim = c(-1.1, 1.1), ylab = , axes = 
FALSE)
points(1:n, rep(c(-1,1), n/2))
axis(1); axis(2); abline(h = 0, col = lightgrey)

-- 

Giovanni Petris  [EMAIL PROTECTED]
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor Madness

2007-12-18 Thread Tony Plate
 From ?cbind:

Data frame methods
The cbind data frame method is just a wrapper for data.frame(..., 
check.names = FALSE). This means that it will split matrix columns in data 
frame arguments, and convert character columns to factors unless 
stringsAsFactors = TRUE is passed.

(I'm guessing 'spectrum' is a data.frame before the code fragment you've shown)

hope this helps,

Tony Plate

Johannes Graumann wrote:
 Why is class(spectrum[[Ion]]) after this factor?
 
 spectrum - cbind(spectrum,Ion=rep(,
 nrow(spectrum)),Deviation.AMU=rep(0.0, nrow(spectrum)))
 
 slowly going crazy ...
 
 Joh
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gam() in gam package

2007-12-18 Thread Kunio takezawa
R-users
E-mail: r-help@r-project.org

 This iteration seems to be for iteratively reweighted least squares not
for backfitting. And lm.wfit may solve multiple linear equation using
QR decomposition; but I am not sure.

   Let me tell you something about my guess above.
The iteration below is from 1 to maxit.

  for (iter in 1:maxit) {
--
fit - eval(bf.call)
--
  }

The help of gam.control() says:
   maxit: maximum number of local scoring iterations
   bf.maxit: maximum number of backfitting iterations

Therefore the iteration above is local scoring iteration not
backfitting iteration.

   The first line of lm.wfit() is:
function (x, y, w, offset = NULL, method = qr, tol = 1e-07,
 singular.ok = TRUE, ...)
There is no bf.maxit or its equivalent in these arguments. And it
contains 'method = qr'; it may mean that this routine solves
a multiple linear equation using QR decomposition.
-- 
*[EMAIL PROTECTED]*
http://cse.naro.affrc.go.jp/takezawa/intro.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] accessing dimension names

2007-12-18 Thread Moshe Olshansky
To access your dimension idx you could do either
assign(a,paste(dimnames(y)$x,idx,sep=))
or
eval(parse(text=paste(a-dimnames(y)$x,idx,sep=)))

--- [EMAIL PROTECTED] wrote:

 I have a matrix y:
 
  dimnames(y)
 $x93
 [1] 1 2
 
 $x94
 [1] 0 1 2
 .. so on  (there are other
 dimensions as well)
 
 
 
 I need to access a particular dimension, but a
 random mechanism tells me
 which dimension it would. So, sometimes I might need
 to access
 dimnames(y)$x93, some other time it would be
 dimnames(y)$x94.. and so on.
 Now let that random dimension be idx, then
 dimnames(y)$paste('x',idx,sep='')
 doesn't work.
 
 Can anyone help?
 
 Thanks!
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a database

2007-12-18 Thread elw


 Can a simple matrix be used for this or would it be better and more 
 efficient to create an external database to hold the data.  If so, 
 should the database be created using C and how would I do this (seeing 
 as that I have never programmed in C)?

You don't want to be down at the C level, most likely:  it would be much 
more straightforward and programmer-efficient to use one of the available 
bindings to one of the available open-source databases.

R has useful / usable bindings to postgresql, sqlite, and mysql, among 
many others.

These are, however, more generally useful when you reach the point that 
you simply can't manage the volume of your data in R objects or in data 
frames. [And, well, you can go a LONG way with intelligently named R 
objects.  :-)]

--elijah

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] assigning and saving datasets in a loop, with names changing with i

2007-12-18 Thread Marie Pierre Sylvestre
Dear R users,

I am analysing a very large data set and I need to perform several data
manipulations. The dataset is so big that the only way I can play with it
without having memory problems (E.g. cannot allocate vectors of size...)
is to write a batch script to:

1. cut the data into pieces 
2. save the pieces in seperate .RData files
3. Remove everything from the environment
4. load one of the piece
5. perform the manipulations on it
6. save it and remove it from the environment
7. Redo 4-6 for every piece
8. Merge everything together at the end

It works if coded line by line but since I'll have to perform these tasks
on other data sets, I am trying to automate this as much as I can. 

I am using a loop in which I used 'assign' and 'get' (pseudo code below).
My problem is when I use 'get', it prints the whole object on the screen.
I am wondering whether there is a more efficient way to do what I need to
do. Any help would be appreciated. Please keep in mind that the whole
process is quite computer-intensive, so I can't keep everything in the
environment while R performs calculations.

Say I have 1 big dataframe called data. I use 'split' to divide it into a
list of 12 dataframes (call this list my.list)

my.fun is a function that takes a dataframe, performs several
manipulations on it and returns a dataframe.


for (i in 1:12){
  assign( paste( data, i, sep=),  my.fun(my.list[i]))   # this works
  # now I need to save this new object as a RData. 

  # The following line does not work
  save(paste(data, i, sep = ),  file = paste(  paste(data, i, sep =
), RData, sep=.))
}

  # This works but it is a bit convoluted!!!
  temp - get(paste(data, i, sep = ))
  save(temp,  file = lala.RData)
}


I am *sure* there is something more clever to do but I can't find it. Any
help would be appreciated.

best regards,

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assigning and saving datasets in a loop, with names changing with i

2007-12-18 Thread Benilton Carvalho

you want to use:

save(list=paste(data, i, sep=), file=paste(data, i, .Rdata,  
sep=))


b

On Dec 18, 2007, at 9:24 PM, Marie Pierre Sylvestre wrote:


Dear R users,

I am analysing a very large data set and I need to perform several  
data
manipulations. The dataset is so big that the only way I can play  
with it
without having memory problems (E.g. cannot allocate vectors of  
size...)

is to write a batch script to:

1. cut the data into pieces
2. save the pieces in seperate .RData files
3. Remove everything from the environment
4. load one of the piece
5. perform the manipulations on it
6. save it and remove it from the environment
7. Redo 4-6 for every piece
8. Merge everything together at the end

It works if coded line by line but since I'll have to perform these  
tasks

on other data sets, I am trying to automate this as much as I can.

I am using a loop in which I used 'assign' and 'get' (pseudo code  
below).
My problem is when I use 'get', it prints the whole object on the  
screen.
I am wondering whether there is a more efficient way to do what I  
need to

do. Any help would be appreciated. Please keep in mind that the whole
process is quite computer-intensive, so I can't keep everything in the
environment while R performs calculations.

Say I have 1 big dataframe called data. I use 'split' to divide it  
into a

list of 12 dataframes (call this list my.list)

my.fun is a function that takes a dataframe, performs several
manipulations on it and returns a dataframe.


for (i in 1:12){
 assign( paste( data, i, sep=),  my.fun(my.list[i]))   # this  
works

 # now I need to save this new object as a RData.

 # The following line does not work
 save(paste(data, i, sep = ),  file = paste(  paste(data, i,  
sep =

), RData, sep=.))
}

 # This works but it is a bit convoluted!!!
 temp - get(paste(data, i, sep = ))
 save(temp,  file = lala.RData)
}


I am *sure* there is something more clever to do but I can't find  
it. Any

help would be appreciated.

best regards,

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table() and precision?

2007-12-18 Thread Moshe Olshansky
Thank you for your response!

'write.table' writes up to 15 decimal digits which is
not the machine (double) precision but not far from
that - sorry for the misleading comments!

After all I found a way to do what I needed without
using disk or much memory and doing only twice as much
work as I could with unlimited memory, so I will stick
to this approach.

--- Prof Brian Ripley [EMAIL PROTECTED] wrote:

 On Mon, 17 Dec 2007, Moshe Olshansky wrote:
 
  Dear List,
 
  Following the below question I have a question of
 my
  own:
  Suppose that I have large matrices which are
 produced
  sequentially and must be used sequentially in the
  reverse order. I do not have enough memory to
 store
  them and so I would like to write them to disk and
  then read them. This raises two questions:
  1) what is the fastest (and the most economic
  space-wise) way to do this?
 
 Using save/load is the simplest.  Don't worry about
 finding better 
 solutions until you know those are not good enough. 
 (serialize / 
 unserialize is another interface to the same
 underlying idea.)
 
  2) functions like write, write.table, etc. write
 the
  data the way it is printed and this may result in
 a
  loss of accuracy. Is there any way to prevent
 this,
  except for setting the digits option to a higher
  value or using format prior to writing the data?
 
 Do please read the help before making false claims.
 ?write.table says
 
   Real and complex numbers are written to the
 maximal possible
   precision.
 
 OTOH, ?write says it is a wrapper for cat, whose
 help says
 
   'cat' converts numeric/complex elements in the
 same way as 'print'
   (and not in the same way as 'as.character'
 which is used by the S
   equivalent), so 'options' 'digits' and
 'scipen' are relevant.
   However, it uses the minimum field width
 necessary for each
   element, rather than the same field width for
 all elements.
 
 so this hints as.character() might be a useful
 preprocessor.
 
  Is it possible to write binary files (similar to
 Fortran)?
 
 See ?writeBin.  save/load by default write binary
 files, but use of 
 writeBin can be faster (and less flexible).
 
  Any suggestion will be greatly appreciated.
 
 Somehow you have missed a great deal of information
 about R I/O.
 Try help.start() and reading the sections the search
 engine shows you 
 that look relevant.
 
 
 -- 
 Brian D. Ripley, 
 [EMAIL PROTECTED]
 Professor of Applied Statistics, 
 http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865
 272861 (self)
 1 South Parks Road, +44 1865
 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865
 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strange timings in convolve(x,y,type=open)

2007-12-18 Thread Charles C. Berry
On Tue, 18 Dec 2007, Art Owen wrote:

 Dear R-ophiles,

 I've found something very odd when I apply convolve
 to ever larger vectors.  Here is an example below
 with vectors ranging from 2^11 to 2^17.   There is
 a funny bump up at 2^12.  Then it gets very slow at 2^16.

The time is consumed by fft, which is called with vectors of length 
2*2^i-1 when type = 'o'

 system.time( fft(seq_len(2^13-1)) )
user  system elapsed
0.310.000.32
 system.time( fft(seq_len(2^14-1)) )
user  system elapsed
   0   0   0



There are no factors of 2^13-1 or 2^17-1 or 2^18-1

 for (i in 11:20 ) print( c(index=i, nfact=length(which( 0 == 
(2^(i+1)-1)%%(2:trunc(sqrt(2^(i+1)-1)) )
index nfact
1111
index nfact
12 0
index nfact
13 3
index nfact
14 3
index nfact
15 7
index nfact
16 0
index nfact
1715
index nfact
18 0
index nfact
1923
index nfact
20 5


It looks like the code in fft.c tries to find factors of the series length 
and works from there.

So, the size of the problem evidently depends on its factors.

HTH,

Chuck




   for( i in 11:20 )print( system.time(convolve(1:2^i,1:2^i,type=o)))
   user  system elapsed
  0.002   0.000   0.002
   user  system elapsed
  0.373   0.002   0.375
   user  system elapsed
  0.014   0.001   0.016
   user  system elapsed
  0.031   0.002   0.034
   user  system elapsed
  0.126   0.004   0.130
   user  system elapsed
 194.095   0.013 194.185
   user  system elapsed
  0.345   0.011   0.356

 This example is run on a fedora machine with 64 bits.  I hit the same
 wall at 2^16 on a Macbook (Intel processor I think).  The fedora machine
 is running R 2.5.0.  That's a bit old (April 07) but I saw no mention of
 this speed
 problem in some web searching, and it's not mentioned in the 2.6
 what's new notes.

 I've rerun it and found the same bump at 12 and wall at 16.
 The timing at 2^16  can change appreciably.  In one
 other case it was about 270 user, 271 elapsed.
 The 2^18 case ran for hours without ever finishing.

 At first I thought that this was a memory latency issue.  Maybe it
 is.  But that makes it hard to explain why 2^17 works better than
 2^16.  I've seen that three times now, so I'm almost ready to call it
 reproducible.
 Also, one of the machines I'm using has lots of memory.  Maybe it's
 a cache issue ... but that still does not explain why 2^12 is slower
 than 2^13 or 2^16 is slower than 2^17.

 I've checked by running convolve without type=o and I don't
 see the wall.  Similarly fft does not have that problem.

 Here's an example without type=open
  for( k in 11:20)print(system.time( convolve( 1:2^k,1:2^k)))
   user  system elapsed
  0.001   0.000   0.000
   user  system elapsed
  0.001   0.000   0.001
   user  system elapsed
  0.002   0.000   0.002
   user  system elapsed
  0.004   0.000   0.004
   user  system elapsed
  0.009   0.001   0.010
   user  system elapsed
  0.017   0.001   0.018
   user  system elapsed
  0.138   0.005   0.143
   user  system elapsed
  0.368   0.012   0.389
   user  system elapsed
  1.010   0.032   1.051
   user  system elapsed
  1.945   0.069   2.015

 This is more what I expected.  Something like N or N log(N) , with
 the difference hard to discern in granularity and noise.

 The convolve function is not very big (see below).  When type is
 not specified, it defaults to circular.  So my guess is that something
 mysterious might be happening inside the first else clause below,
 at least on some architectures.

 -Art Owen



  convolve
 function (x, y, conj = TRUE, type = c(circular, open, filter))
 {
type - match.arg(type)
n - length(x)
ny - length(y)
Real - is.numeric(x)  is.numeric(y)
if (type == circular) {
if (ny != n)
stop(length mismatch in convolution)
}
else {
n1 - ny - 1
x - c(rep.int(0, n1), x)
n - length(y - c(y, rep.int(0, n - 1)))
}
x - fft(fft(x) * (if (conj)
Conj(fft(y))
else fft(y)), inv = TRUE)
if (type == filter)
(if (Real)
Re(x)
else x)[-c(1:n1, (n - n1 + 1):n)]/n
else (if (Real)
Re(x)
else x)/n
 }

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, 

[R] plot cummulative sum from calendar time

2007-12-18 Thread gallon li
I have the following list of observations of calendar time:

 [1] 03-Nov-1997 09-Oct-1991 27-Aug-1992 01-Jul-1994 19-Jan-1990 12-Nov-1993
 [7] 08-Oct-1993 10-Nov-1982 08-Dec-1986 23-Dec-1987 02-Aug-1995 20-Oct-1998
[13] 29-Apr-1991 16-Mar-1994 20-May-1991 28-Dec-1987 14-Jul-1999 27-Nov-1998
[19] 09-Sep-1999 26-Aug-1999 20-Jun-1997 05-May-1995 26-Mar-1998 15-Aug-1994
[25] 24-Jun-1996 02-Oct-1996 12-Aug-1985 24-Jun-1992 09-Jan-1991 04-Mar-1988
3230 Levels: 01-Apr-1987 01-Apr-1990 01-Apr-1991 01-Apr-1996 ... 31-Oct-1998

I want to make a plot where x-axis is the calendar time and y-axix is the
cumulative sum of observations observed on or before that dates.

Can anyone give some suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 4 questions regarding hypothesis testing, survey package, ts on samples, plotting

2007-12-18 Thread eugen pircalabelu

Good morning!

I have 4 questions which trouble me:

1. I want to test the hypothesis that the 2 proportions (the mean of a 
binomial)  which come from 2 different samples are equal. I want to use the 
following function
z= (p1-p2)/ sqrt((p1(1-p1)/n1)+(p2(1-p2)/n2)) which is one of the standard 
formulas  for  this case. Is there such a function in R?

p1=the proportion from the first sample
n1=the first sample size
p1=the proportion from the second sample
n2=the second sample size

2. How can i find the median of a variable in survey package?
a-c(1:10)
b-sample(1:20, 10, replace=T)
b1-sample(0:1, 10, replace=T)
c-data.frame(a,b, b1)
library(survey)
design-svydesign(id=~1, data=c)
svymean(~b, design)

svymean gives me the mean, but what function gives me the svymedian, and what 
function gives me the svyproportion  for b1? 
I want to know the median for a metric variable and the proportion for a 
binomial variable ?

3. Could time series analysis (arima/arma)  be applied on data that are 
obtained from a sample?. Eg: Say I have data from the same sample in 40 
moments. Could I use arima/arma on the series? Or must i have data for the 
whole population like data from the Statistical Office on investment, 
inflation, demographics and so on? 

4. I want to modify the scale of my axes within a plot but i really could not 
find  this option. I think  there is such an option, but i can not find it.  
 x-rnorm(100, 100, 3)
 x-ts(x, frequency=12)
 acf(x)
 plot(x)
On the above example, i want to a scale like this 95,96,97,98,104,105  (on 
y) and 1,2,3,4,...7,8 (on x).

Thank you very much and have a great day!
  

   
-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.