date:20090227

Re: [R] accessing and preserving list names in lapply

2009-02-27 Thread Romain Francois


Hi,

This might be the trick you are looking for:
http://tolstoy.newcastle.edu.au/R/e4/help/08/04/8720.html

Romain

Alexy Khrabrov wrote:

res <- lapply(1:length(L),do.one)


Actually, I do

res <- lapply(:length(L),function(x)do.one(L[x]))

-- this is the price of needing the element's name, so I have to both 
make do.one extract the name and the meat separately inside, and 
lapply becomes ugly.  Yet the obvious alternatives -- extracting the 
names separately, attaching them back into list elements, etc., -- are 
even uglier.  Something pretty? :)


Cheers,
Alexy

--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with RBloomberg (not the usual one)

2009-02-27 Thread Sergey Goriatchev

Hello, everyone!

I have a problem with RBloomberg and this is not the usual "no
administrator rights" problem.

I have R 2.7.2, RBloomberg 0.1-10, RDCOMclient 0.92-0

RDCOMClient, chron, zoo, stats: these packages load OK.

Then, trying to connect, I get following error message:


 conn <- blpConnect(show.days="week", na.action="previous.days",
periodicity="daily")
Warning messages:
1: In getCOMInstance(name, force = TRUE, silent = TRUE) :
  Couldn't get clsid from the string
2: In blpConnect(show.days = "week", na.action = "previous.days",
periodicity = "daily") :
  Seems like this is not a Bloomberg Workstation:  Error : Invalid class string

Anyone encountered this problem?
What is wrong and how can I solve it?

Online, I found just one instance of this problem discussed, and it
was in Chinese:

http://cos.name/bbs/read.php?tid=12821&fpage=3

Thank you for your help!

Sergey

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] combining identify() and locator()

2009-02-27 Thread Brian Bolt


Hi,
I am wondering if there might be a way to combine the two functions  
identify() and locator() such that if I use identify() and then click  
on a point outside the set tolerance, the x,y coordinates are returned  
as in locator().  Does anyone know of a way to do this?

Thanks in advance for any help
-brian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Berwin A Turlach

G'day Rainer,

On Fri, 27 Feb 2009 09:34:11 +0200
Rainer M Krug  wrote:

> I want to install some versions of R simultaneously from source on a
> computer (running Linux). [...]

What flavour of Linux are we talking about?

> If it is not, how is it possible to have several versions of R on one
> computer, or is the only way to compile them and then call R in the
> directory of the version where it was compiled (~/R-2.7.2/bin/R)?

For Debian based machines (I first used Debian, nowadays Kubuntu), I
got into the following habit:

1) Unpack the R sources in /opt/src
2) Enter /opt/src/R-x.y.z and run configure with
   --prefix=/opt/R/R-x.y.z (and other options) 
3) Build R with checks and documentation from source and install.
4) Run in /opt/src a script that uses "update-alternative" install to
   install the new version and creates a link from /opt/R/R-x.y.z/bin/R
   to /opt/bin/R-x.y.z

I have /opt in my PATH, thus I can call any R version explicitly by
R-x.y.z.  

Typing R alone, will usually start the most recently installed
version (as this will have the highest priority) but I can configure
that via "sudo update-alternatives --config R".  I.e., I can make R run
a particular version.  Since the "update-alternative" step above also
registers all the *.info files and man pages, I will also access the
documentation of that particular R version (e.g., C-h i in emacs will
give me access to the info version of the manuals of the version of R
which is run by the R command).

Over time, typically when the linux system is upgraded, libraries on
which old R-x.y.z binaries relied vanish.  At that time I usually
delete /opt/R/R-x.y.z and remove that version from the available
alternatives.

HTH.  Let me know if you need more details.

Cheers,

Berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survival::survfit,plot.survfit

2009-02-27 Thread Heinz Tuechler

At 15:28 26.02.2009, Terry Therneau wrote:

> plot(survfit(fit)) should plot the survival-function for x=0 or
> equivalently beta'=0. This curve is independent of any covariates.

  This is not correct.  It plots the curve for a hypothetical 
subject with x=

mean of each covariate.

Does this mean, the curve corresponds to the one you would get based 
on the base line hazard?

Heinz

  This is NOT the "average survival" of the data set.  Imagine a 
cohort made up

of 60 year old men and their 10 year old grandsons: the expected survival of
this cohort does not look that for a 35 year old male.

Terry T

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with RBloomberg (not the usual one)

2009-02-27 Thread Sergey Goriatchev

Hello, again, everyone

I went through the code and narrowed down the problem

in blpConnect:
COMCreate("Bloomberg.Data.1") which then calls getCOMInstance does not
work, because
getCLSID("Bloomberg.Data.1") returns
"Fehler: Invalid class string"

What is this problem???

Best,
Sergey

On Fri, Feb 27, 2009 at 09:16, Sergey Goriatchev  wrote:
> Hello, everyone!
>
> I have a problem with RBloomberg and this is not the usual "no
> administrator rights" problem.
>
> I have R 2.7.2, RBloomberg 0.1-10, RDCOMclient 0.92-0
>
> RDCOMClient, chron, zoo, stats: these packages load OK.
>
> Then, trying to connect, I get following error message:
>
>
>  conn <- blpConnect(show.days="week", na.action="previous.days",
> periodicity="daily")
> Warning messages:
> 1: In getCOMInstance(name, force = TRUE, silent = TRUE) :
>  Couldn't get clsid from the string
> 2: In blpConnect(show.days = "week", na.action = "previous.days",
> periodicity = "daily") :
>  Seems like this is not a Bloomberg Workstation:  Error : Invalid class string
>
> Anyone encountered this problem?
> What is wrong and how can I solve it?
>
> Online, I found just one instance of this problem discussed, and it
> was in Chinese:
>
> http://cos.name/bbs/read.php?tid=12821&fpage=3
>
> Thank you for your help!
>
> Sergey
>



-- 
I'm not young enough to know everything. /Oscar Wilde
Experience is one thing you can't get for nothing. /Oscar Wilde
When you are finished changing, you're finished. /Benjamin Franklin
Tell me and I forget, teach me and I remember, involve me and I learn.
/Benjamin Franklin
Luck is where preparation meets opportunity. /George Patten

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Rainer M Krug

Hi

On Fri, Feb 27, 2009 at 10:41 AM, Berwin A Turlach
 wrote:
> G'day Rainer,
>
> On Fri, 27 Feb 2009 09:34:11 +0200
> Rainer M Krug  wrote:
>
>> I want to install some versions of R simultaneously from source on a
>> computer (running Linux). [...]
>
> What flavour of Linux are we talking about?

Sorry - I am running SuSE on the machine where I need it.

>
>> If it is not, how is it possible to have several versions of R on one
>> computer, or is the only way to compile them and then call R in the
>> directory of the version where it was compiled (~/R-2.7.2/bin/R)?
>
> For Debian based machines (I first used Debian, nowadays Kubuntu), I
> got into the following habit:
>
> 1) Unpack the R sources in /opt/src
> 2) Enter /opt/src/R-x.y.z and run configure with
>   --prefix=/opt/R/R-x.y.z (and other options)
> 3) Build R with checks and documentation from source and install.

OK - similar to what I did.

> 4) Run in /opt/src a script that uses "update-alternative" install to
>   install the new version and creates a link from /opt/R/R-x.y.z/bin/R
>   to /opt/bin/R-x.y.z

How do I do this? I usually call "sudo make install". Do I have to use
"update-alternative --install R-2.7.1 R 2" if I want to have R-2.7.1
aqs the second priority installed?

>
> I have /opt in my PATH, thus I can call any R version explicitly by
> R-x.y.z.

That is what I need - but I can't find update-alternatives in SuSE
>
> Typing R alone, will usually start the most recently installed
> version (as this will have the highest priority) but I can configure
> that via "sudo update-alternatives --config R".  I.e., I can make R run
> a particular version.  Since the "update-alternative" step above also
> registers all the *.info files and man pages, I will also access the
> documentation of that particular R version (e.g., C-h i in emacs will
> give me access to the info version of the manuals of the version of R
> which is run by the R command).

Exactly what I would like to have.

>
> Over time, typically when the linux system is upgraded, libraries on
> which old R-x.y.z binaries relied vanish.  At that time I usually
> delete /opt/R/R-x.y.z and remove that version from the available
> alternatives.
>
> HTH.  Let me know if you need more details.

Thanks a lot - now I just have to know how I can do it under SuSE. I
will keep this in mind and use it for my main cvomputer, which runs
Xubuntu.

Rainer

>
> Cheers,
>
>        Berwin
>



-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Faculty of Science
Natural Sciences Building
Private Bag X1
University of Stellenbosch
Matieland 7602
South Africa

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining identify() and locator()

2009-02-27 Thread Barry Rowlingson

2009/2/27 Brian Bolt :
> Hi,
> I am wondering if there might be a way to combine the two functions
> identify() and locator() such that if I use identify() and then click on a
> point outside the set tolerance, the x,y coordinates are returned as in
> locator().  Does anyone know of a way to do this?
> Thanks in advance for any help

 Since "identify" will only return the indexes of selected points, and
it only takes on-screen clicks for coordinates, you'll have to
leverage "locator" and duplicate some of the "identify" work. So call
locator(1), then compute the distancez to your points, and if any are
below your tolerance mark them using text(), otherwise keep the
coordinates of the click.

 You can use dist() to compute a distance matrix, but if you want to
totally replicate identify's tolerance behaviour I think you'll have
to convert from your data coordinates to device coordinates. The
grconvertX and Y functions look like they'll do that for you.

 Okay, that's the flatpack delivered, I think you've got all the
parts, some assembly required!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Download daily weather data

2009-02-27 Thread Pfaff, Bernhard Dr.

Dear Thomas,

more for the sake of completeness and as an alternative to R. There are GRIB 
data [1] sets available (some for free) and there is the GPL software Grads 
[2]. Because the Grib-Format is well documented it should be possible to get it 
into R easily and make up your own plots/weather analyis. I do not know and 
have not checked if somebody has already done so.

I use this information/tools aside of others during longer-dated off-shore 
sailing.

Best,
Bernhard 

[1] http://www.grib.us/
[2] http://www.iges.org/grads/

>-Ursprüngliche Nachricht-
>Von: r-help-boun...@r-project.org 
>[mailto:r-help-boun...@r-project.org] Im Auftrag von Scillieri, John
>Gesendet: Donnerstag, 26. Februar 2009 22:58
>An: 'James Muller'; 'r-help@r-project.org'
>Betreff: Re: [R] Download daily weather data
>
>Looks like you can sign up to get XML feed data from Weather.com
>
>http://www.weather.com/services/xmloap.html
>
>Hope it works out!
>
>-Original Message-
>From: r-help-boun...@r-project.org 
>[mailto:r-help-boun...@r-project.org] On Behalf Of James Muller
>Sent: Thursday, February 26, 2009 3:57 PM
>To: r-help@r-project.org
>Subject: Re: [R] Download daily weather data
>
>Thomas,
>
>Have a look at the source code for the webpage (ctrl-u in firefox,
>don't know in internet explorer, etc.). That is what you'd have to
>parse in order to get the forecast from this page. Typically when I
>parse webpages such as this I use regular expressions to do so (and I
>would never downplay the usefulness of regular expressions, but they
>take a little getting used to). There are two parts to the task: find
>patterns that allow you to pull out the datum/data you're after; and
>then write a program to pull it/them out. Also, of course, download
>the webpage (but that's no issue).
>
>I bet you'd be able to find a comma separated value (CSV) file
>containing the weather report somewhere, which would probably involve
>a little less labor in order to produce your automatic wardrobe
>advice.
>
>James
>
>
>
>On Thu, Feb 26, 2009 at 3:47 PM, Thomas Levine 
> wrote:
>> I'm writing a program that will tell me whether I should wear a coat,
>> so I'd like to be able to download daily weather forecasts and daily
>> reports of recent past weather conditions.
>>
>> The NOAA has very promising tabular forecasts
>> 
>(http://forecast.weather.gov/MapClick.php?CityName=Ithaca&state
=NY&site=BGM&textField1=42.4422&textField2=-76.5002&e=0>&FcstType=digital),
>> but I can't figure out how to import them.
>>
>> Someone must have needed to do this before. Suggestions?
>>
>> Thomas Levine!
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
 This e-mail and any attachments are confidential, may 
>contain legal, professional or other privileged information, 
>and are intended solely for the addressee.  If you are not the 
>intended recipient, do not use the information in this e-mail 
>in any way, delete this e-mail and notify the sender. CEG-IP1
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
*
Confidentiality Note: The information contained in this ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Axis-question

2009-02-27 Thread Antje


Hi there,

I was wondering wether it's possible to generate an axis with groups (like in 
Excel).


So that you can have something like this as x-axis (for example for the 
levelplot-method of the lattice package):


---
| X1 | X2 | X3 | X1 | X2 | X3 | X1 | ...
| group1   | group2   | group3  ...
..
..
..

I hope you understand what I'm looking for?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] accessing and preserving list names in lapply

2009-02-27 Thread baptiste auguie


Hi,

Perhaps Hadley's plyr package can help,


library(plyr)
temp <- list(x=2,y=3,x=4)
llply(temp, function(x) x^2 )

$x
[1] 4

$y
[1] 9

$x
[1] 16



baptiste

On 27 Feb 2009, at 03:07, Alexy Khrabrov wrote:


Sometimes I'm iterating over a list where names are keys into another
data structure, e.g. a related list.  Then I can't use lapply as it
does [[]] and loses the name.  Then I do something like this:

do.one <- function(ldf) { # list-dataframe item
  key <- names(ldf)
  meat <- ldf[[1]]
  mydf <- some.df[[key]] # related data structure
  r.df <- cbind(meat,new.column=computed)
  r <- list(xxx=r.df)
  names(r) <- key
  r
}

then if I operate on the list L of those ldf's not as lapply(L,...),  
but


res <- lapply(1:length(L),do.one)

Can this procedure be simplified so that names are preserved?
Specifically, can the xxx=..., xxx <- key part be eliminated -- how
can we have a variable on the left-hand side of list(lhs=value)?

Cheers,
Alexy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] combining identify() and locator()

2009-02-27 Thread Brian Bolt


Hi,
I am wondering if there might be a way to combine the two functions  
identify() and locator() such that if I use identify() and then click  
on a point outside the set tolerance, the x,y coordinates are returned  
as in locator().  Does anyone know of a way to do this?

Thanks in advance for any help
-brian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Balanced design, differences in results using anova and lmer/anova

2009-02-27 Thread Lars Kunert

Hi, I am trying to do an analysis of variance for an unbalanced design.
As a toy example, I use a dataset presented by K. Hinkelmann and O.
Kempthorne in "Design and Anaylysis of Experiments" (p353-356).
This example is very similar to my own dataset, with one difference: it
is balanced.
Thus it is possible to do an anaylsis using both: (1) anova, and (2) lmer.
Furthermore, I can compare my results with the results presented in the
book (the book uses SAS).

In short:
> using anova, I can reproduce the results presented in the book.
> using lmer, I fail to reproduce the results
However, for my "real" analysis, I need lmer - what do I do wrong?

The example uses as randomized complete block desigh (RCBD) with a
nested blocking structure
and subsampling.

response:
  height (of some trees)
covariates:
  HSF (type of the trees)
nested covariates:
  loc (location)
  block  (block is nested in location)

# the data (file: pine.txt) looks like this:

locblockHSFheight
111210
111221
112252
112260
113197
113190
121222
121214
122265
122271
123201
123210
131220
131225
132271
132277
133205
133204
141224
141231
142270
142283
143211
143216
211178
211175
212191
212193
213182
213179
221180
221184
222198
222201
223183
223190
231189
231183
232200
232195
233197
233205
241184
241192
242197
242204
243192
243190

#
# then I load the data
#
read.data = function()
{
d = read.table( "pines.txt", header=TRUE )

d$loc   = as.factor( d$loc   )
d$block.tmp = as.factor( d$block )
d$block = ( d$loc:d$block.tmp )[drop=TRUE]  # lme4 does not support
implicit nesting

d$HSF   = as.factor( d$HSF )

return( d )
}

d = read.data()


#
# using anova.
#
m.aov = aov( height ~ HSF*loc + Error(loc/block + HSF:loc/block), data=d )
summary( m.aov )

#
# I get:
#
Error: loc
Df Sum Sq Mean Sq
loc  1  20336   20336

Error: loc:block
  Df  Sum Sq Mean Sq F value Pr(>F)
Residuals  6 1462.33  243.72

Error: loc:HSF
Df  Sum Sq Mean Sq
HSF  2 12170.7  6085.3
HSF:loc  2  6511.2  3255.6

Error: loc:block:HSF
  Df  Sum Sq Mean Sq F value Pr(>F)
Residuals 12 301.167  25.097

Error: Within
  Df Sum Sq Mean Sq F value Pr(>F)
Residuals 24 529.00   22.04

#
# which is, what I expected, however, using lmer
#
m.lmer = lmer( height ~ HSF*loc + HSF*(loc|block), data=d )
anova( m.lmer )

#
# I get:
#
Analysis of Variance Table
Df  Sum Sq Mean Sq
HSF  2 12170.7  6085.3
loc  1  1924.6  1924.6
HSF:loc  2  6511.2  3255.6

#
# what is, at least not what I expected...
#
Thanks for your help, Lars

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using package ROCR

2009-02-27 Thread wiener30


Just an update concerning an error message in using ROCR package.

Error in as.double(y) : 
  cannot coerce type 'S4' to vector of type 'double' 

I have changed the sequence of loading the packages and the problem has
gone:
library(ROCR)
library(randomForest)

The loading sequence that caused an error was:
library(randomForest)
library(ROCR)

May be this info could be useful for somebody else who is getting the same
error.




wiener30 wrote:
> 
> Thank you very much for the response!
> 
> The plot(1,1) helped to resolve the first problem.
> But I am still getting a second error message when running demo(ROCR)
> 
> Error in as.double(y) : 
>   cannot coerce type 'S4' to vector of type 'double'
> 
> It seems it has something to do with compatibility of S4 objects.
> 
> My versions of R and ROCR package are the same as you listed.
> But it seems something other is missing in my installation.
> 
> 
> William Doane wrote:
>> 
>> 
>> Responding to question 1... it seems the demo assumes you already have a
>> plot window open.
>> 
>>   library(ROCR)
>>   plot(1,1)
>>   demo(ROCR)
>> 
>> seems to work.
>> 
>> For question 2, my environment produces the expected results... plot
>> doesn't generate an error:
>>   * R 2.8.1 GUI 1.27 Tiger build 32-bit (5301)
>>   * OS X 10.5.6
>>   * ROCR 1.0-2
>> 
>> -Wil
>> 
>> 
>> 
>> wiener30 wrote:
>>> 
>>> I am trying to use package ROCR to analyze classification accuracy,
>>> unfortunately there are some problems right at the beginning.
>>> 
>>> Question 1) 
>>> When I try to run demo I am getting the following error message
 library(ROCR)
 demo(ROCR)
 if(dev.cur() <= 1)  [TRUNCATED] 
>>> Error in get(getOption("device")) : wrong first argument
>>> When I issue the command
 dev.cur() 
>>> it returns
>>> null device 
>>>   1
>>> It seems something is wrong with my R-environment ?
>>> Could somebody provide a hint, what is wrong.
>>> 
>>> Question 2)
>>> When I run an example commands from the manual
>>> library(ROCR)
>>> data(ROCR.simple)
>>> pred <- prediction( ROCR.simple$predictions, ROCR.simple$labels )
>>> perf <- performance( pred, "tpr", "fpr" )
>>> plot( perf )
>>> 
>>> the plot command issues the following error message
>>> Error in as.double(y) : 
>>>   cannot coerce type 'S4' to vector of type 'double'
>>> 
>>> How this could be fixed ?
>>> 
>>> Thanks for the support
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Using-package-ROCR-tp22198213p22242023.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bottom legends in ggplot2 ?

2009-02-27 Thread ONKELINX, Thierry

I would think that the lines below should work but they give an error.
Hadley, can you clarify this?

Cheers,

Thierry

> library(ggplot2)
> qplot(mpg, wt, data=mtcars, colour=cyl) + opts(legend.position =
"bottom")
Error in grid.Call.graphics("L_setviewport", pvp, TRUE) : 
  Non-finite location and/or size for viewport
> ggplot(mtcars, aes(x = mpg, y = wt, colour = cyl)) + geom_point() +
opts(legend.position= "bottom")
Error in grid.Call.graphics("L_setviewport", pvp, TRUE) : 
  Non-finite location and/or size for viewport
> sessionInfo()
R version 2.8.1 (2008-12-22) 
i386-pc-mingw32 

locale:
LC_COLLATE=Dutch_Belgium.1252;LC_CTYPE=Dutch_Belgium.1252;LC_MONETARY=Du
tch_Belgium.1252;LC_NUMERIC=C;LC_TIME=Dutch_Belgium.1252

attached base packages:
[1] grid  stats graphics  grDevices datasets  utils methods

[8] base 

other attached packages:
[1] ggplot2_0.8.1 reshape_0.8.2 plyr_0.1.5proto_0.3-8  
 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Avram Aelony
Verzonden: donderdag 26 februari 2009 20:34
Aan: r-h...@stat.math.ethz.ch
Onderwerp: [R] bottom legends in ggplot2 ?


Has anyone had success with producing legends to a qplot graph such that
the legend is placed on the bottom, under the abcissa rather than to the
right hand side ?

The following doesn't move the legend:
   library(ggplot2)
   qplot(mpg, wt, data=mtcars, colour=cyl,
gpar(legend.position="bottom") )


I am using ggplot2_0.8.2.

Thanks in advance,

Avram

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gplot problems with faceting

2009-02-27 Thread ONKELINX, Thierry

Dear Pascal,

I thik you need to define the facets as
facets = ~ Par 
Instead of
facets = Par ~ .

The Par ~ . Syntax can be used with facet_grid and not with facet_wrap.

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology 
and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
BOISSON, Pascal
Verzonden: donderdag 26 februari 2009 17:08
Aan: r-help@r-project.org
Onderwerp: [R] gplot problems with faceting

Dear R-Listers,

I am very confused with what seems to be a misuse of the faceting options with 
gplot function and I hope you might help me on this.

z contains various simulation results from simulations with different set of 
parameters.
I melt my data to have the following data.frame structure : 

> str(z)
'data.frame':   12383 obs. of  5 variables:
 $ vID  : num  1 2 3 4 5 6 7 8 9 10 ...
 $ Var  : Factor w/ 61 levels ".t",".ASU_1.Biofilm_C",..: 1 1 1 1 1 1 
 $ Var.Value: num  317 318 319 320 319 ...
 $ Par  : Factor w/ 7 levels ".Biostyr0d.t_K",..: 1 1 1 1 1 1 1 1 
 $ Par.Value: num  5 5 5 5 5 5 5 5 5 5 ...

I would like to plot for each couple (Parameter(i), Variable(j)) the plot 
Variable(j).value = f(Parameter(i).Value.
I would like to do it step wise and have one set of graphs per Variable.
Then I subset z based on a single variable name eg ".ASU_1.Biofilm_C"

Then I try the following, but I get an error message :

> qp<- qplot(Par.Value, Var.Value, data = z[z$Var==v,], ylab=v, 
> geom=c("point","smooth"), method="lm")
> qp<- qp + facet_wrap( facets= Par~ ., scales = "free_x", ncol=length(vPar))
> qp
Erreur dans `[.data.frame`(plot$data, , setdiff(cond, names(df)), drop = FALSE) 
: 
  colonnes non définies sélectionnées

I can have this working by modifying the facets arguments to "Par~Var", and it 
does what I want, 
But it is not satisfying, and I am confused with this error message.
The same error message happens when I use the full data frame. 
Or when I try other mappings like colors = Par

Any idea of what I am doing wrong?

Best regards
Pascal Boisson
___

Protegeons ensemble l'environnement : avez-vous besoin d'imprimer ce courrier 
electronique ?
___

Les informations figurant sur cet e-mail ont un caractere strictement 
confidentiel et sont exclusivement adressees au destinataire mentionne 
ci-dessus.Tout usage, reproduction ou divulgation de cet e-mail est strictement 
interdit si vous n'en etes pas le destinataire. Dans ce cas, veuillez nous en 
avertir immediatement par la meme voie et detruire l'original. Merci.

This e-mail is intended only for use of the individual or entity to which it is 
addressed and may contain information that is privileged, confidential and 
exempt from disclosure under applicable law. 
Any use, distribution or copying of this e-mail communication is strictly 
prohibited if you are not the addressee. If so, please notify us immediately by 
e-mail, and destroy the original. Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Mutiple ColSideColors for heatmap.2

2009-02-27 Thread Daren Tan

I am in a situation needing 3 ColSideColors for heatmap.2. How can I do that?

TIA
Daren

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rounding problem

2009-02-27 Thread Peterko


hi i am creating some variables from same data, but somewhere is different
rouding.
look:
 P = abs(fft(d.zlato)/480)^2 
 hladane= sort(P,decreasing=T)[1:10]/480 
  
 pozicia=c(0,0,0,0,0) 
 for (j in 1:5){ for (i in 2:239){
  if (P[i]/480==hladane[2*j-1]){pozicia[j]=i-1}}}
 period=479/pozicia  

> P[2]/334 
 [1] 0.0001279107 
 > hladane[1]
 [1] 0.0001279107
 > P[2]/334==hladane[1]
 [1] FALSE
> abs(P[2]/334 - hladane[1]) < 0.001
 [1] TRUE

It is possible to avoid it ?
I know in this exam i can use 2x if to eliminate this rouding, but i need to
fix it in general.
-- 
View this message in context: 
http://www.nabble.com/rounding-problem-tp22243179p22243179.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Prof Brian Ripley


This is really an R-devel question.

On Fri, 27 Feb 2009, Rainer M Krug wrote:


Hi

I want to install some versions of R simultaneously from source on a
computer (running Linux). Some programs have an option to specify a
suffix for the executable (eg R would become R-2.7.2 when the suffix
is specified as "-2.7.2"). I did not find this option for R - did I
overlook it?

If it is not, how is it possible to have several versions of R on one
computer, or is the only way to compile them and then call R in the
directory of the version where it was compiled (~/R-2.7.2/bin/R)?

If this is the case, would it be possible to add this o[ptiuon to
specify the suffix for the executables?


'R' is not an executable, but a shell script.

You can use 'prefix' to install R anywhere, or other variables for 
more precise control (see the R-admin manual).  For example, we use 
rhome to have R 2.8.x under /usr/local/lib64/R-2.8 etc.


And you can rename $prefix/bin/R to, say, R-2.7.2, or link 
R_HOME/bin/R to anywhere in yout path, under any name you choose.




Thanks

Rainer
--
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Faculty of Science
Natural Sciences Building
Private Bag X1
University of Stellenbosch
Matieland 7602
South Africa


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rounding problem

2009-02-27 Thread baptiste auguie


Hi,

you probably want to use ?all.equal instead of "=="

I couldn't run your example, though

Hope this helps,

baptiste

On 27 Feb 2009, at 10:32, Peterko wrote:



hi i am creating some variables from same data, but somewhere is  
different

rouding.
look:
P = abs(fft(d.zlato)/480)^2
hladane= sort(P,decreasing=T)[1:10]/480

pozicia=c(0,0,0,0,0)
for (j in 1:5){ for (i in 2:239){
 if (P[i]/480==hladane[2*j-1]){pozicia[j]=i-1}}}
period=479/pozicia


P[2]/334

[1] 0.0001279107

hladane[1]

[1] 0.0001279107

P[2]/334==hladane[1]

[1] FALSE

abs(P[2]/334 - hladane[1]) < 0.001

[1] TRUE

It is possible to avoid it ?
I know in this exam i can use 2x if to eliminate this rouding, but i  
need to

fix it in general.
--
View this message in context: 
http://www.nabble.com/rounding-problem-tp22243179p22243179.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Wacek Kusnierczyk

Prof Brian Ripley wrote:
> This is really an R-devel question.
>
> On Fri, 27 Feb 2009, Rainer M Krug wrote:
>
>> Hi
>>
>> I want to install some versions of R simultaneously from source on a
>> computer (running Linux). Some programs have an option to specify a
>> suffix for the executable (eg R would become R-2.7.2 when the suffix
>> is specified as "-2.7.2"). I did not find this option for R - did I
>> overlook it?
>>
>> If it is not, how is it possible to have several versions of R on one
>> computer, or is the only way to compile them and then call R in the
>> directory of the version where it was compiled (~/R-2.7.2/bin/R)?
>>
>> If this is the case, would it be possible to add this o[ptiuon to
>> specify the suffix for the executables?
>
> 'R' is not an executable, but a shell script.

depending on what is meant by 'executable'.

"Files that contain instructions for an interpreter
 or virtual
machine  may be considered
executables" [1]
"The term might also be, but generally isn't, applied to scripts
 which are interpreted by a command
line interpreter
." [2]

try also

file `which R`

which is likely, system-dependently, to say that it's *executable*
(independently of the access mode).

vQ


[1] http://en.wikipedia.org/wiki/Executable
[2] http://foldoc.org/index.cgi?query=executable&action=Search


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rounding problem

2009-02-27 Thread Peterko


all.equal is what i need, many thanks to help me

baptiste auguie-2 wrote:
> 
> Hi,
> 
> you probably want to use ?all.equal instead of "=="
> 
> I couldn't run your example, though
> 
> Hope this helps,
> 
> baptiste
> 
> On 27 Feb 2009, at 10:32, Peterko wrote:
> 
>>
>> hi i am creating some variables from same data, but somewhere is  
>> different
>> rouding.
>> look:
>> P = abs(fft(d.zlato)/480)^2
>> hladane= sort(P,decreasing=T)[1:10]/480
>>
>> pozicia=c(0,0,0,0,0)
>> for (j in 1:5){ for (i in 2:239){
>>  if (P[i]/480==hladane[2*j-1]){pozicia[j]=i-1}}}
>> period=479/pozicia
>>
>>> P[2]/334
>> [1] 0.0001279107
>>> hladane[1]
>> [1] 0.0001279107
>>> P[2]/334==hladane[1]
>> [1] FALSE
>>> abs(P[2]/334 - hladane[1]) < 0.001
>> [1] TRUE
>>
>> It is possible to avoid it ?
>> I know in this exam i can use 2x if to eliminate this rouding, but i  
>> need to
>> fix it in general.
>> --
>> View this message in context:
>> http://www.nabble.com/rounding-problem-tp22243179p22243179.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> _
> 
> Baptiste Auguié
> 
> School of Physics
> University of Exeter
> Stocker Road,
> Exeter, Devon,
> EX4 4QL, UK
> 
> Phone: +44 1392 264187
> 
> http://newton.ex.ac.uk/research/emag
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/rounding-problem-tp22243179p22243567.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Gerard M. Keogh

Frank,

I can't see the code you mention - Web marshall at work - but I don't think
you should be too quick to run down SAS - it's a powerful and flexible
language but unfortunately very expensive.

Your example mentions doing a vector product in the macro language - this
only suggest to me that those people writing the code need a crash course
in SAS/IML (the matrix language). SAS is designed to work on records and so
is inapproprorriate for matrices - macros are only an efficient code
copying device. Doing matrix computations in this way is pretty mad and the
code would be impossible never mind the memory problems.
SAS recognise that but a lot of SAS users remain familiar with IML.

In IML by contrast there are inner, cross and outer products and a raft of
other useful methods for matrix work that R users would be familiar with.
OLS for example is one line:

b = solve(X`X, X`y) ;
rss = sqrt(ssq(y - Xb)) ;

And to give you a flavour of IML's capabilities I implemented a SAS version
of the MARS program in it about 6 or 7 years ago.
BTW SPSS also has a matrix language.

Gerard



   
 Frank E Harrell   
 Jr
  R list
 Sent by:   cc 
 r-help-boun...@r- 
 project.org   Subject 
   [R] Inefficiency of SAS Programming 
   
 26/02/2009 22:57  
   
   
   
   




If anyone wants to see a prime example of how inefficient it is to
program in SAS, take a look at the SAS programs provided by the US
Agency for Healthcare Research and Quality for risk adjusting and
reporting for hospital outcomes at
http://www.qualityindicators.ahrq.gov/software.htm .  The PSSASP3.SAS
program is a prime example.  Look at how you do a vector product in the
SAS macro language to evaluate predictions from a logistic regression
model.  I estimate that using R would easily cut the programming time of
this set of programs by a factor of 4.

Frank
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



**
The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission, dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.  It is the policy of the 
Department of Justice, Equality and Law Reform and the Agencies and Offices 
using its IT services to disallow the sending of offensive material.
Should you consider that the material contained in this message is offensive 
you should contact the sender immediately and also mailminder[at]justice.ie.

Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh 
an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh 
agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti. 
Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an 
bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a 
hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair 
tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus 
scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt, 
Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí 
a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú.
Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht 
seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus le 
mailminder[ag]justice.ie chomh maith. 
*

Re: [R] bottom legends in ggplot2 ?

2009-02-27 Thread hadley wickham

Yes, this is a known bug which will (hopefully) be addressed in the
next release.
Hadley

On Fri, Feb 27, 2009 at 4:15 AM, ONKELINX, Thierry
 wrote:
> I would think that the lines below should work but they give an error.
> Hadley, can you clarify this?
>
> Cheers,
>
> Thierry
>
>> library(ggplot2)
>> qplot(mpg, wt, data=mtcars, colour=cyl) + opts(legend.position =
> "bottom")
> Error in grid.Call.graphics("L_setviewport", pvp, TRUE) :
>  Non-finite location and/or size for viewport
>> ggplot(mtcars, aes(x = mpg, y = wt, colour = cyl)) + geom_point() +
> opts(legend.position= "bottom")
> Error in grid.Call.graphics("L_setviewport", pvp, TRUE) :
>  Non-finite location and/or size for viewport
>> sessionInfo()
> R version 2.8.1 (2008-12-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=Dutch_Belgium.1252;LC_CTYPE=Dutch_Belgium.1252;LC_MONETARY=Du
> tch_Belgium.1252;LC_NUMERIC=C;LC_TIME=Dutch_Belgium.1252
>
> attached base packages:
> [1] grid      stats     graphics  grDevices datasets  utils     methods
>
> [8] base
>
> other attached packages:
> [1] ggplot2_0.8.1 reshape_0.8.2 plyr_0.1.5    proto_0.3-8
>
>
>
> 
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
> methodology and quality assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> tel. + 32 54/436 185
> thierry.onkel...@inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to
> say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
>
> -Oorspronkelijk bericht-
> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> Namens Avram Aelony
> Verzonden: donderdag 26 februari 2009 20:34
> Aan: r-h...@stat.math.ethz.ch
> Onderwerp: [R] bottom legends in ggplot2 ?
>
>
> Has anyone had success with producing legends to a qplot graph such that
> the legend is placed on the bottom, under the abcissa rather than to the
> right hand side ?
>
> The following doesn't move the legend:
>       library(ggplot2)
>       qplot(mpg, wt, data=mtcars, colour=cyl,
> gpar(legend.position="bottom") )
>
>
> I am using ggplot2_0.8.2.
>
> Thanks in advance,
>
> Avram
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
> en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd 
> is
> door een geldig ondertekend document. The views expressed in  this message
> and any annex are purely those of the writer and may not be regarded as 
> stating
> an official position of INBO, as long as the message is not confirmed by a 
> duly
> signed document.
>



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survival::predict.coxph

2009-02-27 Thread Bernhard Reinhardt


Hello Therry,

it´s really great to receive some feedback from a "pro". I´m not sure if 
I´ve got the point right:
You suppose that the cox-model isn´t good at forecasting an expected 
survival time because of the issues with the prediction of the 
survival-function at the right tail and one should better use parametric 
models like an exponential model? Or what do you mean by "smooth 
parametric estimate"?
Anyways I just ordered your book at the library. Hopefully I´ll get some 
more insights by the lecture of it.


Maybe I should point out why I even tried to do such forecasts.

Following the article "Quantifying climate-related risks and 
uncertainties using Cox regression models" by Maia and Meinke I try to 
deduce winter-precipitation from lagged Sea-Surface-Temperatures (SSTs).
So precipitation is my survival-time and and the SST-Observations at 
different lags are my covariates.
The sample size is only 55 and I´ve got 11 covariates (Lag=0 months to 
Lag=10 months) to choose from.
My first goal is to identify the optimal time-lag(s) between 
SST-Anomaly-Observation and Precipitation-Observation.

Expectation was that the lag should be some months.

I thought a cox-model would easily provide such a selection. At first I 
used the covariates individually. Coefficients for lags between 0 and 5 
months were all quite big and then decreasing from 6 to 10 months. So I 
think 5 months could be the lag of the process and high persistence of 
the SST accounts for the big coefficients for 0-4 months.


As the next step I used all 11 covariates at once. I hoped to gain 
similar results. Instead the sign of the coefficients "randomly" jumps 
from plus to minus and the magnitude as well is randomly distributed.


I also tried to using sets of three covariates e.g. with lag 4,5,6. But 
even then the sign of the coefficients is varying.


So my thought was that maybe I overfitted the model. But in fact I did 
not find any literature if that´s even possible. As far as my limited 
knowledge goes, overfitted models should reproduce the training-period 
very good but other periods very poor. So I first tried to reproduce the 
training-period. But so far with no success - as well with using 11 
covariates or just 1.


Regards

Bernhard R.

Terry Therneau wrote:

You are mostly correct.
Because of the censoring issue, there is no good estimate of the mean survival 
time.  The survival curve either does not go to zero, or gets very noisy near 
the right hand tail (large standard error); a smooth parametric estimate is what 
is really needed to deal with this.
  For this reason the mean survival, though computed (but see the 
survfit.print.mean option, help(print.survfit)) is not highly regarded.  It is 
not an option in predict.coxph.
  
  	Terry T.


 begin included message --
 
Hi,


if I got it right then the survival-time we expect for a subject is the 
integral over the specific survival-function of the subject from 0 to t_max.


If I have a trained cox-model and want to make a prediction of the 
survival-time for a new subject I could use
survfit(coxmodel, newdata=newSubject) to estimate a new 
survival-function which I have to integrate thereafter.


Actually I thought predict(coxmodel, newSubject) would do this for me, 
but I?m confused which type I have to declare. If I understand the 
little pieces of documentation right then none of the available types is 
exactly the predicted survival-time.
I think I have to use the mean survival-time of the baseline-function 
times exp(the result of type linear predictor).


Am I right?




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Ajay ohri

I would like to know if we can create a package in which r functions are
renamed closer to sas language.doing so will help people familiar to SAS to
straight away take to R for their work,thus decreasing the threshold for
acceptance - and then get into deeper understanding later.
since it is a package it would be optional only for people wanting to try
out R from SAS.. Do we have such a package right now..it basically masks R
functions to the equivalent function in another language just for user ease
/beginners

for example

creating function for means

 procmeans<-function(x,y)
+ {
summary (
subset(x,select=c(x,y))
+
)

creating function for importing csv

procimport <-function(x,y)
+ {
read.csv(
textConnection(x),row.names=y,na.strings="  "
+
)

creating function fo describing data

procunivariate<-function(x)+ {
summary(x)
+
)

regards,

ajay

www.decisionstats.com

On Fri, Feb 27, 2009 at 4:27 AM, Frank E Harrell Jr <
f.harr...@vanderbilt.edu> wrote:

> If anyone wants to see a prime example of how inefficient it is to program
> in SAS, take a look at the SAS programs provided by the US Agency for
> Healthcare Research and Quality for risk adjusting and reporting for
> hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm .
>  The PSSASP3.SAS program is a prime example.  Look at how you do a vector
> product in the SAS macro language to evaluate predictions from a logistic
> regression model.  I estimate that using R would easily cut the programming
> time of this set of programs by a factor of 4.
>
> Frank
> --
> Frank E Harrell Jr   Professor and Chair   School of Medicine
> Department of Biostatistics   Vanderbilt University
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Barry Rowlingson

2009/2/27 Peter Dalgaard :

> Presumably, something like
>
>     IF &N. =  1 THEN SUB_N = 1;
>     ELSE IF &N. < 5 THEN SUB_N = &N.-1;
>     ELSE IF &N. < 16 THEN SUB_N = &N.-2;
>     ELSE SUB_N = &N.-3;
>
> would work, provided that 2, 5, 16 are impossible values. Problem is that it
> actually makes the code harder to grasp, so experienced SAS programmers go
> for the dumb but readable code like the above.

 I'm not sure which is easier to grasp. When I first saw the original
version I thought it was an odd way of doing "SUB_N = &N.". Only then
did I have a closer look and spot the missing 2, 5, and 16. A comment
would have been very enlightening. But there was nothing relevant.

> In R, the cleanest I can think of is
>
> subn <- match(n, setdiff(1:19, c(2,5,16)))
>
> or maybe just
>
> subn <- match(n, c(1, 3:4, 6:15, 17:19))
>
> although
>
> subn <- factor(n, levels = c(1, 3:4, 6:15, 17:19))
>
> might be what is really wanted

 I think the important thing with any programming is to make sure what
you want is expressed in words somewhere. If not in the code, then in
the comments. And operations like this should be abstracted into
functions.

  All the examples of SAS code I've seen seem to fall into the old
practices of writing great long 'scripts', with minimal code-reuse and
encapsulation of useful functionality. If these SAS scripts are then
given to new SAS programmers then the chances are they will follow
these bad practices. Show them well-written R code (or C, or Python)
and maybe they can implement those good practices into their SAS work.
Assuming SAS can do that. I'm not sure.


Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Rainer M Krug

On Fri, Feb 27, 2009 at 12:37 PM, Prof Brian Ripley
 wrote:
> This is really an R-devel question.

sorry about the wrong list.

>
> On Fri, 27 Feb 2009, Rainer M Krug wrote:
>
>> Hi
>>
>> I want to install some versions of R simultaneously from source on a
>> computer (running Linux). Some programs have an option to specify a
>> suffix for the executable (eg R would become R-2.7.2 when the suffix
>> is specified as "-2.7.2"). I did not find this option for R - did I
>> overlook it?
>>
>> If it is not, how is it possible to have several versions of R on one
>> computer, or is the only way to compile them and then call R in the
>> directory of the version where it was compiled (~/R-2.7.2/bin/R)?
>>
>> If this is the case, would it be possible to add this o[ptiuon to
>> specify the suffix for the executables?
>
> 'R' is not an executable, but a shell script.
>
> You can use 'prefix' to install R anywhere, or other variables for more
> precise control (see the R-admin manual).  For example, we use rhome to have
> R 2.8.x under /usr/local/lib64/R-2.8 etc.
>
> And you can rename $prefix/bin/R to, say, R-2.7.2, or link R_HOME/bin/R to
> anywhere in yout path, under any name you choose.

OK - so the proceuder will be: if I want to install R 2.7.2 without
impacting on my existing installation of R (which is done by a package
manager), I use
> ./configure --prefix=/usr/bin/R-2.7.2
> make
> ln -s /usr/R-2.7.2/bin/R /usr/bin/R-2.7.2

and when I use
> R-2.7.2

it will start R 2.7.2

I can continue with as many installed version as I want

Thanks a lot,
that was what I was looking for

Rainer

>
>>
>> Thanks
>>
>> Rainer
>> --
>> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
>> Biology, UCT), Dipl. Phys. (Germany)
>>
>> Centre of Excellence for Invasion Biology
>> Faculty of Science
>> Natural Sciences Building
>> Private Bag X1
>> University of Stellenbosch
>> Matieland 7602
>> South Africa
>
> --
> Brian D. Ripley,                  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Faculty of Science
Natural Sciences Building
Private Bag X1
University of Stellenbosch
Matieland 7602
South Africa

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Berwin A Turlach

G'day Rainer,

On Fri, 27 Feb 2009 10:53:12 +0200
Rainer M Krug  wrote:

> > What flavour of Linux are we talking about?
> 
> Sorry - I am running SuSE on the machine where I need it.

Sorry, I am not familiar with that flavour; before switching to Debian
(and Debian based distributions), I was using RedHat.  And before that
Slackware.

> > 4) Run in /opt/src a script that uses "update-alternative" install
> > to install the new version and creates a link
> > from /opt/R/R-x.y.z/bin/R to /opt/bin/R-x.y.z
> 
> How do I do this? I usually call "sudo make install". Do I have to use
> "update-alternative --install R-2.7.1 R 2" if I want to have R-2.7.1
> aqs the second priority installed?

I do the "make install" step manually, the script just alerts the
system that another alternative for the R command was installed.

If memory serves correctly, the "alternatives" mechanism was developed
by Debian and adopted by RedHat (or the other way round).  I am not
sure whether SuSE has adopted this, or a similar system.

Essentially, for a command, say foo, for which several alternatives
exists, is installed on the system in, say /usr/bin/, as a link
to /etc/alternatives/foo and /etc/alternatives/foo is a link to the
actual program that is called.  

E.g. on my machine I have

ber...@berwin-nus1:~$ update-alternatives --list wish
/usr/bin/wish8.5
/usr/bin/wish8.4

which tells me that wish 8.5 and wish8.4 are installed and I could call
them explicitly.  /usr/bin/wish is a link to /etc/alternatives/wish
and /etc/alternatives/wish will point to either of these two programs
(depending on what the system admin decided should be the default, i.e.
should be used if a user just types 'wish').  

A command like "update-alternatives --config wish" allows to configure
whether "wish" should mean "wish8.5" or "wish8.4".  And all that is
necessary is to change the link in /etc/alternatives/wish to point at
the desired program.

> That is what I need - but I can't find update-alternatives in SuSE

As I said, I do not know whether SuSE offers this alternatives system
or a similar system.  If it does, perhaps it is just a matter of
installing some additional packages?  If it offers a different, but
similar system, then you would have to ask on a SuSE list on that
system is maintained and configured.

On my machine I would say "apt-file search update-alternatives" to find
out which package provides that command and to install that package if
it is not yet installed.  I am afraid I do not know what the equivalent
command on SuSE is.

> > Typing R alone, will usually start the most recently installed
> > version (as this will have the highest priority) but I can configure
> > that via "sudo update-alternatives --config R". __I.e., I can make R
> > run a particular version. __Since the "update-alternative" step
> > above also registers all the *.info files and man pages, I will
> > also access the documentation of that particular R version (e.g.,
> > C-h i in emacs will give me access to the info version of the
> > manuals of the version of R which is run by the R command).
> 
> Exactly what I would like to have.

Well, if you ever use a system that has the alternatives set up and the
update-alternatives command, I am happy to share my script with you. 

Cheers,

Berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Rainer M Krug

On Fri, Feb 27, 2009 at 1:49 PM, Berwin A Turlach
 wrote:
> G'day Rainer,
>
> On Fri, 27 Feb 2009 10:53:12 +0200
> Rainer M Krug  wrote:
>
>> > What flavour of Linux are we talking about?
>>
>> Sorry - I am running SuSE on the machine where I need it.
>
> Sorry, I am not familiar with that flavour; before switching to Debian
> (and Debian based distributions), I was using RedHat.  And before that
> Slackware.
>
>> > 4) Run in /opt/src a script that uses "update-alternative" install
>> > to install the new version and creates a link
>> > from /opt/R/R-x.y.z/bin/R to /opt/bin/R-x.y.z
>>
>> How do I do this? I usually call "sudo make install". Do I have to use
>> "update-alternative --install R-2.7.1 R 2" if I want to have R-2.7.1
>> aqs the second priority installed?
>
> I do the "make install" step manually, the script just alerts the
> system that another alternative for the R command was installed.
>
> If memory serves correctly, the "alternatives" mechanism was developed
> by Debian and adopted by RedHat (or the other way round).  I am not
> sure whether SuSE has adopted this, or a similar system.
>
> Essentially, for a command, say foo, for which several alternatives
> exists, is installed on the system in, say /usr/bin/, as a link
> to /etc/alternatives/foo and /etc/alternatives/foo is a link to the
> actual program that is called.
>
> E.g. on my machine I have
>
> ber...@berwin-nus1:~$ update-alternatives --list wish
> /usr/bin/wish8.5
> /usr/bin/wish8.4
>
> which tells me that wish 8.5 and wish8.4 are installed and I could call
> them explicitly.  /usr/bin/wish is a link to /etc/alternatives/wish
> and /etc/alternatives/wish will point to either of these two programs
> (depending on what the system admin decided should be the default, i.e.
> should be used if a user just types 'wish').
>
> A command like "update-alternatives --config wish" allows to configure
> whether "wish" should mean "wish8.5" or "wish8.4".  And all that is
> necessary is to change the link in /etc/alternatives/wish to point at
> the desired program.
>
>> That is what I need - but I can't find update-alternatives in SuSE
>
> As I said, I do not know whether SuSE offers this alternatives system
> or a similar system.  If it does, perhaps it is just a matter of
> installing some additional packages?  If it offers a different, but
> similar system, then you would have to ask on a SuSE list on that
> system is maintained and configured.
>
> On my machine I would say "apt-file search update-alternatives" to find
> out which package provides that command and to install that package if
> it is not yet installed.  I am afraid I do not know what the equivalent
> command on SuSE is.
>
>> > Typing R alone, will usually start the most recently installed
>> > version (as this will have the highest priority) but I can configure
>> > that via "sudo update-alternatives --config R". __I.e., I can make R
>> > run a particular version. __Since the "update-alternative" step
>> > above also registers all the *.info files and man pages, I will
>> > also access the documentation of that particular R version (e.g.,
>> > C-h i in emacs will give me access to the info version of the
>> > manuals of the version of R which is run by the R command).
>>
>> Exactly what I would like to have.
>
> Well, if you ever use a system that has the alternatives set up and the
> update-alternatives command, I am happy to share my script with you.

Thanks a lot for the offer - that would be great. I will set it up the
same way on m y PC with Xubuntu.

Cheers

Rainer

>
> Cheers,
>
>        Berwin
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Faculty of Science
Natural Sciences Building
Private Bag X1
University of Stellenbosch
Matieland 7602
South Africa

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Advice on graphics to design circle with density-shaded sectors

2009-02-27 Thread John Poulsen


Hello,

I am looking for some general advice on which graphics package to use to 
make a figure demonstrating my experimental design.


I want to design a circle with 7 sectors inside.  Then I will want to 
shade the sectors depending on densities of observations in the 
sectors.  I will also want to draw horizontal lines at increments along 
the sectors to demonstrate different distances out to the end of the sector.


Given this sparse description, does anyone have advice on what package 
or functions to use in R?


Thanks for your help,
John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to get input-data of ROCR

2009-02-27 Thread bioshaw

Hi
I have a problem while using the ROCR package in R.
I can understand the main three commands, but can't understand the input 
format, 
including ROCR.hiv,ROCR.simple and ROCR.xval (actually,not only the format,but 
also 
how to get this data)
##
vectors(scores:numeric; labels:0 or 1)
multiple runs (cross-validation,bootstrapping...)

 What is the scores? 
I use the randomForest in windows XP, but can't obtain such data.
 Would you like to give me some details about the data ?
 It cannot be much better if you can show me some examples about it.
version: R 2.8.0, ROCR 1.0-2, randomForest 4.5-28
Best Wishes
Jiamin Shaw
2009.2.28
2009-02-27 



bioshaw 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] add absolute value to bars in barplot

2009-02-27 Thread soeren . vogel


Hello,

r-h...@r-project.orgbarplot(twcons.area,
  beside=T, col=c("green4", "blue", "red3", "gray"),
  xlab="estate",
  ylab="number of persons", ylim=c(0, 110),
  legend.text=c("treated", "mix", "untreated", "NA"))

produces a barplot very fine. In addition, I'd like to get the bars'  
absolute values on the top of the bars. How can I produce this in an  
easy way?


Thanks

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with correct use of function lsfit

2009-02-27 Thread mauede

To the purpose of fitting a 2nd order polynomial (a + b*x + c*x^2)  to the 
chunk of signal falling in a 17 consecutive samples window
I wrote the following very crude script. Since I have no previous experience of 
using Least Square Fit with R I would appreciate 
your supervision and suggestion.
I guess the returned coefficients of the oolynomial are: 
a = -1.3191398 
b = 0.1233055 
c = 0.9297401 


Thank you very much in advance,
Maura

##
## Main

tms <- t(read.table("signal877cycle1.txt"))
J <- ilogb(length(tms), base=2) + 1
y <- c(tms,rep(0,2^J - length(tms)))
y.win <- tms.ext[1:17]
ls.mat <- matrix(nrow=length(y.win),ncol=3,byrow=TRUE)
dt <- 0.033
ls.mat[,1] <- 1
ls.mat[,2] <- seq(0,dt*(length(y.win)-1),dt)
ls.mat[,3] <- ls.mat[,2]^2
#

> tms <- t(read.table("signal877cycle1.txt"))
> J <- ilogb(length(tms), base=2) + 1
> y <- c(tms,rep(0,2^J - length(tms)))
> y.win <- tms.ext[1:17]
> ls.mat <- matrix(nrow=length(y.win),ncol=3,byrow=TRUE)
> dt <- 0.033
> ls.mat[,1] <- 1
> ls.mat[,2] <- seq(0,dt*(length(y.win)-1),dt)
> ls.mat[,3] <- ls.mat[,2]^2
> y
  [1] -1.29882462 -1.29816465 -1.29175902 -1.33508315 -1.31905086 -1.30246447 
-1.25496640 -1.25858566 -1.19862868
 [10] -1.16985809 -1.15755035 -1.15627040 -1.10929231 -1.09324296 -1.07202676 
-1.03543530 -1.00609649 -0.96931799
 [19] -0.96014189 -0.93879923 -0.89472101 -0.86568807 -0.86394226 -0.83804684 
-0.79226517 -0.74804696 -0.69506558
 [28] -0.63984135 -0.57677266 -0.52376371 -0.48793752 -0.44261935 -0.37505621 
-0.30538492 -0.19309771 -0.07859412
 [37] -0.01879655  0.04247391  0.09565881  0.17329566  0.29132263  0.38380712  
0.45016443  0.50107765  0.57413940
 [46]  0.68835476  0.78369090  0.83756871  0.87753415  0.92834503  0.99560230  
1.08055356  1.17121517  1.22967280
 [55]  1.25791166  1.28749046  1.31672692  1.33188866  1.35420775  1.37356226  
1.38792638  1.40398573  1.41558702
 [64]  1.39204622  1.39848595  1.39902593  1.40604565  1.42092504  1.41436531  
1.3843  1.36012986  1.32950875
 [73]  1.26507137  1.25315597  1.18249472  1.08857029  0.98782261  0.90470599  
0.83081192  0.77709116  0.65228917
 [82]  0.51844166  0.44530462  0.39562664  0.30153281  0.17979539  0.09895985  
0.04306094 -0.03937571 -0.14150334
 [91] -0.25936679 -0.31480454 -0.38806157 -0.47389691 -0.50785671 -0.58179371 
-0.67538285 -0.74246719 -0.78380551
[100] -0.83894328 -0.86450224 -0.90614055 -0.93751928 -0.99679687 -1.03205956 
-1.06616465 -1.06651404 -1.14997066
[109] -1.18338930 -1.21335809 -1.20208854 -1.22370767 -1.23488486 -1.25112655 
-1.26942581 -1.26792234 -1.28838504
[118] -1.28799329 -1.27326566 -1.28502518  0.  0.  0.  
0.  0.  0.
[127]  0.  0.
> y.win
 [1] -1.298825 -1.298165 -1.291759 -1.335083 -1.319051 -1.302464 -1.254966 
-1.258586 -1.198629 -1.169858 -1.157550
[12] -1.156270 -1.109292 -1.093243 -1.072027 -1.035435 -1.006096
> ls.mat
  [,1]  [,2] [,3]
 [1,]1 0.000 0.00
 [2,]1 0.033 0.001089
 [3,]1 0.066 0.004356
 [4,]1 0.099 0.009801
 [5,]1 0.132 0.017424
 [6,]1 0.165 0.027225
 [7,]1 0.198 0.039204
 [8,]1 0.231 0.053361
 [9,]1 0.264 0.069696
[10,]1 0.297 0.088209
[11,]1 0.330 0.108900
[12,]1 0.363 0.131769
[13,]1 0.396 0.156816
[14,]1 0.429 0.184041
[15,]1 0.462 0.213444
[16,]1 0.495 0.245025
[17,]1 0.528 0.278784
> lsfit(x, y, wt = NULL, intercept = TRUE, tolerance = 1e-07,

+yname = NULL> 
> lsfit(ls.mat, y.win,wt = NULL, intercept = TRUE, tolerance = 1e-07,yname = 
> NULL)
$coefficients
 Intercept X1 X2 X3 
-1.3191398  0.1233055  0.9297401  0.000 

$residuals
 [1]  0.020315146  0.015893550  0.015192628 -0.037263015 -0.032387216 
-0.028982296  0.003309337 -0.017541342
 [9]  0.023159250  0.030648485  0.019649885 -0.004401476  0.015220334  
0.001888425 -0.008301609 -0.005141358
[17] -0.011258729

$intercept
[1] TRUE

$qr
$qt
 [1]  4.937370523  0.409411205 -0.089144866 -0.041892736 -0.035696706 
-0.031176843  0.002024443 -0.018121872
 [9]  0.023077794  0.030860815  0.019950712 -0.004217443  0.015082286  
0.001223006 -0.009699688 -0.007477386
[17] -0.014737995

$qr
   Intercept  X2  X3X1
 [1,] -4.1231056 -1.08849989 -0.39512546 -4.123106e+00
 [2,]  0.2425356  0.66656733  0.35194755  1.558035e-17
 [3,]  0.2425356  0.21973588 -0.09588149  1.787189e-17
 [4,]  0.2425356  0.17022850 -0.10350966 -2.990539e-17
 [5,]  0.2425356  0.12072112 -0.19811319  2.906411e-01
 [6,]  0.2425356  0.07121375 -0.27000118  2.654896e-01
 [7,]  0.2425356  0.02170637 -0.31917362  2.457966e-01
 [8,]  0.2425356 -0.02780101 -0.34563052  2.315620e-01
 [9,]  0.2425356 -0.07730838 -0.34937188  2.227859e-01
[10,]  0.2425356 -0.12681576 -0.33039769  2.194681e-01
[11,]  0.2425356 -0.17632314 -0.28870796  2.216089e-01
[12,]  0.2425356 -0.22583052 -0.22430269  2.29

Re: [R] Installing different versions of R simultaneously on Linux

2009-02-27 Thread Berwin A Turlach

G'day Rainer,

On Fri, 27 Feb 2009 14:06:20 +0200
Rainer M Krug  wrote:

> Thanks a lot for the offer - that would be great. I will set it up the
> same way on m y PC with Xubuntu.

Script is attached.  Ignore the comments at the beginning they are
there just to remind me what ./configure line I usually use, possible
variations, and whether to edit config.site or work with environment
variables.

After the "make install" step, I edit in this file the variable VERSION
and PRIORITY and then ran the script as root.  Note that VERSION should
be the same number as the one specified in the ./configure line.  

As long as the the configuration of a command is set to 'auto', the
alternative with the highest priority is used.  So make sure that the
newest version of R has highest priority, I usually set priority just
to xyz for R-x.y.z (and keep my fingers crossed that there will never
be a release with either y or z larger than 9, otherwise I will have
to refine my scheme).

To use this on a new machine, you have to create /opt/info, 
/opt/man/man1 and /opt/bin before running the script the first time
(IIRC).  It also helps to copy /opt/R/R-$VERSION/share/info/dir
to /opt/info/dir so that emacs will include the info files in the list
that you get with C-h i (this has to be done only once, the dir file
does not seem to change between R versions).

Prior to 2.5.0 the man and info files were installed in R-$VERSION/man
and R-$VERSION/info instead of R-$VERSION/share/man and
R-$VERSION/share/info, respectively.  I have a separate script for those
versions (but don't install such old versions anymore).  How far do you
want to go back?  Also, much earlier, if memory serves correctly,
R-exts.info came in 2 parts instead of 3; but I don't seem to have my
script from that time anymore.

I think that's all.  Let me know if you run into troubles or need more
help.

Cheers,

Berwin
#!/bin/bash

##Configure with the following options:
##
## ./configure --prefix=/opt/R/R-2.8.1 --with-blas --with-lapack 
--enable-R-shlib r_arch=32
##
## other possible options:
## r_arch=32 and r_arch=64
## --enable-R-shlib
##
## export JAVA_HOME=/where/is/sun/java (/usr/lib/jvm/java-1.6-sun)
## above not necessary, use config.site instead.
##
##Then as root:
## VERSION=devel
## PRIORITY=100
VERSION=2.8.1
PRIORITY=281

update-alternatives --install /opt/bin/R R /opt/R/R-$VERSION/bin/R $PRIORITY \
  --slave /opt/man/man1/R.1 R.1 /opt/R/R-$VERSION/share/man/man1/R.1 \
  --slave /opt/info/R-FAQ.info.gz R-FAQ.info 
/opt/R/R-$VERSION/share/info/R-FAQ.info.gz \
  --slave /opt/info/R-admin.info.gz R-admin.info 
/opt/R/R-$VERSION/share/info/R-admin.info.gz \
  --slave /opt/info/R-data.info.gz R-data.info 
/opt/R/R-$VERSION/share/info/R-data.info.gz \
  --slave /opt/info/R-exts.info.gz R-exts.info 
/opt/R/R-$VERSION/share/info/R-exts.info.gz \
  --slave /opt/info/R-exts.info-1.gz R-exts.info-1 
/opt/R/R-$VERSION/share/info/R-exts.info-1.gz \
  --slave /opt/info/R-exts.info-2.gz R-exts.info-2 
/opt/R/R-$VERSION/share/info/R-exts.info-2.gz \
  --slave /opt/info/R-intro.info.gz R-intro.info 
/opt/R/R-$VERSION/share/info/R-intro.info.gz \
  --slave /opt/info/R-lang.info.gz R-lang.info 
/opt/R/R-$VERSION/share/info/R-lang.info.gz \
  --slave /opt/info/R-ints.info.gz R-ints.info 
/opt/R/R-$VERSION/share/info/R-ints.info.gz

ln -sf /opt/R/R-$VERSION/bin/R /opt/bin/R-$VERSION 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sweave doesn't do csv.get()

2009-02-27 Thread christiaan pauw

Hi Everybody
I use R2.8.0 on Mac OS X. I set up LyX 1.6.1 to use Sweave today. I can
compile the test file I found on CRAN (
http://cran.r-project.org/contrib/extra/lyx/) without a problem and the
output looks very nice. In the test file the following R code is used.

<>=
xObs <- 100; xMean <- 10; xVar <- 9
x <- rnorm(n=xObs, mean=xMean, sd=sqrt(xVar))
mean(x)
@

that should be the same as:

xObs <- 100
xMean <- 10
xVar <- 9
x <- rnorm(n=xObs, mean=xMean, sd=sqrt(xVar))
mean(x)

in the R console.

My problem is that I want to import data to use in my report. In the R
source I currently use to analyse my data I import it through csv.get(). I
have found that I cannot use csv.get() or write.csv() or that matter. I
don't seem to be able to use load() to get a .rda file in either

Is this issue related to LyX, LaTeX or R?

Thanks in advance
Christiaan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add absolute value to bars in barplot

2009-02-27 Thread Philipp Pagel

On Fri, Feb 27, 2009 at 01:32:45PM +0100, soeren.vo...@eawag.ch wrote:
> barplot(twcons.area,
>   beside=T, col=c("green4", "blue", "red3", "gray"),
>   xlab="estate",
>   ylab="number of persons", ylim=c(0, 110),
>   legend.text=c("treated", "mix", "untreated", "NA"))
>
> produces a barplot very fine. In addition, I'd like to get the bars'  
> absolute values on the top of the bars. How can I produce this in an  
> easy way?

barplot() returns a vector of midpoints so you can use text() to add the
annotation. There is an example in the manual page of barplot:

mp <- barplot(VADeaths)
tot <- colMeans(VADeaths)
text(mp, tot + 3, format(tot), xpd = TRUE, col = "blue")

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://mips.gsf.de/staff/pagel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Axis-question

2009-02-27 Thread Antje


solved by grouping... (see my next mail)

Antje schrieb:

Hi there,

I was wondering wether it's possible to generate an axis with groups 
(like in Excel).


So that you can have something like this as x-axis (for example for the 
levelplot-method of the lattice package):


---
| X1 | X2 | X3 | X1 | X2 | X3 | X1 | ...
| group1   | group2   | group3  ...
..
..
..

I hope you understand what I'm looking for?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr


Wensui Liu wrote:

Thanks for pointing me to the SAS code, Dr Harrell
After reading codes, I have to say that the inefficiency is not
related to SAS language itself but the SAS programmer. An experienced
SAS programmer won't use much of hard-coding, very adhoc and difficult
to maintain.
I agree with you that in the SAS code, it is a little too much to
evaluate predictions. such complex data step actually can be replaced
by simpler iml code.


Agreed that the SAS code could have been much better.  I programmed in 
SAS for 23 years and would have done it much differently.  But you will 
find that the most elegant SAS program re-write will still be a far cry 
from the elegance of R.


Frank



On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr
 wrote:

If anyone wants to see a prime example of how inefficient it is to program
in SAS, take a look at the SAS programs provided by the US Agency for
Healthcare Research and Quality for risk adjusting and reporting for
hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm .
 The PSSASP3.SAS program is a prime example.  Look at how you do a vector
product in the SAS macro language to evaluate predictions from a logistic
regression model.  I estimate that using R would easily cut the programming
time of this set of programs by a factor of 4.

Frank
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] levelplot help needed

2009-02-27 Thread Antje


Hi there,

I'm looking for someone who can give me some hints how to make a nice 
levelplot. As an example, I have the following code:


# create some example data
# --
xl <- 4
yl <- 10

my.data <- sapply(1:xl, FUN = function(x) { rnorm( yl, mean = x) })

x_label <- rep(c("X Label 1", "X Label 2", "X Label 3", "X Label 4"), each = yl)
y_label <- rep(paste("Y Label ", 1:yl, sep=""), xl)

df <- data.frame(x_label = factor(x_label),y_label = factor(y_label), values = 
as.vector(my.data))


df1 <- data.frame(df, group = rep("Group 1", xl*yl))
df2 <- data.frame(df, group = rep("Group 2", xl*yl))
df3 <- data.frame(df, group = rep("Group 3", xl*yl))

mdf <- rbind(df1,df2,df3)

# plot
# --

graph <- levelplot(mdf$values ~ mdf$x_label * mdf$y_label | mdf$group,
aspect = "xy", layout = c(3,1),
scales = list(x = list(labels = substr(levels(factor(mdf$x_label)),0,5), 
rot = 45)))

print(graph)

# --


(I need to put this strange x-labels, because in my real data the values of the 
x-labels are too long and I just want to display the first 10 characters as label)


My questions:

* I'd like to start with "Y Label 1" in the upper row (that's a more general 
issue, how can I have influence on the order of x,y, and groups?)

* I'd like to put the groups at the bottom

Can anybody give me some help?

Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr

Ajay ohri wrote:
Sometimes for the sake of simplicity, SAS coding is created like that. 
One can use the concatenate function and drag and drop in an simple 
excel sheet for creating elaborate SAS code like the one mentioned and 
without any time at all.

A system that requires Excel for its success is not a complete system.

There are multiple ways to do this in SAS , much better and similarly in 
R

There are many areas that SAS programmers would find R a bit not so 
useful ---example

the equivalence of proc logistic for creating a logistic model.

Really?  Try this in SAS:

library(Design)
f <- lrm(death ~ rcs(age,5)*sex)
anova(f) # get test of nonlinearity of interactions among other things
nomogram(f)  # depict model graphically

The restricted cubic spline in age, i.e., assuming the age relationship 
is smooth but not much else, is very easy to code in R.  There are many 
other automatic transformations available.  The lack of generality of 
the SAS language makes many SAS users assume linearity for more often 
than R users do.

Also note that PROC LOGISTIC, without invocation of a special option, 
would make the user believe that older subjects have lower chances of 
dying, as SAS by default takes the even being predicted to be death=0.

Frank

On Fri, Feb 27, 2009 at 10:21 AM, Wensui Liu > wrote:

Thanks for pointing me to the SAS code, Dr Harrell
After reading codes, I have to say that the inefficiency is not
related to SAS language itself but the SAS programmer. An experienced
SAS programmer won't use much of hard-coding, very adhoc and difficult
to maintain.
I agree with you that in the SAS code, it is a little too much to
evaluate predictions. such complex data step actually can be replaced
by simpler iml code.

On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr
mailto:f.harr...@vanderbilt.edu>> wrote:
 > If anyone wants to see a prime example of how inefficient it is
to program
 > in SAS, take a look at the SAS programs provided by the US Agency for
 > Healthcare Research and Quality for risk adjusting and reporting for
 > hospital outcomes at
http://www.qualityindicators.ahrq.gov/software.htm .
 >  The PSSASP3.SAS program is a prime example.  Look at how you do
a vector
 > product in the SAS macro language to evaluate predictions from a
logistic
 > regression model.  I estimate that using R would easily cut the
programming
 > time of this set of programs by a factor of 4.
 >
 > Frank
 > --
 > Frank E Harrell Jr   Professor and Chair   School of Medicine
 > Department of Biostatistics   Vanderbilt
University
 >
 > __
 > R-help@r-project.org  mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-help
 > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 > and provide commented, minimal, self-contained, reproducible code.
 >

--
===
WenSui Liu
Acquisition Risk, Chase
Blog   : statcompute.spaces.live.com

I can calculate the motion of heavenly bodies, but not the madness
of people.”
--  Isaac Newton
===

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr


Gerard M. Keogh wrote:

Frank,

I can't see the code you mention - Web marshall at work - but I don't think
you should be too quick to run down SAS - it's a powerful and flexible
language but unfortunately very expensive.

Your example mentions doing a vector product in the macro language - this
only suggest to me that those people writing the code need a crash course
in SAS/IML (the matrix language). SAS is designed to work on records and so
is inapproprorriate for matrices - macros are only an efficient code
copying device. Doing matrix computations in this way is pretty mad and the
code would be impossible never mind the memory problems.
SAS recognise that but a lot of SAS users remain familiar with IML.

In IML by contrast there are inner, cross and outer products and a raft of
other useful methods for matrix work that R users would be familiar with.
OLS for example is one line:

b = solve(X`X, X`y) ;
rss = sqrt(ssq(y - Xb)) ;

And to give you a flavour of IML's capabilities I implemented a SAS version
of the MARS program in it about 6 or 7 years ago.
BTW SPSS also has a matrix language.

Gerard


But try this:

PROC IML;
... some custom user code ...
... loop over j=1 to 10 ...
...   PROC GENMOD, output results back to IML
...

IML is only a partial solution since it is not integrated with the PROC 
step.


Frank





   
 Frank E Harrell   
 Jr
  bilt.edu> R list
 Sent by:   cc 
 r-help-boun...@r- 
 project.org   Subject 
   [R] Inefficiency of SAS Programming 
   
 26/02/2009 22:57  
   
   
   
   





If anyone wants to see a prime example of how inefficient it is to
program in SAS, take a look at the SAS programs provided by the US
Agency for Healthcare Research and Quality for risk adjusting and
reporting for hospital outcomes at
http://www.qualityindicators.ahrq.gov/software.htm .  The PSSASP3.SAS
program is a prime example.  Look at how you do a vector product in the
SAS macro language to evaluate predictions from a logistic regression
model.  I estimate that using R would easily cut the programming time of
this set of programs by a factor of 4.

Frank
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



**
The information transmitted is intended only for the p...{{dropped:15}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr


Ajay ohri wrote:


I would like to know if we can create a package in which r functions are 
renamed closer to sas language.doing so will help people familiar to SAS 
to straight away take to R for their work,thus decreasing the threshold 
for acceptance - and then get into deeper understanding later.


since it is a package it would be optional only for people wanting to 
try out R from SAS.. Do we have such a package right now..it basically 
masks R functions to the equivalent function in another language just 
for user ease /beginners


for example

creating function for means 


 procmeans<-function(x,y)
+ {
summary (
subset(x,select=c(x,y))
+
)

creating function for importing csv

procimport <-function(x,y)
+ {
read.csv(
textConnection(x),row.names=y,na.strings="  "
+
)


creating function fo describing data

procunivariate<-function(x)
+ {
summary(x)
+
)

regards,

ajay


Ajay,

This will generate major confusion among users of all types and be hard 
to maintain.  A better approach is to get Bob Muenchen's excellent book 
and keep it nearby.


Frank



www.decisionstats.com 

On Fri, Feb 27, 2009 at 4:27 AM, Frank E Harrell Jr 
mailto:f.harr...@vanderbilt.edu>> wrote:


If anyone wants to see a prime example of how inefficient it is to
program in SAS, take a look at the SAS programs provided by the US
Agency for Healthcare Research and Quality for risk adjusting and
reporting for hospital outcomes at
http://www.qualityindicators.ahrq.gov/software.htm .  The
PSSASP3.SAS program is a prime example.  Look at how you do a vector
product in the SAS macro language to evaluate predictions from a
logistic regression model.  I estimate that using R would easily cut
the programming time of this set of programs by a factor of 4.

Frank
-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine

Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ordinal Mantel-Haenszel type inference

2009-02-27 Thread Jourdan Gold

Hello,

I am searching for an R-Package that does an exentsion of the Mantel-Haenszel 
test for ordinal data as described in Liu and Agresti (1996) "A Mantel-Haenszel 
type inference for cummulative odds ratios". in Biometrics. I see packages such 
as Epi that perform it for binary data and derives a varaince for it using the 
Robbins and Breslow variance method. As well as another pacakge that derives it 
for nominal variables but does not provide a variance or confidence limit. 

Does a package exist that does this? I have searched the list archives and 
can't seem to see such a package but I could be missing something.  thank you.


yours sincerely,


Jourdan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how can I compare two vector by a factor

2009-02-27 Thread Xin Shi

Hi, I used Wilcox.test to carry out mann whiteney test when paired=false. 
However, I want to see the comparison of two variables, e.g. pre and post, 
grouped by treatment.

Anyone has this experience?

Thanks!

Xin


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Singularity in a regression?

2009-02-27 Thread Alex Roy

If  collinearity exists, one of the solutions is regulazation version of
regression. There are different types of regularization method. like Ridge,
LASSO, elastic net etc. For example, in  MASS package you can get ridge
regression.

Alex


On Thu, Feb 26, 2009 at 1:58 PM, Bob Gotwals  wrote:

> R friends,
>
> In a matrix of 1s and 0s, I'm getting a singularity error.  Any helpful
> ideas?
>
> lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe +
>paraF + paraCl + paraBr + paraI + paraMe)
>
> Residuals:
>   Min 1Q Median 3QMax
> -4.573e-01 -7.884e-02  3.469e-17  6.616e-02  2.427e-01
>
> Coefficients: (1 not defined because of singularities)
>Estimate Std. Error t value Pr(>|t|)
> (Intercept)   7.9173 0.1129  70.135  < 2e-16 ***
> metaF-0.3973 0.2339  -1.698 0.115172
> metaClNA NA  NA   NA
> metaBr0.3454 0.1149   3.007 0.010929 *
> metaI 0.4827 0.2339   2.063 0.061404 .
> metaMe0.3654 0.1149   3.181 0.007909 **
> paraF 0.7675 0.1449   5.298 0.000189 ***
> paraCl0.3400 0.1449   2.347 0.036925 *
> paraBr1.0200 0.1449   7.040 1.36e-05 ***
> paraI 1.3327 0.2339   5.697 9.96e-05 ***
> paraMe1.2191 0.1573   7.751 5.19e-06 ***
> ---
> Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
>
> Residual standard error: 0.2049 on 12 degrees of freedom
> Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699
> F-statistic: 16.61 on 9 and 12 DF,  p-value: 1.811e-05
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about 3-d plot

2009-02-27 Thread Tony Breyal

Hi Deepankar
The code on the following page looks kind of cool, and also seems to
produce something of the type of graph you are after perhaps:

https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/rgl/demo/regression.r?rev=702&root=rgl&sortby=date&view=auto

[below is a copy of the code...]


library(rgl)

# demo: regression
# author: Daniel Adler
# $Id$

rgl.demo.regression <- function(n=100,xa=3,za=8,xb=0.02,
zb=0.01,xlim=c(0,100),zlim=c(0,100)) {

  rgl.clear("all")
  rgl.bg(sphere = TRUE, color = c("black", "green"), lit = FALSE,
size=2,
alpha=0.2, back = "lines")
  rgl.light()
  rgl.bbox()

  x  <- runif(n,min=xlim[1],max=xlim[2])
  z  <- runif(n,min=zlim[1],max=zlim[2])
  ex <- rnorm(n,sd=3)
  ez <- rnorm(n,sd=2)
  esty  <- (xa+xb*x) * (za+zb*z) + ex + ez

  rgl.spheres(x,esty,z,color="gray",radius=1,specular="green",
texture=system.file("textures/
bump_dust.png",package="rgl"),
texmipmap=T, texminfilter="linear.mipmap.linear")

  regx  <- seq(xlim[1],xlim[2],len=100)
  regz  <- seq(zlim[1],zlim[2],len=100)
  regy  <- (xa+regx*xb) %*% t(za+regz*zb)

  rgl.surface(regx,regz,regy,color="blue",alpha=0.5,shininess=128)

  lx <- c(xlim[1],xlim[2],xlim[2],xlim[1])
  lz <- c(zlim[1],zlim[1],zlim[2],zlim[2])
  f <- function(x,z) { return ( (xa+x*xb) * t(za+z*zb) ) }
  ly <- f(lx,lz)

  rgl.quads
(lx,ly,lz,color="red",size=5,front="lines",back="lines",lit=F)
}

rgl.open()
rgl.demo.regression()



On Feb 27, 5:28 am, Dipankar Basu  wrote:
> Hi R Users,
>
> I have produced a simulated scatter plot of y versus x tightly clustered
> around the 45 degree line through the origin with the following code:
>
>  x <- seq(1,100)
>  y <- x+rnorm(100,0,10)
>  plot(x,y,col="blue")
>  abline(0,1)
>
> Is there some way to generate a 3-dimensional analogue of this? Can I get a
> similar simulated scatter plot of points in 3 dimensions where the points
> are clustered around a plane through the origin where the plane in question
> is the 3-dimensional analogue of the 45 degree line through the origin?
>
> Deepankar
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave doesn't do csv.get()

2009-02-27 Thread Frank E Harrell Jr


christiaan pauw wrote:

Hi Everybody
I use R2.8.0 on Mac OS X. I set up LyX 1.6.1 to use Sweave today. I can
compile the test file I found on CRAN (
http://cran.r-project.org/contrib/extra/lyx/) without a problem and the
output looks very nice. In the test file the following R code is used.

<>=
xObs <- 100; xMean <- 10; xVar <- 9
x <- rnorm(n=xObs, mean=xMean, sd=sqrt(xVar))
mean(x)
@

that should be the same as:

xObs <- 100
xMean <- 10
xVar <- 9
x <- rnorm(n=xObs, mean=xMean, sd=sqrt(xVar))
mean(x)

in the R console.

My problem is that I want to import data to use in my report. In the R
source I currently use to analyse my data I import it through csv.get(). I
have found that I cannot use csv.get() or write.csv() or that matter. I
don't seem to be able to use load() to get a .rda file in either

Is this issue related to LyX, LaTeX or R?

Thanks in advance
Christiaan


I didn't see the library(Hmisc) statement in your code that would give 
you access to csv.get.  This should be unrelated to lyx, Sweave, etc.

Frank



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Gerard M. Keogh

Yes Frank, I accept your point but nevertheless IML is the proper place for
matrix work in SAS - mixing macro-level logic and computation is another
question - R is certainly more seemless in this respect.

Gerard


   
 Frank E Harrell   
 Jr
  "Gerard M. Keogh"   
   
 27/02/2009 13:55   cc 
   R list ,  
   r-help-boun...@r-project.org
   Subject 
   Re: [R] Inefficiency of SAS 
   Programming 
   
   
   
   
   
   




Gerard M. Keogh wrote:
> Frank,
>
> I can't see the code you mention - Web marshall at work - but I don't
think
> you should be too quick to run down SAS - it's a powerful and flexible
> language but unfortunately very expensive.
>
> Your example mentions doing a vector product in the macro language - this
> only suggest to me that those people writing the code need a crash course
> in SAS/IML (the matrix language). SAS is designed to work on records and
so
> is inapproprorriate for matrices - macros are only an efficient code
> copying device. Doing matrix computations in this way is pretty mad and
the
> code would be impossible never mind the memory problems.
> SAS recognise that but a lot of SAS users remain familiar with IML.
>
> In IML by contrast there are inner, cross and outer products and a raft
of
> other useful methods for matrix work that R users would be familiar with.
> OLS for example is one line:
>
> b = solve(X`X, X`y) ;
> rss = sqrt(ssq(y - Xb)) ;
>
> And to give you a flavour of IML's capabilities I implemented a SAS
version
> of the MARS program in it about 6 or 7 years ago.
> BTW SPSS also has a matrix language.
>
> Gerard

But try this:

PROC IML;
... some custom user code ...
... loop over j=1 to 10 ...
...   PROC GENMOD, output results back to IML
...

IML is only a partial solution since it is not integrated with the PROC
step.

Frank

>
>
>
>

>  Frank E Harrell

>  Jr

>bilt.edu> R list 

>  Sent by:
cc
>  r-help-boun...@r-

>  project.org
Subject
>[R] Inefficiency of SAS
Programming
>

>  26/02/2009 22:57

>

>

>

>

>
>
>
>
> If anyone wants to see a prime example of how inefficient it is to
> program in SAS, take a look at the SAS programs provided by the US
> Agency for Healthcare Research and Quality for risk adjusting and
> reporting for hospital outcomes at
> http://www.qualityindicators.ahrq.gov/software.htm .  The PSSASP3.SAS
> program is a prime example.  Look at how you do a vector product in the
> SAS macro language to evaluate predictions from a logistic regression
> model.  I estimate that using R would easily cut the programming time of
> this set of programs by a factor of 4.
>
> Frank
> --
> Frank E Harrell Jr   Professor and Chair   School of Medicine
>   Department of Biostatistics   Vanderbilt University
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
**

> The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you received
this in error, please contact the sender and delete the material from any
computer.  It is the policy of the Department of Justice, Equality and Law
Reform and the Agencies and Offices using its IT services to disallow the
sending of offensive material.
> Should you consider that the material contained in this message is
offensi

[R] Will ctv package work on ubuntu?

2009-02-27 Thread Brian Lunergan

Hi ho:

I had used the ctv package on a Windows setup of R and I was wondering
about Ubuntu. Certainly under Windows it has an easy time of it because
there is only one library folder to scan for existing packages. Would
its install.views and update.views functions work in Ubuntu where the
packages are split up between the library established by R-cran
downloads from synaptic and the default library used by 'conventional'
downloads using install.packages?

If it can't handle that distinction between a Windows and a Linux
situation, is it a package I should remove for now?

Regards...
-- 
Brian Lunergan
Nepean, Ontario
Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Marc Schwartz

on 02/27/2009 07:57 AM Frank E Harrell Jr wrote:
> Ajay ohri wrote:
>>
>> I would like to know if we can create a package in which r functions
>> are renamed closer to sas language.doing so will help people familiar
>> to SAS to straight away take to R for their work,thus decreasing the
>> threshold for acceptance - and then get into deeper understanding later.
>>
>> since it is a package it would be optional only for people wanting to
>> try out R from SAS.. Do we have such a package right now..it basically
>> masks R functions to the equivalent function in another language just
>> for user ease /beginners
>>
>> for example
>>
>> creating function for means
>>  procmeans<-function(x,y)
>> + {
>> summary (
>> subset(x,select=c(x,y))
>> +
>> )
>>
>> creating function for importing csv
>>
>> procimport <-function(x,y)
>> + {
>> read.csv(
>> textConnection(x),row.names=y,na.strings="  "
>> +
>> )
>>
>>
>> creating function fo describing data
>>
>> procunivariate<-function(x)
>> + {
>> summary(x)
>> +
>> )
>>
>> regards,
>>
>> ajay
> 
> Ajay,
> 
> This will generate major confusion among users of all types and be hard
> to maintain.  A better approach is to get Bob Muenchen's excellent book
> and keep it nearby.
> 
> Frank

I whole heartedly agree with Frank here. It may be one thing to have a
"translation" process in place based upon some form of logical mapping
between the two languages (as Bob's book provides). But is another thing
entirely to actually start writing functions that provide wrappers
modeled on SAS based PROCs.

If you do this, then you only serve to obfuscate the fundamental
philosophical and functional differences between the two languages and
doom a new useR to missing all of R's benefits. They will continue to
try to figure out how to use R based upon their "SAS intuition" rather
than developing a new set of coding and even statistical paradigms.

Having been through the SAS to S/R transition myself, having used SAS
for much of the 90's and now having used R for over 7 years, I can speak
from personal experience and state that the only way to achieve the
requisite proficiency with R is immersion therapy.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Download daily weather data

2009-02-27 Thread Thomas Levine

Geonames unfortunately doesn't have weather forecasts. This is a problem.

GRIB looks better. There is an interface between GRIB and R.

On Fri, Feb 27, 2009 at 4:14 AM, Pfaff, Bernhard Dr.
 wrote:
> Dear Thomas,
>
> more for the sake of completeness and as an alternative to R. There are GRIB 
> data [1] sets available (some for free) and there is the GPL software Grads 
> [2]. Because the Grib-Format is well documented it should be possible to get 
> it into R easily and make up your own plots/weather analyis. I do not know 
> and have not checked if somebody has already done so.
>
> I use this information/tools aside of others during longer-dated off-shore 
> sailing.
>
> Best,
> Bernhard
>
> [1] http://www.grib.us/
> [2] http://www.iges.org/grads/
>
>>-Ursprüngliche Nachricht-
>>Von: r-help-boun...@r-project.org
>>[mailto:r-help-boun...@r-project.org] Im Auftrag von Scillieri, John
>>Gesendet: Donnerstag, 26. Februar 2009 22:58
>>An: 'James Muller'; 'r-help@r-project.org'
>>Betreff: Re: [R] Download daily weather data
>>
>>Looks like you can sign up to get XML feed data from Weather.com
>>
>>http://www.weather.com/services/xmloap.html
>>
>>Hope it works out!
>>
>>-Original Message-
>>From: r-help-boun...@r-project.org
>>[mailto:r-help-boun...@r-project.org] On Behalf Of James Muller
>>Sent: Thursday, February 26, 2009 3:57 PM
>>To: r-help@r-project.org
>>Subject: Re: [R] Download daily weather data
>>
>>Thomas,
>>
>>Have a look at the source code for the webpage (ctrl-u in firefox,
>>don't know in internet explorer, etc.). That is what you'd have to
>>parse in order to get the forecast from this page. Typically when I
>>parse webpages such as this I use regular expressions to do so (and I
>>would never downplay the usefulness of regular expressions, but they
>>take a little getting used to). There are two parts to the task: find
>>patterns that allow you to pull out the datum/data you're after; and
>>then write a program to pull it/them out. Also, of course, download
>>the webpage (but that's no issue).
>>
>>I bet you'd be able to find a comma separated value (CSV) file
>>containing the weather report somewhere, which would probably involve
>>a little less labor in order to produce your automatic wardrobe
>>advice.
>>
>>James
>>
>>
>>
>>On Thu, Feb 26, 2009 at 3:47 PM, Thomas Levine
>> wrote:
>>> I'm writing a program that will tell me whether I should wear a coat,
>>> so I'd like to be able to download daily weather forecasts and daily
>>> reports of recent past weather conditions.
>>>
>>> The NOAA has very promising tabular forecasts
>>>
>>(http://forecast.weather.gov/MapClick.php?CityName=Ithaca&state
> =NY&site=BGM&textField1=42.4422&textField2=-76.5002&e=0>&FcstType=digital),
>>> but I can't figure out how to import them.
>>>
>>> Someone must have needed to do this before. Suggestions?
>>>
>>> Thomas Levine!
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
> This e-mail and any attachments are confidential, may
>>contain legal, professional or other privileged information,
>>and are intended solely for the addressee.  If you are not the
>>intended recipient, do not use the information in this e-mail
>>in any way, delete this e-mail and notify the sender. CEG-IP1
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>
> *
> Confidentiality Note: The information contained in this ...{{dropped:10}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave doesn't do csv.get()

2009-02-27 Thread christiaan pauw

It works now.
Your help is much appreciated
Christiaan

2009/2/27 Frank E Harrell Jr 

> christiaan pauw wrote:
>
>> Hi Everybody
>> I use R2.8.0 on Mac OS X. I set up LyX 1.6.1 to use Sweave today. I can
>> compile the test file I found on CRAN (
>> http://cran.r-project.org/contrib/extra/lyx/) without a problem and the
>> output looks very nice. In the test file the following R code is used.
>>
>> <>=
>> xObs <- 100; xMean <- 10; xVar <- 9
>> x <- rnorm(n=xObs, mean=xMean, sd=sqrt(xVar))
>> mean(x)
>> @
>>
>> that should be the same as:
>>
>> xObs <- 100
>> xMean <- 10
>> xVar <- 9
>> x <- rnorm(n=xObs, mean=xMean, sd=sqrt(xVar))
>> mean(x)
>>
>> in the R console.
>>
>> My problem is that I want to import data to use in my report. In the R
>> source I currently use to analyse my data I import it through csv.get(). I
>> have found that I cannot use csv.get() or write.csv() or that matter. I
>> don't seem to be able to use load() to get a .rda file in either
>>
>> Is this issue related to LyX, LaTeX or R?
>>
>> Thanks in advance
>> Christiaan
>>
>
> I didn't see the library(Hmisc) statement in your code that would give you
> access to csv.get.  This should be unrelated to lyx, Sweave, etc.
> Frank
>
>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> Frank E Harrell Jr   Professor and Chair   School of Medicine
> Department of Biostatistics   Vanderbilt University
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Download daily weather data

2009-02-27 Thread James Muller

Can I just say, it's great to see the R community really come out in
support of such a noble and worthy cause as this :).

Downfall of civilization, all that. Not here, no!

James

On Thu, Feb 26, 2009 at 3:47 PM, Thomas Levine  wrote:
> I'm writing a program that will tell me whether I should wear a coat,
> so I'd like to be able to download daily weather forecasts and daily
> reports of recent past weather conditions.
>
> The NOAA has very promising tabular forecasts
> (http://forecast.weather.gov/MapClick.php?CityName=Ithaca&state=NY&site=BGM&textField1=42.4422&textField2=-76.5002&e=0&FcstType=digital),
> but I can't figure out how to import them.
>
> Someone must have needed to do this before. Suggestions?
>
> Thomas Levine!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread spam me

I've actually used AHRQ's software to create Inpatient Quality Indicator
reports.  I can confirm pretty much what we already know; it is inefficient.
Running on about 1.8 - 2 million cases, it would take just about a whole day
to run the entire process from start to finish.  That isn't all processing
time and includes some time for the analyst to check results between
substeps, but I still knew that my day was full when I was working on IQI
reports.



To be fair though, there are a lot of other factors (beside efficiency
considerations) that go into AHRQ's program design.  First, there are a lot
of changes to that software every year.  In some cases it is easier and less
error prone to hardcode a few points in the data so that it is blatantly
obvious what to change next year should another analyst need to do so.  Second,
the organizations that use this software often require transparency and may
not have high level programmers on staff.  Writing code so that it is
accessible, editable, and interpretable by intermediate level programmers or
analysts is a plus.  Third, given that IQI reports are often produced on a
yearly basis, there's no real need to sacrifice clarity, etc. for efficiency
- you're only doing this process once a year.



There are other points that could be made, but the main idea is I don't
think it's fair to hold this software up, out of context, as an example of
SAS's (or even AHRQs) inefficiencies.  I agree that SAS syntax is nowhere
near as elegant or as powerful as R from a programming standpoint, that's
why after 7 years of using SAS I switched to R.  But comparing the two at
that level is like a racing a Ferrari and a Bentley to see which is the
better car.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ftp fetch using RCurl?

2009-02-27 Thread CHD850


I am using RCurl, version 0.9-4, under windows. I can not find the function
getURLContent(). Is it being renamed ? or is it in a different version?

Also, in the reference manual on CRAN R under package RCurl, I found a
function getBinaryURL() documented but can not be found in the package as
well. 





> 
> I would use something like
> 
>content = getURLContent("ftp://./foo.zip";)
> 
>attributes(content) = NULL
> 
>writeBin(content, "/tmp/foo.zip")
> 
> and that should be sufficient.
> 
> (You have to strip the attributes or writeBin() complains.)
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/ftp-fetch-using-RCurl--tp8067p22247131.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cross tabulation: convert frequencies to percentages

2009-02-27 Thread soeren . vogel


Hello,

might be rather easy for R pros, but I've been searching to the dead  
end to ...


twsource.area <- table(twsource, area, useNA="ifany")

gives me a nice cross tabulation of frequencies of two factors, but  
now I want to convert to pecentages of those absolute values. In  
addition I'd like an extra column and an extra row with absolute sums.  
I know, Excel or the likes will produce it more easily, but how would  
the procedure look like in R?


Thanks,

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Thomas Levine

I had enrolled in a statistics course this semester, but after the
first class, I dropped it because it uses SAS. This thread makes me
quite glad.

Tom!

On Fri, Feb 27, 2009 at 8:48 AM, Frank E Harrell Jr
 wrote:
> Wensui Liu wrote:
>>
>> Thanks for pointing me to the SAS code, Dr Harrell
>> After reading codes, I have to say that the inefficiency is not
>> related to SAS language itself but the SAS programmer. An experienced
>> SAS programmer won't use much of hard-coding, very adhoc and difficult
>> to maintain.
>> I agree with you that in the SAS code, it is a little too much to
>> evaluate predictions. such complex data step actually can be replaced
>> by simpler iml code.
>
> Agreed that the SAS code could have been much better.  I programmed in SAS
> for 23 years and would have done it much differently.  But you will find
> that the most elegant SAS program re-write will still be a far cry from the
> elegance of R.
>
> Frank
>
>>
>> On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr
>>  wrote:
>>>
>>> If anyone wants to see a prime example of how inefficient it is to
>>> program
>>> in SAS, take a look at the SAS programs provided by the US Agency for
>>> Healthcare Research and Quality for risk adjusting and reporting for
>>> hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm .
>>>  The PSSASP3.SAS program is a prime example.  Look at how you do a vector
>>> product in the SAS macro language to evaluate predictions from a logistic
>>> regression model.  I estimate that using R would easily cut the
>>> programming
>>> time of this set of programs by a factor of 4.
>>>
>>> Frank
>>> --
>>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>>                    Department of Biostatistics   Vanderbilt University
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>
>
> --
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                     Department of Biostatistics   Vanderbilt University
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Ajay ohri

Immersion therapy can be done at a later stage after the newly
baptized R  corporate
user is happy with the fact that he can do most of his legacy code in R
easily now .
 I have treading water in the immersion for over a year now.

 Most SAS consultants and corporate users are eager to try out R ..but they
are scared of immersion especially in these cut back times  ...so this could
be a middle step...let me go ahead and create the wrapper SAS package as a
middle ware between r and sas ..

and we will let the invisible hands of  free market decide :))

regards,

ajay

www.decisionstats.com

I am not a Marxist.
Karl Marx 

On Fri, Feb 27, 2009 at 8:01 PM, Marc Schwartz wrote:

> on 02/27/2009 07:57 AM Frank E Harrell Jr wrote:
> > Ajay ohri wrote:
> >>
> >> I would like to know if we can create a package in which r functions
> >> are renamed closer to sas language.doing so will help people familiar
> >> to SAS to straight away take to R for their work,thus decreasing the
> >> threshold for acceptance - and then get into deeper understanding later.
> >>
> >> since it is a package it would be optional only for people wanting to
> >> try out R from SAS.. Do we have such a package right now..it basically
> >> masks R functions to the equivalent function in another language just
> >> for user ease /beginners
> >>
> >> for example
> >>
> >> creating function for means
> >>  procmeans<-function(x,y)
> >> + {
> >> summary (
> >> subset(x,select=c(x,y))
> >> +
> >> )
> >>
> >> creating function for importing csv
> >>
> >> procimport <-function(x,y)
> >> + {
> >> read.csv(
> >> textConnection(x),row.names=y,na.strings="  "
> >> +
> >> )
> >>
> >>
> >> creating function fo describing data
> >>
> >> procunivariate<-function(x)
> >> + {
> >> summary(x)
> >> +
> >> )
> >>
> >> regards,
> >>
> >> ajay
> >
> > Ajay,
> >
> > This will generate major confusion among users of all types and be hard
> > to maintain.  A better approach is to get Bob Muenchen's excellent book
> > and keep it nearby.
> >
> > Frank
>
> I whole heartedly agree with Frank here. It may be one thing to have a
> "translation" process in place based upon some form of logical mapping
> between the two languages (as Bob's book provides). But is another thing
> entirely to actually start writing functions that provide wrappers
> modeled on SAS based PROCs.
>
> If you do this, then you only serve to obfuscate the fundamental
> philosophical and functional differences between the two languages and
> doom a new useR to missing all of R's benefits. They will continue to
> try to figure out how to use R based upon their "SAS intuition" rather
> than developing a new set of coding and even statistical paradigms.
>
> Having been through the SAS to S/R transition myself, having used SAS
> for much of the 90's and now having used R for over 7 years, I can speak
> from personal experience and state that the only way to achieve the
> requisite proficiency with R is immersion therapy.
>
> Regards,
>
> Marc Schwartz
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Making tapply code more efficient

2009-02-27 Thread Doran, Harold

Previously, I posed the question pasted down below to the list and
received some very helpful responses. While the code suggestions
provided in response indeed work, they seem to only work with *very*
small data sets and so I wanted to follow up and see if anyone had ideas
for better efficiency. I was quite embarrased on this as our SAS
programmers cranked out programs that did this in the blink of an eye
(with a few variables), but R was spinning for days on my Ubuntu machine
and ultimately I saw a message that R was "killed".

The data I am working with has 800967 total rows and 31 total columns.
The ID variable I use as the index variable in tapply() has 326397
unique cases.

> length(unique(qq$student_unique_id))
[1] 326397

To give a sense of what my data look like and the actual problem,
consider the following:

qq <- data.frame(student_unique_id = factor(c(1,1,2,2,2)),
teacher_unique_id = factor(c(10,10,20,20,25)))

This is a student achievement database where students occupy multiple
rows in the data and the variable teacher_unique_id denotes the class
the student was in. What I am doing is looking to see if the teacher is
the same for each instance of the unique student ID. So, if I implement
the following:

same <- function(x) length( unique(x) ) == 1
results <- data.frame(
freq = tapply(qq$student_unique_id, qq$student_unique_id,
length),
tch = tapply(qq$teacher_unique_id, qq$student_unique_id, same)
)

I get the following results. I can see that student 1 appears in the
data twice and the teacher is always the same. However, student 2
appears three times and the teacher is not always the same.

> results
  freq   tch
12  TRUE
23 FALSE

Now, implementing this same procedure to a large data set with the
characteristics described above seems to be problematic in this
implementation. 

Does anyone have reactions on how this could be more efficient such that
it can run with large data as I described?

Harold

> sessionInfo()
R version 2.8.1 (2008-12-22)
x86_64-pc-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
TF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=
C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATI
ON=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base




# Original question posted on 1/13/09
Suppose I have a dataframe as follows:

dat <- data.frame(id = c(1,1,2,2,2), var1 = c(10,10,20,20,25), var2 =
c('foo', 'foo', 'foo', 'foobar', 'foo'))

Now, if I were to subset by id, such as:

> subset(dat, id==1)
  id var1 var2
1  1   10  foo
2  1   10  foo

I can see that the elements in var1 are exactly the same and the
elements in var2 are exactly the same. However,

> subset(dat, id==2)
  id var1   var2
3  2   20foo
4  2   20 foobar
5  2   25foo

Shows the elements are not the same for either variable in this
instance. So, what I am looking to create is a data frame that would be
like this

id  freqvar1var2
1   2   TRUETRUE   
2   3   FALSE   FALSE

Where freq is the number of times the ID is repeated in the dataframe. A
TRUE appears in the cell if all elements in the column are the same for
the ID and FALSE otherwise. It is insignificant which values differ for
my problem.

The way I am thinking about tackling this is to loop through the ID
variable and compare the values in the various columns of the dataframe.
The problem I am encountering is that I don't think all.equal or
identical are the right functions in this case.

So, say I was wanting to compare the elements of var1 for id ==1. I
would have

x <- c(10,10)

Of course, the following works

> all.equal(x[1], x[2])
[1] TRUE

As would a similar call to identical. However, what if I only have a
vector of values (or if the column consists of names) that I want to
assess for equality when I am trying to automate a process over
thousands of cases? As in the example above, the vector may contain only
two values or it may contain many more. The number of values in the
vector differ by id.

Any thoughts?

Harold

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Setting initial starting conditions in scripts

2009-02-27 Thread Steve_Friedman


Hello,

I'm writing a variety of  R scripts and want to code the loadhistory and
workspace from within the script.  I found the loadhistory function but do
not see a comparable function for load workspace.  Is there one ?

Working with R 2.8.1 (2008-12-22) on a windows platform.

Thanks for any and all suggestions.

Steve

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adjusting confidence intervals for paired t-tests of multiple endpoints

2009-02-27 Thread Erich Studerus

 

Dear R-users,

 

In a randomized placebo-controlled within-subject design, subjects recieved
a psycho-active drug and placebo. Subjects filled out a questionnaire
containing 15 scales on four different time points after drug
administration. In order to detect drug effects on each time point, I
compared scale values between placebo and drug for all time conditions and
scales, which sums up to 4*15=60 comparisons.

 

I have summarized the results in a data.frame with columns for t test
results including confidence intervals and mean-differences:

 

df1<-data.frame(trt=gl(2,35),matrix(rnorm(4200),70,60))

 

df2<-as.data.frame(matrix(NA,60,6))

names(df2)<-c('t','df','p','lower','upper','mean.diff')

for (i in 1:60) {df2[i,1:6]<-as.numeric(

unlist(t.test(df1[,i+1]~df1$trt,paired=T))[1:6])}

 

Now, I want to adjust the confidence intervals for multiple comparisons.

 

For a Bonferroni-adjustment, I did the following:

 

df2$std.error.of.diff<-df2$mean.diff/df2$t

ci<-qt(p=1-(0.05/nrow(df2)),df=df2$df)*df2$std.error.of.diff

ci.bonf<-data.frame(lower=df2$mean.diff-ci,upper=df2$mean.diff+ci)

 

I hope this is the correct method. However, I think, the
Bonferroni-adjustment would be much too conservative. I need a less
conservative approach, perhaps, something like Holm's method, which I can
easily apply to the p-value with p.adjust(df2$p,method='holm'). Is there
package, which can do this for the confidence-interval or could someone
provide a simple script to calculate this?

 

Thanks a lot!

 

Erich


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Changing Ylab and scale in hclust plots

2009-02-27 Thread Steve_Friedman


Hello,

Running R 2.8.1 (2008-12-22) on Windows.

I running a series (25)  of clustering procedures using the hclust function
and would like each of the plots to have the same yaxis label and scale in
all of the plots.
Is there a procedure to change the scale on these plots?  Or is there an
alternative clustering function that can give me broader control

Here is my very simple code:

par(mfrow=c(2,1))

NSM5172004 <- read.csv("H:\\HRH-Data_Files\\FrequencyScenarios\\NMS.csv",
header=TRUE, sep=",")
NMS <- NSM5172004[-(1)]
NMS.dist <- dist(NMS)
plot(hclust(NMS.dist, method = "ward"), xlab="", labels=NMS$Year, main =
"Cape Sable Seaside Sparrow", sub = "Hydro Scenario NMS5172004")


ECB2_65_01 <-
read.csv("H:\\HRH-Data_Files\\FrequencyScenarios\\ECB2_65_01.csv",
header=TRUE, sep=",")
  ECB2 <- ECB2_65_01[-(1)]
   ECB2.dist <- dist(ECB2)
plot(hclust(ECB2.dist, method="ward"), xlab="",labels=ECB2$Year, main="Cape
Sable Seaside Sparrow", sub= "Hydro Scenario ECB2_65-01")

Thanks

Steve

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cross tabulation: convert frequencies to percentages

2009-02-27 Thread Marc Schwartz

on 02/27/2009 08:43 AM soeren.vo...@eawag.ch wrote:
> Hello,
> 
> might be rather easy for R pros, but I've been searching to the dead end
> to ...
> 
> twsource.area <- table(twsource, area, useNA="ifany")
> 
> gives me a nice cross tabulation of frequencies of two factors, but now
> I want to convert to pecentages of those absolute values. In addition
> I'd like an extra column and an extra row with absolute sums. I know,
> Excel or the likes will produce it more easily, but how would the
> procedure look like in R?

See ?prop.table which is referenced in the See Also section of ?table.

This will give you proportions, so if you want percentages, just
multiply by 100.

To add row and column totals, see ?addmargins which is also in the See
Also for ?table

TAB <- table(state.division, state.region)

> TAB
state.region
state.division   Northeast South North Central West
  New England6 0 00
  Middle Atlantic3 0 00
  South Atlantic 0 8 00
  East South Central 0 4 00
  West South Central 0 4 00
  East North Central 0 0 50
  West North Central 0 0 70
  Mountain   0 0 08
  Pacific0 0 05

# Overall table proportions

> prop.table(TAB)
state.region
state.division   Northeast South North Central West
  New England 0.12  0.00  0.00 0.00
  Middle Atlantic 0.06  0.00  0.00 0.00
  South Atlantic  0.00  0.16  0.00 0.00
  East South Central  0.00  0.08  0.00 0.00
  West South Central  0.00  0.08  0.00 0.00
  East North Central  0.00  0.00  0.10 0.00
  West North Central  0.00  0.00  0.14 0.00
  Mountain0.00  0.00  0.00 0.16
  Pacific 0.00  0.00  0.00 0.10


# Column proportions

> prop.table(TAB, 2)
state.region
state.division   Northeast South North Central  West
  New England0.667 0.000 0.000 0.000
  Middle Atlantic0.333 0.000 0.000 0.000
  South Atlantic 0.000 0.500 0.000 0.000
  East South Central 0.000 0.250 0.000 0.000
  West South Central 0.000 0.250 0.000 0.000
  East North Central 0.000 0.000 0.417 0.000
  West North Central 0.000 0.000 0.583 0.000
  Mountain   0.000 0.000 0.000 0.6153846
  Pacific0.000 0.000 0.000 0.3846154



> addmargins(TAB)
state.region
state.division   Northeast South North Central West Sum
  New England6 0 00   6
  Middle Atlantic3 0 00   3
  South Atlantic 0 8 00   8
  East South Central 0 4 00   4
  West South Central 0 4 00   4
  East North Central 0 0 50   5
  West North Central 0 0 70   7
  Mountain   0 0 08   8
  Pacific0 0 05   5
  Sum91612   13  50



HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr

Ajay ohri wrote:
Immersion therapy can be done at a later stage after the 
newly baptized R  corporate user is happy with the fact that he can do 
most of his legacy code in R easily now .

 I have treading water in the immersion for over a year now.

 Most SAS consultants and corporate users are eager to try out R ..but 
they are scared of immersion especially in these cut back times  ...so 
this could be a middle step...let me go ahead and create the wrapper SAS 
package as a middle ware between r and sas ..

and we will let the invisible hands of  free market decide :))

This is futile and will make it more difficult for other R users to help 
you in the future.  As Marc said this is really a bad idea and will 
backfire.

Frank

regards,

ajay

www.decisionstats.com 

I am not a Marxist. 
Karl Marx  

On Fri, Feb 27, 2009 at 8:01 PM, Marc Schwartz 
mailto:marc_schwa...@comcast.net>> wrote:

on 02/27/2009 07:57 AM Frank E Harrell Jr wrote:
 > Ajay ohri wrote:
 >>
 >> I would like to know if we can create a package in which r functions
 >> are renamed closer to sas language.doing so will help people
familiar
 >> to SAS to straight away take to R for their work,thus decreasing the
 >> threshold for acceptance - and then get into deeper
understanding later.
 >>
 >> since it is a package it would be optional only for people
wanting to
 >> try out R from SAS.. Do we have such a package right now..it
basically
 >> masks R functions to the equivalent function in another language
just
 >> for user ease /beginners
 >>
 >> for example
 >>
 >> creating function for means
 >>  procmeans<-function(x,y)
 >> + {
 >> summary (
 >> subset(x,select=c(x,y))
 >> +
 >> )
 >>
 >> creating function for importing csv
 >>
 >> procimport <-function(x,y)
 >> + {
 >> read.csv(
 >> textConnection(x),row.names=y,na.strings="  "
 >> +
 >> )
 >>
 >>
 >> creating function fo describing data
 >>
 >> procunivariate<-function(x)
 >> + {
 >> summary(x)
 >> +
 >> )
 >>
 >> regards,
 >>
 >> ajay
 >
 > Ajay,
 >
 > This will generate major confusion among users of all types and
be hard
 > to maintain.  A better approach is to get Bob Muenchen's
excellent book
 > and keep it nearby.
 >
 > Frank

I whole heartedly agree with Frank here. It may be one thing to have a
"translation" process in place based upon some form of logical mapping
between the two languages (as Bob's book provides). But is another thing
entirely to actually start writing functions that provide wrappers
modeled on SAS based PROCs.

If you do this, then you only serve to obfuscate the fundamental
philosophical and functional differences between the two languages and
doom a new useR to missing all of R's benefits. They will continue to
try to figure out how to use R based upon their "SAS intuition" rather
than developing a new set of coding and even statistical paradigms.

Having been through the SAS to S/R transition myself, having used SAS
for much of the 90's and now having used R for over 7 years, I can speak
from personal experience and state that the only way to achieve the
requisite proficiency with R is immersion therapy.

Regards,

Marc Schwartz

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making tapply code more efficient

2009-02-27 Thread ONKELINX, Thierry

Hi Harold,

What about this? You one have to make the crosstabulation once.

> qq <- data.frame(student = factor(c(1,1,2,2,2)), teacher =
factor(c(10,10,20,20,25)))
> tab <- table(qq$student, qq$teacher)
> data.frame(Student = rownames(tab), Freq = rowSums(tab), tch =
rowSums(tab > 0) == 1)
  Student Freq   tch
1   12  TRUE
2   23 FALSE

HTH,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Doran, Harold
Verzonden: vrijdag 27 februari 2009 15:47
Aan: r-help@r-project.org
Onderwerp: [R] Making tapply code more efficient

Previously, I posed the question pasted down below to the list and
received some very helpful responses. While the code suggestions
provided in response indeed work, they seem to only work with *very*
small data sets and so I wanted to follow up and see if anyone had ideas
for better efficiency. I was quite embarrased on this as our SAS
programmers cranked out programs that did this in the blink of an eye
(with a few variables), but R was spinning for days on my Ubuntu machine
and ultimately I saw a message that R was "killed".

The data I am working with has 800967 total rows and 31 total columns.
The ID variable I use as the index variable in tapply() has 326397
unique cases.

> length(unique(qq$student_unique_id))
[1] 326397

To give a sense of what my data look like and the actual problem,
consider the following:

qq <- data.frame(student_unique_id = factor(c(1,1,2,2,2)),
teacher_unique_id = factor(c(10,10,20,20,25)))

This is a student achievement database where students occupy multiple
rows in the data and the variable teacher_unique_id denotes the class
the student was in. What I am doing is looking to see if the teacher is
the same for each instance of the unique student ID. So, if I implement
the following:

same <- function(x) length( unique(x) ) == 1
results <- data.frame(
freq = tapply(qq$student_unique_id, qq$student_unique_id,
length),
tch = tapply(qq$teacher_unique_id, qq$student_unique_id, same)
)

I get the following results. I can see that student 1 appears in the
data twice and the teacher is always the same. However, student 2
appears three times and the teacher is not always the same.

> results
  freq   tch
12  TRUE
23 FALSE

Now, implementing this same procedure to a large data set with the
characteristics described above seems to be problematic in this
implementation. 

Does anyone have reactions on how this could be more efficient such that
it can run with large data as I described?

Harold

> sessionInfo()
R version 2.8.1 (2008-12-22)
x86_64-pc-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
TF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=
C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATI
ON=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base




# Original question posted on 1/13/09
Suppose I have a dataframe as follows:

dat <- data.frame(id = c(1,1,2,2,2), var1 = c(10,10,20,20,25), var2 =
c('foo', 'foo', 'foo', 'foobar', 'foo'))

Now, if I were to subset by id, such as:

> subset(dat, id==1)
  id var1 var2
1  1   10  foo
2  1   10  foo

I can see that the elements in var1 are exactly the same and the
elements in var2 are exactly the same. However,

> subset(dat, id==2)
  id var1   var2
3  2   20foo
4  2   20 foobar
5  2   25foo

Shows the elements are not the same for either variable in this
instance. So, what I am looking to create is a data frame that would be
like this

id  freqvar1var2
1   2   TRUETRUE   
2   3   FALSE   FALSE

Where freq is the number of times the ID is repeated in the dataframe. A
TRUE appears in the cell if all elements in the column are the same for
the ID and FALSE otherwise. It is insignificant which values differ for
my problem.

The way I am thinking about tackling this is to loop through the ID
variable and compare the values in the various columns of the dataframe.
The problem I am encountering is that I don't think all.equal or
identical are the righ

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Terry Therneau

Three comments

 I actually think you can write worse code in R than in SAS: more tools = more 
scope for innovatively bad ideas.  The ability to write bad code should not 
damm 
a language.  
 
  I found almost all of the "improvements" to the multi-line SAS recode to be 
regressions, both the SAS and the S suggestions. 
a. Everyone, even those of you with no SAS backround whatsoever, 
immediately 
understood the code.  Most of the replacements are obscure.  Compilers are very 
good these days and computers are fast, fewer typed characters != better.
b. If I were writing the S code for such an application, it would look much 
the same.  I worked as a programmer in medical research for several years, and 
one of the things that moved me on to graduate studies in statistics was the 
realization that doing my best work meant being as UN-clever as possible in my 
code.  

  Frank's comments imply that he was reading SAS macro code at the moment of 
peak frustration.  And if you want to criticise SAS code, this is the place to 
look.  SAS macro started out as some simple expansions, then got added on to, 
then added on again, and again, and   with no overall blueprint.  It is 
much 
like the farmhouse of some neighbors of mine growing up: 4 different expansions 
in 4 eras, and no overall guiding plan.  The interior layout was "interesting" 
to say the least. I was once a bona fide SAS 'wizard' (and Frank was much 
better 
than me), and I can't read the stuff without grinding my teeth.
  S was once headed down the same road. One of the best things ever with the 
language was documented in the blue book "The New S Language", where Becker et 
al had the wisdom to scrap the macro processor.  
 
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordinal Mantel-Haenszel type inference

2009-02-27 Thread David Winsemius

I suspect that what you need will be in "S-PLUS (and R) Manual to  
Accompany Agresti’s Categorical Data Analysis (2002)" 2nd edition by  
Laura A. Thompson, 2007 which I have always been able to find with a  
Google search. Yep, it's still there:


https://home.comcast.net/~lthompson221/Splusdiscrete2.pdf

Its Chapter 7, "Logit Models for Multinomial Responses " discusses  
various cumulative logit models.


The polr function (proportional odds logistic regression) in MASS will  
return the regression equivalent of what you are asking for. Thompson  
says the lrm in the "Desing library" will also do it, by which she  
really means that the lrm in the Design package by Harrell will do it.  
The link she offers is outdated and it doesn't really matter for  
obtaining the Hmisc/Design packages, since they are on CRAN, but  
online available documentation is currently at:


http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/StatComp

She then also mentions "lcr (library ordinal), and nordr (library  
gnlm)". Later in the chapter she illustrates the use of the vglm  
function in in the vgam package.


--
David Winsemius

On Feb 27, 2009, at 9:04 AM, Jourdan Gold wrote:


Hello,

I am searching for an R-Package that does an exentsion of the Mantel- 
Haenszel test for ordinal data as described in Liu and Agresti  
(1996) "A Mantel-Haenszel type inference for cummulative odds  
ratios". in Biometrics. I see packages such as Epi that perform it  
for binary data and derives a varaince for it using the Robbins and  
Breslow variance method. As well as another pacakge that derives it  
for nominal variables but does not provide a variance or confidence  
limit.


Does a package exist that does this? I have searched the list  
archives and can't seem to see such a package but I could be missing  
something.  thank you.



yours sincerely,


Jourdan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr


Terry Therneau wrote:

Three comments

 I actually think you can write worse code in R than in SAS: more tools = more 
scope for innovatively bad ideas.  The ability to write bad code should not damm 
a language.  
 
  I found almost all of the "improvements" to the multi-line SAS recode to be 
regressions, both the SAS and the S suggestions. 
a. Everyone, even those of you with no SAS backround whatsoever, immediately 
understood the code.  Most of the replacements are obscure.  Compilers are very 
good these days and computers are fast, fewer typed characters != better.
b. If I were writing the S code for such an application, it would look much 
the same.  I worked as a programmer in medical research for several years, and 
one of the things that moved me on to graduate studies in statistics was the 
realization that doing my best work meant being as UN-clever as possible in my 
code.  


If I were writing S code for this it would be dramatically different.  I 
would try to be efficient and elegant but would need to remember to be a 
teacher at the same time.  For example this kind of recode is super 
efficient and quick to program but would need good comments or a 
handbook to all of my code:  c(cat=1, dog=2, giraffe=3)[animal]
But I think the code is quite intuitive once you have used that 
construct once.


There also a lot of factoring of code that could be done as others have 
pointed out.



  Frank's comments imply that he was reading SAS macro code at the moment of 
peak frustration.  And if you want to criticise SAS code, this is the place to 
look.  SAS macro started out as some simple expansions, then got added on to, 
then added on again, and again, and   with no overall blueprint.  It is much 
like the farmhouse of some neighbors of mine growing up: 4 different expansions 
in 4 eras, and no overall guiding plan.  The interior layout was "interesting" 
to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better 
than me), and I can't read the stuff without grinding my teeth.
  S was once headed down the same road. One of the best things ever with the 
language was documented in the blue book "The New S Language", where Becker et 
al had the wisdom to scrap the macro processor.  


Well put.  I am amazed there hasn't been a revolt among SAS users 
decades ago.  The S approach is also easier to debug one line at a time.


Cheers,
Frank

 
  	Terry Therneau






--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread John Sorkin

Terry's remarks (see below) are well received however, I take issue with one 
part of his comments. As a long time programmer (in both "statistical" 
programming languages and "traditional" programming languages), I miss the 
ability to write native-languages in R. While macros can make for difficult to 
read code, when used properly, they can also make flexible code that, if 
properly written (including good documentation, which should be a part of any 
code) can be easy to read.

Finally, everyone must remember that SAS code can be difficult to understand or 
"inefficient" just as R code can be difficult to understand or "inefficient". 
In the end, both programming systems have their advantages and disadvantage. No 
programming language is perfect. It is not fair, nor correct to damn one or the 
other. Accept the fact that some things are more easily and more clearly done 
in one language, other things are more clearly and more easily done in another 
language.  Let's move on to more important issues, viz. improving R so it is as 
good as it possibly can be.
John  

  

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

>>> Terry Therneau  2/27/2009 10:23 AM >>>
Three comments

 I actually think you can write worse code in R than in SAS: more tools = more 
scope for innovatively bad ideas.  The ability to write bad code should not 
damm 
a language.  
 
  I found almost all of the "improvements" to the multi-line SAS recode to be 
regressions, both the SAS and the S suggestions. 
a. Everyone, even those of you with no SAS backround whatsoever, 
immediately 
understood the code.  Most of the replacements are obscure.  Compilers are very 
good these days and computers are fast, fewer typed characters != better.
b. If I were writing the S code for such an application, it would look much 
the same.  I worked as a programmer in medical research for several years, and 
one of the things that moved me on to graduate studies in statistics was the 
realization that doing my best work meant being as UN-clever as possible in my 
code.  

  Frank's comments imply that he was reading SAS macro code at the moment of 
peak frustration.  And if you want to criticise SAS code, this is the place to 
look.  SAS macro started out as some simple expansions, then got added on to, 
then added on again, and again, and   with no overall blueprint.  It is 
much 
like the farmhouse of some neighbors of mine growing up: 4 different expansions 
in 4 eras, and no overall guiding plan.  The interior layout was "interesting" 
to say the least. I was once a bona fide SAS 'wizard' (and Frank was much 
better 
than me), and I can't read the stuff without grinding my teeth.
  S was once headed down the same road. One of the best things ever with the 
language was documented in the blue book "The New S Language", where Becker et 
al had the wisdom to scrap the macro processor.  
 
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R crash on Mac

2009-02-27 Thread Adelchi Azzalini


If I define this function

R> ask <-  function (message = "Type in datum") 
   eval(parse(prompt = paste(message, ": ", sep = "")))

the following is produced as expected on a Linux/debian machine

R> ask("input")
input: 3
[1] 3
R> ask("input")
input: 3:6
[1] 3 4 5 6
R> ask("input")
input: c(3,6)
[1] 3 6

If I run exactly the same on a Mac (OS X 10.5.6), it still works 
provided R is run in a Terminal window. 

The outcome changes if R is run in "its own window", started by clicking 
on its icon; the first two examples are still Ok, the third one produces:


 *** caught segfault ***
 address 0x4628c854, cause 'memory not mapped'
 

R> sessionInfo()  # before crash!
R version 2.8.1 (2008-12-22) 
i386-apple-darwin8.11.1 

locale:
en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats utils datasets  grDevices graphics  methods   base
R> R.version
   _   
   platform   i386-apple-darwin8.11.1 
   arch   i386
   os darwin8.11.1
   system i386, darwin8.11.1  
   status 
   major  2   
   minor  8.1 
   year   2008
   month  12  
   day22  
   svn rev47281   
   language   R   
   version.string R version 2.8.1 (2008-12-22)
   


-- 
Adelchi Azzalini  
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] formula formatting/grammar for regression

2009-02-27 Thread Brigid Mooney

Hi all,

I am doing some basic regression analysis, and am getting a bit
confused on how to enter non-polynomial formulas to be used.

For example, consider that I want to find A and r such that the
formula y = A*exp(r*x) provides the the best fit to the line y=x on
the interval [0,50].

I can set:
xpts <- seq(0, 50, by=0.1)
ypts <- seq(0, 50, by=0.1)

I know I can find a fitted polynomial of a given degree using
lm(ypts ~ poly(xpts, degree=5, raw=TRUE))

But am confused on what the formula should be for trying to find a fit
to y = A*exp(r*x).

If anyone knows of a resource that describes the "grammar" behind
assembling these formulas, I would really appreciate being pointed in
that direction as I can't seem to find much beyond basic polynomials.

Thanks for the help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] formula formatting/grammar for regression

2009-02-27 Thread Dieter Menne

Brigid Mooney  gmail.com> writes:

> I am doing some basic regression analysis, and am getting a bit
> confused on how to enter non-polynomial formulas to be used.
..
> But am confused on what the formula should be for trying to find a fit
> to y = A*exp(r*x).

If this example is just a placeholder for "more complex than poly",
you should check function nls which works for non-linear functions.

However, if you really want to solve this problem only, doing a 
log on you data and fitting a log of the above function with lm()
is the easiest way out. Results can be a bit different from the
nonlinear case depending on noise, because in one case weight
are log-weighted, in the other linearly.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] Package DAKS for knowledge space theory, on CRAN now

2009-02-27 Thread Ali Uenlue

Version 1.0-0 of DAKS (Data Analysis and Knowledge Spaces) has been  
released to CRAN.

Knowledge space theory is a recent psychometric test theory based on  
combinatorial mathematical structures (order and lattice theory).  
Solvability dependencies between dichotomous test items play an  
important role in knowledge space theory. Utilizing hypothesized  
dependencies between items, knowledge space theory has been  
successfully applied for the computerized, adaptive assessment and  
training of knowledge.

The package DAKS implements inductive item tree analysis methods for  
deriving surmise relations from binary data. It provides functions for  
computing population and estimated asymptotic variances of the used  
fit measures, and for switching between test item and knowledge state  
representations.  Other features are a Hasse diagram drawing device, a  
data simulation tool based on a finite mixture latent variable model,  
and a function for computing response pattern and knowledge state  
frequencies.

Best regards,
Anatol Sargin
Ali Uenlue
--
Department of Computer-Oriented Statistics and Data Analysis
Institute of Mathematics
University of Augsburg
http://stats.math.uni-augsburg.de/


[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using package ROCR

2009-02-27 Thread Uwe Ligges

For question 1: Can you please report to the package maintainer (well, I 
am CCing Tobias now) who will certainly be happy to improve the package 
(particularly the demo behaviour).


For question 2 (and your latest message):
does not happen for me. Which versions are you using, i.e. have you 
updated to the most recent ones? In any case, using Namespaces is 
another thing that might be worth considering for Tobias as the ROCR 
maintainer.


Tobias, a last point for you: Your package gives WARNINGs in the checks 
for ages now, can you please fix that also.


Thank you,
Uwe Ligges



wiener30 wrote:

Just an update concerning an error message in using ROCR package.

Error in as.double(y) : 
  cannot coerce type 'S4' to vector of type 'double' 


I have changed the sequence of loading the packages and the problem has
gone:
library(ROCR)
library(randomForest)

The loading sequence that caused an error was:
library(randomForest)
library(ROCR)

May be this info could be useful for somebody else who is getting the same
error.




wiener30 wrote:

Thank you very much for the response!

The plot(1,1) helped to resolve the first problem.
But I am still getting a second error message when running demo(ROCR)

Error in as.double(y) : 
  cannot coerce type 'S4' to vector of type 'double'


It seems it has something to do with compatibility of S4 objects.

My versions of R and ROCR package are the same as you listed.
But it seems something other is missing in my installation.


William Doane wrote:


Responding to question 1... it seems the demo assumes you already have a
plot window open.

  library(ROCR)
  plot(1,1)
  demo(ROCR)

seems to work.

For question 2, my environment produces the expected results... plot
doesn't generate an error:
  * R 2.8.1 GUI 1.27 Tiger build 32-bit (5301)
  * OS X 10.5.6
  * ROCR 1.0-2

-Wil



wiener30 wrote:

I am trying to use package ROCR to analyze classification accuracy,
unfortunately there are some problems right at the beginning.

Question 1) 
When I try to run demo I am getting the following error message

library(ROCR)
demo(ROCR)
if(dev.cur() <= 1)  [TRUNCATED] 

Error in get(getOption("device")) : wrong first argument
When I issue the command
dev.cur() 

it returns
null device 
  1

It seems something is wrong with my R-environment ?
Could somebody provide a hint, what is wrong.

Question 2)
When I run an example commands from the manual
library(ROCR)
data(ROCR.simple)
pred <- prediction( ROCR.simple$predictions, ROCR.simple$labels )
perf <- performance( pred, "tpr", "fpr" )
plot( perf )

the plot command issues the following error message
Error in as.double(y) : 
  cannot coerce type 'S4' to vector of type 'double'


How this could be fixed ?

Thanks for the support










__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] levelplot help needed

2009-02-27 Thread Sundar Dorai-Raj

To reorder the y-labels, simply reorder the factor levels:

df <- data.frame(x_label = factor(x_label),
 y_label = factor(y_label, rev(y_label)),
 values = as.vector(my.data))

Not sure about putting the strips at the bottom. A quick scan of
?xyplot and ?strip.default suggests that this is not possible, but I'm
sure Deepayan will correct me if I'm wrong (he often does).

--sundar

On Fri, Feb 27, 2009 at 5:51 AM, Antje  wrote:
> Hi there,
>
> I'm looking for someone who can give me some hints how to make a nice
> levelplot. As an example, I have the following code:
>
> # create some example data
> # --
> xl <- 4
> yl <- 10
>
> my.data <- sapply(1:xl, FUN = function(x) { rnorm( yl, mean = x) })
>
> x_label <- rep(c("X Label 1", "X Label 2", "X Label 3", "X Label 4"), each =
> yl)
> y_label <- rep(paste("Y Label ", 1:yl, sep=""), xl)
>
> df <- data.frame(x_label = factor(x_label),y_label = factor(y_label), values
> = as.vector(my.data))
>
> df1 <- data.frame(df, group = rep("Group 1", xl*yl))
> df2 <- data.frame(df, group = rep("Group 2", xl*yl))
> df3 <- data.frame(df, group = rep("Group 3", xl*yl))
>
> mdf <- rbind(df1,df2,df3)
>
> # plot
> # --
>
> graph <- levelplot(mdf$values ~ mdf$x_label * mdf$y_label | mdf$group,
>                                aspect = "xy", layout = c(3,1),
>                                scales = list(x = list(labels =
> substr(levels(factor(mdf$x_label)),0,5), rot = 45)))
>            print(graph)
>
> # --
>
>
> (I need to put this strange x-labels, because in my real data the values of
> the x-labels are too long and I just want to display the first 10 characters
> as label)
>
> My questions:
>
> * I'd like to start with "Y Label 1" in the upper row (that's a more general
> issue, how can I have influence on the order of x,y, and groups?)
> * I'd like to put the groups at the bottom
>
> Can anybody give me some help?
>
> Antje
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining identify() and locator()

2009-02-27 Thread Brian Bolt

awesome.  Thank you very much for the quick response. I think this is  
exactly what I was looking for.

-Brian

On Feb 27, 2009, at 1:10 AM, Barry Rowlingson wrote:


2009/2/27 Brian Bolt :

Hi,
I am wondering if there might be a way to combine the two functions
identify() and locator() such that if I use identify() and then  
click on a
point outside the set tolerance, the x,y coordinates are returned  
as in

locator().  Does anyone know of a way to do this?
Thanks in advance for any help


Since "identify" will only return the indexes of selected points, and
it only takes on-screen clicks for coordinates, you'll have to
leverage "locator" and duplicate some of the "identify" work. So call
locator(1), then compute the distancez to your points, and if any are
below your tolerance mark them using text(), otherwise keep the
coordinates of the click.

You can use dist() to compute a distance matrix, but if you want to
totally replicate identify's tolerance behaviour I think you'll have
to convert from your data coordinates to device coordinates. The
grconvertX and Y functions look like they'll do that for you.

Okay, that's the flatpack delivered, I think you've got all the
parts, some assembly required!

Barry


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] levelplot help needed

2009-02-27 Thread David Winsemius


Try using the alternating=FALSE option.

--  
David Winsemius

On Feb 27, 2009, at 12:07 PM, Sundar Dorai-Raj wrote:


To reorder the y-labels, simply reorder the factor levels:

df <- data.frame(x_label = factor(x_label),
y_label = factor(y_label, rev(y_label)),
values = as.vector(my.data))

Not sure about putting the strips at the bottom. A quick scan of
?xyplot and ?strip.default suggests that this is not possible, but I'm
sure Deepayan will correct me if I'm wrong (he often does).

--sundar

On Fri, Feb 27, 2009 at 5:51 AM, Antje   
wrote:

Hi there,

I'm looking for someone who can give me some hints how to make a nice
levelplot. As an example, I have the following code:

# create some example data
# --
xl <- 4
yl <- 10

my.data <- sapply(1:xl, FUN = function(x) { rnorm( yl, mean = x) })

x_label <- rep(c("X Label 1", "X Label 2", "X Label 3", "X Label  
4"), each =

yl)
y_label <- rep(paste("Y Label ", 1:yl, sep=""), xl)

df <- data.frame(x_label = factor(x_label),y_label =  
factor(y_label), values

= as.vector(my.data))

df1 <- data.frame(df, group = rep("Group 1", xl*yl))
df2 <- data.frame(df, group = rep("Group 2", xl*yl))
df3 <- data.frame(df, group = rep("Group 3", xl*yl))

mdf <- rbind(df1,df2,df3)

# plot
# --

graph <- levelplot(mdf$values ~ mdf$x_label * mdf$y_label | mdf 
$group,

   aspect = "xy", layout = c(3,1),
   scales = list(x = list(labels =
substr(levels(factor(mdf$x_label)),0,5), rot = 45)))
   print(graph)

# --


(I need to put this strange x-labels, because in my real data the  
values of
the x-labels are too long and I just want to display the first 10  
characters

as label)

My questions:

* I'd like to start with "Y Label 1" in the upper row (that's a  
more general

issue, how can I have influence on the order of x,y, and groups?)
* I'd like to put the groups at the bottom

Can anybody give me some help?

Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re : Have a function like the "_n_" in R ? (Automatic count function )

2009-02-27 Thread Johannes Hüsing

If you are in the context of a data frame (which is closest to the 
concept

of a "data set" in SAS), the 1:nrow(df) is closest to what you may look
for.

For instance:

data(iris)
.n. <- 1:nrow(iris)

You may notice that this number is not very idiomatic in R.

If you have something like:

if(_N_ > 50) then output;

in R you can simply put

iris[-(1:50),]

without using an explicit counter variable.

In the context of a matrix, the row() and col() functions may do what
you want.



Am 25.02.2009 um 15:34 schrieb justin bem:

R is more flexible that SAS. You have many functions for loop e.g. 
for, while, repeat. You also have dim and length functions to get 
objects dimensions.


i<-0
dat<-matrix(c(1, runif(1), .Random.seed[1]),nr=1)
repeat{
    i=i+1
    dat<-rbind(dat, matrix(c(1+i, runif(1), .Random.seed[1]),nr=1))
    if (i==4) break
}

colnames(dat)<-c("counter", "x","seed")
dat

 Justin BEM
BP 1917 Yaoundé
Tél (237) 99597295
(237) 22040246





De : Nash 
À : r-help 
Envoyé le : Mercredi, 25 Février 2009, 13h25mn 18s
Objet : [R] Have a function like the "_n_" in R ? (Automatic count 
function )



Have the counter function in R ?

if we use the software SAS

/*** SAS Code **/
data tmp(drop= i);
retain seed x 0;
do i = 1 to 5;
    call ranuni(seed,x);
    output;
end;
run;

data new;
counter=_n_;  * this keyword _n_ ;
set tmp;
run;

/*
_n_ (Automatic variables)
are created automatically by the DATA step or by DATA step statements.
*/

/*** Output 
counter        seed            x
1    584043288            0.27197
2    935902963            0.43581
3    301879523            0.14057
4    753212598            0.35074
5    1607264573    0.74844

/

Have a function like the "_n_" in R ?


--
Nash - morri...@ibms.sinica.edu.tw

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Filtering a dataset's columns by another dataset's column names

2009-02-27 Thread Josh B

Hello all,

I hope some of you can come to my rescue, yet again.

I have two genetic datasets, and I want one of the datasets to have only the 
columns that are in common with the other dataset. 
Here is a toy example (my real datasets have hundreds of columns):

Dataset 1:

IndividualSNP1SNP2SNP3SNP4SNP5
1AGTCA
2TCAGT
3ACTCA

Dataset 2:

IndividualSNP1SNP3SNP5SNP6SNP7
4ATTGC
5TAAGG
6AATCG

I want Dataset1 to have only columns that are also represented in Dataset 2, 
i.e., I want to generate a new Dataset 3 that looks like this:

IndividualSNP1SNP3SNP5
1ATA
2TAT
3ATA

Does anyone know how I could do this? Keep in mind that this is not a simple 
merge, as in the "merge" function.

Thanks very much for your help everyone.
Josh B.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering a dataset's columns by another dataset's column names

2009-02-27 Thread Rowe, Brian Lee Yung (Portfolio Analytics)

Try this:

d1[,intersect(names(d1),names(d2))]

HTH, Brian

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Josh B
Sent: Friday, February 27, 2009 12:28 PM
To: R Help
Subject: [R] Filtering a dataset's columns by another dataset's column
names


Hello all,

I hope some of you can come to my rescue, yet again.

I have two genetic datasets, and I want one of the datasets to have only
the columns that are in common with the other dataset. 
Here is a toy example (my real datasets have hundreds of columns):

Dataset 1:

IndividualSNP1SNP2SNP3SNP4SNP5
1AGTCA
2TCAGT
3ACTCA

Dataset 2:

IndividualSNP1SNP3SNP5SNP6SNP7
4ATTGC
5TAAGG
6AATCG

I want Dataset1 to have only columns that are also represented in
Dataset 2, i.e., I want to generate a new Dataset 3 that looks like
this:

IndividualSNP1SNP3SNP5
1ATA
2TAT
3ATA

Does anyone know how I could do this? Keep in mind that this is not a
simple merge, as in the "merge" function.

Thanks very much for your help everyone.
Josh B.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
This message w/attachments (message) may be privileged, confidential or 
proprietary, and if you are not an intended recipient, please notify the 
sender, do not use or share it and delete it. Unless specifically indicated, 
this message is not an offer to sell or a solicitation of any investment 
products or other financial product or service, an official confirmation of any 
transaction, or an official statement of Merrill Lynch. Subject to applicable 
law, Merrill Lynch may monitor, review and retain e-communications (EC) 
traveling through its networks/systems. The laws of the country of each 
sender/recipient may impact the handling of EC, and EC may be archived, 
supervised and produced in countries other than the country in which you are 
located. This message cannot be guaranteed to be secure or error-free. 
References to "Merrill Lynch" are references to any company in the Merrill 
Lynch & Co., Inc. group of companies, which are wholly-owned by Bank of America 
Corporation. Secu!
 rities and Insurance Products: * Are Not FDIC Insured * Are Not Bank 
Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to 
Any Banking Service or Activity * Are Not Insured by Any Federal Government 
Agency. Attachments that are part of this E-communication may have additional 
important disclosures and disclaimers, which you should read. This message is 
subject to terms available at the following link: 
http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you 
consent to the foregoing.
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering a dataset's columns by another dataset's column names

2009-02-27 Thread Marc Schwartz

on 02/27/2009 11:27 AM Josh B wrote:
> Hello all,
> 
> I hope some of you can come to my rescue, yet again.
> 
> I have two genetic datasets, and I want one of the datasets to have only the 
> columns that are in common with the other dataset. 
> Here is a toy example (my real datasets have hundreds of columns):
> 
> Dataset 1:
> 
> IndividualSNP1SNP2SNP3SNP4SNP5
> 1AGTCA
> 2TCAGT
> 3ACTCA
> 
> Dataset 2:
> 
> IndividualSNP1SNP3SNP5SNP6SNP7
> 4ATTGC
> 5TAAGG
> 6AATCG
> 
> I want Dataset1 to have only columns that are also represented in Dataset 2, 
> i.e., I want to generate a new Dataset 3 that looks like this:
> 
> IndividualSNP1SNP3SNP5
> 1ATA
> 2TAT
> 3ATA
> 
> Does anyone know how I could do this? Keep in mind that this is not a simple 
> merge, as in the "merge" function.
> 
> Thanks very much for your help everyone.
> Josh B.

Same.Cols <- intersect(names(DF1), names(DF2))

> Same.Cols
[1] "Individual" "SNP1"   "SNP3"   "SNP5"

> rbind(DF1[, Same.Cols], DF2[, Same.Cols])
  Individual SNP1 SNP3 SNP5
1  1ATA
2  2TAT
3  3ATA
4  4ATT
5  5TAA
6  6AAT


See ?intersect, which gives you the common column names, which you can
then use in rbind().

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering a dataset's columns by another dataset's column names

2009-02-27 Thread Jorge Ivan Velez

Dear Josh,
Try this:

dataset1[,colnames(dataset1) %in% colnames(dataset2)]

Take a look at ?colnames and ?"%in%" for more information.

HTH,

Jorge


On Fri, Feb 27, 2009 at 12:27 PM, Josh B  wrote:

> Hello all,
>
> I hope some of you can come to my rescue, yet again.
>
> I have two genetic datasets, and I want one of the datasets to have only
> the columns that are in common with the other dataset.
> Here is a toy example (my real datasets have hundreds of columns):
>
> Dataset 1:
>
> IndividualSNP1SNP2SNP3SNP4SNP5
> 1AGTCA
> 2TCAGT
> 3ACTCA
>
> Dataset 2:
>
> IndividualSNP1SNP3SNP5SNP6SNP7
> 4ATTGC
> 5TAAGG
> 6AATCG
>
> I want Dataset1 to have only columns that are also represented in Dataset
> 2, i.e., I want to generate a new Dataset 3 that looks like this:
>
> IndividualSNP1SNP3SNP5
> 1ATA
> 2TAT
> 3ATA
>
> Does anyone know how I could do this? Keep in mind that this is not a
> simple merge, as in the "merge" function.
>
> Thanks very much for your help everyone.
> Josh B.
>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering a dataset's columns by another dataset's column names

2009-02-27 Thread David Winsemius

So you want the data that is in Dataset 1 but only the column names  
that are also in Dataset 2:

How about:

 subset(DS1, select = names(DS1) %in% names(DS2) )

> DS1 <-read.table(textConnection("IndividualSNP1SNP2 
SNP3SNP4SNP5

+ 1AGTCA
+ 2TCAGT
+ 3ACTCA"),header=TRUE)
> DS2 <-read.table(textConnection("IndividualSNP1SNP3 
SNP5SNP6SNP7

+ 4ATTGC
+ 5TAAGG
+ 6AATCG"),header=TRUE)

> subset(DS1, select= names(DS1) %in% names(DS2) )
  Individual SNP1 SNP3 SNP5
1  1ATA
2  2TAT
3  3ATA

Tested!
--
David Winsemius
Heritage Labs

On Feb 27, 2009, at 12:27 PM, Josh B wrote:

Hello all,

I hope some of you can come to my rescue, yet again.

I have two genetic datasets, and I want one of the datasets to have  
only the columns that are in common with the other dataset.

Here is a toy example (my real datasets have hundreds of columns):

Dataset 1:

IndividualSNP1SNP2SNP3SNP4SNP5
1AGTCA
2TCAGT
3ACTCA

Dataset 2:

IndividualSNP1SNP3SNP5SNP6SNP7
4ATTGC
5TAAGG
6AATCG

I want Dataset1 to have only columns that are also represented in  
Dataset 2, i.e., I want to generate a new Dataset 3 that looks like  
this:

IndividualSNP1SNP3SNP5
1ATA
2TAT
3ATA

Does anyone know how I could do this? Keep in mind that this is not  
a simple merge, as in the "merge" function.

Thanks very much for your help everyone.
Josh B.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining identify() and locator()

2009-02-27 Thread Barry Rowlingson

2009/2/27 Brian Bolt :
> awesome.  Thank you very much for the quick response. I think this is
> exactly what I was looking for.

 Here's a basic framework:

 `idloc` <-
  function(xy,n=1, tol=0.25){

tol2=tol^2

icoords = 
cbind(grconvertX(xy[,1],to="inches"),grconvertY(xy[,2],to="inches"))
hit = c()
missed = matrix(ncol=2,nrow=0)
for(i in 1:n){
  ptU = locator(1)
  pt = c(grconvertX(ptU$x,to='inches'),grconvertY(ptU$y,to="inches"))

  d2 = (icoords[,1]-pt[1])^2 + (icoords[,2]-pt[2])^2
  if (any(d2 < tol2)){
print("clicked")
hit = c(hit, (1:dim(xy)[1])[d2 < tol2])
  }else{
print("missed")
missed=rbind(missed,c(ptU$x,ptU$y))
  }

}
return(list(hit=hit,missed=missed))

  }

Test:

 xy = cbind(1:10,runif(10))
 plot(xy)
 idloc(xy,10)

 now click ten times, on points or off points. You get back:

$hit
[1]  4  6  7 10

$missed
 [,1]  [,2]
[1,] 5.698940 0.6835392
[2,] 6.216171 0.6144229
[3,] 5.877982 0.5752569
[4,] 6.773190 0.2895761
[5,] 7.210847 0.3126149
[6,] 9.239985 0.5614337

 - $hit is the indices of the points you hit (in order, including
duplicates) and $missed are the coordinates of the misses.

 It crashes out if you hit the middle button for the locator, but that
should be easy enough to fixup. It doesn't label hit points, but
that's also easy enough to do.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering a dataset's columns by another dataset's column names

2009-02-27 Thread Daniel Malter

Hi Josh B,

this looks like homework to me. Please obey the posting rules. I.e., provide
self-contained code/examples and show what the point is at which you are
stuck. 

To solve your problem, you need the "which" and the "names" function as well
as the %in%  operator. It is then easy to rbind the two datasets once you
have figured out what the common column names are. Please try on your own
first and report back if and where you are stuck along with the
self-contained code. If this is indeed homework, please ask your professor
or teacher.

Example for two simulated datasets:

x=rnorm(30)
dim(x)=c(5,6)
x=data.frame(x)
names(x)=c("a","b","c","x","y","z")

y=rnorm(30)
dim(y)=c(5,6)
y=data.frame(y)
names(y)=c("a","b","d","v","w","x")

Daniel


-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von Josh B
Gesendet: Friday, February 27, 2009 12:28 PM
An: R Help
Betreff: [R] Filtering a dataset's columns by another dataset's column names

Hello all,

I hope some of you can come to my rescue, yet again.

I have two genetic datasets, and I want one of the datasets to have only the
columns that are in common with the other dataset. 
Here is a toy example (my real datasets have hundreds of columns):

Dataset 1:

IndividualSNP1SNP2SNP3SNP4SNP5
1AGTCA
2TCAGT
3ACTCA

Dataset 2:

IndividualSNP1SNP3SNP5SNP6SNP7
4ATTGC
5TAAGG
6AATCG

I want Dataset1 to have only columns that are also represented in Dataset 2,
i.e., I want to generate a new Dataset 3 that looks like this:

IndividualSNP1SNP3SNP5
1ATA
2TAT
3ATA

Does anyone know how I could do this? Keep in mind that this is not a simple
merge, as in the "merge" function.

Thanks very much for your help everyone.
Josh B.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr


spam me wrote:

I've actually used AHRQ's software to create Inpatient Quality Indicator
reports.  I can confirm pretty much what we already know; it is inefficient.
Running on about 1.8 - 2 million cases, it would take just about a whole day
to run the entire process from start to finish.  That isn't all processing
time and includes some time for the analyst to check results between
substeps, but I still knew that my day was full when I was working on IQI
reports.



To be fair though, there are a lot of other factors (beside efficiency
considerations) that go into AHRQ's program design.  First, there are a lot
of changes to that software every year.  In some cases it is easier and less
error prone to hardcode a few points in the data so that it is blatantly
obvious what to change next year should another analyst need to do so.  Second,
the organizations that use this software often require transparency and may
not have high level programmers on staff.  Writing code so that it is
accessible, editable, and interpretable by intermediate level programmers or
analysts is a plus.  Third, given that IQI reports are often produced on a
yearly basis, there's no real need to sacrifice clarity, etc. for efficiency
- you're only doing this process once a year.



There are other points that could be made, but the main idea is I don't
think it's fair to hold this software up, out of context, as an example of
SAS's (or even AHRQs) inefficiencies.  I agree that SAS syntax is nowhere
near as elegant or as powerful as R from a programming standpoint, that's
why after 7 years of using SAS I switched to R.  But comparing the two at
that level is like a racing a Ferrari and a Bentley to see which is the
better car.


Dear Anonymous,

Nice points.  I would just add that it would be better if 
government-sponsored projects would result in software that could be run 
without expensive licenses.


Thanks
Frank



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Frank E Harrell Jr

John Sorkin wrote:

Terry's remarks (see below) are well received however, I take issue with one part of his comments.
As a long time programmer (in both "statistical" programming languages and
"traditional" programming languages), I miss the ability to write native-languages in R.
While macros can make for difficult to read code, when used properly, they can also make flexible
code that, if properly written (including good documentation, which should be a part of any code)
can be easy to read.

Finally, everyone must remember that SAS code can be difficult to understand or
"inefficient" just as R code can be difficult to understand or "inefficient".
In the end, both programming systems have their advantages and disadvantage. No programming
language is perfect. It is not fair, nor correct to damn one or the other. Accept the fact that
some things are more easily and more clearly done in one language, other things are more clearly
and more easily done in another language. Let's move on to more important issues, viz. improving R
so it is as good as it possibly can be.
John

Nice points John. My only response is that I learned SAS in 1969 and
used it intensively until 1991. I wrote some of the first
user-contributed SAS procedures (PROCs PCTL, GRAPH, DATACHK, LOGIST,
PHGLM) and wrote extensively in the macro language. After using S-Plus
for only one month my productivity was far ahead of my productivity
using SAS.

Frank

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Terry Therneau 2/27/2009 10:23 AM >>>

Three comments

I actually think you can write worse code in R than in SAS: more tools = more
scope for innovatively bad ideas. The ability to write bad code should not damm
a language.

I found almost all of the "improvements" to the multi-line SAS recode to be
regressions, both the SAS and the S suggestions.
a. Everyone, even those of you with no SAS backround whatsoever, immediately
understood the code. Most of the replacements are obscure. Compilers are very
good these days and computers are fast, fewer typed characters != better.
b. If I were writing the S code for such an application, it would look much
the same. I worked as a programmer in medical research for several years, and
one of the things that moved me on to graduate studies in statistics was the
realization that doing my best work meant being as UN-clever as possible in my
code.

Frank's comments imply that he was reading SAS macro code at the moment of
peak frustration. And if you want to criticise SAS code, this is the place to
look. SAS macro started out as some simple expansions, then got added on to,
then added on again, and again, and with no overall blueprint. It is much
like the farmhouse of some neighbors of mine growing up: 4 different expansions
in 4 eras, and no overall guiding plan. The interior layout was "interesting"
to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better
than me), and I can't read the stuff without grinding my teeth.
S was once headed down the same road. One of the best things ever with the
language was documented in the blue book "The New S Language", where Becker et
al had the wisdom to scrap the macro processor.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:14}}

[R] help with projection pursuit

2009-02-27 Thread Olivier MARTIN


Hi all,

I have some difficulties with the function ppr for projection pursuit 
regression.
I obtained the results for a projection pursuit regression and now I 
would like to

compute some predictions for new data.

I tried the function predict in the following way predict(res.ppr, 
newdata) but it seems
that it is not right. The data rock is given for illustration of the 
function ppr.


attach(rock)

rock.ppr <- ppr(log(perm) ~ area1 + peri1 + shape, data = rock, nterms = 2, 
max.terms = 5)


So suppose I want to make a prediction for the point
area1=10,peri1=3 and shape=2. I tried
the command predict(rock.ppr, c(10,3,2))  but it returns
an error message.
So, could you  indicate me the right way for this prediction?

Thanks for your help.
Olivier.


--

-
Martin Olivier
INRA - Unité Biostatistique & Processus Spatiaux
Domaine St Paul, Site Agroparc
84914 Avignon Cedex 9, France
Tel : 04 32 72 21 57
Fax : 04 32 72 21 82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help: locfit (local logistic regression)

2009-02-27 Thread Sharai Gomez

Hi,

I am running a local logistic regression using locfit. Now, I want to choose
the bandwidth using cross-validation. I don't know if there is an additional
command to do so or if I can do it in the locfit. I would appreciate any
help about this matter. Thank you.

Regards,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add absolute value to bars in barplot

2009-02-27 Thread Greg Snow

Note that putting numbers near the top of the bars (either inside or outside) 
tends to create 'fuzzy' tops to the bars that make it harder for the viewer to 
quickly interpret the graph.  If the numbers are important, put them in a 
table.  If you really need to have the numbers and graph together then look at 
alternatives (some type of combined table/graph) or put the numbers in a margin 
of the graph where they will not distract from the graph itself.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of soeren.vo...@eawag.ch
> Sent: Friday, February 27, 2009 5:33 AM
> To: r-help@r-project.org
> Subject: [R] add absolute value to bars in barplot
> 
> Hello,
> 
> r-h...@r-project.orgbarplot(twcons.area,
>beside=T, col=c("green4", "blue", "red3", "gray"),
>xlab="estate",
>ylab="number of persons", ylim=c(0, 110),
>legend.text=c("treated", "mix", "untreated", "NA"))
> 
> produces a barplot very fine. In addition, I'd like to get the bars'
> absolute values on the top of the bars. How can I produce this in an
> easy way?
> 
> Thanks
> 
> Sören
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] mefa 3.0-0

2009-02-27 Thread Peter Solymos

Dear R Community,

I am pleased to announce that a new version of the mefa R package is
available at the CRAN.

mefa is a package for multivariate data handling in ecology and
biogeography. It provides object classes to represent the data coded
by samples, taxa and segments (i.e., subpopulations, repeated
measures). It supports easy processing of the data along with
relational data tables for samples and taxa. An object of class mefa
is a project specific compendium of the dataset and can be easily used
in further analyses. Methods are provided for extraction, aggregation,
conversion, plotting, summary and reporting of mefa objects. Reports
can be generated in plain text or LaTex.

The current version has been published in JSS (
http://www.jstatsoft.org/v29/i08 ). The paper presents worked examples
on a variety of ecological analyses.

Best wishes,

Péter

Péter Sólymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] testing two-factor anova effects using model comparison approach with lm() and anova()

2009-02-27 Thread Paul Gribble

I wonder if someone could explain the behavior of the anova() and lm()
functions in the following situation:

I have a standard 3x2 factorial design, factorA has 3 levels, factorB has 2
levels, they are fully crossed. I have a dependent variable DV.

Of course I can do the following to get the usual anova table:

> anova(lm(DV~factorA+factorB+factorA:factorB))
Analysis of Variance Table

Response: DV
Df  Sum Sq Mean Sq F value   Pr(>F)
factorA  2  7.4667  3.7333  4.9778 0.015546 *
factorB  1  2.1333  2.1333  2.8444 0.104648
factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
Residuals   24 18.  0.7500

This is perfectly satisfactory for my situation, but as a pedagogical
exercise, I wanted to demonstrate the model comparison approach to analysis
of variance by using anova() to compare a full model that contains all
effects, to restricted models that contain all effects save for the effect
of interest.

The test of the interaction effect seems to be as I expected:

> fullmodel<-lm(DV~factorA+factorB+factorA:factorB)
> restmodel<-lm(DV~factorA+factorB)
> anova(fullmodel,restmodel)
Analysis of Variance Table

Model 1: DV ~ factorA + factorB + factorA:factorB
Model 2: DV ~ factorA + factorB
  Res.Df RSS Df Sum of Sq  F   Pr(>F)
1 24 18.
2 26 27.8667 -2   -9.8667 6.5778 0.005275 **

As you can see the value of F (6.5778) is the same as in the anova table
above. All is well.

However, if I try to test a main effect, e.g. factorA, by testing the full
model against a restricted model that doesn't contain the main effect
factorA, I get something strange:

> restmodel<-lm(DV~factorB+factorA:factorB)
> anova(fullmodel,restmodel)
Analysis of Variance Table

Model 1: DV ~ factorA + factorB + factorA:factorB
Model 2: DV ~ factorB + factorA:factorB
  Res.Df RSS Df Sum of Sq F Pr(>F)
1 24  18
2 24  18  0 0

upon inspection of each model I see that the Residuals are identical, which
is not what I was expecting:

> anova(fullmodel)
Analysis of Variance Table

Response: DV
Df  Sum Sq Mean Sq F value   Pr(>F)
factorA  2  7.4667  3.7333  4.9778 0.015546 *
factorB  1  2.1333  2.1333  2.8444 0.104648
factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
Residuals   24 18.  0.7500

This looks fine, but then the restricted model is where things are not as I
expected:

> anova(restmodel)
Analysis of Variance Table

Response: DV
Df  Sum Sq Mean Sq F value   Pr(>F)
factorB  1  2.1333  2.1333  2.8444 0.104648
factorB:factorA  4 17.  4.  5.7778 0.002104 **
Residuals   24 18.  0.7500

I was expecting the Residuals in the restricted model (the one not
containing main effect of factorA) to be larger than in the full model
containing all three effects. In other words, the variance accounted for by
the main effect factorA should be added to the Residuals. Instead, it looks
like the variance accounted for by the main effect of factorA is being
soaked up by the factorA:factorB interaction term. Strangely, the degrees of
freedom are also affected.

I must be misunderstanding something here. Can someone point out what is
happening?

Thanks,

-Paul

-- 
Paul L. Gribble, Ph.D.
Associate Professor
Dept. Psychology
The University of Western Ontario
London, Ontario
Canada N6A 5C2
Tel. +1 519 661 2111 x82237
Fax. +1 519 661 3961
pgrib...@uwo.ca
http://gribblelab.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] formula formatting/grammar for regression

2009-02-27 Thread BKMooney

This is just (or should be) just a simple example of what I would like to
extend to further regression - which is why I was looking for a resource on
the grammar.  

If I try:
lm(ypts ~ exp(xpts)), I only get an intercept and one coefficient.  And for
the coefficient, I am not sure where that should go?  (ie is that A or r in
the formula y=A*exp(r*x) ) 

Also, when I tried to use nls, I get an error:  
nls(ypts ~ exp(xpts))
Error in getInitial.default(func, data, mCall = as.list(match.call(func,  : 
  no 'getInitial' method found for "function" objects

If someone could please point out what I am doing wrong, or point me to a
good resource on this, I would greatly appreciate it.  

Thanks!

Dieter Menne wrote:
> 
> Brigid Mooney  gmail.com> writes:
> 
>> I am doing some basic regression analysis, and am getting a bit
>> confused on how to enter non-polynomial formulas to be used.
> ..
>> But am confused on what the formula should be for trying to find a fit
>> to y = A*exp(r*x).
> 
> If this example is just a placeholder for "more complex than poly",
> you should check function nls which works for non-linear functions.
> 
> However, if you really want to solve this problem only, doing a 
> log on you data and fitting a log of the above function with lm()
> is the easiest way out. Results can be a bit different from the
> nonlinear case depending on noise, because in one case weight
> are log-weighted, in the other linearly.
> 
> Dieter
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/formula-formatting-grammar-for-regression-tp22249014p22251094.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Ajay ohri

A further example of software pricing dynamics

 is the complete lack of awareness of WPS , a UK based software which is
basically a base SAS clone with all the features of SAS ( coding read ,write
and data read /write) and priced only at 660$ per desktop and 1400$ for
server licenses ..very very cheap compared to SAS Base..and it has a Bridge
to R for higher level statistics...

You would think a corporate user would not have any hesitation to switch to
a clone software priced at 10 % ...

yet there are hardly any takers for it..in the federal government...
:))

people worried about their government's spending should use the new website
http://www.recovery.gov/?q=content/contact

it is supposed to chronicle this and it would be a good test and control for
the Web 2.0 initiatives..

On Fri, Feb 27, 2009 at 11:18 PM, Frank E Harrell Jr <
f.harr...@vanderbilt.edu> wrote:

> spam me wrote:
>
>> I've actually used AHRQ's software to create Inpatient Quality Indicator
>> reports.  I can confirm pretty much what we already know; it is
>> inefficient.
>> Running on about 1.8 - 2 million cases, it would take just about a whole
>> day
>> to run the entire process from start to finish.  That isn't all processing
>> time and includes some time for the analyst to check results between
>> substeps, but I still knew that my day was full when I was working on IQI
>> reports.
>>
>>
>>
>> To be fair though, there are a lot of other factors (beside efficiency
>> considerations) that go into AHRQ's program design.  First, there are a
>> lot
>> of changes to that software every year.  In some cases it is easier and
>> less
>> error prone to hardcode a few points in the data so that it is blatantly
>> obvious what to change next year should another analyst need to do so.
>>  Second,
>> the organizations that use this software often require transparency and
>> may
>> not have high level programmers on staff.  Writing code so that it is
>> accessible, editable, and interpretable by intermediate level programmers
>> or
>> analysts is a plus.  Third, given that IQI reports are often produced on a
>> yearly basis, there's no real need to sacrifice clarity, etc. for
>> efficiency
>> - you're only doing this process once a year.
>>
>>
>>
>> There are other points that could be made, but the main idea is I don't
>> think it's fair to hold this software up, out of context, as an example of
>> SAS's (or even AHRQs) inefficiencies.  I agree that SAS syntax is nowhere
>> near as elegant or as powerful as R from a programming standpoint, that's
>> why after 7 years of using SAS I switched to R.  But comparing the two at
>> that level is like a racing a Ferrari and a Bentley to see which is the
>> better car.
>>
>
> Dear Anonymous,
>
> Nice points.  I would just add that it would be better if
> government-sponsored projects would result in software that could be run
> without expensive licenses.
>
> Thanks
> Frank
>
>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> Frank E Harrell Jr   Professor and Chair   School of Medicine
> Department of Biostatistics   Vanderbilt University
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with projection pursuit

2009-02-27 Thread David Winsemius

In my experience (and per the help pages now that I look) the predict  
functions need named arguments that match up with the column names in  
the model and generally this needs to be supplied as a dataframe or a  
list.


(note: at least on my machine the rock dataframe does *not* have the  
names you offered)


predict(rock.ppr, list(area=10, peri= 3, shape=2))   # or...
predict(rock.ppr, data.frame(area=10, peri= 3, shape=2))

> predict(rock.ppr, list(area=10, peri= 3, shape=2))
   1
7.118094

--
David Winsemius

On Feb 27, 2009, at 10:09 AM, Olivier MARTIN wrote:


Hi all,

I have some difficulties with the function ppr for projection  
pursuit regression.
I obtained the results for a projection pursuit regression and now I  
would like to

compute some predictions for new data.

I tried the function predict in the following way predict(res.ppr,  
newdata) but it seems
that it is not right. The data rock is given for illustration of the  
function ppr.


attach(rock)

rock.ppr <- ppr(log(perm) ~ area1 + peri1 + shape, data = rock,  
nterms = 2, max.terms = 5)



So suppose I want to make a prediction for the point
area1=10,peri1=3 and shape=2. I tried
the command predict(rock.ppr, c(10,3,2))  but it returns
an error message.
So, could you  indicate me the right way for this prediction?

Thanks for your help.
Olivier.


--

-
Martin Olivier
INRA - Unité Biostatistique & Processus Spatiaux
Domaine St Paul, Site Agroparc
84914 Avignon Cedex 9, France
Tel : 04 32 72 21 57
Fax : 04 32 72 21 82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread John Sorkin

Frank,
A programming language's efficience is a function of several items, including 
what you are trying to program. Without using SAS proc IML, I have found that 
it is more efficient to code algorithms (e.g. a least squares linear 
regression) using R than SAS; we all know that matrix notation leads to more 
compact syntax than can be had when using non-matrix notation and R implements 
matrix notation. On the other hand, searching, sub-setting, merging etc. can a 
times be coded more efficiently, more easily, and in a more easily understood 
fashion is SAS. I am sure you people who use SAS to set up their datasets and 
then use R when they are developing an algorithm. 

Just as French may be a better language to express love, Italian a better 
language in which to write opera, and English the most efficient language for 
communication (at least for the last 50 years), so too do both R and SAS have a 
place in the larger world.
John 

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

>>> Frank E Harrell Jr  2/27/2009 12:52 PM >>>
John Sorkin wrote:
> Terry's remarks (see below) are well received however, I take issue with one 
> part of his comments. As a long time programmer (in both "statistical" 
> programming languages and "traditional" programming languages), I miss the 
> ability to write native-languages in R. While macros can make for difficult 
> to read code, when used properly, they can also make flexible code that, if 
> properly written (including good documentation, which should be a part of any 
> code) can be easy to read.
> 
> Finally, everyone must remember that SAS code can be difficult to understand 
> or "inefficient" just as R code can be difficult to understand or 
> "inefficient". In the end, both programming systems have their advantages and 
> disadvantage. No programming language is perfect. It is not fair, nor correct 
> to damn one or the other. Accept the fact that some things are more easily 
> and more clearly done in one language, other things are more clearly and more 
> easily done in another language.  Let's move on to more important issues, 
> viz. improving R so it is as good as it possibly can be.
> John  

Nice points John.  My only response is that I learned SAS in 1969 and 
used it intensively until 1991.  I wrote some of the first 
user-contributed SAS procedures (PROCs PCTL, GRAPH, DATACHK, LOGIST, 
PHGLM) and wrote extensively in the macro language.  After using S-Plus 
for only one month my productivity was far ahead of my productivity 
using SAS.

Frank

> 
>   
> 
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> 
 Terry Therneau  2/27/2009 10:23 AM >>>
> Three comments
> 
>  I actually think you can write worse code in R than in SAS: more tools = 
> more 
> scope for innovatively bad ideas.  The ability to write bad code should not 
> damm 
> a language.  
>  
>   I found almost all of the "improvements" to the multi-line SAS recode to be 
> regressions, both the SAS and the S suggestions. 
> a. Everyone, even those of you with no SAS backround whatsoever, 
> immediately 
> understood the code.  Most of the replacements are obscure.  Compilers are 
> very 
> good these days and computers are fast, fewer typed characters != better.
> b. If I were writing the S code for such an application, it would look 
> much 
> the same.  I worked as a programmer in medical research for several years, 
> and 
> one of the things that moved me on to graduate studies in statistics was the 
> realization that doing my best work meant being as UN-clever as possible in 
> my 
> code.  
> 
>   Frank's comments imply that he was reading SAS macro code at the moment of 
> peak frustration.  And if you want to criticise SAS code, this is the place 
> to 
> look.  SAS macro started out as some simple expansions, then got added on to, 
> then added on again, and again, and   with no overall blueprint.  It is 
> much 
> like the farmhouse of some neighbors of mine growing up: 4 different 
> expansions 
> in 4 eras, and no overall guiding plan.  The interior layout was 
> "interesting" 
> to say the least. I was once a bona fide SAS 'wizard' (and Frank was much 
> better 
> than me), and I can't read the stuff without grinding my teeth.
>   S was once headed down the same road. One of the best things ever with the 
> language was documented in the blue book "The New S Language", where Becker 
> et 
> al had the wisdom

Re: [R] testing two-factor anova effects using model comparison approach with lm() and anova()

2009-02-27 Thread Greg Snow

Notice the degrees of freedom as well in the different models.  

With factors A and B, the 2 models:

A + B + A:B 

And 

A + A:B

Are actually the same overall model, just different parameterizations (you can 
also see this by using x=TRUE in the call to lm and looking at the x matrix 
used).

Testing if the main effect A should be in the model given that the interaction 
is in the model does not make sense in most cases, therefore the notation gives 
a different parameterization rather than the generally uninteresting test. 

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Paul Gribble
> Sent: Friday, February 27, 2009 11:01 AM
> To: r-help@r-project.org
> Subject: [R] testing two-factor anova effects using model comparison
> approach with lm() and anova()
> 
> I wonder if someone could explain the behavior of the anova() and lm()
> functions in the following situation:
> 
> I have a standard 3x2 factorial design, factorA has 3 levels, factorB
> has 2
> levels, they are fully crossed. I have a dependent variable DV.
> 
> Of course I can do the following to get the usual anova table:
> 
> > anova(lm(DV~factorA+factorB+factorA:factorB))
> Analysis of Variance Table
> 
> Response: DV
> Df  Sum Sq Mean Sq F value   Pr(>F)
> factorA  2  7.4667  3.7333  4.9778 0.015546 *
> factorB  1  2.1333  2.1333  2.8444 0.104648
> factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
> Residuals   24 18.  0.7500
> 
> This is perfectly satisfactory for my situation, but as a pedagogical
> exercise, I wanted to demonstrate the model comparison approach to
> analysis
> of variance by using anova() to compare a full model that contains all
> effects, to restricted models that contain all effects save for the
> effect
> of interest.
> 
> The test of the interaction effect seems to be as I expected:
> 
> > fullmodel<-lm(DV~factorA+factorB+factorA:factorB)
> > restmodel<-lm(DV~factorA+factorB)
> > anova(fullmodel,restmodel)
> Analysis of Variance Table
> 
> Model 1: DV ~ factorA + factorB + factorA:factorB
> Model 2: DV ~ factorA + factorB
>   Res.Df RSS Df Sum of Sq  F   Pr(>F)
> 1 24 18.
> 2 26 27.8667 -2   -9.8667 6.5778 0.005275 **
> 
> As you can see the value of F (6.5778) is the same as in the anova
> table
> above. All is well.
> 
> However, if I try to test a main effect, e.g. factorA, by testing the
> full
> model against a restricted model that doesn't contain the main effect
> factorA, I get something strange:
> 
> > restmodel<-lm(DV~factorB+factorA:factorB)
> > anova(fullmodel,restmodel)
> Analysis of Variance Table
> 
> Model 1: DV ~ factorA + factorB + factorA:factorB
> Model 2: DV ~ factorB + factorA:factorB
>   Res.Df RSS Df Sum of Sq F Pr(>F)
> 1 24  18
> 2 24  18  0 0
> 
> upon inspection of each model I see that the Residuals are identical,
> which
> is not what I was expecting:
> 
> > anova(fullmodel)
> Analysis of Variance Table
> 
> Response: DV
> Df  Sum Sq Mean Sq F value   Pr(>F)
> factorA  2  7.4667  3.7333  4.9778 0.015546 *
> factorB  1  2.1333  2.1333  2.8444 0.104648
> factorA:factorB  2  9.8667  4.9333  6.5778 0.005275 **
> Residuals   24 18.  0.7500
> 
> This looks fine, but then the restricted model is where things are not
> as I
> expected:
> 
> > anova(restmodel)
> Analysis of Variance Table
> 
> Response: DV
> Df  Sum Sq Mean Sq F value   Pr(>F)
> factorB  1  2.1333  2.1333  2.8444 0.104648
> factorB:factorA  4 17.  4.  5.7778 0.002104 **
> Residuals   24 18.  0.7500
> 
> I was expecting the Residuals in the restricted model (the one not
> containing main effect of factorA) to be larger than in the full model
> containing all three effects. In other words, the variance accounted
> for by
> the main effect factorA should be added to the Residuals. Instead, it
> looks
> like the variance accounted for by the main effect of factorA is being
> soaked up by the factorA:factorB interaction term. Strangely, the
> degrees of
> freedom are also affected.
> 
> I must be misunderstanding something here. Can someone point out what
> is
> happening?
> 
> Thanks,
> 
> -Paul
> 
> --
> Paul L. Gribble, Ph.D.
> Associate Professor
> Dept. Psychology
> The University of Western Ontario
> London, Ontario
> Canada N6A 5C2
> Tel. +1 519 661 2111 x82237
> Fax. +1 519 661 3961
> pgrib...@uwo.ca
> http://gribblelab.org
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

_

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Bryan

My apologies, this obviously doubles as my "for registration purposes"
account and so I don't often send from it - I was not intentionally being so
secretive : )

At any rate, I completely agree, but of course it's a reciprocal
relationship.  The software is written in SAS because that's what the
organizations use, the organizations use SAS because that's what the
programs are written in...  For better or worse, SAS's integration in big
bureaucracies is the main thing that keeps it competitive in the marketplace
and viable.  There aren't a lot of other contexts in which their pricing
structure would work.

Bryan

On Fri, Feb 27, 2009 at 12:48 PM, Frank E Harrell Jr <
f.harr...@vanderbilt.edu> wrote:

> spam me wrote:
>
>> I've actually used AHRQ's software to create Inpatient Quality Indicator
>> reports.  I can confirm pretty much what we already know; it is
>> inefficient.
>> Running on about 1.8 - 2 million cases, it would take just about a whole
>> day
>> to run the entire process from start to finish.  That isn't all processing
>> time and includes some time for the analyst to check results between
>> substeps, but I still knew that my day was full when I was working on IQI
>> reports.
>>
>>
>>
>> To be fair though, there are a lot of other factors (beside efficiency
>> considerations) that go into AHRQ's program design.  First, there are a
>> lot
>> of changes to that software every year.  In some cases it is easier and
>> less
>> error prone to hardcode a few points in the data so that it is blatantly
>> obvious what to change next year should another analyst need to do so.
>>  Second,
>> the organizations that use this software often require transparency and
>> may
>> not have high level programmers on staff.  Writing code so that it is
>> accessible, editable, and interpretable by intermediate level programmers
>> or
>> analysts is a plus.  Third, given that IQI reports are often produced on a
>> yearly basis, there's no real need to sacrifice clarity, etc. for
>> efficiency
>> - you're only doing this process once a year.
>>
>>
>>
>> There are other points that could be made, but the main idea is I don't
>> think it's fair to hold this software up, out of context, as an example of
>> SAS's (or even AHRQs) inefficiencies.  I agree that SAS syntax is nowhere
>> near as elegant or as powerful as R from a programming standpoint, that's
>> why after 7 years of using SAS I switched to R.  But comparing the two at
>> that level is like a racing a Ferrari and a Bentley to see which is the
>> better car.
>>
>
> Dear Anonymous,
>
> Nice points.  I would just add that it would be better if
> government-sponsored projects would result in software that could be run
> without expensive licenses.
>
> Thanks
> Frank
>
>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> Frank E Harrell Jr   Professor and Chair   School of Medicine
> Department of Biostatistics   Vanderbilt University
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Chu, Roy

Also because no one wants to put their neck out on a chopping block to
suggest R without technical support and the like.  If you use SAS,
there's a cascade of blame available, but it's not immediately
available for R.

On Fri, Feb 27, 2009 at 10:36 AM, Bryan  wrote:
> My apologies, this obviously doubles as my "for registration purposes"
> account and so I don't often send from it - I was not intentionally being so
> secretive : )
>
> At any rate, I completely agree, but of course it's a reciprocal
> relationship.  The software is written in SAS because that's what the
> organizations use, the organizations use SAS because that's what the
> programs are written in...  For better or worse, SAS's integration in big
> bureaucracies is the main thing that keeps it competitive in the marketplace
> and viable.  There aren't a lot of other contexts in which their pricing
> structure would work.
>
> Bryan
>
> On Fri, Feb 27, 2009 at 12:48 PM, Frank E Harrell Jr <
> f.harr...@vanderbilt.edu> wrote:
>
>> spam me wrote:
>>
>>> I've actually used AHRQ's software to create Inpatient Quality Indicator
>>> reports.  I can confirm pretty much what we already know; it is
>>> inefficient.
>>> Running on about 1.8 - 2 million cases, it would take just about a whole
>>> day
>>> to run the entire process from start to finish.  That isn't all processing
>>> time and includes some time for the analyst to check results between
>>> substeps, but I still knew that my day was full when I was working on IQI
>>> reports.
>>>
>>>
>>>
>>> To be fair though, there are a lot of other factors (beside efficiency
>>> considerations) that go into AHRQ's program design.  First, there are a
>>> lot
>>> of changes to that software every year.  In some cases it is easier and
>>> less
>>> error prone to hardcode a few points in the data so that it is blatantly
>>> obvious what to change next year should another analyst need to do so.
>>>  Second,
>>> the organizations that use this software often require transparency and
>>> may
>>> not have high level programmers on staff.  Writing code so that it is
>>> accessible, editable, and interpretable by intermediate level programmers
>>> or
>>> analysts is a plus.  Third, given that IQI reports are often produced on a
>>> yearly basis, there's no real need to sacrifice clarity, etc. for
>>> efficiency
>>> - you're only doing this process once a year.
>>>
>>>
>>>
>>> There are other points that could be made, but the main idea is I don't
>>> think it's fair to hold this software up, out of context, as an example of
>>> SAS's (or even AHRQs) inefficiencies.  I agree that SAS syntax is nowhere
>>> near as elegant or as powerful as R from a programming standpoint, that's
>>> why after 7 years of using SAS I switched to R.  But comparing the two at
>>> that level is like a racing a Ferrari and a Bentley to see which is the
>>> better car.
>>>
>>
>> Dear Anonymous,
>>
>> Nice points.  I would just add that it would be better if
>> government-sponsored projects would result in software that could be run
>> without expensive licenses.
>>
>> Thanks
>> Frank
>>
>>
>>>        [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>                     Department of Biostatistics   Vanderbilt University
>>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 133 matches

Mail list logo