Re: [R] [OT] Important stat dates

2006-09-07 Thread Marc Schwartz (via MN)
On Thu, 2006-09-07 at 11:57 -0500, Erin Hodgess wrote:
> Dear R People:
> 
> Way Off Topic:
> 
> Is anyone aware of a website that contains important dates
> in statistics history, please?
> 
> Maybe a sort of "This Day in Statistics", please?
> 
> I thought that my students might get a kick out of that.
> 
> (actually I will probably enjoy it more than them!)
> 
> Thanks for any help!
> 
> I tried (via Google) "today in statistics" and "today in statistics
> history" but nothing worthwhile appeared.


Here are two pages that you might find helpful:

  http://www.york.ac.uk/depts/maths/histstat/welcome.htm

  http://www.economics.soton.ac.uk/staff/aldrich/Figures.htm

Both have additional references and reciprocal links.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coerce matrix to number

2006-09-12 Thread Marc Schwartz (via MN)
On Tue, 2006-09-12 at 18:42 +0200, Simone Gabbriellini wrote:
> Dear List,
> 
> how can I coerce a matrix like this
> 
>   [,1] [,2] [,3] [,4] [,5] [,6]
> [1,] "0"  "1"  "1"  "0"  "0"  "0"
> [2,] "1"  "0"  "1"  "0"  "0"  "0"
> [3,] "1"  "1"  "0"  "0"  "0"  "0"
> [4,] "0"  "0"  "0"  "0"  "1"  "0"
> [5,] "0"  "0"  "0"  "1"  "0"  "0"
> [6,] "0"  "0"  "0"  "0"  "0"  "0"
> 
> to be filled with numbers?
> 
> this is the result of replacing some character ("v", "d") with 0 and  
> 1, using the code I found with RSiteSearch()
> 
> z[] <- lapply(z, factor, levels = c("d", "v"), labels = c(0, 1));
> 
> thank you,
> Simone


I reverse engineered your (presumably) original data frame:

> z
  1 2 3 4 5 6
1 d v v d d d
2 v d v d d d
3 v v d d d d
4 d d d d v d
5 d d d v d d
6 d d d d d d


> str(z)
`data.frame':   6 obs. of  6 variables:
 $ 1: Factor w/ 2 levels "d","v": 1 2 2 1 1 1
 $ 2: Factor w/ 2 levels "d","v": 2 1 2 1 1 1
 $ 3: Factor w/ 2 levels "d","v": 2 2 1 1 1 1
 $ 4: Factor w/ 2 levels "d","v": 1 1 1 1 2 1
 $ 5: Factor w/ 2 levels "d","v": 1 1 1 2 1 1
 $ 6: Factor w/ 2 levels "d","v": 1 1 1 1 1 1



If that is correct, then the following should yield what you want in one
step:

> z.num <- sapply(z, function(x) as.numeric(x) - 1)

> z.num
 1 2 3 4 5 6
[1,] 0 1 1 0 0 0
[2,] 1 0 1 0 0 0
[3,] 1 1 0 0 0 0
[4,] 0 0 0 0 1 0
[5,] 0 0 0 1 0 0
[6,] 0 0 0 0 0 0

> str(z.num)
 num [1:6, 1:6] 0 1 1 0 0 0 1 0 1 0 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:6] "1" "2" "3" "4" ...



Alternatively, if you were starting out with the character matrix:

> z.char
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,] "0"  "1"  "1"  "0"  "0"  "0"
[2,] "1"  "0"  "1"  "0"  "0"  "0"
[3,] "1"  "1"  "0"  "0"  "0"  "0"
[4,] "0"  "0"  "0"  "0"  "1"  "0"
[5,] "0"  "0"  "0"  "1"  "0"  "0"
[6,] "0"  "0"  "0"  "0"  "0"  "0"


You could do:

> storage.mode(z.char) <- "numeric"

> z.char
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]011000
[2,]101000
[3,]110000
[4,]000010
[5,]000100
[6,]000000

> str(z.char)
 num [1:6, 1:6] 0 1 1 0 0 0 1 0 1 0 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : NULL



Yet another alternative:

> matrix(as.numeric(z.char), dim(z.char))
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]011000
[2,]101000
[3,]110000
[4,]000010
[5,]000100
[6,]000000



HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group bunch of lines in a data.frame, an additional requirement

2006-09-13 Thread Marc Schwartz (via MN)
Try something like this:

# Initial data frame
> DF
  V1  V2  V3  V4
1  A 1.0 200 ID1
2  A 3.0 800 ID1
3  A 2.0 200 ID1
4  B 0.5  20 ID2
5  B 0.9  50 ID2
6  C 5.0  70 ID1


# Now do the aggregation to get the means
DF.1 <- aggregate(DF[, 2:3], list(V1 = DF$V1), mean)


> DF.1
  V1  V2  V3
1  A 2.0 400
2  B 0.7  35
3  C 5.0  70


# Now get the unique combinations of letters and IDs in DF
DF.U <- unique(DF[, c("V1", "V4")])

> DF.U
  V1  V4
1  A ID1
4  B ID2
6  C ID1


# Now merge the two data frames together, matching the letters
DF.NEW <- merge(DF.1, DF.U, by = "V1")

> DF.NEW
  V1  V2  V3  V4
1  A 2.0 400 ID1
2  B 0.7  35 ID2
3  C 5.0  70 ID1


See ?unique and ?merge for more information.

Also, for the sake of clarification, these are not matrices, but data
frames. A matrix may contain only one data type, whereas data frames are
specifically designed to contain multiple data types as you have here.

HTH,

Marc Schwartz

On Wed, 2006-09-13 at 17:38 +0100, Emmanuel Levy wrote:
> Thanks for pointing me out "aggregate", that works fine!
> 
> There is one complication though: I have mixed types (numerical and 
> character),
> 
> So the matrix is of the form:
> 
> A 1.0 200 ID1
> A 3.0 800 ID1
> A 2.0 200 ID1
> B 0.5 20   ID2
> B 0.9 50   ID2
> C 5.0 70   ID1
> 
> One letter always has the same ID but one ID can be shared by many
> letters (like ID1)
> 
> I just want to keep track of the ID, and get a matrix like:
> 
> A 2.0 400 ID1
> B 0.7 35 ID2
> C 5.0 70 ID1
> 
> Any idea on how to do that without a loop?
> 
>   Many thanks,
> 
>  Emmanuel
> 
> On 9/12/06, Emmanuel Levy <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > I'd like to group the lines of a matrix so that:
> > A 1.0 200
> > A 3.0 800
> > A 2.0 200
> > B 0.5 20
> > B 0.9 50
> > C 5.0 70
> >
> > Would give:
> > A 2.0 400
> > B 0.7 35
> > C 5.0 70
> >
> > So all lines corresponding to a letter (level), become a single line
> > where all the values of each column are averaged.
> >
> > I've done that with a loop but it doesn't sound right (it is very
> > slow). I imagine there is a
> > sort of "apply" shortcut but I can't figure it out.
> >
> > Please note that it is not exactly a matrix I'm using, the function
> > "typeof" tells me it's a list, however I access to it like it was a
> > matrix.
> >
> > Could someone help me with the right function to use, a help topic or
> > a piece of code?
> >
> > Thanks,
> >
> >   Emmanuel
> >

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] acos(0.5) == pi/3 FALSE

2006-09-18 Thread Marc Schwartz (via MN)
On Mon, 2006-09-18 at 19:31 +0200, Iñaki Murillo Arcos wrote:
> Hello,
> 
>   I don't know if the result of
> 
>   acos(0.5) == pi/3
> 
> is a bug or not. It looks strange to me.
> 
>Inaki Murillo

Seems reasonable to me:

> acos(0.5) == pi/3
[1] FALSE

> print(acos(0.5), 20)
[1] 1.0471975511965978534

> print(pi/3, 20)
[1] 1.0471975511965976313


See R FAQ 7.31 Why doesn't R think these numbers are equal?

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R CMD check fails at package dependencies check on Fedora Core 5, works on other systems

2006-09-19 Thread Marc Schwartz (via MN)
On Tue, 2006-09-19 at 22:16 +1000, Robert King wrote:
> Here is another thing that might help work out what is happening.  If I 
> use --no-install, ade4 actually fails as well, in the same way as zipfR.
> 
>   [Desktop]$ R CMD check --no-install ade4
>   * checking for working latex ... OK
>   * using log directory '/home/rak776/Desktop/ade4.Rcheck'
>   * using Version 2.3.1 (2006-06-01)
>   * checking for file 'ade4/DESCRIPTION' ... OK
>   * this is package 'ade4' version '1.4-1'
>   * checking if this is a source package ... OK
>   * checking package directory ... OK
>   * checking for portable file names ... OK
>   * checking for sufficient/correct file permissions ... OK
>   * checking DESCRIPTION meta-information ... ERROR
> 
>   [Desktop]$ R CMD check --no-install zipfR
>   * checking for working latex ... OK
>   * using log directory '/home/rak776/Desktop/zipfR.Rcheck'
>   * using Version 2.3.1 (2006-06-01)
>   * checking for file 'zipfR/DESCRIPTION' ... OK
>   * checking extension type ... Package
>   * this is package 'zipfR' version '0.6-0'
>   * checking if this is a source package ... OK
>   * checking package directory ... OK
>   * checking for portable file names ... OK
>   * checking for sufficient/correct file permissions ... OK
>   * checking DESCRIPTION meta-information ... ERROR



Robert,

I tried the process last night (my time) using the initial instructions
on my FC5 system with:

$ R --version
R version 2.3.1 Patched (2006-08-06 r38829)
Copyright (C) 2006 R Development Core Team


I could not replicate the problem.

However, this morning, with your additional communication:

$ R CMD check --no-install zipfR_0.6-0.tar.gz
* checking for working latex ... OK
* using log directory '/home/marcs/Downloads/zipfR.Rcheck'
* using Version 2.3.1 Patched (2006-08-06 r38829)
* checking for file 'zipfR/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'zipfR' version '0.6-0'
* checking if this is a source package ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for syntax errors ... OK
* checking R files for library.dynam ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking Rd files ... OK
* checking Rd cross-references ... WARNING
Warning in grep(pattern, x, ignore.case, extended, value, fixed,
useBytes) :
 input string 70 is invalid in this locale
* checking for missing documentation entries ... WARNING
Warning in grep(pattern, x, ignore.case, extended, value, fixed,
useBytes) :
 input string 70 is invalid in this locale
All user-level objects in a package should have documentation entries.
See chapter 'Writing R documentation files' in manual 'Writing R
Extensions'.
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking DVI version of manual ... OK

WARNING: There were 2 warnings, see
  /home/marcs/Downloads/zipfR.Rcheck/00check.log
for details



So I am wondering if this raises the possibility of a locale issue on
your FC5 system resulting in a problem reading DESCRIPTION files?  It
may be totally unrelated, but one never knows I suppose. Mine is:

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R CMD check fails at package dependencies check on Fedora Core 5, works on other systems

2006-09-20 Thread Marc Schwartz (via MN)
On Wed, 2006-09-20 at 13:20 +0200, Kurt Hornik wrote:
> >>>>> Marc Schwartz (via MN) writes:
> 
> > On Tue, 2006-09-19 at 22:16 +1000, Robert King wrote:
> >> Here is another thing that might help work out what is happening.  If I 
> >> use --no-install, ade4 actually fails as well, in the same way as zipfR.
> >> 
> >> [Desktop]$ R CMD check --no-install ade4
> >> * checking for working latex ... OK
> >> * using log directory '/home/rak776/Desktop/ade4.Rcheck'
> >> * using Version 2.3.1 (2006-06-01)
> >> * checking for file 'ade4/DESCRIPTION' ... OK
> >> * this is package 'ade4' version '1.4-1'
> >> * checking if this is a source package ... OK
> >> * checking package directory ... OK
> >> * checking for portable file names ... OK
> >> * checking for sufficient/correct file permissions ... OK
> >> * checking DESCRIPTION meta-information ... ERROR
> >> 
> >> [Desktop]$ R CMD check --no-install zipfR
> >> * checking for working latex ... OK
> >> * using log directory '/home/rak776/Desktop/zipfR.Rcheck'
> >> * using Version 2.3.1 (2006-06-01)
> >> * checking for file 'zipfR/DESCRIPTION' ... OK
> >> * checking extension type ... Package
> >> * this is package 'zipfR' version '0.6-0'
> >> * checking if this is a source package ... OK
> >> * checking package directory ... OK
> >> * checking for portable file names ... OK
> >> * checking for sufficient/correct file permissions ... OK
> >> * checking DESCRIPTION meta-information ... ERROR
> 
> > 
> 
> > Robert,
> 
> > I tried the process last night (my time) using the initial instructions
> > on my FC5 system with:
> 
> > $ R --version
> > R version 2.3.1 Patched (2006-08-06 r38829)
> > Copyright (C) 2006 R Development Core Team
> 
> 
> > I could not replicate the problem.
> 
> > However, this morning, with your additional communication:
> 
> > $ R CMD check --no-install zipfR_0.6-0.tar.gz
> > * checking for working latex ... OK
> > * using log directory '/home/marcs/Downloads/zipfR.Rcheck'
> > * using Version 2.3.1 Patched (2006-08-06 r38829)
> > * checking for file 'zipfR/DESCRIPTION' ... OK
> > * checking extension type ... Package
> > * this is package 'zipfR' version '0.6-0'
> > * checking if this is a source package ... OK
> > * checking package directory ... OK
> > * checking for portable file names ... OK
> > * checking for sufficient/correct file permissions ... OK
> > * checking DESCRIPTION meta-information ... OK
> > * checking top-level files ... OK
> > * checking index information ... OK
> > * checking package subdirectories ... OK
> > * checking R files for syntax errors ... OK
> > * checking R files for library.dynam ... OK
> > * checking S3 generic/method consistency ... OK
> > * checking replacement functions ... OK
> > * checking foreign function calls ... OK
> > * checking Rd files ... OK
> > * checking Rd cross-references ... WARNING
> > Warning in grep(pattern, x, ignore.case, extended, value, fixed,
> > useBytes) :
> >  input string 70 is invalid in this locale
> > * checking for missing documentation entries ... WARNING
> > Warning in grep(pattern, x, ignore.case, extended, value, fixed,
> > useBytes) :
> >  input string 70 is invalid in this locale
> > All user-level objects in a package should have documentation entries.
> > See chapter 'Writing R documentation files' in manual 'Writing R
> > Extensions'.
> > * checking for code/documentation mismatches ... OK
> > * checking Rd \usage sections ... OK
> > * checking DVI version of manual ... OK
> 
> > WARNING: There were 2 warnings, see
> >   /home/marcs/Downloads/zipfR.Rcheck/00check.log
> > for details
> 
> 
> 
> > So I am wondering if this raises the possibility of a locale issue on
> > your FC5 system resulting in a problem reading DESCRIPTION files?  It
> > may be totally unrelated, but one never knows I suppose. Mine is:
> 
> > $ locale
> > LANG=en_US.UTF-8
> > LC_CTYPE="en_US.UTF-8"
> > LC_NUMERIC="en_US.UTF-8"
> > LC_TIME="en_US.UTF-8"
> > LC_COLLATE="en_US.UTF-8"
> > LC_MONETARY="en_US.UTF-8"
> > LC_MESSAGES="en_US.UTF-8"
> > LC_PAPER="en_US.UTF-8"
> > LC_NAME="en_US.UTF-8"
> > LC_ADDRESS="en_US.UTF-8"
> > LC_TELEPHONE="en_US.UTF-8"
> > LC_MEASUREMENT="en_US.UTF-8"
> > LC_IDENTIFICATION="en_US.UTF-8"
> > LC_ALL=
> 
> 
> > HTH,
> 
> > Marc Schwartz
> 
> That's a bug in tools:::Rd_aliases (it needs to preprocess the Rd lines,
> which re-encodes if necessary and possible).
> 
> I'll commit a fix later today.
> 
> Thanks for spotting this.
> 
> Best
> -k

Thanks for noting this Kurt!

Regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beginners manual for emacs and ess

2006-09-20 Thread Marc Schwartz (via MN)
On Wed, 2006-09-20 at 17:03 +0200, Rainer M Krug wrote:
> Hi
> 
> I heard so much about Emacs and ESS that I decided to try it out - but I 
>   am stuck at the beginning.
> 
> Is there anywhere a beginners manual for Emacs & ESS to be used with R? 
> even M-x S tells me it can't start S-Plus - obviously - but I want it to 
> start R...
> 
> Any help welcome (otherwise I will be stuck with Eclipse and R)
> 
> Rainer


There are some reference materials on the main ESS site at:

  http://ess.r-project.org/

In addition, there is a dedicated ESS mailing list, with more info here:

  https://stat.ethz.ch/mailman/listinfo/ess-help

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] Sweave processes \Sexpr in commented LaTeX source (2.3.1patched and 2.4.0)

2006-09-20 Thread Marc Schwartz (via MN)
On Wed, 2006-09-20 at 09:09 +0200, Antonio, Fabio Di Narzo wrote:
> Hi.
> 
> 2006/9/20, Marc Schwartz <[EMAIL PROTECTED]>:
> > Hi all,
> >
> > On FC5, using:
> >
> >   Version 2.3.1 Patched (2006-08-06 r38829)
> >
> > and today's
> >
> >   R version 2.4.0 alpha (2006-09-19 r39397)
> >
> > with the following .Rnw file:
> >
> >
> > \documentclass[10pt]{article}
> > \begin{document}
> >
> >This line should print '2': \Sexpr{1 + 1}
> > %% This line should NOT print '2': \Sexpr{1 + 1}
> 
> If it's just a comment, why don't use something like:
> % \ Sexpr (del the space)
> or
> %\sexpr (change 'sexpr' with 'Sexpr')
> or
> %...the 'Sexpr' command (add a backslash in latex code)
> ?
> 
> Antonio.

See my comments in this post on r-devel, where this thread was
originally started:

  https://stat.ethz.ch/pipermail/r-devel/2006-September/039416.html

HTH,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [ROracle] error loading (undefined symbol: sqlclu)

2006-09-20 Thread Marc Schwartz (via MN)
On Wed, 2006-09-20 at 15:15 -0400, Mathieu Drapeau wrote:
> I have this error when I load the library ROracle:
> > library(ROracle)
> Loading required package: DBI
> Error in dyn.load(x, as.logical(local), as.logical(now)) :
> unable to load shared library
> '/usr/local/lib/R/site-library/ROracle/libs/ROracle.so':
>   /usr/local/lib/R/site-library/ROracle/libs/ROracle.so: undefined
> symbol: sqlclu
> Error in library(ROracle) : .First.lib failed for 'ROracle'
> >
> 
> Also, my LD_LIBRARY_PATH seems to be set correctly:
> drapeau:~> echo $LD_LIBRARY_PATH
> /home/drapeau/lib:/opt/oracle/xe/app/oracle/product/10.2.0/client/lib
> 
> I installed the big database applications (10g) and ROracle 0.5-7
> 
> Your help will be very appreciated to help me solve this error,
> Thank you,
> Mathieu

Where did you set LD_LIBRARY_PATH?

If in one of your shell config files, it is likely that it is being
stepped on during your login and thus not being seen within the R
session. 

I had this problem previously and set the variable in /etc/ld.so.conf
(though I am using RODBC instead).

Edit that file (as root), add the path:

/opt/oracle/xe/app/oracle/product/10.2.0/client/lib

and then run [/sbin/]ldconfig to update the current settings.

Be sure also that $ORACLE_HOME is set
to /opt/oracle/xe/app/oracle/product/10.2.0/client

Then try to load ROracle in a new R session.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract data from lm object and then use again?

2006-09-22 Thread Marc Schwartz (via MN)
On Fri, 2006-09-22 at 10:45 -0400, Mike Wolfgang wrote:
> Hi list,
> 
> I want to write a general function so that it would take an lm object,
> extract its data element, then use the data at another R function (eg, glm).
> I searched R-help list, and found this would do the trick of the first part:
> a.lm$call$data
> this would return a name object but could not be recognized as a
> data.frameby glm. I also tried
> call(as.character(a.lm$call$data))
> or
> eval(call(as.character(a.lm$call$data)))
> neither works.
> 
> By eval(call(...)), it acts as evaluating of a function, but what I want is
> just a data frame object which could be inserted into glm function. Anyone
> could help? Thanks,
> 
> Mike

If the 'data' argument in lm() is used, then this approach could work:

> Iris2 <- eval(lm(Sepal.Length ~ Species, data = iris)$call$data)

> str(Iris2)
`data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1
1 1 1 1 1 ...


However, note that you do not get the actual data used within the lm()
function (the model frame) but the entire source data frame.

What you likely want instead is the model frame containing the columns
actually used in the model formula:

> Iris3 <- lm(Sepal.Length ~ Species, data = iris)$model

> str(Iris3)
`data.frame':   150 obs. of  2 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1
1 1 1 1 1 ...
 - attr(*, "terms")=Classes 'terms', 'formula' length 3 Sepal.Length ~
Species
  .. ..- attr(*, "variables")= language list(Sepal.Length, Species)
  .. ..- attr(*, "factors")= int [1:2, 1] 0 1
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:2] "Sepal.Length" "Species"
  .. .. .. ..$ : chr "Species"
  .. ..- attr(*, "term.labels")= chr "Species"
  .. ..- attr(*, "order")= int 1
  .. ..- attr(*, "intercept")= int 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")=length 15 
  .. ..- attr(*, "predvars")= language list(Sepal.Length, Species)
  .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
  .. .. ..- attr(*, "names")= chr [1:2] "Sepal.Length" "Species"


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inequality with NA

2006-09-22 Thread Marc Schwartz (via MN)
On Fri, 2006-09-22 at 20:16 +0200, Mag. Ferri Leberl wrote:
> Dear everybody!
> take a<-c(5,3,NA,6).
> 
> if(a[1]!=NA){b<-7}
> if(a[3]!=5){b<-7}
> if(a[3]!=NA){b<-7}
> if(a[3]==NA){b<-7}
> 
> will alltogeather return
> 
> Fehler in if (a[1] != NA) { : Fehlender Wert, wo TRUE/FALSE nötig ist
> 
> (or simularly). Somehow this is logical. But how else should I get out,
> whether a certain vector-component has an existing value?
> Thank you in advance!
> Yours,
> Mag. Ferri Leberl

NA is not defined, so you cannot predictably perform equality/inequality
tests with it. There are specific functions in place for dealing with
this.

See ?is.na and ?na.omit

> a
[1]  5  3 NA  6

> a[is.na(a)]
[1] NA

> a[!is.na(a)]
[1] 5 3 6


You can also use which() to find the indices:

> which(is.na(a))
[1] 3

> which(!is.na(a))
[1] 1 2 4


Finally, use na.omit() to remove all NA's:

> na.omit(a)
[1] 5 3 6
attr(,"na.action")
[1] 3
attr(,"class")
[1] "omit"

Note that the object attribute 'na.action' shows that a[3] was removed:

> a.omit <- na.omit(a)

> as.vector(attr(a.omit, "na.action"))
[1] 3

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] proj4R library will not install

2006-09-22 Thread Marc Schwartz (via MN)
On Fri, 2006-09-22 at 17:16 -0400, Philip Bermingham wrote:
> I'm hoping someone can help me.  I have downloaded the proj4R.zip and
> under my version of R (2.3.1) I install the package from local zip
> file. This worked great.  I then type library(proj4R) to load the
> library and I get the error: Error in library(proj4R) : 'proj4R' is
> not a valid package -- installed < 2.0.0?  I have read through the
> install documentation and have downloaded and unpacked PROJ.4 to c:
> \proj\ so the bin is located at C:\proj\bin.  I then set the
> environmental variables PATH which now looks like : %SystemRoot%
> \system32;%SystemRoot%;%SystemRoot%\System32\Wbem;C:\Program Files
> \Intel\DMIX;C:\Program Files\UltraEdit;C:\proj and I created a new
> user variable PROJ_LIB to c:\proj\nad.  I'm not sure if I am missing
> anything here but I still get the <2.0.0 error.  If you can help me in
> any way I would truly appreciate it.
> 
> Thanks in advance,
> 
> Philip Bermingham


proj4R is not a base or CRAN R package. Some Googling suggests that the
R package might be deprecated, as it has not been updated for some time
(hence the error msgs) based upon a review of the R related archive
files available at:

http://spatial.nhh.no/R/Devel/

I would suggest communicating with Roger Bivand (who I have cc'd here)
as to the status of the package and any subsequent replacements.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a character variable into a numeric one and a character one?

2006-09-25 Thread Marc Schwartz (via MN)
On Mon, 2006-09-25 at 11:30 -0500, Marc Schwartz (via MN) wrote:
> On Mon, 2006-09-25 at 11:04 -0500, Frank Duan wrote:
> > Hi All,
> > 
> > I have a data with a variable like this:
> > 
> > Column 1
> > 
> > "123abc"
> > "12cd34"
> > "1e23"
> > ...
> > 
> > Now I want to do an operation that can split it into two variables:
> > 
> > Column 1Column 2 Column 3
> > 
> > "123abc" 123  "abc"
> > "12cd34" 12"cd34"
> > "1e23" 1  "e23"
> > ...
> > 
> > So basically, I want to split the original variabe into a numeric one and a
> > character one, while the splitting element is the first character in Column
> > 1.
> > 
> > I searched the forum with key words "strsplit"and "substr", but still can't
> > solve this problem. Can anyone give me some hints?
> > 
> > Thanks in advance,
> > 
> > FD
> 
> 
> Something like this using gsub() should work I think:
> 
> > DF
>   V1
> 1 123abc
> 2 12cd34
> 3   1e23
> 
> 
> # Replace letters and any following chars with ""
> DF$V2 <- gsub("[A-Za-Z]+.*", "", DF$V1)

Quick typo correction here. It should be:

DF$V2 <- gsub("[A-Za-z]+.*", "", DF$V1)

The second 'z' should be lower case.


Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a character variable into a numeric one and a character one?

2006-09-25 Thread Marc Schwartz (via MN)
On Mon, 2006-09-25 at 11:04 -0500, Frank Duan wrote:
> Hi All,
> 
> I have a data with a variable like this:
> 
> Column 1
> 
> "123abc"
> "12cd34"
> "1e23"
> ...
> 
> Now I want to do an operation that can split it into two variables:
> 
> Column 1Column 2 Column 3
> 
> "123abc" 123  "abc"
> "12cd34" 12"cd34"
> "1e23" 1  "e23"
> ...
> 
> So basically, I want to split the original variabe into a numeric one and a
> character one, while the splitting element is the first character in Column
> 1.
> 
> I searched the forum with key words "strsplit"and "substr", but still can't
> solve this problem. Can anyone give me some hints?
> 
> Thanks in advance,
> 
> FD


Something like this using gsub() should work I think:

> DF
  V1
1 123abc
2 12cd34
3   1e23


# Replace letters and any following chars with ""
DF$V2 <- gsub("[A-Za-Z]+.*", "", DF$V1)


# Replace any initial numbers with ""
DF$V3 <- gsub("^[0-9]+", "", DF$V1)


> DF
  V1  V2   V3
1 123abc 123  abc
2 12cd34  12 cd34
3   1e23   1  e23

See ?gsub and ?regex for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't mix high level and low level plot functions.

2006-09-25 Thread Marc Schwartz (via MN)
On Mon, 2006-09-25 at 19:56 +0200, Lothar Botelho-Machado wrote:
> Hey R-Comunity,
> 
> 
> I'd like to print out an histogram of some experimental data and add a
> smooth curve of a normal distribution with an ideally generated
> population having the same mean and standard deviation like the
> experimental data.
> 
> 
> The experimental data is set as vector x and its name is set to
> group.name. I paint the histogram as follows:
> 
> hist(data, freq=FALSE, col="lightgrey", ylab="Density", xlab=group.name)
> 
> 
> 
> First I did the normal distribution curve this way:
> 
> lines(x, dnorm(x, mean=mean(x), sd=sd(x)), type="l", lwd=2)
> 
> This curve just uses as many values as there are in x. When using small
> amounts of sample populations the curve looks really shaky.
> 
> 
> 
> I tried this one using a high level plot function as well:
> 
> curve(dnorm, n=1, add=TRUE, xlim=range(x))
> 
> The advantage is, now I can set an ideal population of 1 to get the
> ideal curve really smooth. But the big disadvantage is, I don't know how
> to add "mean=mean(x),  sd=sd(x)" arguments to it? It says that it can't
> mix high level with low level plot functions when I try to set some kind
> of parameter like "n=1" to the low level function, it says that
> there ain't enough x values.
> 
> So my question is, how to get a smooth curve placed of dnorm over an
> histogram of sample data, ideally by using the curve method?
> 
> 
> TIA,
> Lothar Rubusch

This almost seems like it should be a FAQ. I also checked the R Graphics
Gallery (http://addictedtor.free.fr/graphiques/index.php) and didn't see
an example there either, unless I missed it.

In either case:

x <- rnorm(50)

hist(x, freq = FALSE)

# Create a sequence of x axis values with small
# increments over the range of 'x' to smooth the lines
x.hypo <- seq(min(x), max(x), length = 1000)

# Now use lines()
lines(x.hypo, dnorm(x.hypo, mean=mean(x), sd=sd(x)), type="l", lwd=2)

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] paste? 'cmd /c "c:\\pheno\\whap --file c:\\pheno\\smri --alt 1"'

2006-09-25 Thread Marc Schwartz (via MN)
On Mon, 2006-09-25 at 18:58 +0200, Boks, M.P.M. wrote:
> Dear R users,
>  
> This command works (calling a programm -called whap- with file specifiers 
> etc.):
>  
> >system('cmd /c "c:\\pheno\\whap --file c:\\pheno\\smri --alt 1 --perm 500"', 
> >intern=TRUE)
>  
> Now I need to call it from a loop to replace the "1" by different number, 
> however I get lost using the quotes:
>  
> I tried numerous versions of:
>  
> >i<-1
> >system(paste(c("'cmd /c "c:\\pheno\\whap --file c:\\pheno\\smri --alt", i, " 
> >--perm 500"'", sep="" )), intern=TRUE)
>  
> However no luck! I would be gratefull for any help.
>  
> Thanks,
>  
> Marco

You need to escape the quote (") chars in the paste()d string so that
they get passed to your command properly. Also, you don't want to use
c() within the paste() function, as the paste() function already
concatenates the component vectors.

Note:

i <- 1

> paste("'cmd /c "c:\\pheno\\whap --file c:\\pheno\\smri --alt", i,  " --perm 
> 500"'", sep="")
Error: syntax error in "paste("'cmd /c "c"

R sees the double quote before the second 'c' as the end of the string:

  "'cmd /c "


Now use "\" to escape the internal quotes:

> paste("'cmd /c \"c:\\pheno\\whap --file c:\\pheno\\smri --alt ", i,  " --perm 
> 500\"'", sep="")
[1] "'cmd /c \"c:\\pheno\\whap --file c:\\pheno\\smri --alt 1 --perm 500\"'"


Use '\' to escape each of the double quotes within the string, so that R
can differentiate string delimiters versus characters within the string.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode problem - unexplained values

2006-09-28 Thread Marc Schwartz (via MN)
On Thu, 2006-09-28 at 12:27 +1000, [EMAIL PROTECTED] wrote:
> I am hoping for some advice regarding the difficulties I have been having
> recoding variables which are contained in a csv file.  Table 1 (below) 
> shows there are two types of blanks - as reported in the first two
> columns. I am using windows XP & the latets version of R.
> 
> When blanks cells are replaced with a value of n using syntax: > affect
> [affect==""] <- "n"
> there are still 3 blank values (Table 2).   When as.numeric is applied,
> this also causes problems because values of 2,3 & 4 are generated rather
> than just 1 & 2.
> 
> TABLE 1
> 
> table(group,actions)
>  actions
> group   n   y
> 1 100   2   0   3
> 2  30   1   1   0
> 3  24   0   0   0
> 
> 
> 
> TABLE 2
> 
> >  table(group,actions)
>  actions
> group   n   y
> 1   0   2 100   3
> 2   0   1  31   0
> 3   0   0  24   0
> 
> 
> Below is another example - for some reason there are 2 types of 'aobh'
> values.
> 
> 
> > table(group, type)
>  type
> group aobh aobh   gbh   m  uw
> 1  104  1   0   0   0
> 20  0  15   0  17
> 30  0   0  24   0
> 
> 
> Any assistance is much appreciated,
> 
> 
> Bob Green

Bob,

A quick heads up, which is the presumption that "aobh" and "aobh  " are
different values simply as a consequence of leading/trailing spaces in
the source data file within the delimited fields. This is also the
likely reason for there being multiple missing/blank values in your
imported data set.

Presuming that you used one of the read.table() family functions (ie.
read.csv() ), take note of the 'strip.white' argument in ?read.table,
which defaults to FALSE. If you change it to TRUE, the function will
strip leading and trailing blanks, likely resolving this issue.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Evaluation of defaults in functions

2006-09-28 Thread Marc Schwartz (via MN)
On Thu, 2006-09-28 at 21:49 +0200, Ulrich Keller wrote:
> Hello,
> 
> and sorry if this is already explained somewhere. I couldn't find anything.
> 
> R (2.3.1, Windows) seems to perform some kind of lazy evaluation when 
> evaluating defaults in function calls that, at least for me, leads to 
> unexpected results. Consider the following, seemingly equivalent functions:
> 
>  > foo1 <- function(x, y=x) {
> +   x <- 0
> +   y
> + }
>  > foo1(1)
> [1] 0
>  > foo2 <- function(x, y=x) {
> +   y <- y
> +   x <- 0
> +   y
> + }
>  > foo2(1)
> [1] 1
> 
> Obviously, y is not evaluated until it is used in some way. I would 
> expect it to be evaluated where it is defined. Is this intended behavior?
> Thanks for clarifying,
> 
> Uli

Yep. This is documented in the R Language Definition Manual, which is
available via the GUI in the Windows version and/or online here:

  http://cran.r-project.org/doc/manuals/R-lang.html

Specifically in section 4.3.3 Argument Evaluation:

"R has a form of lazy evaluation of function arguments. Arguments are
not evaluated until needed. It is important to realize that in some
cases the argument will never be evaluated. Thus, it is bad style to use
arguments to functions to cause side-effects. While in C it is common to
use the form, foo(x = y) to invoke foo with the value of y and
simultaneously to assign the value of y to x this same style should not
be used in R. There is no guarantee that the argument will ever be
evaluated and hence the assignment may not take place."

You might also want to read section 2.1.8 Promise objects and section
6.2 Substitutions.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help on plots

2006-09-28 Thread Marc Schwartz (via MN)
On Thu, 2006-09-28 at 23:55 +0800, zhijie zhang wrote:
> Dear friends,
>  I met a problem on plotting.
> My dataset is :
> yearMHBC LHBC MHRC LURC
> 1993   11.75   4.50   0.43   0.46
> 19947.25   1.25   0.35   0.51
> 19958.67   2.17   0.54   0.44
> 1996   2.67   1.33   0.78   0.47
> 1997   3.42   4.92   0.69   0.48
> 1998   1.92   3.08   0.72   0.54
> 1999   2.33   2.58   0.74   0.41
> 2000   5.75   4.50   0.45   0.50
> 2001   3.75   4.42   0.52   0.47
> 2002   2.33   1.83   0.58   0.45
> 2003   0.25   2.83   0.50   0.39
> I want to get a plot -line with scatters, the requirement is :
> x-axis is year;
> two y-axis:
>   y1 corresponds to MHBC and LHBC;
>   y2 corresponds to MHRC and LURC;
> hope to use different symbols to differentiate the MHBC,LHBC,MHRC and  LURC.
> 
> The following is my program, but  very bad ,:
> *plot(a$year,a$MHBC,type='b')  #line1
> par(new=T)
> plot(a$year,a$LHBC,type='b')  #line2
> par(new=T)
> plot(a$year,a$MHRC,type='b')  #line3
> par(new=T)
> plot(a$year,a$LURC,type='b')   #line4
> axis(4, at=pretty(range(a$MHRC)))*
> In the figure, the labels and scales of X-axis are vague, the scale of
> y-axis is not very good.
> The better figure should be like the line1 and 2 are in the upper, and line3
> and 4 are in the bottom.
> Any suggestion are welcome!

It's not entirely clear to me what you want, so let me offer three
possibilities.


1. Do all four lines in a single plot with a common y axis:

matplot(a$year, a[, -1], type = "o", pch = 15:18)



2. Do all four lines in a single plot with the first two having a
separate left hand y axis and the second two having a separate right
hand y axis:

# Draw the first pair of lines
matplot(a$year, a[, 2:3], type = "o", pch = c(19, 20),
lty = "solid", ann = FALSE)

# Get the current plot region boundaries
usr <- par("usr")

# Get the range of the second set of columns
range.y2 <- range(a[, 4:5])

# Change the plot region y axis range for the second
# set of columns. Extend them by 4% as per the default
par(usr = c(usr[1], usr[2], 
range.y2[1] * 0.96 , range.y2[2] * 1.04))

# Add the second pair of lines
matlines(a$year, a[, 4:5], type = "o", pch = c(15, 18), 
 lty = "dashed", col = c("blue", "green"))

# Add the second y axis
axis(4)



3. Do the first two lines in an upper plot and the second two lines in a
lower plot, each has its own y axis range:

# Set plot region to have two rows
par(mfrow = c(2, 1))

# Adjust the plot margins
par(mar = c(2, 5, 2, 2))

# Draw the first pair of lines
matplot(a$year, a[, 2:3], type = "o", pch = c(19, 20),
lty = "solid", ylab = "First Pair")


par(mar = c(3, 5, 2, 2))

# Add the second pair of lines
matplot(a$year, a[, 4:5], type = "o", pch = c(15, 18), 
lty = "dashed", col = c("blue", "green"), 
ylab = "Second Pair")



See ?matplot, ?par and ?points for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a decimal aligned column

2006-09-28 Thread Marc Schwartz (via MN)
On Thu, 2006-09-28 at 12:31 -0700, BBands wrote:
> Hello,
> 
> For numbers in the range 100 to 100,000,000 I'd like to decimal align
> a right-justified comma-delineated column of numbers, but I haven't
> been able to work out the proper format statement. format(num,
> justify=right, width=15, big.mark=",") gets me close, but numbers
> larger than 1,000,000 project a digit beyond the right edge of the
> column, which I really don't understand. I gather I can get the
> decimal alignment from sprintf(), but I am not sure about the
> interaction of the two functions.
> 
> TIA,
> 
> jab

Is this what you want?:

Nums <- 10 ^ (2:8)

Nums.Pretty <- format(Nums, width = 20, justify = "right", 
  big.mark = ",", nsmall = 4, scientific = FALSE)


> Nums.Pretty
[1] "  100." "1,000."
[3] "   10,000." "  100,000."
[5] "1,000,000." "   10,000,000."
[7] "  100,000,000."


> cat(Nums.Pretty, sep = "\n")
  100.
1,000.
   10,000.
  100,000.
1,000,000.
   10,000,000.
  100,000,000.



HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] index vector

2006-09-29 Thread Marc Schwartz (via MN)
On Fri, 2006-09-29 at 16:13 -0400, bertrand toupin wrote:
> Hi!  1st time I'm posting here.  I'm beginning to learn R and I've
> encountered a problem that I'm unable to solve so far.
> 
> I have  a 20 000 x 5 matrix.  In the 5th column, I have elevation.
> Missing value are actually put to -9.  I want to track down the
> index of those values and replace them with NA.  I've read that to
> replace, the command "replace" is enough.  I just don't know how to
> construct the index vector that contains the index of -9 values.
> 
> Hope this makes sense,
> Thanks!
> Philippe


See ?is.na and note the use of:

  is.na(x) <- value


Example:

> mat <- matrix(sample(50), 10, 5)

> mat
  [,1] [,2] [,3] [,4] [,5]
 [1,]   24   39   40   305
 [2,]8   443   34   47
 [3,]   23   12   16   14   45
 [4,]   35   262   116
 [5,]   13   15   42   33   19
 [6,]7   36   31   49   37
 [7,]   29   419   274
 [8,]   481   22   25   17
 [9,]   43   32   28   38   20
[10,]   18   50   46   21   10


# Set some values in column 5 to -9
> mat[sample(10, 3), 5] <- -9


> mat
  [,1] [,2] [,3] [,4]   [,5]
 [1,]   24   39   40   30  5
 [2,]8   443   34 47
 [3,]   23   12   16   14 45
 [4,]   35   262   11  6
 [5,]   13   15   42   33 -9
 [6,]7   36   31   49 -9
 [7,]   29   419   27  4
 [8,]   481   22   25 17
 [9,]   43   32   28   38 20
[10,]   18   50   46   21 -9

# Use which to get the indices within column 5
# of those values which are -9
# See ?which
> which(mat[, 5] == -9)
[1]  5  6 10


# Now extend that and set those to NA
> is.na(mat[, 5]) <- which(mat[, 5] == -9)

> mat
  [,1] [,2] [,3] [,4] [,5]
 [1,]   24   39   40   305
 [2,]8   443   34   47
 [3,]   23   12   16   14   45
 [4,]   35   262   116
 [5,]   13   15   42   33   NA
 [6,]7   36   31   49   NA
 [7,]   29   419   274
 [8,]   481   22   25   17
 [9,]   43   32   28   38   20
[10,]   18   50   46   21   NA



Note one other possibility, which is that if you used one of the
read.table() family functions to read in a delimited ASCII file
containing the data set, you can set the 'na.strings' argument to
"-9" and have it set these to NA upon importing.  See ?read.table
for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplot

2006-10-02 Thread Marc Schwartz (via MN)
On Mon, 2006-10-02 at 11:14 -0400, Mohsen Jafarikia wrote:
> Hello,
> 
> I have used the following data to draw my barplot:
> 
> BL LRQ
> 
> 36.351.00   1.92
> 36.914.00   0.00
> 25.706.00   0.00
> 34.383.00   1.92
> 05.320.50   0.00
> 
>  BL<-c(36.35, 36.91, 25.70, 34.38, 05.32)
> LR<-c(1.00, 4.00, 6.00, 3.00, 0.50)
> Q<-<(1.92, 0.00, 0.00, 1.92, 0.00)
> 
> barplot(dt$LR, main='LR Value',  col='orange', border='black', space=0.05,
> width=(dt$BL), xlab='Length', ylab='LR')
> 
>  axis(1)
> 
> I would like to do the following things that I don't know how to do it:
> 
>   1)  Writing the value of each 'BL' on my X axis.
> 2)  Writing the value of 'Q' on the bottom of  X axis.
> 3)  Draw a line on the bars which connects the 'LR' values.
> 
>  I appreciate your comments.
> 
>  Thanks,
> Mohsen


I'm not sure if I am getting this completely correct, but is this what
you want?


BL <- c(36.35, 36.91, 25.70, 34.38, 5.32)
LR <- c(1.00, 4.00, 6.00, 3.00, 0.50)
Q <- c(1.92, 0.00, 0.00, 1.92, 0.00)


# Get the bar midpoints in 'mp'
mp <- barplot(LR, main='LR Value',  col='orange', border='black',
  space=0.05, width=(BL), xlab='Length', ylab='LR')

# Write the LR and Q values below the bar midpoints
mtext(1, at = mp, text = sprintf("%.1f", LR), line = 1)
mtext(1, at = mp, text = sprintf("%.1f", Q), line = 0)

# Now connect the LR values across the bars
lines(mp, LR)


See ?barplot, ?mtext, ?sprintf and ?lines

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] X-axis labels in histograms drawn by the "truehist" function

2006-10-02 Thread Marc Schwartz (via MN)
On Mon, 2006-10-02 at 14:58 -0400, Ravi Varadhan wrote:
>  
> Hi,
> 
>  
> 
> I had sent this email last week, but received no reply.  So, I am resending
> it - please excuse me for the redundant email.
> 
>  
> 
> I have a simple problem that I would appreciate getting some tips.  I am
> using the "truehist" function within an "apply" call to plot multiple
> histograms.  I can't figure out how to get truehist to use the column names
> of the matrix as the labels for the x-axis of the histograms.  
> 
>  
> 
> Here is a simple example:
> 
>  
> 
> library(MASS)  # this contains the truehist function
> 
> X <- matrix(runif(4000),ncol=4)
> 
> colnames(X) <- c("X1","X2","X3","X4")
> 
> par(mfrow=c(2,2))
> 
> apply(X, 2, function(x)truehist(x))
> 
>  
> 
> In this example, I would like the x-labels of the histograms to be "X1",
> "X2", etc.
> 
>  
> 
> Any help is appreciated.
> 
>  
> 
> Best,
> 
> Ravi

Ravi,

Gabor did reply:

https://stat.ethz.ch/pipermail/r-help/2006-September/114019.html

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how ot replace the diagonal of a matrix

2006-10-03 Thread Marc Schwartz (via MN)
On Tue, 2006-10-03 at 17:03 -0400, Duncan Murdoch wrote:
> On 10/3/2006 4:59 PM, roger bos wrote:
> > Dear useRs,
> > 
> > Trying to replace the diagonal of a matrix is not working for me.  I
> > want a matrix with .6 on the diag and .4 elsewhere.  The following
> > code looks like it should work--when I lookk at mps and idx they look
> > how I want them too--but it only replaces the first element, not each
> > element on the diagonal.
> > 
> > mps <- matrix(rep(.4, 3*3), nrow=n, byrow=TRUE)
> > idx <- diag(3)
> > mps
> > idx
> > mps[idx] <- rep(.6,3)
> > 
> > I also tried something along the lines of diag(mps=.6, ...) but it
> > didn't know what mps was.
> 
> Matrix indexing can use a two column matrix, giving row and column 
> numbers.  So you could get what you want by
> 
> mps[cbind(1:n,1:n)] <- 0.6

What's wrong with:

> mps <- matrix(rep(.4, 3*3), nrow = 3, byrow=TRUE)
> mps
 [,1] [,2] [,3]
[1,]  0.4  0.4  0.4
[2,]  0.4  0.4  0.4
[3,]  0.4  0.4  0.4

> diag(mps) <- 0.6

> mps
 [,1] [,2] [,3]
[1,]  0.6  0.4  0.4
[2,]  0.4  0.6  0.4
[3,]  0.4  0.4  0.6

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplot

2006-10-03 Thread Marc Schwartz (via MN)
Mohsen,

I had not seen a reply to your follow up yet and I have been consumed in
meetings and on phone calls.

On your first question, add two additional lines of code:

BL <- c(36.35, 36.91, 25.70, 34.38, 5.32)
LR <- c(1.00, 4.00, 6.00, 3.00, 0.50)
Q <- c(1.92, 0.00, 0.00, 1.92, 0.00)


# Get the bar midpoints in 'mp'
mp <- barplot(LR, main='LR Value',  col='orange', border='black',
  space=0.05, width=(BL), xlab='Length', ylab='LR')

# Write the LR and Q values below the bar midpoints
mtext(1, at = mp, text = sprintf("%.1f", LR), line = 1)
mtext(1, at = mp, text = sprintf("%.1f", Q), line = 0)

# Write labels at minimum value of X axis
mtext(1, at = par("usr")[1], text = "LR", line = 1)
mtext(1, at = par("usr")[1], text = "Q", line = 0)


See ?par for more information.

With respect to adding some sort of curve fit/density plot to your data,
it is not clear to me what the data represents, as the x axis does not
appear to be monotonic in Q (other than the bar midpoints) and the y
axis values do not appear to be counts.

If you have the original vector of data, you may be better off with a
histogram rather than a barplot, since the histogram will enable a
common density area within the bars (ie. the area of the bars = 1.0)
over which you can then draw a normal density curve. This general
approach was recently covered here:

https://stat.ethz.ch/pipermail/r-help/2006-September/113686.html

and there are similar examples in the archives.

See ?hist and ?truehist in the MASS package.

HTH,

Marc Schwartz

On Mon, 2006-10-02 at 16:42 -0400, Mohsen Jafarikia wrote:
> Thanks for your response. I just have two more questions:
> 1) I don't know how to write the titles of the LR and Q behind their
> lines of values (at the bottom of the graph). I tried to write like
> text = sprintf("LR%.1f", LR)...
>   but it writes 'LR' behind all values while I only want it once at the
> beginning of the line while all the LR and Q values are still in the mid
> points of bars.
> 
> 2) I would like a line which connects the mid points of each bar to be
> like a density function (or regression) line which is not sharp like what I
> have now. I tried to write density in the code but it tells "Error in
> xy.coords(x, y) : 'x' and 'y' lengths differ"
>  I appreciate any comment about these questions
> 
> Thanks,
> Mohsen
> 
> 
> On 10/2/06, Marc Schwartz (via MN) <[EMAIL PROTECTED]> wrote:
> >
> > On Mon, 2006-10-02 at 11:14 -0400, Mohsen Jafarikia wrote:
> > > Hello,
> > >
> > > I have used the following data to draw my barplot:
> > >
> > > BL LRQ
> > >
> > > 36.351.00   1.92
> > > 36.914.00   0.00
> > > 25.706.00   0.00
> > > 34.383.00   1.92
> > > 05.320.50   0.00
> > >
> > >  BL<-c(36.35, 36.91, 25.70, 34.38, 05.32)
> > > LR<-c(1.00, 4.00, 6.00, 3.00, 0.50)
> > > Q<-<(1.92, 0.00, 0.00, 1.92, 0.00)
> > >
> > > barplot(dt$LR, main='LR Value',  col='orange', border='black', space=
> > 0.05,
> > > width=(dt$BL), xlab='Length', ylab='LR')
> > >
> > >  axis(1)
> > >
> > > I would like to do the following things that I don't know how to do it:
> > >
> > >   1)  Writing the value of each 'BL' on my X axis.
> > > 2)  Writing the value of 'Q' on the bottom of  X axis.
> > > 3)  Draw a line on the bars which connects the 'LR' values.
> > >
> > >  I appreciate your comments.
> > >
> > >  Thanks,
> > > Mohsen
> >
> >
> > I'm not sure if I am getting this completely correct, but is this what
> > you want?
> >
> >
> > BL <- c(36.35, 36.91, 25.70, 34.38, 5.32)
> > LR <- c(1.00, 4.00, 6.00, 3.00, 0.50)
> > Q <- c(1.92, 0.00, 0.00, 1.92, 0.00)
> >
> >
> > # Get the bar midpoints in 'mp'
> > mp <- barplot(LR, main='LR Value',  col='orange', border='black',
> >  space=0.05, width=(BL), xlab='Length', ylab='LR')
> >
> > # Write the LR and Q values below the bar midpoints
> > mtext(1, at = mp, text = sprintf("%.1f", LR), line = 1)
> > mtext(1, at = mp, text = sprintf("%.1f", Q), line = 0)
> >
> > # Now connect the LR values across the bars
> > lines(mp, LR)
> >
> >
> > See ?barplot, ?mtext, ?sprintf and ?lines
> >
> > HTH,
> >
> > Marc Schwartz
> >
> >
> >
> 
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer output

2006-10-06 Thread Marc Schwartz (via MN)
On Fri, 2006-10-06 at 17:05 +0100, Mike Ford wrote:
> When I do lmer models I only get Estimate, Standard Error and t value in 
> the output for the fixed effects.
> 
> Is there a way I get degrees of freedom and p values as well?
> 
> I'm a very new to R, so sorry if this a stupid question.
> 
> Thank you
> 
> - Mike

See R FAQ 7.35 Why are p-values not displayed when using lmer()?

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-p_002dvalues-not-displayed-when-using-lmer_0028_0029_003f

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stopping ctrl-\ from qutting R

2006-10-06 Thread Marc Schwartz (via MN)
On Fri, 2006-10-06 at 14:46 -0400, Martin C. Martin wrote:
> Hi,
> 
> In the Linux (FC3) version of R, ctrl-\ quits R.  This wouldn't be so 
> bad, but on my keyboard, it's right next to ctrl-p and I tend to hit it 
> by accident.
> 
> Is there any way to turn that off?

Open your favorite terminal emulator (ie. gnome-terminal, xterm,
konsole) and type:

  stty quit undef

then type:

  R

The first command will disable the QUIT signal within the tty session,
which by default is set to CTRL-\. This will not change other console
sessions.

Of course this behavior may introduce other problems.  :-)

BTW, you might want to consider updating your FC distro, as FC3 is now
EOL and only supported by the Fedora Legacy folks. That support will end
on December 31.  At this point of course, you might just want to wait
until FC 6 is out sometime in the next week or so.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is it possible to fill with a color or transparency gradient?

2006-10-06 Thread Marc Schwartz (via MN)
On Fri, 2006-10-06 at 16:15 -0400, Eric Harley wrote:
> Hi all,
> 
> Is there a way to fill a rectangle or polygon with a color and/or
> transparency gradient?  This would be extremely useful for me in terms
> of adding some additional information to some plots I'm making,
> especially if I could define the gradient on my own by putting
> functions into rgb something like rgb( r=f(x,y), g=f(x,y), b=f(x,y),
> alpha=f(x,y) ).  Not so important whether the coordinates are in terms
> of the plot axes or normalized to the polygon itself somehow.  Ideally
> it would work not only for a fill color but also for shading lines.
> 
> I haven't been using R very long, so it's possible that I'm just
> missing something, but I haven't found anything like this in the help
> files.  I've tried to poke around in graphics, grid, and ggplot,
> without any luck so far.  I really like some of the functionality in
> ggplot, and it does some nice things with continuous gradients for the
> color of scatter plot points, for example, but it each individual
> point (or grob) is always one solid color as far as I can tell.
> 
> Thanks,
> Eric

Take a look at the gradient.rect() function in Jim Lemon's 'plotrix'
CRAN package.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lines at margin?

2006-10-09 Thread Marc Schwartz (via MN)
On Mon, 2006-10-09 at 10:56 -0400, Mike Wolfgang wrote:
> Hi list,
> 
> I want to add some lines at margin area of one figure. mtext could add text
> to these margins, can I add lines with different lty parameters? Thanks,
> 
> mike

You can do it, but it will take some fiddling to get the coordinates
right:

 # Do a generic plot
 plot(1:10)

 # Get the current plot region axis ranges
 # x1, x2, y1, y2
 par("usr")
 [1]  0.64 10.36  0.64 10.36

 # Draw a vertical line in the right hand margin
 # Set 'xpd = TRUE' so that plotting is not
 # clipped at the plot region boundary
 segments(10.75, 4, 10.75, 6, xpd = TRUE)

See ?par for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I delete components in a column ?

2006-10-09 Thread Marc Schwartz (via MN)
On Mon, 2006-10-09 at 17:15 +0200, Yen Ngo wrote:
> Hi all R-helpers,
>
>   i am a new R-user and have problem with deleting some components in a 
> column. I have a dataset like
>
>   Name  Idx
>empty   2
>empty   3
>   anone2
>   bnone3
>   d   none 2
>   ad  cfh   4
>   bf   cdt   5
>empty   2
>empty   2
>   gf  cdh   4
>   d   none 5
>
>   and want to eliminate all components that have id=none and empty . The 
> remaining data should be
>
>   Name  Id   x
>   ad   cfh  4
>   bfcdt  5
>   gfcdh 4
>
>   How can I do this ? The components with id=empty have no name.
>
>   Thanks in advance,
>   Regards,
>   Yen

The easiest way is the use the subset() function. Presuming that your
data frame is called 'DF':

  NewDF <- subset(DF, !Id %in% c("empty", "none"))

The second argument, using a logical negation of the "%in%" function,
tells subset to only select those rows where the "Id" column does not
contain either "empty" or "none".

See ?subset and ?"%in%"

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transcript of "Conservative ANOVA tables"

2006-10-09 Thread Marc Schwartz (via MN)
On Mon, 2006-10-09 at 15:43 +, Gregor Gorjanc wrote:
> Dieter Menne  menne-biomed.de> writes:
> 
> > 
> > Dear friends of lmer,
> > 
> > http://wiki.r-project.org/rwiki/doku.php?id=guides:lmer-tests
> > 
> > I have put a transcript of the long thread on lmer/lme4 statistical test
> > into the Wiki. For all those who missed it life, and for those like me, who
> > don't like the special style of the R-list to keep full length quotes.
> > 
> > Creating the text there was not much fun, waiting times are simply
> > unacceptable and the Wiki only give an empty page when syntax errors (for
> > example from quotes) are detected.
> 
> I agree that there is a problem about waiting times with large pages. This
> might be of interest for r-sig-wiki list, but I can not CC from Gmane - I will
> send separate mail. I think there was discussion about this and that wiki is 
> optimized for many small pages. I do agree though that having several pages 
> for 
> this transcript is not acceptable.
> 
> Gregor

If the content of this particular transcript is likely to be static,
consider an alternative of making it available in a PDF document that is
linked on that page.

Then others can perhaps contribute by providing other relevant content
in that section as may be desired/required.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tranferring R results to word prosessors

2006-02-09 Thread Marc Schwartz (via MN)
In follow up to Harold's thought of using LaTeX, I have an approach when
the use of nicely formatted tables is required in a document where LaTeX
is not being used for the entire document. In other words, where you
need to use Word, OO.org's Writer or similar application for the
majority of the document body.

This involves outputting R results to LaTeX table code in a text file,
processing the file with 'latex' and 'dvips' and creating an EPS file.
Of course, the LaTeX text file is fully complete with preamble, etc.

One can then import the EPS file to a page in the document processor
file. The most recent versions of the aforementioned applications will
generate a bitmapped preview of the table content to aid in placement
and review.

You can then print the document to a PS printer or file for subsequent
use. OO.org's Writer can also use Ghostscript to print to a PDF file
using a "PDF Converter" in the printer selection dialog. This,
importantly, is different than the "Export to PDF" function. The latter
does not properly print embedded EPS images and prints the bitmapped
preview instead.

The advantage of this approach is that you don't have to mess around in
the word processing program doing a 'text to table' conversion and then
go through the formatting of the resultant columns, borders, etc.

HTH,

Marc Schwartz

On Thu, 2006-02-09 at 09:47 -0500, Doran, Harold wrote:
> Well, I don't know if it can be used with Word or not, but you might
> consider Sweave for use with LaTeX. Maybe if you use the sink() command
> this might work, but I haven't tried it. 
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Tom Backer
> Johnsen
> Sent: Thursday, February 09, 2006 9:41 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Tranferring R results to word prosessors
> 
> I have just started looking at R, and are getting more and more
> irritated at myself for not having done that before.
> 
> However, one of the things I have not found in the documentation is some
> way of preparing output from R for convenient formatting into something
> like MS Word.  An example:  If you use summary(lm()) you get nice
> output.  However, if you try to paste that output into the word
> processor, all the text elements are separated by blanks, and that is
> not optimal for the creation of a table (in the word processing sense).
> 
> Is there an option to generate tab-separated output in R ? That would
> solve the problem.
> 
> Tom

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] putting text in the corner

2006-02-09 Thread Marc Schwartz (via MN)
On Thu, 2006-02-09 at 17:18 +0100, Thomas Steiner wrote:
> I want to write some text in a corner of my plot.
> Is it possible to get the xlim and ylim of an open window?
> Or is there anything similar like
> legend(x="bottomright", inset=0.01,legend=...)
> for
> text(x=1,y=2, "test")
> 
> Thomas


Try this:

 plot(1:10)

 # par("usr") defines the x/y coordinates of the plot region
 usr <- par("usr")

 # Upper Left Hand Corner
 text(usr[1], usr[4], "test", adj = c(0, 1))

 # Lower Left Hand Corner
 text(usr[1], usr[3], "test", adj = c(0, 0))

 # Lower Right Hand Corner
 text(usr[2], usr[3], "test", adj = c(1, 0))

 # Upper Right Hand Corner
 text(usr[2], usr[4], "test", adj = c(1, 1))



See ?par and ?text for more information including the 'adj' argument to
text().

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Tranferring R results to word prosessors

2006-02-09 Thread Marc Schwartz (via MN)
There is some documentation online at:

http://www.latex-project.org/guides/

which would be a good starting place.

If you prefer a good book, The LaTeX Companion (aka TLC) is the place to
begin:

http://www.amazon.com/gp/product/0201362996


There is also a boxed set (expensive) of several books (including TLC)
available:

http://www.amazon.com/gp/product/0321269446


Finally, for dealing with EPS (or PDF) graphics (ie. R plots), the
online document "Using Imported Graphics in LaTeX and pdfLaTeX" is
excellent:

http://www.ctan.org/tex-archive/info/epslatex.pdf


HTH,

Marc Schwartz

On Thu, 2006-02-09 at 12:33 -0500, roger bos wrote:
> Yeah, but I don't understand LaTeX at all.  Can you point me to a good
> beginners guide?
> 
> Thanks,
> 
> Roger
> 
> 
> On 2/9/06, Barry Rowlingson <[EMAIL PROTECTED]> wrote:
> >
> > Tom Backer Johnsen wrote:
> > > I have just started looking at R, and are getting more and more
> > irritated
> > > at myself for not having done that before.
> > >
> > > However, one of the things I have not found in the documentation is some
> > > way of preparing output from R for convenient formatting into something
> > > like MS Word.
> >
> > Well whatever you do, don't start looking at LaTeX, because that will
> > get you even more irritated at yourself for not having done it before.
> >
> > LaTeX is to Word as R is to what? SPSS?
> >
> > I've still not seen a pretty piece of mathematics - or even text - in
> > Word.
> >
> > Barry
> >

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] showing the integrated number by point size

2006-02-17 Thread Marc Schwartz (via MN)
On Fri, 2006-02-17 at 17:33 +0100, Knut Krueger wrote:
> Is there any function to show the points like this example of SPSS?
> 
> http://biostatistic.de/temp/reg.jpg
> 
> The point size should represent the number of data at this point.
> 
> with regards
> Knut Krueger

There are a couple of functions in CRAN packages I believe that will do
bubble plots.

You might want to do:

  RSiteSearch("Bubble Plot")

which should help.

A better option from a visualization perspective would be 

  ?sunflowerplot

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sprintf

2006-02-17 Thread Marc Schwartz (via MN)
On Fri, 2006-02-17 at 12:01 -0800, J.M. Breiwick wrote:
> Hi,
> 
> I want to use sprintf with vectors whose lengths vary.
> As an example: x = c(2,4,6,10)
> sprintf("%i%5f%5f%5f",x[1],x[2],x[3],x[4]) works. But if I have to compute 
> the length of x within a function then I cannot list all for format codes 
> and sprintf apparently will not accept just "x" - it wants one value for 
> each format code as in the above example. Does anyone know a way to handle 
> this? And is there a way to repeat the format code like in Fortran (e.g. 
> 5F4.1)? Thanks.
> 
> Jeff B.

Is the format of the vector 'x' predictable?  In other words, will the
first element always be printed as an integer with the rest (of unknown
length) printed as floats?

Keep in mind that sprintf() is vectorized.

So:

x <- c(2, 4, 6, 10)

> c(sprintf("%i", x[1]), sprintf("%5f", x[-1]))
[1] "2" "4.00"  "6.00"  "10.00"


> x <- seq(2, 20, 2)
> x
 [1]  2  4  6  8 10 12 14 16 18 20

# Same code here
> c(sprintf("%i", x[1]), sprintf("%5f", x[-1]))
 [1] "2" "4.00"  "6.00"  "8.00"  "10.00"
 [6] "12.00" "14.00" "16.00" "18.00" "20.00"


Does that get what you want?

BTW, please do not create a new post by responding to a different
thread. It plays havoc with the list archive making it difficult to
search for your post and any replies.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Boxplot Help for Neophyte

2006-02-20 Thread Marc Schwartz (via MN)
On Mon, 2006-02-20 at 20:27 +, Alex Park wrote:
> R helpers
> 
> I am getting to grips with R but came across a small problem today that I 
> could not fix by myself.
> 
> I have 3 text files, each with a single column of data. I read them in 
> using:
> 
> myData1<-scan("C:/Program Files/R/myData1.txt")
> myData2<-scan("C:/Program Files/R/myData2.txt")
> myData3<-scan("C:/Program Files/R/myData3.txt")
> 
> I wanted to produce a chart with 3 boxplots of the data and used:
> 
> boxplot(myData1, myData2, myData3)
> 
> This worked fine so I consulted R [help(bxp)] to add some format and labels 
> e.g. title= , xlab =, ylab= , notch=TRUE etc. I managed to figure that ok.
> 
> However, I could not figure out how to get the labels myData1, myData2, and 
> myData3 on the boxplot x-axis to denote which box was which (though I knew 
> by looking). Can anybody help with this?
> 
> I trawled through my downloaded R pdfs but could not find a way.
> 
> Regards
> 
> 
> Alex Park


Alex,

You can use the 'names' argument to boxplot():

  boxplot(myData1, myData2, myData3, 
  names = c("myData1", "myData2", "myData"))

If you want additional flexibility, note that by default (unless you
change the 'at' argument), the group plots are drawn at integer axis
values of 1:n, where 'n' is the number of groups. See the 'at' argument
in ?boxplot.  You can then use mtext() to draw further text if desired.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] 2 barplots in the same graph

2006-02-22 Thread Marc Schwartz (via MN)
On Wed, 2006-02-22 at 14:31 +0100, jia ding wrote:
> Hello,
> 
> I have a very simple question about "2 barplots in the same graph".
> 
> It seems quite easy, but I searched google for long time, haven't find
> solution.
> 
> For example, I want one graph like:
> x1=seq(0,2,by=0.3)
> x2=seq(3,0,by=-0.1)
> barplot(x1,col="red")
> barplot(x2,col="green")
> 
> It means if it's on the same graph, some bars are overlaped.
> So if the bars are hollow, instead of filled with color, it will be better.
> 
> Actually, I think it's something similar with matlab's "hold on" command.
> 
> Thanks!
> 
> Nina


I may be misinterpreting your question, but do you want something like
this?

 x1 <- seq(0, 2, by = 0.3)
 x2 <- seq(3, 0, by = -0.1)

 # Set bar fill to white, border to green
 barplot(x2, col = "white", border ="green")

 # Set bar fill to white, border to red and add to prior plot
 barplot(x1, col = "white", border = "red", add = TRUE)


See the 'add' and 'border' arguments in ?barplot.


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] OT Futility Analysis

2006-02-22 Thread Marc Schwartz (via MN)
On Wed, 2006-02-22 at 11:57 -0500, Kevin E. Thorpe wrote:
> Thank you Spencer and Steve for your helpful comments.  If I may, I
> would like to elaborate on some of the points you raise.

Kevin,

I am not sure if you received any offlist replies to your post. Given
the subject matter, I had considered that you might have.

You might find the following thread over in the MedStats group to be of
interest:

http://groups.google.com/group/MedStats/browse_frm/thread/144c97dc5cfc4f00?tvc=1

It discusses some of the issues of early stopping, in this particular
case due to "running out of funds". Some of the points raised below are
addressed in that thread.

MedStats, BTW, would be a good forum to consider for your query.

> Stephen A Roberts wrote:
> > I would take the line that if they hadn't pre-specified any stopping
> > rules, the only reason to stop is safety or new external data. I
> > would be very suspicious of requests from the steering committee to
> > stop for futility - they should be blinded so why are they thinking
> > futility unless results have leaked? I would argue that they are
> > obliged to finish the trial once they start.
> 
> In general I agree with this.  In this case the request for a futility
> analysis came from the sponsor (a drug company).  It is a classic case
> of company B buys company A and wnats to stop R&D on company A's drugs.
> Therefore the company was looking for a reason to stop.  Now that they
> will stop producing the drug used in the trial, recruitment will end
> before reaching its target.  Now the Steering Committee's point of
> view is that if there is any reasonable hope, they would find some
> other way to continue recruitment.  I am confident that results have
> not leaked.  I am well aquainted with the data management and blinding
> procedures in place for the trial.

Has the decision to cease production already been made or is the sponsor
still open to being "sold" on the idea of keeping the study going,
pending the outcome of your further work?

If production of the study treatment has already ceased, the ability of
the SC to make a business case to the sponsor may be a forgone
conclusion if there is insufficient product available to continue.

> > This is an example of the need to sort out these things in advance -
> > look up the stuff from the UK DAMOCLES project. The recent book
> > edited by DeMets et al (Data Monitoring in Clinical Trials: A Case
> > Studies Approach) is a good read on these sorts of issues and I think
> > there is a more statistical book from the same group of authors.
> 
> Thanks for the reference.  My library has it, so will give it a look.
> 
> > As far as software is concerned, futility analysis and conditional
> > power are simply standard analyses with made up data and more-or-less
> > justifiable assumptions.
> 
> I am also interested if there are good alternatives to conditional
> power for this type of scenario.
> 
> > Steve.
> > 
> > 
> > 
> >> 
> >> What does this particular Steering Committee think a "futility 
> >> analysis" is?  Do they have any particular reference(s)?  What do
> >> you find in your own literature review?
> >> 
> >> If it were my problem, I think I'd start with questions like that. 
> >> Your comments suggested to me a confounding of technical and
> >> political problems.  The politics suggests the language you need to
> >> use in your response.  Beyond that, I've never heard before of a
> >> "futility analysis", but I think I could do one by just trying to
> >> be clear about the options the Steering Committee might consider
> >> plausible and then comparing them with appropriate simulations --
> >> summarized as confidence intervals, as you suggest.
> 
> I did ask REPEATEDLY for guidelines from the steering committee, but
> none came or are likely to come.  In fact, they wanted me to come up
> with the recommendation, which I find entirely inappropriate, but here
> I am.  So, I don't think I'm confounded between techincal and political.

>From what I have seen of the regulatory guidance documents, the SC
should not provide you with any guidelines and the analysis should be
done independent of their input, since their input may be biased in
favor of the new drug. As I note below, the mere fact that the SC is
arguing in favor of continuing the study would suggest the possibility
of a priori bias. This is critical to consider, since there are no
pre-specified stopping rules.

In addition, these should not be done by you in isolation either and the
other clinical members of the DMC/DSMB should materially contribute to
the process. They should be just as clinically competent as any members
of the SC relative to putting forth reasonable assumptions upon which to
base any analyses.

> Basically, they want to stop if there is a low chance of rejecting the
> null hypothesis.  This is often referred to as conditional power or
> stochastic curtailment.  I recently saw a paper by Scott Emerson
> pointing out some pr

Re: [R] Changing the x-axis labels in plot()

2006-02-23 Thread Marc Schwartz (via MN)
On Thu, 2006-02-23 at 15:35 +, michael watson (IAH-C) wrote:
> Hi
> 
> Hopefully this one isn't in the manual or I am about to get shot :-S

Bang  ;-)

> One of my colleagues wants a slightly strange graph.  We basically have
> a data matrix, and she wants to plot, for each row, the values in the
> row as points on the graph.  The following code draws the graph just
> fine:
> 
> plot(row(d)[,3:9],d[,3:9])


If I am understanding correctly what you want, you could alternatively
use:

  boxplot(as.data.frame(t(d[, 3:9])))

which provides a somewhat different approach to visualizing the data.
There are other methods as well of course.

> So as there are 12 rows in my matrix, there are 12 columns of points,
> which is what she wants.
> 
> However, she wants the x-axis labelled with the row names, not with
> 1,2,3,4,5 etc
> 
> I can figure out from reading par() how to turn off the default drawing
> of the numerical labels, but how do I use the row names instead?
> 
> Thanks
> Mick


Try:

  plot(row(d)[,3:9], d[,3:9], xaxt = "n")


You can then use the axis() function to specify the labels and tick mark
positions that you want. See ?axis for more information.

In ?par, see 'xaxt' and 'yaxt', which are also referred to in the
description of the 'axes' argument in ?plot.default.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Summarize by two-column factor, retaining original factors

2006-02-24 Thread Marc Schwartz (via MN)
On Fri, 2006-02-24 at 08:18 -0800, Matt Crawford wrote:
> I am having trouble doing the following.  I have a data.frame like
> this, where x and y are a variable that I want to do calculations on:
> 
> Name Year x y
> ab   2001  15 3
> ab   2001  10 2
> ab   2002  12 8
> ab   2003  7 10
> dv   2002  10 15
> dv   2002  3 2
> dv   2003  1 15
> 
> Before I do all the other things I need to do with this data, I need
> to summarize or collapse the data by name and year.  I've found that I
> can do things like
> nameyear<-interaction(name,year)
> dataframe$nameyear<-nameyear
> tapply(dataframe$x,dataframe$nameyear,sum)
> tapply(dataframe$y,dataframe$nameyear,sum)
> and then bind those together.
> 
> But my problem is that I need to somehow retain the original Names in
> my collapsed dataset, so that later I can do analyses with the Name
> factors.  All I can think of is something like
> tapply(dataframe$Name,dataframe$nameyear, somefunction?)
> but nothing seems to work.
> 
> I'm actually trying to convert a SAS program, and I can't get out of
> that mindset.  There, it's a simple Proc Means, By Name Year.
> 
> Thanks for any help or suggestions on the right way to go about this.
> 
> Matt Crawford

Matt,

Just use aggregate():

> aggregate(MyDF[, 3:4], list(Name = MyDF$Name, Year = MyDF$Year), sum)
  Name Year  x  y
1   ab 2001 25  5
2   ab 2002 12  8
3   dv 2002 13 17
4   ab 2003  7 10
5   dv 2003  1 15


See ?aggregate for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sorting alphanumerically

2006-02-24 Thread Marc Schwartz (via MN)
On Fri, 2006-02-24 at 12:54 -0600, mtb954 mtb954 wrote:
> I'm trying to sort a DATAFRAME by a column "ID" that contains
> alphanumeric data. Specifically,"ID" contains integers all preceeded
> by the character "g" as in:
> 
> g1, g6, g3, g19, g100, g2, g39
> 
> I am using the following code:
> 
> DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),]
> 
> and was hoping it would sort the dataframe by ID in the following manner
> 
> g1, g2, g3, g6, g19, g39, g100
> 
> but it doesn't sort at all. Could anyone point out my mistake?
> 
> Thank you.
> 
> Mark

The values are being sorted by character based ordering, which may be
impacted upon by your locale.

Thus, on my system, you end up with something like the following:

> ID[order(ID)]
[1] "g1"   "g100" "g19"  "g2"   "g3"   "g39"  "g6"


What you can do, based upon the presumption that the prefix of 'g' is
present as you describe above, is:

> ID[order(as.numeric((gsub("g", "", ID]
[1] "g1"   "g2"   "g3"   "g6"   "g19"  "g39"  "g100"


What this does is to use gsub() to strip the 'g' and then order by
numeric value.


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] subsetting a list of matrices

2006-02-28 Thread Marc Schwartz (via MN)
On Tue, 2006-02-28 at 17:14 +, Federico Calboli wrote:
> Hi All,
> 
> I have a list of matrices:
> 
> > x
>  [,1] [,2]
> [1,]14
> [2,]25
> [3,]36
> > y
>  [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]   18   21   24   27   30   33
> [2,]   19   22   25   28   31   34
> [3,]   20   23   26   29   32   35
> > z =list(x,y)
> 
> I want to create a second list that is has a subset each matrix in the
> list subsetting so I get the 2nd and 3rd row of each (and all columns).
> 
> How could I do that (apart from looping)?
> 
> Regards,
> 
> Federico Calboli


Here is one approach:

> lapply(z, function(x) x[2:3, ])
[[1]]
 [,1] [,2]
[1,]25
[2,]36

[[2]]
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   19   22   25   28   31   34
[2,]   20   23   26   29   32   35


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] jpeg and pixels

2006-02-28 Thread Marc Schwartz (via MN)
On Tue, 2006-02-28 at 16:10 -0600, Erin Hodgess wrote:
> Dear R People:
> 
> When using the jpeg function for plotting,
> is there a way to set the size in inches, please?
> 
> There is an option for width and height in pixels, but
> not inches.
> 
> 
> Any suggestions would be welcome!

The problem is that the size of the resultant image when using bitmaps
is entirely dependent upon the resolution (in pixels per inch) of the
device upon which it is displayed. This is also referred to as dpi or
dots per inch.

Thus, for example, on my system I have a dual display configuration. 

The laptop internal LCD (15 inch diag.) is running at 1600x1200 with a
dpi of 133.

My external LCD display is a 20.1 inch diag., also at 1600x1200, with a
dpi of 98.

Thus, a JPEG image that is 400 pixels x 400 pixels will be roughly 3
inches square on my laptop, but roughly 4 inches square on the external
display.

You need to know the target dpi of the display device and then calculate
the required pixels from there.

An alternative is to use the bitmap() function, where you can specify
height and width arguments, but also need to define the 'res' setting,
which is the dpi desired. Even here, the basic calculation process is
the same:

  Inches = Pixels / DPI


HTH.

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Width of bars in barplot2

2006-03-01 Thread Marc Schwartz (via MN)
On Wed, 2006-03-01 at 12:01 -0500, Jamieson Cobleigh wrote:
> I'm using barplot2 to plot some data.  Is there any way to determine
> the width of the bars in the generated plot?  I know that barplot2
> returns a list of the coordinates of the center of each bar, but since
> there is some white space between each bar, I don't know how to get
> the width of each bar.
> 
> Jamie


Unless you have modified the 'width' argument, the default width is 1.

Thus the sides of the bars are the centers +/- 0.5.

If you modified the width argument, then the widths are set to the
vector assigned.

This behavior is the same in barplot() and barplot2() by design.

As an example:

 # Draw a barplot. adjust the y axis to make some room
 # above the bars
 mp <- barplot2(1:10, ylim = c(0, 12))

 # Draw lines showing the bar width limits
 arrows(mp - 0.5, 2:11, mp + 0.5, 2:11, angle = 90, code = 3)


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] barplot names.arg

2006-03-06 Thread Marc Schwartz (via MN)
On Mon, 2006-03-06 at 15:40 +0100, Roland Kaiser wrote:
> How can i set a rotation for the names.arg in barplot?


See R FAQ 7.27 How can I create rotated axis labels?:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f


That provides the basic concept, which is easy to use with barplot()
along with knowing that barplot() returns the bar midpoints. See the
Value section of ?barplot.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] is there a way to let R do smart matrix-vector operation?

2006-03-06 Thread Marc Schwartz (via MN)
On Mon, 2006-03-06 at 15:10 -0800, Michael wrote:
> Hi all,
> 
> I want to substract vector B from A's each column... how can R do that
> smartly without a loop?
> 
> > A=matrix(c(2:7), 2, 3)
> > A
>  [,1] [,2] [,3]
> [1,]246
> [2,]357
> > B=matrix(c(1, 2), 2, 1)
> > B
>  [,1]
> [1,]1
> [2,]2
> > A-B
> Error in A - B : non-conformable arrays


> apply(A, 2, "-", B)
 [,1] [,2] [,3]
[1,]135
[2,]135


You can use apply() on column-wise operations such as this.

See ?apply for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Adding polygons to a barplot

2006-03-08 Thread Marc Schwartz (via MN)
On Wed, 2006-03-08 at 14:02 -0500, Jamieson Cobleigh wrote:
> I have a barplot I have created using barplot2 and I have been able to
> add points and lines (using the points and lines methods,
> respectively).  I now need to add some polygons (triangles in
> particular), that I want to be shaded to match bars in the plot.  I
> can get the coordinates of the corners of the triangles, but don't
> know how to draw the triangles.  I know there is the grid.polygon
> method, but I don't know how to get it to draw on my plot.  Any help
> would be appreciated.
> 
> Thanks!
> 
> Jamie

There is a polygon() function in base R graphics. It supports a
'density' argument in the same fashion as barplot()/barplot2().

See ?polygon

help.search("polygon") would have gotten you there.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to plot the xaxis label at 45 degree angle?

2006-03-08 Thread Marc Schwartz (via MN)
On Wed, 2006-03-08 at 16:53 -0500, Lisa Wang wrote:
> Hello there,
> 
> I would like to plot a graph with the x axis's label displayed at a 45
> angle to the x axis instead of horizontal to it as the label is very
> long. What should I do?
> 
> Thank you for your help in advance


See R FAQ 7.27 How can I create rotated axis labels?

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f


Using:

 > RSiteSearch("rotated axis labels")

would be helpful as well.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] different values of a vector

2006-03-14 Thread Marc Schwartz (via MN)
On Tue, 2006-03-14 at 18:45 +0100, Arnau Mir wrote:
> Hello.
> 
> I have a vector of length 2771 but it has only 87 different values.
> How can I obtain them?
> 
> Thanks,
> 
> Arnau.

If you just want the unique values themselves, you can use:

  unique(vector)

For example:

> v
 [1] "b" "b" "c" "a" "a" "a" "c" "c" "c" "c"

> unique(v)
[1] "b" "c" "a"


If you wanted counts of each unique value, you can use table():

> table(v)
v
a b c 
3 2 5 


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Hodges-lehmann test and CI/significance

2006-03-14 Thread Marc Schwartz (via MN)
On Tue, 2006-03-14 at 14:07 -0500, Sean Davis wrote:
> Does anyone know of an implementation in R of the Hodges-Lehmann
> nonparametric difference between two groups?  I am interested in the
> estimate of the difference and the CI or significance of that difference.  I
> did some quick searching and didn't see it, but I may not have been looking
> for the right name, etc.
> 
> Thanks,
> Sean

Sean,

See the Details section of ?wilcox.test and/or the wilcox.exact()
function in the exactRankTests CRAN package by Torsten Hothorn and Kurt
Hornik.

BTW:

  RSiteSearch("Hodges-Lehmann")

would get you to both functions.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] matrix indexing

2006-03-15 Thread Marc Schwartz (via MN)
On Wed, 2006-03-15 at 06:03 -0500, tom wright wrote:
> Can someone please give me a pointer here.
> I have two matrices
> 
> matA
>   A   B   C
> 1 5   2   4
> 2 2   4   3
> 3 1   2   4
> 
> matB
>   A   B   C
> 1 TRUEFALSE   TRUE
> 2 FALSE   TRUETRUE
> 3 FALSE   FALSE   FALSE
> 
> how do I extract all the values from matA where the coresponding entry
> in matB == TRUE (or FALSE), perferably in vector form.
> 
> Many thanks
> tom

The subsetting/indexing is premised on the index values being TRUE,
thus:

> matA[matB]
[1] 5 4 4 3

> matA[!matB]
[1] 2 1 2 2 4

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Question about 'lables' & ect.

2006-03-15 Thread Marc Schwartz (via MN)
On Wed, 2006-03-15 at 17:54 +0100, jia ding wrote:
> Hi,
> 
> I have a file named:
> test_R.txt
> aaa  2
> bbb  5
> ccc  7
> sss  3
> xxx  8
> 
> I want to have a plot:
> test<-read.table("test_R.txt",col.name=c("Name","Score"))

> par(mfrow=c(1,2))

It's not clear what the purpose is here, at least in this example. Do
you plan on creating a second plot?

> barplot(test$Score)
> name<-test$Name
> axis(1,at=1:length(test$Name),labels=paste(name))
> 
> Q1, if you try the script above,you will get 5 bars, the axis only shows
> "aaa", "ccc","xxx", but where are "bbb"&"sss"?

The easiest way to do this is to use the 'names.arg' argument in
barplot():

barplot(test$Score, names.arg = as.character(test$Name))

Note that the 'Name' column in the 'test' data frame will be a factor by
default, so you need to convert it to a character vector here.

> Q2, pls have a look this x-axis again, you will find the middle of the bars
> are not pointing to the x-axes.

Note that in the Value section of ?barplot, it indicates that barplot()
returns the bar midpoints, which are not at integer values along the x
axis.

You would need to do something like:

mp <- barplot(test$Score)

axis(1, at = mp, labels = as.character(test$Name))


> Q3, how can i change the width of the bars? I feel they are too "fat".

You can use the 'space' argument:

barplot(test$Score, names.arg = as.character(test$Name), space = 0.5)


See the descriptions of the 'width' and 'space' arguments in ?barplot
for some of the subtleties here.

See ?barplot for more information and further examples.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] which.minimums not which.min

2006-03-15 Thread Marc Schwartz (via MN)
On Wed, 2006-03-15 at 11:32 -0800, Fred J. wrote:
>   Hi
>
>   Is there a function which determines the location, i.e., index of
> the all minimums or maximums of a numeric vector.
>   Which.min(x) only finds the (first) of such.
>
>   > x <- c(1:4,0:5, 4, 11)
>   > x
>[1]  1  2  3  4  0  1  2  3  4  5 4 11
>   > which.min(x)
>   [1] 5
>   > which.max(x)
>   [1] 11
>   >
>
>   but I need 
>   which.min(x)  to be 5 11
>   which.max(x) to be 4 10
>
>   thanks
>

There is something wrong with your example code versus data here, since:

> x
 [1]  1  2  3  4  0  1  2  3  4  5  4 11

> which.min(x)
[1] 5

> which.max(x)
[1] 12


There is one one minimum value of 0 in that vector and only one maximum
value of 11.

If you had a vector 'x':

> x <- c(1:4, 0:5, 4, 0, 5)

> x
 [1] 1 2 3 4 0 1 2 3 4 5 4 0 5


You could then do the following to get the indices of the multiple
min/max values:

> which(x == min(x))
[1]  5 12

> which(x == max(x))
[1] 10 13


The only other thing that I can think you might be considering would be
local minima/maxima in the vector and if that is what you want using:

  RSiteSearch("local minima")

or

  RSiteSearch("peaks")


should lead you to some solutions that have been discussed previously.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] which.minimums not which.min

2006-03-15 Thread Marc Schwartz (via MN)
On Wed, 2006-03-15 at 21:45 +0100, Philippe Grosjean wrote:
> What Fred is looking for is local minima/maxima, also known as turning 
> points, or pits/peaks in a series.  You can look at ?turnpoints in 
> pastecs library.
> 
>  > x <- c(1:4,0:5, 4, 11)
>  > x
>   [1]  1  2  3  4  0  1  2  3  4  5  4 11
>  > tp <- turnpoints(x)
>  > summary(tp)
> Turning points for: x
> 
> nbr observations  : 12
> nbr ex-aequos : 0
> nbr turning points: 4 (first point is a peak)
> E(p) = 6.67 Var(p) = 1.81 (theoretical)
> 
>point type   proba  info
> 1 4 peak 0.1 3.3219281
> 2 5  pit 0.002380952 8.7142455
> 310 peak 0.005952381 7.3923174
> 411  pit 0.7 0.5849625
>  > plot(tp) # Only useful for a longer and more complex series!
>  > # Get the position of peaks
>  > (1:length(x))[extract(tp, no.tp = FALSE, peak = TRUE, pit = FALSE)]
> [1]  4 10
> Warning message:
> arguments after the first two are ignored in: UseMethod("extract", e, n, 
> ...)
>  > (1:length(x))[extract(tp, no.tp = FALSE, peak = FALSE, pit = TRUE)]
> [1]  5 11
> Warning message:
> arguments after the first two are ignored in: UseMethod("extract", e, n, 
> ...)
>  > # By the way, there are warnings although it works well (I ask on R-Help)
> 
> Now, you can easily code your which.minima() function using turnpoints:
> 
> x <- c(1:4,0:5, 4, 11)
> x
> tp <- turnpoints(x)
> summary(tp)
> plot(tp) # Only useful for a longer and more complex series!
> # Get the position of peaks
> (1:length(x))[extract(tp, no.tp = FALSE, peak = TRUE, pit = FALSE)]
> (1:length(x))[extract(tp, no.tp = FALSE, peak = FALSE, pit = TRUE)]
> # By the way, there are warnings although it works well (I ask on R-Help)
> 
> which.minima <- function(x) {
>   if (!require(pastecs)) stop("pastecs library is required!")
>   x <- as.vector(x)
>   (1:length(x))[extract(turnpoints(x), no.tp = FALSE, peak = FALSE, pit = 
> TRUE)]
> }
> 
> which.minima(x)
> 
> Of course, you could optimize this code. This is just a rough solution!
> Best,
> 
> Philippe Grosjean


Philippe,

Thanks for the clarification. As with Andy's reply, it seems that my
closing thoughts were correct.

I was confused since the actual result of which.max() in Fred's post did
not match the data provided.

Best regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] exporting graphics

2006-03-20 Thread Marc Schwartz (via MN)
On Mon, 2006-03-20 at 17:33 -0500, [EMAIL PROTECTED] wrote:
> Hi,
> 
> This is a very fundamental question. I want to export graphical results 
> so that I can place them in an openoffice document.

>  I use Fedora 5. 

That was quick... ;-)

The best way to do this under Linux is to use the R postscript()
function to create an Encapsulated PostScript file (EPS) and then import
that into OO.org.

See ?postscript and pay particular attention to the Details section,
which defines that you should use the following arguments to
successfully create an EPS file for use in this fashion:

postscript(file "YourEPSFile.eps", horizontal = FALSE, onefile = FALSE,
   paper = "special", ...)

The additional key arguments that you will need to pay attention to are
the 'height' and 'width' arguments to specify the dimensions of the
graphic that you desire. These should be set "close to", if not exactly,
the actual size that you require in your document.

Note that when you import the file into OO.org, a bitmapped preview of
the R graphic image will be created. This preview will be low quality
and suitable for aiding placement, but not really for viewing the
graphic.

However when the document is printed to a Postscript (PS) file or
printer, using a PS compatible printer driver, the output will be of
high quality. 

If you need a PDF file created, you can then use 'ps2pdf' to create it,
or if you have set up the 'PDF Converter' printer in OO.org using
'spadmin', you can use this as well (presuming that they kept this in
5). 

At least through FC4, the PDF Export function (the PDF icon on the
toolbar) did not properly print embedded EPS files. The lower quality
bitmapped preview image is what gets printed.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ROWNAMES error message

2006-03-21 Thread Marc Schwartz (via MN)
On Tue, 2006-03-21 at 14:26 -0500, mark salsburg wrote:
> I am getting an error message, which I do not know the source to.
> 
> I have a matrix SAMPLES that has preexisting rownames that I would like to
> change.

SAMPLES is not a matrix, it is a data frame, as your output shows below.

> GENE_NAMES contains these rownames.
> 
> 
> > rownames(SAMPLES) = GENE_NAMES
> Error in "dimnames<-.data.frame"(`*tmp*`, value = list(list(V1 = c(3843,  :
> invalid 'dimnames' given for data frame
> > dim(SAMPLES)
> [1] 1262620
> > dim(GENE_NAMES)
> [1] 12626 1
> > is.data.frame(SAMPLES)
> [1] TRUE
> > is.data.frame(GENE_NAMES)
> [1] TRUE
> 
> I have tried converting GENE_NAMES to a factor, R will not allow me because
> its says "x must be atomic"
> 
> ANY IDEAS??

GENE_NAMES is presumably a data frame with a single column. You need to
properly access the single column by name or index and use that as the
RHS of the assignment. So something like one the following should work:

  rownames(SAMPLES) <- GENE_NAMES$V1

or 

  rownames(SAMPLES) <- GENE_NAMES[, 1]

Use:

  str(GENE_NAMES)

which will display the structure of GENE_NAMES.

'atomic' means that the data in question is one of the primary data
types defined for R. See ?is.atomic for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Date in dataframe manipulation

2006-03-24 Thread Marc Schwartz (via MN)
On Fri, 2006-03-24 at 15:29 -0500, Dan Chan wrote:
> Hi,
> 
> I have a dataframe with many columns, including date and I want to keep
> only a few of the columns including date column.
> 
> I used the following command: 
> with(FireDataAppling, cbind(STARTDATE, County, TOTAL, CAUSE)
> 
> It works, but the date becomes days from Jan 1, 2001.  
> 
> FireDataAppling$STARTDATE[1] gives
> [1] 2001-01-04 00:00:00  
> 1703 Levels: .

This output suggests that STARTDATE is a factor, rather than a Date
related data type. Did you read this data in via one of the read.table()
family of functions? If these values are quoted character fields in the
imported text file, they will be converted to factors by default.

> After the cbind command, the entry becomes a 4.  
> 
> I want to get 2001-01-04.  What command should I use?  
> 
> Thank you. 

You might want to review the "Note" section in ?cbind, relative to the
result of cbind()ing vectors of differing data types. By using with(),
you are effectively taking the data frame columns as individual vectors
and the resultant _matrix_ will be coerced to a single data type, in
this case, presumably numeric. I am guessing that 'County' and 'CAUSE'
are also factors, whereas 'TOTAL' is numeric.

Using str(FireDataAppling) will give you some insight into the structure
of your data frame.

The '4' that you are getting is the factor level numeric code for the
entry above, not the number of days since Jan 1, 2001, which is not a
default 'origin' date in R. Jan 1, 1970 is.

You might want to look at ?factor for more insight here.

If you want to retain only a _subset_ of the columns in a data frame,
use the subset() function:

  subset(FireDataAppling, select = c(STARTDATE, County, TOTAL, CAUSE))

This will return a data frame and retain the original data types. If you
want to then perform actual Date based operations on those values, take
a look at ?DateTimeClasses, paying attention to the "See Also" section
relative to associated functions.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] can R be run without installation on to a computer

2006-03-31 Thread Marc Schwartz (via MN)
On Sat, 2006-04-01 at 06:45 +1000, Bob Green wrote:
> I have been trying for a year to get approval to install R on a work 
> computer and am not optimistic of a positive reply in the near future. I 
> was considering whether an option might be to run R from a CD/USB stick.  I 
> looked through the installation manual but could see no mention of this 
> option.
> 
> If it is possible to run R from a CD or a USB stick without installation to 
> a computer, I would appreciate direction to information or advice on how 
> this might be done,
> 
> Any assistance is appreciated,
> 
> regards
> 
> Bob Green

Your e-mail headers suggest that you are on Windows, thus you might want
to review this post by Prof. Ripley in a thread from 2004 addressing
this issue:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/41937.html

Alternatively, if you are comfortable running Linux, Dirk Eddelbuettel
has kindly provided Quantian:

http://dirk.eddelbuettel.com/quantian.html

which is a Debian Linux based distribution that can run as a Live CD and
can be booted from the CD without having to actually install it.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] can R be run without installation on to a computer

2006-03-31 Thread Marc Schwartz (via MN)
On Fri, 2006-03-31 at 22:26 +0100, Prof Brian Ripley wrote:
> On Fri, 31 Mar 2006, Marc Schwartz (via MN) wrote:
> 
> > On Sat, 2006-04-01 at 06:45 +1000, Bob Green wrote:
> >> I have been trying for a year to get approval to install R on a work
> >> computer and am not optimistic of a positive reply in the near future. I
> >> was considering whether an option might be to run R from a CD/USB stick.  I
> >> looked through the installation manual but could see no mention of this 
> >> option.
> >>
> >> If it is possible to run R from a CD or a USB stick without installation to
> >> a computer, I would appreciate direction to information or advice on how
> >> this might be done,
> >>
> >> Any assistance is appreciated,
> >>
> >> regards
> >>
> >> Bob Green
> >
> > Your e-mail headers suggest that you are on Windows, thus you might want
> > to review this post by Prof. Ripley in a thread from 2004 addressing
> > this issue:
> >
> > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/41937.html
> 
> There is an updated version of that in the rw-FAQ, Q2.6.
> 
> I run R from a USB drive quite often, both for demos in talks (on other 
> people's Windows systems) and to test R on different Windows versions 
> (e.g. 98) with minimal effort.

Prof. Ripley, thanks for the pointer. I should review the Windows FAQ
more often, despite essentially not using it any longer.

I think that there is a typo in the second sentence in that FAQ:

...so you can burn an image on the R installation on your hard disc...

looks like it perhaps should read:

...so you can burn an image _of_ the R installation on your hard disc...

Best regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] printing output to a file from the command line

2006-04-13 Thread Marc Schwartz (via MN)
On Thu, 2006-04-13 at 15:11 -0500, Chad Reyhan Bhatti wrote:
> Hello,
> 
> I have been looking for a way to print output to a file from the command
> line.  I have looked at write(), dump(), dput(), etc and none of these
> seem to have the capability I am needing.  Imagine that you have the
> output of lm(), glm(), or optim().
> 
> out <- lm();
> out <- glm();
> out <- optim();
> 
> I would like to be able to write(out, file="out.txt",replace=TRUE),
> write2file(out,file="out.txt",replace=TRUE) or
> print(out, file=out.txt",replace=TRUE).  I have several outputs to be
> printed so I
> would prefer it if I could write them to file from the command line as
> opposed to pasting them by hand.
> 
> Thanks,
> 
> Chad R. Bhatti

See ?sink and ?capture.output

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Working directory

2006-04-13 Thread Marc Schwartz (via MN)
On Thu, 2006-04-13 at 16:28 -0400, Gong, Yanyan wrote:
> Hi,
> 
> I am a new user of "R", I am trying to read my data in.. "Cervixhc.dat" used
> to be in a different directory, now it has been moved to "O:\E&s\APC cervix
> FINAL (YG,MC,MD)\Manuscript\Data", but when I ran the following program (in
> red) I got an error message "Error in setwd(dir) : cannot change working
> directory", and "Error in file(file, "r", encoding = encoding) : 
> unable to open connection
> In addition: Warning message:
> cannot open file 'O:E&sAPC cervix FINAL (YG,MC,MD)ManuscriptPrograms
> t.R', reason 'Invalid argument' "
> 
> Here is my program:
> 
> setwd("O:\E&s\APC cervix FINAL (YG,MC,MD)\Manuscript\Data")
> 
> library(Epi)
> source("O:\E&s\APC cervix FINAL (YG,MC,MD)\Manuscript\Programs\tt.R")
> cervix_all<-read.table("cervixhc.dat",header=T)
> 
> Wonder whether you can help me to solve the problem? Thanks very much!
> 
> Sincerely,
> 
> Yanyan


See R for Windows FAQ 2.16 "R can't find my file, but I know it is
there!"

http://cran.r-project.org/bin/windows/base/rw-FAQ.html#R-can_0027t-find-my-file

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] assignment to a symbol created by paste

2006-04-13 Thread Marc Schwartz (via MN)
On Thu, 2006-04-13 at 15:04 -0500, Chad Reyhan Bhatti wrote:
> Hello,
> 
> I am creating a number of objects that I wish to have a common name with
> an index such as x1, x2, x3, ... I would like to do everyting in a loop to
> make the code compact and minimize the probability of an error by typo.
> A test problem may look like
> 
> for (j in 1:10){
>   as.symbol(paste("x",j,sep="")) <- j;
> }
> 
> which ideally would produce x1 = 1, ... x10 = 10.  However, this does not
> work.
> 
> > as.symbol(paste("x",1,sep="")) <- 2
> Error: Target of assignment expands to non-language object
> >
> 
> Any help?
> 
> Thanks,
> 
> Chad R. Bhatti


See the first example in ?assign

HTH,

Marc Schwartz


P.S. To R Core:

The comment in that example currently reads:

#-- Create objects  'r1', 'r2', ... 'r6' --

It should be (note periods in vector names, as sep = "."):

#-- Create objects  'r.1', 'r.2', ... 'r.6' --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] help

2006-04-17 Thread Marc Schwartz (via MN)
On Mon, 2006-04-17 at 10:17 -0400, Gong, Yanyan wrote:
> Hi, I am trying to runn a age-period-cohort model, but here is what I am
> having problem with, hope you can help me!
> 
> This is what I am trying to do:
> sumzero_a<-((A-min(A))/5+1) - mean((A-min(A))/5+1)  where A is my age
> variable (numeric, the mid-point of a five-year age group), but I got the
> following error:
> 
> Error in min(..., na.rm = na.rm) : invalid 'mode' of argument
> 
> I am pretty sure I have the variable "A" in my data, can you see why this
> happens?
> 
> Thanks.
> 
> Yanyan

'A' does exist or you should get something like the following:

> min(A)
Error in min(A) : object "A" not found


More than likely, 'A' is not a numeric vector. It could be a character
vector, a list or some other non-atomic data type.  It is not a factor,
lest you get something like the following:

Error in Summary.factor(..., na.rm = na.rm) :
min not meaningful for factors


See ?is.atomic for some additional information.

Without your code, it would be difficult to ascertain how 'A' is
created. However, consider the following example:

> L <- LETTERS

> L
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q"
[18] "R" "S" "T" "U" "V" "W" "X" "Y" "Z"

> min(L)
Error in min(..., na.rm = na.rm) : invalid 'mode' of argument

> mode(L)
[1] "character"


I would use:

str(A)

or

mode(A)

to see what 'A' is. That should give you a hint.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lambda, uncertainty coefficient (& Somers D)

2006-04-18 Thread Marc Schwartz (via MN)
On Tue, 2006-04-18 at 16:40 +0300, Antti Arppe wrote:
> Dear colleagues in R,
> 
> Has anybody implemented the
> 
> 1) (Goodman & Kruskal) lambda
> 
> or the
> 
> 2) (Thiel's) uncertainty coefficient
> 
> statistics (in the asymmetric and symmetric forms), or is anyone aware 
> that they might reside in some package? A search in the R archives 
> does indicate that they are (somehow) part of the CoCo package, but I 
> would rather not start learning how to transform my data into 
> CoCo-format in order to access these functions, regardless of whether 
> the CoCo versions are actually intended for calculating the actual 
> statistic or for some other package internal purposes, as may 
> sometimes be the case.
> 
> Furthermore, it appears to me that the 'somers2' function in the Hmisc 
> package applies Somers' D only to 2x2 and not larger tables. Am I 
> mistaken, or does there exist somewhere else an implementation of the 
> Somers' D statistic for the more general RxC tables? This was queried 
> in 1999, but no response seemed then to be forthcoming.
> 
> Thanks and regards,
> 
>   -Antti Arppe


Antti,

Under a separate e-mail to you, I have sent a text file containing R
code for the above and other related association measures.

If anyone else is interested in these, please let me know and I will be
more than happy to forward them on. I did not want to consume a lot of
bandwidth here.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lambda, uncertainty coefficient (& Somers D)

2006-04-18 Thread Marc Schwartz (via MN)
On Tue, 2006-04-18 at 10:30 -0500, Marc Schwartz (via MN) wrote:
> On Tue, 2006-04-18 at 16:40 +0300, Antti Arppe wrote:
> > Dear colleagues in R,
> > 
> > Has anybody implemented the
> > 
> > 1) (Goodman & Kruskal) lambda
> > 
> > or the
> > 
> > 2) (Thiel's) uncertainty coefficient
> > 
> > statistics (in the asymmetric and symmetric forms), or is anyone aware 
> > that they might reside in some package? A search in the R archives 
> > does indicate that they are (somehow) part of the CoCo package, but I 
> > would rather not start learning how to transform my data into 
> > CoCo-format in order to access these functions, regardless of whether 
> > the CoCo versions are actually intended for calculating the actual 
> > statistic or for some other package internal purposes, as may 
> > sometimes be the case.
> > 
> > Furthermore, it appears to me that the 'somers2' function in the Hmisc 
> > package applies Somers' D only to 2x2 and not larger tables. Am I 
> > mistaken, or does there exist somewhere else an implementation of the 
> > Somers' D statistic for the more general RxC tables? This was queried 
> > in 1999, but no response seemed then to be forthcoming.
> > 
> > Thanks and regards,
> > 
> > -Antti Arppe
> 
> 
> Antti,
> 
> Under a separate e-mail to you, I have sent a text file containing R
> code for the above and other related association measures.
> 
> If anyone else is interested in these, please let me know and I will be
> more than happy to forward them on. I did not want to consume a lot of
> bandwidth here.
> 
> HTH,
> 
> Marc Schwartz


[Replying to r-help only]

Antti,

Each of my e-mails directly to you, including the code file, using
either:

  [EMAIL PROTECTED] (located on your web page)

or 

  [EMAIL PROTECTED] (used here)

have bounced, with access denied/user unknown errors.

Can you please confirm a correct e-mail address?

Thanks,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] store levels in a string

2006-04-18 Thread Marc Schwartz (via MN)
On Tue, 2006-04-18 at 17:52 +0200, Daniele Medri wrote:
> Dear R-users,
> 
> i need to store in a variable a string made from levels of a factor
> 
> e.g.
> 
> a<-("a","a","b","b")

The above should be:

a <- c("a","a","b","b")
 ^

> af<-factor(a)
> 
> mylevels<- ...a string with all the levels(af)
> 
> Thanks
> --
> DM

> mylevels <- levels(af)

> mylevels
[1] "a" "b"


See ?levels, which is listed in the "See Also" for ?factor.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] store levels in a string

2006-04-18 Thread Marc Schwartz (via MN)
On Tue, 2006-04-18 at 18:07 +0200, Daniele Medri wrote:
> Il giorno mar, 18/04/2006 alle 11.00 -0500, Marc Schwartz (via MN):
> > > mylevels<- ...a string with all the levels(af)
> > > mylevels <- levels(af)
> > 
> > > mylevels
> > [1] "a" "b"
> 
> I don't need to store these two levels, but a string with the values
> "ab".
> 
> Thanks
> 
> Cheers
> --
> DM

> paste(levels(af), collapse = "")
[1] "ab"

See ?paste.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] store levels in a string

2006-04-18 Thread Marc Schwartz (via MN)
On Tue, 2006-04-18 at 20:53 +0200, Daniele Medri wrote:
> Il giorno mar, 18/04/2006 alle 11.29 -0500, Marc Schwartz (via MN):
> > > paste(levels(af), collapse = "")
> > [1] "ab"
> > 
> > See ?paste.
> 
> :) more simple: toString(levels(x))
> 
> Thanks,
> 
> Cheers
> --
> DM

Actually, you don't quite get the same result:

> toString(levels(af))
[1] "a, b"

Note the inclusion of the ', '.  

See the first line in toString.default() which is:

  string <- paste(x, collapse = ", ")

HTH,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Creating a .txt file from an Oracle DB without creating an R object

2006-04-19 Thread Marc Schwartz (via MN)
On Wed, 2006-04-19 at 17:02 +0200, Paul wrote:
> Dear R-helpers,
>
> I am dealing with an Oracle database (using package RODBC). I use
> R in order to transform some Oracle tables into .txt files (using
> function sqlFetch from package RODBC and then function write.table).
> However, I cannot do it without creating an R object, which is rather
> restrictive for very big Oracle tables. Indeed, any R Object is stored
> into RAM, which can be of limited size.
> Do you know if it is possible to directly create a .txt file,
> without creating an R object ?
> Thank you in advance.
>
>   P. Poncet
>

Somebody else may have a better idea, but you could probably use either
sink() or capture.output() to send the data to a text file instead of
the console, thus not creating an R object. For example:

  capture.output(sqlFetch(channel, "YourTableName", colnames = TRUE),
 "OutputFile.txt")

You will need to adjust options("width"), which defaults to 80 and would
cause the typical in-console line wrapping to occur. You would not want
this in your text file of course.  'width' can be set up to 10,000 by
default and could go higher, if you want to adjust the value in print.h
and re-compile R.

See ?options, ?sink and ?capture.output for more information.

Another reasonable question might be, if you are just taking data from
an Oracle table and pumping it into a text file, you could do this in
other ways outside of R, including using the Oracle SQL*Plus Instant
Client (ie. via the SPOOL command).

Finally, there is an R e-mail list focused on databases:

  https://stat.ethz.ch/mailman/listinfo/r-sig-db

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] prop.table on three-way table?

2006-04-19 Thread Marc Schwartz (via MN)
On Wed, 2006-04-19 at 16:39 +0200, Fredrik Karlsson wrote:
> Dear list,
> 
> I am trying to create a three-way table with percent occurrence
> instead of raw frequencies. However, I cannot get the results I
> expected:
> 
> I have the following table:
> 
> > ftable(table( mannerDF$agem, mannerDF$target, mannerDF$manner ))
> 
> 50 bak 0 0   0 0   1 0
>pak 0 0   0 0   3 0
>sak 0 1   0 0   0 0
>spak0 0   0 0   0 0
> 
> Now, If I use the prop-table function, I newer get a 1 ratio in any cell:
> 
> 
> 
> With 'margin=1':
> 
> 50 bak0. 0. 0. 0. 0.2000 0.
>pak0. 0. 0. 0. 0.6000 0.
>sak0. 0.2000 0. 0. 0. 0.
>spak   0. 0. 0. 0. 0. 0.
> 
> With 'margin=2':
> 
> 50 bak   0.0 0.0 0.0 0.0 0.004347826 
> 0.0
>pak   0.0 0.0 0.0 0.0 0.010752688 
> 0.0
>sak   0.0 0.005747126 0.0 0.0 0.0 
> 0.0
>spak  0.0 0.0 0.0 0.0 0.0 
> 0.0
> 
> With 'margin=3':
> 
> 50 bak   0.0 0.0 0.0 0.0 0.001373626 
> 0.0
>pak   0.0 0.0 0.0 0.0 0.004120879 
> 0.0
>sak   0.0 0.008695652 0.0 0.0 0.0 
> 0.0
>spak  0.0 0.0 0.0 0.0 0.0 
> 0.0
> 
> What I was looking for is this:
> 
> 
> 50 bak 0 0   0 0   1 0
>pak 0 0   0 0   1 0
>sak 0 1   0 0   0 0
>spak0 0   0 0   0 0
> 
> (With more digits)
> 
> Am I doing something stupid?

I may be missing what you are trying to do, since we don't have your
data to reproduce the output. However, you might want to look at the
ctab() function in the 'catspec' package on CRAN by John Hendrickx.

It builds on the ftable() and prop.table() functions to generate
formatted n-way percentage tables.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Creating a .txt file from an Oracle DB without creating an R object

2006-04-19 Thread Marc Schwartz (via MN)
On Wed, 2006-04-19 at 17:14 +0100, Prof Brian Ripley wrote:
> On Wed, 19 Apr 2006, Marc Schwartz (via MN) wrote:
> 
> > On Wed, 2006-04-19 at 17:02 +0200, Paul wrote:
> >> Dear R-helpers,
> >>
> >> I am dealing with an Oracle database (using package RODBC). I use
> >> R in order to transform some Oracle tables into .txt files (using
> >> function sqlFetch from package RODBC and then function write.table).
> >> However, I cannot do it without creating an R object, which is rather
> >> restrictive for very big Oracle tables. Indeed, any R Object is stored
> >> into RAM, which can be of limited size.
> >> Do you know if it is possible to directly create a .txt file,
> >> without creating an R object ?
> >> Thank you in advance.
> >>
> >>   P. Poncet
> >>
> >
> > Somebody else may have a better idea, but you could probably use either
> > sink() or capture.output() to send the data to a text file instead of
> > the console, thus not creating an R object. For example:
> >
> >  capture.output(sqlFetch(channel, "YourTableName", colnames = TRUE),
> > "OutputFile.txt")
> >
> > You will need to adjust options("width"), which defaults to 80 and would
> > cause the typical in-console line wrapping to occur. You would not want
> > this in your text file of course.  'width' can be set up to 10,000 by
> > default and could go higher, if you want to adjust the value in print.h
> > and re-compile R.
> 
> I don't think this helps: sqlFetch will create an (unnamed) R object 
> containing the whole table and hence have the memory issues.  What you can 
> do is use is a limit on the number of rows and use sqlFetchMore in a loop.

Ah...yes indeed. Now that I looked at the function code, it does create
an internal data frame called 'ans', which is the result of using:

ans <- sqlQuery(channel, paste("SELECT * FROM", dbname),
...)

Looking at the internal code for sqlQuery(), which in turn leads one to
lower level RODBC functions, there is not a "row by row" query result
being returned. The query results in each case appear to be fully stored
in an internal R object first before being returned to the caller.

Thus, Prof. Ripley's loop approach (or one of the myriad external
mechanisms) would be the way to go.

Thanks for the clarification.

Marc



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Basic vector operations was: Function to approximate complex integral

2006-04-19 Thread Marc Schwartz (via MN)
On Wed, 2006-04-19 at 15:25 -0400, Doran, Harold wrote:
> Dear List
> 
> I apologize for the multiple postings. After being in the weeds on this
> problem for a while I think my original post may have been a little
> cryptic. I think I can be clearer. Essentially, I need the following
> 
> a <- c(2,3)
> b <- c(4,5,6)
> 
> (2*4) + (2*5) + (2*6) + (3*4) + (3*5) +(3*6)
> 
> But I do not know of a built in function that would do this. Any
> suggestions?



Unless I am missing something, how about:

> sum(a %x% b)
[1] 75

See ?"%x%"

You could also use outer():

> sum(outer(a, b, "*"))
[1] 75

See ?outer

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Conditional Row Sum

2006-04-20 Thread Marc Schwartz (via MN)
On Thu, 2006-04-20 at 11:46 -0700, Sachin J wrote:
> Hi,
>
>   How can I accomplish this in R. Example:
>
>   R1  R2
>   3 101
>   4 102
>   3 102
>   18102
>   11101
>
>   I want to find Sum(101) =  14 - i.e SUM(R1) where R2 = 101
>   Sum(102) = 25- SUM(R2) where R2 = 102
>
>   TIA
>   Sachin

Presuming that your data is in a data frame called DF:

> DF
  R1  R2
1  3 101
2  4 102
3  3 102
4 18 102
5 11 101

At least three options:

> with(DF, tapply(R1, R2, sum))
101 102
 14  25


> aggregate(DF$R1, list(R2 = DF$R2), sum)
   R2  x
1 101 14
2 102 25


> by(DF$R1, DF$R2, sum)
INDICES: 101
[1] 14
--
INDICES: 102
[1] 25


See ?by, ?aggregate and ?tapply and ?with.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] forcing apply() to return data frame

2006-04-21 Thread Marc Schwartz (via MN)
On Fri, 2006-04-21 at 07:37 -0700, Thomas Lumley wrote:
> On Fri, 21 Apr 2006, Federico Calboli wrote:
> 
> > Hi All,
> >
> > I am (almost) successfully using apply() to apply a function recursively
> > on a data matrix. The function is question is as.genotype() from the
> > library 'genetics'
> >
> > apply(subset(chr1, names$breed == 'lab'),2,as.genotype,sep ="")
> >
> > Unfortuantely apply puts it's results into a matrix object rather than a
> > data frame, tranforming my factors into numerics and making the results
> > useless.
> >
> > Is there a way of forcing apply() to return a data frame rather than a
> > matrix?
> >
> 
> The conversion to a matrix happens on the way in to apply, not on the way 
> out, so no.

This may be a naive example, as I don't work in this domain, but based
upon reviewing the online help at:

  http://finzi.psych.upenn.edu/R/library/genetics/html/genotype.html

and presuming that the intent of the code above is referenced by the
first bullet in the Details section of the function, would the following
work?

This presumes that 'chr1' is a data frame or can be coerced to one as
in:

  chr1 <- as.data.frame(chr1)

Thus:

  data.frame(lapply(subset(chr1, names$breed == 'lab'), 
as.genotype, sep =""))

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Store results of for loop

2006-04-24 Thread Marc Schwartz (via MN)
On Mon, 2006-04-24 at 16:31 -0400, Doran, Harold wrote:
> I have what I'm sure will turn out to be straightforward. I want to
> store the results of a loop for some operations from a patterned vector.
> For example, the following doesn't give what I would hope for
> 
> ss <- c(2,3,9)
> results <- numeric(length(ss))
> for (i in seq(along=ss)){
>results[i] <- i + 1
>}

Harold,

Here you are getting:

> results
[1] 2 3 4

because 'i' is 1:3, thus:

> 1:3 + 1
[1] 2 3 4


> The following does give what I expect, but creates a vector of length 9.
> 
> ss <- c(2,3,9)
> results <- numeric(length(ss))
> for (i in ss){
>results[i] <- i + 1
>}

Here you are getting:

> results
[1]  0  3  4 NA NA NA NA NA 10

because 'i' is set to 'ss' which is c(2, 3, 9). Thus, 'results' is being
indexed as results[c(2, 3, 9)]. 

You are adding 1 to 'ss' in the loop, thus:

> ss + 1
[1]  3  4 10

In short:

  results[ss] <- ss + 1

which yields:

> results
[1]  0  3  4 NA NA NA NA NA 10


> What I am hoping for is that results should be a vector of length 3.

I suspect what you want is:

 ss <- c(2, 3, 9)
 results <- numeric(length(ss))

 for (i in seq(along = ss))
 {
   results[i] <- ss[i] + 1
 }

> results
[1]  3  4 10


You might also want to look at ?sapply, where you could do something
like this:

> sapply(ss, function(x) x + 1)
[1]  3  4 10


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] by() and CrossTable()

2006-04-25 Thread Marc Schwartz (via MN)
On Tue, 2006-04-25 at 11:07 -0400, Chuck Cleland wrote:
>I am attempting to produce crosstabulations between two variables for 
> subgroups defined by a third factor variable.  I'm using by() and 
> CrossTable() in package gmodels.  I get the printing of the tables first 
> and then a printing of each level of the INDICES.  For example:
> 
> library(gmodels)
> 
> by(warpbreaks, warpbreaks$tension, function(x){CrossTable(x$wool, 
> x$breaks > 30, format="SPSS", fisher=TRUE)})
> 
>Is there a way to change this so that the CrossTable() output is 
> labeled by the levels of the INDICES variable?  I think this has to do 
> with how CrossTable returns output, because the following does what I want:
> 
> by(warpbreaks, warpbreaks$tension, function(x){summary(lm(breaks ~ wool, 
> data = x))})
> 
> thanks,
> 
> Chuck

Chuck,

Thanks for your e-mail.

Without digging deeper, I suspect that the problem here is that
CrossTable() has embedded formatted output within the body of the
function using cat(), as opposed to a two step process of creating a
results object, which then has a print method associated with it. This
would be the case in the lm() example that you have as well as many
other functions in R.

I had not anticipated this particular use of CrossTable(), since it was
really focused on creating nicely formatted 2d tables using fixed width
fonts.

That being said, I have had recent requests to enhance CrossTable()'s
functionality to:

1. Be able to assign the results of the internal processing to an object
and be able to assign that object without any other output. For example:

  Results <- CrossTable(...)

yielding no further output in the console.


2. Facilitate LaTeX markup of the CrossTable() formatted output for
inclusion in LaTeX documents.


Both of the above would require me to fundamentally alter CrossTable()
to create a "CrossTable" class object, as opposed to the current
embedded output. I would then create a print.CrossTable() method
yielding the current output, as well as one to create LaTeX markup for
that application. The LaTeX output would likely need to support the
regular 'table' style as well as 'ctable' and 'longtable' styles, the
latter given the potential for long multi-page output.

These changes should then support the type of use that you are
attempting here.

These are on my TODO list for CrossTable() (along with the inclusion of
the measures of association recently discussed) and now that the dust
has settled from some recent abstract submission deadlines I can get
back to some of these things. I don't have a timeline yet, but will
forge ahead with these enhancements.

One possible suggestion for you as an interim, at least in terms of some
nicely formatted n-way tables is the ctab() function in the 'catspec'
package by John Hendrickx.

A possible example call would be:

ctab(warpbreaks$tension, warpbreaks$wool, warpbreaks$breaks > 30, 
 type = c("n", "row", "column", "total"), addmargins = TRUE)


Unlike CrossTable() which is strictly 2d (though that may change in the
future), ctab() directly supports the creation of n-way tables, with
counts and percentages/proportions interleaved in the output. There are
no statistical tests applied and these would need to be done separately
using by().


Chuck, feel free to contact me offlist as other related issues may arise
or as you have other comments on this.

Again, thanks for the e-mail.

Best regards,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] by() and CrossTable()

2006-04-25 Thread Marc Schwartz (via MN)
That does appear to work.

Thanks for the workaround Gabor.

I'll still be working on the other changes of course to make this more
"natural".

Regards,

Marc

On Tue, 2006-04-25 at 12:34 -0400, Gabor Grothendieck wrote:
> At least for this case I think you could get the effect without modifyiing
> CrossTable like this:
> 
> as.CrossTable <- function(x) structure(x, class = c("CrossTable", class(x)))
> print.CrossTable <- function(x) for(L in x) cat(L, "\n")
> 
> by(warpbreaks, warpbreaks$tension, function(x)
>   as.CrossTable(capture.output(CrossTable(x$wool, x$breaks > 30,
>       format="SPSS", fisher=TRUE
> 
> 
> On 4/25/06, Marc Schwartz (via MN) <[EMAIL PROTECTED]> wrote:
> > On Tue, 2006-04-25 at 11:07 -0400, Chuck Cleland wrote:
> > >I am attempting to produce crosstabulations between two variables for
> > > subgroups defined by a third factor variable.  I'm using by() and
> > > CrossTable() in package gmodels.  I get the printing of the tables first
> > > and then a printing of each level of the INDICES.  For example:
> > >
> > > library(gmodels)
> > >
> > > by(warpbreaks, warpbreaks$tension, function(x){CrossTable(x$wool,
> > > x$breaks > 30, format="SPSS", fisher=TRUE)})
> > >
> > >Is there a way to change this so that the CrossTable() output is
> > > labeled by the levels of the INDICES variable?  I think this has to do
> > > with how CrossTable returns output, because the following does what I 
> > > want:
> > >
> > > by(warpbreaks, warpbreaks$tension, function(x){summary(lm(breaks ~ wool,
> > > data = x))})
> > >
> > > thanks,
> > >
> > > Chuck
> >
> > Chuck,
> >
> > Thanks for your e-mail.
> >
> > Without digging deeper, I suspect that the problem here is that
> > CrossTable() has embedded formatted output within the body of the
> > function using cat(), as opposed to a two step process of creating a
> > results object, which then has a print method associated with it. This
> > would be the case in the lm() example that you have as well as many
> > other functions in R.
> >
> > I had not anticipated this particular use of CrossTable(), since it was
> > really focused on creating nicely formatted 2d tables using fixed width
> > fonts.
> >
> > That being said, I have had recent requests to enhance CrossTable()'s
> > functionality to:
> >
> > 1. Be able to assign the results of the internal processing to an object
> > and be able to assign that object without any other output. For example:
> >
> >  Results <- CrossTable(...)
> >
> > yielding no further output in the console.
> >
> >
> > 2. Facilitate LaTeX markup of the CrossTable() formatted output for
> > inclusion in LaTeX documents.
> >
> >
> > Both of the above would require me to fundamentally alter CrossTable()
> > to create a "CrossTable" class object, as opposed to the current
> > embedded output. I would then create a print.CrossTable() method
> > yielding the current output, as well as one to create LaTeX markup for
> > that application. The LaTeX output would likely need to support the
> > regular 'table' style as well as 'ctable' and 'longtable' styles, the
> > latter given the potential for long multi-page output.
> >
> > These changes should then support the type of use that you are
> > attempting here.
> >
> > These are on my TODO list for CrossTable() (along with the inclusion of
> > the measures of association recently discussed) and now that the dust
> > has settled from some recent abstract submission deadlines I can get
> > back to some of these things. I don't have a timeline yet, but will
> > forge ahead with these enhancements.
> >
> > One possible suggestion for you as an interim, at least in terms of some
> > nicely formatted n-way tables is the ctab() function in the 'catspec'
> > package by John Hendrickx.
> >
> > A possible example call would be:
> >
> > ctab(warpbreaks$tension, warpbreaks$wool, warpbreaks$breaks > 30,
> > type = c("n", "row", "column", "total"), addmargins = TRUE)
> >
> >
> > Unlike CrossTable() which is strictly 2d (though that may change in the
> > future), ctab() directly supports the creation of n-way tables, with
> > counts and percentages/proportions interleaved i

Re: [R] Polygon-like interactive selection of plotted points

2006-04-26 Thread Marc Schwartz (via MN)
On Wed, 2006-04-26 at 18:13 +0100, Florian Nigsch wrote:
> [Please CC me for all replies, since I am not currently subscribed to  
> the list.]
> 
> Hi all,
> 
> I have the following problem/question: Imagine you have a two- 
> dimensional plot, and you want to select a number of points, around  
> which you could draw a polygon. The points of the polygon are defined  
> by clicking in the graphics window (locator()/identify()), all points  
> inside the polygon are returned as an object.
> 
> Is something like this already implemented?
> 
> Thanks a lot in advance,
> 
> Florian

I don't know if anyone has created a single function do to this (though
it is always possible). 

However, using:

  RSiteSearch("points inside polygon")

brings up several function hits that, if put together with the above
interactive functions, could be used to do what you wish. That is, input
the matrix of x,y coords of the interactively selected polygon and the
x,y coords of the underlying points set to return the points inside or
outside the polygon boundaries.

Just as an FYI, you might also want to look at ?chull, which is in the
base R distribution and returns the set of points on the convex hull of
the underlying point set. This is to some extent, the inverse of what
you wish to do.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Code for "Screenshots" graphics (following on from ease-of-use issues on www.r-project.org)

2006-04-26 Thread Marc Schwartz (via MN)
On Wed, 2006-04-26 at 11:05 -0700, John McHenry wrote:
> Does anyone know where the code for the graphics on: 
>
>   http://www.r-project.org/screenshots/screenshots.html 
>
>   lives?


demo(graphics)

demo(image)

demo(persp)


These should cover each of the screen shots and then some.

If you know how to navigate the R package subdirectories of your R
installation, the R code files for the demos are located in the 'demo'
subdir of the 'graphics' package.

Typically this would be:

  $RHOME/library/graphics/demo


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Polygon-like interactive selection of plotted points

2006-04-26 Thread Marc Schwartz (via MN)
Clint,

Not sure why I would be surprised (unless you meant Florian), as I have
not actually used these functions. I simply referenced them as part of
the search to assist Florian.  :-)

However, your reply did make me curious, so I installed the 'splancs'
package along with 'sp' as a dependency. Just for reference, this is on
R 2.3.0 under FC4 compiled from source.

I did run Roger's code, the result of which is attached here as a PDF,
which should come through the list.

While my 7 sided polygon is likely different than yours, the result
seems to be the same. That is, only 3 of the vertices are considered
within the boundary and this is unaffected by the use of the 'bound'
argument to input(). 

The 'bound' argument in inout() does need to be set to TRUE in order for
the boundary points to be considered 'within' the polygon. Otherwise,
with the default NULL, the assignment is arbitrary. Thus, Roger's code,
which has the default NULL value in the call to inout(), could
reasonably result in your/our finding.

One quick guess here might be that the values returned by getpoly() are
not the exact center points of the original dataset, but are the x,y
coords of where the clicks occur. There is a
"hand-to-eyeball-coordination" margin of error relative to the actual
coords.

Thus, the coordinate values returned by this approach are very slightly
outside the polygon and not picked up by inout() as being on the
boundary.

I tried the followingm using identify() instead:

  library(splancs)
  set.seed(20060426)
  xy <- cbind(x=runif(100), y=runif(100))
  plot(xy)

  # Select the SAME 7 vertices here
  pts <- identify(xy, n = 7)
  
  # These are the row indices into 'xy'
  # for the coords of the vertices
  > pts
  [1] 18 44 51 61 73 89 91
  
  > io <- inout(xy, xy[pts, ], bound = TRUE, quiet = FALSE)
  Points on boundary:
  [1] 18 44 51 61 73 89 91

Note that the same indices of the points are now returned by inout() as
were selected when using identify(). Note also the setting of 'bound =
TRUE'.

You can now use:

  points(xy[io,], pch=16, col="blue")

to draw BOTH the points inside the polygon and the boundary points as
well.

Thus, this would support my thoughts above, since identify() uses a
'tolerance' argument to identify the actual data points closest the
mouse click, much like all.equal() does with respect to floating point
comparisons.

In a case like this, identify() would seem to be a better choice
relative to selecting the polygon boundary if one wants the exact
boundary points to be considered in or out of the polygon.

HTH,

Marc Schwartz

On Wed, 2006-04-26 at 13:01 -0700, Clint Bowman wrote:
> Roger,
> 
> Just for fun I tried your script--nothing wrong with the script, but I
> created a seven sided polygon by clicking on seven points and letting
> getpoly complete the figure.  At the end I notice that only three of the
> seven vertices are coded as being inside the ploygon (the blue points.)
> 
> I'd send you a screen dump but I haven't gotten xwd to work with Exceed.
> Also I haven't checked any docs to see whether this is a known problem but
> suspect that Marc could be surprised by the behavior.
> 
> Clint

> On Wed, 26 Apr 2006, Roger Bivand wrote:
> 
> > On Wed, 26 Apr 2006, Marc Schwartz (via MN) wrote:
> >
> > > On Wed, 2006-04-26 at 18:13 +0100, Florian Nigsch wrote:
> > > > [Please CC me for all replies, since I am not currently subscribed to
> > > > the list.]
> > > >
> > > > Hi all,
> > > >
> > > > I have the following problem/question: Imagine you have a two-
> > > > dimensional plot, and you want to select a number of points, around
> > > > which you could draw a polygon. The points of the polygon are defined
> > > > by clicking in the graphics window (locator()/identify()), all points
> > > > inside the polygon are returned as an object.
> > > >
> > > > Is something like this already implemented?
> > > >
> > > > Thanks a lot in advance,
> > > >
> > > > Florian
> > >
> > > I don't know if anyone has created a single function do to this (though
> > > it is always possible).
> > >
> > > However, using:
> > >
> > >   RSiteSearch("points inside polygon")
> > >
> > > brings up several function hits that, if put together with the above
> > > interactive functions, could be used to do what you wish. That is, input
> > > the matrix of x,y coords of the interactively selected polygon and the
> > > x,y coords of the underlying points set to return the points inside or
> > >

Re: [R] losing x-label when exporting to PNG

2006-04-27 Thread Marc Schwartz (via MN)
On Thu, 2006-04-27 at 08:43 -0400, John Kane wrote:
> I have a simple barplot that looks fine in the R
> graphics device window. However when I export it to
> png I am losing the x-label.  It must be an obvious
> problem but I cannot see it. Trying to resize the plot
> does not seem to  help.  Code is below.
> 
> Any help gratefully received.
> 
> ## Start Code
> 
> Groups <-c(21.8,45,  43, 17.2, 8.3,  18)
> names(Groups) <- c("Exeter", "Halifax", "Moosonee",
> "Ottawa",
>   "Montréal", "Saskatoon")
> # Here we are setting the font to Bold and addin some
> lines to the bottom
> # margin of the graph, 6 vs default of 5, to give us
> more room with the angled
> # labels.
> 
> par(font=2,mar= (c(6, 4, 4, 2) + 0.1))
> 
> Mycolours <- c("red", "blue", "green", "yellow",
> "orange" ,"purple")
> 
> # --plot to screen
> --
> mp <- barplot(Groups, beside=T ,
>   horiz=F , las=1, ylim=c(0,60),  axisnames=F,
> font.lab=2, col=Mycolours
>   )
> text(mp, par("usr")[3] - 1.5, srt = 45, adj = 1,
> labels = names(Groups), xpd = T ,
>  cex=.75,
>)
>
> mtext(side = 1, line=5, text="% of  Services above
> thresholds ")
> mtext(side=2, line=2.5, text="Percent")
> title(main="")
> box()
>  # ---plot to png
> file--
> png('gr_%04d.png', width=600, height=400)
> mp <- barplot(Groups, beside=T ,
>   horiz=F , las=1, ylim=c(0,60),  axisnames=F,
> font.lab=2, col=Mycolours
>   )
> text(mp, par("usr")[3] - 1.5, srt = 45, adj = 1,
> labels = names(Groups), xpd = T ,
>  cex=.75,
>)
>
> mtext(side = 1, line=5, text="% of  Services above
> thresholds ")
> mtext(side=2, line=2.5, text="Percent")
> title(main="")
> box()
> 
> dev.off()
> 
>    End code #

John,

I suspect that you might want to find a cushioned wall first, in case
you might want to bang your head against it...   ;-)

If your PNG file is created using exactly the code you have above, you
have neglected to include the par() statement to adjust the margins
AFTER calling png().  Note that the prior par() statement is apropos to
your screen plot device and will not affect the png() device.

So insert the same par() call AFTER the png() call and you should be
good to go.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in rm.outlier method

2006-04-28 Thread Marc Schwartz (via MN)
On Fri, 2006-04-28 at 11:17 -0700, Sachin J wrote:
> Hi,
>
>   I am trying to use rm.outlier method but encountering following error:
>
>   > y <- rnorm(100)
>   > rm.outlier(y)
>
>   Error: 
>   Error in if (nrow(x) != ncol(x)) stop("x must be a square matrix") : 
> argument is of length zero
>
>   Whats wrong here?
>
>   TIA
>   Sachin

It would be helpful to know which rm.outlier() function you are using
and from which package it comes.

The only one that I noted in a search is in the 'outliers' CRAN package
and it can take a vector as the 'x' argument.

The above square matrix test and resultant error message is not in the
tarball R code for either outlier() or rm.outlier() in that package, so
the source of the error is unclear.

As an aside, you may wish to consider robust analytic methods rather
than doing post hoc outlier removal.  A search of the list archives will
provide some insights here. RSiteSearch("outlier") will get you there.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in rm.outlier method

2006-04-28 Thread Marc Schwartz (via MN)
Sachin,

I don't have a definitive thought, but some possibilities might be a
conflict somewhere in your environment with a local function or with one
in the searchpath.

Use ls() to review the current objects in your environment to see if
something looks suspicious. It did not look like 'outliers' is using a
namespace, so a conflict of some nature is a little more possible here.

Also use searchpaths() to get a feel for where R is searching for the
function. See what is getting searched "above" the outliers package in
the search order, which might provide a clue.

Also, try to start R from the command line using 'R --vanilla', which
should give you a clean working environment. Then use library(outliers)
and your code below to see if the same behavior is present. If so,
perhaps there was a corruption in the package installation. If not, it
would support some type of conflict or perhaps a corruption in your
default working environment.

HTH,

Marc

On Fri, 2006-04-28 at 11:57 -0700, Sachin J wrote:
> Hi Marc:
>  
> I am using rm.outlier() function from outliers package (reference:
> CRAN package help).
> You are right. I too couldn't find this error message in rm.outlier
> function. Thats why I am unable to understand the cause of error. Any
> further thoughts? I will take a look at the robust analytic methods as
> suggested.
>  
> Thanx
> Sachin 
>  
> 
> "Marc Schwartz (via MN)" <[EMAIL PROTECTED]> wrote:
> On Fri, 2006-04-28 at 11:17 -0700, Sachin J wrote:
> > Hi,
> > 
> > I am trying to use rm.outlier method but encountering
> following error:
> > 
> > > y <- rnorm(100)
> > > rm.outlier(y)
> > 
> > Error: 
> > Error in if (nrow(x) != ncol(x)) stop("x must be a square
> matrix") : 
> > argument is of length zero
> > 
> > Whats wrong here?
> > 
> > TIA
> > Sachin
> 
> It would be helpful to know which rm.outlier() function you
> are using
> and from which package it comes.
> 
> The only one that I noted in a search is in the 'outliers'
> CRAN package
> and it can take a vector as the 'x' argument.
> 
> The above square matrix test and resultant error message is
> not in the
> tarball R code for either outlier() or rm.outlier() in that
> package, so
> the source of the error is unclear.
> 
> As an aside, you may wish to consider robust analytic methods
> rather
> than doing post hoc outlier removal. A search of the list
> archives will
> provide some insights here. RSiteSearch("outlier") will get
> you there.
> 
> HTH,
> 
> Marc Schwartz
> 
> 
> 
> 
> 
> 
> __
> New Yahoo! Messenger with Voice. Call regular phones from your PC and
> save big.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] code for latex function in Hmisc

2006-05-01 Thread Marc Schwartz (via MN)
On Mon, 2006-05-01 at 17:21 -0400, Brian Quinif wrote:
> Forgive my ignorance, but how I can take a look at the code for the
> latex function in the Hmisc library?
> 
> I tried just typing "latex" but all I got was this:
> 
> > latex
> function (object, title = first.word(deparse(substitute(object))),
> ...)
> {
> if (!length(oldClass(object)))
> oldClass(object) <- data.class(object)
> UseMethod("latex")
> }
> 
> What should I do?
> 
> Thanks,
> 
> BQ

Brian,

latex() is the generic function, which in turn dispatches one of several
methods, based upon the class of the 'object' as the first argument.

To see the specific methods and thus the individual functions that are
actually called, use:

> methods(latex)
 [1] latex.bystats  latex.bystats2
 [3] latex.default  latex.describe
 [5] latex.describe.single  latex.function
 [7] latex.list latex.summary.formula.cross
 [9] latex.summary.formula.response latex.summary.formula.reverse


See ?methods

You can then use the above functions to see the code for the individual
methods as it exists within R. Note that this is not technically the
source code, which would be contained in the Hmisc tarball on CRAN. The
latter would have author comments and other content that will not appear
within the R session. However, the code as seen within R is more often
than not, sufficient to understand what is happening.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Is there a bug in CrossTable (gmodels)

2006-05-02 Thread Marc Schwartz (via MN)
On Tue, 2006-05-02 at 17:21 +0200, Albert Sorribas wrote:
> Library gmodels include a function CrossTable that is useful for
> crosstabulation. In the help, it is indicated that one can call this
> function as CrossTable(data), were data is a matrix. However, when I try
> to use this option, it doesn't help. Any idea? Is there a bug?
> 
> Thanks for your help.

Prof. Sorribas,

Can you please provide an example of the error and/or output you are
getting?

I am reviewing the code and think that I may see the problem, but want
to be sure that we are seeing the same thing, which may be an error such
as:

Error in cat(SpaceSep1, "|", ColData, "\n") :
object "ColData" not found


This would appear to occur when 'x' is a matrix. The code is not picking
up the second dimname for the matrix/table if present or otherwise
setting a default value for 'ColData'. It does get set if one explicitly
sets the 'dnn' argument or in the case of 'x' and 'y' being vectors.

Let me know on the above and if correct, a fix would not be difficult to
provide expediently here.

Thanks and regards,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Listing Variables

2006-05-03 Thread Marc Schwartz (via MN)
On Wed, 2006-05-03 at 10:46 -0400, Farrel Buchinsky wrote:
> How does one create a vector whose contents is the list of variables in a
> dataframe pertaining to a particular pattern?
> This is so simple but I cannot find a straightforward answer.
> I want to be able to pass the contents of that list to a "for" loop.
> 
> So let us assume that one has a dataframe whose name is Data. And let us
> assume one had the height of a group of people measured at various ages.
> 
> It could be made up of vectors Data$PersonalID, Data$FirstName,
> Data$LastName, Data$Height.1, Data$Height.5, Data$Height.9,
> Data$Height.10,Data$Height.12,Data$Height.20many many more variables.
> 
> How would one create a vector of all the Height variable names.
> 
> The simple workaround is to not bother creating the vector "Data$Height.1"
> "Data$Height.5" "Data$Height.9" "Data$Height.10"
> "Data$Height.12""Data$Height.20"...but rather just to use the sapply
> function. However with some functions the sapply will not work and it is
> necessary to supply each variable name to a function (see thread at 
> Repeating tdt function on thousands of variables)
> 
> 
> This is such a core capability. I would like to see it in the R-Wiki but 
> could not find it there.


I may be misunderstanding what you want to do, but to simply get the
names of the columns in Data that contain "Height", you can do this:

>  grep("Height", names(Data), value = TRUE)
[1] "Height.1"  "Height.5"  "Height.9"  "Height.10" "Height.12"
[6] "Height.20"


Now you could use something like the following:

  for (i in grep("Height", names(Data), value = TRUE))
YourFunctionHere(Data[[i]])

If it makes for easier reading, you could first assign the subset of the
column names to a vector and then use that in the for() loop, rather
than the above.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] large data set, error: cannot allocate vector

2006-05-05 Thread Marc Schwartz (via MN)
On Fri, 2006-05-05 at 17:56 +0200, Uwe Ligges wrote:
> Robert Citek wrote:
> 
> > Why am I getting the error "Error: cannot allocate vector of size  
> > 512000 Kb" on a machine with 6 GB of RAM?
> 
> 1. The message means that you cannot allocate *further* 512Mb of RAM 
> right now for the next step, but not what is required nor what R is 
> currently consuming.
> 
> 2. This seems to be a 32-bit OS. It limits the maximal allocation for 
> the *single* R process to < 4Gb (if all goes very well).
> 
> 
> > I'm playing with some large data sets within R and doing some simple  
> > statistics.  The data sets have 10^6 and 10^7 rows of numbers.  R  
> 
> 3. 10^7 rows is not large, if you have one column...
> 
> 4. 10^7 needs 10 times what is needed for 10^6. Hence comparing 10^6 and 
> 10^7 is quite a difference.
> 
> Uwe Ligges
> 
> > reads in and performs summary() on the 10^6 set just fine.  However,  
> > on the 10^7 set, R halts with the error.  My hunch is that somewhere  
> > there's an setting to limit some memory size to 500 MB.  What setting  
> > is that, can it be increased, and if so how?  Googling for the error  
> > has produced lots of hits but none with answers, yet.  Still browsing.
> > 
> > Below is a transcript of the session.
> > 
> > Thanks in advance for any pointers in the right direction.
> > 
> > Regards,
> > - Robert
> > 
> > $ uname -sorv ; rpm -q R ; R --version
> > Linux 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 GNU/Linux

  


I might throw out one more pointer in addition to Uwe's comments above,
which _should_ not affect this issue, but as an FYI. Note that I said
"should not" versus "will not".

You are about 17 kernel versions behind. 2.6.11-1.1369_FC4smp was the
original FC4 SMP kernel.

The current FC4 kernel version is 2.6.16-1.2107_FC4smp.

This might suggest that your system in general may require some
substantial updating, which may more generally affect system behavior.

FC4 was rather unstable when first released and has improved notably
since then. It is one of the reasons that some folks are still running
FC3, even though it is EOL.

The current FC4 kernel release (noted above) has some issues with it at
present. A new kernel release version by Dave Jones, 2111, should be out
"any time now", but in the mean time, I would suggest updating your
kernel to version 2.6.16-1.2096smp and doing a full system update
generally.

You may very well find that some behaviors (related and/or unrelated to
this issue) do change for the better.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Graphics window always overlaps console window!

2005-10-25 Thread Marc Schwartz (via MN)
On Tue, 2005-10-25 at 11:55 -0600, [EMAIL PROTECTED] wrote:
> Does anyone know how I can set up R so that when I make a graphic, the
> graphics window remains behind the console window? It's annoying to
> have to reach for the mouse every time I want to type another line of
> code (e.g., to add another line to the plot). Thanks.

What operating system?

Default window focus behavior is highly OS and even window manager
specific and is not an R issue.

Depending upon your OS and window manager, you may need to check the
documentation and/or do a Google search on "window focus" for further
information.

Another alternative, if you are on Windows, is to review Windows FAQ 5.4
"How do I move focus to a graphics window or the console?", but this is
a programmatic approach and not a means to affect default behavior.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Graphics window always overlaps console window!

2005-10-25 Thread Marc Schwartz (via MN)
On Tue, 2005-10-25 at 13:07 -0500, Marc Schwartz (via MN) wrote:
> On Tue, 2005-10-25 at 11:55 -0600, [EMAIL PROTECTED] wrote:
> > Does anyone know how I can set up R so that when I make a graphic, the
> > graphics window remains behind the console window? It's annoying to
> > have to reach for the mouse every time I want to type another line of
> > code (e.g., to add another line to the plot). Thanks.
> 
> What operating system?
> 
> Default window focus behavior is highly OS and even window manager
> specific and is not an R issue.
> 
> Depending upon your OS and window manager, you may need to check the
> documentation and/or do a Google search on "window focus" for further
> information.
> 
> Another alternative, if you are on Windows, is to review Windows FAQ 5.4
> "How do I move focus to a graphics window or the console?", but this is
> a programmatic approach and not a means to affect default behavior.

Yet another approach which I just remembered is that (if on Windows) MS
offers a program called Tweak UI:

http://www.microsoft.com/windowsxp/downloads/powertoys/xppowertoys.mspx

as part of their Power Toys add-ons.

You might want to review that to see if there is a setting for the
handling of new window focus behavior.

HTH,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] encrypted RData file?

2005-10-27 Thread Marc Schwartz (via MN)
On Thu, 2005-10-27 at 16:15 -0500, Na Li wrote:
> On 27 Oct 2005, Duncan Temple Lang wrote:
> 
> > Yes, it is of interest and was sitting on my todo list at
> > some time.  If you want to go ahead and provide code to do it,
> > that would be terrific.  There are other areas where encryption
> > would be good to have, so a general mechanism would be nice.
> > 
> > D.
> > 
> > Na Li wrote:
> > > Hi, I wonder if there is interest/intention to allow for encrypted .RData
> > > files?  One can certainly do that outside R manually but that will leave a
> > > decrypted RData file somewhere which one has to remember to delete.
> > > 
> 
> I was hoping someone has already done it.  ;-(
> 
> One possibility is to implement an interface package to gpgme library which
> itself is an interface to GnuPG.  
> 
> But I'm not sure how the input of passphrase can be handled without using
> clear text.
> 
> Michael

Seems to me that a better option would be to encrypt the full partition
such that (unless you write the files to a non-encrypted partition)
these issues are transparent. This would include the use of save(),
save.image() and write() type functions to save what was an encrypted
dataset/object to a unencrypted file.

Of course, you would also have to encrypt the swap and tmp partitions
(as appropriate) for similar reasons.

On Linuxen/Unixen, full encryption of partitions is available via
loopback devices and other mechanisms and some distros have this
available as a built-in option. I believe that the FC folks finally have
this on their list of functional additions for FC5. Windows of course
can do something similar.

The other consideration here, is that if R Core builds in some form of
encryption, there is the potential for import/export restrictions on
such technology since R is available via international CRAN mirrors. It
may be best to provide for a plug-in "encryption black box" of sorts, so
that folks can use a particular encryption schema that meets various
legal/regulatory requirements.

Of course, simply encrypting the file or even a complete partition has
to be considered within a larger security strategy (ie. network
security, physical access control, etc.) that meets a particular
functional requirement (such as HIPAA here in the U.S.)

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] multiple boxplots

2005-10-28 Thread Marc Schwartz (via MN)
On Fri, 2005-10-28 at 10:26 -0700, J.M. Breiwick wrote:
> Hello,
> 
> I want to plot 3 boxplots [ par(mfrow=c(3,1)) ] but the first one has 8 
> groups, the 2nd has 7 and the third has 6. But I the groups to line up:
> 
>  1 2 3 4 5 6 7 8
>2 3 4 5 6 7 8
>   3 4 5 6 7 8
> 
> where the numbers actually refer to years. I suspect I have to manipulate 
> the function bxp(). Is there a relatively simple way to do this? Thanks for 
> any help.
> 
> Jeff Breiwick
> NMFS, Seattle


Here is one approach. It is based upon using 2 principal concepts:

1. Create the first plot, save the plot region ranges from par("usr")
and then use this information for the two subsequent plots, where we use
the 'add = TRUE' argument.

2. Use the 'at' argument to specify the placement of the boxes in the
second and third plots to line up with the first.


So:

# Create our data.
dat <- rnorm(80)
years <- rep(1991:1998, each = 10)

# MyDF will be a data frame with two columns and we will use 
# this in the formula method for boxplot.
# The second and third plots will use subset()s of MyDF
MyDF <- cbind(dat, years)

# Set the plot matrix
par(mfrow = c(3, 1))

# Create the first boxplot and save par("usr")
boxplot(dat ~ years, MyDF)
usr <- par("usr")

# Now open a new plot, setting it's par("usr") to 
# match the first plot.
# Then use boxplot and set the boxes 'at' x pos 2:8
# and add it to the open plot
plot.new()
par(usr = usr)
boxplot(dat ~ years, subset(MyDF, years %in% 1992:1998),
at = 2:8, add = TRUE)


# Rinse and repeat  ;-)
# Different subset and use 3:8 for 'at'
plot.new()
par(usr = usr)
boxplot(dat ~ years, subset(MyDF, years %in% 1993:1998),
at = 3:8, add = TRUE)



Replace MyDF in the above with your actual datasets of course.

This would of course be a bit easier if one could set an 'xlim' argument
in boxplot(), but this is ignored by bxp().

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R Graphs in Powerpoint

2005-11-01 Thread Marc Schwartz (via MN)
One other option, just to throw it out there, though it involves a few
more steps.

1. Generate the R plots as EPS files.

2. Import them into Powerpoint onto the required slides. Resize and/or
place as required. Recent versions of Powerpoint will auto-generate a
bitmapped preview image upon import.

3. Print the full Powerpoint presentation to a PS file, using a PS
printer driver. This will result in high quality images.

4. Convert the PS file to PDF, using Ghostscript (ps2pdf) or similar.

5. Display the presentation using Acrobat Reader in full screen mode to
your audience.


This works well, as long as you are not using complex object/slide
transitions, animations and the like in Powerpoint and takes advantage
of the higher quality vector format of EPS graphics as opposed to the
bitmapped graphic formats.

HTH,

Marc Schwartz


On Tue, 2005-11-01 at 13:43 -0800, Smith, Daniel (DHS-DEODC-EHIB) wrote:
> I've tried several methods in OS X, and here's what works best for me.
> Save the R graphic as a PDF file.  Open it with Apple's "Preview"
> application, and save it as a PNG file.  The resulting .png file can
> be inserted into MS Word or PowerPoint, can be resized, and looks good
> on either OS X or Windows.  There are other programs available for
> translating the pdf file to png (like the shareware application
> Graphic Converter), but I've found that Preview produces the best
> results.
> 
> Daniel Smith
> Environmental Health Investigations Branch
> California Dept of Health Services
> 
> 
> -Original Message-
> Date: Mon, 31 Oct 2005 15:14:06 -0800
> From: Jarrett Byrnes <[EMAIL PROTECTED]>
> Subject: [R] R Graphs in Powerpoint
> To: r-help@stat.math.ethz.ch
> Message-ID: <[EMAIL PROTECTED]>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
> 
> Hey, all.  Quick question.  I'm attempting to use some of the great 
> graphs generated in R for an upcoming talk that I'm writing in 
> Powerpoint.  Copying and pasting (I'm using OSX) yields graphs that 
> look great in Powerpoint - until I resize them.  Then fonts, points, 
> and lines all become quite pixelated and blurry.  Even if I size the 
> window properly first, and then copy and paste in the graph, when I 
> then view the slideshow, the graphs come out pixelated and blurry.
> 
> Is there any good solution to this, or is this some fundamental 
> incompatibility that I can't get around?
> 
> -Jarrett

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Orientation of tickmarks labels in boxplot/plot

2005-11-02 Thread Marc Schwartz (via MN)
On Wed, 2005-11-02 at 16:06 -0600, Michal Lijowski wrote:
> Hi,
> 
> I have been trying draw tickmark labels along
> the y - axis perpendicular to the y axis while
> labels along the x - axis parallel to x axis
> while making box plot.
> 
> Here is my test dataset.
> 
>  TData
>ID Ratio
> 1   0 7.075
> 2   0 7.414
> 3   0 7.403
> 4   0 7.168
> 5   0 6.820
> 6   0 7.294
> 7   0 7.238
> 8   0 7.938
> 9   1 7.708
> 10  1 8.691
> 11  1 8.714
> 12  1 8.066
> 13  1 8.949
> 14  1 8.590
> 15  1 8.714
> 16  1 8.601
> 
>  boxplot(Ratio ~ ID, data=TData)
> 
> makes box plot with tickmark labels parallel to the y - axis.
> So I try 
> 
> boxplot(Ratio ~ ID, data=TData, axes=FALSE)
> par(las=0)
> axis(1)
> and I get x - axis ranging from 0.5 to 2.5 (why?) and 
> boxes at 1 and 2.
> par(las=2)
> axis(2)
> box()
> So, if I set tickmark labels parallel to y - axis
> somehow the x - axis range is not what I expect even
> if I use xlim = c(0.0, 3.0)
>  in boxplot(Ratio ~ Id, data=TData, axes=FALSE, xlim=c(0.0, 3.0))
>  par(las=0)
>  axis(1)
> 
> Plots are in the attachments in pdf format.
> 
> I appreciate any tips.
> 
> I am using R 2.2.0 (2005-10-06) on FC4.
> 
> Michal

I suspect that you want this:

  boxplot(Ratio ~ ID, data=TData, las = 1)

so that the x and y axis labels are horizontal. See ?par and review the
options for 'las'. '1' is for axis tick mark labels to be horizontal.

The x axis (horizontal axis if the boxes are vertical, the vertical axis
if the boxes are horizontal) is set by default to the number of boxes
(groups) +/- 0.5.  'xlim' is ignored in both cases.

The boxes themselves are placed at integer values from 1:N, where N is
the number of groups.

There is the 'at' argument and there is an example of its use
in ?boxplot. You can also search the archives for posts where variations
on the use of 'at' have been posted (recently, in fact...)

The actual plotting of the boxplots is done by bxp(), so review the help
for that function as well.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bug/feature with barplot?

2005-11-14 Thread Marc Schwartz (via MN)
On Mon, 2005-11-14 at 15:55 +0100, Karin Lagesen wrote:
> I have found a bug/feature with barplot that at least to me shows
> undesireable behaviour. When using barplot and plotting fewer
> groups/levels/factors(I am unsure what they are called) than the number
> of colors stated in a col statement, the colors wrap around such that
> the colors are not fixed to one group. This is mostly problematic when
> I make R figures using scripts, since I sometimes have empty input
> groups. I have in these cases experienced labeling the empty group as
> red, and then seeing a bar being red when that bar is actually from a
> different group.
> 
> Reproducible example (I hope):
> 
> barplot(VADeaths, beside=TRUE, col=c("red", "green", "blue", "yellow", 
> "black"))
> barplot(VADeaths[1:4,], beside=TRUE, col=c("red", "green", "blue", "yellow", 
> "black"))
> 
> Now, I don't know if this is a bug or a feature, but it sure bugged me...:)
> 
> Karin

Most definitely not a bug.

As with many vectorized function arguments, they will be recycled as
required to match the length of other appropriate arguments.

In this case, the number of colors (5) does not match the number of
groups (4). Thus, they are "out of synch" with each other and you get
the result you have.

Not unexpected behavior.

You should adjust your code and the function call so that the number of
groups matches the number of colors. Something along the lines of the
following:

col <- c("red", "green", "blue", "yellow", "black")
no.groups <- 4
barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups])


Now try:

 no.groups <- 5
 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups])

 no.groups <- 3
 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups])


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Coercion of percentages by as.numeric

2005-11-14 Thread Marc Schwartz (via MN)
On Mon, 2005-11-14 at 19:07 +0200, Brandt, T. (Tobias) wrote:
>  
> >-Original Message-
> >From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
> >Sent: 14 November 2005 06:21 PM
> >
> >On 11/14/05, Brandt, T. (Tobias) <[EMAIL PROTECTED]> wrote:
> >> Hi
> >>
> >> Given that things like the following work
> >>
> >>  > a <- c("-.1"," 2.7 ","B")
> >> > a
> >> [1] "-.1"   " 2.7 " "B"
> >> > as.numeric(a)
> >> [1] -0.1  2.7   NA
> >> Warning message:
> >> NAs introduced by coercion
> >> >
> >>
> >> I naively expected that the following would behave differently.
> >>
> >>  > b <- c('10%', '-20%', '30.0%', '.40%')
> >> > b
> >> [1] "10%"   "-20%"  "30.0%" ".40%"
> >> > as.numeric(b)
> >> [1] NA NA NA NA
> >> Warning message:
> >> NAs introduced by coercion
> >
> >Try this:
> >
> >as.numeric(sub("%", "e-2", b))
> >
> 
> Thank you, that accomplishes what I had intended.
> 
> I would have thought though that the expression "53%" would be a fairly
> standard representation of the number 0.53 and might be handled as such.  Is
> there a specific reason for avoiding this behaviour?  

"53%" is a 'shorthand' character representation of a mathematical
concept. To wit, the specific representation of a fraction using 100 as
the denominator (ie. 53 / 100). The symbol '%' can be replaced by the
word "percent", such as "53 percent", which is also a character
representation.

0.53, in context, is a numeric representation of a proportion in the
range of 0 - 1.0.

> I can imagine that it might add unnecessary overhead to routines like
> "as.numeric" which one would like to keep as fast as possible.
> 
> Perhaps there are other areas though where it might be desirable?  For
> example I'm thinking of the read.table function for reading in csv files
> since I have many of these that have been saved from excel and now contain
> numbers in the "%" format.

In Excel, numbers displayed with a '%' are what you see visually.
However, the internal representation (how the value is actually stored
in the program) is still as a floating point value, without the '%'. 

For example:

> a <- 53
> a
[1] 53

> sprintf("%.0f%%", a)
[1] "53%"

> is.numeric(a)
[1] TRUE

> is.numeric(sprintf("%.0f%%", a))
[1] FALSE


Unfortunately (depending upon your perspective), Excel, and other
similar programs, tend to export the visually displayed values and not
the internal representations of them. Thus, as Gabor pointed out, you
will need to do some 'editing' of the values before using them in R. You
can either do this in Excel, by removing the "%" formatting, or
post-import in R as Gabor has described.

You need to keep separate the internal representation of a value and its
printed or displayed representation for human readable consumption.

as.numeric() does basically one thing and it does it well and properly.
It is up to the user to ensure that it is passed the proper values. When
that is not the case, it issues an appropriate warning message and
returns NA.

Of course, using Gabor's hint, you can also write your own variation of
as.numeric(), creating a function that takes percent formatted values
and converts them as you require. One of the many strengths of R, is
that you can extend it to meet your own specific requirements when the
base functions do not.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] as.integer with base other than ten.

2005-11-14 Thread Marc Schwartz (via MN)
On Mon, 2005-11-14 at 19:01 +, William Astle wrote:
> Is there an R function analogous to the C function strtol? 
> I would like to convert a binary string into an integer.
> 
> Cheers for advice
> 
> Will


There was some discussion in the past and you might want to search the
archive for a more generic solution for any base to any base, but for
binary to decimal specifically, something like the following will work:

bin2dec <- function(x)
{
  b <- as.numeric(unlist(strsplit(x, "")))
  pow <- 2 ^ ((length(b) - 1):0)
  sum(pow[b == 1])
}


The function takes the binary string and splits it up into individual
numbers ('b'). It then creates a vector of powers of 2 as long as 'b'
less one through 0 ('pow').  It then takes the sum of the values of pow,
indexed by 'b == 1'.


> bin2dec("101")
[1] 5

> bin2dec("")
[1] 15

> bin2dec("101")
[1] 95


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] invert y-axis in barplot

2005-11-16 Thread Marc Schwartz (via MN)
On Wed, 2005-11-16 at 16:46 +, Jörg Schlingemann wrote:
> Hi!
> 
>  
> 
> This is probably a very trivial question. Is there an easy way to invert the
> y-axis (low values on top) when using the function barplot()? 
> 
>  
> 
> Thanks,
> 
> Jrg

You mean something like this?:

  barplot(1:10, ylim = rev(c(0, 12)))

or, something like this?:

  barplot(1:10, yaxt = "n") 
  axis(2, labels = rev(seq(0, 10, 2)), at = seq(0, 10, 2))


Note the use of rev() in each case.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] [Rd] Scan data from a .txt file

2005-11-17 Thread Marc Schwartz (via MN)
I have a feeling that Vasu wants (mistakenly) this:

dat <- read.table("clipboard", header = FALSE)

> dat
  V1 V2 V3 V4
1   Name Weight Height Gender
2   Anne150 65  F
3Rob160 68  M
4 George180 65  M
5   Greg205 69  M

> str(dat)
`data.frame':   5 obs. of  4 variables:
 $ V1: Factor w/ 5 levels "Anne","George",..: 4 1 5 2 3
 $ V2: Factor w/ 5 levels "150","160","180",..: 5 1 2 3 4
 $ V3: Factor w/ 4 levels "65","68","69",..: 4 1 2 1 3
 $ V4: Factor w/ 3 levels "F","Gender","M": 2 1 3 3 3

> dat$V1
[1] Name   Anne   RobGeorge Greg
Levels: Anne George Greg Name Rob

> dat$V2
[1] Weight 150160180205
Levels: 150 160 180 205 Weight

> dat$V3
[1] Height 65 68 65 69
Levels: 65 68 69 Height

> dat$V4
[1] Gender F  M  M  M
Levels: F Gender M


So that the colnames are actually part of the data frame columns.

Vasu, note however that all values become factors or you can convert to
character, for example:

> as.character(dat$V1)
[1] "Name"   "Anne"   "Rob""George" "Greg"

neither of which I suspect is what you really want.


You can access the column names of the data frame using colnames():

> dat <- read.table("clipboard", header = TRUE)

> dat
Name Weight Height Gender
1   Anne150 65  F
2Rob160 68  M
3 George180 65  M
4   Greg205 69  M

> colnames(dat)
[1] "Name"   "Weight" "Height" "Gender"


This keeps the column names separate from the actual data, which unless
we are missing something here, is the proper way to do this. Think of a
data frame as a rectangular data set, which can contain more than one
data type across the columns, much like a spreadsheet.  The difference
here (unlike a spreadsheet) is that the first row does not contain the
column names/labels. These are separate from the data itself, which in a
typical spreadsheet would start on row 2.

Note as Andy pointed out, that in this case, you should use
read.table(), not scan().

Review "An Introduction To R" and the "R Data Import/Export" manuals for
more information. Both are available with your installation and/or from
the main R web site under Documentation.

HTH,

Marc Schwartz


On Thu, 2005-11-17 at 10:41 -0500, Liaw, Andy wrote:
> [Re-directing to R-help, as this is more appropriate there.]
> 
> I tried copying the snippet of data into the windows clipboard and tried it:
> 
> > dat <- read.table("clipboard", header=T)
> > dat
> Name Weight Height Gender
> 1   Anne150 65  F
> 2Rob160 68  M
> 3 George180 65  M
> 4   Greg205 69  M
> > str(dat)
> `data.frame':   4 obs. of  4 variables:
>  $ Name  : Factor w/ 4 levels "Anne","George",..: 1 4 2 3
>  $ Weight: int  150 160 180 205
>  $ Height: int  65 68 65 69
>  $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2
> > dat <- read.table("clipboard", header=T, row=1)
> > str(dat)
> `data.frame':   4 obs. of  3 variables:
>  $ Weight: int  150 160 180 205
>  $ Height: int  65 68 65 69
>  $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2
> > dat
>Weight Height Gender
> Anne  150 65  F
> Rob   160 68  M
> George180 65  M
> Greg  205 69  M
> 
> Don't see how it "doesn't work".  Please give more detail on what "doesn't
> work" means.
> 
> Andy
> 
> > From: Vasundhara Akkineni
> > 
> > Hi all,
> > Am trying to read data from a .txt file in such a way that i 
> > can access the
> > column names too. For example, the data in the table.txt file 
> > is as below:
> >  Name Weight Height Gender
> > Anne 150 65 F
> > Rob 160 68 M
> > George 180 65 M
> > Greg 205 69 M
> >  i used the following commands:
> >  data<-scan("table.txt",list("",0,0,0),sep="")
> > a<-data[[1]]
> > b<-data[[2]]
> > c<-data[[3]]
> > d<-data[[4]]
> >  But this doesn't work because of type mismatch. I want to 
> > pull the col
> > names also into the respective lists. For example i want 'b' to have
> > (weight,150,160,180,205) so that i can access the col name 
> > and also the
> > induvidual weights. I tried using the read.table method too, 
> > but couldn't
> > get this working. Can someone suggest a way to do this.
> > Thanks,
> > Vasu.
> >

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] dev.copy legend problem

2005-11-17 Thread Marc Schwartz (via MN)
On Thu, 2005-11-17 at 17:03 +0100, Florence Combes wrote:
> Dear all,
> 
> We are facing this problem for long, and so ask for your help.
> 
> We are plotting 2 graphs in a postscript device (left part -layout
> function-), and the common legend for these graphs on the right part.
> The legend in the postscript device looks ok: this is color lines with
> numbers on the right (6 columns) , see the code below:
> 
> > nblock<-c(1:48)
> > leg<-paste(c(1:npin)," ",sep=" ")
> > legend(0,19,legend = leg, col=rainbow(nblock), lty=1,
> merge=TRUE,ncol=6,bty="n",cex=0.6)
> 
> The problem we are facing is that we dev.copy to a pdf device and then, the
> legend doesn't look the same: numbers overlap a little lines.

The problem with using dev.copy() is that it can result in slightly (or
notably) different plotting characteristics, which are device dependent.

A plot that is viewed on the screen, for example, may or may not (most
likely not) look the same when created using the same code to a
postscript device or a pdf device. This can be further exacerbated if
the plot device dimensions and pointsizes are not explicitly defined, as
"device shrinkage" may occur. I suspect that this is what you are
experiencing.

If you want to end up with a PDF plot, use pdf() and set the various
plotting parameters (ie. font size, etc.) based upon what you see there,
not what you see on the screen or in another device.

One other possibility, just to throw it out there since you are on
Linux, is to use ps2pdf from a console to convert the PS file to PDF.
This uses Ghostscript to do the conversion and will generally work well,
but testing with your particular plot, given possible issues with fonts,
etc. would still be a good idea. See 'man ps2pdf' for more information.

> Has someone already encountered such a thing ?
> 
> Any help apreciated, thanks a lot.
> 
> Florence.
> 
> 
> R 2.1.0 on a Linux Debian.

Have not been able to get your SysAdmin to upgrade you yet?

:-)

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


<    1   2   3   >