[Rd] duplicated() variation that goes both ways to capture all duplicates

2012-07-23 Thread Liviu Andronic
Dear all
The trouble with the current duplicated() function in is that it can
report duplicates while searching fromFirst _or_ fromLast, but not
both ways. Often users will want to identify and extract all the
copies of the item that has duplicates, not only the duplicates
themselves.

To take the example from the man page:
 data(iris)
 iris[duplicated(iris), ]  ##duplicates while searching fromFirst
Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
143  5.8 2.7  5.1 1.9 virginica
 iris[duplicated(iris, fromLast=T), ]  ##duplicates while searching fromLast
Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
102  5.8 2.7  5.1 1.9 virginica


To extract all the copies of the concerned items (original and
duplicates) one would need to do something like this:
 iris[(duplicated(iris) | duplicated(iris, fromLast=T)), ]  ##duplicates while 
 searching bothWays
Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
102  5.8 2.7  5.1 1.9 virginica
143  5.8 2.7  5.1 1.9 virginica


Unfortunately this is unnecessarily long and convoluted. Short of a
'bothWays' argument in duplicated(), I came up with a small wrapper
that simplifies the above:
duplicated2 -
function(x, bothWays=TRUE, ...)
{
if(!bothWays) {
return(duplicated(x, ...))
} else if(bothWays) {
return((duplicated(x, ...) | duplicated(x, fromLast=TRUE, ...)))
}
}


Now the above can be achieved simply via:
 iris[duplicated2(iris), ]  ##duplicates while searching bothWays
Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
102  5.8 2.7  5.1 1.9 virginica
143  5.8 2.7  5.1 1.9 virginica


So here's my inquiry: Would the R Core consider adding such
functionality in 'base' R? Either the---suitably cleaned
up---duplicated2() function above, or a bothWays argument in
duplicated() itself? Either of the two would improve user convenience
and reduce confusion. (In my case it took some time before I
understood the correct approach to this problem.)

Regards
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 'data.frame' method for base::rep()

2011-08-03 Thread Liviu Andronic
Hello David


On Tue, Aug 2, 2011 at 4:14 PM, David Winsemius dwinsem...@comcast.net wrote:
 x - data.frame(a = as.Date('2000-01-01'), b=as.Date('2001-01-01'))
 x$d - x$a -x$b
 require(mefa)
 rep(x, 2)
           a          b    d
 1 2000-01-01 2001-01-01 -366
 2 2000-01-01 2001-01-01 -366
 str(rep(x,2))
 'data.frame':   2 obs. of  3 variables:
  $ a: Date, format:  ...
  $ b: Date, format:  ...
  $ d: num  -366 -366   # notice that a difftime object has lost its class

Nice catch. Thanks for pointing it out.


 # Whereas using the [rep(. , .) , ] approach does preserve the difftime
 class.
 str(x[rep(1,2) , ])
 'data.frame':   2 obs. of  3 variables:
  $ a: Date, format:  ...
  $ b: Date, format:  ...
  $ d:Class 'difftime'  atomic [1:2] -366 -366   # leap year
  .. ..- attr(*, units)= chr days

The above is nice. I wouldn't have thought of it.


 Since that works out of the box with fewer potential side-effects, I am not
 sure a new method is needed.

Your solution still seems more like an obscure side-effect of
subsetting than an intuitive feature, in the sense that before trying
it out the average user would probably first turn to base::rep() when
in need to replicate a df, and then (perhaps) to
mefa:::rep.data.frame() (with all the associated confusion and
pitfalls). I would tend to believe that if there is a clean R-ish way
to implement a base::rep.data.frame() it could still be useful.

Best regards
Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 'data.frame' method for base::rep()

2011-08-02 Thread Liviu Andronic
Dear R developers
Would you consider adding a 'data.frame' method for the base::rep
function? The need to replicate a df row-wise can easily arise while
programming, and rep() is unable to handle such a case. See below.
 x - iris[1, ]
 x
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1  5.1 3.5  1.4 0.2  setosa
 rep(x, 2)
$Sepal.Length
[1] 5.1

$Sepal.Width
[1] 3.5

$Petal.Length
[1] 1.4

$Petal.Width
[1] 0.2

$Species
[1] setosa
Levels: setosa versicolor virginica

$Sepal.Length
[1] 5.1

$Sepal.Width
[1] 3.5

$Petal.Length
[1] 1.4

$Petal.Width
[1] 0.2

$Species
[1] setosa
Levels: setosa versicolor virginica


I found a 'rep.data.frame' function in package 'mefa' [2], but I think
it would be nice to have it in base R. In any case, the code used by
the method is very simple.
 require(mefa)
Loading required package: mefa
mefa 3.2-1   2011-05-13
 mefa:::rep.data.frame
function (x, ...)
as.data.frame(lapply(x, rep, ...))
environment: namespace:mefa

And here's the example above:
 rep(x, 2)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1  5.1 3.5  1.4 0.2  setosa
2  5.1 3.5  1.4 0.2  setosa


Please le t me know what you think. Regards
Liviu

[1] http://finzi.psych.upenn.edu/R/library/mefa/html/rep.data.frame.html
[2] http://cran.at.r-project.org/web/packages/mefa/index.html


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] No RTFM?

2010-08-22 Thread Liviu Andronic
On Mon, Aug 23, 2010 at 7:04 AM, Paul Johnson pauljoh...@gmail.com wrote:
 I can tell. I wish sessionInfo would just grab the locale information.

Here it does so by default: locale info is included in sessionInfo
output. Regards
Liviu


 sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] fortunes_1.3-7 IPSUR_1.0

 Sys.getlocale()
[1] 
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] fortune? (was: Re: How do you make a formal feature request?)

2010-08-21 Thread Liviu Andronic
Dear all
I was wondering whether such a long post could be fortune-ed. What do you think?

Regards
Liviu


On Sat, Aug 21, 2010 at 9:33 PM, Sharpie ch...@sharpsteen.net wrote:
 Well, I can think of three ways it can go down:


 1.  You want a shiny new pony.

 You ask about it on the mailing list and it seems that everyone else in the
 world responds Hell yeah! I want to ride that too!.  In this case the
 natives are restless enough that someone on R-Core may personally implement
 the feature- especially if they want to ride the pony as well.

 In this case, you need to provide a detailed specification of what kind of
 pony you want, how it should be groomed and the exact pitch at which you
 want it to whinny.  A good template for such as spec would be a Python
 Enhancement Proposal (PEP) which is the way community-suggested core changes
 are implemented in python.  An example is:
 http://www.python.org/dev/peps/pep-0389/

 However, going this route is extremely rare.  You have to have a significant
 amount of the user community rallying behind your idea and buy-in from core
 developers who are interested in implementing and, most importantly,
 maintaining and supporting the code.


 2. You want a shiny new pony but not many other people in the word seem
 interested.

 In this situation you can do the work yourself, or with a group of other
 like-minded pony enthusiasts, to bring your idea into the world.  Perhaps
 the genetic material you are looking for is already present in the vast
 herds of other ponies running wild on CRAN and elsewhere and you just have
 to do a little breeding to get what you want.  Other times, the only way to
 do right is to write everything from scratch.  Either way, in the end you
 will have a pony that shines exactly the way you want it to that you can
 enjoy for the rest of your life.

 In this case, getting your new pony into R Core is unlikely.  The best
 response you can hope for is something along the lines of That is a mighty
 fine pony you have there, but we really don't want it crapping all over our
 stable.  They are not trying to be rude- the facts of life are that the
 members of R Core have a limited amount of time and a lot of other ponies to
 clean up after.  Add to that the fact that shoveling pony shit is a
 thankless job that does not pay well and it is understandable why R Core may
 be conservative about the number of ponies they let into the official
 stable.

 However, they will be more than happy to provide your pony with a stall at
 CRAN so that everyone else in the world can take it out for a spin.  I have
 never had a problem with installing and using packages from CRAN, even on
 windows machines that have been locked down and then shot in both kneecaps
 by the friendly neighborhood IT gestapo.  All and all, this option is
 actually a pretty sweet deal; you will just have to drop by the CRAN stall
 every once and a while and deal with the pony droppings yourself or people
 will start to avoid it because of the smell.


 3. You want a shiny new pony, but dont have the time or energy to pick out
 or put together the exact one you want.

 In this case, you can still get the pony you want but it will cost you
 money.  There are R programmers out there who can write you a package if you
 pay them the right price.  Supporting your local grad student population can
 also work; hunger is a great motivator.

 In the end you can also pay a corporate pony breeder like SAS for a trusty
 thoroughbred that is well respected by people in high places.  However, you
 may notice that these ponies bear some telltale signs of inbreeding-- one of
 their eyes may not point in the same direction as the other or the pony
 becomes confused easily when put in an unfamiliar situation.  Given there is
 not a lot you can do about these defects, you may suffer a crippling case of
 buyers remorse especially when you see the bill.


 Ok, I think i've thoroughly beat this horse analogy to death and I'm going
 to stop now.

 -Charlie


 -
 Charlie Sharpsteen
 Undergraduate-- Environmental Resources Engineering
 Humboldt State University
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-do-you-make-a-formal-feature-request-tp2333593p2333737.html
 Sent from the R devel mailing list archive at Nabble.com.

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel




-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Make a donation via PayPal? (was: [R] Monetary support to the R-project)

2010-04-01 Thread Liviu Andronic
Dear R developers
I understand that this is not a proper r-devel message, but it still
touches to the organisation of the R project.

I would like to make a small donation to the project, but I am not
comfortable with sending my credit card details via post or mail and,
as echoed elsewhere on r-help, would prefer to go through a generally
accepted service such as PayPal.

Would there be any interest from the R project to implement this? Thank you
Liviu



On 3/8/10, Henrik Bengtsson h...@stat.berkeley.edu wrote:
  For companies and others wondering how to give something back, it is
  possible to support R and the R Foundation either through a donation:

  http://www.r-project.org/ - Foundation - Donations
  [http://www.r-project.org/foundation/donations.html]

  or via a membership:

  http://www.r-project.org/ - Foundation - Membership
  [http://www.r-project.org/foundation/membership.html]

  or both.


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] checking Rd cross-references ... WARNING

2009-12-21 Thread Liviu Andronic
Dear all
I am getting this strange error when checking my package. Would you
have an idea what causes it?

Thank you
Liviu

* checking Rd cross-references ... WARNING
Error in .find.package(package, lib.loc) :
  there is no package called 'KernSmooth'
Calls: Anonymous - lapply - FUN - .find.package
Execution halted



 sessionInfo ()
R version 2.10.0 (2009-10-26)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
LC_PAPER=en_GB.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
 [1] datasets  grid  splines   graphics  stats utils
grDevices tcltk methods   base

other attached packages:
 [1] fortunes_1.3-6   RcmdrPlugin.HH_1.1-25HH_2.1-32
 leaps_2.9
 [5] multcomp_1.1-2   mvtnorm_0.9-8lattice_0.17-26
 RcmdrPlugin.sos_0.1-0
 [9] tcltk2_1.1-1 RcmdrPlugin.Export_0.3-0 Hmisc_3.7-0
 survival_2.35-7
[13] xtable_1.5-6 Rcmdr_1.5-4  car_1.2-16
 relimp_1.0-1
[17] sos_1.2-4brew_1.0-3   hints_1.0.1-1

loaded via a namespace (and not attached):
[1] cluster_1.12.1



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] checking Rd cross-references ... WARNING

2009-12-21 Thread Liviu Andronic
Hello

On 12/21/09, Duncan Murdoch murd...@stats.uwo.ca wrote:
  R version 2.10.0 (2009-10-26)

  Do you get the same message in 2.10.1?

I no longer get the warning after I installed r-recommended and
r-cran-kernsmooth, without upgrading to 2.10.1. Perhaps this is a
Debian specific issue.

Regards
Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] options(width=100) ignored on start-up: bug or feature?

2009-12-07 Thread Liviu Andronic
On 12/7/09, Liviu Andronic landronim...@gmail.com wrote:
   Is it normal that R ignores options(width=100) at start-up? Although
   li...@debian-liv:~$ cat /usr/lib/R/etc/Rprofile.site | grep width
   options(width = 100)

Found the issues. In the config, Rcmdr was starting after the
options() call. Setting them in the following order
library(Rcmdr)
options(width = 100)

solved the issue.

Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] options(width=100) ignored on start-up: bug or feature?

2009-12-06 Thread Liviu Andronic
Dear developers
I've tried this a couple of days ago on r-help, unfortunately with no
feedback. Could you please take a look and confirm whether it's a bug,
feature, or bad eye-sight when reading Help:

 Is it normal that R ignores options(width=100) at start-up? Although
 li...@debian-liv:~$ cat /usr/lib/R/etc/Rprofile.site | grep width
 options(width = 100)

 , R will start with
 [Previously saved workspace restored]

  options()$width
 [1] 80

 Am I doing something wrong?
 Liviu


  sessionInfo()
 R version 2.10.0 (2009-10-26)
 x86_64-pc-linux-gnu

 locale:
  [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
  [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
  [1] datasets  grid  splines   graphics  stats utils grDevices
  [8] tcltk methods   base

 other attached packages:
  [1] fortunes_1.3-6   RcmdrPlugin.HH_1.1-25HH_2.1-32
  [4] leaps_2.9multcomp_1.1-2   mvtnorm_0.9-8
  [7] lattice_0.17-26  RcmdrPlugin.sos_0.1-0RcmdrPlugin.Export_0.3-0
 [10] Hmisc_3.7-0  survival_2.35-7  xtable_1.5-6
 [13] Rcmdr_1.5-4  car_1.2-16   relimp_1.0-1
 [16] sos_1.1-7brew_1.0-3   hints_1.0.1-1

 loaded via a namespace (and not attached):
 [1] cluster_1.12.1

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] strange crashes caused by 'cairoDevice' and 'tcltk' dialogues

2009-11-19 Thread Liviu Andronic
Dear developers
I get some strange crashes when 'cairoDevice' and 'tcltk' are both
loaded in the same R vanilla session.

When executing the following in that order
require(relimp)
require(cairoDevice)
showData (iris)

I get a crash with the following message (see R-relimp-cairoDevice.txt):
The program 'R' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadWindow (invalid Window parameter)'.
  (Details: serial 2832 error_code 3 request_code 15 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)


When I inverse the order of calls, I no longer seem to get issues.
(See R-cairoDevice-relimp.txt)
require(cairoDevice)
require(relimp)
showData (iris)

Initially I got issues with 'playwith' and 'Rcmdr', but managed to pin
down to  'cairoDevice' and 'tcltk' (although it could still be
something else). At this point I'm stuck. I tried
R -d gdb --vanilla

but
 require(relimp)
Loading required package: relimp
Loading Tcl/Tk interface ... [Thread debugging using libthread_db enabled]
Cannot find new threads: generic error
(gdb) quit

I get the same error when trying to load Rcmdr in gdb.

Could someone point me to the right direction of investigating these
crashes? Thank you
Liviu



li...@debian-liv:~$ uname -a
Linux debian-liv 2.6.30-1-amd64 #1 SMP Sat Aug 15 18:09:19 UTC 2009
x86_64 GNU/Linux
li...@debian-liv:~$ cat /etc/apt/sources.list | grep cran2deb
deb http://debian.cran.r-project.org/cran2deb/debian-amd64/ testing/
 sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] relimp_1.0-1 cairoDevice_2.10

loaded via a namespace (and not attached):
[1] tcltk_2.10.0
li...@debian-liv:~$ R --vanilla

R version 2.10.0 (2009-10-26)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 require(relimp)
Loading required package: relimp
Loading Tcl/Tk interface ... done
 showData (iris)
 require(cairoDevice)
Loading required package: cairoDevice

Warning message:
package 'cairoDevice' was built under R version 2.9.0 and help may not work 
correctly 
 
 showData (iris)
 The program 'R' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadWindow (invalid Window parameter)'.
  (Details: serial 2832 error_code 3 request_code 15 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
li...@debian-liv:~$ R --vanilla

R version 2.10.0 (2009-10-26)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 require(cairoDevice)
Loading required package: cairoDevice
Warning message:
package 'cairoDevice' was built under R version 2.9.0 and help may not work 
correctly 
 require(relimp)
Loading required package: relimp
Loading Tcl/Tk interface ... done
 showData(iris)
 sessionInfo()
R version 2.10.0 (2009-10-26) 
x86_64-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C  
 [3] 

[Rd] improving ?RweaveLatex

2009-08-22 Thread Liviu Andronic
Dear developers
Please read below.

On 6/25/09, Marc Schwartz marc_schwa...@me.com wrote:
  You can use the following *after* the \begin{document} directive:
   \setkeys{Gin}{width=0.8\textwidth}

  The above is the default. Reset it to what you would like.

  Note, as per that manual page, that the Sweave options 'height' and 'width'
 affect the size of the PDF and EPS files created, but it is the above
 command that controls the size of the image in the document itself.

Could this information be incorporate into the RweaveLatex help page?
Concerning the sizes of graphs, it contains only the following
information:

 width: numeric (6), width of figures in inches.
 height: numeric (6), height of figures in inches.

Unfortunately this concise information can easily mislead into
believing that these two options affect the graphs dynamically
produced and included in the final .pdf file. It might help for these
two sentences to specify what figures they're affecting, and also to
include some info about \setkeys{Gin}{width=0.8\textwidth}.
Thank you
Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] sessionInfo() fails to correctly detect locale settings

2009-08-21 Thread Liviu Andronic
Dear R devels
Yesterday I was slightly surprised to notice that R incorrectly
detected some of the locale settings. I am not sure whether this is
important, but I preferred to drop a message. In the R output below,
some entries that should have been en_GB.UTF-8 are presented as C.
Regards
Liviu

 sessionInfo()
R version 2.9.1 (2009-06-26)
x86_64-pc-linux-gnu

locale:
LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] fortunes_1.3-6 relimp_1.0-1

loaded via a namespace (and not attached):
[1] tcltk_2.9.1


li...@debian-liv:~$ locale
LANG=en_GB.UTF-8
LC_CTYPE=en_GB.UTF-8
LC_NUMERIC=en_GB.UTF-8
LC_TIME=en_GB.UTF-8
LC_COLLATE=en_GB.UTF-8
LC_MONETARY=en_GB.UTF-8
LC_MESSAGES=en_GB.UTF-8
LC_PAPER=en_GB.UTF-8
LC_NAME=en_GB.UTF-8
LC_ADDRESS=en_GB.UTF-8
LC_TELEPHONE=en_GB.UTF-8
LC_MEASUREMENT=en_GB.UTF-8
LC_IDENTIFICATION=en_GB.UTF-8
LC_ALL=




-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Printing the null hypothesis

2009-08-16 Thread Liviu Andronic
Dear R developers,
Currently many (all?) test functions in R describe the alternative
hypothesis, but not the the null hypothesis being tested. For example,
cor.test:
 require(boot)
 data(mtcars)
 with(mtcars, cor.test(mpg, wt, met=kendall))

Kendall's rank correlation tau

data:  mpg and wt
z = -5.7981, p-value = 0.6706
alternative hypothesis: true tau is not equal to 0
sample estimates:
 tau
-0.72783

Warning message:
In cor.test.default(mpg, wt, met = kendall) :
  Cannot compute exact p-value with ties


In this example,
H0: (not printed)
Ha: true tau is not equal to 0

This should be fine for the advanced users and expert statisticians,
but not for beginners. The help page will also often not explicitely
state the null hypothesis. Personally, I often find myself in front of
an htest object guessing what the null should have reasonably sounded
like.

Are there compelling reasons for not printing out the null being
tested, along with the rest of the results? Thank you
Liviu





-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Printing the null hypothesis

2009-08-16 Thread Liviu Andronic
Hello,

On 8/16/09, Ted Harding ted.hard...@manchester.ac.uk wrote:
 I don't know about *compelling* reasons! But (as a general rule)
  if the Alternative Hyptohesis is stated, then the Null Hypothesis
  is simply its negation. So, in your example, you can infer

   H0: true tau equals 0
   Ha: true tau is not equal to 0.

Oh, I had a slightly different H0 in mind. In the given example,
cor.test(..., met=kendall) would test H0: x and y are independent,
but cor.test(..., met=pearson) would test: H0: x and y are not
correlated (or `are linearly independent') .

To take a different example, a test of normality.
 shapiro.test(mtcars$wt)

Shapiro-Wilk normality test

data:  mtcars$wt
W = 0.9433, p-value = 0.09265

Here both H0: x is normal and Ha: x is not normal are missing. At
least to beginners, these things are not always perfectly clear (even
after reading the documentation), and when interpreting the results it
can prove useful to have on-screen information about the null.


Thank you for answering
Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Printing the null hypothesis

2009-08-16 Thread Liviu Andronic
On 8/16/09, Ted Harding ted.hard...@manchester.ac.uk wrote:
   Oh, I had a slightly different H0 in mind. In the given example,
   cor.test(..., met=kendall) would test H0: x and y are independent,
   but cor.test(..., met=pearson) would test: H0: x and y are not
   correlated (or `are linearly independent') .


 Ah, now you are playing with fire! What the Pearson, Kendall and
  Spearman coefficients in cor.test measure is *association*. OK, if
 the results clearly indicate association, then the variables are
  not independent. But it is possible to have two variables x, y
  which are definitely not independent (indeed one is a function of
  the other) which yield zero association by any of these measures.

  Example:
   x -  (-10:10) ; y - x^2 - mean(x^2)
   cor.test(x,y,method=pearson)
   #   Pearson's product-moment correlation
   # t = 0, df = 19, p-value = 1
   # alternative hypothesis: true correlation is not equal to 0
   # sample estimates: cor 0
   cor.test(x,y,method=kendall)

   #   Kendall's rank correlation tau

   # z = 0, p-value = 1
   # alternative hypothesis: true tau is not equal to 0
   # sample estimates: tau 0
   # cor.test(x,y,method=spearman)
   #  Spearman's rank correlation rho
   # S = 1540, p-value = 1
   # alternative hypothesis: true rho is not equal to 0
   # sample estimates: rho 0

  If you wanted, for instance, that the method=kendall should
  announce that it is testing H0: x and y are independent then
  it would seriously mislead the reader!

I did take the null statement from the description of
Kendall::Kendall() (Computes the Kendall rank correlation and its
p-value on a two-sided test of H0: x and y are independent.). Here,
perhaps monotonically independent (as opposed to functionally
independent) would have been more appropriate.

Still, this very example seems to support my original idea: users can
easily get confused on what is the exact null of a test. Does it test
for association or for no association, for normality or for
lack of normality . Printing a precise and appropriate statement of
the null would prove helpful in interpreting the results, and in
avoiding misinterpreting these.



   Here both H0: x is normal and Ha: x is not normal are missing. At
   least to beginners, these things are not always perfectly clear (even
   after reading the documentation), and when interpreting the results it
   can prove useful to have on-screen information about the null.

 This is possibly a more discussable point, in that even if you know
  what the Shapiro-Wilk statistic is, it is not obvious what it is
  sensitive to, and hence what it might be testing for. But I doubt
  that someone would be led to try the Shapiro-Wilk test in the
  first place unless they were aware that it was a test for normality,
  and indeded this is announced in the first line of the response.
  The alternative, therefore, is non-normality.

To be particularly picky, as statistics is, this is not so obvious
from the print-out. For the Shapiro-Wilk test one could indeed deduce
that since it is a test of normality, then the null tested is H0:
data is normal. This would not hold for, say, the Pearson
correlation. In loose language, it would estimate and test for
correlation; in more statistically appropriate language, it will
test for no correlation (or for no association). It feels to me
that without appropriate indicators, one can easily get playing with
fire.



  As to the contrast between absence of an Ha statement for the
  Shapiro-Wilk, and its presence in cor,test(), this comes back to
  the point I made earlier: cot.test() offers you three alternatives
  to choose from: two-sided (default), greater, less. This
  distinction can be important, and when cor.test() reports Ha it
  tells you which one was used.

  On the other hand, as far as Shapiro-Wilk is concerned there is
  no choice of alternatives (nor of anything else except the data x).
  So there is nothing to tell you! And, further, departure from
  normality has so many dimensions that alternatives like two
  sided, greater or less would make no sense. One can think of
  tests targeted at specific kinds of alternative such as Distribution
  is excessively skew or distribution has excessive kurtosis or
  distribution is bimodal or distribution is multimodal, and so on.
  But any of these can be detected by Shapiro-Wilk, so it is not
  targeted at any specific alternative.

Thank you for these explanations. Best
Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Floating point precision / guard digits? (PR#13771)

2009-06-22 Thread Liviu Andronic
On 6/20/09, Dr. D. P. Kreil dpkr...@gmail.com wrote:
  you can suggest an online resource to help me use the right vocabulary
  and better understand the fundamental concepts, I am of course

There is in R the accuracy [1] package. It has a vignette (and paper)
dealing with various computational errors (in R).
Liviu

[1] http://cran.r-project.org/web/packages/accuracy/index.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Liviu Andronic
Dear Jonathan,

On Thu, May 7, 2009 at 4:18 PM, Jonathan Baron ba...@psych.upenn.edu wrote:
 can't imagine that someone would want to search just vignettes and not
 help pages, or the reverse.

Searching vignettes only can be of interest to users. If someone is
interested in (full-fledged) code examples, and not in various
descriptions of functions, a search vignette facility would come in
handy.
As a personal example, recently I wanted to search all vignettes for
mle examples, but could find no way to do this. I had already
searched the help pages and was unable to find something of obvious
use to me.

Best regards,
Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Google Summer of Code 2009

2009-02-19 Thread Liviu Andronic
On Thu, Feb 19, 2009 at 3:47 PM, Sklyar, Oleg (London)
oskl...@maninvestments.com wrote:
 I do think there is a need for an interactive graphics package for R.

There are also the GTK-based playwith, and latticist; unsure though
whether they fit your requirements.
Liviu



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Identifying graphics files produced by R

2009-02-15 Thread Liviu Andronic
On Sun, Feb 15, 2009 at 8:48 PM, Paul Murrell p.murr...@auckland.ac.nz wrote:
 I know that pdf() adds similar Creator information.  I don't recall
 seeing anything like this for the raster devices, but I've worked less
 with them so I don't know for sure.

By default PDF vector graphs get:
 pdf.options()
[..]
$title
[1] R Graphics Output
[..]

Perhaps .svg gets something similar, but dunno.
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel