Re: [Rd] unique.matrix issue [Was: Anomaly with unique and match]

2011-03-12 Thread Petr Savicky
On Thu, Mar 10, 2011 at 01:19:48AM -0800, Henrik Bengtsson wrote:
 It should be possible to run unique()/duplicated() column by column
 and incrementally update the set of unique/duplicated rows.  This
 would avoid any coercing.  The benefit should be even greater for
 data.frame():s.

This is a good point. An implementation of this using sorting can
be done as follows

  Sort the data frame using function order().
  Determine the groups of consecutive equal rows in the sorted df.
  Map the first row of each group to the original order of the rows.
  Since sorting by the function order() is stable, we obtain the first
  in each group of equal rows also in the original order.

The coercion approach uses hashing for string comparison, but the 
efficiency of hashing seems to be overweighted by the inefficiency
of the coercion. So, we get the following comparison.

  a - matrix(sample(c(1234, 5678), 12*1, replace=TRUE), ncol=12)
  df - data.frame(a)
  
  do.unique.sort - function(df)
  {
i - do.call(order, df)
n - nrow(df)
u - c(TRUE, rowSums(df[i[2:n], ] == df[i[1:(n-1)], ])  ncol(df))
df[u[order(i)], ]
  }

  system.time(out1 - do.unique.sort(df))
  system.time(out2 - unique(df))
  identical(out1, out2)

The result may be, for example

 user  system elapsed 
0.279   0.000   0.273 
 user  system elapsed 
0.514   0.000   0.468 
  [1] TRUE

On another computer

 user  system elapsed 
0.058   0.000   0.058 
 user  system elapsed 
0.187   0.000   0.188 
  [1] TRUE

On Thu, Mar 10, 2011 at 01:39:56PM -0600, Terry Therneau wrote:
 Simon pointed out that the issue I observed was due to internal
 behaviour of unique.matrix.
 
   I had looked carefully at the manual pages before posting the question
 and this was not mentioned.  Perhaps an addition could be made?

According to the description of unique(), the user may expect that if
b is obtained using

  b - unique(a)

then for every i there is j, such that

  all(a[i, ] == b[j, ])

This is usually true, but not always, because among several numbers
in a with the same as.character() only one remains in b. If this
is intended, then i support the suggestion to include a note in the
documentation.

Let me add an argument against using as.character() to determine,
whether two numbers are close. The maximum relative difference between
the numbers, which have the same 15 digit decimal representation, varies
by a factor up to 10 in different ranges. Due to this, we have

  x - 1 + c(1.1, 1.3, 1.7, 1.9)*1e-14

  unique(as.character(x))
  [1] 1.01 1.02

  unique(as.character(9*x))
  [1] 9.1  9.12 9.15 
9.17

The relative differences between components of 9*x are the same as the
relative differences in x, but if the mantissa begins with 9, then
a smaller relative difference is sufficient to change 15-th digit.

In terms of unique(), this implies

  nrow(unique(cbind(x)))
  [1] 2

  nrow(unique(cbind(9*x)))
  [1] 4
  
Petr Savicky.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Run script automatically when package is loaded

2011-03-12 Thread Janko Thyson
Dear list,

is it possible to specify a script that is executed automatically when my
package is mounted via 'require(my.pkg)' or 'library(my.pkg)'?

Id' like to specify execute a small init function that creates some crucial
environment structures. As it's always the first thing to do when using the
package, I wanted to hide it from the user so he won't have to think about
this step.

Can I use the lazy-loading functionality of packages for that (Writing R
Extensions, Section 1.1.5 Data in packages, pp. 10)?

Thanks for any suggestions,
Janko

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] WARNING Undocumented S4 methods 'initialize' - why?

2011-03-12 Thread cstrato

Dear all,

Meanwhile I found my mistake.
I forgot to add class QualTreeSet to 'initialize-methods.Rd'.
However, I am not sure if I should make 'initialize' public at all.

Best regards
Christian

On 3/11/11 9:58 PM, cstrato wrote:

Dear all,

I am just writing the documentation file for S4 class 'QualTreeSet' and
get the following warning with R CMD check:

* checking for missing documentation entries ... WARNING
Undocumented S4 methods:
generic 'initialize' and siglist 'QualTreeSet'
All user-level objects in a package (including S4 classes and methods)
should have documentation entries.
See the chapter 'Writing R documentation files' in manual 'Writing R
Extensions'.

All my S4 classes have a method 'initialize' and R CMD check has never
complained. Thus, do you have any idea why I get suddenly this warning
for method 'initialize'?


Here is the code for 'QualTreeSet':

setClass(QualTreeSet,
representation(qualtype = character,
qualopt = character
),
contains=c(ProcesSet),
prototype(qualtype = rlm,
qualopt = raw
)
)#QualTreeSet

setMethod(initialize, QualTreeSet,
function(.Object,
qualtype = rlm,
qualopt = raw,
...)
{
if (qualtype == ) qualtype - rlm;
if (qualopt == ) qualopt - raw;

.Object - callNextMethod(.Object,
qualtype = qualtype,
qualopt = qualopt,
...);
.Object@qualtype = qualtype;
.Object@qualopt = qualopt;
.Object;
}
)#initialize


However, here is my code for a similar class 'ExprTreeSet' (which is the
class from where I have copied the code):

setClass(ExprTreeSet,
representation(exprtype = character,
normtype = character
),
contains=c(ProcesSet),
prototype(exprtype = none,
normtype = none
)
)#ExprTreeSet

setMethod(initialize, ExprTreeSet,
function(.Object,
exprtype = none,
normtype = none,
...)
{
if (exprtype == ) exprtype - none;
if (normtype == ) normtype - none;

.Object - callNextMethod(.Object,
exprtype = exprtype,
normtype = normtype,
...);
.Object@exprtype = exprtype;
.Object@normtype = normtype;
.Object;
}
)#initialize

In this case R CMD check does not complain, so why does it in the case
of 'QualTreeSet'?

Here is my:
  sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] xps_1.11.4

Thank you in advance
Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a A.u.s.t.r.i.a
e.m.a.i.l: cstrato at aon.at
_._._._._._._._._._._._._._._._._._




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Run script automatically when package is loaded

2011-03-12 Thread Dirk Eddelbuettel

On 12 March 2011 at 14:06, Janko Thyson wrote:
| Dear list,
| 
| is it possible to specify a script that is executed automatically when my
| package is mounted via 'require(my.pkg)' or 'library(my.pkg)'?

That has been possible all along. See help(.onLoad) if you use a NAMESPACE
(as you should) or help(.First.lib) if you don't.

Dirk
 
| Id' like to specify execute a small init function that creates some crucial
| environment structures. As it's always the first thing to do when using the
| package, I wanted to hide it from the user so he won't have to think about
| this step.
| 
| Can I use the lazy-loading functionality of packages for that (Writing R
| Extensions, Section 1.1.5 Data in packages, pp. 10)?
| 
| Thanks for any suggestions,
| Janko
| 
| __
| R-devel@r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] par(ask=TRUE) in R CMD check?

2011-03-12 Thread Spencer Graves

Hello:


  What happens in the auto-checks on R-Forge and CRAN with code 
using par(ask=TRUE)?



  Is this routine, or can it create problems?


  The fda package uses ask=TRUE to provide the user with a way to 
examine a group of plots.  In the past, I've marked those tests in 
\examples with \dontrun.  However, I wonder if that is necessary.  I 
tried it on Windows using R 2.12.0 and R Tools from that version, and R 
Tools seemed to supply all the mouse clicks required.  However, before I 
SVN Commit to R-Forge, I felt a need to ask.



  Thanks,
  Spencer

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] par(ask=TRUE) in R CMD check?

2011-03-12 Thread Prof Brian Ripley

On Sat, 12 Mar 2011, Spencer Graves wrote:


Hello:

 What happens in the auto-checks on R-Forge and CRAN with code using 
par(ask=TRUE)?


 Is this routine, or can it create problems?

 The fda package uses ask=TRUE to provide the user with a way to examine 
a group of plots.  In the past, I've marked those tests in \examples with 
\dontrun.  However, I wonder if that is necessary.  I tried it on Windows 
using R 2.12.0 and R Tools from that version, and R Tools seemed to supply 
all the mouse clicks required.  However, before I SVN Commit to R-Forge, I 
felt a need to ask.


No mouse clicks are needed!  par(ask=TRUE) is ignored on a pdf device 
(which is the non-interactive default device).  In any case, the help says


 ‘ask’ logical.  If ‘TRUE’ (and the R session is interactive) the
...
  This not really a graphics parameter, and its use is
  deprecated in favour of ‘devAskNewPage’.

See also the 'ask' argument in ?example.  What you say you want to do 
is the default behaviour of example(), and has been for some years.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Excited about the near future...

2011-03-12 Thread Henrik Bengtsson
Some already know, but I think it deserves a bit of a attention here as well:

It looks like we're about to get new features in R that will be very powerful!

That should be a good enough teaser for now...

/Henrik

PS ...and thanks for making it available plus credits to similar
efforts by others.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel