On Wed, 30 Apr 2008, Duncan Murdoch wrote:

On 30/04/2008 2:44 AM, Martin Maechler wrote:
"DM" == Duncan Murdoch <[EMAIL PROTECTED]>
    on Sat, 26 Apr 2008 17:21:06 -0400 writes:

    DM> On 25/04/2008 2:47 PM, Prof Brian Ripley wrote:
    >> On Fri, 25 Apr 2008, Deepayan Sarkar wrote:
>> >>> For what it's worth, I use ?foo mostly to look up usage of functions >>> that I know I want to use, and find it perfect for that (one benefit >>> over help() is that completion works for ?). The only thing I miss is
    >>> the ability to do the equivalent of help("foo", package = "bar");
    >>> ?bar::foo gives the help page for "::". Perhaps that would be
    >>> something to consider for addition.
>> >> That fits most naturally with the (somewhat technical) idea that bar::foo >> becomes a symbol and not a function call. I believe that several of think >> that is in principle a better idea, but no one has as yet (AFAIK) explored >> the ramifications. >> >> However, 5 mins looking at the sources suggests that it is easy to do.


    DM> And you already did.  Thanks!
indeed.

    DM> I'm going to make the following change soon (in R-devel).

    DM> ??foo

DM> will now be like help.search("foo"). This will work with your change, DM> so ??utils::foo will limit the search to the utils package. This is DM> also quite easy. A more difficult thing I'd like to do is to broaden DM> the search to look outside the man pages, but that's a lot harder, and I DM> haven't started on it.

DM> I will also follow Hadley's suggestion and change the format of the DM> help.search results, so you can just cut and paste after a question mark DM> to look up the particular topic, e.g. ??foo gives

    DM> utils::citEntry         Writing Package CITATION Files

    DM> Type '?PKG::FOO' to inspect entry 'PKG::FOO TITLE'.

DM> I haven't touched the case of ?foo failing; I'll want to try it for a DM> while to decide whether I like it best as is:

    >> ?foo
    DM> No documentation for 'foo' in specified packages and libraries:
    DM> you could try '??foo'

DM> or whether it should just automatically call help.search, or something DM> in between.

Please the former, at least by default!
[The case of 1500 installed packages was mentioned before...]

Note one thing that hasn't been mentioned before:

help() has had the optional argument
       ' try.all.packages = getOption("help.try.all.packages") '
for many years now, and I have been involved in its history as
well but don't recall all details. IIRC,
help() {and hence "?"} used to *default* to 'try.all.packages = TRUE' for a while and later it was the
default for me (and our whole statistics departmental unit).
But we found that it *was* inconvenient that a big search was
started, often just because of a typo.
So I think   ?<non-existing>  should ``answer quickly'' by
default.

Have you tried help.search() lately? It is now very fast. I haven't checked if help() makes use of the same search mechanism, but presumably it could do so, if speed is an issue.

So I would say the speed is a solvable or solved problem.

There are some possible improvements as yet. Hadley mentioned keeping binary indices -- we do per-package and could per-library. Just opening 1700 files can be quite slow on some systems -- this is one of the areas where you see the benefits of Unix-alike file systems.

A lot of the speed ups are generic, e.g. internal file.path.  I get

system.time(help("linear", try.all.packages = TRUE))
   user  system elapsed
 10.948   2.620  37.808
system.time(help.search("linear"))
   user  system elapsed
  8.219   0.432  28.358

so there is room for improvement in help().  However, the re-run

system.time(help.search("linear"))
   user  system elapsed
  1.951   0.003   1.960

shows the benefits of caching.

(This is on a not particularly fast machine with all of CRAN and BioC installed, in UTF-8: and I know of some ways to improve performance in UTF-8.)

It's all a question of resources and who is prepared to contribute.
I sped help.search() up ca 3x because 100s was too slow for me -- 30s the first time in a session is OK. (And incidentally disc caching means that the next session got

system.time(help.search("linear"))
   user  system elapsed
  7.180   0.246   7.627

, so the main issue is disc access.)

--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to