Re: [Rd] proposed changes to RSiteSearch

2009-06-05 Thread Liaw, Andy
From: spencerg
 
   Thank you all for your suggestions.  My goal with this 
 is to make 
 it as easy as possible for R users to find what they want in 
 contributed 
 packages.  A referee for our R Journal manuscript complained that 
 RSiteSearch.function was too much to type, suggesting we consider 
 masking RSiteSearch.  From the discussion, I do not see a strong 
 consensus for doing that.  I like Romain's suggestion to shorten the 
 name further to, e.g., web.search or doc.search.  Another 
 colleague 
 suggested RSearch. 
 
 
   What do you think about renaming the current 
 RSiteSearch.function{RSiteSearch} to RSearch{RSearch}? 
 
 
   I'm happy to support the consensus of this group on a name (and 
 even enhancements) that seems likely to maximize its utility to R 
 users.  I ask, because a rose by any other name would smell as sweet, 
 but one named prettysweetsmellingthingamabob might not sell 
 as well. 
 
 
   Thanks,
   Spencer 

[I've removed those on cc since I believe everyone will get this through
R-devel anyway...]

I'd suggest something like findFunction() or some such, if the main goal
is to look for functions (not manuals, vignettes, mailing lists, etc.).
RSiteSearch() was named what it was because it was meant as an interface
to Jon's search site that has lots of things related to R.  

It seems to me that the recent discussion has been about including other
alternative search engines, etc.  Recall that when we were discussing
including RSiteSearch() into base R, Jon basically had to commit to
maintaining the site, as well as documenting how to replicate the site
if and when he could no longer maintain it, before R Core accepted the
function.  I think it would be wonderful to have a search facility
that's all encompassing (Roogle?), but for inclusion into base R we
really need to have the sites being searched being basically permenant.

Perhaps a bit OT, but what would really be nice is if a search facility
can not only find functions that's related to some search phrase, but
also indicate whether the packages the functions belong to have already
been installed on the user's system.  Sort of like yum info or yum
search for those on RedHat-based Linux...

Best,
Andy


 
 
 Gabor Grothendieck wrote:
  On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch 
 murd...@stats.uwo.ca wrote:

  Gabor Grothendieck wrote:
  
  On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch 
 murd...@stats.uwo.ca
  wrote:
 

  spencerg wrote:
 
  
  Hello All:
 
  What do you think of adding a function RSiteSeach 
 to the package
  of
  that name, masking the RSiteSearch function in 
 utils, trapping any
  call
  RSiteSearch('searchstring', 'function') to the current
  RSiteSearch.function
  and passing all others to utils:::RSiteSearch?  This 
 was suggested by
  a
  referee to a manuscript on this new capability 
 submitted to R Journal.
   The current version of this manuscript is available via
  system.file('doc',
  'RSiteSearch.pdf', package='RSiteSearch') if you have 
 the RSiteSearch
  package installed.
 

  I suppose this depends on your long term plans for the 
 function and
  package.
   If you think it should eventually replace the utils 
 function, then it
  makes
  sense to use the same name:  users won't get used to a 
 new name in the
  meantime.  But if you think it will diverge from that 
 function, then you
  might as well pick a separate name now.
 
  I disagree with Gabor about this being heavy handed, at 
 least while it is
  the only significant export in the package.  If people 
 don't want it,
  don't
  attach the package.
 
 
  
  The last sentence only gives you a choice of clobbering 
 the existing
  function or not using it and that is not very nice.   
 What is wanted is
  both to be able to use it and allow it to coexist in a nice way.
 

  It is essentially a rename of the existing one to 
 utils::RSiteSearch.  I
  would only suggest this if RSiteSearch::RSiteSearch expanded on its
  capabilities (which I think was Spencer's proposal), 
 rather than replacing
  them with something different.
 
  
  How about R changing its RSiteSearch to be an S3 generic with the
  main functionality being placed into RSiteSearch.default?   Then
  RSiteSearch.function can become RsiteSearch.character and
   - RSiteSearch will give the new functionality when the package is
  loaded and the old functionality if not.
  - RSiteSearch.character can be used in place of 
 RSiteSearch.function
  to force only the new functionality (or an error if not present)
  - RSiteSearch.default will give the old functionality 
 whether or not the
  package is loaded
 
  (If there is a NAMESPACE then Its assumed here that both 
 methods are
  exported.)
 

  How is that an improvement?  Just replace your (RSiteSearch,
  RSiteSearch.character, RSiteSearch.default) with (RSiteSearch,
  RSiteSearch::RSiteSearch, utils::RSiteSearch) in my 

Re: [Rd] proposed changes to RSiteSearch

2009-06-05 Thread spencerg

Liaw, Andy wrote:

From: spencerg
  
  Thank you all for your suggestions.  My goal with this 
is to make 
it as easy as possible for R users to find what they want in 
contributed 
packages.  A referee for our R Journal manuscript complained that 
RSiteSearch.function was too much to type, suggesting we consider 
masking RSiteSearch.  From the discussion, I do not see a strong 
consensus for doing that.  I like Romain's suggestion to shorten the 
name further to, e.g., web.search or doc.search.  Another 
colleague 
suggested RSearch. 



  What do you think about renaming the current 
RSiteSearch.function{RSiteSearch} to RSearch{RSearch}? 



  I'm happy to support the consensus of this group on a name (and 
even enhancements) that seems likely to maximize its utility to R 
users.  I ask, because a rose by any other name would smell as sweet, 
but one named prettysweetsmellingthingamabob might not sell 
as well. 



  Thanks,
  Spencer 



[I've removed those on cc since I believe everyone will get this through
R-devel anyway...]

I'd suggest something like findFunction() or some such, if the main goal
is to look for functions (not manuals, vignettes, mailing lists, etc.).
  
 findFunction sounds to me like the best name I've heard so far. 


RSiteSearch() was named what it was because it was meant as an interface
to Jon's search site that has lots of things related to R.  


It seems to me that the recent discussion has been about including other
alternative search engines, etc.  Recall that when we were discussing
including RSiteSearch() into base R, Jon basically had to commit to
maintaining the site, as well as documenting how to replicate the site
if and when he could no longer maintain it, before R Core accepted the
function.  I think it would be wonderful to have a search facility
that's all encompassing (Roogle?), but for inclusion into base R we
really need to have the sites being searched being basically permenant.

Perhaps a bit OT, but what would really be nice is if a search facility
can not only find functions that's related to some search phrase, but
also indicate whether the packages the functions belong to have already
been installed on the user's system.  Sort of like yum info or yum
search for those on RedHat-based Linux...
  
 The current RSiteSeach package includes that adds other 
information from packageDescription to the package summary, adding 
(current default) Title, Version, Author, Maintainer, and (date) 
Packaged.  If the package is not installed, these fields are left 
blank.  I've used this to prioritize which packages (and then which 
functions) I should consider first. 



 Best Wishes,
 Spencer


Best,
Andy


 
  

Gabor Grothendieck wrote:

On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch 
  

murd...@stats.uwo.ca wrote:

  
  

Gabor Grothendieck wrote:


On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch 
  

murd...@stats.uwo.ca


wrote:

  
  

spencerg wrote:




Hello All:

What do you think of adding a function RSiteSeach 
  

to the package


of
that name, masking the RSiteSearch function in 
  

utils, trapping any


call
RSiteSearch('searchstring', 'function') to the current
RSiteSearch.function
and passing all others to utils:::RSiteSearch?  This 
  

was suggested by


a
referee to a manuscript on this new capability 
  

submitted to R Journal.


 The current version of this manuscript is available via
system.file('doc',
'RSiteSearch.pdf', package='RSiteSearch') if you have 
  

the RSiteSearch


package installed.

  
  
I suppose this depends on your long term plans for the 


function and


package.
 If you think it should eventually replace the utils 


function, then it


makes
sense to use the same name:  users won't get used to a 


new name in the

meantime.  But if you think it will diverge from that 


function, then you


might as well pick a separate name now.

I disagree with Gabor about this being heavy handed, at 


least while it is

the only significant export in the package.  If people 


don't want it,


don't
attach the package.




The last sentence only gives you a choice of clobbering 
  

the existing

function or not using it and that is not very nice.   
  

What is wanted is


both to be able to use it and allow it to coexist in a nice way.

  
  
It is essentially a rename of the existing one to 


utils::RSiteSearch.  I


would only suggest this if RSiteSearch::RSiteSearch expanded on its
capabilities (which I think was Spencer's proposal), 


rather than replacing


them with something different.




How about R changing its 

Re: [Rd] proposed changes to RSiteSearch

2009-06-05 Thread Duncan Murdoch

On 6/5/2009 9:41 AM, spencerg wrote:

Liaw, Andy wrote:

From: spencerg
  
  Thank you all for your suggestions.  My goal with this 
is to make 
it as easy as possible for R users to find what they want in 
contributed 
packages.  A referee for our R Journal manuscript complained that 
RSiteSearch.function was too much to type, suggesting we consider 
masking RSiteSearch.  From the discussion, I do not see a strong 
consensus for doing that.  I like Romain's suggestion to shorten the 
name further to, e.g., web.search or doc.search.  Another 
colleague 
suggested RSearch. 



  What do you think about renaming the current 
RSiteSearch.function{RSiteSearch} to RSearch{RSearch}? 



  I'm happy to support the consensus of this group on a name (and 
even enhancements) that seems likely to maximize its utility to R 
users.  I ask, because a rose by any other name would smell as sweet, 
but one named prettysweetsmellingthingamabob might not sell 
as well. 



  Thanks,
  Spencer 



[I've removed those on cc since I believe everyone will get this through
R-devel anyway...]

I'd suggest something like findFunction() or some such, if the main goal
is to look for functions (not manuals, vignettes, mailing lists, etc.).
  
  findFunction sounds to me like the best name I've heard so far. 


But it isn't looking for functions, it's looking for help pages about 
functions.  Another possibility is ???, e.g.


???topic

This is done by masking the utils function `?`, and you'd have to be 
careful to pass along requests with one or two (or more than three?) 
question marks to the original; it also feels a bit strange to type


hits - ???topic

though I think it's syntactic and well-defined.  I'm not sure how you'd 
include your optional arguments, it would be really weird (but again 
well defined) to say


z - ???spline(maxPages = 2)

(Your first example in ?RSiteSearch.function, translated).

Duncan Murdoch




RSiteSearch() was named what it was because it was meant as an interface
to Jon's search site that has lots of things related to R.  


It seems to me that the recent discussion has been about including other
alternative search engines, etc.  Recall that when we were discussing
including RSiteSearch() into base R, Jon basically had to commit to
maintaining the site, as well as documenting how to replicate the site
if and when he could no longer maintain it, before R Core accepted the
function.  I think it would be wonderful to have a search facility
that's all encompassing (Roogle?), but for inclusion into base R we
really need to have the sites being searched being basically permenant.

Perhaps a bit OT, but what would really be nice is if a search facility
can not only find functions that's related to some search phrase, but
also indicate whether the packages the functions belong to have already
been installed on the user's system.  Sort of like yum info or yum
search for those on RedHat-based Linux...
  
  The current RSiteSeach package includes that adds other 
information from packageDescription to the package summary, adding 
(current default) Title, Version, Author, Maintainer, and (date) 
Packaged.  If the package is not installed, these fields are left 
blank.  I've used this to prioritize which packages (and then which 
functions) I should consider first. 



  Best Wishes,
  Spencer


Best,
Andy


 
  

Gabor Grothendieck wrote:

On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch 
  

murd...@stats.uwo.ca wrote:

  
  

Gabor Grothendieck wrote:


On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch 
  

murd...@stats.uwo.ca


wrote:

  
  

spencerg wrote:




Hello All:

What do you think of adding a function RSiteSeach 
  

to the package


of
that name, masking the RSiteSearch function in 
  

utils, trapping any


call
RSiteSearch('searchstring', 'function') to the current
RSiteSearch.function
and passing all others to utils:::RSiteSearch?  This 
  

was suggested by


a
referee to a manuscript on this new capability 
  

submitted to R Journal.


 The current version of this manuscript is available via
system.file('doc',
'RSiteSearch.pdf', package='RSiteSearch') if you have 
  

the RSiteSearch


package installed.

  
  
I suppose this depends on your long term plans for the 


function and


package.
 If you think it should eventually replace the utils 


function, then it


makes
sense to use the same name:  users won't get used to a 


new name in the

meantime.  But if you think it will diverge from that 


function, then you


might as well pick a separate name now.

I disagree with Gabor about this being heavy handed, at 


least while it is

the only significant export in the package.  If people 

Re: [Rd] proposed changes to RSiteSearch

2009-06-05 Thread spencerg
Dear Andy, Duncan, et al. 



 Based on comments from Andy and Duncan, I'd like to revise my 
proposal as follows: 



  1.  Rename the current RSiteSearch.function to 
findFunction and the package name from RSiteSearch.function to 
findFunction, with findFun being an alias for findFunction. 



  2.  Try to write code so '???differential equation(999)' 
works the same as 'RSiteSearch.function(differential equation, 999)' 
does now. 



 What do you think?  I've made this as two steps, because I can do 
1 myself, but I may need help to develop 2. 



 Thanks again for all your suggestions.
 Best Wishes,
 Spencer


Duncan Murdoch wrote:

On 6/5/2009 9:41 AM, spencerg wrote:

Liaw, Andy wrote:

From: spencerg
 
  Thank you all for your suggestions.  My goal with this is to 
make it as easy as possible for R users to find what they want in 
contributed packages.  A referee for our R Journal manuscript 
complained that RSiteSearch.function was too much to type, 
suggesting we consider masking RSiteSearch.  From the discussion, 
I do not see a strong consensus for doing that.  I like Romain's 
suggestion to shorten the name further to, e.g., web.search or 
doc.search.  Another colleague suggested RSearch.


  What do you think about renaming the current 
RSiteSearch.function{RSiteSearch} to RSearch{RSearch}?


  I'm happy to support the consensus of this group on a name 
(and even enhancements) that seems likely to maximize its utility 
to R users.  I ask, because a rose by any other name would smell as 
sweet, but one named prettysweetsmellingthingamabob might not 
sell as well.


  Thanks,
  Spencer 


[I've removed those on cc since I believe everyone will get this 
through

R-devel anyway...]

I'd suggest something like findFunction() or some such, if the main 
goal

is to look for functions (not manuals, vignettes, mailing lists, etc.).
  
  findFunction sounds to me like the best name I've heard so far. 


But it isn't looking for functions, it's looking for help pages about 
functions.  Another possibility is ???, e.g.


???topic

This is done by masking the utils function `?`, and you'd have to be 
careful to pass along requests with one or two (or more than three?) 
question marks to the original; it also feels a bit strange to type


hits - ???topic

though I think it's syntactic and well-defined.  I'm not sure how 
you'd include your optional arguments, it would be really weird (but 
again well defined) to say


z - ???spline(maxPages = 2)

(Your first example in ?RSiteSearch.function, translated).

Duncan Murdoch



RSiteSearch() was named what it was because it was meant as an 
interface
to Jon's search site that has lots of things related to R. 
It seems to me that the recent discussion has been about including 
other

alternative search engines, etc.  Recall that when we were discussing
including RSiteSearch() into base R, Jon basically had to commit to
maintaining the site, as well as documenting how to replicate the site
if and when he could no longer maintain it, before R Core accepted the
function.  I think it would be wonderful to have a search facility
that's all encompassing (Roogle?), but for inclusion into base R we
really need to have the sites being searched being basically permenant.

Perhaps a bit OT, but what would really be nice is if a search facility
can not only find functions that's related to some search phrase, but
also indicate whether the packages the functions belong to have already
been installed on the user's system.  Sort of like yum info or yum
search for those on RedHat-based Linux...
  
  The current RSiteSeach package includes that adds other 
information from packageDescription to the package summary, adding 
(current default) Title, Version, Author, Maintainer, and 
(date) Packaged.  If the package is not installed, these fields are 
left blank.  I've used this to prioritize which packages (and then 
which functions) I should consider first.


  Best Wishes,
  Spencer


Best,
Andy


 
 

Gabor Grothendieck wrote:
   
On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch   

murd...@stats.uwo.ca wrote:
   
   

Gabor Grothendieck wrote:
   
On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch   

murd...@stats.uwo.ca
   

wrote:

   

spencerg wrote:

   

Hello All:

What do you think of adding a function RSiteSeach 
  

to the package
   

of
that name, masking the RSiteSearch function in   

utils, trapping any
   

call
RSiteSearch('searchstring', 'function') to the current
RSiteSearch.function
and passing all others to utils:::RSiteSearch?  This 
  

was suggested by
   

a
referee to a manuscript on this new capability   

submitted to R Journal.
   

 The current version of this manuscript is available via
system.file('doc',
'RSiteSearch.pdf', package='RSiteSearch') if you have 
  

Re: [Rd] proposed changes to RSiteSearch

2009-06-04 Thread Duncan Murdoch

spencerg wrote:
Hello All: 



  What do you think of adding a function RSiteSeach to the package 
of that name, masking the RSiteSearch function in utils, trapping 
any call RSiteSearch('searchstring', 'function') to the current 
RSiteSearch.function and passing all others to utils:::RSiteSearch?  
This was suggested by a referee to a manuscript on this new capability 
submitted to R Journal.  The current version of this manuscript is 
available via system.file('doc', 'RSiteSearch.pdf', 
package='RSiteSearch') if you have the RSiteSearch package installed. 
  
I suppose this depends on your long term plans for the function and 
package.  If you think it should eventually replace the utils function, 
then it makes sense to use the same name:  users won't get used to a new 
name in the meantime.  But if you think it will diverge from that 
function, then you might as well pick a separate name now.


I disagree with Gabor about this being heavy handed, at least while it 
is the only significant export in the package.  If people don't want it, 
don't attach the package.


Duncan Murdoch


  Thanks,
  Best Wishes,
  Spencer


Liaw, Andy wrote:
  

 I agree!  Recall, though, I had added the RSiteSearch() functionality
to the Rgui under Windows (Help / search.r-project.org...), so if
RSiteSearch() is taken out, this need to go, too.

Best,
Andy

From: Jonathan Baron
  


There is something to be said for taking all of these functions,
including the original RSiteSearch, out of utils and putting them in
the new RSiteSearch package.  These are the sorts of things that will
get revised frequently, and this way (I think) we won't have to bother
whoever takes care of utils, which is part of the regular R
distribution.

I'm adding Spencer Graves to the cc list.  Maybe he is interested in
doing this.

Jon

On 05/07/09 20:54, Romain Francois wrote:

  
We could have a few functions similar to RSiteSearch or 
  

gmaneSearch I 

  

just posted and then cook a summary html page with R ...

Here is a function that grabs relevant groups from gmane:

gmaneGroups - function( prefix = gmane.comp.lang.r. ){
url - URLencode( sprintf( 
http://dir.gmane.org/index.php?prefix=%s;, prefix) )
txt - grep( '^tr.*td align=right.*a', readLines( 
  

url ), value = 

  

TRUE )
   
rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'

out - data.frame(
url = gsub( rx, \\1, txt ),
group = gsub( rx, \\2, txt ),
description = gsub( rx, \\3, txt ),
stringsAsFactors = FALSE
)
out$group - sub( ..., .*, out$group, fixed = TRUE )
out
}

I'll clean this up and write a man page if there is 
  

interest in adding 

  
this to R, but this might be more appropriate in a package, 
  

for example: 

  

http://r-forge.r-project.org/projects/rsitesearch/

Romain

Liaw, Andy wrote:
  


From: Jonathan Baron
  

  

On 05/07/09 13:48, Liaw, Andy wrote:

  

From: Duncan Murdoch 
  

  

I'll incorporate the changes if you like

  

Yes.  Please do.  I understand that it won't take effect 
  


for a while.

  

When it does, I'll change my site.

  What do you think 

  

of the idea 
of adding a gmane (or other archive) search to your results 
page?  Then 
if someone doesn't like what the man pages show, you can 

  

send them 

  

somewhere else, rather than leaving them to find out the 
other resources 
themselves.


gmane has sample code for this on their search page 
search.gmane.org, so 
it looks reasonably easy.  I'd suggest following their 

  

last example, 

  

with a drop-down box to select mailing lists, with 
comp.lang.r.* as an 
option for all lists.


Duncan Murdoch

  

Good idea.  I will do this.  But there are also two 
  


other good search

  
engines.  Maybe I'll add all three search alternatives.  
  


But then,

  

according to Sheena Iyengar, people won't choose any!  Hmm.


  

Actually, I was thinking about a possible RHelpSearch() in 
  

  

addition, if

  

Jon is no longer going to include the R-help archive in the 
  

  

search.  I

  

used the current RSiteSearch() a lot more for searching 
  

  

R-help archive

  


than functions in packages.  Ideas?  comments?
  

  

This is OK with me, but I don't want to do it.  I guess it would
search gmane.  MarkMail is also 

Re: [Rd] proposed changes to RSiteSearch

2009-06-04 Thread Gabor Grothendieck
On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch murd...@stats.uwo.ca wrote:
 spencerg wrote:

 Hello All:

      What do you think of adding a function RSiteSeach to the package of
 that name, masking the RSiteSearch function in utils, trapping any call
 RSiteSearch('searchstring', 'function') to the current RSiteSearch.function
 and passing all others to utils:::RSiteSearch?  This was suggested by a
 referee to a manuscript on this new capability submitted to R Journal.
  The current version of this manuscript is available via system.file('doc',
 'RSiteSearch.pdf', package='RSiteSearch') if you have the RSiteSearch
 package installed.

 I suppose this depends on your long term plans for the function and package.
  If you think it should eventually replace the utils function, then it makes
 sense to use the same name:  users won't get used to a new name in the
 meantime.  But if you think it will diverge from that function, then you
 might as well pick a separate name now.

 I disagree with Gabor about this being heavy handed, at least while it is
 the only significant export in the package.  If people don't want it, don't
 attach the package.


The last sentence only gives you a choice of clobbering the existing
function or not using it and that is not very nice.   What is wanted is
both to be able to use it and allow it to coexist in a nice way.

How about R changing its RSiteSearch to be an S3 generic with the
main functionality being placed into RSiteSearch.default?   Then
RSiteSearch.function can become RsiteSearch.character and

- RSiteSearch will give the new functionality when the package is
loaded and the old functionality if not.
- RSiteSearch.character can be used in place of RSiteSearch.function
to force only the new functionality (or an error if not present)
- RSiteSearch.default will give the old functionality whether or not the
package is loaded

(If there is a NAMESPACE then Its assumed here that both methods are
exported.)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-06-04 Thread Gabor Grothendieck
On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:
 Gabor Grothendieck wrote:

 On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch murd...@stats.uwo.ca
 wrote:


 spencerg wrote:


 Hello All:

     What do you think of adding a function RSiteSeach to the package
 of
 that name, masking the RSiteSearch function in utils, trapping any
 call
 RSiteSearch('searchstring', 'function') to the current
 RSiteSearch.function
 and passing all others to utils:::RSiteSearch?  This was suggested by
 a
 referee to a manuscript on this new capability submitted to R Journal.
  The current version of this manuscript is available via
 system.file('doc',
 'RSiteSearch.pdf', package='RSiteSearch') if you have the RSiteSearch
 package installed.


 I suppose this depends on your long term plans for the function and
 package.
  If you think it should eventually replace the utils function, then it
 makes
 sense to use the same name:  users won't get used to a new name in the
 meantime.  But if you think it will diverge from that function, then you
 might as well pick a separate name now.

 I disagree with Gabor about this being heavy handed, at least while it is
 the only significant export in the package.  If people don't want it,
 don't
 attach the package.



 The last sentence only gives you a choice of clobbering the existing
 function or not using it and that is not very nice.   What is wanted is
 both to be able to use it and allow it to coexist in a nice way.


 It is essentially a rename of the existing one to utils::RSiteSearch.  I
 would only suggest this if RSiteSearch::RSiteSearch expanded on its
 capabilities (which I think was Spencer's proposal), rather than replacing
 them with something different.

 How about R changing its RSiteSearch to be an S3 generic with the
 main functionality being placed into RSiteSearch.default?   Then
 RSiteSearch.function can become RsiteSearch.character and
  - RSiteSearch will give the new functionality when the package is
 loaded and the old functionality if not.
 - RSiteSearch.character can be used in place of RSiteSearch.function
 to force only the new functionality (or an error if not present)
 - RSiteSearch.default will give the old functionality whether or not the
 package is loaded

 (If there is a NAMESPACE then Its assumed here that both methods are
 exported.)


 How is that an improvement?  Just replace your (RSiteSearch,
 RSiteSearch.character, RSiteSearch.default) with (RSiteSearch,
 RSiteSearch::RSiteSearch, utils::RSiteSearch) in my proposal and you get the
 same behaviour.  The point isn't that Spencer has invented a way for
 RSiteSearch to handle character vectors, it already knows that.  The point
 is that he has enhanced it.  Or maybe he has written something similar but
 different, in which case he should pick a new name.
 Duncan Murdoch


He simply renames it RSiteSearch.character (and possibly some other
changes depending on arguments). Then if the core cooperates
by making RSiteSearch a generic with a default method then everything
works as one would expect based on an understanding of S3.

No conflicts in names are involved.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-06-04 Thread spencerg

Gabor Grothendieck wrote:

On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:
  

Gabor Grothendieck wrote:


On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch murd...@stats.uwo.ca
wrote:

  

spencerg wrote:



Hello All:

What do you think of adding a function RSiteSeach to the package
of
that name, masking the RSiteSearch function in utils, trapping any
call
RSiteSearch('searchstring', 'function') to the current
RSiteSearch.function
and passing all others to utils:::RSiteSearch?  This was suggested by
a
referee to a manuscript on this new capability submitted to R Journal.
 The current version of this manuscript is available via
system.file('doc',
'RSiteSearch.pdf', package='RSiteSearch') if you have the RSiteSearch
package installed.

  

I suppose this depends on your long term plans for the function and
package.
 If you think it should eventually replace the utils function, then it
makes
sense to use the same name:  users won't get used to a new name in the
meantime.  But if you think it will diverge from that function, then you
might as well pick a separate name now.

I disagree with Gabor about this being heavy handed, at least while it is
the only significant export in the package.  If people don't want it,
don't
attach the package.




The last sentence only gives you a choice of clobbering the existing
function or not using it and that is not very nice.   What is wanted is
both to be able to use it and allow it to coexist in a nice way.

  

It is essentially a rename of the existing one to utils::RSiteSearch.  I
would only suggest this if RSiteSearch::RSiteSearch expanded on its
capabilities (which I think was Spencer's proposal), rather than replacing
them with something different.



How about R changing its RSiteSearch to be an S3 generic with the
main functionality being placed into RSiteSearch.default?   Then
RSiteSearch.function can become RsiteSearch.character and
 - RSiteSearch will give the new functionality when the package is
loaded and the old functionality if not.
- RSiteSearch.character can be used in place of RSiteSearch.function
to force only the new functionality (or an error if not present)
- RSiteSearch.default will give the old functionality whether or not the
package is loaded

(If there is a NAMESPACE then Its assumed here that both methods are
exported.)

  

How is that an improvement?  Just replace your (RSiteSearch,
RSiteSearch.character, RSiteSearch.default) with (RSiteSearch,
RSiteSearch::RSiteSearch, utils::RSiteSearch) in my proposal and you get the
same behaviour.  The point isn't that Spencer has invented a way for
RSiteSearch to handle character vectors, it already knows that.  The point
is that he has enhanced it.  Or maybe he has written something similar but
different, in which case he should pick a new name.
Duncan Murdoch




He simply renames it RSiteSearch.character (and possibly some other
changes depending on arguments). Then if the core cooperates
by making RSiteSearch a generic with a default method then everything
works as one would expect based on an understanding of S3.

No conflicts in names are involved.

  
To clarify:  RSiteSearch.function{RSiteSearch} accesses Johathan Baron's 
RSiteSearch data base for functions only, returning the result as a 
data.frame, sorts it to put the most frequently cited package first and 
then help page within package. 


 Spencer

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-06-04 Thread Gabor Grothendieck
On Thu, Jun 4, 2009 at 1:38 PM, spencerg spencer.gra...@prodsyse.com wrote:
 Gabor Grothendieck wrote:

 On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch murd...@stats.uwo.ca
 wrote:


 Gabor Grothendieck wrote:


 On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch murd...@stats.uwo.ca
 wrote:



 spencerg wrote:



 Hello All:

    What do you think of adding a function RSiteSeach to the package
 of
 that name, masking the RSiteSearch function in utils, trapping any
 call
 RSiteSearch('searchstring', 'function') to the current
 RSiteSearch.function
 and passing all others to utils:::RSiteSearch?  This was suggested
 by
 a
 referee to a manuscript on this new capability submitted to R
 Journal.
  The current version of this manuscript is available via
 system.file('doc',
 'RSiteSearch.pdf', package='RSiteSearch') if you have the
 RSiteSearch
 package installed.



 I suppose this depends on your long term plans for the function and
 package.
  If you think it should eventually replace the utils function, then it
 makes
 sense to use the same name:  users won't get used to a new name in the
 meantime.  But if you think it will diverge from that function, then
 you
 might as well pick a separate name now.

 I disagree with Gabor about this being heavy handed, at least while it
 is
 the only significant export in the package.  If people don't want it,
 don't
 attach the package.




 The last sentence only gives you a choice of clobbering the existing
 function or not using it and that is not very nice.   What is wanted is
 both to be able to use it and allow it to coexist in a nice way.



 It is essentially a rename of the existing one to utils::RSiteSearch.  I
 would only suggest this if RSiteSearch::RSiteSearch expanded on its
 capabilities (which I think was Spencer's proposal), rather than
 replacing
 them with something different.



 How about R changing its RSiteSearch to be an S3 generic with the
 main functionality being placed into RSiteSearch.default?   Then
 RSiteSearch.function can become RsiteSearch.character and
  - RSiteSearch will give the new functionality when the package is
 loaded and the old functionality if not.
 - RSiteSearch.character can be used in place of RSiteSearch.function
 to force only the new functionality (or an error if not present)
 - RSiteSearch.default will give the old functionality whether or not the
 package is loaded

 (If there is a NAMESPACE then Its assumed here that both methods are
 exported.)



 How is that an improvement?  Just replace your (RSiteSearch,
 RSiteSearch.character, RSiteSearch.default) with (RSiteSearch,
 RSiteSearch::RSiteSearch, utils::RSiteSearch) in my proposal and you get
 the
 same behaviour.  The point isn't that Spencer has invented a way for
 RSiteSearch to handle character vectors, it already knows that.  The
 point
 is that he has enhanced it.  Or maybe he has written something similar
 but
 different, in which case he should pick a new name.
 Duncan Murdoch



 He simply renames it RSiteSearch.character (and possibly some other
 changes depending on arguments). Then if the core cooperates
 by making RSiteSearch a generic with a default method then everything
 works as one would expect based on an understanding of S3.

 No conflicts in names are involved.



 To clarify:  RSiteSearch.function{RSiteSearch} accesses Johathan Baron's
 RSiteSearch data base for functions only, returning the result as a
 data.frame, sorts it to put the most frequently cited package first and then
 help page within package.
     Spencer

Consider this:

 f - function(x) UseMethod(f)
 f.character - function(x) { if (nchar(x)  1) NextMethod() else x }
 f.default - function(x) nchar(x)
 f(xx)
[1] 2
 f(x)
[1] x

In this case f takes a single character string argument and passes it
to f.character.

f.character can handle a single character (it just returns
it) but if x consists of a string of multiple characters it hands
it off to f.default (which returns the number of characters).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-06-04 Thread spencerg
 Thank you all for your suggestions.  My goal with this is to make 
it as easy as possible for R users to find what they want in contributed 
packages.  A referee for our R Journal manuscript complained that 
RSiteSearch.function was too much to type, suggesting we consider 
masking RSiteSearch.  From the discussion, I do not see a strong 
consensus for doing that.  I like Romain's suggestion to shorten the 
name further to, e.g., web.search or doc.search.  Another colleague 
suggested RSearch. 



 What do you think about renaming the current 
RSiteSearch.function{RSiteSearch} to RSearch{RSearch}? 



 I'm happy to support the consensus of this group on a name (and 
even enhancements) that seems likely to maximize its utility to R 
users.  I ask, because a rose by any other name would smell as sweet, 
but one named prettysweetsmellingthingamabob might not sell as well. 



 Thanks,
 Spencer 



Gabor Grothendieck wrote:

On Thu, Jun 4, 2009 at 12:13 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:
  

Gabor Grothendieck wrote:


On Thu, Jun 4, 2009 at 1:58 AM, Duncan Murdoch murd...@stats.uwo.ca
wrote:

  

spencerg wrote:



Hello All:

What do you think of adding a function RSiteSeach to the package
of
that name, masking the RSiteSearch function in utils, trapping any
call
RSiteSearch('searchstring', 'function') to the current
RSiteSearch.function
and passing all others to utils:::RSiteSearch?  This was suggested by
a
referee to a manuscript on this new capability submitted to R Journal.
 The current version of this manuscript is available via
system.file('doc',
'RSiteSearch.pdf', package='RSiteSearch') if you have the RSiteSearch
package installed.

  

I suppose this depends on your long term plans for the function and
package.
 If you think it should eventually replace the utils function, then it
makes
sense to use the same name:  users won't get used to a new name in the
meantime.  But if you think it will diverge from that function, then you
might as well pick a separate name now.

I disagree with Gabor about this being heavy handed, at least while it is
the only significant export in the package.  If people don't want it,
don't
attach the package.




The last sentence only gives you a choice of clobbering the existing
function or not using it and that is not very nice.   What is wanted is
both to be able to use it and allow it to coexist in a nice way.

  

It is essentially a rename of the existing one to utils::RSiteSearch.  I
would only suggest this if RSiteSearch::RSiteSearch expanded on its
capabilities (which I think was Spencer's proposal), rather than replacing
them with something different.



How about R changing its RSiteSearch to be an S3 generic with the
main functionality being placed into RSiteSearch.default?   Then
RSiteSearch.function can become RsiteSearch.character and
 - RSiteSearch will give the new functionality when the package is
loaded and the old functionality if not.
- RSiteSearch.character can be used in place of RSiteSearch.function
to force only the new functionality (or an error if not present)
- RSiteSearch.default will give the old functionality whether or not the
package is loaded

(If there is a NAMESPACE then Its assumed here that both methods are
exported.)

  

How is that an improvement?  Just replace your (RSiteSearch,
RSiteSearch.character, RSiteSearch.default) with (RSiteSearch,
RSiteSearch::RSiteSearch, utils::RSiteSearch) in my proposal and you get the
same behaviour.  The point isn't that Spencer has invented a way for
RSiteSearch to handle character vectors, it already knows that.  The point
is that he has enhanced it.  Or maybe he has written something similar but
different, in which case he should pick a new name.
Duncan Murdoch




He simply renames it RSiteSearch.character (and possibly some other
changes depending on arguments). Then if the core cooperates
by making RSiteSearch a generic with a default method then everything
works as one would expect based on an understanding of S3.

No conflicts in names are involved.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-06-03 Thread spencerg
Hello All: 



 What do you think of adding a function RSiteSeach to the package 
of that name, masking the RSiteSearch function in utils, trapping 
any call RSiteSearch('searchstring', 'function') to the current 
RSiteSearch.function and passing all others to utils:::RSiteSearch?  
This was suggested by a referee to a manuscript on this new capability 
submitted to R Journal.  The current version of this manuscript is 
available via system.file('doc', 'RSiteSearch.pdf', 
package='RSiteSearch') if you have the RSiteSearch package installed. 



 Thanks,
 Best Wishes,
 Spencer


Liaw, Andy wrote:

 I agree!  Recall, though, I had added the RSiteSearch() functionality
to the Rgui under Windows (Help / search.r-project.org...), so if
RSiteSearch() is taken out, this need to go, too.

Best,
Andy

From: Jonathan Baron
  

There is something to be said for taking all of these functions,
including the original RSiteSearch, out of utils and putting them in
the new RSiteSearch package.  These are the sorts of things that will
get revised frequently, and this way (I think) we won't have to bother
whoever takes care of utils, which is part of the regular R
distribution.

I'm adding Spencer Graves to the cc list.  Maybe he is interested in
doing this.

Jon

On 05/07/09 20:54, Romain Francois wrote:

We could have a few functions similar to RSiteSearch or 
  
gmaneSearch I 


just posted and then cook a summary html page with R ...

Here is a function that grabs relevant groups from gmane:

gmaneGroups - function( prefix = gmane.comp.lang.r. ){
url - URLencode( sprintf( 
http://dir.gmane.org/index.php?prefix=%s;, prefix) )
txt - grep( '^tr.*td align=right.*a', readLines( 
  
url ), value = 


TRUE )
   
rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'

out - data.frame(
url = gsub( rx, \\1, txt ),
group = gsub( rx, \\2, txt ),
description = gsub( rx, \\3, txt ),
stringsAsFactors = FALSE
)
out$group - sub( ..., .*, out$group, fixed = TRUE )
out
}

I'll clean this up and write a man page if there is 
  
interest in adding 

this to R, but this might be more appropriate in a package, 
  
for example: 


http://r-forge.r-project.org/projects/rsitesearch/

Romain

Liaw, Andy wrote:
  

From: Jonathan Baron
  


On 05/07/09 13:48, Liaw, Andy wrote:

  
From: Duncan Murdoch 
  


I'll incorporate the changes if you like

  
Yes.  Please do.  I understand that it won't take effect 
  

for a while.


When it does, I'll change my site.

  What do you think 

  
of the idea 
of adding a gmane (or other archive) search to your results 
page?  Then 
if someone doesn't like what the man pages show, you can 

  
send them 

  
somewhere else, rather than leaving them to find out the 
other resources 
themselves.


gmane has sample code for this on their search page 
search.gmane.org, so 
it looks reasonably easy.  I'd suggest following their 

  
last example, 

  
with a drop-down box to select mailing lists, with 
comp.lang.r.* as an 
option for all lists.


Duncan Murdoch

  
Good idea.  I will do this.  But there are also two 
  

other good search

engines.  Maybe I'll add all three search alternatives.  
  

But then,


according to Sheena Iyengar, people won't choose any!  Hmm.


  
Actually, I was thinking about a possible RHelpSearch() in 
  


addition, if

  
Jon is no longer going to include the R-help archive in the 
  


search.  I

  
used the current RSiteSearch() a lot more for searching 
  


R-help archive

  

than functions in packages.  Ideas?  comments?
  


This is OK with me, but I don't want to do it.  I guess it would
search gmane.  MarkMail is also pretty good, as is
http://tolstoy.newcastle.edu.au/R/ All these are much better than
Namazu for searching the R-help list.

  
Sorry I didn't make it clear:  I meant something like the 


gmaneSearcg()


that Romain posted, not hitting your site.

Best,
Andy
 
  


Jon

  

--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

  

--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)



Notice:  This e-mail message, together with any attach...{{dropped:15}}


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-06-03 Thread Gabor Grothendieck
Having RSiteSearch.function be a strict superset of RSiteSearch might
make sense but giving them the same name seems too heavy
handed unless done via OO which seems not applicable here since
R's version is not generic and the two use the same class, character,
anyways.

On Wed, Jun 3, 2009 at 5:37 PM, spencerg spencer.gra...@prodsyse.com wrote:
 Hello All:

     What do you think of adding a function RSiteSeach to the package of
 that name, masking the RSiteSearch function in utils, trapping any call
 RSiteSearch('searchstring', 'function') to the current RSiteSearch.function
 and passing all others to utils:::RSiteSearch?  This was suggested by a
 referee to a manuscript on this new capability submitted to R Journal.
  The current version of this manuscript is available via system.file('doc',
 'RSiteSearch.pdf', package='RSiteSearch') if you have the RSiteSearch
 package installed.

     Thanks,
     Best Wishes,
     Spencer


 Liaw, Andy wrote:

  I agree!  Recall, though, I had added the RSiteSearch() functionality
 to the Rgui under Windows (Help / search.r-project.org...), so if
 RSiteSearch() is taken out, this need to go, too.

 Best,
 Andy

 From: Jonathan Baron


 There is something to be said for taking all of these functions,
 including the original RSiteSearch, out of utils and putting them in
 the new RSiteSearch package.  These are the sorts of things that will
 get revised frequently, and this way (I think) we won't have to bother
 whoever takes care of utils, which is part of the regular R
 distribution.

 I'm adding Spencer Graves to the cc list.  Maybe he is interested in
 doing this.

 Jon

 On 05/07/09 20:54, Romain Francois wrote:


 We could have a few functions similar to RSiteSearch or

 gmaneSearch I

 just posted and then cook a summary html page with R ...

 Here is a function that grabs relevant groups from gmane:

 gmaneGroups - function( prefix = gmane.comp.lang.r. ){
    url - URLencode( sprintf(
 http://dir.gmane.org/index.php?prefix=%s;, prefix) )
    txt - grep( '^tr.*td align=right.*a', readLines(

 url ), value =

 TRUE )
      rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
    out - data.frame(
        url = gsub( rx, \\1, txt ),
        group = gsub( rx, \\2, txt ),
        description = gsub( rx, \\3, txt ),
        stringsAsFactors = FALSE
        )
    out$group - sub( ..., .*, out$group, fixed = TRUE )
    out
 }

 I'll clean this up and write a man page if there is

 interest in adding

 this to R, but this might be more appropriate in a package,

 for example:

 http://r-forge.r-project.org/projects/rsitesearch/

 Romain

 Liaw, Andy wrote:


 From: Jonathan Baron


 On 05/07/09 13:48, Liaw, Andy wrote:


 From: Duncan Murdoch

 I'll incorporate the changes if you like


 Yes.  Please do.  I understand that it won't take effect

 for a while.


 When it does, I'll change my site.

  What do you think

 of the idea of adding a gmane (or other archive) search to your
 results page?  Then if someone doesn't like what the man pages show, 
 you can


 send them

 somewhere else, rather than leaving them to find out the other
 resources themselves.

 gmane has sample code for this on their search page
 search.gmane.org, so it looks reasonably easy.  I'd suggest following 
 their


 last example,

 with a drop-down box to select mailing lists, with comp.lang.r.* as
 an option for all lists.

 Duncan Murdoch


 Good idea.  I will do this.  But there are also two

 other good search


 engines.  Maybe I'll add all three search alternatives.

 But then,


 according to Sheena Iyengar, people won't choose any!  Hmm.



 Actually, I was thinking about a possible RHelpSearch() in


 addition, if


 Jon is no longer going to include the R-help archive in the


 search.  I


 used the current RSiteSearch() a lot more for searching


 R-help archive


 than functions in packages.  Ideas?  comments?


 This is OK with me, but I don't want to do it.  I guess it would
 search gmane.  MarkMail is also pretty good, as is
 http://tolstoy.newcastle.edu.au/R/ All these are much better than
 Namazu for searching the R-help list.


 Sorry I didn't make it clear:  I meant something like the

 gmaneSearcg()


 that Romain posted, not hitting your site.

 Best,
 Andy


 Jon


 --
 Romain Francois
 Independent R Consultant
 +33(0) 6 28 91 30 30
 http://romainfrancois.blog.free.fr



 --
 Jonathan Baron, Professor of Psychology, University of Pennsylvania
 Home page: http://www.sas.upenn.edu/~baron
 Editor: Judgment and Decision Making (http://journal.sjdm.org)



 Notice:  This e-mail message, together with any attach...{{dropped:15}}

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-06-03 Thread Gabor Grothendieck
If it were an entry on the Rgui menu on Windows then at
least Windows users could get to it quickly regardless of
the name.

2009/6/4 Romain François francoisrom...@free.fr:
 One other comment was that the function name is a pain to type, which I
 believe is also true for RSiteSearch. Considering we (I) might add other
 engines, would it make sense to have a more generic name, ... something like
 web.search, doc.search, ...


 Gabor Grothendieck wrote:

 Having RSiteSearch.function be a strict superset of RSiteSearch might
 make sense but giving them the same name seems too heavy
 handed unless done via OO which seems not applicable here since
 R's version is not generic and the two use the same class, character,
 anyways.

 On Wed, Jun 3, 2009 at 5:37 PM, spencerg spencer.gra...@prodsyse.com
 wrote:


 Hello All:

    What do you think of adding a function RSiteSeach to the package of
 that name, masking the RSiteSearch function in utils, trapping any
 call
 RSiteSearch('searchstring', 'function') to the current
 RSiteSearch.function
 and passing all others to utils:::RSiteSearch?  This was suggested by a
 referee to a manuscript on this new capability submitted to R Journal.
  The current version of this manuscript is available via
 system.file('doc',
 'RSiteSearch.pdf', package='RSiteSearch') if you have the RSiteSearch
 package installed.

    Thanks,
    Best Wishes,
    Spencer


 Liaw, Andy wrote:


  I agree!  Recall, though, I had added the RSiteSearch() functionality
 to the Rgui under Windows (Help / search.r-project.org...), so if
 RSiteSearch() is taken out, this need to go, too.

 Best,
 Andy

 From: Jonathan Baron



 There is something to be said for taking all of these functions,
 including the original RSiteSearch, out of utils and putting them in
 the new RSiteSearch package.  These are the sorts of things that will
 get revised frequently, and this way (I think) we won't have to bother
 whoever takes care of utils, which is part of the regular R
 distribution.

 I'm adding Spencer Graves to the cc list.  Maybe he is interested in
 doing this.

 Jon

 On 05/07/09 20:54, Romain Francois wrote:



 We could have a few functions similar to RSiteSearch or


 gmaneSearch I


 just posted and then cook a summary html page with R ...

 Here is a function that grabs relevant groups from gmane:

 gmaneGroups - function( prefix = gmane.comp.lang.r. ){
   url - URLencode( sprintf(
 http://dir.gmane.org/index.php?prefix=%s;, prefix) )
   txt - grep( '^tr.*td align=right.*a', readLines(


 url ), value =


 TRUE )
     rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
   out - data.frame(
       url = gsub( rx, \\1, txt ),
       group = gsub( rx, \\2, txt ),
       description = gsub( rx, \\3, txt ),
       stringsAsFactors = FALSE
       )
   out$group - sub( ..., .*, out$group, fixed = TRUE )
   out
 }

 I'll clean this up and write a man page if there is


 interest in adding


 this to R, but this might be more appropriate in a package,


 for example:


 http://r-forge.r-project.org/projects/rsitesearch/

 Romain

 Liaw, Andy wrote:



 From: Jonathan Baron



 On 05/07/09 13:48, Liaw, Andy wrote:



 From: Duncan Murdoch


 I'll incorporate the changes if you like



 Yes.  Please do.  I understand that it won't take effect


 for a while.



 When it does, I'll change my site.

  What do you think


 of the idea of adding a gmane (or other archive) search to your
 results page?  Then if someone doesn't like what the man pages
 show, you can



 send them


 somewhere else, rather than leaving them to find out the other
 resources themselves.

 gmane has sample code for this on their search page
 search.gmane.org, so it looks reasonably easy.  I'd suggest
 following their



 last example,


 with a drop-down box to select mailing lists, with comp.lang.r.*
 as
 an option for all lists.

 Duncan Murdoch



 Good idea.  I will do this.  But there are also two


 other good search



 engines.  Maybe I'll add all three search alternatives.


 But then,



 according to Sheena Iyengar, people won't choose any!  Hmm.




 Actually, I was thinking about a possible RHelpSearch() in



 addition, if



 Jon is no longer going to include the R-help archive in the



 search.  I



 used the current RSiteSearch() a lot more for searching



 R-help archive



 than functions in packages.  Ideas?  comments?



 This is OK with me, but I don't want to do it.  I guess it would
 search gmane.  MarkMail is also pretty good, as is
 http://tolstoy.newcastle.edu.au/R/ All these are much better than
 Namazu for searching the R-help list.



 Sorry I didn't make it clear:  I meant something like the


 gmaneSearcg()



 that Romain posted, not hitting your site.

 Best,
 Andy



 Jon



 --
 Romain Francois
 Independent R Consultant
 +33(0) 6 28 91 30 30
 http://romainfrancois.blog.free.fr




 --
 Jonathan Baron, Professor of Psychology, University of Pennsylvania
 Home page: http://www.sas.upenn.edu/~baron
 

Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Jonathan Baron
On 05/07/09 13:48, Liaw, Andy wrote:
 From: Duncan Murdoch 
  I'll incorporate the changes if you like.

Yes.  Please do.  I understand that it won't take effect for a while.
When it does, I'll change my site.

  What do you think 
  of the idea 
  of adding a gmane (or other archive) search to your results 
  page?  Then 
  if someone doesn't like what the man pages show, you can send them 
  somewhere else, rather than leaving them to find out the 
  other resources 
  themselves.
  
  gmane has sample code for this on their search page 
  search.gmane.org, so 
  it looks reasonably easy.  I'd suggest following their last example, 
  with a drop-down box to select mailing lists, with 
  comp.lang.r.* as an 
  option for all lists.
  
  Duncan Murdoch

Good idea.  I will do this.  But there are also two other good search
engines.  Maybe I'll add all three search alternatives.  But then,
according to Sheena Iyengar, people won't choose any!  Hmm.

 Actually, I was thinking about a possible RHelpSearch() in addition, if
 Jon is no longer going to include the R-help archive in the search.  I
 used the current RSiteSearch() a lot more for searching R-help archive
 than functions in packages.  Ideas?  comments?

This is OK with me, but I don't want to do it.  I guess it would
search gmane.  MarkMail is also pretty good, as is
http://tolstoy.newcastle.edu.au/R/ All these are much better than
Namazu for searching the R-help list.

Jon
 
 Andy 
 Notice:  This e-mail message, together with any attach...{{dropped:18}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Jonathan Baron
There is something to be said for taking all of these functions,
including the original RSiteSearch, out of utils and putting them in
the new RSiteSearch package.  These are the sorts of things that will
get revised frequently, and this way (I think) we won't have to bother
whoever takes care of utils, which is part of the regular R
distribution.

I'm adding Spencer Graves to the cc list.  Maybe he is interested in
doing this.

Jon

On 05/07/09 20:54, Romain Francois wrote:
 We could have a few functions similar to RSiteSearch or gmaneSearch I 
 just posted and then cook a summary html page with R ...
 
 Here is a function that grabs relevant groups from gmane:
 
 gmaneGroups - function( prefix = gmane.comp.lang.r. ){
 url - URLencode( sprintf( 
 http://dir.gmane.org/index.php?prefix=%s;, prefix) )
 txt - grep( '^tr.*td align=right.*a', readLines( url ), value = 
 TRUE )

 rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
 out - data.frame(
 url = gsub( rx, \\1, txt ),
 group = gsub( rx, \\2, txt ),
 description = gsub( rx, \\3, txt ),
 stringsAsFactors = FALSE
 )
 out$group - sub( ..., .*, out$group, fixed = TRUE )
 out
 }
 
 I'll clean this up and write a man page if there is interest in adding 
 this to R, but this might be more appropriate in a package, for example: 
 http://r-forge.r-project.org/projects/rsitesearch/
 
 Romain
 
 Liaw, Andy wrote:
  From: Jonathan Baron

  On 05/07/09 13:48, Liaw, Andy wrote:
  
  From: Duncan Murdoch 

  I'll incorporate the changes if you like
  
  Yes.  Please do.  I understand that it won't take effect for a while.
  When it does, I'll change my site.
 
What do you think 
  
  of the idea 
  of adding a gmane (or other archive) search to your results 
  page?  Then 
  if someone doesn't like what the man pages show, you can 
  
  send them 
  
  somewhere else, rather than leaving them to find out the 
  other resources 
  themselves.
 
  gmane has sample code for this on their search page 
  search.gmane.org, so 
  it looks reasonably easy.  I'd suggest following their 
  
  last example, 
  
  with a drop-down box to select mailing lists, with 
  comp.lang.r.* as an 
  option for all lists.
 
  Duncan Murdoch
  
  Good idea.  I will do this.  But there are also two other good search
  engines.  Maybe I'll add all three search alternatives.  But then,
  according to Sheena Iyengar, people won't choose any!  Hmm.
 
  
  Actually, I was thinking about a possible RHelpSearch() in 

  addition, if
  
  Jon is no longer going to include the R-help archive in the 

  search.  I
  
  used the current RSiteSearch() a lot more for searching 

  R-help archive
  
  than functions in packages.  Ideas?  comments?

  This is OK with me, but I don't want to do it.  I guess it would
  search gmane.  MarkMail is also pretty good, as is
  http://tolstoy.newcastle.edu.au/R/ All these are much better than
  Namazu for searching the R-help list.
  
 
  Sorry I didn't make it clear:  I meant something like the gmaneSearcg()
  that Romain posted, not hitting your site.
 
  Best,
  Andy
   

  Jon
  
 
 
 -- 
 Romain Francois
 Independent R Consultant
 +33(0) 6 28 91 30 30
 http://romainfrancois.blog.free.fr
 

-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Wacek Kusnierczyk
Romain Francois wrote:

txt - grep( '^tr.*td align=right.*a', readLines( url ), value =
 TRUE )
  rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
out - data.frame(
url = gsub( rx, \\1, txt ),
group = gsub( rx, \\2, txt ),
description = gsub( rx, \\3, txt ),

looking at this bit of your code, i wonder why gsub is not vectorized
for the pattern and replacement arguments, although it is for the x
argument.  the three lines above could be collapsed to just one with a
vectorized gsub:

gsubm = function(pattern, replacement, x, ...)
   mapply(USE.NAMES=FALSE, SIMPLIFY=FALSE,
   gsub, pattern=pattern, replacement=replacement, x=x, ...)

for example, given the sample data

txt = 'foofoo/foobarbar/bar'
rx = '(.*?)(.*?)/(.*?)'

the sequence

open = gsub(rx, '\\1', txt, perl=TRUE)
content = gsub(rx, '\\2', txt, perl=TRUE)
close = gsub(rx, '\\3', txt, perl=TRUE)

print(list(open, content, close))
   
could be replaced with

data = structure(names=c('open', 'content', 'close'),
gsubm(rx, paste('\\', 1:3, sep=''), txt, perl=TRUE))

print(data)

surely, a call to mapply does not improve performance, but a
source-level fix should not be too difficult;  unfortunately, i can't
find myself willing to struggle with r sources right now.


note also that .*? does not work as a non-greedy .* with the default
regex engine, e.g.,

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt)
# foo='FOO' bar
gsub((.*?)='(.*?)', '\\2', txt)
# BAR

because the first .*? matches everyithng up to and exclusive of the
second, *not* the first, '='.  for a non-greedy match, you'd need pcre
(and using pcre generally improves performance anyway):

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt, perl=TRUE)
# foo bar
gsub((.*?)='(.*?)', '\\2', txt, perl=TRUE)
# FOO BAR

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Jonathan Baron
After reading all this, I favor doing one of two things:

1. Put all the search stuff, including the proposed gmane function, in
   Spencer's new package but make it one of the default packages, like
   utils, etc., or,

2. Put everything in utils, including Spencer's new package and the
   gmane function.

I do not know enough to choose between these.

On 05/07/09 14:42, spencerg wrote:
   1.  Whatever we do with the RSiteSearch function, it should 
 still be available every time R starts.  If we put it in its own 
 package, it should still be autoloaded with base, utils, stats, etc. 

Good point.

   2.  Sundar indicated to me that, if Jonathan would like to remove 
 the search capability, it would be rather simple to move RSiteSearch to 
 nabble for the listserve archives.  The RSiteSearch function could be 
 modified to combine that with a separate search of only the help pages 
 on Jonathan's server. 

I do not understand rather simple at all.  For those who are
interested, I've put my notes on how to manage my site (which still
need a bit of revision, but this will give you some idea of what is
involved) in

http://finzi.psych.upenn.edu/~baron/notes.namazu.txt

The problem is that I have not found a way to automate this, so I
still spend several hours each month doing it by hand.  Too many
little glitches come up along the way, and the main problem is the
mailing lists.  Moreover, Namazu just doesn't work all that well for
mailing lists of this size, because of the page footers in each post.
(Now I remove them.  That was a bad idea.  But if we're going to get
rid of this anyway I will not take the time to figure out how to put
them back properly.)

Also, Liviu Androic argued that vignettes should be searchable
separately from help pages.  This makes sense, but I would strongly
prefer to move ahead on other changes and leave this until later.
The need for this sort of modification is what makes me favor option
#1 at the beginning (separate package) on the theory that it would be
easier for me to make changes than if it were part of utils, but I
don't know how this works.

So, if someone can make a decision about how to proceed, I'll do what
I can, as soon as I can.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Romain Francois

Jonathan Baron wrote:

After reading all this, I favor doing one of two things:

1. Put all the search stuff, including the proposed gmane function, in
   Spencer's new package but make it one of the default packages, like
   utils, etc., or,

2. Put everything in utils, including Spencer's new package and the
   gmane function.

I do not know enough to choose between these.
  
I would tend to prefer #1 so that the functionality can incubate in a 
separate package, and then when it is mature enough, we can make a call 
about what to do with it.


Something like this:
- a generic abstract function that sets up the interface to query a 
search engine.


- implementations of this, here are what I can think of:
+ jon's RSiteSearch for help pages
+ r graphical manuals
+ gmane, markmail for mail archives
+ classic help.search
+ R news (not clear how to do this right now)
+ vignettes (not clear how to do this right now)
+ JSS articles (not clear how to this right now)
+ FAQ (not clear how to this right now)
+ ... add your own by simply register your implementation

The point about having some sort of central generic function is that it 
can be responsible for asking all engines and bring all results back in 
a single format.


This somehow duplicates work I have been doing with the rsitesearch 
firefox extension, but doing it in R has several advantages.


This I think is enough design to be a separate package.

I am not sure what are the requirements for a package to be shipped with 
the distribution of R (QA, documentation, ...), but I am sure whoever 
steps me (maybe me) can make it compliant.


There is precedent for functionality that was in a package and was 
merged into utils afterwards (rcompgen), but I think it was included 
because this was necessary, don't think these search engines __have__ to 
be in utils.



On 05/07/09 14:42, spencerg wrote:
  
  1.  Whatever we do with the RSiteSearch function, it should 
still be available every time R starts.  If we put it in its own 
package, it should still be autoloaded with base, utils, stats, etc. 



Good point.

  
  2.  Sundar indicated to me that, if Jonathan would like to remove 
the search capability, it would be rather simple to move RSiteSearch to 
nabble for the listserve archives.  The RSiteSearch function could be 
modified to combine that with a separate search of only the help pages 
on Jonathan's server. 



I do not understand rather simple at all.  For those who are
interested, I've put my notes on how to manage my site (which still
need a bit of revision, but this will give you some idea of what is
involved) in

http://finzi.psych.upenn.edu/~baron/notes.namazu.txt

The problem is that I have not found a way to automate this, so I
still spend several hours each month doing it by hand.  Too many
little glitches come up along the way, and the main problem is the
mailing lists.  Moreover, Namazu just doesn't work all that well for
mailing lists of this size, because of the page footers in each post.
(Now I remove them.  That was a bad idea.  But if we're going to get
rid of this anyway I will not take the time to figure out how to put
them back properly.)

Also, Liviu Androic argued that vignettes should be searchable
separately from help pages.  This makes sense, but I would strongly
prefer to move ahead on other changes and leave this until later.
The need for this sort of modification is what makes me favor option
#1 at the beginning (separate package) on the theory that it would be
easier for me to make changes than if it were part of utils, but I
don't know how this works.

So, if someone can make a decision about how to proceed, I'll do what
I can, as soon as I can.

Jon
  



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Philippe Grosjean

Don't forget R wiki in the list.
Best,

Philippe
..°}))
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..

Romain Francois wrote:

Jonathan Baron wrote:

After reading all this, I favor doing one of two things:

1. Put all the search stuff, including the proposed gmane function, in
   Spencer's new package but make it one of the default packages, like
   utils, etc., or,

2. Put everything in utils, including Spencer's new package and the
   gmane function.

I do not know enough to choose between these.
  
I would tend to prefer #1 so that the functionality can incubate in a 
separate package, and then when it is mature enough, we can make a call 
about what to do with it.


Something like this:
- a generic abstract function that sets up the interface to query a 
search engine.


- implementations of this, here are what I can think of:
+ jon's RSiteSearch for help pages
+ r graphical manuals
+ gmane, markmail for mail archives
+ classic help.search
+ R news (not clear how to do this right now)
+ vignettes (not clear how to do this right now)
+ JSS articles (not clear how to this right now)
+ FAQ (not clear how to this right now)
+ ... add your own by simply register your implementation

The point about having some sort of central generic function is that it 
can be responsible for asking all engines and bring all results back in 
a single format.


This somehow duplicates work I have been doing with the rsitesearch 
firefox extension, but doing it in R has several advantages.


This I think is enough design to be a separate package.

I am not sure what are the requirements for a package to be shipped with 
the distribution of R (QA, documentation, ...), but I am sure whoever 
steps me (maybe me) can make it compliant.


There is precedent for functionality that was in a package and was 
merged into utils afterwards (rcompgen), but I think it was included 
because this was necessary, don't think these search engines __have__ to 
be in utils.



On 05/07/09 14:42, spencerg wrote:
 
  1.  Whatever we do with the RSiteSearch function, it should 
still be available every time R starts.  If we put it in its own 
package, it should still be autoloaded with base, utils, stats, 
etc. 


Good point.

 
  2.  Sundar indicated to me that, if Jonathan would like to 
remove the search capability, it would be rather simple to move 
RSiteSearch to nabble for the listserve archives.  The RSiteSearch 
function could be modified to combine that with a separate search of 
only the help pages on Jonathan's server. 


I do not understand rather simple at all.  For those who are
interested, I've put my notes on how to manage my site (which still
need a bit of revision, but this will give you some idea of what is
involved) in

http://finzi.psych.upenn.edu/~baron/notes.namazu.txt

The problem is that I have not found a way to automate this, so I
still spend several hours each month doing it by hand.  Too many
little glitches come up along the way, and the main problem is the
mailing lists.  Moreover, Namazu just doesn't work all that well for
mailing lists of this size, because of the page footers in each post.
(Now I remove them.  That was a bad idea.  But if we're going to get
rid of this anyway I will not take the time to figure out how to put
them back properly.)

Also, Liviu Androic argued that vignettes should be searchable
separately from help pages.  This makes sense, but I would strongly
prefer to move ahead on other changes and leave this until later.
The need for this sort of modification is what makes me favor option
#1 at the beginning (separate package) on the theory that it would be
easier for me to make changes than if it were part of utils, but I
don't know how this works.

So, if someone can make a decision about how to proceed, I'll do what
I can, as soon as I can.

Jon
  





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Romain Francois

strapply in package gsubfn brings elegance here:

 txt - 'foobar/foo'
 rx - (.*?)(.*?)/(.*?)
 strapply( txt, rx, c , perl = T )
[[1]]
[1] foo bar foo

Too bad you have to pay this on performance:

 txt - rep( 'foobar/foo', 1000 )
 rx - (.*?)(.*?)/(.*?)
 system.time( out - strapply( txt, rx, c , perl = T ) )
  user  system elapsed
 2.923   0.005   3.063
 system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
+ gsub(rx, x, txt, perl=TRUE)
+ } ) )
  user  system elapsed
 0.011   0.000   0.011

Not sure what the right play is


Wacek Kusnierczyk wrote:

Romain Francois wrote:
  

   txt - grep( '^tr.*td align=right.*a', readLines( url ), value =
TRUE )
 rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
   out - data.frame(
   url = gsub( rx, \\1, txt ),
   group = gsub( rx, \\2, txt ),
   description = gsub( rx, \\3, txt ),



looking at this bit of your code, i wonder why gsub is not vectorized
for the pattern and replacement arguments, although it is for the x
argument.  the three lines above could be collapsed to just one with a
vectorized gsub:

gsubm = function(pattern, replacement, x, ...)
   mapply(USE.NAMES=FALSE, SIMPLIFY=FALSE,
   gsub, pattern=pattern, replacement=replacement, x=x, ...)

for example, given the sample data

txt = 'foofoo/foobarbar/bar'
rx = '(.*?)(.*?)/(.*?)'

the sequence

open = gsub(rx, '\\1', txt, perl=TRUE)
content = gsub(rx, '\\2', txt, perl=TRUE)
close = gsub(rx, '\\3', txt, perl=TRUE)

print(list(open, content, close))
   
could be replaced with


data = structure(names=c('open', 'content', 'close'),
gsubm(rx, paste('\\', 1:3, sep=''), txt, perl=TRUE))

print(data)

surely, a call to mapply does not improve performance, but a
source-level fix should not be too difficult;  unfortunately, i can't
find myself willing to struggle with r sources right now.


note also that .*? does not work as a non-greedy .* with the default
regex engine, e.g.,

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt)
# foo='FOO' bar
gsub((.*?)='(.*?)', '\\2', txt)
# BAR

because the first .*? matches everyithng up to and exclusive of the
second, *not* the first, '='.  for a non-greedy match, you'd need pcre
(and using pcre generally improves performance anyway):

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt, perl=TRUE)
# foo bar
gsub((.*?)='(.*?)', '\\2', txt, perl=TRUE)
# FOO BAR

vQ


  



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Romain Francois
yes, and r graph gallery. those two would be easy to implement once the 
system is up.


Philippe Grosjean wrote:

Don't forget R wiki in the list.
Best,

Philippe
..°}))
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..

Romain Francois wrote:

Jonathan Baron wrote:

After reading all this, I favor doing one of two things:

1. Put all the search stuff, including the proposed gmane function, in
   Spencer's new package but make it one of the default packages, like
   utils, etc., or,

2. Put everything in utils, including Spencer's new package and the
   gmane function.

I do not know enough to choose between these.
  
I would tend to prefer #1 so that the functionality can incubate in a 
separate package, and then when it is mature enough, we can make a 
call about what to do with it.


Something like this:
- a generic abstract function that sets up the interface to query a 
search engine.


- implementations of this, here are what I can think of:
+ jon's RSiteSearch for help pages
+ r graphical manuals
+ gmane, markmail for mail archives
+ classic help.search
+ R news (not clear how to do this right now)
+ vignettes (not clear how to do this right now)
+ JSS articles (not clear how to this right now)
+ FAQ (not clear how to this right now)
+ ... add your own by simply register your implementation

The point about having some sort of central generic function is that 
it can be responsible for asking all engines and bring all results 
back in a single format.


This somehow duplicates work I have been doing with the rsitesearch 
firefox extension, but doing it in R has several advantages.


This I think is enough design to be a separate package.

I am not sure what are the requirements for a package to be shipped 
with the distribution of R (QA, documentation, ...), but I am sure 
whoever steps me (maybe me) can make it compliant.


There is precedent for functionality that was in a package and was 
merged into utils afterwards (rcompgen), but I think it was included 
because this was necessary, don't think these search engines __have__ 
to be in utils.



On 05/07/09 14:42, spencerg wrote:
 
  1.  Whatever we do with the RSiteSearch function, it should 
still be available every time R starts.  If we put it in its own 
package, it should still be autoloaded with base, utils, 
stats, etc. 


Good point.

 
  2.  Sundar indicated to me that, if Jonathan would like to 
remove the search capability, it would be rather simple to move 
RSiteSearch to nabble for the listserve archives.  The 
RSiteSearch function could be modified to combine that with a 
separate search of only the help pages on Jonathan's server. 


I do not understand rather simple at all.  For those who are
interested, I've put my notes on how to manage my site (which still
need a bit of revision, but this will give you some idea of what is
involved) in

http://finzi.psych.upenn.edu/~baron/notes.namazu.txt

The problem is that I have not found a way to automate this, so I
still spend several hours each month doing it by hand.  Too many
little glitches come up along the way, and the main problem is the
mailing lists.  Moreover, Namazu just doesn't work all that well for
mailing lists of this size, because of the page footers in each post.
(Now I remove them.  That was a bad idea.  But if we're going to get
rid of this anyway I will not take the time to figure out how to put
them back properly.)

Also, Liviu Androic argued that vignettes should be searchable
separately from help pages.  This makes sense, but I would strongly
prefer to move ahead on other changes and leave this until later.
The need for this sort of modification is what makes me favor option
#1 at the beginning (separate package) on the theory that it would be
easier for me to make changes than if it were part of utils, but I
don't know how this works.

So, if someone can make a decision about how to proceed, I'll do what
I can, as soon as I can.

Jon
  









--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread hadley wickham
On Fri, May 8, 2009 at 10:11 AM, Romain Francois
romain.franc...@dbmail.com wrote:
 strapply in package gsubfn brings elegance here:

 txt - 'foobar/foo'
 rx - (.*?)(.*?)/(.*?)
 strapply( txt, rx, c , perl = T )
 [[1]]
 [1] foo bar foo

 Too bad you have to pay this on performance:

 txt - rep( 'foobar/foo', 1000 )
 rx - (.*?)(.*?)/(.*?)
 system.time( out - strapply( txt, rx, c , perl = T ) )
  user  system elapsed
  2.923   0.005   3.063
 system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
 + gsub(rx, x, txt, perl=TRUE)
 + } ) )
  user  system elapsed
  0.011   0.000   0.011

 Not sure what the right play i

For me:

 system.time( out - strapply( txt, rx, c , perl = T ) )
   user  system elapsed
  0.004   0.000   0.004

 system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
+ gsub(rx, x, txt, perl=TRUE)
+ } ) )
   user  system elapsed
  0   0   0

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Philippe Grosjean


..°}))
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..

Romain Francois wrote:

strapply in package gsubfn brings elegance here:


Don't! If you write functions to be used in a package to be included 
somehow in the base or recommended packages, then, your package should 
only depends on... base (preferably), or recommended packages itself!


So, forget about gsubfn, unless it is itself incorporated in base or utils.
Best,

Philippe


  txt - 'foobar/foo'
  rx - (.*?)(.*?)/(.*?)
  strapply( txt, rx, c , perl = T )
[[1]]
[1] foo bar foo

Too bad you have to pay this on performance:

  txt - rep( 'foobar/foo', 1000 )
  rx - (.*?)(.*?)/(.*?)
  system.time( out - strapply( txt, rx, c , perl = T ) )
  user  system elapsed
 2.923   0.005   3.063
  system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
+ gsub(rx, x, txt, perl=TRUE)
+ } ) )
  user  system elapsed
 0.011   0.000   0.011

Not sure what the right play is


Wacek Kusnierczyk wrote:

Romain Francois wrote:
 

   txt - grep( '^tr.*td align=right.*a', readLines( url ), value =
TRUE )
 rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
   out - data.frame(
   url = gsub( rx, \\1, txt ),
   group = gsub( rx, \\2, txt ),
   description = gsub( rx, \\3, txt ),



looking at this bit of your code, i wonder why gsub is not vectorized
for the pattern and replacement arguments, although it is for the x
argument.  the three lines above could be collapsed to just one with a
vectorized gsub:

gsubm = function(pattern, replacement, x, ...)
   mapply(USE.NAMES=FALSE, SIMPLIFY=FALSE,
   gsub, pattern=pattern, replacement=replacement, x=x, ...)

for example, given the sample data

txt = 'foofoo/foobarbar/bar'
rx = '(.*?)(.*?)/(.*?)'

the sequence

open = gsub(rx, '\\1', txt, perl=TRUE)
content = gsub(rx, '\\2', txt, perl=TRUE)
close = gsub(rx, '\\3', txt, perl=TRUE)

print(list(open, content, close))
   could be replaced with

data = structure(names=c('open', 'content', 'close'),
gsubm(rx, paste('\\', 1:3, sep=''), txt, perl=TRUE))

print(data)

surely, a call to mapply does not improve performance, but a
source-level fix should not be too difficult;  unfortunately, i can't
find myself willing to struggle with r sources right now.


note also that .*? does not work as a non-greedy .* with the default
regex engine, e.g.,

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt)
# foo='FOO' bar
gsub((.*?)='(.*?)', '\\2', txt)
# BAR

because the first .*? matches everyithng up to and exclusive of the
second, *not* the first, '='.  for a non-greedy match, you'd need pcre
(and using pcre generally improves performance anyway):

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt, perl=TRUE)
# foo bar
gsub((.*?)='(.*?)', '\\2', txt, perl=TRUE)
# FOO BAR

vQ


  





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Wacek Kusnierczyk
Romain Francois wrote:
 strapply in package gsubfn brings elegance here:

  txt - 'foobar/foo'
  rx - (.*?)(.*?)/(.*?)
  strapply( txt, rx, c , perl = T )
 [[1]]
 [1] foo bar foo


sure, but this does not, in any way, make it less strange that gsub is
not vectorized. 


 Too bad you have to pay this on performance:

  txt - rep( 'foobar/foo', 1000 )
  rx - (.*?)(.*?)/(.*?)
  system.time( out - strapply( txt, rx, c , perl = T ) )
   user  system elapsed
  2.923   0.005   3.063
  system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
 + gsub(rx, x, txt, perl=TRUE)
 + } ) )
   user  system elapsed
  0.011   0.000   0.011

strapply

and you know why.


vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Wacek Kusnierczyk
hadley wickham wrote:
 On Fri, May 8, 2009 at 10:11 AM, Romain Francois
 romain.franc...@dbmail.com wrote:
   
 strapply in package gsubfn brings elegance here:

 
 txt - 'foobar/foo'
 rx - (.*?)(.*?)/(.*?)
 strapply( txt, rx, c , perl = T )
   
 [[1]]
 [1] foo bar foo

 Too bad you have to pay this on performance:

 
 txt - rep( 'foobar/foo', 1000 )
 rx - (.*?)(.*?)/(.*?)
 system.time( out - strapply( txt, rx, c , perl = T ) )
   
  user  system elapsed
  2.923   0.005   3.063
 
 system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
   
 + gsub(rx, x, txt, perl=TRUE)
 + } ) )
  user  system elapsed
  0.011   0.000   0.011

 Not sure what the right play i
 

 For me:

   
 system.time( out - strapply( txt, rx, c , perl = T ) )
 
user  system elapsed
   0.004   0.000   0.004

   
 system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
 
 + gsub(rx, x, txt, perl=TRUE)
 + } ) )
user  system elapsed
   0   0   0
   

for me:

txt - 'foobar/foo'
rx - '(.*?)(.*?)/(.*?)'

library(rbenchmark)
benchmark(replications=1000, columns=c('test', 'elapsed'),
order='elapsed',
   sapply=sapply(paste('\\', 1:3, sep=''), function(x) gsub(rx, x,
txt, perl=TRUE)),
   mapply=mapply(gsub, rx, paste('\\', 1:3, sep=''), txt, perl=TRUE),
   strapply=strapply(txt, rx, c, perl=TRUE))
# 2   mapply   0.151
# 1   sapply   0.166
# 3 strapply   1.917

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-08 Thread Romain Francois

Philippe Grosjean wrote:


..°}))
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..

Romain Francois wrote:

strapply in package gsubfn brings elegance here:


Don't! If you write functions to be used in a package to be included 
somehow in the base or recommended packages, then, your package should 
only depends on... base (preferably), or recommended packages itself!


Definitely.



So, forget about gsubfn, unless it is itself incorporated in base or 
utils.

Best,

Philippe


  txt - 'foobar/foo'
  rx - (.*?)(.*?)/(.*?)
  strapply( txt, rx, c , perl = T )
[[1]]
[1] foo bar foo

Too bad you have to pay this on performance:

  txt - rep( 'foobar/foo', 1000 )
  rx - (.*?)(.*?)/(.*?)
  system.time( out - strapply( txt, rx, c , perl = T ) )
  user  system elapsed
 2.923   0.005   3.063
  system.time( out2 - sapply( paste('\\', 1:3, sep=''), function(x){
+ gsub(rx, x, txt, perl=TRUE)
+ } ) )
  user  system elapsed
 0.011   0.000   0.011

Not sure what the right play is


Wacek Kusnierczyk wrote:

Romain Francois wrote:
 

   txt - grep( '^tr.*td align=right.*a', readLines( url ), value =
TRUE )
 rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
   out - data.frame(
   url = gsub( rx, \\1, txt ),
   group = gsub( rx, \\2, txt ),
   description = gsub( rx, \\3, txt ),



looking at this bit of your code, i wonder why gsub is not vectorized
for the pattern and replacement arguments, although it is for the x
argument.  the three lines above could be collapsed to just one with a
vectorized gsub:

gsubm = function(pattern, replacement, x, ...)
   mapply(USE.NAMES=FALSE, SIMPLIFY=FALSE,
   gsub, pattern=pattern, replacement=replacement, x=x, ...)

for example, given the sample data

txt = 'foofoo/foobarbar/bar'
rx = '(.*?)(.*?)/(.*?)'

the sequence

open = gsub(rx, '\\1', txt, perl=TRUE)
content = gsub(rx, '\\2', txt, perl=TRUE)
close = gsub(rx, '\\3', txt, perl=TRUE)

print(list(open, content, close))
   could be replaced with

data = structure(names=c('open', 'content', 'close'),
gsubm(rx, paste('\\', 1:3, sep=''), txt, perl=TRUE))

print(data)

surely, a call to mapply does not improve performance, but a
source-level fix should not be too difficult;  unfortunately, i can't
find myself willing to struggle with r sources right now.


note also that .*? does not work as a non-greedy .* with the default
regex engine, e.g.,

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt)
# foo='FOO' bar
gsub((.*?)='(.*?)', '\\2', txt)
# BAR

because the first .*? matches everyithng up to and exclusive of the
second, *not* the first, '='.  for a non-greedy match, you'd need pcre
(and using pcre generally improves performance anyway):

txt = foo='FOO' bar='BAR'
gsub((.*?)='(.*?)', '\\1', txt, perl=TRUE)
# foo bar
gsub((.*?)='(.*?)', '\\2', txt, perl=TRUE)
# FOO BAR

vQ


  









--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Liaw, Andy
From: Liaw, Andy
 
 Can someone in R Core please take a look at the attached patches to
 RSiteSearch() and its help page?  I guess Jon is planning some changes
 on his site. 

Apparently the attachments were stripped off the first time.  Here's a
second try.  

I've already set format to plain text in Outlook, even in that first
post.  If this still doesn't work, can some one explain to me what I
have to do in Outlook to get the attachment through?

Best,
Andy
Notice:  This e-mail message, together with any attachments, contains
information of Merck  Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates (which may be known
outside the United States as Merck Frosst, Merck Sharp  Dohme or
MSD and in Japan, as Banyu - direct contact information for affiliates is
available at http://www.merck.com/contact/contacts.html) that may be
confidential, proprietary copyrighted and/or legally privileged. It is
intended solely for the use of the individual or entity named on this
message. If you are not the intended recipient, and have received this
message in error, please notify us immediately by reply e-mail and
then delete it from your system.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Jonathan Baron
On 05/07/09 10:05, Liaw, Andy wrote:
 Can someone in R Core please take a look at the attached patches to
 RSiteSearch() and its help page?  I guess Jon is planning some changes
 on his site.  Jon:  could you elaborate on what the patch does?

The idea is simply to remove the mail archives, so the search will be
only of functions' help pages.  Eventually I will also add package
vignettes, but I don't think we need anything special for that.  I
can't imagine that someone would want to search just vignettes and not
help pages, or the reverse.

The reasons are: 1. The mail archives are becoming increasingly
difficult and time consuming for me to maintain.  2. There are now
three other ways of searching mail archives, all of which seem much
better than mine, but there seem to be no other good ways to search
help pages for functions, and, indeed, the new RSiteSearch packages
does only functions.  3. With only functions it would be much easier
for someone to set up a complete mirror of my site, which seems like a
good idea.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Liaw, Andy
From: Liaw, Andy 
 
 From: Liaw, Andy
  
  Can someone in R Core please take a look at the attached patches to
  RSiteSearch() and its help page?  I guess Jon is planning 
 some changes
  on his site. 
 
 Apparently the attachments were stripped off the first time.  
 Here's a second try.  
 
 I've already set format to plain text in Outlook, even in 
 that first post.  If this still doesn't work, can some one 
 explain to me what I have to do in Outlook to get the 
 attachment through?

OK, as suggested by Bill Dunlap and Spencer Graves, I've renamed .diff
to .diff.txt.  Hopefully the third time is charm...

Apologies for the wasted bandwidth.
Notice:  This e-mail message, together with any attachments, contains
information of Merck  Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates (which may be known
outside the United States as Merck Frosst, Merck Sharp  Dohme or
MSD and in Japan, as Banyu - direct contact information for affiliates is
available at http://www.merck.com/contact/contacts.html) that may be
confidential, proprietary copyrighted and/or legally privileged. It is
intended solely for the use of the individual or entity named on this
message. If you are not the intended recipient, and have received this
message in error, please notify us immediately by reply e-mail and
then delete it from your system.
--- RSiteSearch.Rd  2009-04-18 03:28:08.0 -0400
+++ /home/liawand/RSiteSearch.Rd2009-05-07 09:54:02.0 -0400
@@ -6,17 +6,15 @@
 \name{RSiteSearch}
 \alias{RSiteSearch}
 \title{
-  Search for Key Words or Phrases in the R-help Mailing List Archives
-  or Documentation
+  Search for Key Words or Phrases in the function help pages
 }
 \description{
-  Search for key words or phrases in the R-help mailing list
-  archives, or \R manuals and help pages, using the search engine at
-  \url{http://search.r-project.org} and view them in a web browser.
+  Search for key words or phrases in the function help pages, using the
+  search engine at \url{http://search.r-project.org} and view them in a
+  web browser.
 }
 \usage{
 RSiteSearch(string,
-restrict = c(Rhelp02a, functions, docs),
 format = c(normal, short),
 sortby = c(score, date:late, date:early,
subject, subject:descending,
@@ -27,14 +25,6 @@
 \arguments{
   \item{string}{word(s) or phrase to search.  If the words are to be
 searched as one entity, enclose all words in braces (see example).}
-  \item{restrict}{a character vector, typically of length larger than one:
-What areas to search in:
-\code{Rhelp02a} for R-help mailing list archive since 2002,
-\code{Rhelp01} for mailing list archive before 2002,
-\code{docs} for R manuals,
-\code{functions} for help pages.
-\code{R-devel} for R-devel mailing list.
-Use \code{c()} to specify more than one.}
   \item{format}{\code{normal} or \code{short} (no excerpts); can be
 abbreviated.}
   \item{sortby}{character string (can be abbreviated) indicating how to
@@ -60,6 +50,11 @@
 
   Unique partial matches will work for all arguments.  Each new
   browser window will stay open unless you close it.
+
+  Mailing lists may be searched at several other sites, including
+  \url{http://tolstoy.newcastle.edu.au/R/}, and
+  \url{http://markmail.org/search/list:r-project}.  See
+  \url{http://search.r-project.org} for a full list.
 }
 \author{Andy Liaw and Jonathan Baron}
 \seealso{
@@ -70,15 +65,8 @@
 \examples{\donttest{ # need Internet connection
 RSiteSearch({logistic regression}) # matches exact phrase
 Sys.sleep(5) # allow browser to open, take a quick look
-RSiteSearch(Baron Liaw, restrict = Rhelp02a)
-## Search in R-devel archive and documents  (and store the query-string):
-Sys.sleep(5)
-fullquery - RSiteSearch(S4, restrict = c(R-dev, docs))
+fullquery - RSiteSearch(S4, sortby = date:late)
 fullquery # a string of ~ 116 characters
-## the latest purported bug reports, responses ...
-%% FIXME: /bug/ and other reg.exp.s seem to fail
-Sys.sleep(5)
-RSiteSearch(bug, restrict = R-devel, sortby = date:late)
 }}
 \keyword{utilities}
 \keyword{documentation}
--- RSiteSearch.R   2009-04-18 03:28:06.0 -0400
+++ /home/liawand/RSiteSearch.R 2009-05-07 09:53:59.0 -0400
@@ -14,8 +14,7 @@
 #  A copy of the GNU General Public License is available at
 #  http://www.r-project.org/Licenses/
 
-RSiteSearch - function(string, restrict = c(Rhelp02a, functions, docs),
-   format = c(normal, short),
+RSiteSearch - function(string, formatt = c(normal, short),
sortby = c(score, date:late, date:early,
subject, subject:descending,
from, from:descending, size, size:descending),
@@ -27,10 +26,6 @@
 mpp - paste0(max=, matchesPerPage)
 format - paste0(result=, match.arg(format))
 
-restrictVALS - c(Rhelp02a, Rhelp01, functions, docs, R-devel)
-

Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Duncan Murdoch

On 5/7/2009 10:18 AM, Jonathan Baron wrote:

On 05/07/09 10:05, Liaw, Andy wrote:

Can someone in R Core please take a look at the attached patches to
RSiteSearch() and its help page?  I guess Jon is planning some changes
on his site.  Jon:  could you elaborate on what the patch does?


The idea is simply to remove the mail archives, so the search will be
only of functions' help pages.  Eventually I will also add package
vignettes, but I don't think we need anything special for that.  I
can't imagine that someone would want to search just vignettes and not
help pages, or the reverse.

The reasons are: 1. The mail archives are becoming increasingly
difficult and time consuming for me to maintain.  2. There are now
three other ways of searching mail archives, all of which seem much
better than mine, but there seem to be no other good ways to search
help pages for functions, and, indeed, the new RSiteSearch packages
does only functions.  3. With only functions it would be much easier
for someone to set up a complete mirror of my site, which seems like a
good idea.


I'll incorporate the changes if you like.  What do you think of the idea 
of adding a gmane (or other archive) search to your results page?  Then 
if someone doesn't like what the man pages show, you can send them 
somewhere else, rather than leaving them to find out the other resources 
themselves.


gmane has sample code for this on their search page search.gmane.org, so 
it looks reasonably easy.  I'd suggest following their last example, 
with a drop-down box to select mailing lists, with comp.lang.r.* as an 
option for all lists.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Liaw, Andy
From: Duncan Murdoch 
 
 On 5/7/2009 10:18 AM, Jonathan Baron wrote:
  On 05/07/09 10:05, Liaw, Andy wrote:
  Can someone in R Core please take a look at the attached patches to
  RSiteSearch() and its help page?  I guess Jon is planning 
 some changes
  on his site.  Jon:  could you elaborate on what the patch does?
  
  The idea is simply to remove the mail archives, so the 
 search will be
  only of functions' help pages.  Eventually I will also add package
  vignettes, but I don't think we need anything special for that.  I
  can't imagine that someone would want to search just 
 vignettes and not
  help pages, or the reverse.
  
  The reasons are: 1. The mail archives are becoming increasingly
  difficult and time consuming for me to maintain.  2. There are now
  three other ways of searching mail archives, all of which seem much
  better than mine, but there seem to be no other good ways to search
  help pages for functions, and, indeed, the new RSiteSearch packages
  does only functions.  3. With only functions it would be much easier
  for someone to set up a complete mirror of my site, which 
 seems like a
  good idea.
 
 I'll incorporate the changes if you like.  What do you think 
 of the idea 
 of adding a gmane (or other archive) search to your results 
 page?  Then 
 if someone doesn't like what the man pages show, you can send them 
 somewhere else, rather than leaving them to find out the 
 other resources 
 themselves.
 
 gmane has sample code for this on their search page 
 search.gmane.org, so 
 it looks reasonably easy.  I'd suggest following their last example, 
 with a drop-down box to select mailing lists, with 
 comp.lang.r.* as an 
 option for all lists.
 
 Duncan Murdoch

Actually, I was thinking about a possible RHelpSearch() in addition, if
Jon is no longer going to include the R-help archive in the search.  I
used the current RSiteSearch() a lot more for searching R-help archive
than functions in packages.  Ideas?  comments?

Andy 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Romain Francois

About this:

gmaneSearch - function( string,
   group = gmane.comp.lang.r.*, author = , sort = c(relevance, 
date, revdate),

   op = c(and, or) ){
  
   sort - match.arg(sort)

   op - match.arg( op )
  
   url - sprintf(
   
'http://search.gmane.org/?query=%sauthor=%sgroup=%ssort=%sDEFAULTOP=%s', 


   gsub( ' +', '+', string),  author,  group,  sort, op )
   url - URLencode( url )
   browseURL( url )
}


Liaw, Andy wrote:
From: Duncan Murdoch 
  

On 5/7/2009 10:18 AM, Jonathan Baron wrote:


On 05/07/09 10:05, Liaw, Andy wrote:
  

Can someone in R Core please take a look at the attached patches to
RSiteSearch() and its help page?  I guess Jon is planning 


some changes


on his site.  Jon:  could you elaborate on what the patch does?

The idea is simply to remove the mail archives, so the 
  

search will be


only of functions' help pages.  Eventually I will also add package
vignettes, but I don't think we need anything special for that.  I
can't imagine that someone would want to search just 
  

vignettes and not


help pages, or the reverse.

The reasons are: 1. The mail archives are becoming increasingly
difficult and time consuming for me to maintain.  2. There are now
three other ways of searching mail archives, all of which seem much
better than mine, but there seem to be no other good ways to search
help pages for functions, and, indeed, the new RSiteSearch packages
does only functions.  3. With only functions it would be much easier
for someone to set up a complete mirror of my site, which 
  

seems like a


good idea.
  
I'll incorporate the changes if you like.  What do you think 
of the idea 
of adding a gmane (or other archive) search to your results 
page?  Then 
if someone doesn't like what the man pages show, you can send them 
somewhere else, rather than leaving them to find out the 
other resources 
themselves.


gmane has sample code for this on their search page 
search.gmane.org, so 
it looks reasonably easy.  I'd suggest following their last example, 
with a drop-down box to select mailing lists, with 
comp.lang.r.* as an 
option for all lists.


Duncan Murdoch



Actually, I was thinking about a possible RHelpSearch() in addition, if
Jon is no longer going to include the R-help archive in the search.  I
used the current RSiteSearch() a lot more for searching R-help archive
than functions in packages.  Ideas?  comments?

Andy 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


  



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Liaw, Andy
From: Jonathan Baron
 
 On 05/07/09 13:48, Liaw, Andy wrote:
  From: Duncan Murdoch 
   I'll incorporate the changes if you like.
 
 Yes.  Please do.  I understand that it won't take effect for a while.
 When it does, I'll change my site.
 
   What do you think 
   of the idea 
   of adding a gmane (or other archive) search to your results 
   page?  Then 
   if someone doesn't like what the man pages show, you can 
 send them 
   somewhere else, rather than leaving them to find out the 
   other resources 
   themselves.
   
   gmane has sample code for this on their search page 
   search.gmane.org, so 
   it looks reasonably easy.  I'd suggest following their 
 last example, 
   with a drop-down box to select mailing lists, with 
   comp.lang.r.* as an 
   option for all lists.
   
   Duncan Murdoch
 
 Good idea.  I will do this.  But there are also two other good search
 engines.  Maybe I'll add all three search alternatives.  But then,
 according to Sheena Iyengar, people won't choose any!  Hmm.
 
  Actually, I was thinking about a possible RHelpSearch() in 
 addition, if
  Jon is no longer going to include the R-help archive in the 
 search.  I
  used the current RSiteSearch() a lot more for searching 
 R-help archive
  than functions in packages.  Ideas?  comments?
 
 This is OK with me, but I don't want to do it.  I guess it would
 search gmane.  MarkMail is also pretty good, as is
 http://tolstoy.newcastle.edu.au/R/ All these are much better than
 Namazu for searching the R-help list.

Sorry I didn't make it clear:  I meant something like the gmaneSearcg()
that Romain posted, not hitting your site.

Best,
Andy
 
 Jon
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Romain Francois
We could have a few functions similar to RSiteSearch or gmaneSearch I 
just posted and then cook a summary html page with R ...


Here is a function that grabs relevant groups from gmane:

gmaneGroups - function( prefix = gmane.comp.lang.r. ){
   url - URLencode( sprintf( 
http://dir.gmane.org/index.php?prefix=%s;, prefix) )
   txt - grep( '^tr.*td align=right.*a', readLines( url ), value = 
TRUE )
  
   rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'

   out - data.frame(
   url = gsub( rx, \\1, txt ),
   group = gsub( rx, \\2, txt ),
   description = gsub( rx, \\3, txt ),
   stringsAsFactors = FALSE
   )
   out$group - sub( ..., .*, out$group, fixed = TRUE )
   out
}

I'll clean this up and write a man page if there is interest in adding 
this to R, but this might be more appropriate in a package, for example: 
http://r-forge.r-project.org/projects/rsitesearch/


Romain

Liaw, Andy wrote:

From: Jonathan Baron
  

On 05/07/09 13:48, Liaw, Andy wrote:

From: Duncan Murdoch 
  

I'll incorporate the changes if you like


Yes.  Please do.  I understand that it won't take effect for a while.
When it does, I'll change my site.

  What do you think 

of the idea 
of adding a gmane (or other archive) search to your results 
page?  Then 
if someone doesn't like what the man pages show, you can 

send them 

somewhere else, rather than leaving them to find out the 
other resources 
themselves.


gmane has sample code for this on their search page 
search.gmane.org, so 
it looks reasonably easy.  I'd suggest following their 

last example, 

with a drop-down box to select mailing lists, with 
comp.lang.r.* as an 
option for all lists.


Duncan Murdoch


Good idea.  I will do this.  But there are also two other good search
engines.  Maybe I'll add all three search alternatives.  But then,
according to Sheena Iyengar, people won't choose any!  Hmm.


Actually, I was thinking about a possible RHelpSearch() in 
  

addition, if

Jon is no longer going to include the R-help archive in the 
  

search.  I

used the current RSiteSearch() a lot more for searching 
  

R-help archive


than functions in packages.  Ideas?  comments?
  

This is OK with me, but I don't want to do it.  I guess it would
search gmane.  MarkMail is also pretty good, as is
http://tolstoy.newcastle.edu.au/R/ All these are much better than
Namazu for searching the R-help list.



Sorry I didn't make it clear:  I meant something like the gmaneSearcg()
that Romain posted, not hitting your site.

Best,
Andy
 
  

Jon




--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Liaw, Andy

 I agree!  Recall, though, I had added the RSiteSearch() functionality
to the Rgui under Windows (Help / search.r-project.org...), so if
RSiteSearch() is taken out, this need to go, too.

Best,
Andy

From: Jonathan Baron
 
 There is something to be said for taking all of these functions,
 including the original RSiteSearch, out of utils and putting them in
 the new RSiteSearch package.  These are the sorts of things that will
 get revised frequently, and this way (I think) we won't have to bother
 whoever takes care of utils, which is part of the regular R
 distribution.
 
 I'm adding Spencer Graves to the cc list.  Maybe he is interested in
 doing this.
 
 Jon
 
 On 05/07/09 20:54, Romain Francois wrote:
  We could have a few functions similar to RSiteSearch or 
 gmaneSearch I 
  just posted and then cook a summary html page with R ...
  
  Here is a function that grabs relevant groups from gmane:
  
  gmaneGroups - function( prefix = gmane.comp.lang.r. ){
  url - URLencode( sprintf( 
  http://dir.gmane.org/index.php?prefix=%s;, prefix) )
  txt - grep( '^tr.*td align=right.*a', readLines( 
 url ), value = 
  TRUE )
 
  rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
  out - data.frame(
  url = gsub( rx, \\1, txt ),
  group = gsub( rx, \\2, txt ),
  description = gsub( rx, \\3, txt ),
  stringsAsFactors = FALSE
  )
  out$group - sub( ..., .*, out$group, fixed = TRUE )
  out
  }
  
  I'll clean this up and write a man page if there is 
 interest in adding 
  this to R, but this might be more appropriate in a package, 
 for example: 
  http://r-forge.r-project.org/projects/rsitesearch/
  
  Romain
  
  Liaw, Andy wrote:
   From: Jonathan Baron
 
   On 05/07/09 13:48, Liaw, Andy wrote:
   
   From: Duncan Murdoch 
 
   I'll incorporate the changes if you like
   
   Yes.  Please do.  I understand that it won't take effect 
 for a while.
   When it does, I'll change my site.
  
 What do you think 
   
   of the idea 
   of adding a gmane (or other archive) search to your results 
   page?  Then 
   if someone doesn't like what the man pages show, you can 
   
   send them 
   
   somewhere else, rather than leaving them to find out the 
   other resources 
   themselves.
  
   gmane has sample code for this on their search page 
   search.gmane.org, so 
   it looks reasonably easy.  I'd suggest following their 
   
   last example, 
   
   with a drop-down box to select mailing lists, with 
   comp.lang.r.* as an 
   option for all lists.
  
   Duncan Murdoch
   
   Good idea.  I will do this.  But there are also two 
 other good search
   engines.  Maybe I'll add all three search alternatives.  
 But then,
   according to Sheena Iyengar, people won't choose any!  Hmm.
  
   
   Actually, I was thinking about a possible RHelpSearch() in 
 
   addition, if
   
   Jon is no longer going to include the R-help archive in the 
 
   search.  I
   
   used the current RSiteSearch() a lot more for searching 
 
   R-help archive
   
   than functions in packages.  Ideas?  comments?
 
   This is OK with me, but I don't want to do it.  I guess it would
   search gmane.  MarkMail is also pretty good, as is
   http://tolstoy.newcastle.edu.au/R/ All these are much better than
   Namazu for searching the R-help list.
   
  
   Sorry I didn't make it clear:  I meant something like the 
 gmaneSearcg()
   that Romain posted, not hitting your site.
  
   Best,
   Andy

 
   Jon
   
  
  
  -- 
  Romain Francois
  Independent R Consultant
  +33(0) 6 28 91 30 30
  http://romainfrancois.blog.free.fr
  
 
 -- 
 Jonathan Baron, Professor of Psychology, University of Pennsylvania
 Home page: http://www.sas.upenn.edu/~baron
 Editor: Judgment and Decision Making (http://journal.sjdm.org)
 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread Liviu Andronic
Dear Jonathan,

On Thu, May 7, 2009 at 4:18 PM, Jonathan Baron ba...@psych.upenn.edu wrote:
 can't imagine that someone would want to search just vignettes and not
 help pages, or the reverse.

Searching vignettes only can be of interest to users. If someone is
interested in (full-fledged) code examples, and not in various
descriptions of functions, a search vignette facility would come in
handy.
As a personal example, recently I wanted to search all vignettes for
mle examples, but could find no way to do this. I had already
searched the help pages and was unable to find something of obvious
use to me.

Best regards,
Liviu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] proposed changes to RSiteSearch

2009-05-07 Thread spencerg
 1.  Whatever we do with the RSiteSearch function, it should 
still be available every time R starts.  If we put it in its own 
package, it should still be autoloaded with base, utils, stats, etc. 



 2.  Sundar indicated to me that, if Jonathan would like to remove 
the search capability, it would be rather simple to move RSiteSearch to 
nabble for the listserve archives.  The RSiteSearch function could be 
modified to combine that with a separate search of only the help pages 
on Jonathan's server. 



 3.  However, I can't volunteer to do much more on this at least 
until late June and probably not before late August.  If you wanted to 
move the RSiteSearch function to the RSiteSearch package on R-Forge, 
Romain, Sundar and I would be happy to have other developers and let 
them implement the group consensus. 



 Best Wishes,
 Spencer

Gabor Grothendieck wrote:

But help really needs to be delivered with R, not an addon.
It should not be necessary to know how to install packages
just to get this level of help. I think it needs to be where it
is now.

On Thu, May 7, 2009 at 4:02 PM, Liaw, Andy andy_l...@merck.com wrote:
  

 I agree!  Recall, though, I had added the RSiteSearch() functionality
to the Rgui under Windows (Help / search.r-project.org...), so if
RSiteSearch() is taken out, this need to go, too.

Best,
Andy

From: Jonathan Baron


There is something to be said for taking all of these functions,
including the original RSiteSearch, out of utils and putting them in
the new RSiteSearch package.  These are the sorts of things that will
get revised frequently, and this way (I think) we won't have to bother
whoever takes care of utils, which is part of the regular R
distribution.

I'm adding Spencer Graves to the cc list.  Maybe he is interested in
doing this.

Jon

On 05/07/09 20:54, Romain Francois wrote:
  

We could have a few functions similar to RSiteSearch or


gmaneSearch I
  

just posted and then cook a summary html page with R ...

Here is a function that grabs relevant groups from gmane:

gmaneGroups - function( prefix = gmane.comp.lang.r. ){
url - URLencode( sprintf(
http://dir.gmane.org/index.php?prefix=%s;, prefix) )
txt - grep( '^tr.*td align=right.*a', readLines(


url ), value =
  

TRUE )

rx - '^.*?a href=(.*?)(.*?)/a.*td(.*?)/td.*$'
out - data.frame(
url = gsub( rx, \\1, txt ),
group = gsub( rx, \\2, txt ),
description = gsub( rx, \\3, txt ),
stringsAsFactors = FALSE
)
out$group - sub( ..., .*, out$group, fixed = TRUE )
out
}

I'll clean this up and write a man page if there is


interest in adding
  

this to R, but this might be more appropriate in a package,


for example:
  

http://r-forge.r-project.org/projects/rsitesearch/

Romain

Liaw, Andy wrote:


From: Jonathan Baron

  

On 05/07/09 13:48, Liaw, Andy wrote:



From: Duncan Murdoch

  

I'll incorporate the changes if you like



Yes.  Please do.  I understand that it won't take effect


for a while.
  

When it does, I'll change my site.

  What do you think



of the idea
of adding a gmane (or other archive) search to your results
page?  Then
if someone doesn't like what the man pages show, you can



send them



somewhere else, rather than leaving them to find out the
other resources
themselves.

gmane has sample code for this on their search page
search.gmane.org, so
it looks reasonably easy.  I'd suggest following their



last example,



with a drop-down box to select mailing lists, with
comp.lang.r.* as an
option for all lists.

Duncan Murdoch



Good idea.  I will do this.  But there are also two


other good search
  

engines.  Maybe I'll add all three search alternatives.


But then,
  

according to Sheena Iyengar, people won't choose any!  Hmm.




Actually, I was thinking about a possible RHelpSearch() in

  

addition, if



Jon is no longer going to include the R-help archive in the

  

search.  I



used the current RSiteSearch() a lot more for searching

  

R-help archive



than functions in packages.  Ideas?  comments?

  

This is OK with me, but I don't want to do it.  I guess it would
search gmane.  MarkMail is also pretty good, as is
http://tolstoy.newcastle.edu.au/R/ All these are much better than
Namazu for searching the R-help list.



Sorry I didn't make it clear:  I meant something like the
  

gmaneSearcg()
  

that Romain posted, not hitting your site.

Best,
Andy


  

Jon



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr



--
Jonathan Baron, Professor of