[Bioc-devel] writeVcf performance
Hi guys (Val, Martin, Herve): Anyone have an itch for optimization? The writeVcf function is currently a bottleneck in our WGS genotyping pipeline. For a typical 50 million row gVCF, it was taking 2.25 hours prior to yesterday's improvements (pasteCollapseRows) that brought it down to about 1 hour, which is still too long by my standards ( 0). Only takes 3 minutes to call the genotypes (and associated likelihoods etc) from the variant calls (using 80 cores and 450 GB RAM on one node), so the output is an issue. Profiling suggests that the running time scales non-linearly in the number of rows. Digging a little deeper, it seems to be something with R's string/memory allocation. Below, pasting 1 million strings takes 6 seconds, but 10 million strings takes over 2 minutes. It gets way worse with 50 million. I suspect it has something to do with R's string hash table. set.seed(1000) end - sample(1e8, 1e6) system.time(paste0(END, =, end)) user system elapsed 6.396 0.028 6.420 end - sample(1e8, 1e7) system.time(paste0(END, =, end)) user system elapsed 134.714 0.352 134.978 Indeed, even this takes a long time (in a fresh session): set.seed(1000) end - sample(1e8, 1e6) end - sample(1e8, 1e7) system.time(as.character(end)) user system elapsed 57.224 0.156 57.366 But running it a second time is faster (about what one would expect?): system.time(levels - as.character(end)) user system elapsed 23.582 0.021 23.589 I did some simple profiling of R to find that the resizing of the string hash table is not a significant component of the time. So maybe something to do with the R heap/gc? No time right now to go deeper. But I know Martin likes this sort of thing ;) Michael [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Additional summarizeOverlaps counting modes for ChIP-Seq
Hi Valerie, I got really busy around May and never got a chance to thank you for adding this option to summarizeOverlaps! So thank you! -Ryan On Thu 01 May 2014 04:25:33 PM PDT, Valerie Obenchain wrote: GenomicAlignments 1.1.8 has a 'preprocess.reads' argument. This should be a function where the first argument is 'reads' and the return value is compatible with 'reads' in the pre-defined count modes. I've used your ResizeReads() as an example in the man page. I think the ability to pre-filter and used a pre-defined mode will be useful. Thanks for the suggestion. Valerie On 05/01/2014 02:29 PM, Valerie Obenchain wrote: On 05/01/2014 02:05 PM, Ryan wrote: Hi Valerie, On Thu May 1 13:27:16 2014, Valerie Obenchain wrote: I have some concerns about the *ExtraArgs() functions. Passing flexible args to findOverlaps in the existing mode functions fundamentally changes the documented behavior. The modes were created to capture specific overlap situations pertaining to gene features which are graphically depicted in the vignette. Changing 'maxgap' or 'minoverlap' will produce a variety of results inconsistent with past behavior and difficult to document (e.g., under what circumstances will IntersectionNotEmpty now register a hit). Well, I wasn't so sure about those functions either. Obviously you can pass arguments that break things. They were mostly designed to be constructors for specific counting modes involving the minoverlap/maxgap arguments, but I decided I didn't need those modes after all. They're certainly not designed to be exposed to the user. I haven't carefully considered the interaction between the counting mode and maxgap/minoverlap, but I believe that it would be roughly equivalent to extending/shrinking the features/reads by the specified amount (with some differences for e.g. a feature/read smaller than 2*minoverlap). For example, with a read length of 100 and a minoverlap of 10 in Union counting mode, this would be the same as truncating the first and last 10 (or mabe 9?) bases and operating in normal Union mode. As I said, though, there may be edge cases that I haven't thought of where unexpected things happen. I agree that controlling the overlap args is appealing and I like the added ability to resize. I've created a 'chipseq' mode that combines these ideas and gives what ResizeReads() was doing but now in 'mode' form. If this implementation gives you the flexibility you were looking for I'll check it into devel. This sounds nice, but if I use the 'chipseq' mode, how do I specify whether I want Union, IntersectionNotEmpty, or IntersectionStrict? It looks like it just does Union? IntersectionStrict would be useful for specifying that the read has to occur entirely within the bounds of a called peak, for example. This is why I implemented it as a wrapper that takes another mode as an argument, so that the resizing logic and the counting logic were independent. 'chipseq' didn't implement the standard modes b/c I wanted to avoid the clash of passing unconventional findOverlaps() args to those modes. The assumption was that if the user specified a mode they would expect a certain behavior ... Maybe summarizeOverlaps could accept an optional read modification function, and if this is provided, it will pass the reads through this before passing them to the counting function. The read modification function would have to take any valid reads argument and return another valid reads argument. It could be used for modifying the reads as well as filtering them. This would allow resizing without the awkward nesting method that I've used. Good idea. Maybe a 'preprocess' or 'prefilter' arg to allow massaging before counting. I'll post back when it's done. Valerie A couple of questions: - Do you want to handle paired-end reads? You coerce to a GRanges to resize but don't coerce back. For paired end reads, there is no need to estimate the fragment length, because the pair gives you both ends of the fragment. So if I had paired-end ChIP-Seq data, I would use it as is with no resizing. I can't personally think of a reason to resize a paired-end fragment, but I don't know if others might need that. I corece to GRanges because I know how GRanges work, but I'm not as familiar with GAlignments so I don't know how the resize function works on GAlignments and other classes. I'm sure you know better than I do how these work. If the coercion is superfluous, feel free to eliminate it. - Do you want to require strand info for all reads? Is this because of how resize() anchors * to 'start'? Yes, I require strand info for all reads because the reads must be directionally extended, which requires strand info. Ditto for counting the 5-prime and 3-prime ends. -Ryan chipseq - function(features, reads, ignore.strand=FALSE, inter.feature=TRUE, type=any, maxgap=0L, minoverlap=1L, width=NULL, fix=start, use.names=TRUE) { reads - as(reads, GRanges) if (any(strand(reads) == *)) stop(all reads
Re: [Bioc-devel] Additional summarizeOverlaps counting modes for ChIP-Seq
Hi again, I'm looking at the examples in the summarizeOverlaps help page here: http://www.bioconductor.org/packages/devel/bioc/manuals/GenomicAlignments/man/GenomicAlignments.pdf And the examples fod preprocess.reads are a little confusing. One of the examples passes some additional ... options to summarizeOverlaps, and implies that these will be passed along as the ... arguments to the proprocess.reads function: summarizeOverlaps(grl, reads, mode=Union, preprocess.reads=ResizeReads, width=1, fix=end) The width and fix arguments are implied to be passed through to ResizeReads, but I don't see it documented anywhere how this would be done. The summarizeOverlaps documentation for ... says Additional arguments for BAM file methods such as singleEnd, fragments or param that apply to the reading of records from a file (see below). I don't see anything about passing through to preprocess.reads. Incidentally, this is why my original example used a function that constructed a closure with these arguments already bound. I would write the example like this to ensure no ambiguity in argument passing (pay attention to parens): ResizeReads - function(mode, ...) { resize.args - list(...) function(reads) { reads - as(reads, GRanges) ## Need strandedness stopifnot(all(strand(reads) != *)) do.call(resize, c(list(x=reads), resize.args)) } } ## By default ResizeReads() counts reads that overlap on the 5 end: summarizeOverlaps(grl, reads, mode=Union, preprocess.reads=ResizeReads()) ## Count reads that overlap on the 3 end by passing new values ## for width and fix: summarizeOverlaps(grl, reads, mode=Union, preprocess.reads=ResizeReads(width=1, fix=end)) Anyway, I don't have the devel version of R handy to test this out, so I don't know if what I've described is a problem in practice. But I think that either the preprocess.reads function should be required to only take one argument, or else the method of passing through additional arguments to it should be documented. -Ryan On Tue 05 Aug 2014 05:12:41 PM PDT, Ryan C. Thompson wrote: Hi Valerie, I got really busy around May and never got a chance to thank you for adding this option to summarizeOverlaps! So thank you! -Ryan On Thu 01 May 2014 04:25:33 PM PDT, Valerie Obenchain wrote: GenomicAlignments 1.1.8 has a 'preprocess.reads' argument. This should be a function where the first argument is 'reads' and the return value is compatible with 'reads' in the pre-defined count modes. I've used your ResizeReads() as an example in the man page. I think the ability to pre-filter and used a pre-defined mode will be useful. Thanks for the suggestion. Valerie On 05/01/2014 02:29 PM, Valerie Obenchain wrote: On 05/01/2014 02:05 PM, Ryan wrote: Hi Valerie, On Thu May 1 13:27:16 2014, Valerie Obenchain wrote: I have some concerns about the *ExtraArgs() functions. Passing flexible args to findOverlaps in the existing mode functions fundamentally changes the documented behavior. The modes were created to capture specific overlap situations pertaining to gene features which are graphically depicted in the vignette. Changing 'maxgap' or 'minoverlap' will produce a variety of results inconsistent with past behavior and difficult to document (e.g., under what circumstances will IntersectionNotEmpty now register a hit). Well, I wasn't so sure about those functions either. Obviously you can pass arguments that break things. They were mostly designed to be constructors for specific counting modes involving the minoverlap/maxgap arguments, but I decided I didn't need those modes after all. They're certainly not designed to be exposed to the user. I haven't carefully considered the interaction between the counting mode and maxgap/minoverlap, but I believe that it would be roughly equivalent to extending/shrinking the features/reads by the specified amount (with some differences for e.g. a feature/read smaller than 2*minoverlap). For example, with a read length of 100 and a minoverlap of 10 in Union counting mode, this would be the same as truncating the first and last 10 (or mabe 9?) bases and operating in normal Union mode. As I said, though, there may be edge cases that I haven't thought of where unexpected things happen. I agree that controlling the overlap args is appealing and I like the added ability to resize. I've created a 'chipseq' mode that combines these ideas and gives what ResizeReads() was doing but now in 'mode' form. If this implementation gives you the flexibility you were looking for I'll check it into devel. This sounds nice, but if I use the 'chipseq' mode, how do I specify whether I want Union, IntersectionNotEmpty, or IntersectionStrict? It looks like it just does Union? IntersectionStrict would be useful for specifying that the read has to occur entirely within the bounds of a called peak, for example. This is why I implemented it
[Rd] Minor documentation change for assign
Good afternoon, Today I was reading the documentation , | help('-'); ` . The passage (I added the asterisks) The operators '' are normally only used in functions, and cause a search *to* made through parent environments for an existing definition of the variable being assigned. If such a looks like it might have wanted to have been phrased as The operators '' are normally only used in functions, and cause a search *to be* made through parent environments for an existing definition of the variable being assigned. If such a . Wanted to share that with the devs, and if that is a wanted change do people usually submit patches or what is the preferred mechanism? , | sessionInfo() ` , | R version 3.1.1 (2014-07-10) | Platform: x86_64-apple-darwin13.2.0 (64-bit) | | locale: | [1] en_US/en_US/en_US/C/en_US/en_US | | attached base packages: | [1] stats graphics grDevices utils datasets methods base | | loaded via a namespace (and not attached): | [1] compiler_3.1.1 tools_3.1.1 ` Kind regards, Grant Rettke | ACM, ASA, FSF, IEEE, SIAM g...@wisdomandwonder.com | http://www.wisdomandwonder.com/ “Wisdom begins in wonder.” --Socrates ((λ (x) (x x)) (λ (x) (x x))) “Life has become immeasurably better since I have been forced to stop taking it seriously.” --Thompson __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Is it a good idea or even possible to redefine attach?
Hi, Today I got curious about whether or not I /could/ remove `attach' from my system so: - Backed it up - Implemented a new one - Like this , | attach.old - attach | attach - function(...) {stop(NEVER USE ATTACH)} ` I got the error: , | Error: cannot change value of locked binding for 'attach' ` If I unlock `attach' I assume that I could stomp on it... however is that even a good idea? What will I break? My goal was never to allow `attach' in my system, but it was just an idea. Kind regards, Grant Rettke | ACM, ASA, FSF, IEEE, SIAM g...@wisdomandwonder.com | http://www.wisdomandwonder.com/ “Wisdom begins in wonder.” --Socrates ((λ (x) (x x)) (λ (x) (x x))) “Life has become immeasurably better since I have been forced to stop taking it seriously.” --Thompson __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is it a good idea or even possible to redefine attach?
On Tue, Aug 5, 2014 at 2:49 PM, Grant Rettke g...@wisdomandwonder.com wrote: Hi, Today I got curious about whether or not I /could/ remove `attach' from my system so: - Backed it up - Implemented a new one - Like this , | attach.old - attach | attach - function(...) {stop(NEVER USE ATTACH)} ` Just masking it with your own function, e.g., attach - function(...) {stop(NEVER USE ATTACH)} should be enough to discourage you from using it. I got the error: , | Error: cannot change value of locked binding for 'attach' ` If I unlock `attach' I assume that I could stomp on it... however is that even a good idea? What will I break? Anything that uses attach. My goal was never to allow `attach' in my system, but it was just an idea. Kind regards, Grant Rettke | ACM, ASA, FSF, IEEE, SIAM g...@wisdomandwonder.com | http://www.wisdomandwonder.com/ “Wisdom begins in wonder.” --Socrates ((λ (x) (x x)) (λ (x) (x x))) “Life has become immeasurably better since I have been forced to stop taking it seriously.” --Thompson __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is it a good idea or even possible to redefine attach?
On Tue, Aug 5, 2014 at 1:49 PM, Grant Rettke g...@wisdomandwonder.com wrote: Hi, Today I got curious about whether or not I /could/ remove `attach' from my system so: - Backed it up - Implemented a new one - Like this , | attach.old - attach | attach - function(...) {stop(NEVER USE ATTACH)} ` I got the error: , | Error: cannot change value of locked binding for 'attach' ` If I unlock `attach' I assume that I could stomp on it... however is that even a good idea? What will I break? If you change the base package environment's copy of `attach` (via `as.environment('package:base')`) , probably not much will break, except for your own code. If, on the other hand, you change the base namespace's copy of `attach` (via `asNamespace('base')`, any package that's subsequently loaded and uses `attach` would run into problems. Still, either way is probably not a good idea. I agree with Ista: assigning it yourself in the the global environment is a better idea. If you want to keep your global env clear of stuff like this, you can put it in an environment and attach that environment as a parent of global: e - new.env() e$attach - function(...) {stop(NEVER USE ATTACH)} base::attach(e, name = my_stuff, warn.conflicts = FALSE) # Test it out: attach() # You can see that the my_stuff env is the parent of global env search() -Winston __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is it a good idea or even possible to redefine attach?
That is delightful. When I run it like this: • Start R • Nothing in .Rprofile • Paste in your code ╭ │ gcrenv - new.env() │ gcrenv$attach.old - attach │ gcrenv$attach - function(...){stop(NEVER USE ATTACH)} │ base::attach(gcrenv, name=gcr, warn.conflicts = FALSE) ╰ • I get exactly what is expected, I think ╭ │ search() ╰ ╭ │ [1] .GlobalEnvgcr ESSR │ [4] package:stats package:graphics package:grDevices │ [7] package:utils package:datasets package:methods │ [10] Autoloads package:base ╰ Just to be sure: • Is that what is expected? • I am surprised because I thought that `gcr' would come first before `.GlobalEnv' • Perhaps I mis understand, as `.GlobalEnv' is actually the REPL? My goal is to move that to my .Rprofile so that it is always run and I can forget about it more or less. Reading [this] I felt like `.First' would be the right place to put it, but then read further to find that packages are only loaded /after/ `.First' has completed. Curious, I tried it just to be sure. I am now :). This is the .Rprofile file: ╭ │ cat(.Rprofile: Setting CMU repository\n) │ r = getOption(repos) │ r[CRAN] = http://lib.stat.cmu.edu/R/CRAN/; │ options(repos = r) │ rm(r) │ │ .First - function() { │«same code as above» │ } ╰ (I included the repository load, and understand it should not impact things here) This is run after normal startup of R: ╭ │ search() ╰ ╭ │ [1] .GlobalEnvpackage:stats package:graphics │ [4] package:grDevices package:utils package:datasets │ [7] gcr package:methods Autoloads │ [10] package:base ╰ When I read this, I read it as: • My rebind of `attach' occurs • Then all of the packages are loaded and they are referring to my-rebound `attach' • That is a problem because it *will* break package code • Clearly me putting that code in `.Rprofile' is the wrong place. What I am looking for now is the right way to achieve what is demonstarted but automatically via a startup file. My ideas: • Manually paste the step each time • Always use Emacs and ESS to run R and add a hook so that the code will be run after ESS loads • Something I am missing [this] http://stat.ethz.ch/R-manual/R-devel/library/base/html/Startup.html Grant Rettke | ACM, ASA, FSF, IEEE, SIAM g...@wisdomandwonder.com | http://www.wisdomandwonder.com/ “Wisdom begins in wonder.” --Socrates ((λ (x) (x x)) (λ (x) (x x))) “Life has become immeasurably better since I have been forced to stop taking it seriously.” --Thompson On Tue, Aug 5, 2014 at 2:47 PM, Winston Chang winstoncha...@gmail.com wrote: On Tue, Aug 5, 2014 at 1:49 PM, Grant Rettke g...@wisdomandwonder.com wrote: Hi, Today I got curious about whether or not I /could/ remove `attach' from my system so: - Backed it up - Implemented a new one - Like this , | attach.old - attach | attach - function(...) {stop(NEVER USE ATTACH)} ` I got the error: , | Error: cannot change value of locked binding for 'attach' ` If I unlock `attach' I assume that I could stomp on it... however is that even a good idea? What will I break? If you change the base package environment's copy of `attach` (via `as.environment('package:base')`) , probably not much will break, except for your own code. If, on the other hand, you change the base namespace's copy of `attach` (via `asNamespace('base')`, any package that's subsequently loaded and uses `attach` would run into problems. Still, either way is probably not a good idea. I agree with Ista: assigning it yourself in the the global environment is a better idea. If you want to keep your global env clear of stuff like this, you can put it in an environment and attach that environment as a parent of global: e - new.env() e$attach - function(...) {stop(NEVER USE ATTACH)} base::attach(e, name = my_stuff, warn.conflicts = FALSE) # Test it out: attach() # You can see that the my_stuff env is the parent of global env search() -Winston __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] More than one package document with the same name
Hi, I sent this to the Bioconductor mailing list (q.v.), replies recommended this forum. Here it is: A suggestion. Typically a package provides documentation of two types, one or more vignettes and a reference manual; and they have the same file name, PackageName.pdf. When downloaded some file name manipulation is required, or accept PackageName(1).pdf. I suggest the documents be given different names, for example the reference manual could be named PackageNameRefMan.pdf. Package checking could enforce the rule. Jerry [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] More than one package document with the same name
On Tue, Aug 5, 2014 at 5:47 PM, Davison, Jerry jdavi...@fhcrc.org wrote: Hi, I sent this to the Bioconductor mailing list (q.v.), replies recommended this forum. Here it is: A suggestion. Typically a package provides documentation of two types, one or more vignettes and a reference manual; and they have the same file name, PackageName.pdf. When downloaded some file name manipulation is required, or accept PackageName(1).pdf. I suggest the documents be given different names, for example the reference manual could be named PackageNameRefMan.pdf. Package checking could enforce the rule. Jerry Sounds like you want to put all the pdf files in a single directory. I like that too. I cheat. I run Linux. I have a ~/Documents/R-pdfs in which I keep symlinks to all the pdf files. And I do it similar to the way that you indicate. I make the name of the pdf be ${enclosing_directory}_${original_pdf_name.pdf}. I do something like: cd ~/Documents/R-pdf find /usr/lib64/R -name '*.pdf'|\ while read i;do file=${i##*/}; dir=${i%%/doc/*.pdf}; package=${dir##*/}; ln -s $i ${package}_${file}; done Above won't work if anything has a blank in it. Windows people tend to do this. Most UNIX people are better trained. It would be nice if the packagers did this. But the above works for me. On Linux and other UNIX like systems. Won't work for the poor, benighted Windows people. But I think something similar is possible. But I don't know Windows well enough. Maranatha! John McKown __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is it a good idea or even possible to redefine attach?
On Tue, Aug 5, 2014 at 4:37 PM, Grant Rettke g...@wisdomandwonder.com wrote: That is delightful. When I run it like this: • Start R • Nothing in .Rprofile • Paste in your code ╭ │ gcrenv - new.env() │ gcrenv$attach.old - attach │ gcrenv$attach - function(...){stop(NEVER USE ATTACH)} │ base::attach(gcrenv, name=gcr, warn.conflicts = FALSE) ╰ • I get exactly what is expected, I think ╭ │ search() ╰ ╭ │ [1] .GlobalEnvgcr ESSR │ [4] package:stats package:graphics package:grDevices │ [7] package:utils package:datasets package:methods │ [10] Autoloads package:base ╰ Just to be sure: • Is that what is expected? • I am surprised because I thought that `gcr' would come first before `.GlobalEnv' • Perhaps I mis understand, as `.GlobalEnv' is actually the REPL? My goal is to move that to my .Rprofile so that it is always run and I can forget about it more or less. Reading [this] I felt like `.First' would be the right place to put it, but then read further to find that packages are only loaded /after/ `.First' has completed. Curious, I tried it just to be sure. I am now :). This is the .Rprofile file: ╭ │ cat(.Rprofile: Setting CMU repository\n) │ r = getOption(repos) │ r[CRAN] = http://lib.stat.cmu.edu/R/CRAN/; │ options(repos = r) │ rm(r) │ │ .First - function() { │«same code as above» │ } ╰ (I included the repository load, and understand it should not impact things here) This is run after normal startup of R: ╭ │ search() ╰ ╭ │ [1] .GlobalEnvpackage:stats package:graphics │ [4] package:grDevices package:utils package:datasets │ [7] gcr package:methods Autoloads │ [10] package:base ╰ When I read this, I read it as: • My rebind of `attach' occurs • Then all of the packages are loaded and they are referring to my-rebound `attach' • That is a problem because it *will* break package code • Clearly me putting that code in `.Rprofile' is the wrong place. That order for search path should actually be fine. To understand why, you first have to know the difference between the _binding_ environment for an object, and the _enclosing_ environment for a function. The binding environment is where you can find an object. For example, in the global env, you have a bunch bindings (we often call them variables), that point to various objects - vectors, data frames, other environments, etc. The enclosing environment for a function is where the function runs in when it's called. Most R objects have just a binding environment (a variable or reference that points to the object); functions also have an enclosing environment. These two environments aren't necessarily the same. When you run search(), it shows the set of environments where R will look for an object of a given name, when you run stuff at the console (and are in the global env). The trick is that, although you can find a function (they are bound bound) in one of these _package_ environments, those functions run in (are enclosed by) a different environment: the a corresponding _namespace_ environment. The way that a namespace environment is set up with the arrangement of its ancestor environments, it will find the base namespace version of `attach` before it finds yours, even if your personal gcr environment comes early in the search path. = # Here's an example to illustrate. The `utils::alarm` function calls `cat`, which is in base. alarm # function () # { # cat(\a) # flush.console() # } # environment: namespace:utils # Running it makes the screen flash or beep alarm() # [screen flashes] # We'll put a replacement version of cat early in the search path, between utils and base my_stuff - new.env() my_stuff$cat - function(...) stop(Tried to call cat) base::attach(my_stuff, pos=length(search()) - 1, name=my_stuff) search() # [1] .GlobalEnvtools:rstudio package:stats package:graphics # [5] package:grDevices package:utils package:datasets package:methods # [9] my_stuff Autoloads package:base # Calling cat from the console gives the error, as expected cat() # Error in cat() : Tried to call cat # But when we run alarm(), it still gets the real version of `cat()`, # because it finds the the original base namespace version of cat # before it finds yours. alarm() # [screen flashes] == You can even alter package environments without affecting the corresponding namespace environment. The exception to the package and namespace environments being distinct is the base environment; change one and you change the other. (I just realized this and have to retract my earlier statement about the behavior being different if change attach in the base package env vs. the base namespace env.) -Winston __ R-devel@r-project.org mailing list