date:20130212

Re: [Bioc-devel] ShortRead: optional custom labeling of samples in QA report

2013-02-12 Thread Julian Gehring


Hi,

Since the attached file didn't make it all the way through to the 
mailing list, you can find it at 
http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/0001-Example-patch-for-naming-samples-in-BAMQA.patch.



Best wishes
Julian


On 02/12/2013 03:23 PM, Julian Gehring wrote:

Hi,

In the QA report of the 'ShortRead' package, a short sequential integer
labeling for referencing the samples/files throughout the report is
created by default.  Would it be reasonable/possible to allow for other
optional names to label the samples to make the results of the report
easier to understand?

In general, I have three ideas what would be handy to have:

1. Derive a label from the file names.  This is probably hard to
generalize and implement in a way that it actually helps.

2. In case the 'dirPath' argument in the 'qa' function call is a named
vector, such as

 qa(dirPath=c(p1=bam_file1.bam, p2=bam_file2.bam))

use the names [p1, p2] for the labeling later on.  This would
require storing the names in the object returned by 'qa', but should not
be too hard to implement.

3. Optionally, pass a named vector to the 'report' method, matching file
names to sample labels.  In case the file names do not match or
'samples' is missing, default to the sequential labeling.


For option 3, I have created a simple example patch to illustrate how
this could be implemented (see attached).  So, later this may look like
this:


 library(ShortRead)
 files = c(p1=bam_file1.bam, p2=bam_file2.bam)
 qa = qa(files, type=BAM)

 ## default sequential labeling ##
 ShortRead:::.report_html_BAMQA(qa, dest=report_normal)

 ## samples named according to names(files) ##
 ShortRead:::.report_html_BAMQA(qa, dest=report_named, samples=files)


I'm happy about any inputs or thoughts regarding this.


Best wishes
Julian


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] ShortRead: optional custom labeling of samples in QA report

2013-02-12 Thread Martin Morgan

On 02/12/2013 06:29 AM, Julian Gehring wrote:

Hi,

Since the attached file didn't make it all the way through to the mailing list,
you can find it at
http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/0001-Example-patch-for-naming-samples-in-BAMQA.patch.

Thanks Julian the request seems reasonable and I'll try to get to this in the
next week. Martin

Best wishes
Julian

On 02/12/2013 03:23 PM, Julian Gehring wrote:

Hi,

In the QA report of the 'ShortRead' package, a short sequential integer
labeling for referencing the samples/files throughout the report is
created by default. Would it be reasonable/possible to allow for other
optional names to label the samples to make the results of the report
easier to understand?

In general, I have three ideas what would be handy to have:

1. Derive a label from the file names. This is probably hard to
generalize and implement in a way that it actually helps.

2. In case the 'dirPath' argument in the 'qa' function call is a named
vector, such as

qa(dirPath=c(p1=bam_file1.bam, p2=bam_file2.bam))

use the names [p1, p2] for the labeling later on. This would
require storing the names in the object returned by 'qa', but should not
be too hard to implement.

3. Optionally, pass a named vector to the 'report' method, matching file
names to sample labels. In case the file names do not match or
'samples' is missing, default to the sequential labeling.

For option 3, I have created a simple example patch to illustrate how
this could be implemented (see attached). So, later this may look like
this:

library(ShortRead)
files = c(p1=bam_file1.bam, p2=bam_file2.bam)
qa = qa(files, type=BAM)

## default sequential labeling ##
ShortRead:::.report_html_BAMQA(qa, dest=report_normal)

## samples named according to names(files) ##
ShortRead:::.report_html_BAMQA(qa, dest=report_named, samples=files)

I'm happy about any inputs or thoughts regarding this.

Best wishes
Julian

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] R CMD build not copying PDF vignettes to inst\doc

2013-02-12 Thread Norman Pavelka

Hi Dan,

I actually just installed the latest version of R-Devel (r61902) and
used biocLite(plgem) to download and install the latest version of
my package from the server. Although there are no errors or warnings
on the Bioc build/check report, my package still lacks the PDF version
of the vignette. I checked the source tarball in
http://www.bioconductor.org/packages/2.12/bioc/src/contrib/plgem_1.31.1.tar.gz
and in fact cannot see any PDFs in inst/doc. You can also notice the
vignette is not listed anymore in
http://www.bioconductor.org/packages/2.12/bioc/html/plgem.html

I then rebuilt the package from source myself from a freshly
checked-out version from the Bioc-devel repository (plgem version
1.31.1) using R-Devel r61902. I get no errors, no warnings and most
importantly the PDF is being built and included in the tarball
correctly.

So it appears that R-Devel r61868 (the version currenlty on the build
machine) is still not copying the vignette PDF into the package. Could
you please try to update R-Devel to r61902 and see if it solves the
problem?

Thanks!
Norman

P.S.: For full disclosure, I should probably mention that I recently
moved the .Rnw file from inst/doc to /vignettes following the latest R
recommendations, but I am unsure if this has anything to do with the
problem, as the package builds just fine on my machine using the
latest version of R-Devel.

On Wed, Feb 13, 2013 at 12:11 PM, Norman Pavelka
normanpave...@gmail.com wrote:
 Hi Dan,

 I can see the issue is resolved now! I will update my version of R-devel, too.

 Thanks,
 Norman

 On Fri, Feb 8, 2013 at 1:19 PM, Dan Tenenbaum dtene...@fhcrc.org wrote:
 On Thu, Feb 7, 2013 at 8:48 PM, Dan Tenenbaum dtene...@fhcrc.org wrote:
 Hi Norman,


 On Thu, Feb 7, 2013 at 6:59 PM, Norman Pavelka normanpave...@gmail.com 
 wrote:
 Hi,

 I am sure many of you may have noticed already, but basically every
 package in Bioc-devel that has a vignette (i.e. almost every package)
 is currently issuing warnings in R CMD check:
 http://www.bioconductor.org/checkResults/2.12/bioc-LATEST/

 I ran some tests myself and it appears that in the latest version of
 R-devel some changes have been introduced in R CMD build that causes
 it not to copy the compiled PDF vignettes to inst\doc. R CMD build
 returns only a silent warning such as:

 * creating vignettes ... OK
 Warning in file.copy(c(vigns$docs, outfiles), doc_dir) :
   problem copying
 E:\biocbld\bbs-2.12-bioc\tmpdir\Rtmpq4jjoR\Rbuild93c202b5ca6\plgem\vignettes\plgem.pdf
 to inst\doc\plgem.pdf: No such file or directory

 R CMD check then issues the following user-visible warning:

 * checking package vignettes in 'inst/doc' ... WARNING
 Package vignette without corresponding PDF:
'plgem.Rnw'

 Compiling my package from the same source but using the previous
 version of R CMD build does not cause any problems, i.e. the vignette
 PDF is correctly copied to inst/doc and R CMD check does not issue any
 warning.

 Should we bring this up to R-Devel mailing list?


 I'm not sure (checking right now) but I think this was fixed in r61843.
 The build machines are running r61836. The nightly build is underway
 but I will update R-devel tomorrow if doing so indeed fixes the
 problem.


 I can confirm that pdfs are properly copied into source tarballs with
 R-devel r61868.

 I will update to the latest R-devel tomorrow.
 Dan


 Thanks!
 Dan


 Cheers,
 Norman

 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] stringsAsFactors

2013-02-12 Thread Ista Zahn

FWIW my view is that for data cleaning and organizing factors just get
it the way. For modeling I like them because they make it easier to
understand what is happening. For example I can look at the levels()
to see what the reference group will be. With characters one has to
know a) that levels are created in alphabetical order and b) the
alphabetical order of the the unique values in the character vector.
Ugh. So my habit is to turn off stringsAsFactors, then explicitly
convert to factors before modeling (I also use factors to change the
order in which things are displayed in tables and graphs, another
place where converting to factors myself is useful but the creating
them in alphabetical order by default is not)

All this is to say that I would like options(stingsAsFactors=FALSE) to
be the default, but I like the warning about converting to factors in
modeling functions because it reminds me that I forgot to covert them,
which I like to do anyway...

Best,
Ista

On Mon, Feb 11, 2013 at 12:50 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 11/02/2013 12:13 PM, William Dunlap wrote:

 Note that changing this does not just mean getting rid of silly
 warnings.
 Currently, predict.lm() can give wrong answers when stringsAsFactors is
 FALSE.

 d - data.frame(x=1:10, f=rep(c(A,B,C), c(4,3,3)), y=c(1:4,
 15:17, 28.1,28.8,30.1))
 fit_ab - lm(y ~ x + f, data = d, subset = f!=B)
Warning message:
In model.matrix.default(mt, mf, contrasts) :
  variable 'f' converted to a factor
 predict(fit_ab, newdata=d)
 1 2 3 4 5 6 7 8 9 10
 1  2  3  4 25 26 27  8  9 10
Warning messages:
1: In model.matrix.default(Terms, m, contrasts.arg = object$contrasts)
 :
  variable 'f' converted to a factor
2: In predict.lm(fit_ab, newdata = d) :
  prediction from a rank-deficient fit may be misleading

 fit_ab is not rank-deficient and the predict should report
 1 2 3 4 NA NA NA 28 29 30


 In R-devel, the two warnings about factor conversions are no longer given,
 but the predictions are the same and the warning about rank deficiency still
 shows up.  If f is set to be a factor, an error is generated:

 Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
 object$xlevels) :
   factor f has new levels B

 I think both the warning and error are somewhat reasonable responses.  The
 fit is rank deficient relative to the model that includes f == B,  because
 the column of the design matrix corresponding to f level B would be
 completely zero.  In this particular model, we could still do predictions
 for the other levels, but it also seems reasonable to quit, given that
 clearly something has gone wrong.

 I do think that it's unfortunate that we don't get the same result in both
 cases, and I'd like to have gotten the predictions you suggested, but I
 don't think that's going to happen.  The reason for the difference is that
 the subsetting is done before the conversion to a factor, but I think that
 is unavoidable without really big changes.

 Duncan Murdoch




 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

  -Original Message-
  From: r-devel-boun...@r-project.org
  [mailto:r-devel-boun...@r-project.org] On Behalf
  Of Terry Therneau
  Sent: Monday, February 11, 2013 5:50 AM
  To: r-devel@r-project.org; Duncan Murdoch
  Subject: Re: [Rd] stringsAsFactors
 
  I think your idea to remove the warnings is excellent, and a good
  compromise.
  Characters
  already work fine in modeling functions except for the silly warning.
 
  It is interesting how often the defaults for a program reflect the data
  sets in use at the
  time the defaults were chosen.  There are some such in my own survival
  package whose
  proper value is no longer as obvious as it was when I chose them.
  Factors are very
  handy for variables which have only a few levels and will be used in
  modeling.  Every
  character variable of every dataset in Statistical Models in S, which
  introduced
  factors, is of this type so auto-transformation made a lot of sense.
  The solder data
  set there is one for which Helmert contrasts are proper so guess what
  the default
  contrast
  option was?  (I think there are only a few data sets in the world for
  which Helmert makes
  sense, however, and R eventually changed the default.)
 
  For character variables that should not be factors such as a street
  adress
  stringsAsFactors can be a real PITA, and I expect that people's
  preference for the option
  depends almost entirely on how often these arise in their own work.  As
  long as there is
  an option that can be overridden I'm okay.  Yes, I'd prefer FALSE as the
  default, partly
  because the current value is a tripwire in the hallway that eventually
  catches every new
  user.
 
  Terry Therneau
 
  On 02/11/2013 05:00 AM, r-devel-requ...@r-project.org wrote:
   Both of these were discussed by R Core.  I think it's unlikely the
   default for stringsAsFactors will be

[Rd] Private environments and/or assignInMyNamespace

2013-02-12 Thread Ulrike Grömping


Dear DevelopeRs,

I've been struggling with the new regulations regarding modifications to 
the search path, regarding my Rcmdr plugin package RcmdrPlugin.DoE. John 
Fox made Rcmdr comply with the new policy by removing the environment 
RcmdrEnv from the search path. For the time being, he developed an 
option that allows users to put the environment from Rcmdr (RcmdrEnv) on 
the search path, like in earlier versions of Rcmdr (thanks John!), which 
rescues my package for the immediate future; however, in the long run it 
would be nice to be able to make it work without that.


The reason why I currently need the environment on the search path (may 
be due to my lack of understanding how tcltk widgets are handled): I 
have quite elaborate notebook widgets on which users can make many 
entries. Some entries are only checked after clicking OK, and if an 
error is found at that point, the user receives a small message window 
that has to be confirmed and is subsequently returned to the notebook 
widget in the state it was in when pressing OK. These widgets are 
currently held in the environment RcmdrEnv; they work when RcmdrEnv is 
on the search path; however, it is not sufficient to retrieve them with 
John's function getRcmdr, which works fine for objects other than widgets.


Here my question: Would it be an option to place the widgets in a 
private environment of my plugin package (then I would have to learn how 
to create one and work with it), or won't they be found that way? 
Alternatively, I could have unexported objects of all required names in 
my namespace and modify these via assignInMyNamespace (I don't think 
that anybody from somewhere else would import that namespace, it's not 
that kind of package). Would that be a viable alternative, and would the 
widgets be found that way? Any further ideas?


Best regards,
Ulrike

--
*
* Ulrike Groemping  *
* BHT Berlin - University of Applied Sciences   *
*
* +49 (30) 39404863 (Home Office)   *
* +49 (30) 4504 5127 (BHT)  *
*
* http://prof.beuth-hochschule.de/groemping *
* groemp...@bht-berlin.de   *

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

27 matches

Mail list logo