Re: [Bioc-devel] VcfFile and VcfFileList

2013-07-17 Thread Robert Castelo

hi Valerie,

sounds great, and i think it will do, however, i've encountered a 
problem when trying to obtain the tabix index file, required to build 
this TabixFile object, with a toy VCF file i'm playing with:


library(Rsamtools)
indexTabix(x.vcf.gz)
Error in value[[3L]](cond) : 'seq' must be integer(1)
  file: x.vcf.gz

the message sounds to me complaining about the sequence names (?)

library(VariantAnnotation)
vcf - readVcf(x.vcf.gz, genome=hg19)
seqnames(vcf)
factor-Rle of length 5000 with 5 runs
  Lengths:  704  993 1911 1370   22
  Values :   20   21   22XY
Levels(5): 20 21 22 X Y

so could the problem be that sequence names are not integers because of 
X and Y ?


thanks!
robert.

On 7/17/13 6:56 PM, Valerie Obenchain wrote:

Hi Robert,

Have you seen TabixFile and TabixFileList? Both scanVcf() and 
readVcf() have methods for TabixFile.


 showMethods(readVcf)
Function: readVcf (package VariantAnnotation)
file=character, genome=character, param=missing
file=character, genome=character, param=ScanVcfParam
file=character, genome=missing, param=missing
file=TabixFile, genome=character, param=GRanges
file=TabixFile, genome=character, param=GRangesList
file=TabixFile, genome=character, param=missing
file=TabixFile, genome=character, param=RangedData
file=TabixFile, genome=character, param=RangesList
file=TabixFile, genome=character, param=ScanVcfParam

Does this fit your need/use case?

Valerie

On 07/17/2013 03:22 AM, Robert Castelo wrote:

hi,

i'm interested in having classes 'VcfFile' and 'VcfFileList', analogous
to 'BamFile'/'BamFileList' or 'Bcf/BcfFileList,' with their
corresponding functionality to ease manipulating, in this case, VCF
files. i thought i could sort of copypaste code from Rsamtools but the
existing definitions of classes and methods rely on the Rsamtools
internal functions '.RsamtoolsFile' and '.RsamtoolsFileList' which are
not exported, and thus i cannot use them.

i might be following the wrong path, so i'd like to ask how should i
proceed to have this kind of functionality to handle VCF files,
analogous to the existing one in Rsamtools to handle BAM or BCF files.

thanks!
robert.

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] SweaveParseOptions, quoted commas, and knitr vignettes

2013-07-17 Thread Ben Bolker
Yihui Xie xie at yihui.name writes:

 
  [snip]

  thanks, that all makes sense.

 Two approaches to solve the problem:
 
 1. either you Depends on knitr,
 2. or make install.packages() also install VignetteBuilder (specified
 in DESCRIPTION) when the user chooses to install from source, i.e.
 install.packages(..., type = 'source')
 
  [snip]

 Or if you want a less cryptic error message, put a code chunk like
 this in your Rnw document:
 
 setup, include=FALSE=
 library(knitr)
  at 
 
 I think R will emit an error that knitr was not installed, which can
 be more helpful for the users to realize the real problem.

  I like the third option here. It might be nice if this were documented
in the Writing R extensions manual, although I guess I shouldn't complain
until I volunteer to write a documentation patch ...

  Ben

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] On the mechanics of function evaluation and argument matching

2013-07-17 Thread R. Michael Weylandt
On Wed, Jul 17, 2013 at 9:58 AM, Brian Rowe r...@muxspace.com wrote:
 Hello,

 Section 4.3.2 of the R language definition [1] states that argument matching 
 to formal arguments is a 3-pass process to match arguments to a function. An 
 error is generated if any (supplied) arguments are left unmatched. 
 Interestingly the opposite is not true as any unmatched formals does not 
 generate an error.

 f - function(x,y,z) x
 f(2)
 [1] 2
 f(2,3)
 [1] 2

 Since R is lazily evaluated, I understand that it is not an error for an 
 unused argument to be unassigned. However, it is surprising to me that a 
 function need not be called with all its required arguments. I guess in this 
 situation technically required arguments means required and referenced 
 arguments.

 f()
 Error in f() : argument x is missing, with no default

 Can anyone shed light on the reasoning for this design choice?

I'm not sure I can, but I'd look around at how the missing() function is used.

Cheers,
MW

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] On the mechanics of function evaluation and argument matching

2013-07-17 Thread Brian Rowe
Hello,

Section 4.3.2 of the R language definition [1] states that argument matching to 
formal arguments is a 3-pass process to match arguments to a function. An error 
is generated if any (supplied) arguments are left unmatched. Interestingly the 
opposite is not true as any unmatched formals does not generate an error.

 f - function(x,y,z) x
 f(2)
[1] 2
 f(2,3)
[1] 2

Since R is lazily evaluated, I understand that it is not an error for an unused 
argument to be unassigned. However, it is surprising to me that a function need 
not be called with all its required arguments. I guess in this situation 
technically required arguments means required and referenced arguments. 

 f()
Error in f() : argument x is missing, with no default

Can anyone shed light on the reasoning for this design choice?

Warm Regards,
Brian Rowe


[1] 
http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Argument-matching


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] On the mechanics of function evaluation and argument matching

2013-07-17 Thread Brian Rowe
Thanks for the lead. Given the example in ?missing though, wouldn't it be safer 
to explicitly define a default value of NULL:

myplot - function(x, y=NULL) {
  if(is.null(y)) {
y - x
x - 1:length(y)
  }
  plot(x, y)
}



On Jul 17, 2013, at 11:05 AM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 On Wed, Jul 17, 2013 at 9:58 AM, Brian Rowe r...@muxspace.com wrote:
 Hello,
 
 Section 4.3.2 of the R language definition [1] states that argument matching 
 to formal arguments is a 3-pass process to match arguments to a function. An 
 error is generated if any (supplied) arguments are left unmatched. 
 Interestingly the opposite is not true as any unmatched formals does not 
 generate an error.
 
 f - function(x,y,z) x
 f(2)
 [1] 2
 f(2,3)
 [1] 2
 
 Since R is lazily evaluated, I understand that it is not an error for an 
 unused argument to be unassigned. However, it is surprising to me that a 
 function need not be called with all its required arguments. I guess in this 
 situation technically required arguments means required and referenced 
 arguments.
 
 f()
 Error in f() : argument x is missing, with no default
 
 Can anyone shed light on the reasoning for this design choice?
 
 I'm not sure I can, but I'd look around at how the missing() function is used.
 
 Cheers,
 MW

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] On the mechanics of function evaluation and argument matching

2013-07-17 Thread Ben Bolker
Brian Rowe rowe at muxspace.com writes:

 
 Thanks for the lead. Given the example in ?missing though,
  wouldn't it be safer to explicitly define a
 default value of NULL:
 
 myplot - function(x, y=NULL) {
   if(is.null(y)) {
 y - x
 x - 1:length(y)
   }
   plot(x, y)
 }
 

 [snip]

 In my opinion the missing() functionality can indeed be
fragile (for example, I don't know how I can manipulate an
existing call to make an argument be 'missing' when it was
previously 'non-empty') and using an explicit NULL is often
a good idea.  This makes the documentation a tiny bit less
wieldy if you have lots of parameters ...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem following an R bug fix to integrate()

2013-07-17 Thread Martyn Plummer
On Tue, 2013-07-16 at 13:55 +0200, Hans W Borchers wrote:
 I have been told by the CRAN administrators that the following code generated
 an error on 64-bit Fedora Linux (gcc, clang) and on Solaris machines (sparc,
 x86), but runs well on all other systems):
 
  fn - function(x, y) ifelse(x^2 + y^2 = 1, 1 - x^2 - y^2, 0)
 
  tol - 1.5e-8
  fy - function(x) integrate(function(y) fn(x, y), 0, 1,
 subdivisions = 300, rel.tol = tol)$value
  Fy - Vectorize(fy)
 
  xa - -1; xb - 1
  Q  - integrate(Fy, xa, xb,
 subdivisions = 300, rel.tol = tol)$value
 
 Error in integrate(Fy, xa, xb, subdivisions = 300, rel.tol = tol) :
 roundoff error was detected
 
 Obviously, this realizes a double integration, split up into two 1-dimensional
 integrations, and the result shall be pi/4. I wonder what a 'roundoff error'
 means in this situation.
 
 In my package, this test worked well, w/o error or warnings, since July 2011,
 on Windows, Max OS X, and Ubuntu Linux. I have no chance to test it on one of
 the above mentioned systems. Of course, I can simply disable these tests, but
 I would not like to do so w/o good reason.
 
 If there is a connection to a bug fix to integrate(), with NEWS item
 
 integrate() reverts to the pre-2.12.0 behaviour.  (PR#15219),
 
 then I do not understand what this pre-2.12.0 behavior really means.
 
 Thanks for any help or a hint to what shall be changed.

You can see the bug report here:

https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15219

It concerns the behaviour of integrate with a small error tolerance.
From 2.12.0 to 3.0.1 integrate was not working correctly with small
error tolerance values, in the sense that small values did not improve
accuracy and the accuracy was mis-reported.

The tolerance in your example (1.5e-8) is considerably smaller than the
default (1.2e-4). My guess is that the rounding error always existed but
was not detected due to the bug.  You might try a larger tolerance. I
have tested your example and increasing the tolerance to 1.5e-7 removes
the error.

Martyn


 Hans W Borchers
 
 PS:
 This kind of tricky definition in function 'fn' has caused some discussion on
 this list in July 2009. I still think it should be allowed to proceed in this
 way.
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

---
This message and its attachments are strictly confidenti...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] On the mechanics of function evaluation and argument matching

2013-07-17 Thread Peter Meilstrup
On Wed, Jul 17, 2013 at 10:20 AM, Ben Bolker bbol...@gmail.com wrote:

 Brian Rowe rowe at muxspace.com writes:

 
  Thanks for the lead. Given the example in ?missing though,
   wouldn't it be safer to explicitly define a
  default value of NULL:
 
  myplot - function(x, y=NULL) {
if(is.null(y)) {
  y - x
  x - 1:length(y)
}
plot(x, y)
  }
 

  [snip]

  In my opinion the missing() functionality can indeed be
 fragile (for example, I don't know how I can manipulate an
 existing call to make an argument be 'missing' when it was
 previously 'non-empty')


Like so:

 thecall - quote(x[i,j])
 thecall[[3]] - quote(expr=)
 thecall
x[, j]


 and using an explicit NULL is often
 a good idea.  This makes the documentation a tiny bit less
 wieldy if you have lots of parameters ...


I could certainly imagine a variant of R in which missing and NULL are
unified, and NULL is the default for any binding that exists but was not
given a value. I would probably prefer it on the grounds of being smaller
and more consistent.  (At the C implementation level, R_MissingArg and
R_NilValue are just two parallel uses of the null object pattern with
different behavior, which is a bit silly)

But one advantage the missing value behavior can have is that it fails
early, i.e. it generates an error closer to where a function wants to use
a value it was not provided, rather than failing late, where a NULL
propagates though your data and you have to do more debugging work to find
out where it came from. This kind of fragility can be a good thing as it's
easier to debug problems that happen closer to the point of failure.

For instance,

 myplot - function(y, x=1:length(y)) plot(x,y)
 myplot()
Error in plot(x, y) (from #1) :
  error in evaluating the argument 'x' in selecting a method for function
'plot': Error in length(y) (from #1) : 'y' is missing

I didn't think about what myplot should do with no arguments. As it turns
out it is an error, as R refuses to pass a missing value along to length()
or plot(), which is reasonable.

Compare with a default-NULL version.
 myplot - function(y=NULL, x=1:length(y)) plot(x,y)
 myplot()

Instead of failing early and generating a stack trace pointing you at the
problem, myplot() now generates a graph with points at (0,0) and (1,1) --
most surprising! This is because R happily forwards NULL to length() and
plot() where it refused to earlier. In more complicated code nulls can pass
along several layers before causing problems, making those problems more of
a headache to debug.

Peter

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] On the mechanics of function evaluation and argument matching

2013-07-17 Thread Brian Rowe
I agree that failing fast is a good principle. My initial point led the other 
way though, i.e. any unmatched formal arguments without default values should 
be handled in one of two ways:

1. Fail the function call. This is what most non-functional languages do e.g. 
Python
 def f(x,y,z): x
...
 f(2)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: f() takes exactly 3 arguments (1 given)

2. Perform partial application, like some functional languages e.g. Haskell
f :: Int - Int - Int - Int
f x y z = x

*Main let a = f 2
*Main :t a
a :: Int - Int - Int

Otherwise if an argument is truly optional, I don't see why a default value 
cannot be assigned to the formal argument when defining the function (excepting 
the edge cases you pointed out earlier).

Brian


On Jul 17, 2013, at 2:35 PM, Peter Meilstrup peter.meilst...@gmail.com wrote:

 On Wed, Jul 17, 2013 at 10:20 AM, Ben Bolker bbol...@gmail.com wrote:
 
 Brian Rowe rowe at muxspace.com writes:
 
 
 Thanks for the lead. Given the example in ?missing though,
 wouldn't it be safer to explicitly define a
 default value of NULL:
 
 myplot - function(x, y=NULL) {
  if(is.null(y)) {
y - x
x - 1:length(y)
  }
  plot(x, y)
 }
 
 
 [snip]
 
 In my opinion the missing() functionality can indeed be
 fragile (for example, I don't know how I can manipulate an
 existing call to make an argument be 'missing' when it was
 previously 'non-empty')
 
 
 Like so:
 
 thecall - quote(x[i,j])
 thecall[[3]] - quote(expr=)
 thecall
 x[, j]
 
 
 and using an explicit NULL is often
 a good idea.  This makes the documentation a tiny bit less
 wieldy if you have lots of parameters ...
 
 
 I could certainly imagine a variant of R in which missing and NULL are
 unified, and NULL is the default for any binding that exists but was not
 given a value. I would probably prefer it on the grounds of being smaller
 and more consistent.  (At the C implementation level, R_MissingArg and
 R_NilValue are just two parallel uses of the null object pattern with
 different behavior, which is a bit silly)
 
 But one advantage the missing value behavior can have is that it fails
 early, i.e. it generates an error closer to where a function wants to use
 a value it was not provided, rather than failing late, where a NULL
 propagates though your data and you have to do more debugging work to find
 out where it came from. This kind of fragility can be a good thing as it's
 easier to debug problems that happen closer to the point of failure.
 
 For instance,
 
 myplot - function(y, x=1:length(y)) plot(x,y)
 myplot()
 Error in plot(x, y) (from #1) :
  error in evaluating the argument 'x' in selecting a method for function
 'plot': Error in length(y) (from #1) : 'y' is missing
 
 I didn't think about what myplot should do with no arguments. As it turns
 out it is an error, as R refuses to pass a missing value along to length()
 or plot(), which is reasonable.
 
 Compare with a default-NULL version.
 myplot - function(y=NULL, x=1:length(y)) plot(x,y)
 myplot()
 
 Instead of failing early and generating a stack trace pointing you at the
 problem, myplot() now generates a graph with points at (0,0) and (1,1) --
 most surprising! This is because R happily forwards NULL to length() and
 plot() where it refused to earlier. In more complicated code nulls can pass
 along several layers before causing problems, making those problems more of
 a headache to debug.
 
 Peter
 
   [[alternative HTML version deleted]]
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel