my current reason to prefer a CompressedVRangesList object over a SimpleVRangesList object is that i find one order of magnitude difference in creation time in each of these classes of objects:

library(VariantAnnotation)

fl <- system.file("extdata", "CEUtrio.vcf.bgz",
                  package="VariantFiltering")

vcf <- readVcf(fl, genome="hg19")
vr <- as(vcf, "VRanges")
length(vr)
[1] 15000

## create a VRangesList object
system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
   user  system elapsed
  0.247   0.004   0.252

## create a CompressedVRangesList object
system.time(cvrl <- new("CompressedVRangesList", split(vr, sampleNames(vr))))
   user  system elapsed
  0.019   0.000   0.019

0.252/0.019
[1] 13.26316

with a larger vcf differences increase:

[... load vcf, coerce to VRanges ...]
length(vr)
[1] 25916

system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
   user  system elapsed
  2.672   0.000   2.676

system.time(cvrl <- new("CompressedVRangesList", split(vr, sampleNames(vr))))
   user  system elapsed
  0.014   0.000   0.014

2.676 / 0.014
[1] 191.1429


so maybe i'm using the wrong way to construct a VRangesList object, but according to our last conversation about this, there was no obvious default fast way to do it, starting from a VRanges object:

https://stat.ethz.ch/pipermail/bioc-devel/2015-January/006905.html

it would be great if there's a fast way to do this kind of construction.

thanks,

robert.

On 02/25/2015 04:42 PM, Michael Lawrence wrote:
If you're storing data on a relatively small number of individuals (say,
hundreds), you should use SimpleVRangesList, not CompressedVRangesList.

On Wed, Feb 25, 2015 at 7:10 AM, Robert Castelo <robert.cast...@upf.edu
<mailto:robert.cast...@upf.edu>> wrote:

    i see you point, the logic i was thinking about is to use a list of
    VRanges objects to hold separately the variants of multiple
    individuals, with one VRanges object per individual.

    if i type the name of such a list object on the R shell, having the
    GRangesList show method, i feel i do not see much information
    because the screen just scrolls up tens or hundreds of lines
    specifiying variants per individual. however, the concise appearance
    of something like a VRangesList:

     > vrl
    VRangesList of length 10
    names(32): S1 S2 S3 S4 ... S7 S8 S9 S10

    at least suggests the user that the object holding the variants has
    information for 10 samples and belongs to the class 'VRangesList'.

    i thought this made general sense but i'm fine if you feel this
    interpretation does not warrant such a change.

    cheers,

    robert.

    On 02/25/2015 01:25 AM, Michael Lawrence wrote:

        Why not have the SimpleVRangesList be shown like
        CompressedVRangesList,
        for consistency with GRangesList? In other words, the opposite
        of what
        you propose. A strong argument could also be made that a
        SimpleGenomicRangesList should be shown like a GRangesList.
        Unless there
        is some aversion to the more verbose output....

        On Tue, Feb 24, 2015 at 2:36 PM, Robert Castelo
        <robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>
        <mailto:robert.cast...@upf.edu
        <mailto:robert.cast...@upf.edu>__>> wrote:

             so, yes, but IMO rather than inheriting the show method from a
             GRangesList, i think that the show method for
        CompressedVRangesList
             objects should be inherited from a VRangesList object.
        right now
             this is the situation:

             library(VariantAnnotation)

             example(VRangesList)
             vrl
             VRangesList of length 2
             names(2): sampleA sampleB

             cvrl <- new("CompressedVRangesList", split(vr,
        sampleNames(vr)))
             cvrl
             CompressedVRangesList object of length 2:
             $a
             VRanges object with 1 range and 1 metadata column:
                    seqnames    ranges strand         ref              alt
             totalDepth       refDepth       altDepth
        <Rle> <IRanges> <Rle> <character> <characterOrRle> <integerOrRle>
        <integerOrRle> <integerOrRle>
                [1]     chr1    [1, 5]      +           T
             C             12              5              7
                      sampleNames softFilterMatrix | tumorSpecific
        <factorOrRle> <matrix> | <logical>
                [1]             a             TRUE |         FALSE

             $b
             VRanges object with 1 range and 1 metadata column:
                    seqnames   ranges strand ref alt totalDepth refDepth
        altDepth
             sampleNames softFilterMatrix |
                [1]     chr2 [10, 20]      +   A   T         17       10
             6           b            FALSE |
                    tumorSpecific
                [1]          TRUE

             -------
             seqinfo: 2 sequences from an unspecified genome; no seqlengths

             would it be possible to have the VRangesList show method for
             CompressedVRangesList objects?

             robert.



             On 2/24/15 7:24 PM, Michael Lawrence wrote:

                 I think you might be missing an import. It should
            inherit the
                 method for GRangesList.

                 On Tue, Feb 24, 2015 at 9:53 AM, Robert Castelo
            <robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>
            <mailto:robert.cast...@upf.edu
            <mailto:robert.cast...@upf.edu>__>> wrote:

                     hi,

                     i'm using the CompressedVRangesList class in
            VariantFiltering
                     to hold variants and their annotations across
            multiple samples
                     and found that there was no show method for this
            class (unless
                     i'm missing the right import here) so i made one within
                     VariantFiltering by copying&pasting from other
            similar classes:

                     setMethod("show",
            signature(object="__CompressedVRangesList"),
                               function(object) {
                                 lo <- length(object)
                                 cat(classNameForDisplay(__object), " of
            length ",
                     lo, "\n",
                                     sep = "")
                                 if (!is.null(names(object)))
                                   cat(BiocGenerics:::__labeledLine("names",
                     names(object)))
                               })

                     i guess, however, that the right home for this would be
                     VariantAnnotation. let me know if you consider
            adding it there
                     (or somewhere else) and i'll remove it from
            VariantFiltering.

                     thanks,

                     robert.

                     _________________________________________________
            Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
            <mailto:Bioc-devel@r-project.__org
            <mailto:Bioc-devel@r-project.org>>
                     mailing list
            https://stat.ethz.ch/mailman/__listinfo/bioc-devel
            <https://stat.ethz.ch/mailman/listinfo/bioc-devel>





    --
    Robert Castelo, PhD
    Associate Professor
    Dept. of Experimental and Health Sciences
    Universitat Pompeu Fabra (UPF)
    Barcelona Biomedical Research Park (PRBB)
    Dr Aiguader 88
    E-08003 Barcelona, Spain
    telf: +34.933.160.514 <tel:%2B34.933.160.514>
    fax: +34.933.160.550 <tel:%2B34.933.160.550>



--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to