I have now implemented VCF tracks for IGV, supporting both 

   a local VCF object read and filtered by the VariantAnnotation package, and
   a remote webserver-hosted vcf file.   

In normal use I expect (and recommend) that the local VCF object will be 
relatively small (< 1Mb, < 50 samples - or some tradeoff of those approximate 
numbers), and that the genome scale vcf file is accompanied by an index.  

I am now turning to annotation tracks: bed, bed9, gff, gff3, gtf.  rtracklayer 
provides a good set of importers for these formats, and S4 classes to represent 
them (apparently all are subclasses of GenomicRanges): 
 
   BEDFile (3 required fields, up to 9 optional fields - 
https://genome.ucsc.edu/FAQ/FAQformat.html#format1)
   GFFFile (includes gff, gff3, gtf)

I propose to support four different representations of these data in R:

   data.frame
   the two rtracklayer classes
   a url pointing to a web-hosted and indexed annotation

The AnnotationTrack constructor accepts all three in the “annotation” 
parameter, a simple version of which (with many parameters defaulted) is:

  track <- AnnotationTrack(trackName, annotation, color, displayMode)

The annotation parameter will be inspected by the constructor: is it a 
data.frame? a BEDFile?  a GFFFile?  a url?

The local data is reformatted as needed into a file with a format igv.js 
understands - native bed and gff text files - then passed to igv as a local 
url. Remote urls are transmitted without change.

Does this sound right?  If you have a minute to comment, now is a good time to 
offer critique and suggestions on annotation tracks.

Next up after the AnnotationTrack class will be alignment (bam) tracks and, if 
I get to it before package submission data, a “seg” track for segmented copy 
number data.

Last week Gabe asked:

> If myigv represents the  IGV session/state, then add_track(myigv, vcfobj) 
> could call down to add_track(myigv,VariantTrack(vcf)) so you'd get the 
> default behaviors. you could also support add_track(myigv, vcf, title = 
> "bla", homVarColor = "whateverman") which would call down to add_track(myigv, 
> VariantTrack(vcf, title = "bla", homVarColor = "whateverman”))
> 
> This is easy to do (I'm assume the IGVSession class name but replace it with 
> whatever class add_track is endomorphic in...):
>   setMethod("add_track", signature = c("IGVSession", "VCF"), function(igv, 
> track, ...) add_track(igv, VariantTrack(track, ...)))
>   setMethod("add_track", signature = c("IGVSession", "BAM", function(igv, 
> track, ...) add_track(igv, AlignmentTrack(track, ...)))
> 
> This would, as Michael points out, give you the default values of the 
> parameter when you just call add_track(myigv, vcfobj)

I hope I don’t sound disrespectful by describing these shorter methods as only 
syntactic simplifications with a little S4 dispatch thrown in.    They have 
value, for sure, but are they not just a relatively thin layer on top of the 
classes I am writing now?   *If* that description is accurate, then I’d rather 
consider adding them later, after the nuts and bolts and basic operations are 
all written, tested, and subjected to a few months of user QC.  I admit that I 
also prefer the greater operational clarity which for me, with my plodding 
brain, comes from using by explicit data types and explicit constructors.)  

 - Paul




> On Mar 14, 2018, at 1:05 PM, Michael Lawrence <lawrence.mich...@gene.com> 
> wrote:
> 
> Agreed about encapsulating plot parameters. I was thinking in terms of user 
> convenience, relying on defaults.
> 
> On Wed, Mar 14, 2018 at 12:40 PM, Paul Shannon 
> <paul.thurmond.shan...@gmail.com> wrote:
> Hi Michael,
> 
> Set me straight if I got this wrong.   You suggest:
> 
> > There should be no need to explicitly construct a track; just rely on 
> > dispatch and class semantics, i.e., passing a VCF object to add_track() 
> > would create a variant track automatically.
> 
> But wouldn’t
> 
>    displayTrack(vcf)
> 
> preclude any easy specification of options - which vary across track types - 
> which are straightforward, easily managed and checked, by a set of track 
> constructors?
> 
> Two examples:
> 
>    displayTrack(VariantTrack(vcf, title=“mef2c eqtl”, height=“300”, 
> homrefColor=“lightGray”,
>                              homVarColor=“darkRed”, hetVarColor=“lightRed”))
> 
>    displayTrack(AlignmentTrack(x, title=“bam 32”, viewAsPairs=TRUE, 
> insertionColor=“black”))
> 
> 
> So I suggest that the visualization of tracks has lots of track-type-specific 
> settings which the user will want to control, and which would be messy to 
> handle with an open-ended set of optional “…” args to a dispatch-capable 
> single “displayTrack” method.
> 
>  - Paul
> 

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to