Hi Kasper,

On 02/16/2016 07:05 AM, Kasper Daniel Hansen wrote:
upstream / downstream is what we have previously used for strand awareness.

These names are good for specifying *relative* positions in a
way that is strand-aware. In the case of restrict() though, where we
need to be able to specify *absolute* positions, I don't think that
works. How would you call restrict() with these arguments to perform
Dario's strand-specific trimming?

H.


Kasper

On Tue, Feb 16, 2016 at 3:41 AM, Hervé Pagès <hpa...@fredhutch.org
<mailto:hpa...@fredhutch.org>> wrote:

    Hi Dario,

    AFAIK the 'start' and 'end' are strand-independent concepts so it
    wouldn't be a good idea to let the user specify a strand-specific
    window thru these arguments. That means a strand-aware restrict()
    would need to have 2 additional arguments. But how should we name them?

    My preference would be to support negative values for 'start' and 'end'
    like we do for subseq(). When negative, the position is counted from
    the end of the sequence (-1 being the last nucleotide). If we had this,
    then you could do your strand-specific trimming with:

       restrict(gr, start=ifelse(strand(gr) == "-", 51, 1),
                    end=ifelse(strand(gr) == "-", -1, -51))

    Note that using negative values is convenient but not strictly needed:

       start <- ifelse(strand(gr) == "-", 51, 1)
       end <- extractROWS(seqlengths(gr), seqnames(gr))
       end <- ifelse(strand(gr) == "-", end, end - 50)
       restrict(gr, start=start, end=end)

    Cheers,
    H.



    On 02/14/2016 09:00 PM, Dario Strbenac wrote:

        Hello,

        The restrict function currently has no strand settings. This
        would be useful if I am creating fixed size windows, 50 bases
        wide, by sampling the start positions of the windows from a
        GRanges object. I'd like to restrict the GRanges object being
        sampled from, but only on the - stand when removing positions 1
        to 50 and only on the positive strand when removing positions
        within the last 50 bases of the chromosome. So, a strand-aware
        version would be useful, to avoid sampling start positions too
        close to the ends of chromosomes. I suppose that setdiff will be
        a suitable alternative, if the first and last 50 bases are
        calculated of each chromosome.

        --------------------------------------
        Dario Strbenac
        PhD Student
        University of Sydney
        Camperdown NSW 2050
        Australia

        _______________________________________________
        Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
        mailing list
        https://stat.ethz.ch/mailman/listinfo/bioc-devel


    --
    Hervé Pagès

    Program in Computational Biology
    Division of Public Health Sciences
    Fred Hutchinson Cancer Research Center
    1100 Fairview Ave. N, M1-B514
    P.O. Box 19024
    Seattle, WA 98109-1024

    E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>
    Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
    Fax: (206) 667-1319 <tel:%28206%29%20667-1319>


    _______________________________________________
    Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to