Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-19 Thread Hervé Pagès

Hi Michael,

On 10/17/2015 12:01 PM, Michael Lawrence wrote:

Cool, thanks. So does it make sense to have that as the coercion from
GAlignmentsList to GAlignmentPairs? I can add it, if it seems
reasonable. I'm kind of torn on whether a coercion should discard records.


Right, the coercion won't return an object parallel to the input but you
don't have much choice. You could emit a warning to mitigate this.

2 more things:

1) We should refrain from propagating discordant pairs for now. Even
  though the GAlignmentPairs container can store them, the API is not
  ready. It will be easy to propagate them later. For now, the coercion
  method could emit the same kind of warning emitted by
  readGAlignmentPairs() about the discordant pairs being dropped.

2) You might want to propagate pairs with

mate_status %in% c("mated",  "ambiguous")

  instead of "mated" pairs only.

Thanks,
H.



On Sat, Oct 17, 2015 at 8:37 AM, Morgan, Martin
<martin.mor...@roswellpark.org <mailto:martin.mor...@roswellpark.org>>
wrote:

Val and I discussed this a bit, with the resolution that calling the
GAlignmentPairs() constructor on the first and the second element of
the GAlignmentsList was the 'easiest' way to go.

   ga = unlist(gal[mcols(gal)$mate_status == "mated"])
   GAlignmentPairs(ga[c(TRUE, FALSE)], ga[c(FALSE, TRUE)])

(completely speculative code). If I understand correctly, there's a
check in for discordant pairs in readGAlignmentPairs, but not in
GAlignmentPairs itself; could be mistaken though...

Martin

From: Michael Lawrence [lawrence.mich...@gene.com
<mailto:lawrence.mich...@gene.com>]
Sent: Saturday, October 17, 2015 9:48 AM
To: Hervé Pagès
Cc: Michael Lawrence; Morgan, Martin; bioc-devel@r-project.org
    <mailto:bioc-devel@r-project.org>
    Subject: Re: [Bioc-devel] readGAlignmentPairs with discordant strand

This might have been lost in this thread. If there is an easy way to
coerce a GAlignmentsList where all(lengths(x) == 2) to a
GAlignmentPairs, please let me know. Otherwise, I'll add a coerce
method. Debating whether it should discard elements of length != 2,
or if it should fail.

On Fri, Oct 16, 2015 at 9:58 AM, Michael Lawrence <micha...@gene.com
<mailto:micha...@gene.com><mailto:micha...@gene.com
<mailto:micha...@gene.com>>> wrote:
Sure, "*" makes more sense for strand, given the precedent.

On Fri, Oct 16, 2015 at 9:55 AM, Hervé Pagès <hpa...@fredhutch.org
<mailto:hpa...@fredhutch.org><mailto:hpa...@fredhutch.org
<mailto:hpa...@fredhutch.org>>> wrote:
> On 10/16/2015 09:28 AM, Michael Lawrence wrote:
>>
>> I kind of wish it would return NA for things like seqnames and strand,
>> but yes that would be very useful.
>
>
> Could do this for seqnames() but I'm hesitant to do this for strand().
> If you look at ?strand in BiocGenerics, ‘*’ is used when the exact
> strand of the location is unknown, or irrelevant, or when the "feature"
> at that location belongs to both strands. A pair with discordant strand
> belongs to both strands. Also there is a lot of code around that
> assumes strand() never returns NAs.
>
>
> H.
>
>>
>> On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès <hpa...@fredhutch.org
<mailto:hpa...@fredhutch.org><mailto:hpa...@fredhutch.org
<mailto:hpa...@fredhutch.org>>> wrote:
>>>
>>> Hi Michael, Martin,
>>>
>>> On 10/16/2015 06:48 AM, Michael Lawrence wrote:
>>>>
>>>>
>>>> It does seem like starting with the more general data structure is the
>>>> better approach, but I couldn't find an easy way to move the paired
>>>> subset
>>>> of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's
>>>> not
>>>> obvious to me, unfortunately.
>>>>
>>>> Another approach would be a GAlignmentPairs where the unpaired reads
>>>> have
>>>> "missing" mates. I know GAlignments has no concept of missing, but it
>>>> would
>>>> get everything into a single data structure that is convenient for
>>>> computing on pairs.
>>>
>>>
>>>
>>> I could modify readGAlignmentPairs() to have the discordant and/or
>>> ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
>>> could be marked as such thru a metadata col of the object or thru
>>> a proper slot. The

Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-17 Thread Michael Lawrence
This might have been lost in this thread. If there is an easy way to coerce
a GAlignmentsList where all(lengths(x) == 2) to a GAlignmentPairs, please
let me know. Otherwise, I'll add a coerce method. Debating whether it
should discard elements of length != 2, or if it should fail.

On Fri, Oct 16, 2015 at 9:58 AM, Michael Lawrence  wrote:

> Sure, "*" makes more sense for strand, given the precedent.
>
> On Fri, Oct 16, 2015 at 9:55 AM, Hervé Pagès  wrote:
> > On 10/16/2015 09:28 AM, Michael Lawrence wrote:
> >>
> >> I kind of wish it would return NA for things like seqnames and strand,
> >> but yes that would be very useful.
> >
> >
> > Could do this for seqnames() but I'm hesitant to do this for strand().
> > If you look at ?strand in BiocGenerics, ‘*’ is used when the exact
> > strand of the location is unknown, or irrelevant, or when the "feature"
> > at that location belongs to both strands. A pair with discordant strand
> > belongs to both strands. Also there is a lot of code around that
> > assumes strand() never returns NAs.
> >
> >
> > H.
> >
> >>
> >> On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès 
> wrote:
> >>>
> >>> Hi Michael, Martin,
> >>>
> >>> On 10/16/2015 06:48 AM, Michael Lawrence wrote:
> 
> 
>  It does seem like starting with the more general data structure is the
>  better approach, but I couldn't find an easy way to move the paired
>  subset
>  of GAlignmentsList to GAlignmentPairs. You mention a coercion, but
> it's
>  not
>  obvious to me, unfortunately.
> 
>  Another approach would be a GAlignmentPairs where the unpaired reads
>  have
>  "missing" mates. I know GAlignments has no concept of missing, but it
>  would
>  get everything into a single data structure that is convenient for
>  computing on pairs.
> >>>
> >>>
> >>>
> >>> I could modify readGAlignmentPairs() to have the discordant and/or
> >>> ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
> >>> could be marked as such thru a metadata col of the object or thru
> >>> a proper slot. The seqnames() and strand() accessors will return
> >>> * on discordant pairs. Does that sound reasonable?
> >>>
> >>> H.
> >>>
> >>>
> 
>  On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
>  martin.mor...@roswellpark.org> wrote:
> 
> >
> >
> >> -Original Message-
> >> From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org] On
> Behalf
> >> Of
> >> Michael Lawrence
> >> Sent: Friday, October 16, 2015 7:41 AM
> >> To: bioc-devel@r-project.org
> >> Subject: [Bioc-devel] readGAlignmentPairs with discordant strand
> >>
> >> Now that GAlignmentPairs supports discordant strand between mates,
> how
> >> hard would it be to relax that restriction on readGAlignmentPairs()?
> >>
> >> Also, would be nice if getDumpedAlignments() returned those dumped
> by
> >> readGAlignmentPairs(). Right now, I'm reading a GAlignments (with
> the
> >
> >
> > extra
> >>
> >>
> >> mcols) and calling makeGAlignmentPairs(). Not so convenient.
> >
> >
> >
> > I'm not sure whether this is relevant to your use case but
> > readGAlignmentsList returns a list of paired mates, and if
> appropriate
> > (based on ScanBamParam) list elements with solo travelers. The paired
> > portion of the list can be coerced to GAlignmentPairs if the
> additional
> > structure of that class is required.
> >
> > Martin
> >
> >>
> >> Michael
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ___
> >> Bioc-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
> >
> >
> >
> > This email message may contain legally privileged and/or confidential
> > information.  If you are not the intended recipient(s), or the
> employee
> > or
> > agent responsible for the delivery of this message to the intended
> > recipient(s), you are hereby notified that any disclosure, copying,
> > distribution, or use of this email message is prohibited.  If you
> have
> > received this message in error, please notify the sender immediately
> by
> > e-mail and delete this email message from your computer. Thank you.
> >
> 
>   [[alternative HTML version deleted]]
> 
>  ___
>  Bioc-devel@r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> >>>
> >>> --
> >>> Hervé Pagès
> >>>
> >>> Program in Computational Biology
> >>> Division of Public Health Sciences
> >>> Fred Hutchinson Cancer Research Center
> >>> 1100 Fairview Ave. N, M1-B514
> >>> P.O. Box 19024
> >>> Seattle, WA 98109-1024
> >>>
> >>> E-mail: hpa...@fredhutch.org
> >>> Phone:  (206) 667-5791
> 

Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-17 Thread Michael Lawrence
Cool, thanks. So does it make sense to have that as the coercion from
GAlignmentsList to GAlignmentPairs? I can add it, if it seems reasonable.
I'm kind of torn on whether a coercion should discard records.

On Sat, Oct 17, 2015 at 8:37 AM, Morgan, Martin <
martin.mor...@roswellpark.org> wrote:

> Val and I discussed this a bit, with the resolution that calling the
> GAlignmentPairs() constructor on the first and the second element of the
> GAlignmentsList was the 'easiest' way to go.
>
>   ga = unlist(gal[mcols(gal)$mate_status == "mated"])
>   GAlignmentPairs(ga[c(TRUE, FALSE)], ga[c(FALSE, TRUE)])
>
> (completely speculative code). If I understand correctly, there's a check
> in for discordant pairs in readGAlignmentPairs, but not in GAlignmentPairs
> itself; could be mistaken though...
>
> Martin
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Saturday, October 17, 2015 9:48 AM
> To: Hervé Pagès
> Cc: Michael Lawrence; Morgan, Martin; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] readGAlignmentPairs with discordant strand
>
> This might have been lost in this thread. If there is an easy way to
> coerce a GAlignmentsList where all(lengths(x) == 2) to a GAlignmentPairs,
> please let me know. Otherwise, I'll add a coerce method. Debating whether
> it should discard elements of length != 2, or if it should fail.
>
> On Fri, Oct 16, 2015 at 9:58 AM, Michael Lawrence <micha...@gene.com
> <mailto:micha...@gene.com>> wrote:
> Sure, "*" makes more sense for strand, given the precedent.
>
> On Fri, Oct 16, 2015 at 9:55 AM, Hervé Pagès <hpa...@fredhutch.org hpa...@fredhutch.org>> wrote:
> > On 10/16/2015 09:28 AM, Michael Lawrence wrote:
> >>
> >> I kind of wish it would return NA for things like seqnames and strand,
> >> but yes that would be very useful.
> >
> >
> > Could do this for seqnames() but I'm hesitant to do this for strand().
> > If you look at ?strand in BiocGenerics, ‘*’ is used when the exact
> > strand of the location is unknown, or irrelevant, or when the "feature"
> > at that location belongs to both strands. A pair with discordant strand
> > belongs to both strands. Also there is a lot of code around that
> > assumes strand() never returns NAs.
> >
> >
> > H.
> >
> >>
> >> On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès <hpa...@fredhutch.org
> <mailto:hpa...@fredhutch.org>> wrote:
> >>>
> >>> Hi Michael, Martin,
> >>>
> >>> On 10/16/2015 06:48 AM, Michael Lawrence wrote:
> >>>>
> >>>>
> >>>> It does seem like starting with the more general data structure is the
> >>>> better approach, but I couldn't find an easy way to move the paired
> >>>> subset
> >>>> of GAlignmentsList to GAlignmentPairs. You mention a coercion, but
> it's
> >>>> not
> >>>> obvious to me, unfortunately.
> >>>>
> >>>> Another approach would be a GAlignmentPairs where the unpaired reads
> >>>> have
> >>>> "missing" mates. I know GAlignments has no concept of missing, but it
> >>>> would
> >>>> get everything into a single data structure that is convenient for
> >>>> computing on pairs.
> >>>
> >>>
> >>>
> >>> I could modify readGAlignmentPairs() to have the discordant and/or
> >>> ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
> >>> could be marked as such thru a metadata col of the object or thru
> >>> a proper slot. The seqnames() and strand() accessors will return
> >>> * on discordant pairs. Does that sound reasonable?
> >>>
> >>> H.
> >>>
> >>>
> >>>>
> >>>> On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
> >>>> martin.mor...@roswellpark.org<mailto:martin.mor...@roswellpark.org>>
> wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>>> -Original Message-
> >>>>>> From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org bioc-devel-boun...@r-project.org>] On Behalf
> >>>>>> Of
> >>>>>> Michael Lawrence
> >>>>>> Sent: Friday, October 16, 2015 7:41 AM
> >>>>>> To: bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
> >>>>>> Subject: [Bioc-devel] readGAlignmentPa

Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-17 Thread Morgan, Martin
Val and I discussed this a bit, with the resolution that calling the 
GAlignmentPairs() constructor on the first and the second element of the 
GAlignmentsList was the 'easiest' way to go.

  ga = unlist(gal[mcols(gal)$mate_status == "mated"])
  GAlignmentPairs(ga[c(TRUE, FALSE)], ga[c(FALSE, TRUE)])

(completely speculative code). If I understand correctly, there's a check in 
for discordant pairs in readGAlignmentPairs, but not in GAlignmentPairs itself; 
could be mistaken though...

Martin

From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Saturday, October 17, 2015 9:48 AM
To: Hervé Pagès
Cc: Michael Lawrence; Morgan, Martin; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] readGAlignmentPairs with discordant strand

This might have been lost in this thread. If there is an easy way to coerce a 
GAlignmentsList where all(lengths(x) == 2) to a GAlignmentPairs, please let me 
know. Otherwise, I'll add a coerce method. Debating whether it should discard 
elements of length != 2, or if it should fail.

On Fri, Oct 16, 2015 at 9:58 AM, Michael Lawrence 
<micha...@gene.com<mailto:micha...@gene.com>> wrote:
Sure, "*" makes more sense for strand, given the precedent.

On Fri, Oct 16, 2015 at 9:55 AM, Hervé Pagès 
<hpa...@fredhutch.org<mailto:hpa...@fredhutch.org>> wrote:
> On 10/16/2015 09:28 AM, Michael Lawrence wrote:
>>
>> I kind of wish it would return NA for things like seqnames and strand,
>> but yes that would be very useful.
>
>
> Could do this for seqnames() but I'm hesitant to do this for strand().
> If you look at ?strand in BiocGenerics, ‘*’ is used when the exact
> strand of the location is unknown, or irrelevant, or when the "feature"
> at that location belongs to both strands. A pair with discordant strand
> belongs to both strands. Also there is a lot of code around that
> assumes strand() never returns NAs.
>
>
> H.
>
>>
>> On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès 
>> <hpa...@fredhutch.org<mailto:hpa...@fredhutch.org>> wrote:
>>>
>>> Hi Michael, Martin,
>>>
>>> On 10/16/2015 06:48 AM, Michael Lawrence wrote:
>>>>
>>>>
>>>> It does seem like starting with the more general data structure is the
>>>> better approach, but I couldn't find an easy way to move the paired
>>>> subset
>>>> of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's
>>>> not
>>>> obvious to me, unfortunately.
>>>>
>>>> Another approach would be a GAlignmentPairs where the unpaired reads
>>>> have
>>>> "missing" mates. I know GAlignments has no concept of missing, but it
>>>> would
>>>> get everything into a single data structure that is convenient for
>>>> computing on pairs.
>>>
>>>
>>>
>>> I could modify readGAlignmentPairs() to have the discordant and/or
>>> ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
>>> could be marked as such thru a metadata col of the object or thru
>>> a proper slot. The seqnames() and strand() accessors will return
>>> * on discordant pairs. Does that sound reasonable?
>>>
>>> H.
>>>
>>>
>>>>
>>>> On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
>>>> martin.mor...@roswellpark.org<mailto:martin.mor...@roswellpark.org>> wrote:
>>>>
>>>>>
>>>>>
>>>>>> -Original Message-
>>>>>> From: Bioc-devel 
>>>>>> [mailto:bioc-devel-boun...@r-project.org<mailto:bioc-devel-boun...@r-project.org>]
>>>>>>  On Behalf
>>>>>> Of
>>>>>> Michael Lawrence
>>>>>> Sent: Friday, October 16, 2015 7:41 AM
>>>>>> To: bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
>>>>>> Subject: [Bioc-devel] readGAlignmentPairs with discordant strand
>>>>>>
>>>>>> Now that GAlignmentPairs supports discordant strand between mates, how
>>>>>> hard would it be to relax that restriction on readGAlignmentPairs()?
>>>>>>
>>>>>> Also, would be nice if getDumpedAlignments() returned those dumped by
>>>>>> readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the
>>>>>
>>>>>
>>>>> extra
>>>>>>
>>>>>>
>>>>>> mcols) and calling makeGAlignmentPairs(). Not so convenient.
>>>>>
>>>>

Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-16 Thread Michael Lawrence
Sure, "*" makes more sense for strand, given the precedent.

On Fri, Oct 16, 2015 at 9:55 AM, Hervé Pagès  wrote:
> On 10/16/2015 09:28 AM, Michael Lawrence wrote:
>>
>> I kind of wish it would return NA for things like seqnames and strand,
>> but yes that would be very useful.
>
>
> Could do this for seqnames() but I'm hesitant to do this for strand().
> If you look at ?strand in BiocGenerics, ‘*’ is used when the exact
> strand of the location is unknown, or irrelevant, or when the "feature"
> at that location belongs to both strands. A pair with discordant strand
> belongs to both strands. Also there is a lot of code around that
> assumes strand() never returns NAs.
>
>
> H.
>
>>
>> On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès  wrote:
>>>
>>> Hi Michael, Martin,
>>>
>>> On 10/16/2015 06:48 AM, Michael Lawrence wrote:


 It does seem like starting with the more general data structure is the
 better approach, but I couldn't find an easy way to move the paired
 subset
 of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's
 not
 obvious to me, unfortunately.

 Another approach would be a GAlignmentPairs where the unpaired reads
 have
 "missing" mates. I know GAlignments has no concept of missing, but it
 would
 get everything into a single data structure that is convenient for
 computing on pairs.
>>>
>>>
>>>
>>> I could modify readGAlignmentPairs() to have the discordant and/or
>>> ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
>>> could be marked as such thru a metadata col of the object or thru
>>> a proper slot. The seqnames() and strand() accessors will return
>>> * on discordant pairs. Does that sound reasonable?
>>>
>>> H.
>>>
>>>

 On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
 martin.mor...@roswellpark.org> wrote:

>
>
>> -Original Message-
>> From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org] On Behalf
>> Of
>> Michael Lawrence
>> Sent: Friday, October 16, 2015 7:41 AM
>> To: bioc-devel@r-project.org
>> Subject: [Bioc-devel] readGAlignmentPairs with discordant strand
>>
>> Now that GAlignmentPairs supports discordant strand between mates, how
>> hard would it be to relax that restriction on readGAlignmentPairs()?
>>
>> Also, would be nice if getDumpedAlignments() returned those dumped by
>> readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the
>
>
> extra
>>
>>
>> mcols) and calling makeGAlignmentPairs(). Not so convenient.
>
>
>
> I'm not sure whether this is relevant to your use case but
> readGAlignmentsList returns a list of paired mates, and if appropriate
> (based on ScanBamParam) list elements with solo travelers. The paired
> portion of the list can be coerced to GAlignmentPairs if the additional
> structure of that class is required.
>
> Martin
>
>>
>> Michael
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee
> or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
>

  [[alternative HTML version deleted]]

 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel

>>>
>>> --
>>> Hervé Pagès
>>>
>>> Program in Computational Biology
>>> Division of Public Health Sciences
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, M1-B514
>>> P.O. Box 19024
>>> Seattle, WA 98109-1024
>>>
>>> E-mail: hpa...@fredhutch.org
>>> Phone:  (206) 667-5791
>>> Fax:(206) 667-1319
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-16 Thread Hervé Pagès

On 10/16/2015 09:28 AM, Michael Lawrence wrote:

I kind of wish it would return NA for things like seqnames and strand,
but yes that would be very useful.


Could do this for seqnames() but I'm hesitant to do this for strand().
If you look at ?strand in BiocGenerics, ‘*’ is used when the exact
strand of the location is unknown, or irrelevant, or when the "feature"
at that location belongs to both strands. A pair with discordant strand
belongs to both strands. Also there is a lot of code around that
assumes strand() never returns NAs.

H.



On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès  wrote:

Hi Michael, Martin,

On 10/16/2015 06:48 AM, Michael Lawrence wrote:


It does seem like starting with the more general data structure is the
better approach, but I couldn't find an easy way to move the paired subset
of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's
not
obvious to me, unfortunately.

Another approach would be a GAlignmentPairs where the unpaired reads have
"missing" mates. I know GAlignments has no concept of missing, but it
would
get everything into a single data structure that is convenient for
computing on pairs.



I could modify readGAlignmentPairs() to have the discordant and/or
ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
could be marked as such thru a metadata col of the object or thru
a proper slot. The seqnames() and strand() accessors will return
* on discordant pairs. Does that sound reasonable?

H.




On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
martin.mor...@roswellpark.org> wrote:





-Original Message-
From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org] On Behalf Of
Michael Lawrence
Sent: Friday, October 16, 2015 7:41 AM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] readGAlignmentPairs with discordant strand

Now that GAlignmentPairs supports discordant strand between mates, how
hard would it be to relax that restriction on readGAlignmentPairs()?

Also, would be nice if getDumpedAlignments() returned those dumped by
readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the


extra


mcols) and calling makeGAlignmentPairs(). Not so convenient.



I'm not sure whether this is relevant to your use case but
readGAlignmentsList returns a list of paired mates, and if appropriate
(based on ScanBamParam) list elements with solo travelers. The paired
portion of the list can be coerced to GAlignmentPairs if the additional
structure of that class is required.

Martin



Michael

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




This email message may contain legally privileged and/or confidential
information.  If you are not the intended recipient(s), or the employee
or
agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited.  If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.



 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-16 Thread Morgan, Martin


> -Original Message-
> From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org] On Behalf Of
> Michael Lawrence
> Sent: Friday, October 16, 2015 7:41 AM
> To: bioc-devel@r-project.org
> Subject: [Bioc-devel] readGAlignmentPairs with discordant strand
> 
> Now that GAlignmentPairs supports discordant strand between mates, how
> hard would it be to relax that restriction on readGAlignmentPairs()?
> 
> Also, would be nice if getDumpedAlignments() returned those dumped by
> readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the extra
> mcols) and calling makeGAlignmentPairs(). Not so convenient.

I'm not sure whether this is relevant to your use case but readGAlignmentsList 
returns a list of paired mates, and if appropriate (based on ScanBamParam) list 
elements with solo travelers. The paired portion of the list can be coerced to 
GAlignmentPairs if the additional structure of that class is required.

Martin

>   
> Michael
> 
>   [[alternative HTML version deleted]]
> 
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-16 Thread Michael Lawrence
It does seem like starting with the more general data structure is the
better approach, but I couldn't find an easy way to move the paired subset
of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's not
obvious to me, unfortunately.

Another approach would be a GAlignmentPairs where the unpaired reads have
"missing" mates. I know GAlignments has no concept of missing, but it would
get everything into a single data structure that is convenient for
computing on pairs.

On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
martin.mor...@roswellpark.org> wrote:

>
>
> > -Original Message-
> > From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org] On Behalf Of
> > Michael Lawrence
> > Sent: Friday, October 16, 2015 7:41 AM
> > To: bioc-devel@r-project.org
> > Subject: [Bioc-devel] readGAlignmentPairs with discordant strand
> >
> > Now that GAlignmentPairs supports discordant strand between mates, how
> > hard would it be to relax that restriction on readGAlignmentPairs()?
> >
> > Also, would be nice if getDumpedAlignments() returned those dumped by
> > readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the
> extra
> > mcols) and calling makeGAlignmentPairs(). Not so convenient.
>
> I'm not sure whether this is relevant to your use case but
> readGAlignmentsList returns a list of paired mates, and if appropriate
> (based on ScanBamParam) list elements with solo travelers. The paired
> portion of the list can be coerced to GAlignmentPairs if the additional
> structure of that class is required.
>
> Martin
>
> >
> > Michael
> >
> >   [[alternative HTML version deleted]]
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-16 Thread Hervé Pagès

Hi Michael, Martin,

On 10/16/2015 06:48 AM, Michael Lawrence wrote:

It does seem like starting with the more general data structure is the
better approach, but I couldn't find an easy way to move the paired subset
of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's not
obvious to me, unfortunately.

Another approach would be a GAlignmentPairs where the unpaired reads have
"missing" mates. I know GAlignments has no concept of missing, but it would
get everything into a single data structure that is convenient for
computing on pairs.


I could modify readGAlignmentPairs() to have the discordant and/or
ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
could be marked as such thru a metadata col of the object or thru
a proper slot. The seqnames() and strand() accessors will return
* on discordant pairs. Does that sound reasonable?

H.



On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
martin.mor...@roswellpark.org> wrote:





-Original Message-
From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org] On Behalf Of
Michael Lawrence
Sent: Friday, October 16, 2015 7:41 AM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] readGAlignmentPairs with discordant strand

Now that GAlignmentPairs supports discordant strand between mates, how
hard would it be to relax that restriction on readGAlignmentPairs()?

Also, would be nice if getDumpedAlignments() returned those dumped by
readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the

extra

mcols) and calling makeGAlignmentPairs(). Not so convenient.


I'm not sure whether this is relevant to your use case but
readGAlignmentsList returns a list of paired mates, and if appropriate
(based on ScanBamParam) list elements with solo travelers. The paired
portion of the list can be coerced to GAlignmentPairs if the additional
structure of that class is required.

Martin



Michael

   [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



This email message may contain legally privileged and/or confidential
information.  If you are not the intended recipient(s), or the employee or
agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited.  If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.



[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] readGAlignmentPairs with discordant strand

2015-10-16 Thread Michael Lawrence
I kind of wish it would return NA for things like seqnames and strand,
but yes that would be very useful.

On Fri, Oct 16, 2015 at 9:15 AM, Hervé Pagès  wrote:
> Hi Michael, Martin,
>
> On 10/16/2015 06:48 AM, Michael Lawrence wrote:
>>
>> It does seem like starting with the more general data structure is the
>> better approach, but I couldn't find an easy way to move the paired subset
>> of GAlignmentsList to GAlignmentPairs. You mention a coercion, but it's
>> not
>> obvious to me, unfortunately.
>>
>> Another approach would be a GAlignmentPairs where the unpaired reads have
>> "missing" mates. I know GAlignments has no concept of missing, but it
>> would
>> get everything into a single data structure that is convenient for
>> computing on pairs.
>
>
> I could modify readGAlignmentPairs() to have the discordant and/or
> ambiguous pairs end up in th GAlignmentPairs. The ambiguous pairs
> could be marked as such thru a metadata col of the object or thru
> a proper slot. The seqnames() and strand() accessors will return
> * on discordant pairs. Does that sound reasonable?
>
> H.
>
>
>>
>> On Fri, Oct 16, 2015 at 5:21 AM, Morgan, Martin <
>> martin.mor...@roswellpark.org> wrote:
>>
>>>
>>>
 -Original Message-
 From: Bioc-devel [mailto:bioc-devel-boun...@r-project.org] On Behalf Of
 Michael Lawrence
 Sent: Friday, October 16, 2015 7:41 AM
 To: bioc-devel@r-project.org
 Subject: [Bioc-devel] readGAlignmentPairs with discordant strand

 Now that GAlignmentPairs supports discordant strand between mates, how
 hard would it be to relax that restriction on readGAlignmentPairs()?

 Also, would be nice if getDumpedAlignments() returned those dumped by
 readGAlignmentPairs(). Right now, I'm reading a GAlignments (with the
>>>
>>> extra

 mcols) and calling makeGAlignmentPairs(). Not so convenient.
>>>
>>>
>>> I'm not sure whether this is relevant to your use case but
>>> readGAlignmentsList returns a list of paired mates, and if appropriate
>>> (based on ScanBamParam) list elements with solo travelers. The paired
>>> portion of the list can be coerced to GAlignmentPairs if the additional
>>> structure of that class is required.
>>>
>>> Martin
>>>

 Michael

[[alternative HTML version deleted]]

 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>>>
>>> This email message may contain legally privileged and/or confidential
>>> information.  If you are not the intended recipient(s), or the employee
>>> or
>>> agent responsible for the delivery of this message to the intended
>>> recipient(s), you are hereby notified that any disclosure, copying,
>>> distribution, or use of this email message is prohibited.  If you have
>>> received this message in error, please notify the sender immediately by
>>> e-mail and delete this email message from your computer. Thank you.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel