[Bioc-devel] thanks

2014-09-19 Thread Valerie Obenchain
Just a note to say thanks to those who worked on (1) the new biocViews 
search capabilities and (2) seqlevelsStyle<-. These are great 
improvements that have made tasks easier / faster time and time again.


Yea!

Val

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] need for consistent coordinate mapping API

2014-09-19 Thread Hervé Pagès

Hi,

It's an interesting problem. Right now mapCoords() has some
limitations. For example I can use it to map from reference sequence
to read cycle but not the other way around. Or from reference genome
to transcriptome but not the other way around (this reverse mapping
is actually what low level util transcriptLocs2RefLocs() does).
Ideally we should be able to easily go back and forth between 2
coordinate spaces.

Also we need to think about how more complex use cases (like Laurent's
one) could be handled by mapCoords(). Not clear. We might need to
change mapCoords's design a little bit, or maybe we'll need something
else.

H.

On 09/19/2014 10:40 AM, Laurent Gatto wrote:


On 19 September 2014 18:07, Michael Lawrence wrote:


Hi guys,

This is the problem of mapping back and forth between coordinate spaces,
such as between genomic and transcript space. I think there was some
progress this release cycle (introduction of mapCoords generic, etc), but I
think there is yet more to do. For example, transcriptLocs2RefLocs could be
given a ranges-based wrapper that conforms to the mapCoords API somehow.
Could we please put this on the TODO list of someone (in Seattle) for the
next release cycle?


And I would be very interested in (and slowly working towards)
generalising this to proteomics data. For now, there is a rather long
description in [1], but eventually, it should be standardised. I was not
aware of mapCoords and will read about it.

Laurent

[1] 
http://bioconductor.org/packages/devel/bioc/vignettes/Pbase/inst/doc/mapping.html


Michael

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] need for consistent coordinate mapping API

2014-09-19 Thread Laurent Gatto

On 19 September 2014 18:07, Michael Lawrence wrote:

> Hi guys,
>
> This is the problem of mapping back and forth between coordinate spaces,
> such as between genomic and transcript space. I think there was some
> progress this release cycle (introduction of mapCoords generic, etc), but I
> think there is yet more to do. For example, transcriptLocs2RefLocs could be
> given a ranges-based wrapper that conforms to the mapCoords API somehow.
> Could we please put this on the TODO list of someone (in Seattle) for the
> next release cycle?

And I would be very interested in (and slowly working towards)
generalising this to proteomics data. For now, there is a rather long
description in [1], but eventually, it should be standardised. I was not
aware of mapCoords and will read about it.

Laurent

[1] 
http://bioconductor.org/packages/devel/bioc/vignettes/Pbase/inst/doc/mapping.html

> Michael
>
>   [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] need for consistent coordinate mapping API

2014-09-19 Thread Michael Lawrence
Hi guys,

This is the problem of mapping back and forth between coordinate spaces,
such as between genomic and transcript space. I think there was some
progress this release cycle (introduction of mapCoords generic, etc), but I
think there is yet more to do. For example, transcriptLocs2RefLocs could be
given a ranges-based wrapper that conforms to the mapCoords API somehow.
Could we please put this on the TODO list of someone (in Seattle) for the
next release cycle?

Michael

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] GenomicRanges::findOverlaps() ignoring chromosome information?

2014-09-19 Thread Kevin Rue-Albrecht
Hi all, for this concluding email !

I found the problem in my code:
Everything was in the right place, except that I initialised the column
meant to store the chromosome name with NA values (DMRs without hits will
be left with this NA if the users requires all DMRs in the return value).
When I subsquently inserted the chromosome name for the DMRs hitting an
annotated gene, the character value was then converted in a numeric value
because a column initialised with NA is of class "logical". This is where
the actual chromosome name was converted to a numeric value, often
different from the original chromosome name. When I subsequently prefixed
that value with "chr", converting that column to the class character, there
was no trace of the undesired conversion left.

Anyway, for those interested, I attach the two functions I wrote (and
corrected):

   - OverlapDmrs.Gene
   - Takes the output data.frame "dmrs" from bsseq, a GRanges object
  obtained form a UCSC gene track, and some opotional arguments
  - To find DMRs overlapping annotated genes, and return a table with
  the coordinates and Ensembl identifier of that gene
   - OverlapDmrs.Cpg
   - Same as above, except expects a GRanges object from a UCSC cpg track
  - Annotates with the coordinates of an overlapping CpG island

I also attached a example data.frame of dmrs obtained using bsseq, as
described in my first email. I believe all the code is there to test. Feel
free to give me feedback on this.

Apologies for the spam and the relatively obvious mistake on my part.

Cheers
Kevin





On 19 September 2014 12:21, Kevin Rue-Albrecht 
wrote:

> Hi again,
>
> Update on my issue, although I haven't found the source of the error yet..
> I have correct overlaps in one scenario, but not in another.  This suggests
> that the findOverlaps() command works as expected on my data, but in the
> second scenario I don't see where the error is yet, let me explain:
>
>- When I use my function OverlapDmrs.Gene with argument only.hits=TRUE,
>all the hits make perfect sense
>   - Full command: dmrs_gene = OverlapDmrs.Gene(dmrs=dmrs,
>   gene_track=ensGene.asFeatures, only.hits=TRUE, prefix.chr=TRUE)
>- When I use my function OverlapDmrs.Gene with argument only.hits=FALSE,
>the correct DMRs are annotated with the right start and stop position, but
>with an incorrect chromosome value (strangest thing is that chromosone 30
>should not exist in *Bos taurus*, while some hits state this value in
>the chromosome column)
>   - Full command: dmrs_gene.all = OverlapDmrs.Gene(dmrs=dmrs,
>   gene_track=ensGene.asFeatures, only.hits=FALSE, prefix.chr=T)
>
>
> ...
> Now that I wrote that "out loud", I just got an idea where to look for the
> source of the problem. Apologies for the spam, but if I find the solution,
> I'll definitely bring a conclusion to this thread.
>
> Kevin
>
>
>
>
>
>
> On 19 September 2014 10:12, Kevin Rue-Albrecht 
> wrote:
>
>> Dear maintainer, Dear all,
>>
>> *Situation*
>> I have used the findOverlaps(function) to annotate differentially
>> methylated regions (DRMs) obtained using the bsseq Bioconductor package in
>> the *Bos taurus* genome. (No, you won't steal my experimental design :-P
>> ).
>> I used the genome UMD3.1.75 as a reference for my analysis.
>>
>> *Problem*
>> The genes found to overlap the DMRs genomic ranges are often on a
>> different chromosone than the DMR, although the start and end coordinate of
>> DMRs and gene do overlap in all cases.
>> This leads me to believe that the chromosome information is ignored in
>> findOverlaps(). Is this the case, or am I using the function incorrectly?
>> Note that it does happen that a "true hit" is returned, i.e. the
>> overlapping gene is present on the same chromosome, with start and end
>> overlapping the coordinates of the DMR.
>>
>>
>> *Attached for your use/testing:*
>>
>>- dmrs variable
>>- script used to annotate dmrs with information about overlapping gene
>>   - Note that I have tried to set select to arbitrary, first and
>>   last with always the same issue. I would prefer to get a single hit at 
>> this
>>   stage rather than filter afterwards, but the latter remain a possible
>>   option if necessary.
>>
>>
>> Any help / solution / feedback welcome !
>>
>> Best regards,
>> Kevin
>>
>> --
>> Kévin RUE-ALBRECHT
>> Wellcome Trust Computational Infection Biology PhD Programme
>> University College Dublin
>> Ireland
>> http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en
>>
>
>
>
> --
> Kévin RUE-ALBRECHT
> Wellcome Trust Computational Infection Biology PhD Programme
> University College Dublin
> Ireland
> http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en
>



-- 
Kévin RUE-ALBRECHT
Wellcome Trust Computational Infection Biology PhD Programme
University College Dublin
Ireland
http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en
___
Bioc-devel@r-project.org mai

Re: [Bioc-devel] GenomicRanges::findOverlaps() ignoring chromosome information?

2014-09-19 Thread Kevin Rue-Albrecht
Hi again,

Update on my issue, although I haven't found the source of the error yet..
I have correct overlaps in one scenario, but not in another.  This suggests
that the findOverlaps() command works as expected on my data, but in the
second scenario I don't see where the error is yet, let me explain:

   - When I use my function OverlapDmrs.Gene with argument only.hits=TRUE,
   all the hits make perfect sense
  - Full command: dmrs_gene = OverlapDmrs.Gene(dmrs=dmrs,
  gene_track=ensGene.asFeatures, only.hits=TRUE, prefix.chr=TRUE)
   - When I use my function OverlapDmrs.Gene with argument only.hits=FALSE,
   the correct DMRs are annotated with the right start and stop position, but
   with an incorrect chromosome value (strangest thing is that chromosone 30
   should not exist in *Bos taurus*, while some hits state this value in
   the chromosome column)
  - Full command: dmrs_gene.all = OverlapDmrs.Gene(dmrs=dmrs,
  gene_track=ensGene.asFeatures, only.hits=FALSE, prefix.chr=T)


...
Now that I wrote that "out loud", I just got an idea where to look for the
source of the problem. Apologies for the spam, but if I find the solution,
I'll definitely bring a conclusion to this thread.

Kevin






On 19 September 2014 10:12, Kevin Rue-Albrecht 
wrote:

> Dear maintainer, Dear all,
>
> *Situation*
> I have used the findOverlaps(function) to annotate differentially
> methylated regions (DRMs) obtained using the bsseq Bioconductor package in
> the *Bos taurus* genome. (No, you won't steal my experimental design :-P
> ).
> I used the genome UMD3.1.75 as a reference for my analysis.
>
> *Problem*
> The genes found to overlap the DMRs genomic ranges are often on a
> different chromosone than the DMR, although the start and end coordinate of
> DMRs and gene do overlap in all cases.
> This leads me to believe that the chromosome information is ignored in
> findOverlaps(). Is this the case, or am I using the function incorrectly?
> Note that it does happen that a "true hit" is returned, i.e. the
> overlapping gene is present on the same chromosome, with start and end
> overlapping the coordinates of the DMR.
>
>
> *Attached for your use/testing:*
>
>- dmrs variable
>- script used to annotate dmrs with information about overlapping gene
>   - Note that I have tried to set select to arbitrary, first and last
>   with always the same issue. I would prefer to get a single hit at this
>   stage rather than filter afterwards, but the latter remain a possible
>   option if necessary.
>
>
> Any help / solution / feedback welcome !
>
> Best regards,
> Kevin
>
> --
> Kévin RUE-ALBRECHT
> Wellcome Trust Computational Infection Biology PhD Programme
> University College Dublin
> Ireland
> http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en
>



-- 
Kévin RUE-ALBRECHT
Wellcome Trust Computational Infection Biology PhD Programme
University College Dublin
Ireland
http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] GenomicRanges::findOverlaps() ignoring chromosome information?

2014-09-19 Thread Kevin Rue-Albrecht
Dear maintainer, Dear all,

*Situation*
I have used the findOverlaps(function) to annotate differentially
methylated regions (DRMs) obtained using the bsseq Bioconductor package in
the *Bos taurus* genome. (No, you won't steal my experimental design :-P ).
I used the genome UMD3.1.75 as a reference for my analysis.

*Problem*
The genes found to overlap the DMRs genomic ranges are often on a different
chromosone than the DMR, although the start and end coordinate of DMRs and
gene do overlap in all cases.
This leads me to believe that the chromosome information is ignored in
findOverlaps(). Is this the case, or am I using the function incorrectly?
Note that it does happen that a "true hit" is returned, i.e. the
overlapping gene is present on the same chromosome, with start and end
overlapping the coordinates of the DMR.


*Attached for your use/testing:*

   - dmrs variable
   - script used to annotate dmrs with information about overlapping gene
  - Note that I have tried to set select to arbitrary, first and last
  with always the same issue. I would prefer to get a single hit at this
  stage rather than filter afterwards, but the latter remain a possible
  option if necessary.


Any help / solution / feedback welcome !

Best regards,
Kevin

-- 
Kévin RUE-ALBRECHT
Wellcome Trust Computational Infection Biology PhD Programme
University College Dublin
Ireland
http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel