Re: [Bioc-devel] coverage(gr, weight='score') does not work when score(gr) is an Rle

2015-04-20 Thread Hervé Pagès

Hi,

On 04/17/2015 09:42 AM, Hervé Pagès wrote:

Hi,

I think we should just expand the Rle internally. That will produce
a numeric vector of the length of the GRanges i.e. it will be the
same size as the start and end components of the GRanges object itself.
No big deal at all.

I'll make that change.


Done in GenomicRanges 1.20.2 (release) and 1.21.1 (devel).

Cheers,
H.



H.

On 04/17/2015 09:00 AM, Michael Lawrence wrote:

Ideally it should be supported, but it would take some work as the
coverage
stuff is all in C. Could you give more details on your use case? For
example, if you already have a range for every position on the
chromosome,
you could just extract the score column. I'm guessing it's more
complicated
than that. If the zeros are the problem, you could just subset the
GRanges
to remove the ranges with zero score, and then coerce the score to
numeric
before calling coverage.

Michael

2015-04-17 8:00 GMT-07:00 Philip Lijnzaad p.lijnz...@umcutrecht.nl:


Dear all,  I'm puzzled by the following behaviour:

Given

 n - 10
 gr - GRanges(seqnames=Rle('A', n),
   ranges=IRanges(1:n, width=1),
   score=Rle(5,n))
If I do

 coverage(gr,weight='score')

I get

 Error in .normarg_shift_or_weight(weight, weight, x) :
   'weight' must be a numeric vector, a single string, or a
list-like
object

Surely 'score' should be allowed to be an Rle? Especially given the
fact that the return value of coverage(x,weight=score) when score is
plain numeric vector is always an Rle ! Is this the expected behaviour?
If so, I would argue that violates the principle of least suprise :-)

The background to this is that I do numerical analysis on derived
numerical data along my chromosomes. It contains many
contiguous zeroes so it would be wasteful to cast
everything down using as.numeric().

This is R version 3.01 on x86_64 Linux, Bioconductor version 2.13,


package.version(IRanges)

[1] 1.20.7

package.version(GenomicRanges)

[1] 1.14.4

Regards,


Philip


--
Philip Lijnzaad, PhD
Molecular Cancer Research
University Medical Center (UMC), Utrecht
Stratenum room 2.211
IM: plijnz...@jabber.org , philip.lijnz...@gmail.com
P.O. Box 85060, 3508 AB Utrecht
(Universiteitsweg 100, 3584 CG Utrecht)
The Netherlands
tel: +31 (0)8875 68464


--


De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht
onterecht
ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
direct
te informeren door het bericht te retourneren. Het Universitair Medisch
Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de
W.H.W.
(Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat
geregistreerd
bij
de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.


--


This message may contain confidential information and ...{{dropped:10}}


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel





--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] coverage(gr, weight='score') does not work when score(gr) is an Rle

2015-04-17 Thread Hervé Pagès

On 04/17/2015 10:00 AM, Michael Lawrence wrote:

Is that the case here? He has an Rle as an mcol in the GRanges, so in
general expanding it will not align with the other components.


Not sure what you mean. Can you give an example?

H.



On Fri, Apr 17, 2015 at 9:42 AM, Hervé Pagès hpa...@fredhutch.org
mailto:hpa...@fredhutch.org wrote:

Hi,

I think we should just expand the Rle internally. That will produce
a numeric vector of the length of the GRanges i.e. it will be the
same size as the start and end components of the GRanges object itself.
No big deal at all.

I'll make that change.

H.


On 04/17/2015 09:00 AM, Michael Lawrence wrote:

Ideally it should be supported, but it would take some work as
the coverage
stuff is all in C. Could you give more details on your use case? For
example, if you already have a range for every position on the
chromosome,
you could just extract the score column. I'm guessing it's more
complicated
than that. If the zeros are the problem, you could just subset
the GRanges
to remove the ranges with zero score, and then coerce the score
to numeric
before calling coverage.

Michael

2015-04-17 8:00 GMT-07:00 Philip Lijnzaad
p.lijnz...@umcutrecht.nl mailto:p.lijnz...@umcutrecht.nl:

Dear all,  I'm puzzled by the following behaviour:

Given

  n - 10
  gr - GRanges(seqnames=Rle('A', n),
ranges=IRanges(1:n, width=1),
score=Rle(5,n))
If I do

  coverage(gr,weight='score')

I get

  Error in .normarg_shift_or_weight(weight, weight, x) :
'weight' must be a numeric vector, a single string,
or a list-like
object

Surely 'score' should be allowed to be an Rle? Especially
given the
fact that the return value of coverage(x,weight=score)
when score is
plain numeric vector is always an Rle ! Is this the expected
behaviour?
If so, I would argue that violates the principle of least
suprise :-)

The background to this is that I do numerical analysis on
derived
numerical data along my chromosomes. It contains many
contiguous zeroes so it would be wasteful to cast
everything down using as.numeric().

This is R version 3.01 on x86_64 Linux, Bioconductor version
2.13,

package.version(IRanges)

[1] 1.20.7

package.version(GenomicRanges)

[1] 1.14.4

Regards,


Philip


--
Philip Lijnzaad, PhD
Molecular Cancer Research
University Medical Center (UMC), Utrecht
Stratenum room 2.211
IM: plijnz...@jabber.org mailto:plijnz...@jabber.org ,
philip.lijnz...@gmail.com mailto:philip.lijnz...@gmail.com
P.O. Box 85060, 3508 AB Utrecht
(Universiteitsweg 100, 3584 CG Utrecht)
The Netherlands
tel: +31 (0)8875 68464 tel:%2B31%20%280%298875%2068464



--

De informatie opgenomen in dit bericht kan vertrouwelijk
zijn en is
uitsluitend bestemd voor de geadresseerde. Indien u dit
bericht onterecht
ontvangt, wordt u verzocht de inhoud niet te gebruiken en de
afzender
direct
te informeren door het bericht te retourneren. Het
Universitair Medisch
Centrum Utrecht is een publiekrechtelijke rechtspersoon in
de zin van de
W.H.W.
(Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat
geregistreerd
bij
de Kamer van Koophandel voor Midden-Nederland onder nr.
30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.



--

This message may contain confidential information and
...{{dropped:10}}


___
Bioc-devel@r-project.org mailto:Bioc-devel@r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org mailto:hpa...@fredhutch.org
Phone: (206) 667-5791 tel:%28206%29%20667-5791
Fax: (206) 667-1319 

Re: [Bioc-devel] coverage(gr, weight='score') does not work when score(gr) is an Rle

2015-04-17 Thread Hervé Pagès

Hi,

I think we should just expand the Rle internally. That will produce
a numeric vector of the length of the GRanges i.e. it will be the
same size as the start and end components of the GRanges object itself.
No big deal at all.

I'll make that change.

H.

On 04/17/2015 09:00 AM, Michael Lawrence wrote:

Ideally it should be supported, but it would take some work as the coverage
stuff is all in C. Could you give more details on your use case? For
example, if you already have a range for every position on the chromosome,
you could just extract the score column. I'm guessing it's more complicated
than that. If the zeros are the problem, you could just subset the GRanges
to remove the ranges with zero score, and then coerce the score to numeric
before calling coverage.

Michael

2015-04-17 8:00 GMT-07:00 Philip Lijnzaad p.lijnz...@umcutrecht.nl:


Dear all,  I'm puzzled by the following behaviour:

Given

 n - 10
 gr - GRanges(seqnames=Rle('A', n),
   ranges=IRanges(1:n, width=1),
   score=Rle(5,n))
If I do

 coverage(gr,weight='score')

I get

 Error in .normarg_shift_or_weight(weight, weight, x) :
   'weight' must be a numeric vector, a single string, or a list-like
object

Surely 'score' should be allowed to be an Rle? Especially given the
fact that the return value of coverage(x,weight=score) when score is
plain numeric vector is always an Rle ! Is this the expected behaviour?
If so, I would argue that violates the principle of least suprise :-)

The background to this is that I do numerical analysis on derived
numerical data along my chromosomes. It contains many
contiguous zeroes so it would be wasteful to cast
everything down using as.numeric().

This is R version 3.01 on x86_64 Linux, Bioconductor version 2.13,


package.version(IRanges)

[1] 1.20.7

package.version(GenomicRanges)

[1] 1.14.4

Regards,


Philip


--
Philip Lijnzaad, PhD
Molecular Cancer Research
University Medical Center (UMC), Utrecht
Stratenum room 2.211
IM: plijnz...@jabber.org , philip.lijnz...@gmail.com
P.O. Box 85060, 3508 AB Utrecht
(Universiteitsweg 100, 3584 CG Utrecht)
The Netherlands
tel: +31 (0)8875 68464


--

De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
direct
te informeren door het bericht te retourneren. Het Universitair Medisch
Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de
W.H.W.
(Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd
bij
de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.


--

This message may contain confidential information and ...{{dropped:10}}


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] coverage(gr, weight='score') does not work when score(gr) is an Rle

2015-04-17 Thread Michael Lawrence
Is that the case here? He has an Rle as an mcol in the GRanges, so in
general expanding it will not align with the other components.

On Fri, Apr 17, 2015 at 9:42 AM, Hervé Pagès hpa...@fredhutch.org wrote:

 Hi,

 I think we should just expand the Rle internally. That will produce
 a numeric vector of the length of the GRanges i.e. it will be the
 same size as the start and end components of the GRanges object itself.
 No big deal at all.

 I'll make that change.

 H.


 On 04/17/2015 09:00 AM, Michael Lawrence wrote:

 Ideally it should be supported, but it would take some work as the
 coverage
 stuff is all in C. Could you give more details on your use case? For
 example, if you already have a range for every position on the chromosome,
 you could just extract the score column. I'm guessing it's more
 complicated
 than that. If the zeros are the problem, you could just subset the GRanges
 to remove the ranges with zero score, and then coerce the score to numeric
 before calling coverage.

 Michael

 2015-04-17 8:00 GMT-07:00 Philip Lijnzaad p.lijnz...@umcutrecht.nl:

  Dear all,  I'm puzzled by the following behaviour:

 Given

  n - 10
  gr - GRanges(seqnames=Rle('A', n),
ranges=IRanges(1:n, width=1),
score=Rle(5,n))
 If I do

  coverage(gr,weight='score')

 I get

  Error in .normarg_shift_or_weight(weight, weight, x) :
'weight' must be a numeric vector, a single string, or a list-like
 object

 Surely 'score' should be allowed to be an Rle? Especially given the
 fact that the return value of coverage(x,weight=score) when score is
 plain numeric vector is always an Rle ! Is this the expected behaviour?
 If so, I would argue that violates the principle of least suprise :-)

 The background to this is that I do numerical analysis on derived
 numerical data along my chromosomes. It contains many
 contiguous zeroes so it would be wasteful to cast
 everything down using as.numeric().

 This is R version 3.01 on x86_64 Linux, Bioconductor version 2.13,

  package.version(IRanges)

 [1] 1.20.7

 package.version(GenomicRanges)

 [1] 1.14.4

 Regards,


 Philip


 --
 Philip Lijnzaad, PhD
 Molecular Cancer Research
 University Medical Center (UMC), Utrecht
 Stratenum room 2.211
 IM: plijnz...@jabber.org , philip.lijnz...@gmail.com
 P.O. Box 85060, 3508 AB Utrecht
 (Universiteitsweg 100, 3584 CG Utrecht)
 The Netherlands
 tel: +31 (0)8875 68464



 --

 De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
 uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
 ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
 direct
 te informeren door het bericht te retourneren. Het Universitair Medisch
 Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de
 W.H.W.
 (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat
 geregistreerd
 bij
 de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

 Denk s.v.p aan het milieu voor u deze e-mail afdrukt.



 --

 This message may contain confidential information and ...{{dropped:10}}


 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel


 --
 Hervé Pagès

 Program in Computational Biology
 Division of Public Health Sciences
 Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N, M1-B514
 P.O. Box 19024
 Seattle, WA 98109-1024

 E-mail: hpa...@fredhutch.org
 Phone:  (206) 667-5791
 Fax:(206) 667-1319


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] coverage(gr, weight='score') does not work when score(gr) is an Rle

2015-04-17 Thread Michael Lawrence
Ideally it should be supported, but it would take some work as the coverage
stuff is all in C. Could you give more details on your use case? For
example, if you already have a range for every position on the chromosome,
you could just extract the score column. I'm guessing it's more complicated
than that. If the zeros are the problem, you could just subset the GRanges
to remove the ranges with zero score, and then coerce the score to numeric
before calling coverage.

Michael

2015-04-17 8:00 GMT-07:00 Philip Lijnzaad p.lijnz...@umcutrecht.nl:

 Dear all,  I'm puzzled by the following behaviour:

 Given

 n - 10
 gr - GRanges(seqnames=Rle('A', n),
   ranges=IRanges(1:n, width=1),
   score=Rle(5,n))
 If I do

 coverage(gr,weight='score')

 I get

 Error in .normarg_shift_or_weight(weight, weight, x) :
   'weight' must be a numeric vector, a single string, or a list-like
 object

 Surely 'score' should be allowed to be an Rle? Especially given the
 fact that the return value of coverage(x,weight=score) when score is
 plain numeric vector is always an Rle ! Is this the expected behaviour?
 If so, I would argue that violates the principle of least suprise :-)

 The background to this is that I do numerical analysis on derived
 numerical data along my chromosomes. It contains many
 contiguous zeroes so it would be wasteful to cast
 everything down using as.numeric().

 This is R version 3.01 on x86_64 Linux, Bioconductor version 2.13,

  package.version(IRanges)
 [1] 1.20.7
  package.version(GenomicRanges)
 [1] 1.14.4

 Regards,


 Philip


 --
 Philip Lijnzaad, PhD
 Molecular Cancer Research
 University Medical Center (UMC), Utrecht
 Stratenum room 2.211
 IM: plijnz...@jabber.org , philip.lijnz...@gmail.com
 P.O. Box 85060, 3508 AB Utrecht
 (Universiteitsweg 100, 3584 CG Utrecht)
 The Netherlands
 tel: +31 (0)8875 68464


 --

 De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
 uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
 ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
 direct
 te informeren door het bericht te retourneren. Het Universitair Medisch
 Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de
 W.H.W.
 (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd
 bij
 de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

 Denk s.v.p aan het milieu voor u deze e-mail afdrukt.


 --

 This message may contain confidential information and ...{{dropped:10}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] coverage(gr, weight='score') does not work when score(gr) is an Rle

2015-04-17 Thread Michael Lawrence
Sorry, I was confused. You're right, there's not much point in keeping it
compressed.

On Fri, Apr 17, 2015 at 10:14 AM, Hervé Pagès hpa...@fredhutch.org wrote:

 On 04/17/2015 10:00 AM, Michael Lawrence wrote:

 Is that the case here? He has an Rle as an mcol in the GRanges, so in
 general expanding it will not align with the other components.


 Not sure what you mean. Can you give an example?

 H.


 On Fri, Apr 17, 2015 at 9:42 AM, Hervé Pagès hpa...@fredhutch.org
 mailto:hpa...@fredhutch.org wrote:

 Hi,

 I think we should just expand the Rle internally. That will produce
 a numeric vector of the length of the GRanges i.e. it will be the
 same size as the start and end components of the GRanges object
 itself.
 No big deal at all.

 I'll make that change.

 H.


 On 04/17/2015 09:00 AM, Michael Lawrence wrote:

 Ideally it should be supported, but it would take some work as
 the coverage
 stuff is all in C. Could you give more details on your use case?
 For
 example, if you already have a range for every position on the
 chromosome,
 you could just extract the score column. I'm guessing it's more
 complicated
 than that. If the zeros are the problem, you could just subset
 the GRanges
 to remove the ranges with zero score, and then coerce the score
 to numeric
 before calling coverage.

 Michael

 2015-04-17 8:00 GMT-07:00 Philip Lijnzaad
 p.lijnz...@umcutrecht.nl mailto:p.lijnz...@umcutrecht.nl:


 Dear all,  I'm puzzled by the following behaviour:

 Given

   n - 10
   gr - GRanges(seqnames=Rle('A', n),
 ranges=IRanges(1:n, width=1),
 score=Rle(5,n))
 If I do

   coverage(gr,weight='score')

 I get

   Error in .normarg_shift_or_weight(weight, weight, x) :
 'weight' must be a numeric vector, a single string,
 or a list-like
 object

 Surely 'score' should be allowed to be an Rle? Especially
 given the
 fact that the return value of coverage(x,weight=score)
 when score is
 plain numeric vector is always an Rle ! Is this the expected
 behaviour?
 If so, I would argue that violates the principle of least
 suprise :-)

 The background to this is that I do numerical analysis on
 derived
 numerical data along my chromosomes. It contains many
 contiguous zeroes so it would be wasteful to cast
 everything down using as.numeric().

 This is R version 3.01 on x86_64 Linux, Bioconductor version
 2.13,

 package.version(IRanges)

 [1] 1.20.7

 package.version(GenomicRanges)

 [1] 1.14.4

 Regards,


 Philip


 --
 Philip Lijnzaad, PhD
 Molecular Cancer Research
 University Medical Center (UMC), Utrecht
 Stratenum room 2.211
 IM: plijnz...@jabber.org mailto:plijnz...@jabber.org ,
 philip.lijnz...@gmail.com mailto:philip.lijnz...@gmail.com
 P.O. Box 85060, 3508 AB Utrecht
 (Universiteitsweg 100, 3584 CG Utrecht)
 The Netherlands
 tel: +31 (0)8875 68464 tel:%2B31%20%280%298875%2068464



 --

 De informatie opgenomen in dit bericht kan vertrouwelijk
 zijn en is
 uitsluitend bestemd voor de geadresseerde. Indien u dit
 bericht onterecht
 ontvangt, wordt u verzocht de inhoud niet te gebruiken en de
 afzender
 direct
 te informeren door het bericht te retourneren. Het
 Universitair Medisch
 Centrum Utrecht is een publiekrechtelijke rechtspersoon in
 de zin van de
 W.H.W.
 (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat
 geregistreerd
 bij
 de Kamer van Koophandel voor Midden-Nederland onder nr.
 30244197.

 Denk s.v.p aan het milieu voor u deze e-mail afdrukt.



 --

 This message may contain confidential information and
 ...{{dropped:10}}


 ___
 Bioc-devel@r-project.org mailto:Bioc-devel@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel


 --
 Hervé Pagès

 Program in Computational Biology
 Division of Public Health Sciences
 Fred Hutchinson