Re: [Bioc-devel] Need help figuring out GeometryDoesNotContainImage-error on machv2-build for chimeraviz

2020-04-23 Thread Bemis, Kylie
I’m now seeing the same "semi-transparency" error on my Mac builds for 
Cardinal. My vignettes have used transparency for years now and this has never 
been an issue before (on merida1 or otherwise).

I can reproduce the error locally with an X11() device, but not with quartz(), 
png(), png(), etc.

(Note that my Cardinal 2.5.9 builds are currently failing due to an unrelated 
issue that I’ve since fixed, but the build system hasn’t gotten to 2.5.11 yet.)

~~~
Kylie Ariel Bemis (she/her)
Khoury College of Computer Sciences
Northeastern University
kuwisdelu.github.io










On Apr 23, 2020, at 11:28 PM, Hervé Pagès 
mailto:hpa...@fredhutch.org>> wrote:

Ok so I'm changing my mind about this. I suspect that the error is actually 
related to the warning. The error comes from the magick package (a wrapper 
around the ImageMagick software) and it indicates a failure to crop an empty 
image. It can easily be reproduced with:

 ## Generate an empty image.
 png("myplot.png", bg="transparent")
 plot.new()
 dev.off()

 ## Try to crop it.
 magick::image_trim(magick::image_read("myplot.png"))
 # Error in magick_image_trim(image, fuzz) :
 #   R: GeometryDoesNotContainImage `/Users/biocbuild/myplot.png' @ # 
warning/attribute.c/GetImageBoundingBox/247

So I suspect that what happens is that the images generated on Mac by the code 
in the vignette are empty (because of the semi-transparency problem on Mac) 
which would explain why later knitr fails to crop them (it uses 
magick::image_trim() for that).

I don't exactly understand why we wouldn't have seen the problem on merida1 
though (same version of knitr (1.28) and magick (2.3) on both machines) but it 
seems that chimeraviz has changed significantly between BioC 3.10 and 3.11. Did 
you start using semi-transparency recently in your plots?

Best,
H.


On 4/23/20 19:42, Hervé Pagès wrote:
Hi Stian,
I went on machv2 and gave this a shot. I can reproduce the 
GeometryDoesNotContainImage error in an interactive session. I don't have an 
answer yet but I was curious about the "semi-transparency is not supported on 
this device" warning and was wondering if it could somehow be related with the 
error.
It turns out that the warning is actually easy to reproduce on a Mac with 
something like this:
  plot.new()
  lines(c(0.1, 0.22), c(0.5, 0.44), type = "l", lwd = 1, col = "#FF80")
I think that the 4th byte (80) in the color specification ("#FF80") is the 
level of transparency.
I can get this warning on machv2 **and** merida1. Some googling indicates that 
this is a pretty common warning on Mac. Since we don't get the vignette error 
on merida1 I think it's unlikely that the warning is related to the error.
I'll keep investigating the GeometryDoesNotContainImage error...
H.
On 4/22/20 01:59, Stian Lågstad wrote:
I'm still unable to reproduce this error on my end. If anyone with a mac
could try building locally I would be very grateful. Thanks.

On Sat, Apr 18, 2020 at 4:06 PM Stian Lågstad 
mailto:stianlags...@gmail.com>>
wrote:

Hi,

I'm haven't been able to figure out this error for the latest machv2 build
for chimeraviz:

```
Warning in doTryCatch(return(expr), name, parentenv, handler) :
   semi-transparency is not supported on this device: reported only once
per page
Quitting from lines 108-126 (chimeraviz-vignette.Rmd)
Error: processing vignette 'chimeraviz-vignette.Rmd' failed with
diagnostics:
R: GeometryDoesNotContainImage
`/private/tmp/RtmpdBrrvk/Rbuild9ed5154894cc/chimeraviz/vignettes/chimeraviz-vignette_files/figure-html/unnamed-chunk-7-1.png'
@ warning/attribute.c/GetImageBoundingBox/247
--- failed re-building ‘chimeraviz-vignette.Rmd’
```

The build in question:
https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__bioconductor.org_checkResults_3.11_bioc-2DLATEST_chimeraviz_machv2-2Dbuildsrc.html%26d%3DDwIFaQ%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DQ1W5ctHOQ2Hjw49pX7VAGBn7-u5l3mAqH1rt9tFNnZM%26s%3DA7jGEFF5ehz2Hsxmlr_vGmPSX3Xy2SwZErgyi1mPIuw%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7C0fefe5ade3994ca1943e08d7e7ff9084%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C637232957151727562&sdata=jz%2FO8NrKC2VZTuK4Z%2F7Fo19ExLg9d1C4LpzbZddhcQ8%3D&reserved=0

If anyone has seen something like this before then I'd appreciate some
help. Thank you!

--
Stian Lågstad
+47 41 80 80 25




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel&data=02%7C01%7Ck.be

Re: [Bioc-devel] Need help figuring out GeometryDoesNotContainImage-error on machv2-build for chimeraviz

2020-04-23 Thread Bemis, Kylie
That’s interesting. I did:

> BiocManager::install("Cardinal", type="mac.binary.el-capitan”)
> browseVignettes("Cardinal")

from R 3.6.3, and the figures using transparency in the vignettes look fine to 
me.

When I use X11() to reproduce the warning locally, the transparent colors get 
truncated, so that the higher-alpha colors appear opaque and the lower-alpha 
colors don’t appear at all.

However, in the merida1 vignette, the figures appear as I’d normally get form 
quartz() or pdf() locally, which don’t produce warnings for me on macOS 10.15.3.

-Kylie






On Apr 24, 2020, at 12:39 AM, Hervé Pagès 
mailto:hpa...@fredhutch.org>> wrote:

Hi Kylie,

I get the warnings on merida1 for Cardinal too e.g. when I run the code in the 
Cardinal-2-stats vignette:

   merida1:vignettes biocbuild$ pwd
   /Users/biocbuild/bbs-3.10-bioc/meat/Cardinal/vignettes

   merida1:vignettes biocbuild$ R CMD Stangle Cardinal-2-stats.Rmd
   Output file:  Cardinal-2-stats.R

   merida1:vignettes biocbuild$ R
   ...
   > source("Cardinal-2-stats.R", echo=TRUE)
   ...
   There were 14 warnings (use warnings() to see them)
   > warnings()
   Warning messages:
   1: In rect(left, top, r, b, angle = angle, density = density,  ... :
 semi-transparency is not supported on this device: reported only once per 
page
   2: In rect(left, top, r, b, angle = angle, density = density,  ... :
 semi-transparency is not supported on this device: reported only once per 
page
   ...

The thing is that 'R CMD build' does not display warnings, unless there is an 
error. Maybe that's why you've never seen them until now because of the error 
you have on machv2 (and other platforms).

It's be interesting to know if the plots included in the vignette are actually 
OK. Have you checked them? You can do this by installing the Mac binary for 
Cardinal in BioC 3.10 with:

   BiocManager::install("Cardinal", type="mac.el-capitan.binary")

(make sure you do this in R 3.6). This will install the vignette generated on 
merida1. Then open the vignette via browseVignettes("Cardinal") and check the 
plots. Do they look ok despite the "semi-transparency" problem?

Thanks,
H.



On 4/23/20 20:39, Bemis, Kylie wrote:
I’m now seeing the same "semi-transparency" error on my Mac builds for 
Cardinal. My vignettes have used transparency for years now and this has never 
been an issue before (on merida1 or otherwise).
I can reproduce the error locally with an X11() device, but not with quartz(), 
png(), png(), etc.
(Note that my Cardinal 2.5.9 builds are currently failing due to an unrelated 
issue that I’ve since fixed, but the build system hasn’t gotten to 2.5.11 yet.)
~~~
Kylie Ariel Bemis (she/her)
Khoury College of Computer Sciences
Northeastern University
kuwisdelu.github.io<http://kuwisdelu.github.io> 
<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__kuwisdelu.github.io%26d%3DDwMGaQ%26c%3DeRAMFD45gAfqt84VtBcfhQ%26r%3DBK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA%26m%3DiIfSQOsSomvKm_1GHwSPKYxnfvicYz3rNyk04PTXhxU%26s%3DitZMZz7G1z4hEn_h6m-WLnSXgIbD-41KOovzdseVHT4%26e%3D&data=02%7C01%7Ck.bemis%40northeastern.edu%7C557e92d7107a41f63ec208d7e8098af2%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C63723308658074&sdata=9TmRy1lpNBJjGLZcqPqJY%2BMP32z2kgbsmTdvSn4RUk8%3D&reserved=0>
On Apr 23, 2020, at 11:28 PM, Hervé Pagès 
mailto:hpa...@fredhutch.org> 
<mailto:hpa...@fredhutch.org>> wrote:

Ok so I'm changing my mind about this. I suspect that the error is actually 
related to the warning. The error comes from the magick package (a wrapper 
around the ImageMagick software) and it indicates a failure to crop an empty 
image. It can easily be reproduced with:

 ## Generate an empty image.
 png("myplot.png", bg="transparent")
 plot.new()
 dev.off()

 ## Try to crop it.
 magick::image_trim(magick::image_read("myplot.png"))
 # Error in magick_image_trim(image, fuzz) :
 #   R: GeometryDoesNotContainImage `/Users/biocbuild/myplot.png' @ # 
warning/attribute.c/GetImageBoundingBox/247

So I suspect that what happens is that the images generated on Mac by the code 
in the vignette are empty (because of the semi-transparency problem on Mac) 
which would explain why later knitr fails to crop them (it uses 
magick::image_trim() for that).

I don't exactly understand why we wouldn't have seen the problem on merida1 
though (same version of knitr (1.28) and magick (2.3) on both machines) but it 
seems that chimeraviz has changed significantly between BioC 3.10 and 3.11. Did 
you start using semi-transparency recently in your plots?

Best,
H.


On 4/23/20 19:42, Hervé Pagès wrote:
Hi Stian,
I went on machv2 and gave this a shot. I can reproduce the 
GeometryDoesNotContainImage error in an interactive

Re: [Bioc-devel] Need help figuring out GeometryDoesNotContainImage-error on machv2-build for chimeraviz

2020-04-23 Thread Bemis, Kylie
Worked for me without errors or warnings:

kuwisdelu@Eva-02-Dash Projects % R CMD build chimeraviz
* checking for file ‘chimeraviz/DESCRIPTION’ ... OK
* preparing ‘chimeraviz’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
Removed empty directory ‘chimeraviz/docker’
* building ‘chimeraviz_1.13.8.tar.gz’

kuwisdelu@Eva-02-Dash Projects % R CMD INSTALL chimeraviz_1.13.8.tar.gz
* installing to library ‘/Users/kuwisdelu/Library/R/4.0/library’
* installing *source* package ‘chimeraviz’ ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (chimeraviz)

Under:

> sessionInfo()
R version 4.0.0 RC (2020-04-18 r78249)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.3

Matrix products: default
BLAS:   
/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.0.0 tools_4.0.0

The vignette looks okay as far as I can tell.

-Kylie






On Apr 24, 2020, at 1:40 AM, Hervé Pagès 
mailto:hpa...@fredhutch.org>> wrote:

Interesting indeed. Thanks for checking this.

Even though I'm not sure what conclusion to draw from all this.

Since you are on a Mac, can I ask you another big favor? Do you think you could 
run 'R CMD build' on chimeraviz and see if you can reproduce the error we see 
on the build report here:

https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbioconductor.org%2FcheckResults%2F3.11%2Fbioc-LATEST%2Fchimeraviz%2Fmachv2-buildsrc.html&data=02%7C01%7Ck.bemis%40northeastern.edu%7Ca50b7fbb03654f8b503208d7e811f4b5%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C637233036132939427&sdata=s9fJRfwSMO0UkPEwdmrW7mZhEM%2FgCOYtQm3HO12DknA%3D&reserved=0

Get the source with

 git clone 
https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.bioconductor.org%2Fpackages%2Fchimeraviz&data=02%7C01%7Ck.bemis%40northeastern.edu%7Ca50b7fbb03654f8b503208d7e811f4b5%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C637233036132939427&sdata=cTfgpg28qU2zPXAZn2DZs7k8TQKk290wYyj2vjD6Khw%3D&reserved=0

Thanks,
H.


On 4/23/20 21:55, Bemis, Kylie wrote:
That’s interesting. I did:
BiocManager::install("Cardinal", type="mac.binary.el-capitan”)
browseVignettes("Cardinal")
from R 3.6.3, and the figures using transparency in the vignettes look fine to 
me.
When I use X11() to reproduce the warning locally, the transparent colors get 
truncated, so that the higher-alpha colors appear opaque and the lower-alpha 
colors don’t appear at all.
However, in the merida1 vignette, the figures appear as I’d normally get form 
quartz() or pdf() locally, which don’t produce warnings for me on macOS 10.15.3.
-Kylie
On Apr 24, 2020, at 12:39 AM, Hervé Pagès mailto:hpa...@fredhutch.org>> wrote:

Hi Kylie,

I get the warnings on merida1 for Cardinal too e.g. when I run the code in the 
Cardinal-2-stats vignette:

   merida1:vignettes biocbuild$ pwd
   /Users/biocbuild/bbs-3.10-bioc/meat/Cardinal/vignettes

   merida1:vignettes biocbuild$ R CMD Stangle Cardinal-2-stats.Rmd
   Output file:  Cardinal-2-stats.R

   merida1:vignettes biocbuild$ R
   ...
   > source("Cardinal-2-stats.R", echo=TRUE)
   ...
   There were 14 warnings (use warnings() to see them)
   > warnings()
   Warning messages:
   1: In rect(left, top, r, b, angle = angle, density = density,  ... :
 semi-transparency is not supported on this device: reported only once per 
page
   2: In rect(left, top, r, b, angle = angle, density = density,  ... :
 semi-transparency is not supported on this device: reported only once per 
page
   ...

The thing is that 'R CMD build' does not display warnings, unless there is an 
error. Maybe that's why you've never seen them until now because of the error 
you have on machv2 (and other platforms).

It's be interesting to know if the plots included in the vignette are actually 
OK. Have you checked them? You can do this by installing the Mac binary for 
Cardinal in BioC 3.10 with:

   BiocManager::install("Cardinal", type="mac.el-capitan.binary")

(make sure you do this in R 3.6). This will install the vi

Re: [Bioc-devel] Updating author/maintainer info

2016-09-21 Thread Bemis, Kylie
Thanks, but when I follow the directions, it doesn’t work:

svn: E175013: Commit failed (details follow):
svn: E175013: Unable to connect to a repository at URL 
'https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs/CardinalWorkflows'
svn: E175013: Access to 
'https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs/CardinalWorkflows' 
forbidden

Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University





On Sep 21, 2016, at 10:37 PM, Monther Alhamdoosh 
mailto:m.hamdo...@gmail.com>> wrote:

Hi Kyle,

See the Experiment Data Packages section here

https://bioconductor.org/developers/how-to/source-control/

Cheers,
Monther


On Thu, Sep 22, 2016 at 12:16 PM, Kyle Dwayne Bemis 
mailto:kbe...@purdue.edu>> wrote:
Hello,

My personal information has changed and I need to update my author/maintainer 
name and email address on my packages.

For Cardinal, this is easy, but CardinalWorkflows is a data package, so I 
cannot update it myself. How can I update the data package info?

Besides subscribing to the mailing list on my new institution's email address, 
is there anything else I need to do?

Thank you,
Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Updating author/maintainer info

2016-09-22 Thread Bemis, Kylie
It works now, thanks!

Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io




On Sep 22, 2016, at 1:45 AM, Hervé Pagès 
mailto:hpa...@fredhutch.org>> wrote:

Hi Kylie,

It seems that we somehow forgot to grant you write access to your
CardinalWorkflows package. This should work now. Please try again.

Sorry for the inconvenience,
H.

On 09/21/2016 07:16 PM, Kyle Dwayne Bemis wrote:
Hello,

My personal information has changed and I need to update my author/maintainer 
name and email address on my packages.

For Cardinal, this is easy, but CardinalWorkflows is a data package, so I 
cannot update it myself. How can I update the data package info?

Besides subscribing to the mailing list on my new institution's email address, 
is there anything else I need to do?

Thank you,
Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Reading and storing single cell ATAC data

2016-09-23 Thread Bemis, Kylie
Hi Caleb,

Per your question on sparse on-disk matrices: is your experimental data coming 
in some pre-defined text or binary file format, or are you looking to convert 
to a new, custom format to take advantage of the sparsity?

I need to start working on an on-disk sparse matrix implementation sometime 
myself for our MS imaging datasets, so I am wondering what others’ needs are in 
this regard.

Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io




On Sep 23, 2016, at 5:37 PM, Caleb Lareau 
mailto:caleblar...@g.harvard.edu>> wrote:

Hi everyone—

I’m working with a team that’s generating single cell ATAC data in large 
amounts and am designing the framework of an S4 object to facilitate analyses 
in R. I have a couple of high-level questions that I wanted to pose early to 
hopefully attain some community guidance in the implementation of these data 
structures.


Question on S4 scATAC Structure--
It’s easy to imagine scATAC data as a matrix where the rows are particular 
peaks and the columns are individual samples. We already have such an 
impressive volume of data, such that if stored in an ordinary matrix, we run 
into ~20 GB objects. As these data are very sparse, we store the peak values in 
a sparse matrix (through the Matrix library). I wanted to collate the peak 
information (probably in GRanges object) and sample information (in a data 
frame) as well as some potential meta data in an S4 object.

Easy enough, sure, but after looking at the scRNA structure (e.g. scater 
),
 I feel like I should be considering how to inherit some of the nice properties 
from the canonical `ExpressionSet` structure. However, since my constraints 
aren’t directly compatible (namely the featureData slot really needs to be a 
GRanges and the exprs slot must be an object from Matrix), it wasn’t clear to 
me how to maximize the inheritance properties while adjusting to my unique 
constraints. Also, it wasn’t clear to me whether or not I could inherent 
`SummarizedExperiment` due to the different nature of the sparse matrix. Does 
anyone have any advice on this structure?


Question on reading sparse matrices from disk--
I’m trying to work out the best to selectively read certain rows and columns 
from a sparse matrix on disk into memory. I anticipate a time fairly soon that 
loading our full scATAC data, even in sparse matrices, is going to be 
untenable. Any matrix reading/slicing implementations that I’ve seen don’t play 
friendly with sparse matrices. So, I hacked together two solutions— 1) reads 
and subsets a gzipped matrix with 3 columns (row index; column index; non-zero 
value) through a system call to awk. 2) converts that same 3 column matrix into 
an SQLite object and send queries to read values based on indices. The hiccups 
are that 1) doesn’t play friendly on non-unix platforms and always scans the 
full file, and 2) is faster for querying, but the binary object is ~7x larger 
than the gzipped object. I’ve played around with hdf5 as well, but it didn’t 
seem to give me much back in terms of speed or storage benefits comparatively. 
Has anyone found an implementation that achieves a decent lookup time and 
compression, or am I essentially needing to choose between the two?



Thanks and have a great weekend!
-Caleb
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] any interest in a BiocMatrix core package?

2017-02-24 Thread Bemis, Kylie
It’s not there yet, but I plan to expose a C++ API for my disk-backed matrix 
objects in the next version of my ‘matter’ package.

It’s getting easier to interchange matter/HDF5Array/bigmemory/etc. objects at 
the R level, especially if using a frontend like DelayedArray on top of them, 
but it would be nice to have a common C++ API that I could hook into as well (a 
la Rcpp), so new C/C++ could be re-used across various backends more easily.

Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io




On Feb 24, 2017, at 4:50 PM, Aaron Lun 
mailto:a...@wehi.edu.au>> wrote:

It's a good place to start, though it would be very handy to have a C(++) API 
that can be linked against. I'm not sure how much work that would entail but it 
would give downstream developers a lot more options. Sort of like how we can 
link to Rhtslib, which speeds up a lot of BAM file processing, instead of just 
relying on Rsamtools.


-Aaron


From: Tim Triche, Jr. mailto:tim.tri...@gmail.com>>
Sent: Saturday, 25 February 2017 8:34:58 AM
To: Aaron Lun
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] any interest in a BiocMatrix core package?

yes

the DelayedArray framework that handles HDF5Array, etc. seems like the right 
choice?

--t

On Fri, Feb 24, 2017 at 1:26 PM, Aaron Lun 
mailto:a...@wehi.edu.au>> wrote:
Hi everyone,

I just attended the Human Cell Atlas meeting in Stanford, and people were 
talking about gene expression matrices for >1 million cells. If we assume that 
we can get non-zero expression profiles for ~5000 genes, we�d be talking about 
a 5000 x 1 million matrix for the raw count data. This would be 20-40 GB in 
size, which would clearly benefit from sparse (via Matrix) or disk-backed 
representations (bigmatrix, BufferedMatrix, rhdf5, etc.).

I�m wondering whether there is any appetite amongst us for making a consistent 
BioC API to handle these matrices, sort of like what BiocParallel does for 
multicore and snow. It goes without saying that the different matrix 
representations should have consistent functions at the R level (rbind/cbind, 
etc.) but it would also be nice to have an integrated C/C++ API (accessible via 
LinkedTo). There�s many non-trivial things that can be done with this type of 
data, and it is often faster and more memory efficient to do these complex 
operations in compiled code.

I was thinking of something that you could supply any supported matrix 
representation to a registered function via .Call; the C++ constructor would 
recognise the type of matrix during class instantiation; and operations 
(row/column/random read access, also possibly various ways of writing a matrix) 
would be overloaded and behave as required for the class. Only the 
implementation of the API would need to care about the nitty gritty of each 
representation, and we would all be free to write code that actually does the 
interesting analytical stuff.

Anyway, just throwing some thoughts out there. Any comments appreciated.

Cheers,

Aaron

   [[alternative HTML version deleted]]


___
Bioc-devel@r-project.org
 mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Generate valid SSH keys for the bioc-git server!

2017-08-18 Thread Bemis, Kylie
Dear Bioc-Devel

I am still having trouble fetching from the repository. I’ve tried submitting 
both my Github username and copy-pasting public SSH key to the Google Form.

My public key in ~/.ssh/id_rsa.pub and on Github look similar to the example, 
so it appears valid to me, but I still get an access rights error trying to 
fetch.

Thanks,
Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io




On Aug 17, 2017, at 10:07 AM, Turaga, Nitesh 
mailto:nitesh.tur...@roswellpark.org>> wrote:

Hi Maintainers,

Some of you have submitted keys which look like this,

```3d:9a:10:86:19:3d:ac:b4:a1:fe:c2:a9:8c:28:55:49```  ##wrong, this is a SSH 
key fingerprint. This is NOT an ssh key.

A valid ssh key looks like,

```
ssh-rsa B3NzaC1yc2EBIwAAAQEAklOUpkDHrfHY17SbrmTIpNLTGK9Tjom/BWDSU
GPl+nafzlHDTYW7hdI4yZ5ew18JH4JW9jbhUFrviQzM7xlELEVf4h9lFX5QVkbPppSwg0cda3
Pbv7kOdJ/MTyBlWXFCR+HAo3FXRitBqxiX1nKhXpHAZsMciLq8V6RjsNAQwdsdMFvSlVK/7XA
t3FaoJoAsncM1Q9x5+3V0Ww68/eIFmb1zuUFljQJKprrX88XypNDvjYNby6vw/Pb0rwert/En
mZ+AW4OZPnTPI89ZPmVMLuayrD2cE86Z/il8b+gw3r3+1nKatmIkjn2so1d01QraTlMqVSsbx
NrRFi9wrf+M7Q==
```

Please check this link to learn how to generate keys,

https://git-scm.com/book/en/v2/Git-on-the-Server-Generating-Your-SSH-Public-Key

https://www.bioconductor.org/developers/how-to/git/.

https://www.bioconductor.org/developers/how-to/git/faq/

If you want to validate your key before sub, please try,

```
ssh-keygen -l -f id_rsa.pub
```

Another point I’d like to make is, please submit your queries to the bioc-devel 
mailing list. The bioc-devel mailing list email is bioc-devel@r-project.org. It 
is very useful for the community to know about these issues.


Thanks,


Best,

Nitesh


On Aug 16, 2017, at 9:05 PM, Turaga, Nitesh  
wrote:

You have a malformed key. Please check the key you submitted.

If you do not know what an SSH key is,

https://www.bioconductor.org/developers/how-to/git/.

https://www.bioconductor.org/developers/how-to/git/faq/

Please follow the links to submit a SSH key which is valid.

Please also reply to bioc-devel, the reply will be much faster and there are 
many people who might have the same question.

Nitesh

On Aug 16, 2017, at 9:00 PM, Dario Strbenac  wrote:

Good day,

I used the Google Form a couple of weeks ago to submit the public key for 
ClassifyR. I still can't fetch the repository. What is the status of it being 
registered?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia




This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Generate valid SSH keys for the bioc-git server!

2017-08-18 Thread Bemis, Kylie
Thanks! It works now.

Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<https://kuwisdelu.github.io>




On Aug 18, 2017, at 1:36 PM, Turaga, Nitesh 
mailto:nitesh.tur...@roswellpark.org>> wrote:

Hi Kylie,

Your key shows as accepted.

Nitesh


On Aug 18, 2017, at 1:28 PM, Bemis, Kylie 
mailto:k.be...@northeastern.edu>> wrote:

Dear Bioc-Devel

I am still having trouble fetching from the repository. I’ve tried submitting 
both my Github username and copy-pasting public SSH key to the Google Form.

My public key in ~/.ssh/id_rsa.pub and on Github look similar to the example, 
so it appears valid to me, but I still get an access rights error trying to 
fetch.

Thanks,
Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<http://kuwisdelu.github.io/><https://kuwisdelu.github.io<https://kuwisdelu.github.io/>>




On Aug 17, 2017, at 10:07 AM, Turaga, Nitesh 
mailto:nitesh.tur...@roswellpark.org><mailto:nitesh.tur...@roswellpark.org>>
 wrote:

Hi Maintainers,

Some of you have submitted keys which look like this,

```3d:9a:10:86:19:3d:ac:b4:a1:fe:c2:a9:8c:28:55:49```  ##wrong, this is a SSH 
key fingerprint. This is NOT an ssh key.

A valid ssh key looks like,

```
ssh-rsa B3NzaC1yc2EBIwAAAQEAklOUpkDHrfHY17SbrmTIpNLTGK9Tjom/BWDSU
GPl+nafzlHDTYW7hdI4yZ5ew18JH4JW9jbhUFrviQzM7xlELEVf4h9lFX5QVkbPppSwg0cda3
Pbv7kOdJ/MTyBlWXFCR+HAo3FXRitBqxiX1nKhXpHAZsMciLq8V6RjsNAQwdsdMFvSlVK/7XA
t3FaoJoAsncM1Q9x5+3V0Ww68/eIFmb1zuUFljQJKprrX88XypNDvjYNby6vw/Pb0rwert/En
mZ+AW4OZPnTPI89ZPmVMLuayrD2cE86Z/il8b+gw3r3+1nKatmIkjn2so1d01QraTlMqVSsbx
NrRFi9wrf+M7Q==
```

Please check this link to learn how to generate keys,

https://git-scm.com/book/en/v2/Git-on-the-Server-Generating-Your-SSH-Public-Key

https://www.bioconductor.org/developers/how-to/git/.

https://www.bioconductor.org/developers/how-to/git/faq/

If you want to validate your key before sub, please try,

```
ssh-keygen -l -f id_rsa.pub
```

Another point I’d like to make is, please submit your queries to the bioc-devel 
mailing list. The bioc-devel mailing list email is bioc-devel@r-project.org. It 
is very useful for the community to know about these issues.


Thanks,


Best,

Nitesh


On Aug 16, 2017, at 9:05 PM, Turaga, Nitesh  
wrote:

You have a malformed key. Please check the key you submitted.

If you do not know what an SSH key is,

https://www.bioconductor.org/developers/how-to/git/.

https://www.bioconductor.org/developers/how-to/git/faq/

Please follow the links to submit a SSH key which is valid.

Please also reply to bioc-devel, the reply will be much faster and there are 
many people who might have the same question.

Nitesh

On Aug 16, 2017, at 9:00 PM, Dario Strbenac  wrote:

Good day,

I used the Google Form a couple of weeks ago to submit the public key for 
ClassifyR. I still can't fetch the repository. What is the status of it being 
registered?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia




This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] Cardinal duplicate commits

2017-10-17 Thread Bemis, Kylie
Hi all,

I have a problem with duplicate commits in my package “Cardinal”. So far I have 
avoided making more commits to my other package “matter” until I am sure how to 
avoid this again.

Soon after the git transition, I made a few small commits to the Bioconductor 
git repo, which I notice now included duplicate commits in the history. They 
were not rejected at the time. New commits now are rejected for me.

Locally and on my own Github repo (https://github.com/kuwisdelu/Cardinal) I 
have reverted to a commit from before I merged the Bioconductor master, to get 
a clean history without duplicate commits.

I have been trying to push this with


git push -f upstream

from the “Abandon changes” help page.

This is being rejected with:

kuwisdelu$ git push --force upstream master
Total 0 (delta 0), reused 0 (delta 0)
remote: FATAL: + refs/heads/master packages/Cardinal k.bemis DENIED by fallthru
remote: error: hook declined to update refs/heads/master
To git.bioconductor.org:packages/Cardinal.git
 ! [remote rejected] master -> master (hook declined)
error: failed to push some refs to 
'g...@git.bioconductor.org:packages/Cardinal.git'

What is the best way to get a clean history and avoid this happening again?

Thank you,
Kylie

~~~
Kylie Ariel Bemis
Future Faculty Fellow
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io





[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] any interest in a BiocMatrix core package?

2017-11-01 Thread Bemis, Kylie
 approach for a given set
of tasks.  Should we devise a set of bioinformatic benchmark problems to
foster comparison and informed
decisionmaking?  @becker.gabe: is ALTREP far enough along that one could
contemplate benchmarking with it?

On Fri, Feb 24, 2017 at 7:08 PM, Bemis, Kylie 
mailto:k.be...@northeastern.edu>>
wrote:

> It’s not there yet, but I plan to expose a C++ API for my disk-backed
> matrix objects in the next version of my ‘matter’ package.
>
> It’s getting easier to interchange matter/HDF5Array/bigmemory/etc.
> objects at the R level, especially if using a frontend like DelayedArray on
> top of them, but it would be nice to have a common C++ API that I could
> hook into as well (a la Rcpp), so new C/C++ could be re-used across various
> backends more easily.
>
> Kylie
>
> ~~~
> Kylie Ariel Bemis
> Future Faculty Fellow
> College of Computer and Information Science
> Northeastern University
> kuwisdelu.github.io<http://kuwisdelu.github.io/><https://kuwisdelu.github.io<https://kuwisdelu.github.io/>>
>
>
>
>
> On Feb 24, 2017, at 4:50 PM, Aaron Lun 
> mailto:a...@wehi.edu.au><mailto:alun@<mailto:alun@>
> wehi.edu.au<http://wehi.edu.au/>>> wrote:
>
> It's a good place to start, though it would be very handy to have a C(++)
> API that can be linked against. I'm not sure how much work that would
> entail but it would give downstream developers a lot more options. Sort of
> like how we can link to Rhtslib, which speeds up a lot of BAM file
> processing, instead of just relying on Rsamtools.
>
>
> -Aaron
>
> 
> From: Tim Triche, Jr. 
> mailto:tim.tri...@gmail.com><mailto:tim.tri...@gmail.com<mailto:tim.tri...@gmail.com>>>
> Sent: Saturday, 25 February 2017 8:34:58 AM
> To: Aaron Lun
> Cc: 
> bioc-devel@r-project.org<mailto:bioc-devel@r-project.org><mailto:bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>>
> Subject: Re: [Bioc-devel] any interest in a BiocMatrix core package?
>
> yes
>
> the DelayedArray framework that handles HDF5Array, etc. seems like the
> right choice?
>
> --t
>
> On Fri, Feb 24, 2017 at 1:26 PM, Aaron Lun 
> mailto:a...@wehi.edu.au><mailto:alun@<mailto:alun@>
> wehi.edu.au<http://wehi.edu.au/>><mailto:a...@wehi.edu.au<mailto:a...@wehi.edu.au>>>
>  wrote:
> Hi everyone,
>
> I just attended the Human Cell Atlas meeting in Stanford, and people were
> talking about gene expression matrices for >1 million cells. If we assume
> that we can get non-zero expression profiles for ~5000 genes, we�d be
> talking about a 5000 x 1 million matrix for the raw count data. This would
> be 20-40 GB in size, which would clearly benefit from sparse (via Matrix)
> or disk-backed representations (bigmatrix, BufferedMatrix, rhdf5, etc.).
>
> I�m wondering whether there is any appetite amongst us for making a
> consistent BioC API to handle these matrices, sort of like what
> BiocParallel does for multicore and snow. It goes without saying that the
> different matrix representations should have consistent functions at the R
> level (rbind/cbind, etc.) but it would also be nice to have an integrated
> C/C++ API (accessible via LinkedTo). There�s many non-trivial things that
> can be done with this type of data, and it is often faster and more memory
> efficient to do these complex operations in compiled code.
>
> I was thinking of something that you could supply any supported matrix
> representation to a registered function via .Call; the C++ constructor
> would recognise the type of matrix during class instantiation; and
> operations (row/column/random read access, also possibly various ways of
> writing a matrix) would be overloaded and behave as required for the class.
> Only the implementation of the API would need to care about the nitty
> gritty of each representation, and we would all be free to write code that
> actually does the interesting analytical stuff.
>
> Anyway, just throwing some thoughts out there. Any comments appreciated.
>
> Cheers,
>
> Aaron
>
>[[alternative HTML version deleted]]
>
>
> ___
> Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org><mailto:Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org>> Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org>> mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
> https://stat.ethz.ch/mailman/l

Re: [Bioc-devel] any interest in a BiocMatrix core package?

2017-11-01 Thread Bemis, Kylie
Yes, the ideal solution seems rather unlikely, but I feel like there must be a 
solution better than the current situation.

I’d like to implement some more of the functionality from matrixStats for 
‘matter’ matrices, but importing DelayedArray and DelayedMatrixStats solely for 
the generic functions seems like a bit much. Is that the best thing to do 
though?

Any suggestions?

-Kylie

> On Nov 1, 2017, at 4:59 PM, Hervé Pagès  wrote:
> 
> That's probably a good idea but a clean solution would need to
> involve all players, including the Matrix package. Right now there
> are conflicts for some S4 generics defined in Matrix and in
> BiocGenerics (e.g. rowSums). I'm not sure that moving rowSums from
> BiocGenerics to a new MatrixGenerics package would address this.
> Unless MatrixGenerics is on CRAN and Matrix depends on it ;-)
> 
> How likely is this to happen?
> 
> H.
> 
> On 11/01/2017 01:44 PM, Peter Hickey wrote:
>> I think that's a good idea, Kylie.
>> Pete (DelayedMatrixStats developer)
>> 
>> On Thu., 2 Nov. 2017, 6:09 am Kasper Daniel Hansen, <
>> kasperdanielhan...@gmail.com> wrote:
>> 
>>> I think it makes sense. A lot of sense. Might be useful to involve Henrik
>>> (matrixStats) as well.
>>> 
>>> Who are the players, apart from DelayedArray/DelayedMatrixStats and matter?
>>> (and some very old stuff in Biobase which should really be deprecated in
>>> favor of matrixStats).
>>> 
>>> Best,
>>> Kasper
>>> 
>>> On Wed, Nov 1, 2017 at 3:03 PM, Bemis, Kylie 
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> To continue a variant of this conversation, with the latest BioC release,
>>>> we now have quite a few packages that are implementing various
>>>> matrix-related S4 generic functions, many of them relying on matrixStats
>>> as
>>>> a template.
>>>> 
>>>> I was wondering if there is any interest or intention to create a common
>>>> MatrixGenerics/ArrayGenerics package on which we can depend to import the
>>>> relevant S4 generic functions. Although BiocGeneric has a few like
>>>> ‘rowSums()’ and ‘colMeans()’, etc., there are many more that are
>>>> implemented across ‘DelayedArray', ‘DelayedMatrixStats', my own package
>>>> ‘matter', etc., including ‘apply()’, ‘rowSds()’, ‘colVars()’, and so
>>> forth.
>>>> 
>>>> It would be nice to have a single package with minimal additional
>>>> dependencies (a la BiocGenerics) where we could import the various S4
>>>> generics and avoid unwanted namespace collisions.
>>>> 
>>>> Have there been any thoughts on this?
>>>> 
>>>> Many thanks,
>>>> Kylie
>>>> 
>>>> ~~~
>>>> Kylie Ariel Bemis
>>>> Future Faculty Fellow
>>>> College of Computer and Information Science
>>>> Northeastern University
>>>> kuwisdelu.github.io<https://urldefense.proofpoint.com/v2/url?u=https-3A__kuwisdelu.github.io&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=rB5bgdmBaBGWPSNktamrt-mzOZWaJ649FWWr_wCcCEs&s=jvekQlr-c1DbU0g-P5b_FApuAd33vBk3IMDG5F_slQo&e=>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Mar 3, 2017, at 11:27 AM, Kasper Daniel Hansen <
>>>> kasperdanielhan...@gmail.com<mailto:kasperdanielhan...@gmail.com>>
>>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Fri, Mar 3, 2017 at 10:22 AM, Vincent Carey <
>>> st...@channing.harvard.edu
>>>> <mailto:st...@channing.harvard.edu>> wrote:
>>>> 
>>>> 
>>>> On Fri, Mar 3, 2017 at 10:07 AM, Kasper Daniel Hansen <
>>>> kasperdanielhan...@gmail.com<mailto:kasperdanielhan...@gmail.com>>
>>> wrote:
>>>> Some comment on Aaron's stuff
>>>> 
>>>> One possibility for doing things like this is if your code can be done in
>>>> C++ using a subset of rows or columns.  That can sometimes give the
>>>> necessary speed up.  What I mean is this
>>>> 
>>>> Say you can safely process 1000 cells (not matrix cells, but biological
>>>> cells, aka columns) at a time in RAM
>>>> 
>>>> iterate in R:
>>>>   get chunk i containing 1000 cells from the backend data storage
>>>>   do something on this sub matrix where everything is in a normal matrix
>>>> and 

[Bioc-devel] Possible to export coerce2() from S4Vectors?

2018-11-13 Thread Bemis, Kylie
Dear all,

Are there any plans to export coerce2() from the S4Vectors namespace, like 
other exported internal utilities such as showAsCell() and setListElement()?

I have a couple classes that inherit from DataFrame, and some inherited methods 
(like [[<-) break in certain situations due to calls to coerce2() that coerce 
arguments to a regular DataFrame (instead of my subclass). This could be fixed 
if I were able to implement a coerce2() method for my subclass.

Any suggestions on how to approach problems like this when inheriting from 
DataFrame and other Vector derivatives?

Many thanks,
Kylie

~~~
Kylie Ariel Bemis
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io






[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Possible to export coerce2() from S4Vectors?

2018-11-14 Thread Bemis, Kylie
Hi Herve,

Thanks for the detailed reply. Using as() makes sense. Unfortunately my use 
case makes it a little more complicated.

The issue comes from a combination of factors:

- My DataFrame subclasses track additional metadata for each row, separate from 
the typical user-defined columns
- This metadata is checked to decide how to do cbind(...) or if cbind(...) 
makes sense for those objects
- cbind(...) ends up being called internally by some inherited assignment 
methods like [[<-
- Coercing to my subclass with as() results in incompatible metadata, causing 
cbind(...) to fail

I see a few solutions:

1. Using coerce2() works where as() doesn’t, because it takes an example of the 
“to” object rather than just the class, so compatible metadata can be copied 
directly from the “to” object, allowing cbind(…) to work as intended.

2. Create an exception to my class logic that allows the metadata to be 
missing, and change my cbind(…) implementation to ignore the metadata in the 
case that it is missing.

3. Supply my own version of methods like [[<-. I don’t like this one, since it 
should be unnecessary.

I can do (2), but I would need to rethink some of my other methods that expect 
that metadata to exist, so I wanted to check on the plans for coerce2() before 
making those changes.

What are your thoughts?

Thanks!
Kylie

~~~
Kylie Ariel Bemis
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<https://kuwisdelu.github.io>





On Nov 13, 2018, at 8:55 PM, Pages, Herve 
mailto:hpa...@fredhutch.org>> wrote:


Hi Kylie,

I've modified coerce2() in S4Vectors 0.21.5 so that `coerce2(from, to)` should 
now do the right thing when 'to' is a DataFrame derivative:

  
https://github.com/Bioconductor/S4Vectors/commit/48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBioconductor%2FS4Vectors%2Fcommit%2F48e11dd2c8d474c63e09a69ee7d2d2ec35d7307a&data=02%7C01%7Ck.bemis%40northeastern.edu%7Cd1ed8517bd164aeed6be08d649d441d6%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636777573332495470&sdata=6QqZsJmrVuB1fQ0FcBCvSIZT3Uyt3CBmhlsE7YzZNiw%3D&reserved=0>

With the following gotcha: this will work only if coercion (with as()) from 
DataFrame to the DataFrame derivative does the right thing. So I'm assuming 
that this coercion makes sense and can be supported. There are 2 possible 
situations:

1) The automatic coercion method from DataFrame to your DataFrame derivative 
(i.e. the coercion method automatically defined by the methods package) does 
the right thing. In this case coerce2() (and therefore [[<-) will also do the 
right thing on your DataFrame derivatives. For example:

  library(S4Vectors)
  setClass("MyDataFrameExtension", contains="DataFrame")

  ## WARNING: Don't trust selectMethod() here!
  selectMethod("coerce", c("DataFrame", "MyDataFrameExtension"))
  # Error in selectMethod("coerce", c("DataFrame", "MyDataFrameExtension")) :
  #  no method found for signature DataFrame, MyDataFrameExtension

  as(DataFrame(), "MyDataFrameExtension")
  # MyDataFrameExtension with 0 rows and 0 columns

  ## The automatic coercion method is only created the 1st time it's used!
  ## So now selectMethod() shows it:
  selectMethod("coerce", c("DataFrame", "MyDataFrameExtension"))
  # Method Definition:
  #
  # function (from, to = "MyDataFrameExtension", strict = TRUE)
  # {
  # obj <- new("MyDataFrameExtension")
  # as(obj, "DataFrame") <- from
  # obj
  # }
  # 
  #
  # Signatures:
  # fromto
  # target  "DataFrame" "MyDataFrameExtension"
  # defined "DataFrame" "MyDataFrameExtension"


  MDF <- new("MyDataFrameExtension")
  S4Vectors:::coerce2(list(aa=1:3, bb=21:23), MDF)
  # MyDataFrameExtension with 3 rows and 2 columns
  #  aabb
  #
  # 1 121
  # 2 222
  # 3 323


2) The automatic coercion method from DataFrame to your DataFrame derivative 
doesn't do the right thing (e.g. it returns an invalid object). In this case 
you need to define this coercion (with a setAs() statement). This will allow 
coerce2() (and therefore [[<-) to do the right thing on your DataFrame 
derivatives.

There is no plan at the moment to export coerce2() because this should not be 
needed. The idea is that developers should not need to define "coerce2" methods 
but instead make it work via the addition of the appropriate coercion methods. 
The only purpose of coerce2() is to support things like [[<- and endoapply(). 
Once coerce2() works properly, these things work out-of-the-box.

So to summarize: just make sure that a DataFrame can be coerced to your

Re: [Bioc-devel] Possible to export coerce2() from S4Vectors?

2018-11-14 Thread Bemis, Kylie
Hi Michael,

Here is a simple example of what I’m trying to do:

setClass("IndexedDataFrame",
contains="DataFrame",
slots=c(ids="numeric"))

# track additional ID metadata w/ special rules
IndexedDataFrame <- function(ids, ...) {
x <- DataFrame(...)
new("IndexedDataFrame",
ids=ids,
rownames=rownames(x),
nrows=nrow(x),
listData=x@listData,
elementMetadata=mcols(x))
}

# check for matching IDs before cbind-ing
setMethod("cbind", "IndexedDataFrame",
function(...) {
args <- list(...)
ids <- args[[1L]]@ids
ok <- vapply(args, function(a) {
# check for compatible IDs
identical(a@ids, ids)
}, logical(1))
if ( !all(ok) )
stop("ids must match")
x <- callNextMethod(...)
new(class(args[[1L]]),
ids=ids,
rownames=rownames(x),
nrows=nrow(x),
listData=x@listData,
elementMetadata=mcols(x))
})

set.seed(1)
idf <- IndexedDataFrame(ids=runif(10), a=1:10, b=11:20)
idf$c <- 21:30

Error in identical(a@ids, ids) :
  no slot of name "ids" for this object of class "DataFrame"
In addition: Warning message:
In methods:::.selectDotsMethod(classes, .MTable, .AllMTable) :
  multiple direct matches: "IndexedDataFrame", "DataFrame"; using the first of 
these

Specific examples where I use this pattern are new MassDataFrame and 
PositionDataFrame classes in Cardinal, which require associated m/z-values and 
pixel coordinates as additional metadata. Current source code is here:

https://github.com/kuwisdelu/Cardinal/blob/master/R/methods2-MassDataFrame.R
https://github.com/kuwisdelu/Cardinal/blob/master/R/methods2-PositionDataFrame.R

In older versions of Cardinal, similar versions of these classes extended 
AnnotatedDataFrame and used regular columns for this metadata, while requiring 
those columns to follow a specific naming scheme. This proved fragile, 
difficult to maintain, and easily broken, so I am now using slots to contain 
this metadata so they can be validated independently of whatever user-supplied 
columns exist.

Kylie

~~~
Kylie Ariel Bemis
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<https://kuwisdelu.github.io>





On Nov 14, 2018, at 10:12 AM, Michael Lawrence 
mailto:lawrence.mich...@gene.com>> wrote:

I don't want to derail this thread, but why is coerce2() necessary? Would it be 
possible to fold its logic into as() without breaking too much?

Kylie,

It would help to see your code, with some pointers to where things break.

Michael

On Wed, Nov 14, 2018 at 5:36 AM Bemis, Kylie 
mailto:k.be...@northeastern.edu>> wrote:
Hi Herve,

Thanks for the detailed reply. Using as() makes sense. Unfortunately my use 
case makes it a little more complicated.

The issue comes from a combination of factors:

- My DataFrame subclasses track additional metadata for each row, separate from 
the typical user-defined columns
- This metadata is checked to decide how to do cbind(...) or if cbind(...) 
makes sense for those objects
- cbind(...) ends up being called internally by some inherited assignment 
methods like [[<-
- Coercing to my subclass with as() results in incompatible metadata, causing 
cbind(...) to fail

I see a few solutions:

1. Using coerce2() works where as() doesn’t, because it takes an example of the 
“to” object rather than just the class, so compatible metadata can be copied 
directly from the “to” object, allowing cbind(…) to work as intended.

2. Create an exception to my class logic that allows the metadata to be 
missing, and change my cbind(…) implementation to ignore the metadata in the 
case that it is missing.

3. Supply my own version of methods like [[<-. I don’t like this one, since it 
should be unnecessary.

I can do (2), but I would need to rethink some of my other methods that expect 
that metadata to exist, so I wanted to check on the plans for coerce2() before 
making those changes.

What are your thoughts?

Thanks!
Kylie

~~~
Kylie Ariel Bemis
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747276754&sdata=vXC5ppIu2%2BqCgS1fs1UD2Say6y0zIDNwHRDfX1sKA1w%3D&reserved=0><https://kuwisdelu.github.io<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C37d816d077d04986bf9d08d64a43a579%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778051747276754&sdata=hO3dFZvT5OqGsgsmLsPAWQktRf%2BynDtCeSQ2RA8h2tw%3D&reserved=0>>





On Nov 13, 2018, at 8:55 PM, Pages, Herve 
mailto:hpa...@fredhutch.org><mailto:hpa...@fredhutch.org<mailto:hpa...@fredhutch.org>>>
 wrote:


Hi Kylie,

I've modified coer

Re: [Bioc-devel] Possible to export coerce2() from S4Vectors?

2018-11-14 Thread Bemis, Kylie
Yes, I will make sure my cbind() implementation coerces to the correct subclass.

That could solve my error as well, but the warnings about S4 dispatch on “...” 
are still a problem.

-Kylie

On Nov 14, 2018, at 12:38 PM, Michael Lawrence 
mailto:lawrence.mich...@gene.com>> wrote:

The use of c() in the implementation of [[<- is problematic, since [[<- has the 
semantic of insertion, preserving the overall structure of x, while c() is a 
combination of two or more peer data structures, and it is difficult to define 
the correct logic through dispatch.

The dispatch on ... is not well documented. I will try to improve that, as soon 
as I understand it myself. But no matter what, your cbind() method will need to 
uplift ordinary DataFrames to IndexedDataFrame.

Michael

On Wed, Nov 14, 2018 at 7:52 AM Bemis, Kylie 
mailto:k.be...@northeastern.edu>> wrote:
Hi Michael,

Here is a simple example of what I’m trying to do:

setClass("IndexedDataFrame",
contains="DataFrame",
slots=c(ids="numeric"))

# track additional ID metadata w/ special rules
IndexedDataFrame <- function(ids, ...) {
x <- DataFrame(...)
new("IndexedDataFrame",
ids=ids,
rownames=rownames(x),
nrows=nrow(x),
listData=x@listData,
elementMetadata=mcols(x))
}

# check for matching IDs before cbind-ing
setMethod("cbind", "IndexedDataFrame",
function(...) {
args <- list(...)
ids <- args[[1L]]@ids
ok <- vapply(args, function(a) {
# check for compatible IDs
identical(a@ids, ids)
}, logical(1))
if ( !all(ok) )
stop("ids must match")
x <- callNextMethod(...)
new(class(args[[1L]]),
ids=ids,
rownames=rownames(x),
nrows=nrow(x),
listData=x@listData,
elementMetadata=mcols(x))
})

set.seed(1)
idf <- IndexedDataFrame(ids=runif(10), a=1:10, b=11:20)
idf$c <- 21:30

Error in identical(a@ids, ids) :
  no slot of name "ids" for this object of class "DataFrame"
In addition: Warning message:
In methods:::.selectDotsMethod(classes, .MTable, .AllMTable) :
  multiple direct matches: "IndexedDataFrame", "DataFrame"; using the first of 
these

Specific examples where I use this pattern are new MassDataFrame and 
PositionDataFrame classes in Cardinal, which require associated m/z-values and 
pixel coordinates as additional metadata. Current source code is here:

https://github.com/kuwisdelu/Cardinal/blob/master/R/methods2-MassDataFrame.R<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkuwisdelu%2FCardinal%2Fblob%2Fmaster%2FR%2Fmethods2-MassDataFrame.R&data=02%7C01%7Ck.bemis%40northeastern.edu%7C081784ec8f744e4f53ea08d64a57fbe3%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778139103458490&sdata=VZC1bED%2BMesUG3%2FYmjm2NarP3rH3wpwsI3Xmqlnv6AU%3D&reserved=0>
https://github.com/kuwisdelu/Cardinal/blob/master/R/methods2-PositionDataFrame.R<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkuwisdelu%2FCardinal%2Fblob%2Fmaster%2FR%2Fmethods2-PositionDataFrame.R&data=02%7C01%7Ck.bemis%40northeastern.edu%7C081784ec8f744e4f53ea08d64a57fbe3%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778139103458490&sdata=FqUMv0fDdgEe3H0yKV%2BhfKl%2BtCml5pgwR%2FyQBoSnWwk%3D&reserved=0>

In older versions of Cardinal, similar versions of these classes extended 
AnnotatedDataFrame and used regular columns for this metadata, while requiring 
those columns to follow a specific naming scheme. This proved fragile, 
difficult to maintain, and easily broken, so I am now using slots to contain 
this metadata so they can be validated independently of whatever user-supplied 
columns exist.

Kylie

~~~
Kylie Ariel Bemis
College of Computer and Information Science
Northeastern University
kuwisdelu.github.io<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C081784ec8f744e4f53ea08d64a57fbe3%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636778139103458490&sdata=gu3BTRG0wUSpYQtG%2F6Vy605LkJvKqZWlsyUt1LHmqRk%3D&reserved=0>





On Nov 14, 2018, at 10:12 AM, Michael Lawrence 
mailto:lawrence.mich...@gene.com>> wrote:

I don't want to derail this thread, but why is coerce2() necessary? Would it be 
possible to fold its logic into as() without breaking too much?

Kylie,

It would help to see your code, with some pointers to where things break.

Michael

On Wed, Nov 14, 2018 at 5:36 AM Bemis, Kylie 
mailto:k.be...@northeastern.edu>> wrote:
Hi Herve,

Thanks for the detailed reply. Using as() makes sense. Unfortunately my use 
case makes it a little more complicated.

The issue comes from a combination of factors:

- My DataFrame subclasses track additional metadata for each row, separate from 
the typical user-defined columns
- This metadata is checked to decide how to do cbind(...) or if cbind(...) 
makes sense for those objects
- cbind(...) ends up being

[Bioc-devel] set.seed and BiocParallel

2019-03-12 Thread Bemis, Kylie
Hi all,

I remember similar questions coming up before, but couldn’t track any down that 
directly pertain to my situation.

Suppose I want to use bplapply() in a function to fit models to many features, 
and I am applying over features. The models are stochastic, and I want the 
results to be reproducible, and preferably use the same RNG seed for each 
feature. So I could do:

fitModels <- function(object, seed=1, BPPARAM=bpparam()) {
bplapply(object, function(x) {
set.seed(seed)
fitModel(x)
}, BPPARAM=BPPARAM)
}

But the BioC guidelines say not to use set.seed() inside function code, and 
I’ve seen other questions answered saying not to use “seed” as a function 
parameter in this way.

Is it preferable to check and modify .Random.seed directly, or is there some 
other standard way of doing this?

Thanks,
Kylie

~~~
Kylie Ariel Bemis
Khoury College of Computer Sciences
Northeastern University
kuwisdelu.github.io









[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel