Re: [Bioc-devel] BiocParallel: fine-grained progress bar

2017-12-31 Thread Martin Morgan

On 12/30/2017 04:08 PM, Ludwig Geistlinger wrote:

Hi,


I'm currently playing around with progress bars in BiocParallel - which is a 
great package! ;-)


For demonstration, I'm using the example code from DESeq2::DESeq.


library(DESeq2)
library(BiocParallel)

f <- function(mu)
{
 cnts <- matrix(rnbinom(n=1000, mu=mu, size=1/0.5), ncol=10)
 cond <- factor(rep(1:2, each=5))

 # object construction
 suppressMessages({
 dds <- DESeqDataSetFromMatrix(cnts, DataFrame(cond), ~ cond)
 dds <- DESeq(dds)
 })
 res <- results(dds)

 return(res)
}


and apply 'f' to a range of 'mu' values using 'bplapply'.

mu.grid <- 90:120
x <- bplapply(mu.grid, f)


Now, switching to serial execution and verbosing progress

bp <- registered()$SerialParam
bpprogressbar(bp) <- TRUE
register(bp)

x <- bplapply(mu.grid, f)

gives me somehow no progress bar at all.


probably a limitation (aka bug)...



Furthermore, switching to multi-core execution (2 cores) and verbosing progress

bp <- registered()$MulticoreParam
bpprogressbar(bp) <- TRUE
register(bp)

x <- bplapply(mu.grid, f)
  | 
  | 
 
|===   |
  
|==| 100%

gives me only a very coarse-grained progress bar (updates when 50% of the job 
is done, and when the complete job = 100% is done).

What I actually want to have is a fine-grained progress bar that updates 
whenever f finishes execution on an element of the vector I am applying over.


For four workers and a job with X = 1:100, bplapply by default divides 
the job into 4 equally sized tasks 1:25, 26:50, ... and sends them off 
to workers. It reports progress as each task (e.g., 1:25) completes, so 
at most there are four ticks. If fine-grained progress trumps all other 
concerns, then setting the number of tasks equal to length(X) will 
indicate progress.


It's not impossible to arrange for more fine-grained progress in all 
cases, and it's a reasonable feature request.


Martin




In "normal" serial R execution, the desired behavior can be illustrated via

pb <- txtProgressBar(90, 120, style=3, width=length(mu.grid))
r <- vector(mode="list", length=length(mu.grid))
for(i in mu.grid)
{
 setTxtProgressBar(pb, i)
 r[[i-89]] <- f(i)
}
close(pb)


Is there a way to obtain something similar using BiocParallel?

Thanks,
Ludwig


--
Dr. Ludwig Geistlinger
CUNY School of Public Health


sessionInfo()

R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.1

Matrix products: default
BLAS: 
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4stats graphics  grDevices utils datasets
[8] methods   base

other attached packages:
  [1] BiocParallel_1.12.0DESeq2_1.18.1
  [3] SummarizedExperiment_1.8.0 DelayedArray_0.4.1
  [5] matrixStats_0.52.2 Biobase_2.38.0
  [7] GenomicRanges_1.30.0   GenomeInfoDb_1.14.0
  [9] IRanges_2.12.0 S4Vectors_0.16.0
[11] BiocGenerics_0.24.0

loaded via a namespace (and not attached):
  [1] genefilter_1.60.0   locfit_1.5-9.1  splines_3.4.2
  [4] lattice_0.20-35 colorspace_1.3-2htmltools_0.3.6
  [7] base64enc_0.1-3 blob_1.1.0  survival_2.41-3
[10] XML_3.98-1.9rlang_0.1.4 DBI_0.7
[13] foreign_0.8-69  bit64_0.9-7 RColorBrewer_1.1-2
[16] GenomeInfoDbData_0.99.1 plyr_1.8.4  stringr_1.2.0
[19] zlibbioc_1.24.0 munsell_0.4.3   gtable_0.2.0
[22] htmlwidgets_0.9 memoise_1.1.0   latticeExtra_0.6-28
[25] knitr_1.17  geneplotter_1.56.0  AnnotationDbi_1.40.0
[28] htmlTable_1.9   Rcpp_0.12.14acepack_1.4.1
[31] xtable_1.8-2scales_0.5.0backports_1.1.1
[34] checkmate_1.8.5 Hmisc_4.0-3 annotate_1.56.1
[37] XVector_0.18.0  bit_1.1-12  gridExtra_2.3
[40] ggplot2_2.2.1   digest_0.6.12   stringi_1.1.6
[43] grid_3.4.2  tools_3.4.2 bitops_1.0-6
[46] magrittr_1.5RSQLite_2.0 lazyeval_0.2.1
[49] RCurl_1.95-4.8  tibble_1.3.4Formula_1.2-2
[52] cluster_2.0.6   Matrix_1.2-12   data.table_1.10.4-3
[55] rpart_4.1-11nnet_7.3-12 compiler_3.4.2


[[alternative HTML version deleted]]


[Bioc-devel] BiocParallel: fine-grained progress bar

2017-12-30 Thread Ludwig Geistlinger
Hi,


I'm currently playing around with progress bars in BiocParallel - which is a 
great package! ;-)


For demonstration, I'm using the example code from DESeq2::DESeq.


library(DESeq2)
library(BiocParallel)

f <- function(mu)
{
cnts <- matrix(rnbinom(n=1000, mu=mu, size=1/0.5), ncol=10)
cond <- factor(rep(1:2, each=5))

# object construction
suppressMessages({
dds <- DESeqDataSetFromMatrix(cnts, DataFrame(cond), ~ cond)
dds <- DESeq(dds)
})
res <- results(dds)

return(res)
}


and apply 'f' to a range of 'mu' values using 'bplapply'.

mu.grid <- 90:120
x <- bplapply(mu.grid, f)


Now, switching to serial execution and verbosing progress

bp <- registered()$SerialParam
bpprogressbar(bp) <- TRUE
register(bp)

x <- bplapply(mu.grid, f)

gives me somehow no progress bar at all.

Furthermore, switching to multi-core execution (2 cores) and verbosing progress

bp <- registered()$MulticoreParam
bpprogressbar(bp) <- TRUE
register(bp)

x <- bplapply(mu.grid, f)
 |  
 |  

|===   |
  
|==| 100%

gives me only a very coarse-grained progress bar (updates when 50% of the job 
is done, and when the complete job = 100% is done).

What I actually want to have is a fine-grained progress bar that updates 
whenever f finishes execution on an element of the vector I am applying over.


In "normal" serial R execution, the desired behavior can be illustrated via

pb <- txtProgressBar(90, 120, style=3, width=length(mu.grid))
r <- vector(mode="list", length=length(mu.grid))
for(i in mu.grid)
{
setTxtProgressBar(pb, i)
r[[i-89]] <- f(i)
}
close(pb)


Is there a way to obtain something similar using BiocParallel?

Thanks,
Ludwig


--
Dr. Ludwig Geistlinger
CUNY School of Public Health

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.1

Matrix products: default
BLAS: 
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4stats graphics  grDevices utils datasets
[8] methods   base

other attached packages:
 [1] BiocParallel_1.12.0DESeq2_1.18.1
 [3] SummarizedExperiment_1.8.0 DelayedArray_0.4.1
 [5] matrixStats_0.52.2 Biobase_2.38.0
 [7] GenomicRanges_1.30.0   GenomeInfoDb_1.14.0
 [9] IRanges_2.12.0 S4Vectors_0.16.0
[11] BiocGenerics_0.24.0

loaded via a namespace (and not attached):
 [1] genefilter_1.60.0   locfit_1.5-9.1  splines_3.4.2
 [4] lattice_0.20-35 colorspace_1.3-2htmltools_0.3.6
 [7] base64enc_0.1-3 blob_1.1.0  survival_2.41-3
[10] XML_3.98-1.9rlang_0.1.4 DBI_0.7
[13] foreign_0.8-69  bit64_0.9-7 RColorBrewer_1.1-2
[16] GenomeInfoDbData_0.99.1 plyr_1.8.4  stringr_1.2.0
[19] zlibbioc_1.24.0 munsell_0.4.3   gtable_0.2.0
[22] htmlwidgets_0.9 memoise_1.1.0   latticeExtra_0.6-28
[25] knitr_1.17  geneplotter_1.56.0  AnnotationDbi_1.40.0
[28] htmlTable_1.9   Rcpp_0.12.14acepack_1.4.1
[31] xtable_1.8-2scales_0.5.0backports_1.1.1
[34] checkmate_1.8.5 Hmisc_4.0-3 annotate_1.56.1
[37] XVector_0.18.0  bit_1.1-12  gridExtra_2.3
[40] ggplot2_2.2.1   digest_0.6.12   stringi_1.1.6
[43] grid_3.4.2  tools_3.4.2 bitops_1.0-6
[46] magrittr_1.5RSQLite_2.0 lazyeval_0.2.1
[49] RCurl_1.95-4.8  tibble_1.3.4Formula_1.2-2
[52] cluster_2.0.6   Matrix_1.2-12   data.table_1.10.4-3
[55] rpart_4.1-11nnet_7.3-12 compiler_3.4.2


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel