Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?

2015-07-14 Thread Leonardo Collado Torres
Hi Valerie,

I added a short explanation to
http://lcolladotor.github.io/SnowParam-memory/ about how the files are
linked (see commit
https://github.com/lcolladotor/SnowParam-memory/commit/602e42dd60e4a988425bf41b774263c058a7551a).

The entry you selected,
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt#L27-L38,
is linked to the stderr log file
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.e6463239
and stdout log file
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463239
The key is the job id, which is on the first line of each entry in the
emails from the cluster and is used at the end of the names for the
log files.

And yes, like you figured out, the Max vmem line
(https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt#L37)
is the maximum memory used as reported by my cluster. It's normally in
GB, but could be in MB as in
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/der-serial-3.1.x.txt#L11

Cheers,
Leo


On Tue, Jul 14, 2015 at 3:12 PM, Valerie Obenchain
voben...@fredhutch.org wrote:
 Hi,

 Thanks for sending the updated information. To make sure I know how the
 files relate, let's take snow-3.2.x, BiocParallel 1.3.34 run on July 14 as
 an example.

 memory files:

 The mem_email files report max virtual memory (Max vmem) on the master, for
 the whole job. I see 3 outputs here - one run from July 9 and two from July
 14:

 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt

 This last entry corresponds to BiocParallel 1.3.34 from July 14:

 Job 6463239 (snow-3.2.x) Complete
  User = lcollado
  Queue= share...@compute-082.cm.cluster
  Host = compute-082.cm.cluster
  Start Time   = 07/14/2015 12:20:48
  End Time = 07/14/2015 12:22:27
  User Time= 00:00:18
  System Time  = 00:00:04
  Wallclock Time   = 00:01:39
  CPU  = 00:10:14
  Max vmem = 7.286G
  Exit Status  = 0



 log files:

 Corresponding log files for BiocParallel 1.3.34 on July 14 show memory used
 on workers:

 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463213
 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463239



 On 07/14/2015 10:23 AM, Leonardo Collado Torres wrote:

 Hi Valerie,

 My other recent thread about SnowParam
 (https://stat.ethz.ch/pipermail/bioc-devel/2015-July/007788.html)
 allowed me to see a small difference.

 In R 3.1.x, using 'outfile' as in

 https://github.com/lcolladotor/SnowParam-memory/blob/d9d70086016c7e720714bec48f121aeca05b1416/SnowParam-memory-derfinder.R#L83
 showed more info than in R 3.2.0 and 3.2.1 as shown below:

 * R 3.1.x
 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.e6416599
 * R 3.2.0
 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6416601
 * R 3.2.1
 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.e6416603

 using BiocParallel versions

 * 1.0.3
 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6416599#L57
 * 1.2.8
 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6416601#L58
 * 1.3.33
 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6416603#L58

 Basically, only some of the message() calls get printed in stderr in R
 3.2.0 and 3.2.1 which is something I wasn't expecting. However, using
 log = TRUE, all the message() calls get printed in stdout, as
 expected given the docs, in which case using 'outfile' is kind of
 useless.


 The idea was to phase out the snow-centric 'outfile' and replace it with
 'log' and 'logdir'. You're right, previous versions of BiocParallel did not
 capture stdout and stderr on the workers; the current release and devel do.
 As you've now seen, when log = TRUE stderr and stdout messages print to the
 console, if logdir is given they go to a file. One thing I have not done yet
 is capture stdout and stderr from the master to a 'logdir' file (workers
 captured only).

Nice =)

I also noticed 
https://github.com/Bioconductor-mirror/BiocParallel/commit/6f816112e2d136d789ed1045a7fc67fd2a91eb25
after sending my previous email.



 See below for the output under R 3.2.0 when log = TRUE

 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201

 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6463201

 Assuming that it's all working as intended, I'll switch to using log =
 TRUE in some of my code when users specify 'verbose = TRUE'.


 Sounds good.


 Cheers,
 Leo

 PS In page 1 of

 http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf
 there's a typo. It reads SnowParma instead of SnowParam.


 Thanks, I'll 

Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?

2015-07-14 Thread Valerie Obenchain

Hi,

Thanks for sending the updated information. To make sure I know how the 
files relate, let's take snow-3.2.x, BiocParallel 1.3.34 run on July 14 
as an example.


memory files:

The mem_email files report max virtual memory (Max vmem) on the master, 
for the whole job. I see 3 outputs here - one run from July 9 and two 
from July 14:


https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt

This last entry corresponds to BiocParallel 1.3.34 from July 14:


Job 6463239 (snow-3.2.x) Complete
 User = lcollado
 Queue= share...@compute-082.cm.cluster
 Host = compute-082.cm.cluster
 Start Time   = 07/14/2015 12:20:48
 End Time = 07/14/2015 12:22:27
 User Time= 00:00:18
 System Time  = 00:00:04
 Wallclock Time   = 00:01:39
 CPU  = 00:10:14
 Max vmem = 7.286G
 Exit Status  = 0



log files:

Corresponding log files for BiocParallel 1.3.34 on July 14 show memory 
used on workers:


https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463213
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463239



On 07/14/2015 10:23 AM, Leonardo Collado Torres wrote:

Hi Valerie,

My other recent thread about SnowParam
(https://stat.ethz.ch/pipermail/bioc-devel/2015-July/007788.html)
allowed me to see a small difference.

In R 3.1.x, using 'outfile' as in
https://github.com/lcolladotor/SnowParam-memory/blob/d9d70086016c7e720714bec48f121aeca05b1416/SnowParam-memory-derfinder.R#L83
showed more info than in R 3.2.0 and 3.2.1 as shown below:

* R 3.1.x 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.e6416599
* R 3.2.0 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6416601
* R 3.2.1 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.e6416603

using BiocParallel versions

* 1.0.3 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6416599#L57
* 1.2.8 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6416601#L58
* 1.3.33 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6416603#L58

Basically, only some of the message() calls get printed in stderr in R
3.2.0 and 3.2.1 which is something I wasn't expecting. However, using
log = TRUE, all the message() calls get printed in stdout, as
expected given the docs, in which case using 'outfile' is kind of
useless.


The idea was to phase out the snow-centric 'outfile' and replace it with 
'log' and 'logdir'. You're right, previous versions of BiocParallel did 
not capture stdout and stderr on the workers; the current release and 
devel do. As you've now seen, when log = TRUE stderr and stdout messages 
print to the console, if logdir is given they go to a file. One thing I 
have not done yet is capture stdout and stderr from the master to a 
'logdir' file (workers captured only).




See below for the output under R 3.2.0 when log = TRUE
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6463201

Assuming that it's all working as intended, I'll switch to using log =
TRUE in some of my code when users specify 'verbose = TRUE'.


Sounds good.



Cheers,
Leo

PS In page 1 of
http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf
there's a typo. It reads SnowParma instead of SnowParam.


Thanks, I'll fix that.

Valerie



On Fri, May 22, 2015 at 5:33 PM, Valerie Obenchain
voben...@fredhutch.org wrote:

Hi,

Thanks for reporting the bug. Now fixed in 1.13.4 (devel) and 1.2.2
(release).


bplapply(1:2, print, BPPARAM=SnowParam(outfile = NULL))

starting worker for localhost:11031
starting worker for localhost:11031
Type: EXEC
[1] 1
Type: EXEC
[1] 2
Type: DONE
Type: DONE
[[1]]
[1] 1

[[2]]
[1] 2


Some new features have been added to BiocParallel - logging with
futile.logger, writing out log and result files and control over how a job
is divided into tasks.

## send log file to 'logdir':
bplapply(1:2, print, BPPARAM=SnowParam(log=TRUE, logdir=tempdir())

## write results to 'resdir':
bplapply(1:2, print, BPPARAM=SnowParam(resdir=tempdir())

## by default jobs are divided evenly over the workers
bplapply(1:100, print, BPPARAM=SnowParam(workers=4))

## force the job to be run in 2 tasks
## (useful when you know runtime or memory requirements)
bplapply(1:100, print, BPPARAM=SnowParam(tasks=2))


It would be great if you had a chance to try any of these out - I'd be
interested in the feedback. Logging was intended to take the idea of the
'outfile' from snow further with the ability to add messages that can be
filtered by threshold.

The package also has a new errors/logging vignette:


Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?

2015-07-14 Thread Leonardo Collado Torres
Hi Valerie,

My other recent thread about SnowParam
(https://stat.ethz.ch/pipermail/bioc-devel/2015-July/007788.html)
allowed me to see a small difference.

In R 3.1.x, using 'outfile' as in
https://github.com/lcolladotor/SnowParam-memory/blob/d9d70086016c7e720714bec48f121aeca05b1416/SnowParam-memory-derfinder.R#L83
showed more info than in R 3.2.0 and 3.2.1 as shown below:

* R 3.1.x 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.e6416599
* R 3.2.0 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6416601
* R 3.2.1 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.e6416603

using BiocParallel versions

* 1.0.3 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6416599#L57
* 1.2.8 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6416601#L58
* 1.3.33 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6416603#L58

Basically, only some of the message() calls get printed in stderr in R
3.2.0 and 3.2.1 which is something I wasn't expecting. However, using
log = TRUE, all the message() calls get printed in stdout, as
expected given the docs, in which case using 'outfile' is kind of
useless.

See below for the output under R 3.2.0 when log = TRUE
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6463201

Assuming that it's all working as intended, I'll switch to using log =
TRUE in some of my code when users specify 'verbose = TRUE'.

Cheers,
Leo

PS In page 1 of
http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf
there's a typo. It reads SnowParma instead of SnowParam.

On Fri, May 22, 2015 at 5:33 PM, Valerie Obenchain
voben...@fredhutch.org wrote:
 Hi,

 Thanks for reporting the bug. Now fixed in 1.13.4 (devel) and 1.2.2
 (release).

 bplapply(1:2, print, BPPARAM=SnowParam(outfile = NULL))
 starting worker for localhost:11031
 starting worker for localhost:11031
 Type: EXEC
 [1] 1
 Type: EXEC
 [1] 2
 Type: DONE
 Type: DONE
 [[1]]
 [1] 1

 [[2]]
 [1] 2


 Some new features have been added to BiocParallel - logging with
 futile.logger, writing out log and result files and control over how a job
 is divided into tasks.

 ## send log file to 'logdir':
 bplapply(1:2, print, BPPARAM=SnowParam(log=TRUE, logdir=tempdir())

 ## write results to 'resdir':
 bplapply(1:2, print, BPPARAM=SnowParam(resdir=tempdir())

 ## by default jobs are divided evenly over the workers
 bplapply(1:100, print, BPPARAM=SnowParam(workers=4))

 ## force the job to be run in 2 tasks
 ## (useful when you know runtime or memory requirements)
 bplapply(1:100, print, BPPARAM=SnowParam(tasks=2))


 It would be great if you had a chance to try any of these out - I'd be
 interested in the feedback. Logging was intended to take the idea of the
 'outfile' from snow further with the ability to add messages that can be
 filtered by threshold.

 The package also has a new errors/logging vignette:

 http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf

 Valerie




 On 05/21/2015 07:58 AM, Leonardo Collado Torres wrote:

 Hi,

 This might be a BioC support website question, but maybe it's a bug.

 In previous versions of BiocParallel, you could specify where to
 direct the output from the workers by using the 'outfile' argument.
 For example, SnowParam(outfile = Sys.getenv('SGE_STDERR_PATH')). I'm
 not finding how to do so with the current version (1.3.12).

 I understand that SnowParam() has a ... argument that gets passed to
 snow::makeCluster(), according to the SnowParam() docs. I also see
 that when log = TRUE, a script is used instead of snow. But log =
 FALSE by default.

 If I use snow, the 'outfile' argument does work as shown below:

 ## Nothing gets printed by default

 cl - makeCluster(2, type = 'SOCK')
 y - clusterApply(cl, 1:2, print)



 ## Use outfile now, print works

 cl - makeCluster(2, type = 'SOCK', outfile = NULL)

 starting worker for localhost:11671
 starting worker for localhost:11671

 y - clusterApply(cl, 1:2, print)

 Type: EXEC
 Type: EXEC
 [1] 1
 [1] 2

 packageVersion('snow')

 [1] ‘0.3.13’


 However, I can't use 'outfile' with SnowParam:

 ## SerialParam works

 x - bplapply(1:2, print, BPPARAM = SerialParam())

 [1] 1
 [1] 2

 ## No printing by default with SnowParam, as expected

 x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1))


 ## Can't pass 'outfile' argument

 x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile =
 NULL))

 Error in bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile
 = NULL)) :
error in evaluating the argument 'BPPARAM' in selecting a method for
 function 'bplapply': Error in envRefSetField(.Object, field, classDef,
 selfEnv, 

Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?

2015-05-22 Thread Valerie Obenchain

Hi,

Thanks for reporting the bug. Now fixed in 1.13.4 (devel) and 1.2.2 
(release).


 bplapply(1:2, print, BPPARAM=SnowParam(outfile = NULL))
starting worker for localhost:11031
starting worker for localhost:11031
Type: EXEC
[1] 1
Type: EXEC
[1] 2
Type: DONE
Type: DONE
[[1]]
[1] 1

[[2]]
[1] 2


Some new features have been added to BiocParallel - logging with 
futile.logger, writing out log and result files and control over how a 
job is divided into tasks.


## send log file to 'logdir':
bplapply(1:2, print, BPPARAM=SnowParam(log=TRUE, logdir=tempdir())

## write results to 'resdir':
bplapply(1:2, print, BPPARAM=SnowParam(resdir=tempdir())

## by default jobs are divided evenly over the workers
bplapply(1:100, print, BPPARAM=SnowParam(workers=4))

## force the job to be run in 2 tasks
## (useful when you know runtime or memory requirements)
bplapply(1:100, print, BPPARAM=SnowParam(tasks=2))


It would be great if you had a chance to try any of these out - I'd be 
interested in the feedback. Logging was intended to take the idea of the 
'outfile' from snow further with the ability to add messages that can be 
filtered by threshold.


The package also has a new errors/logging vignette:

http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf

Valerie



On 05/21/2015 07:58 AM, Leonardo Collado Torres wrote:

Hi,

This might be a BioC support website question, but maybe it's a bug.

In previous versions of BiocParallel, you could specify where to
direct the output from the workers by using the 'outfile' argument.
For example, SnowParam(outfile = Sys.getenv('SGE_STDERR_PATH')). I'm
not finding how to do so with the current version (1.3.12).

I understand that SnowParam() has a ... argument that gets passed to
snow::makeCluster(), according to the SnowParam() docs. I also see
that when log = TRUE, a script is used instead of snow. But log =
FALSE by default.

If I use snow, the 'outfile' argument does work as shown below:

## Nothing gets printed by default

cl - makeCluster(2, type = 'SOCK')
y - clusterApply(cl, 1:2, print)



## Use outfile now, print works

cl - makeCluster(2, type = 'SOCK', outfile = NULL)

starting worker for localhost:11671
starting worker for localhost:11671

y - clusterApply(cl, 1:2, print)

Type: EXEC
Type: EXEC
[1] 1
[1] 2


packageVersion('snow')

[1] ‘0.3.13’


However, I can't use 'outfile' with SnowParam:

## SerialParam works

x - bplapply(1:2, print, BPPARAM = SerialParam())

[1] 1
[1] 2

## No printing by default with SnowParam, as expected

x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1))


## Can't pass 'outfile' argument

x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile = NULL))

Error in bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile
= NULL)) :
   error in evaluating the argument 'BPPARAM' in selecting a method for
function 'bplapply': Error in envRefSetField(.Object, field, classDef,
selfEnv, elements[[field]]) :
   ‘outfile’ is not a field in class “SnowParam”


Digging at the code, I see that in SnowParam() the ... argument is
saved in .clusterargs

args - c(list(spec = workers, type = type), list(...))
 .clusterargs - lapply(args, force)

However, ... is still passed to .SnowParam(), and .SnowParam() fields are:

fields=list(
 cluster=cluster,
 .clusterargs=list,
 .controlled=logical,
 log=logical,
 threshold=ANY,
 logdir=character,
 resultdir=character)


So, I'm wondering if ... should not be passed to .SnowParam().



## Trying to pass outfile to .clusterargs directly doesn't work
## Actually, I'm surprised it didn't crash .SnowParam()

x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, 
.clusterargs=list(outfile = NULL)))



Full log: https://gist.github.com/9ac957c6cad1c07f4ea4


Thanks,
Leo

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: voben...@fredhutch.org
Phone: (206) 667-3158

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel