Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?
Hi Valerie, I added a short explanation to http://lcolladotor.github.io/SnowParam-memory/ about how the files are linked (see commit https://github.com/lcolladotor/SnowParam-memory/commit/602e42dd60e4a988425bf41b774263c058a7551a). The entry you selected, https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt#L27-L38, is linked to the stderr log file https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.e6463239 and stdout log file https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463239 The key is the job id, which is on the first line of each entry in the emails from the cluster and is used at the end of the names for the log files. And yes, like you figured out, the Max vmem line (https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt#L37) is the maximum memory used as reported by my cluster. It's normally in GB, but could be in MB as in https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/der-serial-3.1.x.txt#L11 Cheers, Leo On Tue, Jul 14, 2015 at 3:12 PM, Valerie Obenchain voben...@fredhutch.org wrote: Hi, Thanks for sending the updated information. To make sure I know how the files relate, let's take snow-3.2.x, BiocParallel 1.3.34 run on July 14 as an example. memory files: The mem_email files report max virtual memory (Max vmem) on the master, for the whole job. I see 3 outputs here - one run from July 9 and two from July 14: https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt This last entry corresponds to BiocParallel 1.3.34 from July 14: Job 6463239 (snow-3.2.x) Complete User = lcollado Queue= share...@compute-082.cm.cluster Host = compute-082.cm.cluster Start Time = 07/14/2015 12:20:48 End Time = 07/14/2015 12:22:27 User Time= 00:00:18 System Time = 00:00:04 Wallclock Time = 00:01:39 CPU = 00:10:14 Max vmem = 7.286G Exit Status = 0 log files: Corresponding log files for BiocParallel 1.3.34 on July 14 show memory used on workers: https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463213 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463239 On 07/14/2015 10:23 AM, Leonardo Collado Torres wrote: Hi Valerie, My other recent thread about SnowParam (https://stat.ethz.ch/pipermail/bioc-devel/2015-July/007788.html) allowed me to see a small difference. In R 3.1.x, using 'outfile' as in https://github.com/lcolladotor/SnowParam-memory/blob/d9d70086016c7e720714bec48f121aeca05b1416/SnowParam-memory-derfinder.R#L83 showed more info than in R 3.2.0 and 3.2.1 as shown below: * R 3.1.x https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.e6416599 * R 3.2.0 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6416601 * R 3.2.1 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.e6416603 using BiocParallel versions * 1.0.3 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6416599#L57 * 1.2.8 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6416601#L58 * 1.3.33 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6416603#L58 Basically, only some of the message() calls get printed in stderr in R 3.2.0 and 3.2.1 which is something I wasn't expecting. However, using log = TRUE, all the message() calls get printed in stdout, as expected given the docs, in which case using 'outfile' is kind of useless. The idea was to phase out the snow-centric 'outfile' and replace it with 'log' and 'logdir'. You're right, previous versions of BiocParallel did not capture stdout and stderr on the workers; the current release and devel do. As you've now seen, when log = TRUE stderr and stdout messages print to the console, if logdir is given they go to a file. One thing I have not done yet is capture stdout and stderr from the master to a 'logdir' file (workers captured only). Nice =) I also noticed https://github.com/Bioconductor-mirror/BiocParallel/commit/6f816112e2d136d789ed1045a7fc67fd2a91eb25 after sending my previous email. See below for the output under R 3.2.0 when log = TRUE https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6463201 Assuming that it's all working as intended, I'll switch to using log = TRUE in some of my code when users specify 'verbose = TRUE'. Sounds good. Cheers, Leo PS In page 1 of http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf there's a typo. It reads SnowParma instead of SnowParam. Thanks, I'll
Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?
Hi, Thanks for sending the updated information. To make sure I know how the files relate, let's take snow-3.2.x, BiocParallel 1.3.34 run on July 14 as an example. memory files: The mem_email files report max virtual memory (Max vmem) on the master, for the whole job. I see 3 outputs here - one run from July 9 and two from July 14: https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt This last entry corresponds to BiocParallel 1.3.34 from July 14: Job 6463239 (snow-3.2.x) Complete User = lcollado Queue= share...@compute-082.cm.cluster Host = compute-082.cm.cluster Start Time = 07/14/2015 12:20:48 End Time = 07/14/2015 12:22:27 User Time= 00:00:18 System Time = 00:00:04 Wallclock Time = 00:01:39 CPU = 00:10:14 Max vmem = 7.286G Exit Status = 0 log files: Corresponding log files for BiocParallel 1.3.34 on July 14 show memory used on workers: https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463213 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463239 On 07/14/2015 10:23 AM, Leonardo Collado Torres wrote: Hi Valerie, My other recent thread about SnowParam (https://stat.ethz.ch/pipermail/bioc-devel/2015-July/007788.html) allowed me to see a small difference. In R 3.1.x, using 'outfile' as in https://github.com/lcolladotor/SnowParam-memory/blob/d9d70086016c7e720714bec48f121aeca05b1416/SnowParam-memory-derfinder.R#L83 showed more info than in R 3.2.0 and 3.2.1 as shown below: * R 3.1.x https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.e6416599 * R 3.2.0 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6416601 * R 3.2.1 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.e6416603 using BiocParallel versions * 1.0.3 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6416599#L57 * 1.2.8 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6416601#L58 * 1.3.33 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6416603#L58 Basically, only some of the message() calls get printed in stderr in R 3.2.0 and 3.2.1 which is something I wasn't expecting. However, using log = TRUE, all the message() calls get printed in stdout, as expected given the docs, in which case using 'outfile' is kind of useless. The idea was to phase out the snow-centric 'outfile' and replace it with 'log' and 'logdir'. You're right, previous versions of BiocParallel did not capture stdout and stderr on the workers; the current release and devel do. As you've now seen, when log = TRUE stderr and stdout messages print to the console, if logdir is given they go to a file. One thing I have not done yet is capture stdout and stderr from the master to a 'logdir' file (workers captured only). See below for the output under R 3.2.0 when log = TRUE https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6463201 Assuming that it's all working as intended, I'll switch to using log = TRUE in some of my code when users specify 'verbose = TRUE'. Sounds good. Cheers, Leo PS In page 1 of http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf there's a typo. It reads SnowParma instead of SnowParam. Thanks, I'll fix that. Valerie On Fri, May 22, 2015 at 5:33 PM, Valerie Obenchain voben...@fredhutch.org wrote: Hi, Thanks for reporting the bug. Now fixed in 1.13.4 (devel) and 1.2.2 (release). bplapply(1:2, print, BPPARAM=SnowParam(outfile = NULL)) starting worker for localhost:11031 starting worker for localhost:11031 Type: EXEC [1] 1 Type: EXEC [1] 2 Type: DONE Type: DONE [[1]] [1] 1 [[2]] [1] 2 Some new features have been added to BiocParallel - logging with futile.logger, writing out log and result files and control over how a job is divided into tasks. ## send log file to 'logdir': bplapply(1:2, print, BPPARAM=SnowParam(log=TRUE, logdir=tempdir()) ## write results to 'resdir': bplapply(1:2, print, BPPARAM=SnowParam(resdir=tempdir()) ## by default jobs are divided evenly over the workers bplapply(1:100, print, BPPARAM=SnowParam(workers=4)) ## force the job to be run in 2 tasks ## (useful when you know runtime or memory requirements) bplapply(1:100, print, BPPARAM=SnowParam(tasks=2)) It would be great if you had a chance to try any of these out - I'd be interested in the feedback. Logging was intended to take the idea of the 'outfile' from snow further with the ability to add messages that can be filtered by threshold. The package also has a new errors/logging vignette:
Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?
Hi Valerie, My other recent thread about SnowParam (https://stat.ethz.ch/pipermail/bioc-devel/2015-July/007788.html) allowed me to see a small difference. In R 3.1.x, using 'outfile' as in https://github.com/lcolladotor/SnowParam-memory/blob/d9d70086016c7e720714bec48f121aeca05b1416/SnowParam-memory-derfinder.R#L83 showed more info than in R 3.2.0 and 3.2.1 as shown below: * R 3.1.x https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.e6416599 * R 3.2.0 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6416601 * R 3.2.1 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.e6416603 using BiocParallel versions * 1.0.3 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6416599#L57 * 1.2.8 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6416601#L58 * 1.3.33 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6416603#L58 Basically, only some of the message() calls get printed in stderr in R 3.2.0 and 3.2.1 which is something I wasn't expecting. However, using log = TRUE, all the message() calls get printed in stdout, as expected given the docs, in which case using 'outfile' is kind of useless. See below for the output under R 3.2.0 when log = TRUE https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6463201 Assuming that it's all working as intended, I'll switch to using log = TRUE in some of my code when users specify 'verbose = TRUE'. Cheers, Leo PS In page 1 of http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf there's a typo. It reads SnowParma instead of SnowParam. On Fri, May 22, 2015 at 5:33 PM, Valerie Obenchain voben...@fredhutch.org wrote: Hi, Thanks for reporting the bug. Now fixed in 1.13.4 (devel) and 1.2.2 (release). bplapply(1:2, print, BPPARAM=SnowParam(outfile = NULL)) starting worker for localhost:11031 starting worker for localhost:11031 Type: EXEC [1] 1 Type: EXEC [1] 2 Type: DONE Type: DONE [[1]] [1] 1 [[2]] [1] 2 Some new features have been added to BiocParallel - logging with futile.logger, writing out log and result files and control over how a job is divided into tasks. ## send log file to 'logdir': bplapply(1:2, print, BPPARAM=SnowParam(log=TRUE, logdir=tempdir()) ## write results to 'resdir': bplapply(1:2, print, BPPARAM=SnowParam(resdir=tempdir()) ## by default jobs are divided evenly over the workers bplapply(1:100, print, BPPARAM=SnowParam(workers=4)) ## force the job to be run in 2 tasks ## (useful when you know runtime or memory requirements) bplapply(1:100, print, BPPARAM=SnowParam(tasks=2)) It would be great if you had a chance to try any of these out - I'd be interested in the feedback. Logging was intended to take the idea of the 'outfile' from snow further with the ability to add messages that can be filtered by threshold. The package also has a new errors/logging vignette: http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf Valerie On 05/21/2015 07:58 AM, Leonardo Collado Torres wrote: Hi, This might be a BioC support website question, but maybe it's a bug. In previous versions of BiocParallel, you could specify where to direct the output from the workers by using the 'outfile' argument. For example, SnowParam(outfile = Sys.getenv('SGE_STDERR_PATH')). I'm not finding how to do so with the current version (1.3.12). I understand that SnowParam() has a ... argument that gets passed to snow::makeCluster(), according to the SnowParam() docs. I also see that when log = TRUE, a script is used instead of snow. But log = FALSE by default. If I use snow, the 'outfile' argument does work as shown below: ## Nothing gets printed by default cl - makeCluster(2, type = 'SOCK') y - clusterApply(cl, 1:2, print) ## Use outfile now, print works cl - makeCluster(2, type = 'SOCK', outfile = NULL) starting worker for localhost:11671 starting worker for localhost:11671 y - clusterApply(cl, 1:2, print) Type: EXEC Type: EXEC [1] 1 [1] 2 packageVersion('snow') [1] ‘0.3.13’ However, I can't use 'outfile' with SnowParam: ## SerialParam works x - bplapply(1:2, print, BPPARAM = SerialParam()) [1] 1 [1] 2 ## No printing by default with SnowParam, as expected x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1)) ## Can't pass 'outfile' argument x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile = NULL)) Error in bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile = NULL)) : error in evaluating the argument 'BPPARAM' in selecting a method for function 'bplapply': Error in envRefSetField(.Object, field, classDef, selfEnv,
Re: [Bioc-devel] Redirect workers output to STDERR, how to do so with current BiocParallel::SnowParam()?
Hi, Thanks for reporting the bug. Now fixed in 1.13.4 (devel) and 1.2.2 (release). bplapply(1:2, print, BPPARAM=SnowParam(outfile = NULL)) starting worker for localhost:11031 starting worker for localhost:11031 Type: EXEC [1] 1 Type: EXEC [1] 2 Type: DONE Type: DONE [[1]] [1] 1 [[2]] [1] 2 Some new features have been added to BiocParallel - logging with futile.logger, writing out log and result files and control over how a job is divided into tasks. ## send log file to 'logdir': bplapply(1:2, print, BPPARAM=SnowParam(log=TRUE, logdir=tempdir()) ## write results to 'resdir': bplapply(1:2, print, BPPARAM=SnowParam(resdir=tempdir()) ## by default jobs are divided evenly over the workers bplapply(1:100, print, BPPARAM=SnowParam(workers=4)) ## force the job to be run in 2 tasks ## (useful when you know runtime or memory requirements) bplapply(1:100, print, BPPARAM=SnowParam(tasks=2)) It would be great if you had a chance to try any of these out - I'd be interested in the feedback. Logging was intended to take the idea of the 'outfile' from snow further with the ability to add messages that can be filtered by threshold. The package also has a new errors/logging vignette: http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf Valerie On 05/21/2015 07:58 AM, Leonardo Collado Torres wrote: Hi, This might be a BioC support website question, but maybe it's a bug. In previous versions of BiocParallel, you could specify where to direct the output from the workers by using the 'outfile' argument. For example, SnowParam(outfile = Sys.getenv('SGE_STDERR_PATH')). I'm not finding how to do so with the current version (1.3.12). I understand that SnowParam() has a ... argument that gets passed to snow::makeCluster(), according to the SnowParam() docs. I also see that when log = TRUE, a script is used instead of snow. But log = FALSE by default. If I use snow, the 'outfile' argument does work as shown below: ## Nothing gets printed by default cl - makeCluster(2, type = 'SOCK') y - clusterApply(cl, 1:2, print) ## Use outfile now, print works cl - makeCluster(2, type = 'SOCK', outfile = NULL) starting worker for localhost:11671 starting worker for localhost:11671 y - clusterApply(cl, 1:2, print) Type: EXEC Type: EXEC [1] 1 [1] 2 packageVersion('snow') [1] ‘0.3.13’ However, I can't use 'outfile' with SnowParam: ## SerialParam works x - bplapply(1:2, print, BPPARAM = SerialParam()) [1] 1 [1] 2 ## No printing by default with SnowParam, as expected x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1)) ## Can't pass 'outfile' argument x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile = NULL)) Error in bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile = NULL)) : error in evaluating the argument 'BPPARAM' in selecting a method for function 'bplapply': Error in envRefSetField(.Object, field, classDef, selfEnv, elements[[field]]) : ‘outfile’ is not a field in class “SnowParam” Digging at the code, I see that in SnowParam() the ... argument is saved in .clusterargs args - c(list(spec = workers, type = type), list(...)) .clusterargs - lapply(args, force) However, ... is still passed to .SnowParam(), and .SnowParam() fields are: fields=list( cluster=cluster, .clusterargs=list, .controlled=logical, log=logical, threshold=ANY, logdir=character, resultdir=character) So, I'm wondering if ... should not be passed to .SnowParam(). ## Trying to pass outfile to .clusterargs directly doesn't work ## Actually, I'm surprised it didn't crash .SnowParam() x - bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, .clusterargs=list(outfile = NULL))) Full log: https://gist.github.com/9ac957c6cad1c07f4ea4 Thanks, Leo ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, Seattle, WA 98109 Email: voben...@fredhutch.org Phone: (206) 667-3158 ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel