Dear Simon,

Thank you for your response! I was not able to provide you with the requested 
information at an earlier stage since I am not a full time academic / 
researcher.

An example of a bam call that may result in an error is:
bam(formula=Di ~ 1 + Gender + I(L_Dis==0) + s(DisPerc, by=as.numeric(L_Dis==2), 
bs='cr'), offset=log(Ei*Mi), family=poisson, data=dtPF, method="fREML", 
discrete=TRUE, gc.level=2);

Here, dtPF is a data.table object with 22m rows and 21 columns/variables, 
Gender is a factor variable, L_Dis is an integer variable which equals 0 if 
DisPerc is missing (manually set to 0.1), equals 1 if DisPerc==0, and equals 2 
if DisPerc>0 (ranges from 0 to 0.25).

The sessionInfo() provides the following output:
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

Matrix products: default
BLAS/LAPACK: 
/sara/eb/Debian9/OpenBLAS/0.2.20-GCC-6.4.0-2.28/lib/libopenblas_sandybridgep-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US       LC_NUMERIC=C         LC_TIME=en_US
 [4] LC_COLLATE=en_US     LC_MONETARY=en_US    LC_MESSAGES=en_US
 [7] LC_PAPER=en_US       LC_NAME=C            LC_ADDRESS=C
[10] LC_TELEPHONE=C       LC_MEASUREMENT=en_US LC_IDENTIFICATION=C

attached base packages:
[1] methods   stats     graphics  grDevices utils     datasets  base

other attached packages:
[1] mgcv_1.8-27       nlme_3.1-137      data.table_1.12.0

loaded via a namespace (and not attached):
[1] compiler_3.4.3  Matrix_1.2-16   tools_3.4.3     splines_3.4.3
[5] grid_3.4.3      lattice_0.20-38

Thank you for your help!

Frank

________________________________
From: R-help <r-help-boun...@r-project.org> on behalf of 
r-help-requ...@r-project.org <r-help-requ...@r-project.org>
Sent: Saturday, March 16, 2019 11:00 AM
To: r-help@r-project.org
Subject: R-help Digest, Vol 193, Issue 16

Send R-help mailing list submissions to
        r-help@r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://stat.ethz.ch/mailman/listinfo/r-help
or, via email, send a message with subject or body 'help' to
        r-help-requ...@r-project.org

You can reach the person managing the list at
        r-help-ow...@r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-help digest..."


Date: Fri, 15 Mar 2019 12:31:31 +0000
From: Simon Wood <simon.w...@bath.edu>
To: r-help@r-project.org
Subject: Re: [R] [mgcv] Memory issues with bam() on computer cluster
Message-ID: <d8e2643a-d960-0d86-4296-f0c7fcf14...@bath.edu>
Content-Type: text/plain; charset="utf-8"

Can you supply the results of sessionInfo() please, and the full bam
call that causes this.

best,

Simon (mgcv maintainer)

On 15/03/2019 09:09, Frank van Berkum wrote:
> Dear Community,
>
> In our current research we are trying to fit Generalized Additive Models to a 
> large dataset. We are using the package mgcv in R.
>
> Our dataset contains about 22 million records with less than 20 risk factors 
> for each observation, so in our case n>>p. The dataset covers the period 2006 
> until 2011, and we analyse both the complete dataset and datasets in which we 
> leave out a single year. The latter part is done to analyse robustness of the 
> results. We understand k-fold cross validation may seem more appropriate, but 
> out approach is closer to what is done in practice (how will one additional 
> year of information affect your estimates?).
>
> We use the function bam as advocated in Wood et al. (2017), and we apply the 
> following options: bam(�, discrete=TRUE, chunk.size=10000, gc.level=1). We 
> run these analyses on a computer cluster (see 
> https://userinfo.surfsara.nl/systems/lisa/description for details), and the 
> job is allocated to a node within the computer cluster. A node has at least 
> 16 cores and 64Gb memory.
>
> We had expected 64Gb of memory to be sufficient for these analyses, 
> especially since the bam function is built specifically for large datasets. 
> However, when applying this function to the different datasets described 
> above with different regression specifications (different risk factors 
> included in the linear predictor), we sometimes obtain errors of the 
> following form.
>
> Error in XWyd(G$Xd, w, z, G$kd, G$ks, G$ts, G$dt, G$v, G$qc, G$drop, ar.stop, 
>  :
>
>    'Calloc' could not allocate memory (22624897 of 8 bytes)
>
> Calls: fnEstimateModel_bam -> bam -> bgam.fitd -> XWyd
>
> Execution halted
>
> Warning message:
>
> system call failed: Cannot allocate memory
>
> Error in Xbd(G$Xd, coef, G$kd, G$ks, G$ts, G$dt, G$v, G$qc, G$drop) :
>
>    'Calloc' could not allocate memory (18590685 of 8 bytes)
>
> Calls: fnEstimateModel_bam -> bam -> bgam.fitd -> Xbd
>
> Execution halted
>
> Warning message:
>
> system call failed: Cannot allocate memory
>
> Error: cannot allocate vector of size 1.7 Gb
>
> Timing stopped at: 2 0.556 4.831
>
> Error in system.time(oo <- .C(C_XWXd0, XWX = as.double(rep(0, (pt + nt)^2)),  
> :
>
>    'Calloc' could not allocate memory (55315650 of 24 bytes)
>
> Calls: fnEstimateModel_bam -> bam -> bgam.fitd -> XWXd -> system.time -> .C
>
> Timing stopped at: 1.056 1.396 2.459
>
> Execution halted
>
> Warning message:
>
> system call failed: Cannot allocate memory
>
> The errors seem to arise at different stages in the optimization process. We 
> have analysed whether these errors disappear if different settings are used 
> (different chunk.size, different gc.level), but this does not resolve our 
> problem. Also, the errors occur on different datasets when using different 
> settings, and even when using the same settings it is possible that an error 
> that occurred on dataset X in one run it does not necessarily occur on 
> dataset X in a different run. When using the discrete=TRUE option, 
> optimization can be parallelized, but we have chosen to not employ this 
> feature to ensure memory does not have to be shared between parallel 
> processes.
>
> Naturally I cannot share our dataset with you which makes the problem 
> difficult to analyse. However, based on your collective knowledge, could you 
> pinpoint us to where the problem may occur? Is it something within the C-code 
> used within the package (as the last error seems to indicate), or is it 
> related to the computer cluster?
>
> Any help or insights is much appreciated.
>
> Kind regards,
>
> Frank
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Simon Wood, School of Mathematics, University of Bristol, BS8 1TW UK
https://people.maths.bris.ac.uk/~sw15190/

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to