Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-19 Thread Chris Evans
[Damn: now I'm forgetting that r-help has reply-to the individual respondent, 
sorry for duplicate Email Charles but I'm sure this should be in the archives.] 

Excellent: I'm sure that's it. I hadn't noticed that I'd loaded libraries in a 
cached code block. I thought I'd learned not to do that: can't believe I didn't 
check that. 

Thanks, another hole in a foot: (re)-read the pertinent manual before assuming 
something is broken Christopher! 

Very best all, 

C 

- Original Message - 
> From: "Berry, Charles"  
> To: "Chris Evans"  
> Cc: "R-help"  
> Sent: Sunday, 19 September, 2021 19:28:49 
> Subject: Re: Cacheing of functions from libraries other than the base in 
> Rmarkdown 

> Chris, 
> 
> 
>> On Sep 18, 2021, at 12:26 PM, Chris Evans  wrote: 
>> 
>> This question may belong somewhere else, if so, please signpost me and 
>> accept 
>> apologies. 
>> 
>> What is happening is that I have a large (for me, > 3k lines) Rmarkdown file 
>> with many R code blocks (no other code or 
>> engine is used) working on some large datasets. I have some inline r like 
>> 
>> There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` rows 
>> of 
>> data. 
>> 
>> What I am finding is that even if one knit has worked fine and I change 
>> something somewhere and knit again, the second 
>> knit is often failing with an error like 
>> 
>> n_distinct(tibDat$ID) : could not find function "n_distinct" 
>> 
>> This is not happening for functions like nrow() from base R and it mostly 
>> seems 
>> to happen to functions from the tidyverse. 
>> 
>> I think what is happening is some sort of cache corruption presumably caused 
>> by 
>> the memory demands. I am pretty sure I've 
>> seen this before but a long time ago and dealt with it by deleting the files 
>> and 
>> cache folders created by the knit. 
> 
> Caching things that depend on libraries is known to be tricky. 
> 
> Specifically, it is advised that "loading packages via library() in a cached 
> chunk and these packages will be used by uncached chunks" is something you 
> should not do. I suspect that this is the problem with your inline chunk. 
> 
> I have to reread things like: 
> 
> https://yihui.org/knitr/demo/cache/ 
> 
> and relevant parts of the manual to be sure I didn't mess something up and 
> maybe 
> you should look at that and the manual yet another time. 
> 
> HTH, 
> 
> Chuck 

-- 
Chris Evans (he/him)  Visiting Professor, University of 
Sheffield and UDLA, Quito, Ecuador 
I do some consultation work for the University of Roehampton 
 and other places 
but  remains my main Email address. I have a work web site 
at: 
https://www.psyctc.org/psyctc/ 
and a site I manage for CORE and CORE system trust at: 
http://www.coresystemtrust.org.uk/ 
I have "semigrated" to France, see: 
https://www.psyctc.org/pelerinage2016/semigrating-to-france/ 
https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/
 

If you want an Emeeting, I am trying to keep them to Thursdays and my diary is 
at: 
https://www.psyctc.org/pelerinage2016/ceworkdiary/ 
Beware: French time, generally an hour ahead of UK. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-19 Thread Jeff Newmiller
You should Google "r cache" yourself, but I have used memoise, R.cache, drake, 
and targets, and I rate targets as #1 and R.cache as #2.

If you try to retrieve old cache objects (more than a few weeks, say) you are 
likely to run into package/class changes that could cause the kind of issues 
you are having to crop up. Try to archive results in an interchange format like 
csv, parquet, or feather to future-proof your work as a separate task from 
caching.

On September 19, 2021 10:49:50 AM PDT, Chris Evans  wrote:
>Can you point me to an example of this?  I definitely need cacheing for this 
>work but I don't know
>about data cacheing packages.  Might be one of those things where my learning 
>time might outweigh
>time saved but I lost a fair bit of time by being stupid with this so perhaps 
>not.
>
>- Original Message -
>> From: "Jeff Newmiller" 
>> To: r-help@r-project.org, "Charles Berry" , "Chris 
>> Evans" 
>> Cc: "R-help" 
>> Sent: Sunday, 19 September, 2021 19:45:03
>> Subject: Re: [R] Cacheing of functions from libraries other than the base in 
>> Rmarkdown
>
>> I avoid knitr (Rmarkdown uses knitr) caching like the plague. If I want 
>> caching,
>> I do it myself (with or without the aid of one of a data caching package).
>> 
>> On September 19, 2021 10:28:49 AM PDT, "Berry, Charles"
>>  wrote:
>>>Chris,
>>>
>>>
>>>> On Sep 18, 2021, at 12:26 PM, Chris Evans  wrote:
>>>> 
>>>> This question may belong somewhere else, if so, please signpost me and 
>>>> accept
>>>> apologies.
>>>> 
>>>> What is happening is that I have a large (for me, > 3k lines) Rmarkdown 
>>>> file
>>>> with many R code blocks (no other code or
>>>> engine is used) working on some large datasets.  I have some inline r like
>>>> 
>>>>   There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` 
>>>> rows of
>>>>   data.
>>>> 
>>>> What I am finding is that even if one knit has worked fine and I change
>>>> something somewhere and knit again, the second
>>>> knit is often failing with an error like
>>>> 
>>>>   n_distinct(tibDat$ID) : could not find function "n_distinct"
>>>> 
>>>> This is not happening for functions like nrow() from base R and it mostly 
>>>> seems
>>>> to happen to functions from the tidyverse.
>>>> 
>>>> I think what is happening is some sort of cache corruption presumably 
>>>> caused by
>>>> the memory demands.  I am pretty sure I've
>>>> seen this before but a long time ago and dealt with it by deleting the 
>>>> files and
>>>> cache folders created by the knit.
>>>
>>>Caching things that depend on libraries is known to be tricky.
>>>
>>>Specifically, it is advised that "loading packages via library() in a cached
>>>chunk and these packages will be used by uncached chunks" is something you
>>>should not do.  I suspect that this is the problem with your inline chunk.
>>>
>>>I have to reread things like:
>>>
>>> https://yihui.org/knitr/demo/cache/
>>>
>>>and relevant parts of the manual to be sure I didn't mess something up and 
>>>maybe
>>>you should look at that and the manual yet another time.
>>>
>>>HTH,
>>>
>>>Chuck
>>>
>>>__
>>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>> 
>> --
>> Sent from my phone. Please excuse my brevity.
>

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-19 Thread Chris Evans
Can you point me to an example of this?  I definitely need cacheing for this 
work but I don't know
about data cacheing packages.  Might be one of those things where my learning 
time might outweigh
time saved but I lost a fair bit of time by being stupid with this so perhaps 
not.

- Original Message -
> From: "Jeff Newmiller" 
> To: r-help@r-project.org, "Charles Berry" , "Chris 
> Evans" 
> Cc: "R-help" 
> Sent: Sunday, 19 September, 2021 19:45:03
> Subject: Re: [R] Cacheing of functions from libraries other than the base in 
> Rmarkdown

> I avoid knitr (Rmarkdown uses knitr) caching like the plague. If I want 
> caching,
> I do it myself (with or without the aid of one of a data caching package).
> 
> On September 19, 2021 10:28:49 AM PDT, "Berry, Charles"
>  wrote:
>>Chris,
>>
>>
>>> On Sep 18, 2021, at 12:26 PM, Chris Evans  wrote:
>>> 
>>> This question may belong somewhere else, if so, please signpost me and 
>>> accept
>>> apologies.
>>> 
>>> What is happening is that I have a large (for me, > 3k lines) Rmarkdown file
>>> with many R code blocks (no other code or
>>> engine is used) working on some large datasets.  I have some inline r like
>>> 
>>>   There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` 
>>> rows of
>>>   data.
>>> 
>>> What I am finding is that even if one knit has worked fine and I change
>>> something somewhere and knit again, the second
>>> knit is often failing with an error like
>>> 
>>>   n_distinct(tibDat$ID) : could not find function "n_distinct"
>>> 
>>> This is not happening for functions like nrow() from base R and it mostly 
>>> seems
>>> to happen to functions from the tidyverse.
>>> 
>>> I think what is happening is some sort of cache corruption presumably 
>>> caused by
>>> the memory demands.  I am pretty sure I've
>>> seen this before but a long time ago and dealt with it by deleting the 
>>> files and
>>> cache folders created by the knit.
>>
>>Caching things that depend on libraries is known to be tricky.
>>
>>Specifically, it is advised that "loading packages via library() in a cached
>>chunk and these packages will be used by uncached chunks" is something you
>>should not do.  I suspect that this is the problem with your inline chunk.
>>
>>I have to reread things like:
>>
>>  https://yihui.org/knitr/demo/cache/
>>
>>and relevant parts of the manual to be sure I didn't mess something up and 
>>maybe
>>you should look at that and the manual yet another time.
>>
>>HTH,
>>
>>Chuck
>>
>>__
>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
> 
> --
> Sent from my phone. Please excuse my brevity.

-- 
Chris Evans (he/him)  Visiting Professor, University of 
Sheffield and UDLA, Quito, Ecuador
I do some consultation work for the University of Roehampton 
 and other places
but  remains my main Email address.  I have a work web site 
at:
   https://www.psyctc.org/psyctc/
and a site I manage for CORE and CORE system trust at:
   http://www.coresystemtrust.org.uk/
I have "semigrated" to France, see: 
   https://www.psyctc.org/pelerinage2016/semigrating-to-france/ 
   
https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/

If you want an Emeeting, I am trying to keep them to Thursdays and my diary is 
at:
   https://www.psyctc.org/pelerinage2016/ceworkdiary/
Beware: French time, generally an hour ahead of UK.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-19 Thread Jeff Newmiller
I avoid knitr (Rmarkdown uses knitr) caching like the plague. If I want 
caching, I do it myself (with or without the aid of one of a data caching 
package).

On September 19, 2021 10:28:49 AM PDT, "Berry, Charles" 
 wrote:
>Chris,
>
>
>> On Sep 18, 2021, at 12:26 PM, Chris Evans  wrote:
>> 
>> This question may belong somewhere else, if so, please signpost me and 
>> accept apologies.
>> 
>> What is happening is that I have a large (for me, > 3k lines) Rmarkdown file 
>> with many R code blocks (no other code or 
>> engine is used) working on some large datasets.  I have some inline r like 
>> 
>>   There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` rows 
>> of data.
>> 
>> What I am finding is that even if one knit has worked fine and I change 
>> something somewhere and knit again, the second
>> knit is often failing with an error like
>> 
>>   n_distinct(tibDat$ID) : could not find function "n_distinct"
>> 
>> This is not happening for functions like nrow() from base R and it mostly 
>> seems to happen to functions from the tidyverse.
>> 
>> I think what is happening is some sort of cache corruption presumably caused 
>> by the memory demands.  I am pretty sure I've
>> seen this before but a long time ago and dealt with it by deleting the files 
>> and cache folders created by the knit. 
>
>Caching things that depend on libraries is known to be tricky.
>
>Specifically, it is advised that "loading packages via library() in a cached 
>chunk and these packages will be used by uncached chunks" is something you 
>should not do.  I suspect that this is the problem with your inline chunk.
>
>I have to reread things like:
>
>   https://yihui.org/knitr/demo/cache/
>
>and relevant parts of the manual to be sure I didn't mess something up and 
>maybe you should look at that and the manual yet another time. 
>
>HTH,
>
>Chuck
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-19 Thread Berry, Charles
Chris,


> On Sep 18, 2021, at 12:26 PM, Chris Evans  wrote:
> 
> This question may belong somewhere else, if so, please signpost me and accept 
> apologies.
> 
> What is happening is that I have a large (for me, > 3k lines) Rmarkdown file 
> with many R code blocks (no other code or 
> engine is used) working on some large datasets.  I have some inline r like 
> 
>   There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` rows 
> of data.
> 
> What I am finding is that even if one knit has worked fine and I change 
> something somewhere and knit again, the second
> knit is often failing with an error like
> 
>   n_distinct(tibDat$ID) : could not find function "n_distinct"
> 
> This is not happening for functions like nrow() from base R and it mostly 
> seems to happen to functions from the tidyverse.
> 
> I think what is happening is some sort of cache corruption presumably caused 
> by the memory demands.  I am pretty sure I've
> seen this before but a long time ago and dealt with it by deleting the files 
> and cache folders created by the knit. 

Caching things that depend on libraries is known to be tricky.

Specifically, it is advised that "loading packages via library() in a cached 
chunk and these packages will be used by uncached chunks" is something you 
should not do.  I suspect that this is the problem with your inline chunk.

I have to reread things like:

https://yihui.org/knitr/demo/cache/

and relevant parts of the manual to be sure I didn't mess something up and 
maybe you should look at that and the manual yet another time. 

HTH,

Chuck

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-19 Thread Chris Evans
Ah, I completely agree, I should have said! Definitely interested to hear from 
others here though.

I cannot say how much I have learned here over a now rather frightening length 
of time: hm, at least 16 
years to judge from the oldest Email I've kept!  Ouch, getting old.

I've put a plea about polymode and Rmd to the ESS help list as equipping my 
Emacs/ESS to knit whole Rmd
files may give me an alternative and a bit more information.

Thanks Bert (and all for > 16 years of knowledge and occasional high drama 
here!)

Chris

- Original Message -
> From: "Bert Gunter" 
> To: "Chris Evans" 
> Cc: "R-help" 
> Sent: Saturday, 18 September, 2021 22:01:25
> Subject: Re: [R] Cacheing of functions from libraries other than the base in 
> Rmarkdown

> I think you should post on the RStudio help forums. They have specific areas 
> to
> ask for help on their stuff, at least for some of it. You may wish to wait a
> bit before doing so, though, just to see if someone here responds.

> Bert

> On Sat, Sep 18, 2021, 12:26 PM Chris Evans < [ mailto:chrish...@psyctc.org |
> chrish...@psyctc.org ] > wrote:

>> This question may belong somewhere else, if so, please signpost me and accept
>> apologies.

>> What is happening is that I have a large (for me, > 3k lines) Rmarkdown file
>> with many R code blocks (no other code or
>> engine is used) working on some large datasets. I have some inline r like

>> There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` rows of
>> data.

>> What I am finding is that even if one knit has worked fine and I change
>> something somewhere and knit again, the second
>> knit is often failing with an error like

>> n_distinct(tibDat$ID) : could not find function "n_distinct"

>> This is not happening for functions like nrow() from base R and it mostly 
>> seems
>> to happen to functions from the tidyverse.

>> I think what is happening is some sort of cache corruption presumably caused 
>> by
>> the memory demands. I am pretty sure I've
>> seen this before but a long time ago and dealt with it by deleting the files 
>> and
>> cache folders created by the knit. That
>> works now too but as knitting the whole file now takes over 20 minutes, I 
>> really
>> don't want to have to do that.

>> I have found that replacing things with base functions fixes the problem 
>> every
>> time, e.g. replacing `r n_distinct(tibDat$ID)`
>> with `r length(unique(tibDat$ID))` works fine. The other workaround is to
>> compute what you need for the inline
>> computation at the end of the preceding code block, trivial e.g. at the end 
>> of
>> the preceding code block:

>> n_distinct(tibDat$ID) -> tmpN
>> ```

>> and then

>> `r tmpN`

>> that works fine so I have my workarounds but I guess I have three questions:

>> 1) do others see this?
>> 2) is there some setting that might, assuming my guess about the cause is
>> correct, increase some storage somewhere and avert this?
>> 3) if it is a bug, where should I report it (as I'm not sure what is causing
>> it!)?

>> Thanks in advance,

>> Chris

>> > sessionInfo()
>> R version 4.1.1 (2021-08-10)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 20.04.3 LTS

>> Matrix products: default
>> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

>> locale:
>> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 
>> LC_COLLATE=en_GB.UTF-8
>> LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 
>> LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 
>> LC_IDENTIFICATION=C

>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base

>> other attached packages:
>> [1] boot_1.3-28 CECPfuns_0.0.0.9041 janitor_2.1.0 lubridate_1.7.10 
>> forcats_0.5.1
>> stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.0.1 tidyr_1.1.3 tibble_3.1.4
>> [12] ggplot2_3.3.5 tidyverse_1.3.1 english_1.2-6 pander_0.6.4

>> loaded via a namespace (and not attached):
>> [1] fs_1.5.0 bit64_4.0.5 RColorBrewer_1.1-2 httr_1.4.2 tools_4.1.1
>> backports_1.2.1 utf8_1.2.2 R6_2.5.1 rpart_4.1-15 Hmisc_4.5-0 DBI_1.1.1
>> [12] colorspace_2.0-2 nnet_7.3-16 withr_2.4.2 tidyselect_1.1.1 gridExtra_2.3
>> bit_4.0.4 compiler_4.1.1 cli_3.0.1 rvest_1.0.1 htmlTable_2.2.1 xml2_1.3.2
>> [23] labeling_0.4.2 scales_1.1.1 checkmate_2.0.0 corrr_0.4.3 odbc_1.3.2
>> digest_0.6.27 readODS_1.7.0 for

Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-18 Thread Bert Gunter
I think you should post on the RStudio help forums. They have specific
areas to ask for help on their stuff, at least for some of it. You may wish
to wait a bit before doing so, though, just to see if someone here responds.

Bert


On Sat, Sep 18, 2021, 12:26 PM Chris Evans  wrote:

> This question may belong somewhere else, if so, please signpost me and
> accept apologies.
>
> What is happening is that I have a large (for me, > 3k lines) Rmarkdown
> file with many R code blocks (no other code or
> engine is used) working on some large datasets.  I have some inline r like
>
>There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)`
> rows of data.
>
> What I am finding is that even if one knit has worked fine and I change
> something somewhere and knit again, the second
> knit is often failing with an error like
>
>n_distinct(tibDat$ID) : could not find function "n_distinct"
>
> This is not happening for functions like nrow() from base R and it mostly
> seems to happen to functions from the tidyverse.
>
> I think what is happening is some sort of cache corruption presumably
> caused by the memory demands.  I am pretty sure I've
> seen this before but a long time ago and dealt with it by deleting the
> files and cache folders created by the knit.  That
> works now too but as knitting the whole file now takes over 20 minutes, I
> really don't want to have to do that.
>
> I have found that replacing things with base functions fixes the problem
> every time, e.g. replacing `r n_distinct(tibDat$ID)`
> with `r length(unique(tibDat$ID))` works fine.  The other workaround is to
> compute what you need for the inline
> computation at the end of the preceding code block, trivial e.g. at the
> end of the preceding code block:
>
> n_distinct(tibDat$ID) -> tmpN
> ```
>
> and then
>
>   `r tmpN`
>
> that works fine so I have my workarounds but I guess I have three
> questions:
>
> 1) do others see this?
> 2) is there some setting that might, assuming my guess about the cause is
> correct, increase some storage somewhere and avert this?
> 3) if it is a bug, where should I report it (as I'm not sure what is
> causing it!)?
>
> Thanks in advance,
>
> Chris
>
>
>
> > sessionInfo()
> R version 4.1.1 (2021-08-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 20.04.3 LTS
>
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>
> locale:
>  [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
>  LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
>  LC_MONETARY=en_GB.UTF-8LC_MESSAGES=en_GB.UTF-8
> LC_PAPER=en_GB.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
>  LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
>  [1] boot_1.3-28 CECPfuns_0.0.0.9041 janitor_2.1.0
>  lubridate_1.7.10forcats_0.5.1   stringr_1.4.0   dplyr_1.0.7
>  purrr_0.3.4 readr_2.0.1 tidyr_1.1.3
>  tibble_3.1.4
> [12] ggplot2_3.3.5   tidyverse_1.3.1 english_1.2-6
>  pander_0.6.4
>
> loaded via a namespace (and not attached):
>  [1] fs_1.5.0bit64_4.0.5 RColorBrewer_1.1-2
> httr_1.4.2  tools_4.1.1 backports_1.2.1 utf8_1.2.2
> R6_2.5.1rpart_4.1-15Hmisc_4.5-0 DBI_1.1.1
>
> [12] colorspace_2.0-2nnet_7.3-16 withr_2.4.2
>  tidyselect_1.1.1gridExtra_2.3   bit_4.0.4
>  compiler_4.1.1  cli_3.0.1   rvest_1.0.1
>  htmlTable_2.2.1 xml2_1.3.2
> [23] labeling_0.4.2  scales_1.1.1checkmate_2.0.0
>  corrr_0.4.3 odbc_1.3.2  digest_0.6.27   readODS_1.7.0
>  foreign_0.8-81  rmarkdown_2.11  base64enc_0.1-3
>  jpeg_0.1-9
> [34] pkgconfig_2.0.3 htmltools_0.5.2 dbplyr_2.1.1
> fastmap_1.1.0   RJDBC_0.2-8 htmlwidgets_1.5.4   rlang_0.4.11
> readxl_1.3.1rstudioapi_0.13 farver_2.1.0
> generics_0.1.0
> [45] jsonlite_1.7.2  magrittr_2.0.1  Formula_1.2-4
>  Matrix_1.3-4Rcpp_1.0.7  munsell_0.5.0   fansi_0.5.0
>  lifecycle_1.0.0 stringi_1.7.4   yaml_2.2.1
> snakecase_0.11.0
> [56] grid_4.1.1  blob_1.2.2  crayon_1.4.1
> lattice_0.20-44 haven_2.4.3 splines_4.1.1   hms_1.1.0
>  knitr_1.34  pillar_1.6.2reprex_2.0.1
> glue_1.4.2
> [67] evaluate_0.14   latticeExtra_0.6-29 data.table_1.14.0
>  modelr_0.1.8png_0.1-7   vctrs_0.3.8 tzdb_0.1.2
>   psy_1.1 cellranger_1.1.0gtable_0.3.0
> assertthat_0.2.1
> [78] xfun_0.26   broom_0.7.9 rsconnect_0.8.24
> viridisLite_0.4.0   survival_3.2-13 rJava_1.0-4 cluster_2.1.2
>  ellipsis_0.3.2
>
>
> --
> Chris Evans (he/him)  Visiting Professor, University of
> Sheffield and UDLA, Quito

[R] Cacheing of functions from libraries other than the base in Rmarkdown

2021-09-18 Thread Chris Evans
This question may belong somewhere else, if so, please signpost me and accept 
apologies.

What is happening is that I have a large (for me, > 3k lines) Rmarkdown file 
with many R code blocks (no other code or 
engine is used) working on some large datasets.  I have some inline r like 

   There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` rows 
of data.

What I am finding is that even if one knit has worked fine and I change 
something somewhere and knit again, the second
knit is often failing with an error like

   n_distinct(tibDat$ID) : could not find function "n_distinct"

This is not happening for functions like nrow() from base R and it mostly seems 
to happen to functions from the tidyverse.

I think what is happening is some sort of cache corruption presumably caused by 
the memory demands.  I am pretty sure I've
seen this before but a long time ago and dealt with it by deleting the files 
and cache folders created by the knit.  That
works now too but as knitting the whole file now takes over 20 minutes, I 
really don't want to have to do that.

I have found that replacing things with base functions fixes the problem every 
time, e.g. replacing `r n_distinct(tibDat$ID)`
with `r length(unique(tibDat$ID))` works fine.  The other workaround is to 
compute what you need for the inline 
computation at the end of the preceding code block, trivial e.g. at the end of 
the preceding code block:

n_distinct(tibDat$ID) -> tmpN
```

and then

  `r tmpN` 

that works fine so I have my workarounds but I guess I have three questions:

1) do others see this?
2) is there some setting that might, assuming my guess about the cause is 
correct, increase some storage somewhere and avert this?
3) if it is a bug, where should I report it (as I'm not sure what is causing 
it!)?

Thanks in advance,

Chris



> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C   LC_TIME=en_GB.UTF-8  
  LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8
LC_MESSAGES=en_GB.UTF-8LC_PAPER=en_GB.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C 
LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
 [1] boot_1.3-28 CECPfuns_0.0.0.9041 janitor_2.1.0   
lubridate_1.7.10forcats_0.5.1   stringr_1.4.0   dplyr_1.0.7 
purrr_0.3.4 readr_2.0.1 tidyr_1.1.3 tibble_3.1.4   
[12] ggplot2_3.3.5   tidyverse_1.3.1 english_1.2-6   pander_0.6.4   


loaded via a namespace (and not attached):
 [1] fs_1.5.0bit64_4.0.5 RColorBrewer_1.1-2  httr_1.4.2 
 tools_4.1.1 backports_1.2.1 utf8_1.2.2  R6_2.5.1   
 rpart_4.1-15Hmisc_4.5-0 DBI_1.1.1  
[12] colorspace_2.0-2nnet_7.3-16 withr_2.4.2 
tidyselect_1.1.1gridExtra_2.3   bit_4.0.4   compiler_4.1.1  
cli_3.0.1   rvest_1.0.1 htmlTable_2.2.1 xml2_1.3.2 
[23] labeling_0.4.2  scales_1.1.1checkmate_2.0.0 corrr_0.4.3
 odbc_1.3.2  digest_0.6.27   readODS_1.7.0   foreign_0.8-81 
 rmarkdown_2.11  base64enc_0.1-3 jpeg_0.1-9 
[34] pkgconfig_2.0.3 htmltools_0.5.2 dbplyr_2.1.1fastmap_1.1.0  
 RJDBC_0.2-8 htmlwidgets_1.5.4   rlang_0.4.11readxl_1.3.1   
 rstudioapi_0.13 farver_2.1.0generics_0.1.0 
[45] jsonlite_1.7.2  magrittr_2.0.1  Formula_1.2-4   Matrix_1.3-4   
 Rcpp_1.0.7  munsell_0.5.0   fansi_0.5.0 
lifecycle_1.0.0 stringi_1.7.4   yaml_2.2.1  snakecase_0.11.0   
[56] grid_4.1.1  blob_1.2.2  crayon_1.4.1
lattice_0.20-44 haven_2.4.3 splines_4.1.1   hms_1.1.0   
knitr_1.34  pillar_1.6.2reprex_2.0.1glue_1.4.2 
[67] evaluate_0.14   latticeExtra_0.6-29 data.table_1.14.0   modelr_0.1.8   
 png_0.1-7   vctrs_0.3.8 tzdb_0.1.2  psy_1.1
 cellranger_1.1.0gtable_0.3.0assertthat_0.2.1   
[78] xfun_0.26   broom_0.7.9 rsconnect_0.8.24
viridisLite_0.4.0   survival_3.2-13 rJava_1.0-4 cluster_2.1.2   
ellipsis_0.3.2


-- 
Chris Evans (he/him)  Visiting Professor, University of 
Sheffield and UDLA, Quito, Ecuador
I do some consultation work for the University of Roehampton 
 and other places
but  remains my main Email address.  I have a work web site 
at:
   https://www.psyctc.org/psyctc/
and a site I manage f