Re: [Rd] RProfmem output format

2017-05-30 Thread Henrik Bengtsson
UPDATE: Tomas Kalibera has fixed this bug (on missing newlines in
Rprofmem output) in R-devel r72747
(https://github.com/wch/r-source/commit/ba6665deace4a8dc239aebec977c17d0975fbc27).
  Using the example of this thread, Rprofmem() of R-devel now outputs
with newlines:

$ R Under development (unstable) (2017-05-30 r72747) -- "Unsuffered
Consequences"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
[...]

> Rprofmem(); x <- raw(2000); Rprofmem("")
> cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n")
232 :
472 :
472 :
1064 :
2040 :"raw"

/Henrik

On Sat, Jun 4, 2016 at 12:40 PM, Henrik Bengtsson
 wrote:
> I'm picking up this 5-year old thread.
>
> 1. About the four memory allocations without a stacktrace
>
> I think the four memory allocations without a stacktrace reported by 
> Rprofmem():
>
>> Rprofmem(); x <- raw(2000); Rprofmem("")
>> cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n")
> 192 :360 :360 :1064 :2040 :"raw"
>
> are due to some initialization of R that is independent of Rprofmem(),
> because they can be avoided if one allocates some memory before (in a
> fresh R session):
>
>> z <- raw(1000); dummy <- gc()
>> Rprofmem(); x <- raw(2000); Rprofmem("")
>> cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n")
> 2040 :"raw"
>
>
> 2. About missing newlines when stacktrace is empty
>
> As a refresher, the problem is that memory allocations an empty
> stracktrace are reported without newlines, i.e.
>
> 192 :360 :360 :1064 :2040 :"raw"
>
> The question is why this is not reported as:
>
> 192 :
> 360 :
> 360 :
> 1064 :
> 2040 :"raw"
>
> This was/is because C function R_OutputStackTrace() - part of
> src/main/memory.c  - looks like:
>
> static void R_OutputStackTrace(FILE *file)
> {
> int newline = 0;
> RCNTXT *cptr;
>
> for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) {
> if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN))
>&& TYPEOF(cptr->call) == LANGSXP) {
>SEXP fun = CAR(cptr->call);
>if (!newline) newline = 1;
>fprintf(file, "\"%s\" ",
>TYPEOF(fun) == SYMSXP ? CHAR(PRINTNAME(fun)) :
>"");
> }
> }
> if (newline) fprintf(file, "\n");
> }
>
>
> Thomas, your last comment was:
>
>> Yes. It's obviously better to always print a newline, and so clearly
>> deliberate not to, that I suspect there may have been a good reason.
>> If I can't work it out (after my grant deadline this week) I will just
>> assume it's wrong.
>
> When I search the code and the commit history
> (https://github.com/wch/r-source/commit/3d5eb2a09f2d75893efdc8bbf1c72d17603886a0),
> it appears that this was there from the very first commit.  Also,
> searching the code for usages of R_OutputStackTrace(), I only find
> R_ReportAllocation() and R_ReportNewPage(), both part of of
> src/main/memory.c (see below).
>
> static void R_ReportAllocation(R_size_t size)
> {
> if (R_IsMemReporting) {
> if(size > R_MemReportingThreshold) {
>fprintf(R_MemReportingOutfile, "%lu :", (unsigned long) size);
>R_OutputStackTrace(R_MemReportingOutfile);
> }
> }
> return;
> }
>
> static void R_ReportNewPage(void)
> {
> if (R_IsMemReporting) {
> fprintf(R_MemReportingOutfile, "new page:");
> R_OutputStackTrace(R_MemReportingOutfile);
> }
> return;
> }
>
>
> Could it be that when you wrote it you had another usage for
> R_OutputStackTrace() in mind as well?  If so, it makes sense that
> R_OutputStackTrace() shouldn't output a newline if the stack trace was
> empty.  But if the above is the only usage, to me it looks pretty safe
> to always add a newline.
>
>> sessionInfo()
> R version 3.3.0 Patched (2016-05-26 r70682)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 7 x64 (build 7601) Service Pack 1
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> /Henrik
>
>
> On Sun, May 15, 2011 at 1:16 PM, Thomas Lumley  wrote:
>> On Mon, May 16, 2011 at 1:02 AM, Hadley Wickham  wrote:
>>> So what causes allocations when the call stack is empty?  Something
>>> internal?  Does the garbage collector trigger allocations (i.e. could
>>> it be caused by moving data to contiguous memory)?
>>
>> The garbage collector doesn't move anything, it just swaps pointers in
>> a linked list.
>>
>> The lexer, parser, and evaluator all have  to do some work before a
>> function context is set up for the top-level function, so I assume
>> that's where it is happening.
>>
>>> Any ideas what the correct thing to do with these memory allocations?
>>> Ignore them because they're not really related to the function they're
>>> attributed 

Re: [Rd] RProfmem output format

2016-06-04 Thread Henrik Bengtsson
I'm picking up this 5-year old thread.

1. About the four memory allocations without a stacktrace

I think the four memory allocations without a stacktrace reported by Rprofmem():

> Rprofmem(); x <- raw(2000); Rprofmem("")
> cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n")
192 :360 :360 :1064 :2040 :"raw"

are due to some initialization of R that is independent of Rprofmem(),
because they can be avoided if one allocates some memory before (in a
fresh R session):

> z <- raw(1000); dummy <- gc()
> Rprofmem(); x <- raw(2000); Rprofmem("")
> cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n")
2040 :"raw"


2. About missing newlines when stacktrace is empty

As a refresher, the problem is that memory allocations an empty
stracktrace are reported without newlines, i.e.

192 :360 :360 :1064 :2040 :"raw"

The question is why this is not reported as:

192 :
360 :
360 :
1064 :
2040 :"raw"

This was/is because C function R_OutputStackTrace() - part of
src/main/memory.c  - looks like:

static void R_OutputStackTrace(FILE *file)
{
int newline = 0;
RCNTXT *cptr;

for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) {
if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN))
   && TYPEOF(cptr->call) == LANGSXP) {
   SEXP fun = CAR(cptr->call);
   if (!newline) newline = 1;
   fprintf(file, "\"%s\" ",
   TYPEOF(fun) == SYMSXP ? CHAR(PRINTNAME(fun)) :
   "");
}
}
if (newline) fprintf(file, "\n");
}


Thomas, your last comment was:

> Yes. It's obviously better to always print a newline, and so clearly
> deliberate not to, that I suspect there may have been a good reason.
> If I can't work it out (after my grant deadline this week) I will just
> assume it's wrong.

When I search the code and the commit history
(https://github.com/wch/r-source/commit/3d5eb2a09f2d75893efdc8bbf1c72d17603886a0),
it appears that this was there from the very first commit.  Also,
searching the code for usages of R_OutputStackTrace(), I only find
R_ReportAllocation() and R_ReportNewPage(), both part of of
src/main/memory.c (see below).

static void R_ReportAllocation(R_size_t size)
{
if (R_IsMemReporting) {
if(size > R_MemReportingThreshold) {
   fprintf(R_MemReportingOutfile, "%lu :", (unsigned long) size);
   R_OutputStackTrace(R_MemReportingOutfile);
}
}
return;
}

static void R_ReportNewPage(void)
{
if (R_IsMemReporting) {
fprintf(R_MemReportingOutfile, "new page:");
R_OutputStackTrace(R_MemReportingOutfile);
}
return;
}


Could it be that when you wrote it you had another usage for
R_OutputStackTrace() in mind as well?  If so, it makes sense that
R_OutputStackTrace() shouldn't output a newline if the stack trace was
empty.  But if the above is the only usage, to me it looks pretty safe
to always add a newline.

> sessionInfo()
R version 3.3.0 Patched (2016-05-26 r70682)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

/Henrik


On Sun, May 15, 2011 at 1:16 PM, Thomas Lumley  wrote:
> On Mon, May 16, 2011 at 1:02 AM, Hadley Wickham  wrote:
>> So what causes allocations when the call stack is empty?  Something
>> internal?  Does the garbage collector trigger allocations (i.e. could
>> it be caused by moving data to contiguous memory)?
>
> The garbage collector doesn't move anything, it just swaps pointers in
> a linked list.
>
> The lexer, parser, and evaluator all have  to do some work before a
> function context is set up for the top-level function, so I assume
> that's where it is happening.
>
>> Any ideas what the correct thing to do with these memory allocations?
>> Ignore them because they're not really related to the function they're
>> attributed to?  Sum them up?
>>
>>> I don't see why this is done, and I may well be the person who did it
>>> (I don't have svn on this computer to check), but it is clearly
>>> deliberate.
>>
>> It seems like it would be more consistent to always print a newline,
>> and then it would obvious those allocations occurred when the call
>> stack was empty.  This would make parsing the file a little bit
>> easier.
>
> Yes. It's obviously better to always print a newline, and so clearly
> deliberate not to, that I suspect there may have been a good reason.
> If I can't work it out (after my grant deadline this week) I will just
> assume it's wrong.
>
>
>-thomas
>
> --
> Thomas Lumley
> Professor of Biostatistics
> University of Auckland
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RProfmem output format

2011-05-15 Thread Hadley Wickham
 In the subsequence lines I'm assuming the structure is bytes allocated : 
 call.

 I think the five numbers come from four memory allocations before
 example() is called.  Looking at the code in src/main/memory.c, it
 prints newline only when the call stack is not empty.

Looking into that example in more detail, here's the distribution of
allocation numbers:

   134
4621   302

(with a threshold of 5k)

So it happens ~30 times in total.

So what causes allocations when the call stack is empty?  Something
internal?  Does the garbage collector trigger allocations (i.e. could
it be caused by moving data to contiguous memory)?

Any ideas what the correct thing to do with these memory allocations?
Ignore them because they're not really related to the function they're
attributed to?  Sum them up?

 I don't see why this is done, and I may well be the person who did it
 (I don't have svn on this computer to check), but it is clearly
 deliberate.

It seems like it would be more consistent to always print a newline,
and then it would obvious those allocations occurred when the call
stack was empty.  This would make parsing the file a little bit
easier.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RProfmem output format

2011-05-15 Thread Hadley Wickham
Also, would you mind commenting how RProfmem is misleading?

There are three ways to profile memory use over time in R code. ...
All can be misleading, for different reasons.
--- 
http://cran.r-project.org/doc/manuals/R-exts.html#Profiling-R-code-for-memory-use

The other two ways describe why they are misleading.

Hadley

On Sun, May 15, 2011 at 8:02 AM, Hadley Wickham had...@rice.edu wrote:
 In the subsequence lines I'm assuming the structure is bytes allocated : 
 call.

 I think the five numbers come from four memory allocations before
 example() is called.  Looking at the code in src/main/memory.c, it
 prints newline only when the call stack is not empty.

 Looking into that example in more detail, here's the distribution of
 allocation numbers:

   1    3    4
 4621   30    2

 (with a threshold of 5k)

 So it happens ~30 times in total.

 So what causes allocations when the call stack is empty?  Something
 internal?  Does the garbage collector trigger allocations (i.e. could
 it be caused by moving data to contiguous memory)?

 Any ideas what the correct thing to do with these memory allocations?
 Ignore them because they're not really related to the function they're
 attributed to?  Sum them up?

 I don't see why this is done, and I may well be the person who did it
 (I don't have svn on this computer to check), but it is clearly
 deliberate.

 It seems like it would be more consistent to always print a newline,
 and then it would obvious those allocations occurred when the call
 stack was empty.  This would make parsing the file a little bit
 easier.

 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/




-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RProfmem output format

2011-05-15 Thread Thomas Lumley
On Mon, May 16, 2011 at 1:02 AM, Hadley Wickham had...@rice.edu wrote:
 So what causes allocations when the call stack is empty?  Something
 internal?  Does the garbage collector trigger allocations (i.e. could
 it be caused by moving data to contiguous memory)?

The garbage collector doesn't move anything, it just swaps pointers in
a linked list.

The lexer, parser, and evaluator all have  to do some work before a
function context is set up for the top-level function, so I assume
that's where it is happening.

 Any ideas what the correct thing to do with these memory allocations?
 Ignore them because they're not really related to the function they're
 attributed to?  Sum them up?

 I don't see why this is done, and I may well be the person who did it
 (I don't have svn on this computer to check), but it is clearly
 deliberate.

 It seems like it would be more consistent to always print a newline,
 and then it would obvious those allocations occurred when the call
 stack was empty.  This would make parsing the file a little bit
 easier.

Yes. It's obviously better to always print a newline, and so clearly
deliberate not to, that I suspect there may have been a good reason.
If I can't work it out (after my grant deadline this week) I will just
assume it's wrong.


   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RProfmem output format

2011-05-14 Thread Thomas Lumley
On Sat, May 14, 2011 at 5:00 AM, Hadley Wickham had...@rice.edu wrote:
 Hi all,

 When I run the example in RProfmem, I get:


     Rprofmem(Rprofmem.out, threshold=1000)
     example(glm)
     Rprofmem(NULL)
     noquote(readLines(Rprofmem.out, n=5))

 ...

 [1] 1384 :5416 :5416 :1064 :1064 :readRDS index.search example
 [2] 1064 :readRDS index.search example
 [3] 4712 :readRDS index.search example
 [4] 4712 :readRDS index.search example
 [5] 1064 :readRDS index.search example

 What do 5 the numbers in the first line mean?

 In the subsequence lines I'm assuming the structure is bytes allocated : call.

I think the five numbers come from four memory allocations before
example() is called.  Looking at the code in src/main/memory.c, it
prints newline only when the call stack is not empty.

I don't see why this is done, and I may well be the person who did it
(I don't have svn on this computer to check), but it is clearly
deliberate.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel