Re: [Rd] [External] Re: Operations with long altrep vectors cause segfaults on Windows

Tomas Kalibera Tue, 08 Sep 2020 08:21:02 -0700

On 9/8/20 4:48 PM, Hugh Parsonage wrote:

Unfortunately I only get


[Thread 21752.0x4aa8 exited with code 3221225477]
[Thread 21752.0x4514 exited with code 3221225477]
[Thread 21752.0x3f10 exited with code 3221225477]
[Inferior 1 (process 21752) exited with code 030000000005]

(I'm guessing I would need to build an instrumented version of R, or
can R be debugged using gdb with an off-the-shelf installation?)

No, the default build lacks debug symbols. You need a build with debugsymbols, and if you can reproduce in a build without compileroptimizations (-O0), the backtrace may be easier to interpret. Some bugshowever "disappear" when optimizations are disabled. You can build Rfrom source (and there may be debug builds provided by someone else(Jeroen?)).


Tomas


On Wed, 9 Sep 2020 at 00:32, <luke-tier...@uiowa.edu> wrote:

On Tue, 8 Sep 2020, Hugh Parsonage wrote:

Thanks Martin.  On further testing, it seems that the segmentation
fault can only occur when the amount of obtainable memory is
sufficiently high. On my machine (admittedly with other processes
running):

$ R --vanilla --max-mem-size=30G -e "x <- c(0L, -2e9:2e9)"
Segmentation fault

$ R --vanilla --max-mem-size=29G -e "x <- c(0L, -2e9:2e9)"
Error: cannot allocate vector of size 14.9 Gb
Execution halted

Unfortunately I don't have access to a Windows machine with enough
memory to get to the point of failure. If you have rtools and gdb
installed can you run in gdb and see where the segfault is happening?

Best,

luke

On Tue, 8 Sep 2020 at 18:52, Martin Maechler <maech...@stat.math.ethz.ch> wrote:

Martin Maechler
     on Tue, 8 Sep 2020 10:40:24 +0200 writes:
Hugh Parsonage
     on Tue, 8 Sep 2020 18:08:11 +1000 writes:

    >> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2):

    >> $> R --vanilla
    >> x <- c(0L, -2e9:2e9)

    >> # > Segmentation fault

    >> Tried to reproduce on Linux but the above worked as expected. Not an
    >> issue merely with the length of the vector; for example, x <-
    >> rep_len(1:10, 1e10) works, though the altrep vector must be long to
    >> reproduce:

    >> x <- c(0L, -1e9:1e9)  #ok

    >> Segmentation faults occur with the following too:

    >> x <- (-2e9:2e9) + 1L

    > Your operation would "need" (not in theory, but in practice)
    > to go from altrep to regular vectors.
    > I guess the segfault occurs because of something like this :

    > R asks Windows to hand it a huge amount of memory and Windows replies
    > "ok, here is the memory pointer"
    > and then R tries to write to there, but illegally (because
    > Windows should have told R that it does not really have enough
    > memory for that ..).

    > I cannot reproduce the segmentation fault .. but I can confirm
    > there is a bug there that shows for me on Windows but not on
    > Linux:

    > "My" Windows is on a terminalserver not with too many GB of memory
    > (but then in a version of Windows that recognizes that it cannot
    > get so much memory):

    > ------------------------- Here some transcript (thanks to
    > using Emacs w/ ESS also on Windows) ------------------

    > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered 
Consequences"
    > Copyright (C) 2020 The R Foundation for Statistical Computing
    > Platform: x86_64-w64-mingw32/x64 (64-bit)

    > R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
    > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten.
    > Tippen Sie 'license()' or 'licence()' für Details dazu.

    > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden.
    > Tippen Sie 'contributors()' für mehr Information und 'citation()',
    > um zu erfahren, wie R oder R packages in Publikationen zitiert werden 
können.

    > Tippen Sie 'demo()' für einige Demos, 'help()' für on-line Hilfe, oder
    > 'help.start()' für eine HTML Browserschnittstelle zur Hilfe.
    > Tippen Sie 'q()', um R zu verlassen.

    >> x <- (-2e9:2e9) + 1L
    > Fehler: kann Vektor der Größe 14.9 GB nicht allozieren
    >> y <- c(0L, -2e9:2e9)
    > Fehler: kann Vektor der Größe 14.9 GB nicht allozieren
    >> Sys.setenv(LANGUAGE="en")
    >> y <- c(0L, -2e9:2e9)
    > Error: cannot allocate vector of size 14.9 Gb
    >> y <- -1e9:4e9
    >> .Internal(inspect(y))
    > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)]  -1000000000 : 
-294967296 (compact)
    >> .Machine$integer.max / 1e9
    > [1] 2.147484
    >> y <- -1e6:2.2e9
    >> .Internal(inspect(y))
    > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)]  -1000000 : -2094967296 
(compact)
    >> y <- -1e6:2e9
    >> .Internal(inspect(y))
    > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)]  -1000000 : 2000000000 
(compact)
    >>
    > ------------------------- end of transcript 
-----------------------------------

    > So indeed, no seg.fault, R notices that it can't get 15 GB of
    > memory.

    > But the bug is bad news:  We have *silent* integer overflow happening
    > according to what  .Internal(inspect(y)) shows...

    > .... less bad new: Probably the bug is only in the 'internal inspect' code
    > where a format specifier is used in C's printf() that does not work
    > correctly on Windows, at least the way it is currently compiled ..


    > On (64-bit) Linux, I get

    >> y <- -1e9:4e9 ; .Internal(inspect(y))
    > @7d86388 14 REALSXP g0c0 [REF(65535)]  -1000000000 : 4000000000 (compact)

    >> y <- c(0L, y)
    > Error: cannot allocate vector of size 37.3 Gb

    > which seems much better ... until I do find a bug, may again
    > only in the C code underlying .Internal(inspect(.)) :

    >> y <- -1e9:2e9 ; .Internal(inspect(y))
    > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported 
yet: ../../../R/src/main/altclasses.c:139
    >>

Indeed, the purported "integer overflow" (above) does not
happen.
It is "only" a  'printf' related bug inside .Internal(inspect(.)) on Windows.

*interestingly*, the above bug I've noticed on (64-bit) Linux
does *not* show on Windows (64-bit), at least not for that case:

On Windows, things are fine as long as they remain (compacted
aka 'ALTREP') INTSXP:

  > y <- -1e3:2e9 ;.Internal(inspect(y))
   @0x000000000a285648 13 INTSXP g0c0 [REF(65535)]  -1000 : 2000000000 (compact)
  > y <- -1e3:2.1e9 ;.Internal(inspect(y))
   @0x0000000019925930 13 INTSXP g0c0 [REF(65535)]  -1000 : 2100000000 (compact)

and here, y is correct, just the printing from
.Internal(inspect(y)) is bugous (probably prints the double as an integer):

  > y <- -1e3:2.2e9 ; .Internal(inspect(y))
   @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)]  -1000 : -2094967296 
(compact)
  > length(y)
   [1] 2200001001
  > tail(y)
   [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09
  > tail(y) - 2.2e9
   [1] -5 -4 -3 -2 -1  0
  >

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
     Actuarial Science
241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: Operations with long altrep vectors cause segfaults on Windows

Reply via email to