Hello.

I'm still working on my OCaml-R binding and I get a segfault in the GetNewPage() function of memory.c.

For the record, the OCaml-R binding seems to work fine with OCaml bytecode. The segfault here is the main issue I have with OCaml native code. OCaml-R can be found on the following links.

Source code:

        http://yziquel.homelinux.org/gitweb/?p=ocaml-r.git;a=summary
        http://svn.gna.org/viewcvs/ocaml-r/branches/yziquel/

Debian packages for amd64:

        http://yziquel.homelinux.org/debian/pool/main/o/ocaml-r/

Some documentation (not entirely up to date...):

        http://yziquel.homelinux.org/topos/api/ocaml-r/R.html
        http://yziquel.homelinux.org/topos/debian-ocamlr.html

Back to my segfault:

yziq...@seldon:~/git/ocaml-finquote$ gdb -silent -d 
/home/yziquel/src/r-base-2.10.1/src/main/ _build/test/test.native
Reading symbols from 
/home/yziquel/git/ocaml-finquote/_build/test/test.native...(no debugging 
symbols found)...done.
(gdb) run
Starting program: /home/yziquel/git/ocaml-finquote/_build/test/test.native [Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
GetNewPage (node_class=1) at memory.c:657
657             SNAP_NODE(s, base);
(gdb) backtrace
#0  GetNewPage (node_class=1) at memory.c:657
#1  0x00007ffff7993c24 in Rf_allocVector (type=16, length=1) at memory.c:2030
#2  0x00007ffff7981070 in Rf_mkString (s=0x6ae548 "require(quantmod)") at 
../../src/include/Rinlinedfuns.h:582
#3  0x000000000047d63f in parse_sexp ()
#4  0x0000000000498990 in caml_c_call ()
#5  0x00007ffff7fb37e8 in ?? ()
#6  0x0000000000423aa0 in camlQuantmod__entry ()
#7  0x00007ffff7fb5820 in ?? ()
#8  0x0000000000421649 in caml_program ()
#9  0x000000000012697e in ?? ()
#10 0x00000000004989e6 in caml_start_program ()
#11 0x0000000000000000 in ?? ()
(gdb)

As OCaml is compiled to native machine code, it has its own ABI, and this is why you do not see much traceback on the OCaml side.

The segfault happens at the moment that we try to do "require(quantmod)". The R interpreter is already up an running when we execute this R command.

I wish to point out that this same piece of code works fine in OCaml bytecode, which sorts of implies that my C glue is rather OK.

From source code, the execution goes this way:

  let () = ignore (R.eval_string "require(quantmod)")

We're simply trying to evaluate the "require(quantmod)" string in R.

let eval_string s = eval_langsxp (parse_sexp s)

eval_string calls

external parse_sexp : string -> sexp = "parse_sexp"

which access the C glue code wrapping R_ParseVector.

CAMLprim value parse_sexp (value s) {
  CAMLparam1(s);
  SEXP text ;
  SEXP pr ;
  ParseStatus status;
  PROTECT(text = mkString(String_val(s)));
  PROTECT(pr=R_ParseVector(text, 1, &status, R_NilValue));
  UNPROTECT(2);
  switch (status) {
    case PARSE_OK:
     break;
    case PARSE_INCOMPLETE:
    case PARSE_EOF:
      caml_raise_with_string(*caml_named_value("Parse_incomplete"), 
(String_val(s)));
    case PARSE_NULL:
    case PARSE_ERROR:
      caml_raise_with_string(*caml_named_value("Parse_error"), (String_val(s)));
      }
  CAMLreturn(Val_sexp(VECTOR_ELT(pr,0)));
}

But before calling ParseVector, it allocates an R string with the command.

  PROTECT(text = mkString(String_val(s)));

It is this call to mkString which gives the segfault. String_val essentially is a macro that casts an OCaml value to a char *.

yziq...@seldon:~/git/ocaml-finquote$ gdb -silent -d 
/home/yziquel/src/r-base-2.10.1/src/main/ _build/test/test.native
Reading symbols from 
/home/yziquel/git/ocaml-finquote/_build/test/test.native...(no debugging 
symbols found)...done.
(gdb) set breakpoint pending on
(gdb) break Rf_mkString
Breakpoint 1 at 0x420858
(gdb) run
Starting program: /home/yziquel/git/ocaml-finquote/_build/test/test.native [Thread debugging using libthread_db enabled]

Breakpoint 1, Rf_mkString (s=0x6ae548 "require(quantmod)") at 
../../src/include/Rinlinedfuns.h:582
582         PROTECT(t = allocVector(STRSXP, 1));
(gdb) step
579     {
(gdb) 582 PROTECT(t = allocVector(STRSXP, 1)); (gdb) Rf_allocVector (type=16, length=1) at memory.c:1916
1916    {
(gdb) next
1924        if (length < 0 )
(gdb) 1928 switch (type) { (gdb) 1978 if (length <= 0) (gdb) 1984 size = PTR2VEC(length); (gdb) 2000 if (size <= NodeClassSize[1]) { (gdb) 2017 old_R_VSize = R_VSize; (gdb) 2020 if (FORCE_GC || NO_FREE_NODES() || VHEAP_FREE() < alloc_size) { (gdb) 2017 old_R_VSize = R_VSize; (gdb) 2020 if (FORCE_GC || NO_FREE_NODES() || VHEAP_FREE() < alloc_size) { (gdb) 2028 if (size > 0) { (gdb) 2029 if (node_class < NUM_SMALL_NODE_CLASSES) { (gdb) 2030 CLASS_GET_FREE_NODE(node_class, s); (gdb)
Program received signal SIGSEGV, Segmentation fault.
GetNewPage (node_class=1) at memory.c:657
657             SNAP_NODE(s, base);
(gdb)

So CLASS_GET_FREE_NODE is #defined in memory.c as:

#define CLASS_GET_FREE_NODE(c,s) do { \
  SEXP __n__ = R_GenHeap[c].Free; \
  if (__n__ == R_GenHeap[c].New) { \
    GetNewPage(c); \
    __n__ = R_GenHeap[c].Free; \
  } \
  R_GenHeap[c].Free = NEXT_NODE(__n__); \
  R_NodesInUse++; \
  (s) = __n__; \
} while (0)

and we here have a call to GetNewPage.

yziq...@seldon:~/git/ocaml-finquote$ gdb -silent -d 
/home/yziquel/src/r-base-2.10.1/src/main/ _build/test/test.native
Reading symbols from 
/home/yziquel/git/ocaml-finquote/_build/test/test.native...(no debugging 
symbols found)...done.
(gdb) set breakpoint pending on
(gdb) break GetNewPage
Function "GetNewPage" not defined.
Breakpoint 1 (GetNewPage) pending.
(gdb) run
Starting program: /home/yziquel/git/ocaml-finquote/_build/test/test.native [Thread debugging using libthread_db enabled]

Breakpoint 1, GetNewPage (node_class=1) at memory.c:629
629     {
(gdb) n
635         node_size = NODE_SIZE(node_class);
(gdb) 638 page = malloc(R_PAGE_SIZE); (gdb) 639 if (page == NULL) { (gdb) 638 page = malloc(R_PAGE_SIZE); (gdb) 639 if (page == NULL) { (gdb) 646 R_ReportNewPage(); (gdb) 648 page->next = R_GenHeap[node_class].pages; (gdb) 653 base = R_GenHeap[node_class].New; (gdb) 648 page->next = R_GenHeap[node_class].pages; (gdb) 653 base = R_GenHeap[node_class].New; (gdb) 648 page->next = R_GenHeap[node_class].pages; (gdb) 653 base = R_GenHeap[node_class].New; (gdb) 648 page->next = R_GenHeap[node_class].pages; (gdb) 650 R_GenHeap[node_class].PageCount++; (gdb) 654 for (i = 0; i < page_count; i++, data += node_size) { (gdb) 649 R_GenHeap[node_class].pages = page; (gdb) 653 base = R_GenHeap[node_class].New; (gdb) 654 for (i = 0; i < page_count; i++, data += node_size) { (gdb) 652 data = PAGE_DATA(page); (gdb) 669 SET_NODE_CLASS(s, node_class); (gdb) 657 SNAP_NODE(s, base); (gdb)
Program received signal SIGSEGV, Segmentation fault.
GetNewPage (node_class=1) at memory.c:657
657             SNAP_NODE(s, base);
(gdb)

and SNAP_NODE is:

/* snap in node s before node t */
#define SNAP_NODE(s,t) do { \
  SEXP sn__n__ = (s); \
  SEXP next = (t); \
  SEXP prev = PREV_NODE(next); \
  SET_NEXT_NODE(sn__n__, next); \
  SET_PREV_NODE(next, sn__n__); \
  SET_NEXT_NODE(prev, sn__n__); \
  SET_PREV_NODE(sn__n__, prev); \
} while (0)

I do not know how to track the segfault further except by looking at the disassembled machine code. However, as the machine code seems to have been optimised, it is not that easily readable, and I would appreciate if someone could take the time to look into any obvious reasons why there may be a segfault. Any background information or pointers helping me to understand what is precisely supposed to be going on in the memory allocation code would also be very much appreciated.

All the best,

--
     Guillaume Yziquel
http://yziquel.homelinux.org/

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to