Package: r-cran-hdf5
Version: 1.6.10-3+b2
Severity: normal

Attached to this bugreport are two HDF5 files named 'title.hdf' and
'no-title.hdf', and the Python script that generated them (using pytables).
According to h5dump, the only difference between the two is

--- title.hdf
+++ no-title.hdf
...
       ATTRIBUTE "TITLE" {
          DATATYPE  H5T_STRING {
             STRSIZE 1;
             STRPAD H5T_STR_NULLTERM;
             CSET H5T_CSET_UTF8;
             CTYPE H5T_C_S1;
          }
-         DATASPACE  SCALAR
+         DATASPACE  NULL
          DATA {
-         (0): "M"
          }
       }

Calling hdf5load() on 'title.hdf' works fine, but calling it on
'no-title.hdf' crashes the R interpreter.

(gdb) r
Starting program: /usr/lib/R/bin/exec/R 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

R version 3.2.4 Revised (2016-03-16 r70336) -- "Very Secure Dishes"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(hdf5)
> hdf5load('no-title.hdf', verbosity=99)
hdf5_global_verbosity=99 load=1
Processing object: M ...... its a dataset...Dataset has ID83886080
Dataset has tid 50331742
Dataset has space id 67108866
Dataset has rank 1
Dataset has dims/maxdims: 1 / 1 
Allocating vector with rank=1 dim=1
calling vector_io. Hangs here with big datsets
in vector_io: rank=1
in vector_io:size 0 = 1 into n_elements......=1
 Setting buffer size in plist
About to read with bufsize = 8
 Done read
in vector_io: permuting
in vector_io: tidying
Phew. Done it. calling iinfo->add
Rank > 1 or not VECSXP
Calling  hdf5_load_attributes 
Processing attribute 1 called CLASS
attribute CLASS has rank 0 
Rank 0 attribute treated as rank 1 size 1
Attribute is a string
in string_ref: count=1, size=5 srcbf=5
leaving string_ref
string length of new name =6
Done processing attribute CLASS
Processing attribute 2 called VERSION
attribute VERSION has rank 0 
Rank 0 attribute treated as rank 1 size 1
Attribute is a string
in string_ref: count=1, size=3 srcbf=3
leaving string_ref
string length of new name =8
Done processing attribute VERSION
Processing attribute 3 called TITLE
attribute TITLE has rank 0 
Rank 0 attribute treated as rank 1 size 1
Attribute is a string

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff70b606a in strlen () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff70b606a in strlen () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7915af9 in Rf_mkChar () from /usr/lib/R/lib/libR.so
#2  0x00007ffff453148a in hdf5_process_attribute ()
   from /usr/lib/R/site-library/hdf5/libs/hdf5.so
#3  0x00007ffff3130c3b in H5A_attr_iterate_table ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#4  0x00007ffff32000bc in H5O_attr_iterate_real ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#5  0x00007ffff32007c7 in H5O_attr_iterate ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#6  0x00007ffff312e661 in H5Aiterate1 ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#7  0x00007ffff452e466 in ?? () from /usr/lib/R/site-library/hdf5/libs/hdf5.so
#8  0x00007ffff452ff9b in ?? () from /usr/lib/R/site-library/hdf5/libs/hdf5.so
#9  0x00007ffff31b07ed in ?? ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#10 0x00007ffff31b6449 in H5G__node_iterate ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#11 0x00007ffff3135763 in ?? ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#12 0x00007ffff3136c26 in H5B_iterate ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#13 0x00007ffff31bbc4f in H5G__stab_iterate ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#14 0x00007ffff31b8d90 in H5G__obj_iterate ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#15 0x00007ffff31b17e8 in H5G_iterate ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#16 0x00007ffff31ae4fa in H5Giterate ()
   from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10
#17 0x00007ffff4531cdf in do_hdf5load ()
   from /usr/lib/R/site-library/hdf5/libs/hdf5.so
#18 0x00007ffff78f7331 in ?? () from /usr/lib/R/lib/libR.so
#19 0x00007ffff792f7af in Rf_eval () from /usr/lib/R/lib/libR.so
#20 0x00007ffff7931ca8 in ?? () from /usr/lib/R/lib/libR.so
#21 0x00007ffff792f5a1 in Rf_eval () from /usr/lib/R/lib/libR.so
#22 0x00007ffff7930c65 in Rf_applyClosure () from /usr/lib/R/lib/libR.so
#23 0x00007ffff792f37d in Rf_eval () from /usr/lib/R/lib/libR.so
#24 0x00007ffff7956bc2 in Rf_ReplIteration () from /usr/lib/R/lib/libR.so
#25 0x00007ffff7956f41 in ?? () from /usr/lib/R/lib/libR.so
#26 0x00007ffff7956ff4 in run_Rmainloop () from /usr/lib/R/lib/libR.so
#27 0x00000000004007db in main ()
(gdb) disas
Dump of assembler code for function strlen:
   0x00007ffff70b6040 <+0>:     pxor   %xmm8,%xmm8
   0x00007ffff70b6045 <+5>:     pxor   %xmm9,%xmm9
   0x00007ffff70b604a <+10>:    pxor   %xmm10,%xmm10
   0x00007ffff70b604f <+15>:    pxor   %xmm11,%xmm11
   0x00007ffff70b6054 <+20>:    mov    %rdi,%rax
   0x00007ffff70b6057 <+23>:    mov    %rdi,%rcx
   0x00007ffff70b605a <+26>:    and    $0xfff,%rcx
   0x00007ffff70b6061 <+33>:    cmp    $0xfcf,%rcx
   0x00007ffff70b6068 <+40>:    ja     0x7ffff70b60d0 <strlen+144>
=> 0x00007ffff70b606a <+42>:    movdqu (%rax),%xmm12
[...etc...]
(gdb) info registers
rax            0x0      0
rbx            0x0      0
rcx            0x0      0
rdx            0x1      1
rsi            0x1      1
rdi            0x0      0
rbp            0x7fffffffbb00   0x7fffffffbb00
rsp            0x7fffffffba48   0x7fffffffba48
r8             0x0      0
r9             0x20     32
r10            0x7ffff7fd0780   140737353942912
r11            0x0      0
r12            0x1      1
r13            0x11d8880        18712704
r14            0xaa3488 11154568
r15            0x4000003        67108867
rip            0x7ffff70b606a   0x7ffff70b606a <strlen+42>
eflags         0x10293  [ CF AF SF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

Looks like Rf_mkChar() tried to call strlen() on a null pointer.

I believe that this is *not* a bug in pytables -- if I'm reading the HDF5
spec correctly, DATASPACE NULL in this context is a valid way to spell
the empty string -- but even if it is, under no circumstances should
loading an HDF file, no matter how ill-formed, be able to crash the
R interpreter.

zw

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.4.0-1-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages r-cran-hdf5 depends on:
ii  libc6                  2.22-5
ii  libhdf5-10             1.8.16+docs-7
ii  r-base-core [r-api-3]  3.2.4-revised-1
ii  zlib1g                 1:1.2.8.dfsg-2+b1

r-cran-hdf5 recommends no packages.

r-cran-hdf5 suggests no packages.

-- no debconf information
import tables

with tables.open_file('no-title.hdf', 'w') as f:
    M = f.create_array(f.root, 'M', [1.0], '')

with tables.open_file('title.hdf', 'w') as f:
    M = f.create_array(f.root, 'M', [1.0], 'M')

Attachment: title.hdf
Description: HDF file

Attachment: no-title.hdf
Description: HDF file

Reply via email to