Package: r-cran-hdf5 Version: 1.6.10-3+b2 Severity: normal Attached to this bugreport are two HDF5 files named 'title.hdf' and 'no-title.hdf', and the Python script that generated them (using pytables). According to h5dump, the only difference between the two is
--- title.hdf +++ no-title.hdf ... ATTRIBUTE "TITLE" { DATATYPE H5T_STRING { STRSIZE 1; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_UTF8; CTYPE H5T_C_S1; } - DATASPACE SCALAR + DATASPACE NULL DATA { - (0): "M" } } Calling hdf5load() on 'title.hdf' works fine, but calling it on 'no-title.hdf' crashes the R interpreter. (gdb) r Starting program: /usr/lib/R/bin/exec/R [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". R version 3.2.4 Revised (2016-03-16 r70336) -- "Very Secure Dishes" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(hdf5) > hdf5load('no-title.hdf', verbosity=99) hdf5_global_verbosity=99 load=1 Processing object: M ...... its a dataset...Dataset has ID83886080 Dataset has tid 50331742 Dataset has space id 67108866 Dataset has rank 1 Dataset has dims/maxdims: 1 / 1 Allocating vector with rank=1 dim=1 calling vector_io. Hangs here with big datsets in vector_io: rank=1 in vector_io:size 0 = 1 into n_elements......=1 Setting buffer size in plist About to read with bufsize = 8 Done read in vector_io: permuting in vector_io: tidying Phew. Done it. calling iinfo->add Rank > 1 or not VECSXP Calling hdf5_load_attributes Processing attribute 1 called CLASS attribute CLASS has rank 0 Rank 0 attribute treated as rank 1 size 1 Attribute is a string in string_ref: count=1, size=5 srcbf=5 leaving string_ref string length of new name =6 Done processing attribute CLASS Processing attribute 2 called VERSION attribute VERSION has rank 0 Rank 0 attribute treated as rank 1 size 1 Attribute is a string in string_ref: count=1, size=3 srcbf=3 leaving string_ref string length of new name =8 Done processing attribute VERSION Processing attribute 3 called TITLE attribute TITLE has rank 0 Rank 0 attribute treated as rank 1 size 1 Attribute is a string Program received signal SIGSEGV, Segmentation fault. 0x00007ffff70b606a in strlen () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) bt #0 0x00007ffff70b606a in strlen () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007ffff7915af9 in Rf_mkChar () from /usr/lib/R/lib/libR.so #2 0x00007ffff453148a in hdf5_process_attribute () from /usr/lib/R/site-library/hdf5/libs/hdf5.so #3 0x00007ffff3130c3b in H5A_attr_iterate_table () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #4 0x00007ffff32000bc in H5O_attr_iterate_real () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #5 0x00007ffff32007c7 in H5O_attr_iterate () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #6 0x00007ffff312e661 in H5Aiterate1 () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #7 0x00007ffff452e466 in ?? () from /usr/lib/R/site-library/hdf5/libs/hdf5.so #8 0x00007ffff452ff9b in ?? () from /usr/lib/R/site-library/hdf5/libs/hdf5.so #9 0x00007ffff31b07ed in ?? () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #10 0x00007ffff31b6449 in H5G__node_iterate () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #11 0x00007ffff3135763 in ?? () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #12 0x00007ffff3136c26 in H5B_iterate () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #13 0x00007ffff31bbc4f in H5G__stab_iterate () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #14 0x00007ffff31b8d90 in H5G__obj_iterate () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #15 0x00007ffff31b17e8 in H5G_iterate () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #16 0x00007ffff31ae4fa in H5Giterate () from /usr/lib/x86_64-linux-gnu/libhdf5_serial.so.10 #17 0x00007ffff4531cdf in do_hdf5load () from /usr/lib/R/site-library/hdf5/libs/hdf5.so #18 0x00007ffff78f7331 in ?? () from /usr/lib/R/lib/libR.so #19 0x00007ffff792f7af in Rf_eval () from /usr/lib/R/lib/libR.so #20 0x00007ffff7931ca8 in ?? () from /usr/lib/R/lib/libR.so #21 0x00007ffff792f5a1 in Rf_eval () from /usr/lib/R/lib/libR.so #22 0x00007ffff7930c65 in Rf_applyClosure () from /usr/lib/R/lib/libR.so #23 0x00007ffff792f37d in Rf_eval () from /usr/lib/R/lib/libR.so #24 0x00007ffff7956bc2 in Rf_ReplIteration () from /usr/lib/R/lib/libR.so #25 0x00007ffff7956f41 in ?? () from /usr/lib/R/lib/libR.so #26 0x00007ffff7956ff4 in run_Rmainloop () from /usr/lib/R/lib/libR.so #27 0x00000000004007db in main () (gdb) disas Dump of assembler code for function strlen: 0x00007ffff70b6040 <+0>: pxor %xmm8,%xmm8 0x00007ffff70b6045 <+5>: pxor %xmm9,%xmm9 0x00007ffff70b604a <+10>: pxor %xmm10,%xmm10 0x00007ffff70b604f <+15>: pxor %xmm11,%xmm11 0x00007ffff70b6054 <+20>: mov %rdi,%rax 0x00007ffff70b6057 <+23>: mov %rdi,%rcx 0x00007ffff70b605a <+26>: and $0xfff,%rcx 0x00007ffff70b6061 <+33>: cmp $0xfcf,%rcx 0x00007ffff70b6068 <+40>: ja 0x7ffff70b60d0 <strlen+144> => 0x00007ffff70b606a <+42>: movdqu (%rax),%xmm12 [...etc...] (gdb) info registers rax 0x0 0 rbx 0x0 0 rcx 0x0 0 rdx 0x1 1 rsi 0x1 1 rdi 0x0 0 rbp 0x7fffffffbb00 0x7fffffffbb00 rsp 0x7fffffffba48 0x7fffffffba48 r8 0x0 0 r9 0x20 32 r10 0x7ffff7fd0780 140737353942912 r11 0x0 0 r12 0x1 1 r13 0x11d8880 18712704 r14 0xaa3488 11154568 r15 0x4000003 67108867 rip 0x7ffff70b606a 0x7ffff70b606a <strlen+42> eflags 0x10293 [ CF AF SF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 Looks like Rf_mkChar() tried to call strlen() on a null pointer. I believe that this is *not* a bug in pytables -- if I'm reading the HDF5 spec correctly, DATASPACE NULL in this context is a valid way to spell the empty string -- but even if it is, under no circumstances should loading an HDF file, no matter how ill-formed, be able to crash the R interpreter. zw -- System Information: Debian Release: stretch/sid APT prefers unstable APT policy: (500, 'unstable'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 4.4.0-1-amd64 (SMP w/8 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages r-cran-hdf5 depends on: ii libc6 2.22-5 ii libhdf5-10 1.8.16+docs-7 ii r-base-core [r-api-3] 3.2.4-revised-1 ii zlib1g 1:1.2.8.dfsg-2+b1 r-cran-hdf5 recommends no packages. r-cran-hdf5 suggests no packages. -- no debconf information
import tables with tables.open_file('no-title.hdf', 'w') as f: M = f.create_array(f.root, 'M', [1.0], '') with tables.open_file('title.hdf', 'w') as f: M = f.create_array(f.root, 'M', [1.0], 'M')
title.hdf
Description: HDF file
no-title.hdf
Description: HDF file