On 15/07/2017 11:33 AM, Anthony Damico wrote:
hi, i realized that the segfault happens on the text file in a new R
session.  so, creating the segfault-generating text file requires a
contributed package, but prompting the actual segfault does not --
pretty sure that means this is a base R bug?  submitted here:
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311  hopefully i
am not doing something remarkably stupid.  the text file itself is 4GB
so cannot upload it to bugzilla, and from the R_AllocStringBugger error
in the previous message, i think most or all of it needs to be there to
trigger the segfault.  thanks!

I don't want to download the big file or install the archive package. Could you run the code below on the bad file? If you're right and it's only nulls that matter, this might allow me to create a file that triggers the bug.

f <-  # put the filename of the bad file here

con <- file(f, open="rb")
zeros <- numeric()
repeat {
  bytes <- readBin(con, "int", 1000000, size=1)
  zeros <- c(zeros, count + which(bytes == 0))
  count <- count + length(bytes)
  if (length(bytes) < 1000000) break
}
close(con)
cat("File length=", count, "\n")
cat("Nulls:\n")
zeros

Here's some code to recreate a file of the same length with nulls in the same places, and spaces everywhere else:

size <- count
f2 <- tempfile()
con <- file(f2, open="wb")
count <- 0
while (count < size) {
  nonzeros <- min(c(size - count, 1000000, zeros - 1))
  if (nonzeros) {
    writeBin(rep(32L, nonzeros), con, size = 1)
    count <- count + nonzeros
  }
  zeros <- zeros - nonzeros
  if (length(zeros) && min(zeros) == 1) {
    writeBin(0L, con, size = 1)
    count <- count + 1
    zeros <- zeros[-1] - 1
  }
}
close(con)

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to