On 16/07/2017 6:17 AM, Anthony Damico wrote:
thank you for taking the time to write this.  i set it running last
night and it's still going -- if it doesn't finish by tomorrow, i will
try to find a site to host the problem file and add that link to the bug
report so the archive package can be avoided at least.  i'm sorry for
the bother


How big is that text file? I wouldn't expect my script to take more than a few minutes even on a huge file.

My script might have a bug...

Duncan Murdoch

On Sat, Jul 15, 2017 at 4:14 PM, Duncan Murdoch
<murdoch.dun...@gmail.com <mailto:murdoch.dun...@gmail.com>> wrote:

    On 15/07/2017 11:33 AM, Anthony Damico wrote:

        hi, i realized that the segfault happens on the text file in a new R
        session.  so, creating the segfault-generating text file requires a
        contributed package, but prompting the actual segfault does not --
        pretty sure that means this is a base R bug?  submitted here:
        https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311
        <https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311>
        hopefully i
        am not doing something remarkably stupid.  the text file itself
        is 4GB
        so cannot upload it to bugzilla, and from the
        R_AllocStringBugger error
        in the previous message, i think most or all of it needs to be
        there to
        trigger the segfault.  thanks!


    I don't want to download the big file or install the archive
    package. Could you run the code below on the bad file?  If you're
    right and it's only nulls that matter, this might allow me to create
    a file that triggers the bug.

    f <-  # put the filename of the bad file here

    con <- file(f, open="rb")
    zeros <- numeric()
    repeat {
      bytes <- readBin(con, "int", 1000000, size=1)
      zeros <- c(zeros, count + which(bytes == 0))
      count <- count + length(bytes)
      if (length(bytes) < 1000000) break
    }
    close(con)
    cat("File length=", count, "\n")
    cat("Nulls:\n")
    zeros

    Here's some code to recreate a file of the same length with nulls in
    the same places, and spaces everywhere else:

    size <- count
    f2 <- tempfile()
    con <- file(f2, open="wb")
    count <- 0
    while (count < size) {
      nonzeros <- min(c(size - count, 1000000, zeros - 1))
      if (nonzeros) {
        writeBin(rep(32L, nonzeros), con, size = 1)
        count <- count + nonzeros
      }
      zeros <- zeros - nonzeros
      if (length(zeros) && min(zeros) == 1) {
        writeBin(0L, con, size = 1)
        count <- count + 1
        zeros <- zeros[-1] - 1
      }
    }
    close(con)

    Duncan Murdoch





______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to