Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-04 Thread Tomas Kalibera
As of R-devel 72925 one gets a proper error message instead of the crash. Tomas On 09/04/2017 08:46 AM, rh...@eoos.dds.nl wrote: Although the problem can apparently be avoided in this case. readLines causing a segfault still seems unwanted behaviour to me. I can replicate this with the

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-04 Thread rhelp
Although the problem can apparently be avoided in this case. readLines causing a segfault still seems unwanted behaviour to me. I can replicate this with the example below (sessionInfo is further down): # Generate an example file l <- paste0(sample(c(letters, LETTERS), 1E6, replace = TRUE),

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-03 Thread Jennifer Lyon
Jeroen: Thank you for pointing me to ndjson, which I had not heard of and is exactly my case. My experience: jsonlite::stream_in - segfaults ndjson::stream_in - my fault, I am running Ubuntu 14.04 and it is too old so it won't compile the package corpus::read_ndjson - works!!! Of course it

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-03 Thread Jeroen Ooms
On Sat, Sep 2, 2017 at 8:58 PM, Jennifer Lyon wrote: > I have a 2.1GB JSON file. Typically I use readLines() and > jsonlite:fromJSON() to extract data from a JSON file. If your data consists of one json object per line, this is called 'ndjson'. There are several

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-02 Thread Suzen, Mehmet
Jennifer, Why do you try Sparkr? https://spark.apache.org/docs/1.6.1/api/R/read.json.html On 2 September 2017 at 23:15, Jennifer Lyon wrote: > Thank you for your suggestion. Unfortunately, while R doesn't segfault > calling readr::read_file() on the test file I

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-02 Thread Iñaki Úcar
2017-09-02 20:58 GMT+02:00 Jennifer Lyon : > Hi: > > I have a 2.1GB JSON file. Typically I use readLines() and > jsonlite:fromJSON() to extract data from a JSON file. > > When I try and read in this file using readLines() R segfaults. > > I believe the two salient issues

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-02 Thread Jennifer Lyon
Thank you for your suggestion. Unfortunately, while R doesn't segfault calling readr::read_file() on the test file I described, I get the error message: Error in read_file_(ds, locale) : negative length vectors are not allowed Jen On Sat, Sep 2, 2017 at 1:38 PM, Ista Zahn

Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-02 Thread Ista Zahn
As s work-around I suggest readr::read_file. --Ista On Sep 2, 2017 2:58 PM, "Jennifer Lyon" wrote: > Hi: > > I have a 2.1GB JSON file. Typically I use readLines() and > jsonlite:fromJSON() to extract data from a JSON file. > > When I try and read in this file using

[Rd] readLines() segfaults on large file & question on how to work around

2017-09-02 Thread Jennifer Lyon
Hi: I have a 2.1GB JSON file. Typically I use readLines() and jsonlite:fromJSON() to extract data from a JSON file. When I try and read in this file using readLines() R segfaults. I believe the two salient issues with this file are 1). Its size 2). It is a single line (no line breaks) I can