I've spent some time trying to wrap my head around reading in large csv
files with the ff-package.  I think I know how to do it, but am bumping
into some problems.  I've tried to recreate the issues as best as I can
with a smaller example and maybe someone can help explain the problems.

The following code just creates a csv file with an integer column,
character column and logical column.
-------------------------------------------------
library(ff)
#Create data
size = 2000
fake.data =
data.frame("Integer"=round(100000*runif(size)),"Character"=sample(LETTERS,size,replace=T),"Logical"=sample(c(T,F),size,replace=T))

#Write to csv
write.csv(fake.data,"data.csv",row.names=F)
-------------------------------------------------

Now to read it in as a 'ffdf' class, I can do the following:

-------------------------------------------------
data = read.csv.ffdf(x=NULL,file="data.csv",nrows=1001,first.rows = 500,
next.rows = 1005,sep=",")
-------------------------------------------------

That works.  But with my current large data set, read.csv.ffdf is debating
with me about the classes it's importing. I was also messing around with
the first.rows/next.rows, but that's a question for another time. So I'll
try to load the data in, specifying the column types (same exact command,
except with specifying colClasses):

-------------------------------------------------

> data = read.csv.ffdf(x=NULL,file="data.csv",nrows=1001,first.rows = 500, 
> next.rows = 1005,sep=",",colClasses = c("integer","integer","logical"))Error 
> in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  scan() expected 'an integer', got '"J"'> data =
read.csv.ffdf(x=NULL,file="data.csv",nrows=1001,first.rows = 500,
next.rows = 1005,sep=",",colClasses =
c("integer","character","logical"))Error in ff(initdata = initdata,
length = length, levels = levels, ordered = ordered,  :
  vmode 'character' not implemented> data =
read.csv.ffdf(x=NULL,file="data.csv",nrows=1001,first.rows = 500,
next.rows = 1005,sep=",",colClasses = rep("character",3))Error in
ff(initdata = initdata, length = length, levels = levels, ordered =
ordered,  :
  vmode 'character' not implemented> data =
read.csv.ffdf(x=NULL,file="data.csv",nrows=1001,first.rows = 500,
next.rows = 1005,sep=",",colClasses = rep("raw",3))Error in scan(file,
what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  scan() expected 'a raw', got '8601'

-------------------------------------------------
I just can't find a combination of classes that will result in this reading
in.  I really don't understand why the classes 'character' won't work for
all of them.  Any thoughts as to why?  I appreciate the help and time.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to