Greetings -- i want to read in a huge table and slice it, plotting
pieces of it with R. The table t looks like,
run ord unit words new
1 1 1497 1009 697
1 2 2112 1009 538
1 3 2035 1004 484
1 4 1492 1035 463
In R, my workflow is as follows:
read.table
save("<table>.rda")
Then I read the table, select blocks of it and plot them with:
for (run in 1:maxruns) {
...
r <- t[t$run == my.run,]
x <- t$ord
y <- t$new
xy.coords(xy)
plot(xy,...)
}
In R, r <- t does not create a copy unless t or r are modified. It's
essential for inspecting and plotting pieces of the huge t.
Now I'm trying to replicate this in rpy. First of, if I load and
pickle t, it's 7 times bigger than rda in case I do it with:
def main():
fr, to = sys.argv[1:]
matrix = []
count = 0
with file(fr) as f:
for line in f:
count += 1
row = line.split()
matrix.append(row)
print "read",count,"rows"
with open(to,"wb") as f:
pikl = Pickler(f,-1)
pikl.dump(matrix)
-- obviously I need to use numpy array for the matrix. How would I
load and save t with numpy, select blocks of the matrix without
copying, and pas on to R? How can one create a data.frame in R from
a numpy array read in Python -- or would I rather read.table/
save .rda in R?
Cheers,
Alexy
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
rpy-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rpy-list