[Rpy] NumPy Slicing without much Copying, Compact Pickling vs. .rda

Alexy Khrabrov Wed, 07 Nov 2007 14:43:27 -0800

Greetings -- i want to read in a huge table and slice it, plotting  
pieces of it with R.  The table t looks like,


run     ord     unit    words   new
1       1       1497    1009    697
1       2       2112    1009    538
1       3       2035    1004    484
1       4       1492    1035    463

In R, my workflow is as follows:

read.table
save("<table>.rda")

Then I read the table, select blocks of it and plot them with:

for (run in 1:maxruns) {
   ...
   r <- t[t$run == my.run,]

   x <- t$ord
   y <- t$new
   xy.coords(xy)

   plot(xy,...)
}

In R, r <- t does not create a copy unless t or r are modified.  It's  
essential for inspecting and plotting pieces of the huge t.
Now I'm trying to replicate this in rpy.  First of, if I load and  
pickle t, it's 7 times bigger than rda in case I do it with:

def main():
        fr, to = sys.argv[1:]
        
        matrix = []
        count = 0
        with file(fr) as f:
                for line in f:
                        count += 1
                        row = line.split()
                        matrix.append(row)
                        
        print "read",count,"rows"
        with open(to,"wb") as f:
                pikl = Pickler(f,-1)
                pikl.dump(matrix)

-- obviously I need to use numpy array for the matrix.  How would I  
load and save t with numpy, select blocks of the matrix without  
copying, and pas on to R?  How can one create a data.frame in R from  
a numpy array read in Python -- or would I rather read.table/ 
save .rda in R?

Cheers,
Alexy


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
rpy-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rpy-list

[Rpy] NumPy Slicing without much Copying, Compact Pickling vs. .rda

Reply via email to