Hi

Something to get you started
? as.list
a data.frame can be regarded as a 2 dimensional array of list vectors

df = data.frame(a=1:2,b=2:1,c=4:5,d=9:10)
as.list(df[,1:3])
$a
[1] 1 2

$b
[1] 2 1

$c
[1] 4 5

see also
http://cran.ms.unimelb.edu.au/doc/contrib/Burns-unwilling_S.pdf

Regards

Duncan


Duncan Mackay
Department of Agronomy and Soil Science
University of New England
ARMIDALE NSW 2351
Email: home mac...@northnet.com.au

At 10:58 10/08/2011, you wrote:
Hello,

This is my first project in R, so I'm trying to work 'the R way', but it
still feels awkward sometimes.

The problem that I'm facing right now is that I need to convert a data.frame
into a structure of lists. The data.frame has columns in the order of tens
(I need to focus on only three of them) and rows in the order of millions.
So it's quite a big dataset.
Let say that the columns of interest are A, B and C. I need to take the
data.frame and construct a structure of list where I have a list for every
level of A, those list all contain lists for every levels of B, and the
'b-lists' contains all the values of C that match the corresponding levels
of A and B.
So, I should be able to write something like this:
> MyData@list_structure$x_level_of_A$y_level_of_B
and get a vector of the values of C that were on rows where A=x_level_of_A
and B=y_level_of_B.

My first attempt was to use two imbricated "lapply" functions running
something like this:

list_structure<-lapply(levels(A) function(x) {
  as.character(x) = lapply( levels(B), function(y) {
    as.character(y) = C[A==x & B==y]
  })
})

The real code was not quite as simple, but I managed to have it work, and it
worked well on my first dataset (where A and B had only few levels). I was
quite happy... but the imbricated loops killed me on a second dataset where
A had several thousand levels. So I tried something else.

My second attempt was to go through every row of the data.frame and append
the value to the appropriate vector.

I first initialized a structure of lists ending with NULL vector, then I did
something like this:

for (i in 1:nrow(DataFrame)) {
  eval(
    substitute(
      append(MyData@list_structure$a_value$b_value, c_value),
      list(a_value=as.character(DF$A[i]), b_value=as.character(DF$B[i]),
c_value=as.character(DF$C[i]))
    )
  )
}

This works... but way too slowly for my purpose.

I would like to know if there is a better road to take to do this
transformation. Or, if there is a way of speeding one of the two solutions
that I have tried.

Thank you very much for your help!

(And in your replies, please remember that this is my first project in R, so
don't hesitate to state the obvious if it seems like I am missing it!)

Frederic

--
View this message in context: http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3731746.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to