On 02/28/2015 06:48 PM, Alexandre F. Souza wrote:
Dear friends,

I need to write a code to find data using one variable as reference. The
code I wrote, however, is not working and I can't figure it out why. Could
anyone help me?

Imagine a data set with two variables, B and C. Now I have variable A,
which is the same variable as variable B but the data are not in the same
order nor have necessarily the same extension as B (it may be a sample of
B, for example).

I want to find the values of variable C that match each line in variable A
using B as the association criterion. So the code should perform a loop in
which it would take the first line in A, search B until it finds it there,
then copy the corresponding value of C and store it in a new variable D. Do
it until all lines in A have been associated to a C value.

starting with...

df<-data.frame(B=sample(letters[1:10],replace=FALSE), C=rnorm(10), stringsAsFactors=FALSE)
A=letters[1:10]

two thoughts spring to mind:
(a) would merge() do what you want? e.g. df2 <- merge(df,data.frame(A=A), by.x="B", by.y="A"), and then extract the values of C with df2$C[df2$B=="f"], for example.
(b) sapply(A, function(lt, DF) DF$C[DF$B==lt], DF=df)

R's looping is generally more efficient when it's done internally, so it will be easier for you if you understand the R mentality, in particular vectorisation. usually if you have a for() loop, you're not writing R code efficiently.

Bob


Here is the code I wrote:


# Considering that matrices data.ref and data.assoc have been already read,
containing the

# User-defined number of columns to be associated with A (I imagined that
more than one variable could be associated at once)
col.assoc = 20

# To assure that data will not be in a non-usable data category
ref = as.matrix(data.ref)
assoc = as.matrix(data.assoc)


# Table where results will be stored
#  Number of columns = n associated variables plus one column
#  Reserved to receive the initial data (example column A)

result = matrix(nrow = nrow(ref), ncol = col.assoc + 1)

# Fulfill the first column of the result table with the original reference
variable

result[,1] = ref[,1]


for (i in 1:nrow(ref)){
   for (j in 1:nrow(assoc))
    if (ref[i, 1] == assoc[j, 1]){
      resultado[i, 2] == assoc[j, 2]
    }
}



col = ncol(dados)

####

Any thoughts?

Thanks in advance,

Alexandre



--
Bob O'Hara

Biodiversity and Climate Research Centre
Senckenberganlage 25
D-60325 Frankfurt am Main,
Germany

Tel: +49 69 7542 1863
Mobile: +49 1515 888 5440
WWW:   http://www.bik-f.de/root/index.php?page_id=219
Blog: http://blogs.nature.com/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Reply via email to