On 02/28/2015 06:48 PM, Alexandre F. Souza wrote:
Dear friends,
I need to write a code to find data using one variable as reference. The
code I wrote, however, is not working and I can't figure it out why. Could
anyone help me?
Imagine a data set with two variables, B and C. Now I have variable A,
which is the same variable as variable B but the data are not in the same
order nor have necessarily the same extension as B (it may be a sample of
B, for example).
I want to find the values of variable C that match each line in variable A
using B as the association criterion. So the code should perform a loop in
which it would take the first line in A, search B until it finds it there,
then copy the corresponding value of C and store it in a new variable D. Do
it until all lines in A have been associated to a C value.
starting with...
df<-data.frame(B=sample(letters[1:10],replace=FALSE), C=rnorm(10),
stringsAsFactors=FALSE)
A=letters[1:10]
two thoughts spring to mind:
(a) would merge() do what you want? e.g. df2 <-
merge(df,data.frame(A=A), by.x="B", by.y="A"), and then extract the
values of C with df2$C[df2$B=="f"], for example.
(b) sapply(A, function(lt, DF) DF$C[DF$B==lt], DF=df)
R's looping is generally more efficient when it's done internally, so it
will be easier for you if you understand the R mentality, in particular
vectorisation. usually if you have a for() loop, you're not writing R
code efficiently.
Bob
Here is the code I wrote:
# Considering that matrices data.ref and data.assoc have been already read,
containing the
# User-defined number of columns to be associated with A (I imagined that
more than one variable could be associated at once)
col.assoc = 20
# To assure that data will not be in a non-usable data category
ref = as.matrix(data.ref)
assoc = as.matrix(data.assoc)
# Table where results will be stored
# Number of columns = n associated variables plus one column
# Reserved to receive the initial data (example column A)
result = matrix(nrow = nrow(ref), ncol = col.assoc + 1)
# Fulfill the first column of the result table with the original reference
variable
result[,1] = ref[,1]
for (i in 1:nrow(ref)){
for (j in 1:nrow(assoc))
if (ref[i, 1] == assoc[j, 1]){
resultado[i, 2] == assoc[j, 2]
}
}
col = ncol(dados)
####
Any thoughts?
Thanks in advance,
Alexandre
--
Bob O'Hara
Biodiversity and Climate Research Centre
Senckenberganlage 25
D-60325 Frankfurt am Main,
Germany
Tel: +49 69 7542 1863
Mobile: +49 1515 888 5440
WWW: http://www.bik-f.de/root/index.php?page_id=219
Blog: http://blogs.nature.com/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology