Re: [R] Conditional looping over a set of variables in R
On 2010-10-27 06:21, David Herzberg wrote: Peter, thanks for this elegant solution that works well and handles the empty cases. However, the vector it returns includes both the row (case) numbers and the target result (number of column of first "1"). How can I strip out the row numbers and leave only the target result. Use unname(x) or as.vector(x) on the result. -Peter Ehlers Regards, David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Tuesday, October 26, 2010 9:23 AM To: David Herzberg Cc: Petr PIKAL; r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R I would still recommend vector_of_column_number<- apply(yourdata, 1, match, x=1) as the simplest way if you only want the number of the column that has the first 1 or "1" (the call works as is for both numeric and character data). Rows which have no 1s will return a value of NA. Anything wrong with it? -Peter Ehlers On 2010-10-26 07:50, David Herzberg wrote: Thank you - I will try this solution as well. Sent via DROID X -Original message- From: Petr PIKAL To: David Herzberg Cc: Adrienne Wootten, "r-help@r-project.org" Sent: Tue, Oct 26, 2010 06:43:09 GMT+00:00 Subject: Re: [R] Conditional looping over a set of variables in R Hi r-help-boun...@r-project.org napsal dne 25.10.2010 20:41:55: Adrienne, there's one glitch when I implement your solution below. When the loop encounters a case with no data at all (that is, all 140 item responses are missing), it aborts and prints this error message: " ERROR: argument is of length zero". I wonder if there's a logical condition I could add that would enable R to skip these empty cases and continue executing on the next case that contains data. Thanks, Dave David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf Of Adrienne Wootten Sent: Friday, October 22, 2010 9:09 AM To: David Herzberg Cc: r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R David, here I'm referring to your data as testmat, a matrix of 140 columns and 1500 rows, but the same or similar notation can be applied to data frames in R. If I understand correctly, you are looking for the first response (column) where you got a value of 1. I'm assuming also that since your missing values are characters then your two numeric values are also characters. keeping all this in mind, try something like this. If you really only want to know which column in each row has first occurrence of 1 (or any other value) you can get rid of looping and use other R capabilities. set.seed(111) mat<-matrix(sample(1:3, 20, replace=T),5,4) mat [,1] [,2] [,3] [,4] [1,]2222 [2,]3121 [3,]2213 [4,]2211 [5,]2112 mat.w<-which(mat==1, arr.ind=T) tapply(mat.w[,2], mat.w[,1], min) 2 3 4 5 2 3 3 2 mat[2, ]<-NA mat [,1] [,2] [,3] [,4] [1,]2222 [2,] NA NA NA NA [3,]2213 [4,]2211 [5,]2112 and this approach smoothly works with NA values too mat.w<-which(mat==1, arr.ind=T) tapply(mat.w[,2], mat.w[,1], min) 3 4 5 3 3 2 You can then use modify such output as you have info about columns and rows. I am sure there are other maybe better options, e.g. lll<-as.list(as.data.frame(t(mat))) unlist(lapply(lll, function(x) min(which(x==1 V1 V2 V3 V4 V5 Inf Inf 3 3 2 Regards Petr first = c() # your extra variable which will eventually contain the first correct response for each case for(i in 1:nrow(testmat)){ c = 1 while( c<=ncol(testmat) | testmat[i,c] != "1" ){ if( testmat[i,c] == "1"){ first[i] = c break # will exit the while loop once it finds the first correct answer, and then jump to the next case } else { c=c+1 # procede to the next column if not } } } Hope this helps you out a bit. Adrienne Wootten NCSU On Fri, Oct 22, 2010 at 11:33 AM, David Herzbergmailto:dav...@wpspublish.com>> wrote: Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as
Re: [R] Conditional looping over a set of variables in R
Peter, thanks for this elegant solution that works well and handles the empty cases. However, the vector it returns includes both the row (case) numbers and the target result (number of column of first "1"). How can I strip out the row numbers and leave only the target result. Regards, David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Tuesday, October 26, 2010 9:23 AM To: David Herzberg Cc: Petr PIKAL; r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R I would still recommend vector_of_column_number <- apply(yourdata, 1, match, x=1) as the simplest way if you only want the number of the column that has the first 1 or "1" (the call works as is for both numeric and character data). Rows which have no 1s will return a value of NA. Anything wrong with it? -Peter Ehlers On 2010-10-26 07:50, David Herzberg wrote: > > Thank you - I will try this solution as well. > > Sent via DROID X > > > -Original message- > From: Petr PIKAL > To: David Herzberg > Cc: Adrienne Wootten, > "r-help@r-project.org" > Sent: Tue, Oct 26, 2010 06:43:09 GMT+00:00 > Subject: Re: [R] Conditional looping over a set of variables in R > > Hi > > r-help-boun...@r-project.org napsal dne 25.10.2010 20:41:55: > >> Adrienne, there's one glitch when I implement your solution below. >> When > the >> loop encounters a case with no data at all (that is, all 140 item > responses >> are missing), it aborts and prints this error message: " ERROR: >> argument > is >> of length zero". >> >> I wonder if there's a logical condition I could add that would enable >> R > to >> skip these empty cases and continue executing on the next case that > contains data. >> >> Thanks, Dave >> >> David S. Herzberg, Ph.D. >> Vice President, Research and Development Western Psychological >> Services >> 12031 Wilshire Blvd. >> Los Angeles, CA 90025-1251 >> Phone: (310)478-2061 x144 >> FAX: (310)478-7838 >> email: dav...@wpspublish.com >> >> >> >> From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] >> On > Behalf >> Of Adrienne Wootten >> Sent: Friday, October 22, 2010 9:09 AM >> To: David Herzberg >> Cc: r-help@r-project.org >> Subject: Re: [R] Conditional looping over a set of variables in R >> >> David, >> >> here I'm referring to your data as testmat, a matrix of 140 columns >> and > 1500 >> rows, but the same or similar notation can be applied to data frames >> in > R. If >> I understand correctly, you are looking for the first response >> (column) > where >> you got a value of 1. I'm assuming also that since your missing >> values > are >> characters then your two numeric values are also characters. keeping > all this >> in mind, try something like this. > > If you really only want to know which column in each row has first > occurrence of 1 (or any other value) you can get rid of looping and > use other R capabilities. > >> set.seed(111) >> mat<-matrix(sample(1:3, 20, replace=T),5,4) mat > [,1] [,2] [,3] [,4] > [1,]2222 > [2,]3121 > [3,]2213 > [4,]2211 > [5,]2112 >> mat.w<-which(mat==1, arr.ind=T) >> tapply(mat.w[,2], mat.w[,1], min) > 2 3 4 5 > 2 3 3 2 >> mat[2, ]<-NA >> mat > [,1] [,2] [,3] [,4] > [1,]2222 > [2,] NA NA NA NA > [3,]2213 > [4,]2211 > [5,]2112 > > and this approach smoothly works with NA values too > >> mat.w<-which(mat==1, arr.ind=T) >> tapply(mat.w[,2], mat.w[,1], min) > 3 4 5 > 3 3 2 > > You can then use modify such output as you have info about columns and > rows. I am sure there are other maybe better options, e.g. > > lll<-as.list(as.data.frame(t(mat))) >> unlist(lapply(lll, function(x) min(which(x==1 > V1 V2 V3 V4 V5 > Inf Inf 3 3 2 > > Regards > Petr > >> >> first = c() # your extra variable which will eventually contain the > first >> correct response for each case >> >> for(i in 1:nrow(testmat)){ >> >> c = 1 >> >> while( c<=ncol(testmat) | testmat[i,c] != "1" ){ >>
Re: [R] Conditional looping over a set of variables in R
I would still recommend vector_of_column_number <- apply(yourdata, 1, match, x=1) as the simplest way if you only want the number of the column that has the first 1 or "1" (the call works as is for both numeric and character data). Rows which have no 1s will return a value of NA. Anything wrong with it? -Peter Ehlers On 2010-10-26 07:50, David Herzberg wrote: Thank you - I will try this solution as well. Sent via DROID X -Original message- From: Petr PIKAL To: David Herzberg Cc: Adrienne Wootten, "r-help@r-project.org" Sent: Tue, Oct 26, 2010 06:43:09 GMT+00:00 Subject: Re: [R] Conditional looping over a set of variables in R Hi r-help-boun...@r-project.org napsal dne 25.10.2010 20:41:55: Adrienne, there's one glitch when I implement your solution below. When the loop encounters a case with no data at all (that is, all 140 item responses are missing), it aborts and prints this error message: " ERROR: argument is of length zero". I wonder if there's a logical condition I could add that would enable R to skip these empty cases and continue executing on the next case that contains data. Thanks, Dave David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf Of Adrienne Wootten Sent: Friday, October 22, 2010 9:09 AM To: David Herzberg Cc: r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R David, here I'm referring to your data as testmat, a matrix of 140 columns and 1500 rows, but the same or similar notation can be applied to data frames in R. If I understand correctly, you are looking for the first response (column) where you got a value of 1. I'm assuming also that since your missing values are characters then your two numeric values are also characters. keeping all this in mind, try something like this. If you really only want to know which column in each row has first occurrence of 1 (or any other value) you can get rid of looping and use other R capabilities. set.seed(111) mat<-matrix(sample(1:3, 20, replace=T),5,4) mat [,1] [,2] [,3] [,4] [1,]2222 [2,]3121 [3,]2213 [4,]2211 [5,]2112 mat.w<-which(mat==1, arr.ind=T) tapply(mat.w[,2], mat.w[,1], min) 2 3 4 5 2 3 3 2 mat[2, ]<-NA mat [,1] [,2] [,3] [,4] [1,]2222 [2,] NA NA NA NA [3,]2213 [4,]2211 [5,]2112 and this approach smoothly works with NA values too mat.w<-which(mat==1, arr.ind=T) tapply(mat.w[,2], mat.w[,1], min) 3 4 5 3 3 2 You can then use modify such output as you have info about columns and rows. I am sure there are other maybe better options, e.g. lll<-as.list(as.data.frame(t(mat))) unlist(lapply(lll, function(x) min(which(x==1 V1 V2 V3 V4 V5 Inf Inf 3 3 2 Regards Petr first = c() # your extra variable which will eventually contain the first correct response for each case for(i in 1:nrow(testmat)){ c = 1 while( c<=ncol(testmat) | testmat[i,c] != "1" ){ if( testmat[i,c] == "1"){ first[i] = c break # will exit the while loop once it finds the first correct answer, and then jump to the next case } else { c=c+1 # procede to the next column if not } } } Hope this helps you out a bit. Adrienne Wootten NCSU On Fri, Oct 22, 2010 at 11:33 AM, David Herzbergmailto:dav...@wpspublish.com>> wrote: Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECL
Re: [R] Conditional looping over a set of variables in R
Thank you - I will try this solution as well. Sent via DROID X -Original message- From: Petr PIKAL To: David Herzberg Cc: Adrienne Wootten , "r-help@r-project.org" Sent: Tue, Oct 26, 2010 06:43:09 GMT+00:00 Subject: Re: [R] Conditional looping over a set of variables in R Hi r-help-boun...@r-project.org napsal dne 25.10.2010 20:41:55: > Adrienne, there's one glitch when I implement your solution below. When the > loop encounters a case with no data at all (that is, all 140 item responses > are missing), it aborts and prints this error message: " ERROR: argument is > of length zero". > > I wonder if there's a logical condition I could add that would enable R to > skip these empty cases and continue executing on the next case that contains data. > > Thanks, Dave > > David S. Herzberg, Ph.D. > Vice President, Research and Development > Western Psychological Services > 12031 Wilshire Blvd. > Los Angeles, CA 90025-1251 > Phone: (310)478-2061 x144 > FAX: (310)478-7838 > email: dav...@wpspublish.com > > > > From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf > Of Adrienne Wootten > Sent: Friday, October 22, 2010 9:09 AM > To: David Herzberg > Cc: r-help@r-project.org > Subject: Re: [R] Conditional looping over a set of variables in R > > David, > > here I'm referring to your data as testmat, a matrix of 140 columns and 1500 > rows, but the same or similar notation can be applied to data frames in R. If > I understand correctly, you are looking for the first response (column) where > you got a value of 1. I'm assuming also that since your missing values are > characters then your two numeric values are also characters. keeping all this > in mind, try something like this. If you really only want to know which column in each row has first occurrence of 1 (or any other value) you can get rid of looping and use other R capabilities. > set.seed(111) > mat<-matrix(sample(1:3, 20, replace=T),5,4) > mat [,1] [,2] [,3] [,4] [1,]2222 [2,]3121 [3,]2213 [4,]2211 [5,]2112 > mat.w<-which(mat==1, arr.ind=T) > tapply(mat.w[,2], mat.w[,1], min) 2 3 4 5 2 3 3 2 > mat[2, ]<-NA > mat [,1] [,2] [,3] [,4] [1,]2222 [2,] NA NA NA NA [3,]2213 [4,]2211 [5,]2112 and this approach smoothly works with NA values too > mat.w<-which(mat==1, arr.ind=T) > tapply(mat.w[,2], mat.w[,1], min) 3 4 5 3 3 2 You can then use modify such output as you have info about columns and rows. I am sure there are other maybe better options, e.g. lll<-as.list(as.data.frame(t(mat))) > unlist(lapply(lll, function(x) min(which(x==1 V1 V2 V3 V4 V5 Inf Inf 3 3 2 Regards Petr > > first = c() # your extra variable which will eventually contain the first > correct response for each case > > for(i in 1:nrow(testmat)){ > > c = 1 > > while( c<=ncol(testmat) | testmat[i,c] != "1" ){ > > if( testmat[i,c] == "1"){ > > first[i] = c > break # will exit the while loop once it finds the first correct answer, and > then jump to the next case > > } else { > > c=c+1 # procede to the next column if not > > } > > } > > } > > > Hope this helps you out a bit. > > Adrienne Wootten > NCSU > > On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg mailto:dav...@wpspublish.com>> wrote: > Here's the problem I'm trying to solve in R: I have a data frame that consists > of about 1500 cases (rows) of data from kids who took a test of listening > comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = > missing) on 140 test items. The items are numbered sequentially and are > ordered by increasing difficulty as you go from left to right across the > columns. I want R to go through the data and find the first correct response > for each case. Because of basal and ceiling rules, many cases have missing > data on many items before the first correct response appears. > > For each case, I want R to evaluate the item responses sequentially starting > with item 1. If the score is 0 or missing, proceed to the next item and > evaluate it. If the score is 1, stop the operation for that case, record the > item number of that first correct response in a new variable, proceed to the > next case, and restart the operation. > > In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as > follows (assuming the data set is already loaded): > > * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT > RESPONSE, SET IT EQUAL TO 0. > numeric LCf
Re: [R] Conditional looping over a set of variables in R
Hi r-help-boun...@r-project.org napsal dne 25.10.2010 20:41:55: > Adrienne, there's one glitch when I implement your solution below. When the > loop encounters a case with no data at all (that is, all 140 item responses > are missing), it aborts and prints this error message: " ERROR: argument is > of length zero". > > I wonder if there's a logical condition I could add that would enable R to > skip these empty cases and continue executing on the next case that contains data. > > Thanks, Dave > > David S. Herzberg, Ph.D. > Vice President, Research and Development > Western Psychological Services > 12031 Wilshire Blvd. > Los Angeles, CA 90025-1251 > Phone: (310)478-2061 x144 > FAX: (310)478-7838 > email: dav...@wpspublish.com > > > > From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf > Of Adrienne Wootten > Sent: Friday, October 22, 2010 9:09 AM > To: David Herzberg > Cc: r-help@r-project.org > Subject: Re: [R] Conditional looping over a set of variables in R > > David, > > here I'm referring to your data as testmat, a matrix of 140 columns and 1500 > rows, but the same or similar notation can be applied to data frames in R. If > I understand correctly, you are looking for the first response (column) where > you got a value of 1. I'm assuming also that since your missing values are > characters then your two numeric values are also characters. keeping all this > in mind, try something like this. If you really only want to know which column in each row has first occurrence of 1 (or any other value) you can get rid of looping and use other R capabilities. > set.seed(111) > mat<-matrix(sample(1:3, 20, replace=T),5,4) > mat [,1] [,2] [,3] [,4] [1,]2222 [2,]3121 [3,]2213 [4,]2211 [5,]2112 > mat.w<-which(mat==1, arr.ind=T) > tapply(mat.w[,2], mat.w[,1], min) 2 3 4 5 2 3 3 2 > mat[2, ]<-NA > mat [,1] [,2] [,3] [,4] [1,]2222 [2,] NA NA NA NA [3,]2213 [4,]2211 [5,]2112 and this approach smoothly works with NA values too > mat.w<-which(mat==1, arr.ind=T) > tapply(mat.w[,2], mat.w[,1], min) 3 4 5 3 3 2 You can then use modify such output as you have info about columns and rows. I am sure there are other maybe better options, e.g. lll<-as.list(as.data.frame(t(mat))) > unlist(lapply(lll, function(x) min(which(x==1 V1 V2 V3 V4 V5 Inf Inf 3 3 2 Regards Petr > > first = c() # your extra variable which will eventually contain the first > correct response for each case > > for(i in 1:nrow(testmat)){ > > c = 1 > > while( c<=ncol(testmat) | testmat[i,c] != "1" ){ > > if( testmat[i,c] == "1"){ > > first[i] = c > break # will exit the while loop once it finds the first correct answer, and > then jump to the next case > > } else { > > c=c+1 # procede to the next column if not > > } > > } > > } > > > Hope this helps you out a bit. > > Adrienne Wootten > NCSU > > On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg mailto:dav...@wpspublish.com>> wrote: > Here's the problem I'm trying to solve in R: I have a data frame that consists > of about 1500 cases (rows) of data from kids who took a test of listening > comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = > missing) on 140 test items. The items are numbered sequentially and are > ordered by increasing difficulty as you go from left to right across the > columns. I want R to go through the data and find the first correct response > for each case. Because of basal and ceiling rules, many cases have missing > data on many items before the first correct response appears. > > For each case, I want R to evaluate the item responses sequentially starting > with item 1. If the score is 0 or missing, proceed to the next item and > evaluate it. If the score is 1, stop the operation for that case, record the > item number of that first correct response in a new variable, proceed to the > next case, and restart the operation. > > In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as > follows (assuming the data set is already loaded): > > * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT > RESPONSE, SET IT EQUAL TO 0. > numeric LCfirst1. > comp LCfirst1 = 0 > > * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. > vector x=LC1a_score to LC140a_score. > > * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. &q
Re: [R] Conditional looping over a set of variables in R
Adrienne, there's one glitch when I implement your solution below. When the loop encounters a case with no data at all (that is, all 140 item responses are missing), it aborts and prints this error message: " ERROR: argument is of length zero". I wonder if there's a logical condition I could add that would enable R to skip these empty cases and continue executing on the next case that contains data. Thanks, Dave David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf Of Adrienne Wootten Sent: Friday, October 22, 2010 9:09 AM To: David Herzberg Cc: r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R David, here I'm referring to your data as testmat, a matrix of 140 columns and 1500 rows, but the same or similar notation can be applied to data frames in R. If I understand correctly, you are looking for the first response (column) where you got a value of 1. I'm assuming also that since your missing values are characters then your two numeric values are also characters. keeping all this in mind, try something like this. first = c() # your extra variable which will eventually contain the first correct response for each case for(i in 1:nrow(testmat)){ c = 1 while( c<=ncol(testmat) | testmat[i,c] != "1" ){ if( testmat[i,c] == "1"){ first[i] = c break # will exit the while loop once it finds the first correct answer, and then jump to the next case } else { c=c+1 # procede to the next column if not } } } Hope this helps you out a bit. Adrienne Wootten NCSU On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg mailto:dav...@wpspublish.com>> wrote: Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. + comp x(#i) = 99. * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT CASE AND RESTARTS THE LOOP. + comp LCfirst1 = #i. + end if. end loop. exe. After several hours of trying to translate this procedure to R, I'm stumped. I played around with creating a list to hold the item responses variables (analogous to 'vector' in SPSS), but when I tried to use the list in an R procedure, I kept getting a warning along the lines of 'the list contains > 1 element, only the first element will be used'. So perhaps a list is not the appropriate class to 'hold' these variables? It seems that some nested arrangement of 'for' 'while' and/or 'lapply' wil
Re: [R] Conditional looping over a set of variables in R
On Sun, Oct 24, 2010 at 2:54 PM, Peter Ehlers wrote: > Whoops, got an extra comma in there somehow; should be: > > apply(d, 1, function(x) match(1, x)) > A slight variation on this would be: apply(d, 1, match, x = 1) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional looping over a set of variables in R
Whoops, got an extra comma in there somehow; should be: apply(d, 1, function(x) match(1, x)) -Peter Ehlers On 2010-10-24 08:17, Peter Ehlers wrote: This won't be as quick as Bill's elegant solution, but it's a one-liner: apply(d, 1, function(x), match(1, x)) See ?match. -Peter Ehlers On 2010-10-22 10:36, David Herzberg wrote: Bill, thanks so much for this. I'll get a chance to test it later today, and will post the outcome. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Friday, October 22, 2010 9:52 AM To: David Herzberg; r-help@r-project.org Subject: RE: [R] Conditional looping over a set of variables in R You were a bit vague about the format of your data. I'm assuming all columns were numeric and the entries are one of 0, 1, and NA (missing value). I made a little function to generate random data of that format for testing purposes: makeData<- function (nrow = 1500, ncol = 140, pMissing = 0.1) { # pMissing if proportion of missing values m<- matrix(sample(c(1, 0), size = nrow * ncol, replace = TRUE), nrow, ncol) m[runif(nrow * ncol)< pMissing]<- NA data.frame(m) } E.g., > set.seed(168) > d<- makeData(15,3) > d X1 X2 X3 1 1 1 1 2 0 0 NA 3 0 1 0 4 0 0 NA 5 0 1 1 6 0 0 NA 7 1 0 0 8 0 1 1 9 0 0 1 10 1 1 NA 11 0 0 1 12 0 0 0 13 NA NA NA 14 0 0 0 15 1 0 0 I think the following function does what you want. The algorithm is pretty similar to what you showed. columnOfFirstOne<- function(data) { # col will be return value, one entry per row of data. # Fill it with NA's: NA in output will mean there were no 1's in row col<- rep(as.integer(NA), nrow(data)) for (j in seq_len(ncol(data))) { # loop over columns # For each entry in 'col', if it has not been set yet # and this entry the j'th column of data is 1 (and not missing) # then set to the column number. col[is.na(col)& !is.na(data[, j])& data[, j] == 1]<- j } col # return this from function } With the above data we get > columnOfFirstOne(d) [1] 1 NA 2 NA 2 NA 1 2 3 1 3 NA NA NA 1 It seems quick enough for a dataset of your size > dd<- makeData(nrow=1500, ncol=140) > system.time(columnOfFirstOne(dd)) # time in seconds user system elapsed 0.080.000.08 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Herzberg Sent: Friday, October 22, 2010 8:34 AM To: r-help@r-project.org Subject: [R] Conditional looping over a set of variables in R Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAI
Re: [R] Conditional looping over a set of variables in R
This won't be as quick as Bill's elegant solution, but it's a one-liner: apply(d, 1, function(x), match(1, x)) See ?match. -Peter Ehlers On 2010-10-22 10:36, David Herzberg wrote: Bill, thanks so much for this. I'll get a chance to test it later today, and will post the outcome. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Friday, October 22, 2010 9:52 AM To: David Herzberg; r-help@r-project.org Subject: RE: [R] Conditional looping over a set of variables in R You were a bit vague about the format of your data. I'm assuming all columns were numeric and the entries are one of 0, 1, and NA (missing value). I made a little function to generate random data of that format for testing purposes: makeData<- function (nrow = 1500, ncol = 140, pMissing = 0.1) { # pMissing if proportion of missing values m<- matrix(sample(c(1, 0), size = nrow * ncol, replace = TRUE), nrow, ncol) m[runif(nrow * ncol)< pMissing]<- NA data.frame(m) } E.g., > set.seed(168) > d<- makeData(15,3) > d X1 X2 X3 1 1 1 1 2 0 0 NA 3 0 1 0 4 0 0 NA 5 0 1 1 6 0 0 NA 7 1 0 0 8 0 1 1 9 0 0 1 10 1 1 NA 11 0 0 1 12 0 0 0 13 NA NA NA 14 0 0 0 15 1 0 0 I think the following function does what you want. The algorithm is pretty similar to what you showed. columnOfFirstOne<- function(data) { # col will be return value, one entry per row of data. # Fill it with NA's: NA in output will mean there were no 1's in row col<- rep(as.integer(NA), nrow(data)) for (j in seq_len(ncol(data))) { # loop over columns # For each entry in 'col', if it has not been set yet # and this entry the j'th column of data is 1 (and not missing) # then set to the column number. col[is.na(col)& !is.na(data[, j])& data[, j] == 1]<- j } col # return this from function } With the above data we get > columnOfFirstOne(d) [1] 1 NA 2 NA 2 NA 1 2 3 1 3 NA NA NA 1 It seems quick enough for a dataset of your size > dd<- makeData(nrow=1500, ncol=140) > system.time(columnOfFirstOne(dd)) # time in seconds user system elapsed 0.080.000.08 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Herzberg Sent: Friday, October 22, 2010 8:34 AM To: r-help@r-project.org Subject: [R] Conditional looping over a set of variables in R Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE
Re: [R] Conditional looping over a set of variables in R
Adrienne - this solves the problem nicely. Thanks for your help. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf Of Adrienne Wootten Sent: Friday, October 22, 2010 9:09 AM To: David Herzberg Cc: r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R David, here I'm referring to your data as testmat, a matrix of 140 columns and 1500 rows, but the same or similar notation can be applied to data frames in R. If I understand correctly, you are looking for the first response (column) where you got a value of 1. I'm assuming also that since your missing values are characters then your two numeric values are also characters. keeping all this in mind, try something like this. first = c() # your extra variable which will eventually contain the first correct response for each case for(i in 1:nrow(testmat)){ c = 1 while( c<=ncol(testmat) | testmat[i,c] != "1" ){ if( testmat[i,c] == "1"){ first[i] = c break # will exit the while loop once it finds the first correct answer, and then jump to the next case } else { c=c+1 # procede to the next column if not } } } Hope this helps you out a bit. Adrienne Wootten NCSU On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg mailto:dav...@wpspublish.com>> wrote: Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. + comp x(#i) = 99. * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT CASE AND RESTARTS THE LOOP. + comp LCfirst1 = #i. + end if. end loop. exe. After several hours of trying to translate this procedure to R, I'm stumped. I played around with creating a list to hold the item responses variables (analogous to 'vector' in SPSS), but when I tried to use the list in an R procedure, I kept getting a warning along the lines of 'the list contains > 1 element, only the first element will be used'. So perhaps a list is not the appropriate class to 'hold' these variables? It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will allow me to recreate the operation described above? How do I set up the indexing operation analogous to 'loop #i' in SPSS? Any help is appreciated, and I'm happy to provide more information if needed. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los
Re: [R] Conditional looping over a set of variables in R
Bill, thanks so much for this. I'll get a chance to test it later today, and will post the outcome. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Friday, October 22, 2010 9:52 AM To: David Herzberg; r-help@r-project.org Subject: RE: [R] Conditional looping over a set of variables in R You were a bit vague about the format of your data. I'm assuming all columns were numeric and the entries are one of 0, 1, and NA (missing value). I made a little function to generate random data of that format for testing purposes: makeData <- function (nrow = 1500, ncol = 140, pMissing = 0.1) { # pMissing if proportion of missing values m <- matrix(sample(c(1, 0), size = nrow * ncol, replace = TRUE), nrow, ncol) m[runif(nrow * ncol) < pMissing] <- NA data.frame(m) } E.g., > set.seed(168) > d <- makeData(15,3) > d X1 X2 X3 1 1 1 1 2 0 0 NA 3 0 1 0 4 0 0 NA 5 0 1 1 6 0 0 NA 7 1 0 0 8 0 1 1 9 0 0 1 10 1 1 NA 11 0 0 1 12 0 0 0 13 NA NA NA 14 0 0 0 15 1 0 0 I think the following function does what you want. The algorithm is pretty similar to what you showed. columnOfFirstOne <- function(data) { # col will be return value, one entry per row of data. # Fill it with NA's: NA in output will mean there were no 1's in row col <- rep(as.integer(NA), nrow(data)) for (j in seq_len(ncol(data))) { # loop over columns # For each entry in 'col', if it has not been set yet # and this entry the j'th column of data is 1 (and not missing) # then set to the column number. col[is.na(col) & !is.na(data[, j]) & data[, j] == 1] <- j } col # return this from function } With the above data we get > columnOfFirstOne(d) [1] 1 NA 2 NA 2 NA 1 2 3 1 3 NA NA NA 1 It seems quick enough for a dataset of your size > dd <- makeData(nrow=1500, ncol=140) > system.time(columnOfFirstOne(dd)) # time in seconds user system elapsed 0.080.000.08 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of David Herzberg > Sent: Friday, October 22, 2010 8:34 AM > To: r-help@r-project.org > Subject: [R] Conditional looping over a set of variables in R > > Here's the problem I'm trying to solve in R: I have a data frame that > consists of about 1500 cases (rows) of data from kids who took a test > of listening comprehension. The columns are their scores (1 = correct, > 0 = incorrect, . = missing) on 140 test items. The items are numbered > sequentially and are ordered by increasing difficulty as you go from > left to right across the columns. I want R to go through the data and > find the first correct response for each case. Because of basal and > ceiling rules, many cases have missing data on many items before the > first correct response appears. > > For each case, I want R to evaluate the item responses sequentially > starting with item 1. If the score is 0 or missing, proceed to the > next item and evaluate it. If the score is 1, stop the operation for > that case, record the item number of that first correct response in a > new variable, proceed to the next case, and restart the operation. > > In SPSS, this operation would be carried out with LOOP, VECTOR, and DO > IF, as follows (assuming the data set is already loaded): > > * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT > RESPONSE, SET IT EQUAL TO 0. > numeric LCfirst1. > comp LCfirst1 = 0 > > * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. > vector x=LC1a_score to LC140a_score. > > * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS > LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME > THE LOOP RUNS. > loop #i=1 to 140 if (LCfirst1 = 0). > > * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH > ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES > THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM > RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR > ELELMENTS ARE EVALUATED. > THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE > VECTOR UNTIL A '1' IS ENCOUNTERED. > + do if x(#i) = 1. > > * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, &g
Re: [R] Conditional looping over a set of variables in R
You were a bit vague about the format of your data. I'm assuming all columns were numeric and the entries are one of 0, 1, and NA (missing value). I made a little function to generate random data of that format for testing purposes: makeData <- function (nrow = 1500, ncol = 140, pMissing = 0.1) { # pMissing if proportion of missing values m <- matrix(sample(c(1, 0), size = nrow * ncol, replace = TRUE), nrow, ncol) m[runif(nrow * ncol) < pMissing] <- NA data.frame(m) } E.g., > set.seed(168) > d <- makeData(15,3) > d X1 X2 X3 1 1 1 1 2 0 0 NA 3 0 1 0 4 0 0 NA 5 0 1 1 6 0 0 NA 7 1 0 0 8 0 1 1 9 0 0 1 10 1 1 NA 11 0 0 1 12 0 0 0 13 NA NA NA 14 0 0 0 15 1 0 0 I think the following function does what you want. The algorithm is pretty similar to what you showed. columnOfFirstOne <- function(data) { # col will be return value, one entry per row of data. # Fill it with NA's: NA in output will mean there were no 1's in row col <- rep(as.integer(NA), nrow(data)) for (j in seq_len(ncol(data))) { # loop over columns # For each entry in 'col', if it has not been set yet # and this entry the j'th column of data is 1 (and not missing) # then set to the column number. col[is.na(col) & !is.na(data[, j]) & data[, j] == 1] <- j } col # return this from function } With the above data we get > columnOfFirstOne(d) [1] 1 NA 2 NA 2 NA 1 2 3 1 3 NA NA NA 1 It seems quick enough for a dataset of your size > dd <- makeData(nrow=1500, ncol=140) > system.time(columnOfFirstOne(dd)) # time in seconds user system elapsed 0.080.000.08 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of David Herzberg > Sent: Friday, October 22, 2010 8:34 AM > To: r-help@r-project.org > Subject: [R] Conditional looping over a set of variables in R > > Here's the problem I'm trying to solve in R: I have a data > frame that consists of about 1500 cases (rows) of data from > kids who took a test of listening comprehension. The columns > are their scores (1 = correct, 0 = incorrect, . = missing) > on 140 test items. The items are numbered sequentially and > are ordered by increasing difficulty as you go from left to > right across the columns. I want R to go through the data and > find the first correct response for each case. Because of > basal and ceiling rules, many cases have missing data on many > items before the first correct response appears. > > For each case, I want R to evaluate the item responses > sequentially starting with item 1. If the score is 0 or > missing, proceed to the next item and evaluate it. If the > score is 1, stop the operation for that case, record the item > number of that first correct response in a new variable, > proceed to the next case, and restart the operation. > > In SPSS, this operation would be carried out with LOOP, > VECTOR, and DO IF, as follows (assuming the data set is > already loaded): > > * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST > CORRECT RESPONSE, SET IT EQUAL TO 0. > numeric LCfirst1. > comp LCfirst1 = 0 > > * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. > vector x=LC1a_score to LC140a_score. > > * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS > LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 > EACH TIME THE LOOP RUNS. > loop #i=1 to 140 if (LCfirst1 = 0). > > * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR > EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE > EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT > IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS > AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. > THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH > THE VECTOR UNTIL A '1' IS ENCOUNTERED. > + do if x(#i) = 1. > > * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT > STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. > + comp x(#i) = 99. > > * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH > RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE, > THUS CAPTURING THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE > FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE S > THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM > MOVES TO THE NEXT CASE AND RESTARTS THE LOOP. > + comp LCfirst1 = #i. > + end if. > end loop. &g
Re: [R] Conditional looping over a set of variables in R
David, here I'm referring to your data as testmat, a matrix of 140 columns and 1500 rows, but the same or similar notation can be applied to data frames in R. If I understand correctly, you are looking for the first response (column) where you got a value of 1. I'm assuming also that since your missing values are characters then your two numeric values are also characters. keeping all this in mind, try something like this. first = c() # your extra variable which will eventually contain the first correct response for each case for(i in 1:nrow(testmat)){ c = 1 while( c<=ncol(testmat) | testmat[i,c] != "1" ){ if( testmat[i,c] == "1"){ first[i] = c break # will exit the while loop once it finds the first correct answer, and then jump to the next case } else { c=c+1 # procede to the next column if not } } } Hope this helps you out a bit. Adrienne Wootten NCSU On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg wrote: > Here's the problem I'm trying to solve in R: I have a data frame that > consists of about 1500 cases (rows) of data from kids who took a test of > listening comprehension. The columns are their scores (1 = correct, 0 = > incorrect, . = missing) on 140 test items. The items are numbered > sequentially and are ordered by increasing difficulty as you go from left to > right across the columns. I want R to go through the data and find the first > correct response for each case. Because of basal and ceiling rules, many > cases have missing data on many items before the first correct response > appears. > > For each case, I want R to evaluate the item responses sequentially > starting with item 1. If the score is 0 or missing, proceed to the next item > and evaluate it. If the score is 1, stop the operation for that case, record > the item number of that first correct response in a new variable, proceed to > the next case, and restart the operation. > > In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, > as follows (assuming the data set is already loaded): > > * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT > RESPONSE, SET IT EQUAL TO 0. > numeric LCfirst1. > comp LCfirst1 = 0 > > * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. > vector x=LC1a_score to LC140a_score. > > * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" > IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. > loop #i=1 to 140 if (LCfirst1 = 0). > > * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF > THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT > OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP > RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if > STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' > IS ENCOUNTERED. > + do if x(#i) = 1. > > * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH > RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. > + comp x(#i) = 99. > > * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF > LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE > FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO > CAUSE S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO > THE NEXT CASE AND RESTARTS THE LOOP. > + comp LCfirst1 = #i. > + end if. > end loop. > exe. > > After several hours of trying to translate this procedure to R, I'm > stumped. I played around with creating a list to hold the item responses > variables (analogous to 'vector' in SPSS), but when I tried to use the list > in an R procedure, I kept getting a warning along the lines of 'the list > contains > 1 element, only the first element will be used'. So perhaps a > list is not the appropriate class to 'hold' these variables? > > It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will > allow me to recreate the operation described above? How do I set up the > indexing operation analogous to 'loop #i' in SPSS? > > Any help is appreciated, and I'm happy to provide more information if > needed. > > David S. Herzberg, Ph.D. > Vice President, Research and Development > Western Psychological Services > 12031 Wilshire Blvd. > Los Angeles, CA 90025-1251 > Phone: (310)478-2061 x144 > FAX: (310)478-7838 > email: dav...@wpspublish.com > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-gu
[R] Conditional looping over a set of variables in R
Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. + comp x(#i) = 99. * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT CASE AND RESTARTS THE LOOP. + comp LCfirst1 = #i. + end if. end loop. exe. After several hours of trying to translate this procedure to R, I'm stumped. I played around with creating a list to hold the item responses variables (analogous to 'vector' in SPSS), but when I tried to use the list in an R procedure, I kept getting a warning along the lines of 'the list contains > 1 element, only the first element will be used'. So perhaps a list is not the appropriate class to 'hold' these variables? It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will allow me to recreate the operation described above? How do I set up the indexing operation analogous to 'loop #i' in SPSS? Any help is appreciated, and I'm happy to provide more information if needed. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.