I am hoping someone can help me with a bus stop sequencing problem in R, where I need to match counts of people getting on and off a bus to the correct stop in the bus route stop sequence. I have tried looking online/forums for sequence matching but seems to refer to numeric sequences or DNA matching and over my head. I am after a simple example if anyone can please help.
I have two data series as per below (from database), that I want to combine. In this example “stop_sequence” includes the equence (seq) of bus stops and “stop_onoff” is a count of people getting on and off at certain stops (there is no entry if noone gets on or off). stop_sequence <- data.frame(seq=c(10,20,30,40,50,60), ref=c('A','B','C','D','B','A')) ## seq ref ## 1 10 A ## 2 20 B ## 3 30 C ## 4 40 D ## 5 50 B ## 6 60 A stop_onoff <- data.frame(ref=c('A','D','B','A'),on=c(5,0,10,0),off=c(0,2,2,6)) ## ref on off ## 1 A 5 0 ## 2 D 0 2 ## 3 B 10 2 ## 4 A 0 6 I need to match the stop_onoff numbers in the right sto sequence, with the correctly matched output as follows (load is a cumulative count of on and off) desired_output <- data.frame(seq=c(10,20,30,40,50,60), ref=c('A','B','C','D','B','A'), on=c(5,'-','-',0,10,0),off=c(0,'-','-',2,2,6), load=c(5,0,0,3,11,5)) ## seq ref on off load ## 1 10 A 5 0 5 ## 2 20 B - - 0 ## 3 30 C - - 0 ## 4 40 D 0 2 3 ## 5 50 B 10 2 11 ## 6 60 A 0 6 5 In this example the stop “B” is matched to the second stop “B” in the stop sequence and not the first because the onoff data is after stop “D”. Any guidance much appreciated. Regards Adam [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.