Id cat1 location item_values p-values sequence a111 1 3002737 0.196504377 0.01 1 a112 1 3017821 0.196504377 0.05 2 a113 1 3027730 0.196504377 0.02 3 a114 1 3036220 0.196504377 0.04 4 a115 1 3053984 0.196504377 0.03 5 a116 1 3063892 0.196504377 0.07 6 a117 1 3076333 0.196504377 0.08 7 a118 1 3090500 0.196504377 0.02 8 a119 1 3103304 0.196504377 0.03 9 a120 1 3119350 0.196504377 0.05 10 a121 1 3129884 0.196504377 0.01 11 a122 1 3154598 0.196504377 0.03 12 a123 1 3170910 0.196504377 0.05 13 a124 1 3180712 0.196504377 0.06 14 a125 1 3186519 0.196504377 0.07 15 a126 1 3192256 0.196504377 0.09 16 a127 1 3198441 0.196504377 0.01 17 a128 1 3205784 0.196504377 0.02 18 a129 1 3210685 0.196504377 0.03 19 a130 1 3218542 0.196504377 0.04 20 a131 1 3234318 0.196504377 0.05 21 a132 1 3239972 0.196504377 0.09 22 a133 1 3245663 0.196504377 0.05 23 a134 1 3257997 0.196504377 0.02 24 a135 1 3273226 0.196504377 0.03 26 a136 1 3285404 0.196504377 0.04 27 a137 1 3290332 0.196504377 0.05 28 a138 1 3300679 0.196504377 0.03 29 a139 1 3310164 0.196504377 0.09 30
first of all, please pay attention to the P -values, all the rows with the p-value <0.05 will be considered as one region until the p-value >0.05 identified. for instance: REGION 1 is the rows from id a111 to id A115 . REGION 2 is the rows from id a118 to a123, etc. what i am going to accomplish is to pick the start and end location, and the peak value from the item_values for each region. option 1: loop through each row until the p-value>0.05 identified then start_location=the first location value end_location=the location value before the p>0.05 peak_value of the item_values=the maximum one option 2 create a sequence number for each row; subset the raw dataframe by p<0.05; the p-value regions will be identified by the gapped sequence number. for instance from sequence 1 to 5 will be considering one region. Id cat1 location item_values p-values sequence a111 1 3002737 0.196504377 0.01 1 a112 1 3017821 0.196504377 0.05 2 a113 1 3027730 0.196504377 0.02 3 a114 1 3036220 0.196504377 0.04 4 a115 1 3053984 0.196504377 0.03 5 a118 1 3090500 0.196504377 0.02 8 a119 1 3103304 0.196504377 0.03 9 I need your recommendation on the different approach to implement this? Thanks, -- View this message in context: http://r.789695.n4.nabble.com/data-arranged-by-p-values-tp2301909p2301909.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.