Please learn to use dput() to post example data. # This is your data: data <- structure(c(1232, 0, 43, 357, 71, 919, 23, 9, 1111, 0, 811, 0, 9871, 795, 76, 72, 743, 14), .Dim = c(3L, 6L), .Dimnames = list( NULL, c("X1", "X2", "X3", "Y1", "Y2", "Y3")))
data # define groups and threshold explicitly groupA <- c(1, 2, 3) groupB <- c(4, 5, 6) thrsh <- 100 # Here's how you evaluate your condition on the member elements of your group rowSums(data[ , groupA]) > thrsh # note that you can cast a logical TRUE/FALSE into an integer 0/1 as.numeric(rowSums(data[ , groupA]) >= thrsh) # ... which you can multiply with your data (*) data[ , groupA] * as.numeric(rowSums(data[ , groupA]) > thrsh) # now you could write this into your matrix data[ , groupA] <- data[ , groupA] * as.numeric(rowSums(data[ , groupA]) > thrsh) # data[ , groupB] etc ... data # ... but you would be repeating code, therefore better to write this # as a function: clearReadsBelowThreshold <- function(m, g, t) { m[ , g] <- m[ , g] * as.numeric(rowSums(m[ , g]) >= t) return(m) } data <- clearReadsBelowThreshold(data, groupA, thrsh) data <- clearReadsBelowThreshold(data, groupB, thrsh) data (*) Note that R would do this conversion implicitly but omitting the conversion will cause confusion for those who read the code later. Cheers, Boris On Nov 6, 2015, at 8:53 AM, Assa Yeroslaviz <fry...@gmail.com> wrote: > sorry, for the misunderstanding. here is a more elaborate description of > what i would like to achieve. > > I have a data set of counts from a RNA-Seq experiment and would like to > filter reads with low counts. I don't want to set everything to 0 > automatically. > > I would like to set each categorical group (e.g. condition) to 0, if and > only if all replica in the group together have less than 100 reads. > in my examples I used X and Y to represents the categories. Ususally they > have a more distinct names like "control", "knockout1", "dKo" etc. > > So what I really like to do is to check if the sum of all the "control" > samples is lower than 100. If so, set all control sample to 0. This I would > like to check *for each category* of every row of the data set. > > I hope it is more clear now > > thanks > Assa > > > On Fri, Nov 6, 2015 at 2:29 PM, jim holtman <jholt...@gmail.com> wrote: > >> Is this what you want: >> >>> x <- read.table(text = "X1 X2 X3 Y1 Y2 Y3 >> + 1232 357 23 0 9871 72 >> + 0 71 9 811 795 743 >> + 43 919 1111 0 76 14", header = TRUE) >>> x >> X1 X2 X3 Y1 Y2 Y3 >> 1 1232 357 23 0 9871 72 >> 2 0 71 9 811 795 743 >> 3 43 919 1111 0 76 14 >>> >>> # create indices of columns that start with the same character >>> indx <- split(seq(ncol(x)), substring(colnames(x), 1, 1)) >>> names(indx) <- NULL # remove names so output not messed up >>> >>> result <- lapply(indx, function(a){ >> + row_sum <- rowSums(x[, a]) >> + x[row_sum < 100, a] <- 0 >> + x[, a] >> + }) >>> # combine back together >>> do.call(cbind, result) >> X1 X2 X3 Y1 Y2 Y3 >> 1 1232 357 23 0 9871 72 >> 2 0 0 0 811 795 743 >> 3 43 919 1111 0 0 0 >> >> >> Jim Holtman >> Data Munger Guru >> >> What is the problem that you are trying to solve? >> Tell me what you want to do, not how you want to do it. >> >> On Fri, Nov 6, 2015 at 5:40 AM, Assa Yeroslaviz <fry...@gmail.com> wrote: >> >>> Hi, >>> >>> I have a data frame with multiple columns, which are belong to several >>> groups >>> like that: >>> X1 X2 X3 Y1 Y2 Y3 >>> 1232 357 23 0 9871 72 >>> 0 71 9 811 795 743 >>> 43 919 1111 0 76 14 >>> >>> I would like to filter such rows out, where the sums in one group is lower >>> than a specifc value. For example, I would like to set all the values in a >>> group of cloums to zero, if the sum in one group is less than 100 >>> In my example table I would like to set the values in the second row for >>> the three X-columns to 0, so that the table looks like that: >>> >>> X1 X2 X3 Y1 Y2 Y3 >>> 1232 357 23 0 9871 72 >>> 0 0 0 811 795 743 >>> 43 919 1111 0 0 0 >>> >>> the same apply also for the Y-values in the last column. >>> Is there a more efficient way of doing it than going row by row and use >>> the >>> apply function on each of the subgroups I have in the columns? >>> >>> thanks >>> Assa >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.