Re: [R] Finding unique terms
Yes you are right, I want the sum. I wll change the formula accordingly. On Fri, Oct 12, 2018 at 10:36 AM Jeff Newmiller wrote: > You said "add up"... so you did not mean to say that? Denes computed the > mean... > > On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria < > roslina...@gmail.com> wrote: > >Hi Denes, > > > >It works perfectly as I want! > > > >Thanks a lot. > > > >On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth > >wrote: > > > >> > >> > >> On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote: > >> > Dear r-users, > >> > > >> > I have this data: > >> > > >> > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > >> > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = > >"factor"), > >> > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > >> > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > >> > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class > >= > >> > "factor"), > >> > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > >> > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > >> > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > >> > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > >> > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > >> > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = > >c("STUDENT_ID", > >> > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = > >> > "data.frame", row.names = c(NA, > >> > -11L)) > >> > > >> > I want to combine the same Student ID and add up all the values for > >PO1M, > >> > PO1T,...,PO2T obtained by the same ID. > >> > >> dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, > >1L, > >> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = > >"factor"), > >> COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > >> 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > >> "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = > >> "factor"), > >> PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > >> 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > >> 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > >> 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > >> X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > >> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = > >c("STUDENT_ID", > >> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = > >> "data.frame", row.names = c(NA, > >> -11L)) > >> > >> # I assume you would like to add up the values with na.rm = TRUE > >> meanFn <- function(x) mean(x, na.rm = TRUE) > >> > >> # see ?aggregate > >> aggregate(dat[, c("PO1M", "PO1T", "PO2M")], > >>by = dat["STUDENT_ID"], > >>FUN = meanFn) > >> > >> # if you have largish or large data > >> library(data.table) > >> dat2 <- as.data.table(dat) > >> dat2[, lapply(.SD, meanFn), > >> by = STUDENT_ID, > >> .SDcols = c("PO1M", "PO1T", "PO2M")] > >> > >> > >> Regards, > >> Denes > >> > >> > >> > > >> > How do I do that? > >> > Thank you for any help given. > >> > > >> > > -- > Sent from my phone. Please excuse my brevity. > -- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairi...@ump.edu.my ; roslina...@gmail.com * Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
Here is a base R solution: "dat" is the data frame as in Robert's solution. > aggregate(dat[,3:6], by= dat[1], FUN = sum, na.rm = TRUE) STUDENT_ID PO1M PO1T PO2M PO2T 1AA15285 287.80 350 37 50 2AA15286 240.45 330 41 50 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Oct 15, 2018 at 6:42 PM Robert Baer wrote: > > > On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote: > > Dear r-users, > > > > I have this data: > > > > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = > > "factor"), > > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = > > "data.frame", row.names = c(NA, > > -11L)) > > > > I want to combine the same Student ID and add up all the values for PO1M, > > PO1T,...,PO2T obtained by the same ID. > > > > How do I do that? > > Thank you for any help given. > > > oops! Forgot to clean up after my cut and paste. Solution with dplyr > looks like this: > # Create sums by student ID > library(dplyr) > dat %>% >group_by(STUDENT_ID) %>% >summarize(sum.PO1M = sum(PO1M, na.rm = TRUE), > sum.PO1T = sum(PO1T, na.rm = TRUE), > sum.PO2M = sum(PO2M, na.rm = TRUE), > sum.PO2T = sum(PO2T, na.rm = TRUE)) > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote: Dear r-users, I have this data: structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) I want to combine the same Student ID and add up all the values for PO1M, PO1T,...,PO2T obtained by the same ID. How do I do that? Thank you for any help given. oops! Forgot to clean up after my cut and paste. Solution with dplyr looks like this: # Create sums by student ID library(dplyr) dat %>% group_by(STUDENT_ID) %>% summarize(sum.PO1M = sum(PO1M, na.rm = TRUE), sum.PO1T = sum(PO1T, na.rm = TRUE), sum.PO2M = sum(PO2M, na.rm = TRUE), sum.PO2T = sum(PO2T, na.rm = TRUE)) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
Dear r-users, I have this data: structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) I want to combine the same Student ID and add up all the values for PO1M, PO1T,...,PO2T obtained by the same ID. How do I do that? Thank you for any help given # load data # Enter dataframe by hand dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) # Create sums by student ID library(dplyr) dat %>% group_by(STUDENT_ID) %>% summarize(sum.PO1M = sum(PO1M, na.rm = TRUE), sum.PO1T = sum(PO1M, na.rm = TRUE), sum.PO2M = sum(PO1M, na.rm = TRUE), sum.PO2T = sum(PO1M, na.rm = TRUE)) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
On 10/12/2018 08:58 AM, Dénes Tóth wrote: On 10/12/2018 04:36 AM, Jeff Newmiller wrote: You said "add up"... so you did not mean to say that? Denes computed the mean... Nice catch, Jeff. Of course I wanted to use 'sum' instead of 'mean'. Oh, and one more note: If you have NAs in your columns, 'sum' is rarely the aggregate statistic that you are after. Probably this is why my subconscious statistician suggested 'mean'. On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria wrote: Hi Denes, It works perfectly as I want! Thanks a lot. On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth wrote: On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote: Dear r-users, I have this data: structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) I want to combine the same Student ID and add up all the values for PO1M, PO1T,...,PO2T obtained by the same ID. dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) # I assume you would like to add up the values with na.rm = TRUE meanFn <- function(x) mean(x, na.rm = TRUE) # see ?aggregate aggregate(dat[, c("PO1M", "PO1T", "PO2M")], by = dat["STUDENT_ID"], FUN = meanFn) # if you have largish or large data library(data.table) dat2 <- as.data.table(dat) dat2[, lapply(.SD, meanFn), by = STUDENT_ID, .SDcols = c("PO1M", "PO1T", "PO2M")] Regards, Denes How do I do that? Thank you for any help given. -- Dr. Tóth Dénes ügyvezető Kogentum Kft. Tel.: 06-30-2583723 Web: www.kogentum.hu __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
On 10/12/2018 04:36 AM, Jeff Newmiller wrote: You said "add up"... so you did not mean to say that? Denes computed the mean... Nice catch, Jeff. Of course I wanted to use 'sum' instead of 'mean'. On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria wrote: Hi Denes, It works perfectly as I want! Thanks a lot. On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth wrote: On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote: Dear r-users, I have this data: structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) I want to combine the same Student ID and add up all the values for PO1M, PO1T,...,PO2T obtained by the same ID. dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) # I assume you would like to add up the values with na.rm = TRUE meanFn <- function(x) mean(x, na.rm = TRUE) # see ?aggregate aggregate(dat[, c("PO1M", "PO1T", "PO2M")], by = dat["STUDENT_ID"], FUN = meanFn) # if you have largish or large data library(data.table) dat2 <- as.data.table(dat) dat2[, lapply(.SD, meanFn), by = STUDENT_ID, .SDcols = c("PO1M", "PO1T", "PO2M")] Regards, Denes How do I do that? Thank you for any help given. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
Yes, I thought that as well and had worked out this but didn't send it: add_Pscores<-function(x) { return(sum(unlist(x),na.rm=TRUE)) } by(rzdf[,c("PO1M", "PO1T", "PO2M", "PO2T")],rzdf$STUDENT_ID,FUN=add_Pscores) rzdf$STUDENT_ID: AA15285 [1] 724.8 rzdf$STUDENT_ID: AA15286 [1] 661.45 Jim On Fri, Oct 12, 2018 at 1:37 PM Jeff Newmiller wrote: > > You said "add up"... so you did not mean to say that? Denes computed the > mean... > > On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria > wrote: > >Hi Denes, > > > >It works perfectly as I want! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
You said "add up"... so you did not mean to say that? Denes computed the mean... On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria wrote: >Hi Denes, > >It works perfectly as I want! > >Thanks a lot. > >On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth >wrote: > >> >> >> On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote: >> > Dear r-users, >> > >> > I have this data: >> > >> > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, >> > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = >"factor"), >> > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, >> > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", >> > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class >= >> > "factor"), >> > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, >> > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, >> > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, >> > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), >> > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, >> > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = >c("STUDENT_ID", >> > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = >> > "data.frame", row.names = c(NA, >> > -11L)) >> > >> > I want to combine the same Student ID and add up all the values for >PO1M, >> > PO1T,...,PO2T obtained by the same ID. >> >> dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, >1L, >> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = >"factor"), >> COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, >> 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", >> "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = >> "factor"), >> PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, >> 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, >> 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, >> 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), >> X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, >> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = >c("STUDENT_ID", >> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = >> "data.frame", row.names = c(NA, >> -11L)) >> >> # I assume you would like to add up the values with na.rm = TRUE >> meanFn <- function(x) mean(x, na.rm = TRUE) >> >> # see ?aggregate >> aggregate(dat[, c("PO1M", "PO1T", "PO2M")], >>by = dat["STUDENT_ID"], >>FUN = meanFn) >> >> # if you have largish or large data >> library(data.table) >> dat2 <- as.data.table(dat) >> dat2[, lapply(.SD, meanFn), >> by = STUDENT_ID, >> .SDcols = c("PO1M", "PO1T", "PO2M")] >> >> >> Regards, >> Denes >> >> >> > >> > How do I do that? >> > Thank you for any help given. >> > >> -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
Hi Denes, It works perfectly as I want! Thanks a lot. On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth wrote: > > > On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote: > > Dear r-users, > > > > I have this data: > > > > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = > > "factor"), > > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = > > "data.frame", row.names = c(NA, > > -11L)) > > > > I want to combine the same Student ID and add up all the values for PO1M, > > PO1T,...,PO2T obtained by the same ID. > > dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = > "factor"), > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = > "data.frame", row.names = c(NA, > -11L)) > > # I assume you would like to add up the values with na.rm = TRUE > meanFn <- function(x) mean(x, na.rm = TRUE) > > # see ?aggregate > aggregate(dat[, c("PO1M", "PO1T", "PO2M")], >by = dat["STUDENT_ID"], >FUN = meanFn) > > # if you have largish or large data > library(data.table) > dat2 <- as.data.table(dat) > dat2[, lapply(.SD, meanFn), > by = STUDENT_ID, > .SDcols = c("PO1M", "PO1T", "PO2M")] > > > Regards, > Denes > > > > > > How do I do that? > > Thank you for any help given. > > > -- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairi...@ump.edu.my ; roslina...@gmail.com * Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding unique terms
On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote: Dear r-users, I have this data: structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) I want to combine the same Student ID and add up all the values for PO1M, PO1T,...,PO2T obtained by the same ID. dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class = "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class = "data.frame", row.names = c(NA, -11L)) # I assume you would like to add up the values with na.rm = TRUE meanFn <- function(x) mean(x, na.rm = TRUE) # see ?aggregate aggregate(dat[, c("PO1M", "PO1T", "PO2M")], by = dat["STUDENT_ID"], FUN = meanFn) # if you have largish or large data library(data.table) dat2 <- as.data.table(dat) dat2[, lapply(.SD, meanFn), by = STUDENT_ID, .SDcols = c("PO1M", "PO1T", "PO2M")] Regards, Denes How do I do that? Thank you for any help given. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.