Hello,There was a bug in the way I copied&pasted your data to my R session, hence the NA's.
Here is a tidyverse way of doing what you want. Its output matches the expected output in your last post. The column names don't start at zero because there was no Starting_SoC_of_12 equal to 0.
library(tidyverse) result <- dt_2014 %>% mutate(Hour = lubridate::hour(BatteryChargeStartDate)) %>% group_by(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>% mutate(Count = n()) %>% ungroup() %>% arrange(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>% pivot_wider( id_cols = Hour, names_from = c(Starting_SoC_of_12, Ending_SoC_of_12), names_sep = "-", values_from = Count, values_fill = 0L ) i <- str_order(names(result)[-1], numeric = TRUE) result <- cbind(result[1], result[-1][i]) result # Hour 1-11 2-10 2-11 4-4 4-12 5-8 8-12 #1 7 0 0 0 0 0 1 0 #2 8 0 0 0 0 1 0 1 #3 15 0 0 0 1 0 0 0 #4 16 1 1 0 0 0 0 0 #5 18 0 0 1 0 0 0 1 #6 21 0 0 0 0 1 0 1 Hope this helps, Rui Barradas Às 16:10 de 19/07/2022, roslinazairimah zakaria escreveu:
Hi Rui, I try to run your code, but all data became NA. Not sure why... # these columns need to be fixed cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate") dt[cols] <- lapply(dt[cols], \(x) sub("\n", " ", x)) # use package lubridate to coerce to a datetime class library(lubridate) dt <- lapply(dt, lubridate::dmy_hm) dt dt[cols] <- lapply(dt[cols], lubridate::dmy_hm) h <- lubridate::hour(dt[["BatteryChargeStartDate"]]) aggregate(Starting_SoC_of_12 ~ h, dt, length) $BCStartTime [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [33] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [65] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [97] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [129] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [161] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [193] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [225] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [257] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [289] NA NA NA NA NA NA NA NA [929] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [961] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [993] NA NA NA NA NA NA NA NA [ reached 'max' / getOption("max.print") -- omitted 34418 entries ]dt[cols] <- lapply(dt[cols], lubridate::dmy_hm)Warning messages: 1: All formats failed to parse. No formats found. 2: All formats failed to parse. No formats found.h <- lubridate::hour(dt[["BatteryChargeStartDate"]]) aggregate(Starting_SoC_of_12 ~ h, dt, length)Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...) : no rows to aggregate On Tue, Jul 19, 2022 at 4:53 PM roslinazairimah zakaria < roslina...@gmail.com> wrote:Hi Rui, Yes, I would like to count for each hour, how many in the state of charge start 0 and SOC 12, then SOC 1 and SOC 12 and so on. Thank you for your help. On Tue, Jul 19, 2022 at 1:11 AM Rui Barradas <ruipbarra...@sapo.pt> wrote:Hello, I'm not sure I understand the problem. Do you want counts of how many rows are there per hour? # these columns need to be fixed cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate") dt_2014[cols] <- lapply(dt_2014[cols], \(x) sub("\n", " ", x)) # use package lubridate to coerce to a datetime class dt_2014[cols] <- lapply(dt_2014[cols], lubridate::dmy_hm) h <- lubridate::hour(dt_2014[["BatteryChargeStartDate"]]) aggregate(Starting_SoC_of_12 ~ h, dt_2014, length) It would be better if you post the expected output corresponding to the posted data set. Hope this helps, Rui Barradas Às 05:04 de 18/07/2022, roslinazairimah zakaria escreveu:Dear all, I have data of Battery Electric vehicle (BEV). I would like to extractdatafrom every hour starting from 0.00 to 0.59, 1:00-1:59 for SOC(state of charge) start to end. Some examples: I can extract data from SOC=0 and SOC=12 dt_2014[which(dt_2014$Starting_SoC_of_12==0 & dt_2014$Ending_SoC_of_12==12),] I can extract data from SOC=1 and SOC=12 dt_2014[which(dt_2014$Starting_SoC_of_12==1 & dt_2014$Ending_SoC_of_12==12),] and I would like to further categorise the data by hour and count howmanycars from 0 state charge to 12 state charge at in that particular hour. Thank you so much for any help given. Some datadput(dt_2014[1:10,])structure(list(ï..CarID = c("GC10", "GC10", "GC10", "GC10", "GC10", "GC10", "GC10", "GC10", "GC10", "GC10"), BatteryChargeStartDate = c("16/2/2014 16:05", "16/2/2014 18:20", "17/2/2014 8:10", "18/2/2014 7:41", "18/2/201415:36","18/2/2014 16:36", "18/2/2014 21:26", "19/2/2014 8:57", "19/2/201421:08","20/2/2014 18:11"), BCStartTime = c("16:05", "18:20", "8:10", "7:41", "15:36", "16:36", "21:26", "8:57", "21:08", "18:11"), Year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L), Month = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Day = c(16L, 16L, 17L, 18L, 18L, 18L, 18L, 19L, 19L, 20L), BatteryChargeStopDate = c("16/2/2014 17:05", "16/2/2014 19:00", "17/2/2014 15:57", "18/2/2014 9:52", "18/2/2014 15:39", "18/2/2014 17:36", "19/2/2014 1:55", "19/2/2014 14:25", "20/2/2014 5:17", "20/2/201423:20"), BCStopTime = c("17:05", "19:00", "15:57", "9:52", "15:39", "17:36", "1:55", "14:25", "5:17", "23:20"), Year2 = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L ), Month2 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Day2 =c(16L,16L, 17L, 18L, 18L, 18L, 19L, 19L, 20L, 20L), Starting_SoC_of_12 = c(1L, 2L, 4L, 5L, 4L, 2L, 8L, 8L, 4L, 8L), Ending_SoC_of_12 = c(11L, 11L, 12L, 8L, 4L, 10L, 12L, 12L, 12L, 12L)), row.names = c(NA, 10L), class = "data.frame")-- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairi...@ump.edu.my <roslinazairi...@ump.edu.my>; roslina...@gmail.com <roslina...@gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.