Hello,

There was a bug in the way I copied&pasted your data to my R session, hence the NA's.

Here is a tidyverse way of doing what you want. Its output matches the expected output in your last post. The column names don't start at zero because there was no Starting_SoC_of_12 equal to 0.


library(tidyverse)

result <- dt_2014 %>%
  mutate(Hour = lubridate::hour(BatteryChargeStartDate)) %>%
  group_by(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>%
  mutate(Count = n()) %>%
  ungroup() %>%
  arrange(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>%
  pivot_wider(
    id_cols = Hour,
    names_from = c(Starting_SoC_of_12, Ending_SoC_of_12),
    names_sep = "-",
    values_from = Count,
    values_fill = 0L
  )

i <- str_order(names(result)[-1], numeric = TRUE)
result <- cbind(result[1], result[-1][i])
result
#  Hour 1-11 2-10 2-11 4-4 4-12 5-8 8-12
#1    7    0    0    0   0    0   1    0
#2    8    0    0    0   0    1   0    1
#3   15    0    0    0   1    0   0    0
#4   16    1    1    0   0    0   0    0
#5   18    0    0    1   0    0   0    1
#6   21    0    0    0   0    1   0    1


Hope this helps,

Rui Barradas


Às 16:10 de 19/07/2022, roslinazairimah zakaria escreveu:
Hi Rui,
I try to run your code, but all data became NA. Not sure why...
# these columns need to be fixed
cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate")
dt[cols] <- lapply(dt[cols], \(x) sub("\n", " ", x))

# use package lubridate to coerce to a datetime class
library(lubridate)

dt <- lapply(dt, lubridate::dmy_hm)
dt
dt[cols] <- lapply(dt[cols], lubridate::dmy_hm)

h <- lubridate::hour(dt[["BatteryChargeStartDate"]])
aggregate(Starting_SoC_of_12 ~ h, dt, length)


$BCStartTime
    [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
   [33] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
   [65] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
   [97] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [129] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [161] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [193] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [225] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [257] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [289] NA NA NA NA NA NA NA NA

  [929] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [961] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
NA NA NA NA NA NA NA NA NA
  [993] NA NA NA NA NA NA NA NA
  [ reached 'max' / getOption("max.print") -- omitted 34418 entries ]

dt[cols] <- lapply(dt[cols], lubridate::dmy_hm)
Warning messages:
1: All formats failed to parse. No formats found.
2: All formats failed to parse. No formats found.

h <- lubridate::hour(dt[["BatteryChargeStartDate"]])
aggregate(Starting_SoC_of_12 ~ h, dt, length)
Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...) :
   no rows to aggregate

On Tue, Jul 19, 2022 at 4:53 PM roslinazairimah zakaria <
roslina...@gmail.com> wrote:

Hi Rui,

Yes, I would like to count for each hour, how many in the state of charge
start 0 and SOC 12, then SOC 1 and SOC 12 and so on.

Thank you for your help.

On Tue, Jul 19, 2022 at 1:11 AM Rui Barradas <ruipbarra...@sapo.pt> wrote:

Hello,

I'm not sure I understand the problem. Do you want counts of how many
rows are there per hour?


# these columns need to be fixed
cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate")
dt_2014[cols] <- lapply(dt_2014[cols], \(x) sub("\n", " ", x))
# use package lubridate to coerce to a datetime class
dt_2014[cols] <- lapply(dt_2014[cols], lubridate::dmy_hm)

h <- lubridate::hour(dt_2014[["BatteryChargeStartDate"]])
aggregate(Starting_SoC_of_12 ~ h, dt_2014, length)



It would be better if you post the expected output corresponding to the
posted data set.

Hope this helps,

Rui Barradas

Às 05:04 de 18/07/2022, roslinazairimah zakaria escreveu:
Dear all,

I have data of Battery Electric vehicle (BEV). I would like to extract
data
from every hour starting from 0.00 to 0.59, 1:00-1:59 for SOC(state of
charge) start to end.

Some examples:
I can extract data from SOC=0 and SOC=12
dt_2014[which(dt_2014$Starting_SoC_of_12==0 &
dt_2014$Ending_SoC_of_12==12),]

I can extract data from SOC=1 and SOC=12
dt_2014[which(dt_2014$Starting_SoC_of_12==1 &
dt_2014$Ending_SoC_of_12==12),]

and I would like to further categorise the data by hour and count how
many
cars from 0 state charge to 12 state charge at in that particular hour.

Thank you so much for any help given.

Some data
dput(dt_2014[1:10,])
structure(list(ï..CarID = c("GC10", "GC10", "GC10", "GC10", "GC10",
"GC10", "GC10", "GC10", "GC10", "GC10"), BatteryChargeStartDate =
c("16/2/2014 16:05",
"16/2/2014 18:20", "17/2/2014 8:10", "18/2/2014 7:41", "18/2/2014
15:36",
"18/2/2014 16:36", "18/2/2014 21:26", "19/2/2014 8:57", "19/2/2014
21:08",
"20/2/2014 18:11"), BCStartTime = c("16:05", "18:20", "8:10",
"7:41", "15:36", "16:36", "21:26", "8:57", "21:08", "18:11"),
      Year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
      2014L, 2014L, 2014L), Month = c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
      2L, 2L, 2L), Day = c(16L, 16L, 17L, 18L, 18L, 18L, 18L, 19L,
      19L, 20L), BatteryChargeStopDate = c("16/2/2014 17:05", "16/2/2014
19:00",
      "17/2/2014 15:57", "18/2/2014 9:52", "18/2/2014 15:39", "18/2/2014
17:36",
      "19/2/2014 1:55", "19/2/2014 14:25", "20/2/2014 5:17", "20/2/2014
23:20"
      ), BCStopTime = c("17:05", "19:00", "15:57", "9:52", "15:39",
      "17:36", "1:55", "14:25", "5:17", "23:20"), Year2 = c(2014L,
      2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L
      ), Month2 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Day2 =
c(16L,
      16L, 17L, 18L, 18L, 18L, 19L, 19L, 20L, 20L), Starting_SoC_of_12 =
c(1L,
      2L, 4L, 5L, 4L, 2L, 8L, 8L, 4L, 8L), Ending_SoC_of_12 = c(11L,
      11L, 12L, 8L, 4L, 10L, 12L, 12L, 12L, 12L)), row.names = c(NA,
10L), class = "data.frame")





--
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: roslinazairi...@ump.edu.my <roslinazairi...@ump.edu.my>;
roslina...@gmail.com <roslina...@gmail.com>*
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia




______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to