Dear Jeff, On 2021-12-21, 11:59 AM, "R-help on behalf of Jeff Newmiller" <r-help-boun...@r-project.org on behalf of jdnew...@dcn.davis.ca.us> wrote:
Intuitive, perhaps, but noticably slower. I think that in most applications, one wouldn't notice the difference; for example: > D <- data.frame(matrix(rnorm(1000*1e6), 1e6, 1000)) > microbenchmark(D[, 1]) Unit: microseconds expr min lq mean median uq max neval D[, 1] 3.321 3.362 3.98561 3.444 3.5875 51.291 100 > microbenchmark(D[[1]]) Unit: microseconds expr min lq mean median uq max neval D[[1]] 1.722 1.763 1.99137 1.804 1.8655 17.876 100 Best, John And it doesn't work on tibbles by design. Data frames are lists of columns. On December 21, 2021 8:38:35 AM PST, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: >On 21/12/2021 11:31 a.m., Duncan Murdoch wrote: >> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote: >>> Thanks for the reply. >>> >>> sort(unique(Data[1])) >>> Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = >>> decreasing)) : >>> undefined columns selected >> >> That's the wrong syntax: Data[1] is not "column one of Data". Use >> Data[[1]] for that, so >> >> sort(unique(Data[[1]])) > >Actually, I'd probably recommend > > sort(unique(Data[, 1])) > >instead. This treats Data as a matrix rather than as a list. >Dataframes are lists that look like matrices, but to me the matrix >aspect is usually more intuitive. > >Duncan Murdoch > >> >> I think Rui already pointed out the typo in the quoted text below... >> >> Duncan Murdoch >> >>> >>> The recommended syntax did not work, as listed above. >>> >>> What I want is the sort of distinct column output. Again, the column may >>> be text or numbers. This is a huge analysis effort with data coming at >>> me from many different sources. >>> >>> >>> *Stephen Dawson, DSL* >>> /Executive Strategy Consultant/ >>> Business & Technology >>> +1 (865) 804-3454 >>> http://www.shdawson.com <http://www.shdawson.com> >>> >>> >>> On 12/21/21 11:07 AM, Duncan Murdoch wrote: >>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote: >>>>> Thanks everyone for the replies. >>>>> >>>>> It is clear one either needs to write a function or put the unique >>>>> entries into another dataframe. >>>>> >>>>> It seems odd R cannot sort a list of unique column entries with ease. >>>>> Python and SQL can do it with ease. >>>> >>>> I've seen several responses that looked pretty simple. It's hard to >>>> beat sort(unique(x)), though there's a fair bit of confusion about >>>> what you actually want. Maybe you should post an example of the code >>>> you'd use in Python? >>>> >>>> Duncan Murdoch >>>> >>>>> >>>>> QUESTION >>>>> Is there a simpler means than other than the unique function to capture >>>>> distinct column entries, then sort that list? >>>>> >>>>> >>>>> *Stephen Dawson, DSL* >>>>> /Executive Strategy Consultant/ >>>>> Business & Technology >>>>> +1 (865) 804-3454 >>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>> >>>>> >>>>> On 12/20/21 5:53 PM, Rui Barradas wrote: >>>>>> Hello, >>>>>> >>>>>> Inline. >>>>>> >>>>>> Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu: >>>>>>> Thanks. >>>>>>> >>>>>>> sort(unique(Data[[1]])) >>>>>>> >>>>>>> This syntax provides row numbers, not column values. >>>>>> >>>>>> This is not right. >>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]] >>>>>> extracts the column vector. >>>>>> >>>>>> As for my previous answer, it was not addressing the question, I >>>>>> misinterpreted it as being a question on how to sort by numeric order >>>>>> when the data is not numeric. Here is a, hopefully, complete answer. >>>>>> Still with package stringr. >>>>>> >>>>>> >>>>>> cols_to_sort <- 1:4 >>>>>> >>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){ >>>>>> stringr::str_sort(unique(x), numeric = TRUE) >>>>>> }) >>>>>> >>>>>> >>>>>> Or using Avi's suggestion of writing a function to do all the work and >>>>>> simplify the lapply loop later, >>>>>> >>>>>> >>>>>> unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...) >>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE) >>>>>> >>>>>> >>>>>> Hope this helps, >>>>>> >>>>>> Rui Barradas >>>>>> >>>>>> >>>>>>> >>>>>>> *Stephen Dawson, DSL* >>>>>>> /Executive Strategy Consultant/ >>>>>>> Business & Technology >>>>>>> +1 (865) 804-3454 >>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>> >>>>>>> >>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> >>>>>>>> Running a simple syntax set to review entries in dataframe columns. >>>>>>>> Here is the working code. >>>>>>>> >>>>>>>> Data <- read.csv("./input/Source.csv", header=T) >>>>>>>> describe(Data) >>>>>>>> summary(Data) >>>>>>>> unique(Data[1]) >>>>>>>> unique(Data[2]) >>>>>>>> unique(Data[3]) >>>>>>>> unique(Data[4]) >>>>>>>> >>>>>>>> I would like to add sort the unique entries. The data in the various >>>>>>>> columns are not defined as numbers, but also text. I realize 1 and >>>>>>>> 10 will not sort properly, as the column is not defined as a number, >>>>>>>> but want to see what I have in the columns viewed as sorted. >>>>>>>> >>>>>>>> QUESTION >>>>>>>> What is the best process to sort unique output, please? >>>>>>>> >>>>>>>> >>>>>>>> Thanks. >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide >>>>>>> http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> >>> >> > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.