Re: [R] Splitting a data column randomly into 3 groups

2021-09-02 Thread Avi Gross via R-help
Abou, I am not trying to be negative. Assuming you are a professor of Statistics, your request seems odd as what you are asking about is very routine in much of statistical work where you want to make a model or something using just part of your data and need to reserve some to check if you

Re: [R] Splitting a data column randomly into 3 groups

2021-09-02 Thread Jim Lemon
Hi Abou, One way is to shuffle the original data frame using sample(). and split up the result into three equal parts. I was going to provide example code, but Avi's response popped up and I kind of agree with him. Jim On Fri, Sep 3, 2021 at 11:31 AM AbouEl-Makarim Aboueissa wrote: > > Dear

Re: [R] Splitting a data column randomly into 3 groups

2021-09-02 Thread AbouEl-Makarim Aboueissa
Sorry, please forget about it. I believe that I am very serious when I posted my question. with thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Statistics and Data Science* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern

Re: [R] Splitting a data column randomly into 3 groups

2021-09-02 Thread Avi Gross via R-help
What is stopping you Abou? Some of us here start wondering if we have better things to do than homework for others. Help is supposed to be after they try and encounter issues that we may help with. So think about your problem. You supplied data in a file that is NOT in CSV format but is in

[R] Splitting a data column randomly into 3 groups

2021-09-02 Thread AbouEl-Makarim Aboueissa
Dear All: How to split a column data *randomly* into three groups. Please see the attached data. I need to split column #2 titled "Data" with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Statistics and Data Science* *Graduate Coordinator* *Department of

Re: [R] plotting some rows in different color

2021-09-02 Thread Jim Lemon
Hi Eliza This seems to work: plot(BFA3[,1],BFA3[,4], pch=16, xlab = "", ylab = "",col=(BFA3[,2]==BFA3[,3])+2,axes=FALSE) but I have no idea what you are trying to do with the as.numeric(as.Date(...)) business. Jim On Fri, Sep 3, 2021 at 8:44 AM Eliza Botto wrote: > > Dear useRs, > > For

[R] plotting some rows in different color

2021-09-02 Thread Eliza Botto
Dear useRs, For the following dataset, dput(BFA3) structure(c(17532, 17533, 17534, 17535, 17536, 17537, 17538, 17539, 17540, 17541, 17542, 17543, 17544, 17545, 17546, 17547, 17548, 17549, 17550, 17551, 17552, 17553, 17554, 17555, 17556, 17557, 17558, 17559, 17560, 17561, 17562, 17563, 17564,

Re: [R] Show only header of str() function

2021-09-02 Thread Luigi Marongiu
Thanks, that is perfect! On Thu, Sep 2, 2021 at 7:02 PM Deepayan Sarkar wrote: > > On Thu, Sep 2, 2021 at 9:26 PM Enrico Schumann > wrote: > > > > On Thu, 02 Sep 2021, Luigi Marongiu writes: > > > > > Hello, is it possible to show only the header (that is: `'data.frame': > > > x obs. of y

Re: [R] Show only header of str() function

2021-09-02 Thread Rui Barradas
Hello, I believe but do not have references that str was meant for interactive use, not for use in a script or package. If this is the case, then it should be rare to have to output to an object such as a character vector. As for my solution, it is far from perfect, I try to avoid

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Duncan Murdoch
On 02/09/2021 3:20 p.m., Greg Minshall wrote: Andrew, x[] <- lapply(x, function(xx) { xx[is.nan(xx)] <- NA_real_ xx }) is different from x <- lapply(x, function(xx) { xx[is.nan(xx)] <- NA_real_ xx }) indeed, the two are different -- but some ignorance of mine is

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Jeff Newmiller
Regardless of whether you use the lower-level split function, or the higher-level aggregate function, or the tidyverse group_by function, the key is learning how to create the column that is the same for all records corresponding to the time interval of interest. If you convert the sampdate to

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Rich Shepard
On Thu, 2 Sep 2021, Andrew Simmons wrote: You could use 'split' to create a list of data frames, and then apply a function to each to get the means and sds. cols <- "cfs" # add more as necessary S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m")) means <-

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Greg Minshall
Andrew, > x[] <- lapply(x, function(xx) { > xx[is.nan(xx)] <- NA_real_ > xx > }) > > is different from > > x <- lapply(x, function(xx) { > xx[is.nan(xx)] <- NA_real_ > xx > }) indeed, the two are different -- but some ignorance of mine is exposed. i wonder, can you explain why

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Andrew Simmons
You could use 'split' to create a list of data frames, and then apply a function to each to get the means and sds. cols <- "cfs" # add more as necessary S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m")) means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE)) sds

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Rich Shepard
On Thu, 2 Sep 2021, Rich Shepard wrote: If I correctly understand the output of as.POSIXlt each date and time element is separate, so input such as 2016-03-03 12:00 would now be 2016 03 03 12 00 (I've not read how the elements are separated). (The TZ is not important because all data are either

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Rich Shepard
On Mon, 30 Aug 2021, Richard O'Keefe wrote: x <- rnorm(samples.per.day * 365) length(x) [1] 105120 Reshape the fake data into a matrix where each row represents one 24-hour period. m <- matrix(x, ncol=samples.per.day, byrow=TRUE) Richard, Now I understand the need to keep the date and

Re: [R] read.csv() error

2021-09-02 Thread Rich Shepard
On Thu, 2 Sep 2021, Enrico Schumann wrote: There is no column 'ht'. Enrico, New eyeballs caught my change in variable name that I kept missing. Thanks very much, Rich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] Show only header of str() function

2021-09-02 Thread Deepayan Sarkar
On Thu, Sep 2, 2021 at 9:26 PM Enrico Schumann wrote: > > On Thu, 02 Sep 2021, Luigi Marongiu writes: > > > Hello, is it possible to show only the header (that is: `'data.frame': > > x obs. of y variables:` part) of the str function? > > Thank you > > Perhaps one more solution. You could limit

Re: [R] read.csv() error

2021-09-02 Thread Enrico Schumann
On Thu, 02 Sep 2021, Rich Shepard writes: > The first three commands in the script are: > stage <- read.csv('../data/water/gauge-ht.dat', header > = TRUE, sep = ',', stringsAsFactors = FALSE) > stage$sampdate <- as.Date(stage$sampdate) > stage$ht <- as.numeric(stage$ht, length = 6) > > Running

Re: [R] read.csv() error

2021-09-02 Thread Enrico Schumann
On Thu, 02 Sep 2021, Rich Shepard writes: > The first three commands in the script are: > stage <- read.csv('../data/water/gauge-ht.dat', header > = TRUE, sep = ',', stringsAsFactors = FALSE) > stage$sampdate <- as.Date(stage$sampdate) > stage$ht <- as.numeric(stage$ht, length = 6) > > Running

Re: [R] Show only header of str() function

2021-09-02 Thread Avi Gross via R-help
Thanks for the interesting method Rui. So that is a way to do a redirect of output not to a sinkfile but to an in-memory variable as a textConnection. Of course, one has to wonder why the makers of str thought it would be too inefficient to have an option that returns the output in a form that

[R] read.csv() error

2021-09-02 Thread Rich Shepard
The first three commands in the script are: stage <- read.csv('../data/water/gauge-ht.dat', header = TRUE, sep = ',', stringsAsFactors = FALSE) stage$sampdate <- as.Date(stage$sampdate) stage$ht <- as.numeric(stage$ht, length = 6) Running the script produces this error: source('stage.R')

Re: [R] Show only header of str() function

2021-09-02 Thread Enrico Schumann
On Thu, 02 Sep 2021, Luigi Marongiu writes: > Hello, is it possible to show only the header (that is: `'data.frame': > x obs. of y variables:` part) of the str function? > Thank you Perhaps one more solution. You could limit the number of list components to be printed, though it will leave a

Re: [R] Show only header of str() function

2021-09-02 Thread Avi Gross via R-help
Luigi, If you are sure you are looking at something like a data.frame, and all you want o know is how many rows and how many columns are in it, then str() is perhaps too detailed a tool. The functions nrow() and ncol() tell you what you want and you can get both together with dim(). You can, of

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Luigi Marongiu
Thank you! On Thu, Sep 2, 2021 at 4:17 PM Andrew Simmons wrote: > > It seems like you might've missed one more thing, you need the brackets next > to 'x' to get it to work. > > > x[] <- lapply(x, function(xx) { > xx[is.nan(xx)] <- NA_real_ > xx > }) > > is different from > > x <-

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Andrew Simmons
It seems like you might've missed one more thing, you need the brackets next to 'x' to get it to work. x[] <- lapply(x, function(xx) { xx[is.nan(xx)] <- NA_real_ xx }) is different from x <- lapply(x, function(xx) { xx[is.nan(xx)] <- NA_real_ xx }) Also, if all of your data is

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Luigi Marongiu
Sorry, still I don't get it: ``` > dim(df) [1] 302 626 > # clean > df <- lapply(x, function(xx) { + xx[is.nan(xx)] <- NA + xx + }) > dim(df) NULL ``` On Thu, Sep 2, 2021 at 3:47 PM Andrew Simmons wrote: > > You removed the second line 'xx' from the function, put it back and it should > work

Re: [R] Loop over columns of dataframe and change values condtionally

2021-09-02 Thread Rui Barradas
Hello, In the particular case you have, to change to NA based on condition, use `is.na<-`. Here is some test data, 3 times the same df. set.seed(2021) df3 <- df2 <- df1 <- data.frame( x = c(0, 0, 1, 2, 3), y = c(1, 2, 3, 0, 0), z = rbinom(5, 1, prob = c(0.25, 0.75)), a =

Re: [R] Loop over columns of dataframe and change values condtionally

2021-09-02 Thread PIKAL Petr
Hi you could operate with whole data frame (sometimes) head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Andrew Simmons
You removed the second line 'xx' from the function, put it back and it should work On Thu, Sep 2, 2021, 09:45 Luigi Marongiu wrote: > `data[sapply(data, is.nan)] <- NA` is a nice compact command, but I > still get NaN when using the summary function, for instance one of the > columns give: >

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Luigi Marongiu
`data[sapply(data, is.nan)] <- NA` is a nice compact command, but I still get NaN when using the summary function, for instance one of the columns give: ``` Min. : NA 1st Qu.: NA Median : NA Mean :NaN 3rd Qu.: NA Max. : NA NA's :110 ``` I tried to implement the second solution but: ``` df

[R] Loop over columns of dataframe and change values condtionally

2021-09-02 Thread Luigi Marongiu
Hello, it is possible to select the columns of a dataframe in sequence with: ``` for(i in 1:ncol(df)) { df[ , i] } # or for(i in 1:ncol(df)) { df[ i] } ``` And change all values with, for instance: ``` for(i in 1:ncol(df)) { df[ , i] <- df[ , i] + 10 } ``` Is it possible to apply a

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Andrew Simmons
Hello, I would use something like: x <- c(1:5, NaN) |> sample(100, replace = TRUE) |> matrix(10, 10) |> as.data.frame() x[] <- lapply(x, function(xx) { xx[is.nan(xx)] <- NA_real_ xx }) This prevents attributes from being changed in 'x', but accomplishes the same thing as you have

Re: [R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread PIKAL Petr
Hi what about data[sapply(data, is.nan)] <- NA Cheers Petr > -Original Message- > From: R-help On Behalf Of Luigi Marongiu > Sent: Thursday, September 2, 2021 3:18 PM > To: r-help > Subject: [R] How to globally convert NaN to NA in dataframe? > > Hello, > I have some NaN values in

[R] How to globally convert NaN to NA in dataframe?

2021-09-02 Thread Luigi Marongiu
Hello, I have some NaN values in some elements of a dataframe that I would like to convert to NA. The command `df1$col[is.nan(df1$col)]<-NA` allows to work column-wise. Is there an alternative for the global modification at once of all instances? I have seen from

Re: [R] Show only header of str() function

2021-09-02 Thread Luigi Marongiu
Thank you! better than dim() anyway. Best regards Luigi On Thu, Sep 2, 2021 at 1:31 PM Rui Barradas wrote: > > Hello, > > Not perfect but works for data.frames: > > > header_str <- function(x){ >capture.output(str(x))[[1]] > } > header_str(iris) > header_str(AirPassengers) > header_str(1:10)

Re: [R] Show only header of str() function

2021-09-02 Thread Rui Barradas
Hello, Not perfect but works for data.frames: header_str <- function(x){ capture.output(str(x))[[1]] } header_str(iris) header_str(AirPassengers) header_str(1:10) Hope this helps, Rui Barradas Às 12:02 de 02/09/21, Luigi Marongiu escreveu: Hello, is it possible to show only the header

[R] Show only header of str() function

2021-09-02 Thread Luigi Marongiu
Hello, is it possible to show only the header (that is: `'data.frame': x obs. of y variables:` part) of the str function? Thank you -- Best regards, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] What if there's nothing to dispatch on?

2021-09-02 Thread Rolf Turner
On Wed, 1 Sep 2021 19:29:32 -0400 Duncan Murdoch wrote: > I don't know the header of your foo() method, but let's suppose foo() > is > >foo <- function(x, data, ...) { > UseMethod("foo") >} > > with > >foo.formula <- function(x, data, ...) { > # do something with the

Re: [R] conditional replacement of elements of matrix with another matrix column

2021-09-02 Thread Rui Barradas
Hello, With the new data, here are two ways. The first with a for loop. I find it simple and readable. for(b in unique(B[,1])){ A[which(A[,1] == b), 2] <- B[which(B[,1] == b), 2] } na <- is.na(A[,2]) A[!na, 2] sum(!na) # [1] 216 sum(A[,1] %in% B[,1]) # [1] 216 # Another way,

Re: [R] how to install npsm package

2021-09-02 Thread caghpm
Thank you, Eric. Very useful. From: Eric Berger Sent: Wednesday, September 1, 2021 12:31 PM To: cag...@gmail.com Cc: R mailing list Subject: Re: [R] how to install npsm package Instructions can be found at https://github.com/kloke/npsm On Wed, Sep 1, 2021 at 6:27 PM

[R] combining geom_boxplot and geom_point with jitter

2021-09-02 Thread Ivan Calandra
Dear useRs, I'm having a problem to combine geom_boxplot and geom_point with jitter. It is difficult to explain but the code and result should make it clear (the example dataset is long so I copy it at the end of the email): p <- ggplot(my_data, aes(x = Diet, y = value, color = Software)) p

Re: [R] ISO Code for Namibia ('NA')

2021-09-02 Thread Dr Eberhard Lisse
Thank you. el On 02/09/2021 00:41, Bill Dunlap wrote: z <- tibble(Code=c("NA","NZ",NA), Name=c("Namibia","New Zealand","?")) z # A tibble: 3 x 2 Code Name 1 NANamibia 2 NZNew Zealand 3 ? subset(z, Code=="NA") # A tibble: 1 x 2 Code Name 1 NANamibia