Re: [R] Removing NAs from dataframe (for use in Vioplot)
> Bert Gunter> on Mon, 2 May 2016 06:20:52 -0700 writes: > Martin et. al.: > na.omit(frame) will remove all rows/cases in which an NA occurs. I'm > not sure that this is what the OP wanted, which seemed to be to > separately remove NA's from each column and plot the resulting column. > This is what the lapply (and the OP's provided code) does, anyway. > Also, lapply() produces a single list (of vectors), not a "series of lists" . > Corrections happily accepted if I'm in error. No corrections needed. You were right ... and indeed I was wrong in assuming that "one would want" a complete na.omit() here. Martin > Cheers, > Bert > Bert Gunter > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > On Mon, May 2, 2016 at 1:49 AM, Martin Maechler > wrote: >>> Mike Smith >>> on Sun, 1 May 2016 08:15:44 +0100 writes: >> >> On Apr 30, 2016, at 12:58 PM, Mike Smith >> wrote: Hi >> >> First post and a relative R newbie >> >> I am using the vioplot library to produce some violin >> plots. >> DW> It's a package, not a library. >> >> [yes!] >> >> >> 1. Is there a more elegant way of automatically >> stripping the NAs, passing the columns to the function >> along with the header names?? >> >> >>> ds2 <- lapply( ds1, na.omit) >> >> >> > Fantastic - that does the trick! Easy when you know how!! >> >> > Follow-on: is there a way feed all the lists from ds2 to >> > vioplot? It is now a series of lists (rather than a >> > dataframe - is that right?). >> >> Yes, that's right. So after all the above was not really >> perfect : >> >> na.omit() has been designed as a generic function and has always >> had a method for "data.frame"; so, really >> >> ds.noNA <- na.omit(ds1) >> or ds0NA <- na.omit(ds1) >> >> (choosing "expressive names") >> >> is what you want. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
Martin et. al.: na.omit(frame) will remove all rows/cases in which an NA occurs. I'm not sure that this is what the OP wanted, which seemed to be to separately remove NA's from each column and plot the resulting column. This is what the lapply (and the OP's provided code) does, anyway. Also, lapply() produces a single list (of vectors), not a "series of lists" . Corrections happily accepted if I'm in error. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, May 2, 2016 at 1:49 AM, Martin Maechlerwrote: >> Mike Smith >> on Sun, 1 May 2016 08:15:44 +0100 writes: > > On Apr 30, 2016, at 12:58 PM, Mike Smith > wrote: Hi > > First post and a relative R newbie > > I am using the vioplot library to produce some violin > plots. > > DW> It's a package, not a library. > > [yes!] > > > 1. Is there a more elegant way of automatically > stripping the NAs, passing the columns to the function > along with the header names?? > > >>> ds2 <- lapply( ds1, na.omit) > > > > Fantastic - that does the trick! Easy when you know how!! > > > Follow-on: is there a way feed all the lists from ds2 to > > vioplot? It is now a series of lists (rather than a > > dataframe - is that right?). > > Yes, that's right. So after all the above was not really > perfect : > > na.omit() has been designed as a generic function and has always > had a method for "data.frame"; so, really > > ds.noNA <- na.omit(ds1) > or ds0NA <- na.omit(ds1) > > (choosing "expressive names") > > is what you want. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
> Mike Smith> on Sun, 1 May 2016 08:15:44 +0100 writes: On Apr 30, 2016, at 12:58 PM, Mike Smith wrote: Hi First post and a relative R newbie I am using the vioplot library to produce some violin plots. DW> It's a package, not a library. [yes!] 1. Is there a more elegant way of automatically stripping the NAs, passing the columns to the function along with the header names?? >>> ds2 <- lapply( ds1, na.omit) > Fantastic - that does the trick! Easy when you know how!! > Follow-on: is there a way feed all the lists from ds2 to > vioplot? It is now a series of lists (rather than a > dataframe - is that right?). Yes, that's right. So after all the above was not really perfect : na.omit() has been designed as a generic function and has always had a method for "data.frame"; so, really ds.noNA <- na.omit(ds1) or ds0NA <- na.omit(ds1) (choosing "expressive names") is what you want. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
> On May 1, 2016, at 12:15 AM, Mike Smithwrote: > On Apr 30, 2016, at 12:58 PM, Mike Smith wrote: > Hi > First post and a relative R newbie > I am using the vioplot library to produce some violin plots. > > DW> It's a package, not a library. > I have an input CSV with columns off irregular length that contain NAs. I want to strip the NAs out and produce a multiple violin plot automatically labelled using the headers. At the moment I do this > Code: ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) library(vioplot) y6<-na.omit(ds1$y6) y5<-na.omit(ds1$y5) y4<-na.omit(ds1$y4) y3<-na.omit(ds1$y3) y2<-na.omit(ds1$y2) y1<-na.omit(ds1$y1) vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", "Y5","Y4","Y3","Y2","Y1"), col = "lightblue") > > Two queries: > 1. Is there a more elegant way of automatically stripping the NAs, passing the columns to the function along with the header names?? > > >>> ds2 <- lapply( ds1, na.omit) > > > Fantastic - that does the trick! Easy when you know how!! > > Follow-on: is there a way feed all the lists from ds2 to vioplot? It is now a > series of lists (rather than a dataframe - is that right?). So this works, > > library(vioplot) > ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) > ds2 <- lapply( ds1, na.omit) > vioplot(ds2$y1,ds2$y2) > > but this doesnt > > library(vioplot) > ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) > ds2 <- lapply( ds1, na.omit) > vioplot(ds2) > Error in min(data) : invalid 'type' (list) of argument I had trouble, too. I thought, "Oh, this is easy, just use `do.call`", but I failed in getting any successful argument passing that way. > do.call('vioplot', list(x=ds2[[6]], ds2[-6]) ) Error in min(data) : invalid 'type' (list) of argument > do.call('vioplot', c(x=ds2[[6]], ds2[-6]) ) Error in vioplot(x1 = 5L, x2 = 10L, x3 = 6L, x4 = 7L, x5 = 7L, x6 = 6L, : argument "x" is missing, with no default Eventually I re-wrote the first line of vioplot's body to behave the way I thought made the most sense: vioplot <- function (x, ..., range = 1.5, h = NULL, ylim = NULL, names = NULL, horizontal = FALSE, col = "magenta", border = "black", lty = 1, lwd = 1, rectCol = "black", colMed = "white", pchMed = 19, at, add = FALSE, wex = 1, drawRect = TRUE) { datas <- c(list(x), ...) # but keep the rest the same. # I then get success with: vioplot(ds2[['y1']], ds2[-6]) # success do.call('vioplot', list(x=ds2[[6]], ds2[-6]) ) # also successes do.call('vioplot', list(x=ds2[['y1']], ds2[-6]) ) This is retracing a route explored 8 years ago: http://markmail.org/search/?q=list%3Aorg.r-project.r-help+list+argument+to+vioplot#query:list%3Aorg.r-project.r-help%20list%20argument%20to%20vioplot+page:1+mid:j6lapgri46utcod7+state:results It's probably easier to use that helper-function approach than my efforts at hacking. Best of luck; David 2. Can I easily add the sample size to each violin plotted?? > ?violplot >>> No documentation for ‘violplot’ in specified packages and libraries: >>> you could try ‘??violplot’ > > DW> I see that I mispled that _package_ name. However, after loading > DW> it I realized that I had no way of replicating what you are > DW> seeing, because you didn't provide that file (or even something > DW> that resembles it. It's rather unclear how you wanted this information > presented. > > The original code *should* have worked as the csv was online. There doesnt > seem to be any option in vioplot to add the sample size (these are all small > samples which I wanted to highlight) so I dont know if this is easily done > elsewhere. > > Thanks again!! > --- > Mike Smith > David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
>>> On Apr 30, 2016, at 12:58 PM, Mike Smithwrote: >>> Hi >>> First post and a relative R newbie >>> I am using the vioplot library to produce some violin plots. DW> It's a package, not a library. >>> I have an input CSV with columns off irregular length that contain NAs. I >>> want to strip the NAs out and produce a multiple violin plot automatically >>> labelled using the headers. At the moment I do this >>> Code: >>> ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) >>> library(vioplot) >>> y6<-na.omit(ds1$y6) >>> y5<-na.omit(ds1$y5) >>> y4<-na.omit(ds1$y4) >>> y3<-na.omit(ds1$y3) >>> y2<-na.omit(ds1$y2) >>> y1<-na.omit(ds1$y1) >>> vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", >>> "Y5","Y4","Y3","Y2","Y1"), col = "lightblue") >>> Two queries: >>> 1. Is there a more elegant way of automatically stripping the NAs, passing >>> the columns to the function along with the header names?? >> ds2 <- lapply( ds1, na.omit) Fantastic - that does the trick! Easy when you know how!! Follow-on: is there a way feed all the lists from ds2 to vioplot? It is now a series of lists (rather than a dataframe - is that right?). So this works, library(vioplot) ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) ds2 <- lapply( ds1, na.omit) vioplot(ds2$y1,ds2$y2) but this doesnt library(vioplot) ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) ds2 <- lapply( ds1, na.omit) vioplot(ds2) >>> 2. Can I easily add the sample size to each violin plotted?? >>> ?violplot >> No documentation for ‘violplot’ in specified packages and libraries: >> you could try ‘??violplot’ DW> I see that I mispled that _package_ name. However, after loading DW> it I realized that I had no way of replicating what you are DW> seeing, because you didn't provide that file (or even something DW> that resembles it. It's rather unclear how you wanted this information presented. The original code *should* have worked as the csv was online. There doesnt seem to be any option in vioplot to add the sample size (these are all small samples which I wanted to highlight) so I dont know if this is easily done elsewhere. Thanks again!! --- Mike Smith __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
But require() should not be used interchangeably with library()... the return value from require() should always be tested. -- Sent from my phone. Please excuse my brevity. On May 1, 2016 3:03:59 AM GMT+01:00, Tom Wrightwrote: >Never let it be said there's only one way to do a thing: > >require(ggplot2) >require(dplyr) > >#create a sample dataset >dat <- data.frame(y1=sample(c(1:10,NA),20,replace=TRUE), > y2=sample(c(1:10,NA),20,replace=TRUE), > y3=sample(c(1:10,NA),20,replace=TRUE)) > ># convert from wide to long >dat <- melt(dat) > ># add the counts as a label >dat <- merge(dat, > group_by(dat,variable) %>% > summarise(lab=paste0('n=',length(na.omit(value) > ># do the plot >ggplot(dat,aes(x=variable,y=value)) + >geom_violin() + >geom_text(aes(y=max(value,na.rm=TRUE)/2,label=lab)) > > ># apologies to David Winsemius for directing this answer to him, I'll >work >out how to use email one day. > >On Sat, Apr 30, 2016 at 12:58 PM, Mike Smith wrote: > >> Hi >> >> First post and a relative R newbie >> >> I am using the vioplot library to produce some violin plots. I have >an >> input CSV with columns off irregular length that contain NAs. I want >to >> strip the NAs out and produce a multiple violin plot automatically >labelled >> using the headers. At the moment I do this >> >> Code: >> ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) >> library(vioplot) >> y6<-na.omit(ds1$y6) >> y5<-na.omit(ds1$y5) >> y4<-na.omit(ds1$y4) >> y3<-na.omit(ds1$y3) >> y2<-na.omit(ds1$y2) >> y1<-na.omit(ds1$y1) >> vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", >> "Y5","Y4","Y3","Y2","Y1"), col = "lightblue") >> >> >> Two queries: >> >> 1. Is there a more elegant way of automatically stripping the NAs, >passing >> the columns to the function along with the header names?? >> >> 2. Can I easily add the sample size to each violin plotted?? >> >> thanks >> >> mike >> >> >> >> --- >> Mike Smith >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
Never let it be said there's only one way to do a thing: require(ggplot2) require(dplyr) #create a sample dataset dat <- data.frame(y1=sample(c(1:10,NA),20,replace=TRUE), y2=sample(c(1:10,NA),20,replace=TRUE), y3=sample(c(1:10,NA),20,replace=TRUE)) # convert from wide to long dat <- melt(dat) # add the counts as a label dat <- merge(dat, group_by(dat,variable) %>% summarise(lab=paste0('n=',length(na.omit(value) # do the plot ggplot(dat,aes(x=variable,y=value)) + geom_violin() + geom_text(aes(y=max(value,na.rm=TRUE)/2,label=lab)) # apologies to David Winsemius for directing this answer to him, I'll work out how to use email one day. On Sat, Apr 30, 2016 at 12:58 PM, Mike Smithwrote: > Hi > > First post and a relative R newbie > > I am using the vioplot library to produce some violin plots. I have an > input CSV with columns off irregular length that contain NAs. I want to > strip the NAs out and produce a multiple violin plot automatically labelled > using the headers. At the moment I do this > > Code: > ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) > library(vioplot) > y6<-na.omit(ds1$y6) > y5<-na.omit(ds1$y5) > y4<-na.omit(ds1$y4) > y3<-na.omit(ds1$y3) > y2<-na.omit(ds1$y2) > y1<-na.omit(ds1$y1) > vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", > "Y5","Y4","Y3","Y2","Y1"), col = "lightblue") > > > Two queries: > > 1. Is there a more elegant way of automatically stripping the NAs, passing > the columns to the function along with the header names?? > > 2. Can I easily add the sample size to each violin plotted?? > > thanks > > mike > > > > --- > Mike Smith > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
> On Apr 30, 2016, at 4:16 PM, David Winsemiuswrote: > > >> On Apr 30, 2016, at 12:58 PM, Mike Smith wrote: >> >> Hi >> >> First post and a relative R newbie >> >> I am using the vioplot library to produce some violin plots. It's a package, not a library. >> I have an input CSV with columns off irregular length that contain NAs. I >> want to strip the NAs out and produce a multiple violin plot automatically >> labelled using the headers. At the moment I do this >> >> Code: >> ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) >> library(vioplot) >> y6<-na.omit(ds1$y6) >> y5<-na.omit(ds1$y5) >> y4<-na.omit(ds1$y4) >> y3<-na.omit(ds1$y3) >> y2<-na.omit(ds1$y2) >> y1<-na.omit(ds1$y1) >> vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", >> "Y5","Y4","Y3","Y2","Y1"), col = "lightblue") >> >> >> Two queries: >> >> 1. Is there a more elegant way of automatically stripping the NAs, passing >> the columns to the function along with the header names?? >> > > ds2 <- lapply( ds1, na.omit) > > >> 2. Can I easily add the sample size to each violin plotted?? > >> ?violplot > No documentation for ‘violplot’ in specified packages and libraries: > you could try ‘??violplot’ I see that I mispled that _package_ name. However, after loading it I realized that I had no way of replicating what you are seeing, because you didn't provide that file (or even something that resembles it. It's rather unclear how you wanted this information presented. -- David. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing NAs from dataframe (for use in Vioplot)
> On Apr 30, 2016, at 12:58 PM, Mike Smithwrote: > > Hi > > First post and a relative R newbie > > I am using the vioplot library to produce some violin plots. I have an input > CSV with columns off irregular length that contain NAs. I want to strip the > NAs out and produce a multiple violin plot automatically labelled using the > headers. At the moment I do this > > Code: > ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) > library(vioplot) > y6<-na.omit(ds1$y6) > y5<-na.omit(ds1$y5) > y4<-na.omit(ds1$y4) > y3<-na.omit(ds1$y3) > y2<-na.omit(ds1$y2) > y1<-na.omit(ds1$y1) > vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", > "Y5","Y4","Y3","Y2","Y1"), col = "lightblue") > > > Two queries: > > 1. Is there a more elegant way of automatically stripping the NAs, passing > the columns to the function along with the header names?? > ds2 <- lapply( ds1, na.omit) > 2. Can I easily add the sample size to each violin plotted?? > ?violplot No documentation for ‘violplot’ in specified packages and libraries: you could try ‘??violplot’ > > thanks > > mike > > > > --- > Mike Smith > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Removing NAs from dataframe (for use in Vioplot)
Hi First post and a relative R newbie I am using the vioplot library to produce some violin plots. I have an input CSV with columns off irregular length that contain NAs. I want to strip the NAs out and produce a multiple violin plot automatically labelled using the headers. At the moment I do this Code: ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv;) library(vioplot) y6<-na.omit(ds1$y6) y5<-na.omit(ds1$y5) y4<-na.omit(ds1$y4) y3<-na.omit(ds1$y3) y2<-na.omit(ds1$y2) y1<-na.omit(ds1$y1) vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", "Y5","Y4","Y3","Y2","Y1"), col = "lightblue") Two queries: 1. Is there a more elegant way of automatically stripping the NAs, passing the columns to the function along with the header names?? 2. Can I easily add the sample size to each violin plotted?? thanks mike --- Mike Smith __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.