Re: [R] violin plot help

Abdelrahman, Omar (RER) Wed, 17 May 2017 15:13:01 -0700

Thank you, curly quotes got me! I was able to subset the data and produce the 
violin plot. Now, is there a way to generate multiple plots separately (no 
facets)? With so many levels of each variable, I am trying to avoid doing it 
iteratively. Neither ggplot2 books nor web searches have yielded anything (so 
far). 
Also I want a violin for each year within Geo. I did try to specify year with 
the following:
ggplot () +
facet_grid (PARAMETER ~Wshed~year, scales="free_y") +
geom_violin (data=subdf, aes(x=Geo, y=RESULT, fill=Geo))


which yielded
-Error in combine_vars(data, params$plot_env, cols, drop = params$drop) : 
  At least one layer must contain all variables used for faceting

Also tried:
ggplot () +
facet_grid (PARAMETER ~Wshed, scales="free_y") +
geom_violin (data=subdf, aes(x=Geo~year, y=RESULT, fill=Geo))

Do I need to specify "year(date)"; I loaded lubridate?


-----Original Message-----
From: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us] 
Sent: Wednesday, May 17, 2017 10:05 AM
To: Abdelrahman, Omar (RER) <omar.abdelrah...@miamidade.gov>
Cc: R-help <r-help@r-project.org>
Subject: RE: [R] violin plot help

Here is an example that works... a reproducible example always includes code 
AND enough sample data to exercise the code:

########
dta <- read.table( text=
"STATION        Geo     Wshed   DATE            PARAMETER                 RESULT
BB36            Bay     C-100   1/10/2013       'Phosphorus, Total (TP)'  0.004
BB36            Bay     C-100   1/10/2013       'Chlorophyll-A'           0.2
BB52            Bay     C-100   1/10/2013       'Phosphorus, Total (TP)'  0.003
BB52            Bay     C-100   1/10/2013       'Chlorophyll-A'           0.39
CD01A           Mouth   C-100   1/10/2013       'Phosphorus, Total (TP)'  0.017
CD01A           Mouth   C-100   1/10/2013       'Chlorophyll-A'           0.64
CD02            East    C-100   1/10/2013       'Phosphorus, Total (TP)'  0.01
CD05            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.005
CD06            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.01
CD09            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.007
BB36            Bay     C-100   2/7/2013        'Chlorophyll-A'           0.18
BB36            Bay     C-100   2/7/2013        'Phosphorus, Total (TP)'  0.002
BB52            Bay     C-100   2/7/2013        'Phosphorus, Total (TP)'  0.002
BB52            Bay     C-100   2/7/2013        'Chlorophyll-A'           0.31
CD01A           Mouth   C-100   2/7/2013        'Phosphorus, Total (TP)'  0.004
CD01A           Mouth   C-100   2/7/2013        'Chlorophyll-A'           0.4
CD02            East    C-100   2/7/2013        'Phosphorus, Total (TP)'  0.011
CD05            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.007
CD06            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.015
CD09            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.008
CD01A           Mouth   C-100   3/7/2013        'Phosphorus, Total (TP)'  0.007
", header=TRUE)
# prints result to console without assigning it to a new variable subset( dta, 
Geo == "East" ) ########

Note that [1] and [2] suggest the use of the dput function to help create R 
code that creates the object just as you have it before the troublesome line of 
code:

########
dta <- structure(list(STATION = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 
7L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 3L) , .Label = c("BB36", "BB52", 
"CD01A", "CD02", "CD05", "CD06", "CD09")
      , class = "factor"),
     Geo = structure(c(1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L,
     1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L, 4L), .Label = c("Bay",
     "Central", "East", "Mouth"), class = "factor"),
     Wshed = structure(c(1L,
     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
     1L, 1L, 1L, 1L, 1L), .Label = "C-100", class = "factor"),
     DATE = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L), .Label = c("1/10/2013",
     "2/7/2013", "3/7/2013"), class = "factor"),
     PARAMETER = structure(c(2L,
     1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
     2L, 2L, 2L, 2L, 2L), .Label = c("Chlorophyll-A",
     "Phosphorus, Total (TP)"
     ), class = "factor"), RESULT = c(0.004, 0.2, 0.003, 0.39,
     0.017, 0.64, 0.01, 0.005, 0.01, 0.007, 0.18, 0.002, 0.002,
     0.31, 0.004, 0.4, 0.011, 0.007, 0.015, 0.008, 0.007)),
     .Names = c("STATION", "Geo", "Wshed", "DATE", "PARAMETER", "RESULT"),
     class = "data.frame", row.names = c(NA, -21L)) subset( dta, Geo == "East" 
) ########

Note that the "structure" function created by dput is mostly insensitive to 
extra newlines, except inside quotes.

So the above examples work for me. What doesn't work for you?

One thought: Are you editing your R code with a plain text editor or are you 
editing it with a word processor that might replace your plain quotes with 
curly quotes?

[1] 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

[2] http://adv-r.had.co.nz/Reproducibility.html

On Wed, 17 May 2017, Abdelrahman, Omar (RER) wrote:

> Thanks again
> RE: "so all the more reason to give us an example that we can run to trigger 
> the same error." Are you asking for an example of the data? Below is a 
> "small" example, but with so many levels of the different variables I am not 
> sure it can be useful.
>
> STATION       Geo     Wshed   DATE            PARAMETER               RESULT
> BB36          Bay     C-100   1/10/2013       Phosphorus, Total (TP)  0.004
> BB36          Bay     C-100   1/10/2013       Chlorophyll-A           0.2
> BB52          Bay     C-100   1/10/2013       Phosphorus, Total (TP)  0.003
> BB52          Bay     C-100   1/10/2013       Chlorophyll-A           0.39
> CD01A         Mouth   C-100   1/10/2013       Phosphorus, Total (TP)  0.017
> CD01A         Mouth   C-100   1/10/2013       Chlorophyll-A   0.64
> CD02          East    C-100   1/10/2013       Phosphorus, Total (TP)  0.01
> CD05          Central C-100   1/10/2013       Phosphorus, Total (TP)  0.005
> CD06          Central C-100   1/10/2013       Phosphorus, Total (TP)  0.01
> CD09          Central C-100   1/10/2013       Phosphorus, Total (TP)  0.007
> BB36          Bay     C-100   2/7/2013        Chlorophyll-A           0.18
> BB36          Bay     C-100   2/7/2013        Phosphorus, Total (TP)  0.002
> BB52          Bay     C-100   2/7/2013        Phosphorus, Total (TP)  0.002
> BB52          Bay     C-100   2/7/2013        Chlorophyll-A           0.31
> CD01A         Mouth   C-100   2/7/2013        Phosphorus, Total (TP)  0.004
> CD01A         Mouth   C-100   2/7/2013        Chlorophyll-A           0.4
> CD02          East    C-100   2/7/2013        Phosphorus, Total (TP)  0.011
> CD05          Central C-100   2/7/2013        Phosphorus, Total (TP)  0.007
> CD06          Central C-100   2/7/2013        Phosphorus, Total (TP)  0.015
> CD09          Central C-100   2/7/2013        Phosphorus, Total (TP)  0.008
> CD01A         Mouth   C-100   3/7/2013        Phosphorus, Total (TP)  0.007
>
> Hope this is not too much
>
> -----Original Message-----
> From: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us]
> Sent: Tuesday, May 16, 2017 12:30 PM
> To: Abdelrahman, Omar (RER) <omar.abdelrah...@miamidade.gov>; R-help 
> <r-help@r-project.org>
> Subject: RE: [R] violin plot help
>
> Please use reply-all or equivalent to keep the list in the conversation. I 
> don't do private online consultation.
>
> Your example suggested you did not know the difference, but your error 
> suggests a completely different expression triggered the error, so all the 
> more reason to give us an example that we can run to trigger the same error.
>
> Items B and C are recommendations to read the help pages for those syntax 
> elements. You should already have read enough of an introduction to R to have 
> encountered the use of the question mark to bring up the help pages. If not, 
> please do.
> --
> Sent from my phone. Please excuse my brevity.
>
> On May 16, 2017 9:00:09 AM PDT, "Abdelrahman, Omar (RER)" 
> <omar.abdelrah...@miamidade.gov> wrote:
>> Thanks Jeff. I will send plain text from now on. I am not sure what B 
>> or C mean; is there a guide that I can reference? I know the 
>> difference between "=" and "==" , they work the same in Stata and SAS.
>>
>> Omar
>> -----Original Message-----
>> From: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us]
>> Sent: Tuesday, May 16, 2017 11:43 AM
>> To: r-help@r-project.org; Abdelrahman, Omar (RER) 
>> <omar.abdelrah...@miamidade.gov>; 'r-help@r-project.org'
>> <r-help@r-project.org>
>> Subject: Re: [R] violin plot help
>>
>> Read
>> A) the Posting Guide (re plain text only... your emails may be 
>> damaged by the mailing list if you send html-formatted email... only 
>> you can solve this by figuring out how to use your email software)
>> B) Help on assignment (?`=`)
>> C) Help on logical tests (?`==`)
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On May 16, 2017 7:06:40 AM PDT, "Abdelrahman, Omar (RER)"
>> <omar.abdelrah...@miamidade.gov> wrote:
>>> I am trying to produce multiple violin plots by 3 categorical 
>>> variables, each violin representing 1 year worth of data. The
>> variables
>>> are:
>>>
>>> Watershed (7 levels: county canals)
>>>
>>> Geography (5 levels: west; central; east; mouth; bay)
>>>
>>> Parameter (8 levels: water quality chemical parameters)
>>>
>>> Year (25 levels: 1992-2017)
>>>
>>> I want to produce 1 plot for each Parameter-Watershed subdivided 
>>> into Geography with a violin for each year. I used facets with the
>> following
>>> code (not by year):
>>>
>>> ggplot () +
>>>
>>> facet_grid (PARAMETER ~Wshed, scales="free_y") +
>>>
>>> geom_violin (data=merged, aes(x=Geo, y=RESULT))
>>>
>>>
>>>
>>> I do not want facets, they crowd the information so it is unreadable.
>> I
>>> just started with R this week and have not been able to figure out 
>>> the
>>
>>> foreach protocol, or any other loop protocol. I tried to subset the 
>>> data to do it iteratively with the following code:
>>>
>>>
>>>
>>> subdf<-subset (merged, Wshed = "AC")
>>>
>>>
>>>
>>> but got an error: Error: unexpected input in "subdf=subset (merged, 
>>> Wshed == ""
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Thanks,
>>>
>>> Omar Abdelrahman, Biologist II
>>> Miami-Dade County, Department of Regulatory and Economic Resources 
>>> Division of Environmental Resources Management (DERM) Overtown 
>>> Transit
>>
>>> Village
>>> 701 NW 1st Court, 5th Floor
>>> Miami, FL 33136-3912
>>> (305) 372-6872
>>> abd...@miamidade.gov<mailto:abd...@miamidade.gov>
>>> www.miamidade.gov/environment<http://www.miamidade.gov/environment/>
>>>
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] violin plot help

Reply via email to