[R] drop levels problem

2010-11-29 Thread Felipe Carrillo
Hi all:
I am having trouble dropping levels, got a few hints online without success.
Please consider the dataset below:
 I was under the inpression that subset(..drop=TRUE) would work but it 
doesn't

library(ggplot2)
    library(hmisc)

x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, 
46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, 
34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, 
42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766, 
40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419, 
42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634, 
38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second, 
third), class = data.frame, row.names = c(NA, -14L))

 head(x);str(x)
xmelt - melt(x)
 names(xmelt) - c(year,fatPerc)

  # Year variable is a factor with three levels
 # Subset to plot only 'first' year
firstyear - subset(xmelt,year=='first');str(firstyear)
# Plot showing three levels still after I made the subset
  ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

# Try to drop the levels but dropUnusedLevels() doesn't seem to work here
  dropUnusedLevels()
ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

# code below also should drop levels but it doesn't
#data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)} 
else{x}))
str(firstyear)
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Joshua Wiley
Hi Felipe,

On Mon, Nov 29, 2010 at 11:01 AM, Felipe Carrillo
mazatlanmex...@yahoo.com wrote:
 Hi all:
 I am having trouble dropping levels, got a few hints online without success.
 Please consider the dataset below:
  I was under the inpression that subset(..drop=TRUE) would work but it
 doesn't

Here drop is referring to:

data.frame(1:10)[, 1]
data.frame(1:10)[, 1, drop = FALSE]

not to levels of a factor.


 library(ggplot2)
     library(hmisc)

 x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232,
 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056,
 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894,
 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,
 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419,
 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634,
 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second,
 third), class = data.frame, row.names = c(NA, -14L))

Thanks for the nice example!


  head(x);str(x)
 xmelt - melt(x)
  names(xmelt) - c(year,fatPerc)

   # Year variable is a factor with three levels
  # Subset to plot only 'first' year
 firstyear - subset(xmelt,year=='first');str(firstyear)
 # Plot showing three levels still after I made the subset
   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

right, because it is possible to have levels of a factor that have no
observations---sometimes these are the most interesting (e.g., if you
subset by smoking and found that there were no instances of lung
cancer in non-smokers (not that extreme, but you get the point)).


 # Try to drop the levels but dropUnusedLevels() doesn't seem to work here
   dropUnusedLevels()

sorry, I have had some difficulty installing Hmisc on my linux system
and never gotten around to working it out.

 ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

 # code below also should drop levels but it doesn't
 #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)}
 else{x}))

it would if you assigned it back to firstyear.  You do it, and then
just print to screen and the changed data goes off to oblivion.

firstyear - data.frame(lapply(firstyear, function(x) if(is.factor(x))
{factor(x)} else {x}))
str(firstyear) # should now just have one level

Cheers,

Josh

 str(firstyear)

 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Henrique Dallazuanna
Take a look on droplevels function (R = 2.12)

On Mon, Nov 29, 2010 at 5:01 PM, Felipe Carrillo
mazatlanmex...@yahoo.comwrote:

 Hi all:
 I am having trouble dropping levels, got a few hints online without
 success.
 Please consider the dataset below:
  I was under the inpression that subset(..drop=TRUE) would work but it
 doesn't

 library(ggplot2)
 library(hmisc)

 x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232,
 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056,
 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894,
 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,
 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419,
 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634,
 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second,
 third), class = data.frame, row.names = c(NA, -14L))

  head(x);str(x)
 xmelt - melt(x)
  names(xmelt) - c(year,fatPerc)

   # Year variable is a factor with three levels
  # Subset to plot only 'first' year
 firstyear - subset(xmelt,year=='first');str(firstyear)
 # Plot showing three levels still after I made the subset
   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

 # Try to drop the levels but dropUnusedLevels() doesn't seem to work here
   dropUnusedLevels()
 ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

 # code below also should drop levels but it doesn't
 #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)}
 else{x}))
 str(firstyear)

 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Felipe Carrillo
Thanks Joshua, I get it now, levels sometimes drive me loco
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA



- Original Message 
 From: Joshua Wiley jwiley.ps...@gmail.com
 To: Felipe Carrillo mazatlanmex...@yahoo.com
 Cc: r-h...@stat.math.ethz.ch
 Sent: Mon, November 29, 2010 11:18:45 AM
 Subject: Re: [R] drop levels problem
 
 Hi Felipe,
 
 On Mon, Nov 29, 2010 at 11:01 AM, Felipe Carrillo
 mazatlanmex...@yahoo.com wrote:
  Hi all:
  I am having trouble dropping levels, got a few hints online without success.
  Please consider the dataset below:
   I was under the inpression that subset(..drop=TRUE) would work but it
  doesn't
 
 Here drop is referring to:
 
 data.frame(1:10)[, 1]
 data.frame(1:10)[, 1, drop = FALSE]
 
 not to levels of a factor.
 
 
  library(ggplot2)
      library(hmisc)
 
  x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232,
  46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056,
  34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894,
  42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,
  40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419,
  42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634,
  38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second,
  third), class = data.frame, row.names = c(NA, -14L))
 
 Thanks for the nice example!
 
 
   head(x);str(x)
  xmelt - melt(x)
   names(xmelt) - c(year,fatPerc)
 
    # Year variable is a factor with three levels
   # Subset to plot only 'first' year
  firstyear - subset(xmelt,year=='first');str(firstyear)
  # Plot showing three levels still after I made the subset
    ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()
 
 right, because it is possible to have levels of a factor that have no
 observations---sometimes these are the most interesting (e.g., if you
 subset by smoking and found that there were no instances of lung
 cancer in non-smokers (not that extreme, but you get the point)).
 
 
  # Try to drop the levels but dropUnusedLevels() doesn't seem to work here
    dropUnusedLevels()
 
 sorry, I have had some difficulty installing Hmisc on my linux system
 and never gotten around to working it out.
 
  ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()
 
  # code below also should drop levels but it doesn't
  #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)}
  else{x}))
 
 it would if you assigned it back to firstyear.  You do it, and then
 just print to screen and the changed data goes off to oblivion.
 
 firstyear - data.frame(lapply(firstyear, function(x) if(is.factor(x))
 {factor(x)} else {x}))
 str(firstyear) # should now just have one level
 
 Cheers,
 
 Josh
 
  str(firstyear)
 
  Felipe D. Carrillo
  Supervisory Fishery Biologist
  Department of the Interior
  US Fish  Wildlife Service
  California, USA
 
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Joshua Wiley
Just to follow up on my own post a bit:

xmelt$year[xmelt$year == first, drop = TRUE]

will do what you want.  I think because in the subset there are
multiple columns not all of which are factor, the method for '[' being
used is not the factor one that would drop unused levels.  I did not
make that clear at all the first time around (and probably still
butchered it, which some knowledgeable soul may correct me on).  Also
I did get Hmisc installed, but I think dropUnusedLevels() does not
work in this case for a similar reason.

Henrique's solution is, as usual, the shortest :)

Josh

[snip]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.