On 12-02-29 8:16 AM, R. Michael Weylandt wrote:
Factors are internally stored as integers (enums if you have used
other programming languages) with a special label set -- it's more
memory efficient than storing the whole string over and over.

That was one of the original justifications, but character vectors are just as memory efficient these days.

The other justifications are still valid: sometimes you have a vector which only takes on a subset of the possible values it could take, and when you tabulate it, you'd like to see those zero counts. You may also want to control the display order, and a factor allows that.

For example:

x <- c("a", "a", "b")
table(x)
x <- factor(x, levels=c("c", "b", "a"))
table(x)

Duncan Murdoch


Michael

On Wed, Feb 29, 2012 at 5:49 AM, Aniruddha Mukherjee
<aniruddha.mukher...@tcs.com>  wrote:
Hello Berend.

Many thanks for your prompt reply and that helped me a lot. One more
thing, if you please explain, I shall be highly obliged.
Why in my case (i.e. when stringsAsFactors was TRUE by default),
as.numeric(matr1$Pulse_rate)
displays the following
  [1]  4  5  7  5  9  8  6 10  3  2  5  1 10 10
?

Best regards.


From:
Berend Hasselman<b...@xs4all.nl>
To:
Aniruddha Mukherjee<aniruddha.mukher...@tcs.com>
Cc:
R-help<r-help@r-project.org>
Date:
02/29/2012 03:57 PM
Subject:
Re: [R] Error occurred during mean calculation of a column of a data
frame, which is apparently contents numeric data




On 29-02-2012, at 09:45, Aniruddha Mukherjee wrote:

Hello R people,

How can I compute the mean of the "Pulse_rate" column of the data frame
or
matrix from the following character object called "str_got". It has 14
entries and each entry has 8 values, separated by commas. Please go thru

the following R commands to know how I tried to unstring and unlist the
values to form a data frame.
str_got
[1]
"bp,67,2011-12-09T19:59:44.044+05:30,9830576102,68.0,124.0,58.0,66.0"
"bp,67,2011-12-09T20:19:31.031+05:30,9830576102,72.0,133.0,93.0,40.0"
.....

matr<-matrix(unlist(strsplit(str_got, ",")), nrows, byrow=T)

nrows?
I assume this was set somewhere in your script and not shown.
Is it length(str_got)?

matr
        [,1]   [,2]                                              [,3]
       [,4]               [,5]        [,6]       [,7]       [,8]
[1,] "bp" "67"    "2011-12-09T19:59:44.044+05:30" "9830576102" "68.0"
......

Note column names must be inserted before computing the desired mean
value.
matr1<-as.data.frame(matr)

Use matr1<- as.data.frame(matr, stringsAsFactors=FALSE)

If you don't dos tringsAsFactors=FALSE the column will be a factor and
that is not equivalent with numeric.

What's wrong with

matr1$Pulse_rate<- as.numeric(matr1$Pulse_rate)

Then you can calculate the desired mean with

mean(matr1$Pulse_rate)

or

mean(matr1[,"Pulse_rate"])

Berend



=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to