Hello Bert. I didn't reply to the list because i forgot. I hit reply instead of reply all....
Thanks for your example. I understood now that i was trying to do something that didn't made sense and that was why it failed. I should have used an histogram do do a graph of the frequency of each number of 'posts' instead of going the convoluted way around and trying to do a scatterplot. I now understand that table() transforms each value of the variable into a "factor" and counts how many times it shows up. It makes sense that these "factors" are then tranformed into "character" when in the data frame, because they are not a quantity, but the representation of the number. Thanks for the help. Problem solved. António Brito Camacho No dia 26/05/2013, às 15:00, Bert Gunter <gunter.ber...@gene.com> escreveu: > 1. Please always cc. the list; do not reply just to me. > > 2. OK, I see. I ERRED. Had you cc'ed the list, someone might have > pointed this out. The correct example reproduces what you saw. > > z<- sample(1:10,30,rep=TRUE) > table(z) > w <- data.frame(table(z)) > w > > z Freq > 1 1 2 > 2 2 3 > 3 3 1 > 4 4 3 > 5 5 5 > 6 6 3 > 7 7 5 > 8 8 4 > 9 9 1 > 10 10 3 > >> sapply(w,class) > z Freq > "factor" "integer" > > This is exactly what is expected and documented. See ?table. So the > question is: What do you expect? table() produces an array whose > cross-classifying factors are the dimensions. data.frame converts this > into a data frame. Perhaps the following will help clarify: > >> z <- data.frame(fac1= sample(LETTERS[1:3],10,rep=TRUE), > fac2 = sample(c("j","k"),10,rep=TRUE)) >> z > fac1 fac2 > 1 A k > 2 B k > 3 C k > 4 C k > 5 B k > 6 C k > 7 C k > 8 A j > 9 A j > 10 C j > >> table(z) > > fac2 > fac1 j k > A 2 1 > B 0 2 > C 1 4 > >> data.frame(table(z)) > > fac1 fac2 Freq > 1 A j 2 > 2 B j 0 > 3 C j 1 > 4 A k 1 > 5 B k 2 > 6 C k 4 > >> table(z['fac1']) > > A B C > 3 2 5 > >> data.frame(table(z['fac1'])) > Var1 Freq > 1 A 3 > 2 B 2 > 3 C 5 > > Cheers, > Bert > > On Sat, May 25, 2013 at 6:54 PM, António Camacho <toin...@gmail.com> wrote: >> Hello Bert >> Thanks for your prompt reply. >> I tried your example and it worked without a problem. >> >> But what i want is to create a data frame from the output of the function >> table(), so in your example i tried "sapply(data.frame(tbl),class)" and the >> output was z --> factor and Freq --->integer. >> What is happening in the table() function that is transforming the integers >> in z into values with labels ? >> because when i do "names(tbl)" it returns each value of z as a name.... >> >> I read the manual for " [ " but i didn't understand it completely. I have to >> read the introduction to R more carefully. >> >> I also tried using "[," "[[" and "$" for the extraction of the values from >> the 'posts' column, but the problem persisted. >> >> Like i said, this code was taken from an example in a webpage. I contacted >> the author and he confirmed me that the code worked on his machine, that was >> running R 2.15.1.... >> Maybe something changed between versions in the data.frame() ?? >> >> I really don't understant what I am doing wrong. >> >> António >> >> On 2013/05/26, at 01:44, Bert Gunter wrote: >> >>> Huh? >>> >>>> z <- sample(1:10,30,rep=TRUE) >>>> tbl <- table(z) >>>> tbl >>> >>> z >>> 1 2 3 4 5 6 7 8 9 10 >>> 4 3 2 6 3 3 2 2 2 3 >>>> >>>> data.frame(z) >>> >>> z >>> 1 5 >>> 2 2 >>> 3 4 >>> 4 1 >>> 5 6 >>> 6 4 >>> 7 10 >>> 8 4 >>> 9 3 >>> 10 8 >>> 11 10 >>> 12 4 >>> 13 3 >>> 14 9 >>> 15 2 >>> 16 2 >>> 17 6 >>> 18 1 >>> 19 4 >>> 20 7 >>> 21 9 >>> 22 10 >>> 23 7 >>> 24 5 >>> 25 5 >>> 26 6 >>> 27 8 >>> 28 1 >>> 29 1 >>> 30 4 >>>> >>>> sapply(data.frame(z),class) >>> >>> z >>> "integer" >>> >>> Your error: you used df['posts'] . You should have used df[,'posts'] . >>> >>> The former is a data frame. The latter is a vector. Read the >>> "Introduction to R tutorial" or ?"[" if you don't understand why. >>> >>> -- Bert >>> >>> -- Bert >>> >>> On Sat, May 25, 2013 at 12:36 PM, António Camacho <toin...@gmail.com> >>> wrote: >>>> >>>> Hello >>>> >>>> >>>> I am novice to R and i was learning how to do a scatter plot with R using >>>> an example from a website. >>>> >>>> My setup is iMac with Mac OS X 10.8.3, with R 3.0.1, default install, >>>> without additional packages loaded >>>> >>>> I created a .csv file in vim with the following content >>>> userID,user,posts >>>> 1,user1,581 >>>> 2,user2,281 >>>> 3,user3,196 >>>> 4,user4,150 >>>> 5,user5,282 >>>> 6,user6,184 >>>> 7,user7,90 >>>> 8,user8,74 >>>> 9,user9,45 >>>> 10,user10,20 >>>> 11,user11,3 >>>> 12,user12,1 >>>> 13,user13,345 >>>> 14,user14,123 >>>> >>>> i imported the file into R using : ' df <- read.csv('file.csv') >>>> to confirm the data types i did : 'sappily(df, class) ' >>>> that returns "userID" --> "integer" ; "user" ---> "factor" ; "posts" ---> >>>> "integer" >>>> then i try to create another data frame with the number of posts and its >>>> frequencies, >>>> so i did: 'postFreqCount<-data.frame(table(df['posts']))' >>>> this gives me the postFreqCount data frame with two columns, one called >>>> 'Var1' that has the number of posts each user did, and another collumn >>>> 'Freq' with the frequency of each number of posts. >>>> the problem is that if i do : 'sappily(postFreqCount['Var1'],class)' it >>>> returns "factor". >>>> So the data.frame() function transformed a variable that was "integer" >>>> (posts) to a variable (Var1) that has the same values but is "factor". >>>> I want to know how to prevent this from happening. How do i keep the >>>> values >>>> from being transformed from "integer" to "factor" ? >>>> >>>> Thank you for your help >>>> >>>> António >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> >>> Bert Gunter >>> Genentech Nonclinical Biostatistics >>> >>> Internal Contact Info: >>> Phone: 467-7374 >>> Website: >>> >>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm >> >> > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.