Re: [R] Selecting names with regard to visit frequency
Hi Michael, It is not clear how you read the dataset. It looks like a dataframe. df1- read.table(text=' ,x A1,2 A2,5 A3,4 A4,6 A5,24 A6,7 A7,12 A8,3 A9,5 ',sep=,,header=TRUE,row.names=1) vec1-unlist(df1) names(vec1)- row.names(df1) names(vec1)[vec1%in% 3:5] #[1] A2 A3 A8 A9 names(vec1)[!is.na(match(vec1,3:5))] #[1] A2 A3 A8 A9 names(vec1)[vec1=3 vec1=5] #[1] A2 A3 A8 A9 A.K. From: michael steele real.steele...@gmail.com To: smartpink...@yahoo.com Sent: Tuesday, July 23, 2013 11:30 AM Subject: Re: Selecting names with regard to visit frequency Hi A.K., Sorry for the confusion. The first option worked. I can't give out the actual data. But I can give something similar to its structure. Output from write.csv(myvector,file=copy) is attached. A1, A2... represent unique identification codes for individuals. Basically, we are interested in finding individuals who have visited within a certain range. It was easy enough to find those that visited the most and the least, but not somewhere in the middle. Your first option worked and I had tried something similar (I don't remember exactly what) but I must have missed something simple. Thanks steele From: smartpink...@yahoo.com Date: Mon, Jul 22, 2013 at 11:05 PM Subject: Re: Selecting names with regard to visit frequency HI Steele, Could you provide a reproducible example for Options 2 and 3 that returns character(0)? Better would be use ?dput(). Not sure I understand you correctly. Did you meant that none of the options worked or except option 1? Also, comment regarding the practicality is also not clear. Tx. quote author='m.steele' Thanks A.K., I actually tried something similar to option 1, but I missed something simple it seems. Options 2 and 3 do not work; they return: character(0) It may make a difference that myvector already exists, it displays in the form I provided. Recreating that vector in your solution may not be practical in this way as there are pushing towards 1000 names with corresponding visits. Thanks again this has been very helpful and I look forward to learning more. steele /quote Quoted from: http://r.789695.n4.nabble.com/Selecting-names-with-regard-to-visit-frequency-tp4672074p4672087.html _ Sent from http://r.789695.n4.nabble.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting names with regard to visit frequency
Hi Michael, It could be due to some extra space. If you use read.table(..., fill=TRUE), it should read. Then, there would be missing values. Using ?dput() will be better. dput(df1) structure(list(x = c(2L, 5L, 4L, 6L, 24L, 7L, 12L, 3L, 5L)), .Names = x, class = data.frame, row.names = c(A1, A2, A3, A4, A5, A6, A7, A8, A9)) Now, try the code by assigning: df1- structure(list(x. It wouldn't work with decimals because here: 3:5 #[1] 3 4 5 #it will matching all values that are 3,4, and 5 Trying this on another dataset: df2- structure(list(x = c(2, 5, 4.4, 6, 24, 7, 12, 3.6, 5)), .Names = x, class = data.frame, row.names = c(A1, A2, A3, A4, A5, A6, A7, A8, A9)) vec2- unlist(df2) names(vec2)- row.names(df2) vec2 # A1 A2 A3 A4 A5 A6 A7 A8 A9 # 2.0 5.0 4.4 6.0 24.0 7.0 12.0 3.6 5.0 names(vec2)[vec2%in% 3:5] #incorrect #[1] A2 A9 names(vec2)[vec2%in% seq(3,5,by=0.1)] #[1] A2 A3 A8 A9 #If I change vec2[3]- 4.46 names(vec2)[vec2%in% seq(3,5,by=0.1)] #[1] A2 A8 A9 names(vec2)[round(vec2,1)%in% seq(3,5,by=0.1)] #[1] A2 A3 A8 A9 names(vec2)[vec2=3 vec2=5] #should be better in such cases #[1] A2 A3 A8 A9 It is also better to check R FAQ 7.31. A.K. Hi Arun, Perhaps these are dataframes I am working with, and have mistaken them for vectors (I am still very new at this and learning the data structures). I tried to read the text in as you have it here (copied and pasted), but it did not work. Error in read.table(text = \n\\,\x\ \n\A1\,2 \n\A2\,5 \n\A3\,4 \n\A4\,6 \n\A5\,24 \n\A6\,7 \n\A7\,12 \n\A8\,3 \n\A9\,5 \n, : more columns than column names I retried both: names(vec1)[vec1%in% 3:5] names(vec1)[!is.na(match(vec1,3:5))] before and after processing my current dataframe to a vector but I get a NULL return. I also get a NULL return if I unlist the dataframe and try to execute: names(vec1)[vec1=3 vec1=5] All 3 do work if I keep the dataframe in its original form, instead of using: vec1-unlist(df1) names(vec1)- row.names(df1) I discovered another issue, however. I am working with a couple datasets, one of them has whole numbers the other has percentages in place of visits such as: A1,0.2 A2,0.5 ... the two options: names(vec1)[vec1%in% 3:5] names(vec1)[!is.na(match(vec1,3:5))] do not seem to work with ranges given in decimals (and that is probably what I originally tested them on) but are fine with whole numbers. Thanks, steele __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting names with regard to visit frequency
Hi, myvector- c(3,2,7,4,1) names(myvector)-paste0(name,1:5) names(myvector)[myvector=3 myvector=5] #[1] name1 name4 #or names(myvector)[myvector%in% 3:5] #[1] name1 name4 #or names(myvector)[!is.na(match(myvector,3:5))] #[1] name1 name4 A.K. Hello all, I am new to R but trying to learn. I have a vector of names with visit frequencies (myvector) in the form name1 name2 name3 name4 name5 3 2 7 4 1 I can select names of patients that have visited more often: frequent.pats-names(myvector) [myvector5] or those that have visited less often, but how could I get the names of those who visited, say between 35 times? Thanks steele __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.