Re: [R] How to count rows with a condition

2012-10-18 Thread arun

HI, 
I tried the code with unsorted ac_names column and found to be 
working.  So, couldn't identify exactly the problem.  If you can provide
 a subset of your dataset using ?dput(), then it would be much helpful. 

set.seed(1) 
dat1-data.frame(ac_name=sample(c(HouseA,HouseB,HouseC,HouseD,HouseE,HouseF,HouseG,HouseI,HouseJ),50,replace=TRUE),val=rnorm(50,15))
 
dat2-within(dat1,{ac_name-as.character(ac_name)}) 
dat2-dat2[order(dat2[,1]),] 
 dat3-dat2[dat2[,1]%in%count(dat2[,1])$x[count(dat2[,1])[2]5],] #data 
excluded 
dat4-dat2[!dat2[,1]%in%count(dat2[,1])$x[count(dat2[,1])[2]5],] #data 
included 
A.K. 


- Original Message -
From: fxen3k f.seha...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Wednesday, October 17, 2012 9:57 AM
Subject: Re: [R] How to count rows with a condition

Thanks for the first reply. 

Unfortunately, my list of different ac_names ist pretty long (about 1,000
different names). Is there a way, to sort them, count the quantity of each
name and exclude these rows, who exceed a particular limit?



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-count-rows-with-a-condition-tp4646454p4646465.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to count rows with a condition

2012-10-17 Thread fxen3k
Hi,

I have a dataset called data. There is one row called ac_name. Some
names in this column appear very often, some less. 
What I want is to filter this dataset with the following condition:

Exclude the names, which appear more than five times. (example: House A
appears 8 times == exclude it; House B appears 5 times == include it etc.)

In the end, I want to have the old data dataset excluding the rows with
the above mentioned condition and another list with all the names which have
been excluded.


I think for one of the professionals amongst you this is pretty easy to
solve. ;-)

Thanks dudes!

Cheerio,
Felix



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-count-rows-with-a-condition-tp4646454.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to count rows with a condition

2012-10-17 Thread fxen3k
Thanks for the first reply. 

Unfortunately, my list of different ac_names ist pretty long (about 1,000
different names). Is there a way, to sort them, count the quantity of each
name and exclude these rows, who exceed a particular limit?



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-count-rows-with-a-condition-tp4646454p4646465.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to count rows with a condition

2012-10-17 Thread William Dunlap
One way is:
  ac_name_count - ave(integer(nrow(data)), data[[ac_name]], FUN=length)
  data[ac_name_count = 5, ,drop=FALSE] # rows whose ac_name entry is rare
  data[ac_name_count  5, ,drop=FALSE]  # rows whose ac_name entry is common
Use
  ac_name_seqno - ave(integer(nrow(data)), data[[ac_name]], FUN=seq_along)
to assign a within-group sequence number so you can pick out the first or last
n items in a group for the big groups.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of fxen3k
 Sent: Wednesday, October 17, 2012 5:45 AM
 To: r-help@r-project.org
 Subject: [R] How to count rows with a condition
 
 Hi,
 
 I have a dataset called data. There is one row called ac_name. Some
 names in this column appear very often, some less.
 What I want is to filter this dataset with the following condition:
 
 Exclude the names, which appear more than five times. (example: House A
 appears 8 times == exclude it; House B appears 5 times == include it etc.)
 
 In the end, I want to have the old data dataset excluding the rows with
 the above mentioned condition and another list with all the names which have
 been excluded.
 
 
 I think for one of the professionals amongst you this is pretty easy to
 solve. ;-)
 
 Thanks dudes!
 
 Cheerio,
 Felix
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-count-rows-with-
 a-condition-tp4646454.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to count rows with a condition

2012-10-17 Thread David Winsemius


On Oct 17, 2012, at 5:44 AM, fxen3k wrote:


Hi,

I have a dataset called data. There is one row called ac_name.  
Some

names in this column appear very often, some less.
What I want is to filter this dataset with the following condition:

Exclude the names, which appear more than five times. (example:  
House A
appears 8 times == exclude it; House B appears 5 times == include  
it etc.)


In the end, I want to have the old data dataset excluding the rows  
with
the above mentioned condition and another list with all the names  
which have

been excluded.



data[ ave(data$ac_name, data$ac_name, length) = 5, ]  # all with 5 or  
fewer entries


--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to count rows with a condition

2012-10-17 Thread arun
HI David,

I tried ur function:
set.seed(1)
dat1-data.frame(ac_name=rep(c(HouseA,HouseB,HouseC,HouseD,HouseE),times=c(8,5,4,6,3)),val=rnorm(26,15))
dat2-within(dat1,{ac_name-as.character(ac_name)})
dat2-dat2[order(dat2[,1]),]

 dat2[ave(dat2$ac_name,dat2$ac_name,length)=5,]
#Error in unique.default(x) : unique() applies only to vectors
#With FUN added
head(dat2[ave(dat2$ac_name,dat2$ac_name,FUN=length)=5,])
#   ac_name  val
#9   HouseB 15.57578
#10  HouseB 14.69461
#11  HouseB 16.51178
#12  HouseB 15.38984
#13  HouseB 14.37876
#14  HouseC 12.78530
A.K.






- Original Message -
From: David Winsemius dwinsem...@comcast.net
To: fxen3k f.seha...@gmail.com
Cc: r-help@r-project.org
Sent: Wednesday, October 17, 2012 4:25 PM
Subject: Re: [R] How to count rows with a condition


On Oct 17, 2012, at 5:44 AM, fxen3k wrote:

 Hi,
 
 I have a dataset called data. There is one row called ac_name. Some
 names in this column appear very often, some less.
 What I want is to filter this dataset with the following condition:
 
 Exclude the names, which appear more than five times. (example: House A
 appears 8 times == exclude it; House B appears 5 times == include it etc.)
 
 In the end, I want to have the old data dataset excluding the rows with
 the above mentioned condition and another list with all the names which have
 been excluded.
 

data[ ave(data$ac_name, data$ac_name, length) = 5, ]  # all with 5 or fewer 
entries

--
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to count rows with a condition

2012-10-17 Thread William Dunlap
data[ ave(data$ac_name, data$ac_name, length) = 5, ]
fails for two reasons:
  a) you need to label the FUN argument, FUN=length, since there
  is a ... in the middle of ave's argument list to catch all the grouping 
arguments
  b) the type of the first argument to needs to be compatible with
  the type of the return value of FUN().  If ac_name is a factor
  you get NA's and warnings, if it is character  the 5 starts using
  character order instead of numerical order, leading to incorrect results
  because 115:

 data - data.frame(ac_name=rep(c(Amos,Boris,Charlotte),c(3,8,11)), 
 n=101:122, stringsAsFactors=FALSE)
 data[ ave(data$ac_name, data$ac_name, FUN=length) = 5, ]
 ac_name   n
1   Amos 101
2   Amos 102
3   Amos 103
12 Charlotte 112
13 Charlotte 113
... [ rows elided ] ...
22 Charlotte 122
 data - data.frame(ac_name=rep(c(Amos,Boris,Charlotte),c(3,8,11)), 
 n=101:122, stringsAsFactors=TRUE)
 data[ ave(data$ac_name, data$ac_name, FUN=length) = 5, ]
  ac_name  n
NA   NA NA
NA.1 NA NA
NA.2 NA NA
... [rows elided] ...
NA.21NA NA
Warning messages:
1: In `[-.factor`(`*tmp*`, i, value = 3L) :
  invalid factor level, NAs generated
2: In `[-.factor`(`*tmp*`, i, value = 8L) :
  invalid factor level, NAs generated
3: In `[-.factor`(`*tmp*`, i, value = 11L) :
  invalid factor level, NAs generated
4: In Ops.factor(ave(data$ac_name, data$ac_name, FUN = length), 5) :
  = not meaningful for factors

That is why I made the first argument integer:

 data[ ave(integer(nrow(data)), data$ac_name, FUN=length) = 5, ]
  ac_name   n
1Amos 101
2Amos 102
3Amos 103
  

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of David Winsemius
 Sent: Wednesday, October 17, 2012 1:25 PM
 To: fxen3k
 Cc: r-help@r-project.org
 Subject: Re: [R] How to count rows with a condition
 
 
 On Oct 17, 2012, at 5:44 AM, fxen3k wrote:
 
  Hi,
 
  I have a dataset called data. There is one row called ac_name.
  Some
  names in this column appear very often, some less.
  What I want is to filter this dataset with the following condition:
 
  Exclude the names, which appear more than five times. (example:
  House A
  appears 8 times == exclude it; House B appears 5 times == include
  it etc.)
 
  In the end, I want to have the old data dataset excluding the rows
  with
  the above mentioned condition and another list with all the names
  which have
  been excluded.
 
 
 data[ ave(data$ac_name, data$ac_name, length) = 5, ]  # all with 5 or
 fewer entries
 
 --
 
 David Winsemius, MD
 Alameda, CA, USA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.