Re: [R] remove duplicated row according to NA condition

2014-05-29 Thread jeff6868
Yes, this is the good one Arun! Thank you very much. 
I tried each solution but yours was the best. It works well.
Thanks anyway for all your replies!






--
View this message in context: 
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362p4691422.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove duplicated row according to NA condition

2014-05-28 Thread arun


Hi,
May be this helps:
data1 <- data[with(data, order(col1, col2,1*is.na(col3))),]
 data1[!duplicated(data1[,1:2]),]
A.K.


On Wednesday, May 28, 2014 11:28 AM, jeff6868 
 wrote:
Hi everybody,

I have a little problem in my R-code which seems be easy to solve, but I
wasn't able to find the solution by myself for the moment.

Here's an example of the form of my data:

data <-
data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))

I would like to remove duplicated data based on the first two columns
(col1,col2), but in both cases here, I would like to remove the duplicated
row which is equal to NA in col3.

Here's the data.frame I would like to obtain:

data2 <- data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))

I've been trying to mix duplicated() with is.na() but it doesn't work yet.

Can someone tell me the best and easiest way to do this?

Thanks a lot!







--
View this message in context: 
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove duplicated row according to NA condition

2014-05-28 Thread William Dunlap
It would help if you said what you want done when none or all or some
of the col1-col2 duplicates have NA's in the col3.  E.g., what do you
want the function to do for the following input?

> data2 <- data.frame(col1=c("a","a","a","b","b","c","c","d","d","e"),
col2=c(1,1,1,2,2,3,3,4,4,5),
col3=c("A1",NA,"A3",NA,"B2","C1","C2",NA,NA,NA))
> data2
   col1 col2 col3
1 a1   A1
2 a1 
3 a1   A3
4 b2 
5 b2   B2
6 c3   C1
7 c3   C2
8 d4 
9 d4 
10e5 

(You may want it to return a data.frame or you may want the function
to stop because the data is not considered legal, but you should
decide what it should do.)

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, May 28, 2014 at 5:35 AM, jeff6868
 wrote:
> Hi everybody,
>
> I have a little problem in my R-code which seems be easy to solve, but I
> wasn't able to find the solution by myself for the moment.
>
> Here's an example of the form of my data:
>
> data <-
> data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))
>
> I would like to remove duplicated data based on the first two columns
> (col1,col2), but in both cases here, I would like to remove the duplicated
> row which is equal to NA in col3.
>
> Here's the data.frame I would like to obtain:
>
> data2 <- data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))
>
> I've been trying to mix duplicated() with is.na() but it doesn't work yet.
>
> Can someone tell me the best and easiest way to do this?
>
> Thanks a lot!
>
>
>
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove duplicated row according to NA condition

2014-05-28 Thread K. Elo
Hi!

How about trying this:

data[ data$col1!=data$col2 & !is.na(data$col3), ]

  col1 col2  col3
2a1 ST001
3b2 ST002

HTH, Kimmo


28.05.2014 15:35, jeff6868 wrote:
> Hi everybody,
> 
> I have a little problem in my R-code which seems be easy to solve, but I
> wasn't able to find the solution by myself for the moment.
> 
> Here's an example of the form of my data:
> 
> data <-
> data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))
> 
> I would like to remove duplicated data based on the first two columns
> (col1,col2), but in both cases here, I would like to remove the duplicated
> row which is equal to NA in col3.
> 
> Here's the data.frame I would like to obtain:
> 
> data2 <- data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))
> 
> I've been trying to mix duplicated() with is.na() but it doesn't work yet.
> 
> Can someone tell me the best and easiest way to do this?
> 
> Thanks a lot!
> 
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] remove duplicated row according to NA condition

2014-05-28 Thread jeff6868
Hi everybody,

I have a little problem in my R-code which seems be easy to solve, but I
wasn't able to find the solution by myself for the moment.

Here's an example of the form of my data:

data <-
data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))

I would like to remove duplicated data based on the first two columns
(col1,col2), but in both cases here, I would like to remove the duplicated
row which is equal to NA in col3.

Here's the data.frame I would like to obtain:

data2 <- data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))

I've been trying to mix duplicated() with is.na() but it doesn't work yet.

Can someone tell me the best and easiest way to do this?

Thanks a lot!







--
View this message in context: 
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.