Re: [R] Fast Removing Duplicates from Every Column

Bert Jacobs Wed, 17 Jan 2007 14:51:51 -0800

Hi,


Working further on this dataframe : my_data

 

          Col1 Col2 Col3 ... Col 159 Col 160 

 Row 1      0     0     LD ... 0       VD 

 Row 2      HD    0     0      0       MD 

 Row 3      0     HD    HD     0       LD 

 Row 4      LD    HD    HD     0       0 

 ...        ...

 LastRow    HD    HD    LD     0       MD

 

Running this line of code:

Test = apply(X=my_data, MARGIN=2, FUN=unique)

 

I get this list:

 

$Col1

[1] "0" "HD" "LD"               

$Col2

[1] "0" "HD"

$Col3

[1] "LD" "0" "HD"

...

$Col159

[1] "0" 

$Col160

[1] "VD" "MD" "LD" "0"

 

Now I was wondering how I can get this list into a data.frame:

because a simple data.frame doesn't work (error: arguments imply differing
number of rows)

 

Can someone help me out on this. Thx

 

So that I get the following result:

           Col1 Col2 Col3 ... Col 159 Col 160 

 Row 1       0   0    LD       0        VD

 Row 2     HD   HD   0        0        MD

 Row 3     LD   0    HD       0        LD

 Row 4      0    0    0        0        0

 

 

 

 

-----Original Message-----
From: Petr Pikal [mailto:[EMAIL PROTECTED] 
Sent: 05 January 2007 11:51
To: Bert Jacobs; 'R help list'
Subject: Re: [R] Fast Removing Duplicates from Every Column

 

Hi

 

I am not sure if I understand how do you want to select unique items.

 

with

 sapply(DF, function(x) !duplicated(x))

you can get data frame with TRUE when an item in particular column is 

unique and FALSE in opposite. However then you need to choose which 

rows to keep or discard

 

e.g.

 

DF[rowSums(sapply(comp, function(x) !duplicated(x)))>1,]

 

selects all rows in which are 2 or more unique values.

 

HTH

Petr

 

 

On 5 Jan 2007 at 9:54, Bert Jacobs wrote:

 

From:             "Bert Jacobs" <[EMAIL PROTECTED]>

To:               "'R help list'" <r-help@stat.math.ethz.ch>

Date sent:        Fri, 5 Jan 2007 09:54:17 +0100

Subject:          Re: [R] Fast Removing Duplicates from Every Column

 

> Hi,

> 

> I'm looking for some lines of code that does the following:

> I have a dataframe with 160 Columns and a number of rows (max 30):

> 

>          Col1 Col2 Col3 ... Col 159 Col 160 

> Row 1     0     0     LD ... 0       VD 

> Row 2     HD    0     0      0       MD 

> Row 3     0     HD    HD     0       LD 

> Row 4     LD    HD    HD     0       0 

> ...       ...

> LastRow   HD    HD    LD     0       MD

> 

> 

> Now I want a dataframe that looks like this. As you see all duplicates

> are removed. Can this dataframe be constructed in a fast way?

> 

>   Col1 Col2 Col3 ... Col 159 Col 160 

> Row 1       0    0    LD       0      VD

> Row 2           HD   HD   0        0        MD

> Row 3           LD   0    HD       0        LD

> 

> Thx for helping me out.

> Bert

> 

> ______________________________________________

> R-help@stat.math.ethz.ch mailing list

> https://stat.ethz.ch/mailman/listinfo/r-help

> PLEASE do read the posting guide

> http://www.R-project.org/posting-guide.html and provide commented,

> minimal, self-contained, reproducible code.

 

Petr Pikal

[EMAIL PROTECTED]

 


        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast Removing Duplicates from Every Column

Reply via email to