[R] data frames; matching/merging

2010-02-08 Thread Jonathan
Hi all,
I'm feeling a little guilty to ask this question, since I've
written a solution using a rather clunky for loop that gets the job
done.  But I'm convinced there must be a faster (and probably more
elegant) way to accomplish what I'm looking to do (perhaps using the
merge function?).  I figured somebody out there might've already
figured this out:

I have a dataframe with two columns (let's call them V1 and V2).  All
rows are unique, although column V1 has several redundant entries.

Ex:

 V1 V2
1a3
2a2
3b9
4c4
5a7
6b11


What I'd like is to return a dataframe cut down to have only unique
entires in V1.  V2 should contain a vector, for each V1, that is the
minimum of all the possible choices from the set of redundant V1's.

Example output:

  V1 V2
1 a2
2 b9
3 c4


If somebody could (relatively easily) figure out how to get closer to
a solution, I'd appreciate hearing how.  Also, I'd be interested to
hear how you came upon the answer (so I can get better at searching
the R resources myself).

Regards,
Jonathan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frames; matching/merging

2010-02-08 Thread jim holtman
On Mon, Feb 8, 2010 at 11:39 AM, Jonathan jonsle...@gmail.com wrote:
 Hi all,
    I'm feeling a little guilty to ask this question, since I've
 written a solution using a rather clunky for loop that gets the job
 done.  But I'm convinced there must be a faster (and probably more
 elegant) way to accomplish what I'm looking to do (perhaps using the
 merge function?).  I figured somebody out there might've already
 figured this out:

 I have a dataframe with two columns (let's call them V1 and V2).  All
 rows are unique, although column V1 has several redundant entries.

 Ex:

     V1     V2
 1    a        3
 2    a        2
 3    b        9
 4    c        4
 5    a        7
 6    b        11


 What I'd like is to return a dataframe cut down to have only unique
 entires in V1.  V2 should contain a vector, for each V1, that is the
 minimum of all the possible choices from the set of redundant V1's.

 Example output:

      V1     V2
 1     a        2
 2     b        9
 3     c        4


 If somebody could (relatively easily) figure out how to get closer to
 a solution, I'd appreciate hearing how.  Also, I'd be interested to
 hear how you came upon the answer (so I can get better at searching
 the R resources myself).

 Regards,
 Jonathan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frames; matching/merging

2010-02-08 Thread Ivan Calandra

Hi!

I'm definitely not an expert in R (and it's my first reply!), but if I 
understand right, I think the aggregate function might do what you're 
looking for.

Try ?aggregate to get more info. You might find what you need!

HTH
Ivan



Le 2/8/2010 17:39, Jonathan a écrit :

Hi all,
 I'm feeling a little guilty to ask this question, since I've
written a solution using a rather clunky for loop that gets the job
done.  But I'm convinced there must be a faster (and probably more
elegant) way to accomplish what I'm looking to do (perhaps using the
merge function?).  I figured somebody out there might've already
figured this out:

I have a dataframe with two columns (let's call them V1 and V2).  All
rows are unique, although column V1 has several redundant entries.

Ex:

  V1 V2
1a3
2a2
3b9
4c4
5a7
6b11


What I'd like is to return a dataframe cut down to have only unique
entires in V1.  V2 should contain a vector, for each V1, that is the
minimum of all the possible choices from the set of redundant V1's.

Example output:

   V1 V2
1 a2
2 b9
3 c4


If somebody could (relatively easily) figure out how to get closer to
a solution, I'd appreciate hearing how.  Also, I'd be interested to
hear how you came upon the answer (so I can get better at searching
the R resources myself).

Regards,
Jonathan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frames; matching/merging

2010-02-08 Thread jim holtman
 x - read.table(textConnection(V1 V2
+ 1a3
+ 2a2
+ 3b9
+ 4c4
+ 5a7
+ 6b11), header=TRUE)
 closeAllConnections()
 # close; matrix with rownames - easy enough to change into a dataframe if you 
 want
 cbind(tapply(x$V2, x$V1, min))
  [,1]
a2
b9
c4



On Mon, Feb 8, 2010 at 11:39 AM, Jonathan jonsle...@gmail.com wrote:
 Hi all,
    I'm feeling a little guilty to ask this question, since I've
 written a solution using a rather clunky for loop that gets the job
 done.  But I'm convinced there must be a faster (and probably more
 elegant) way to accomplish what I'm looking to do (perhaps using the
 merge function?).  I figured somebody out there might've already
 figured this out:

 I have a dataframe with two columns (let's call them V1 and V2).  All
 rows are unique, although column V1 has several redundant entries.

 Ex:

     V1     V2
 1    a        3
 2    a        2
 3    b        9
 4    c        4
 5    a        7
 6    b        11


 What I'd like is to return a dataframe cut down to have only unique
 entires in V1.  V2 should contain a vector, for each V1, that is the
 minimum of all the possible choices from the set of redundant V1's.

 Example output:

      V1     V2
 1     a        2
 2     b        9
 3     c        4


 If somebody could (relatively easily) figure out how to get closer to
 a solution, I'd appreciate hearing how.  Also, I'd be interested to
 hear how you came upon the answer (so I can get better at searching
 the R resources myself).

 Regards,
 Jonathan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frames; matching/merging

2010-02-08 Thread David Winsemius


On Feb 8, 2010, at 11:39 AM, Jonathan wrote:


Hi all,
   I'm feeling a little guilty to ask this question, since I've
written a solution using a rather clunky for loop that gets the job
done.  But I'm convinced there must be a faster (and probably more
elegant) way to accomplish what I'm looking to do (perhaps using the
merge function?).  I figured somebody out there might've already
figured this out:

I have a dataframe with two columns (let's call them V1 and V2).  All
rows are unique, although column V1 has several redundant entries.

Ex:

V1 V2
1a3
2a2
3b9
4c4
5a7
6b11


What I'd like is to return a dataframe cut down to have only unique
entires in V1.  V2 should contain a vector, for each V1, that is the
minimum of all the possible choices from the set of redundant V1's.


 rd.txt
function(txt, header=TRUE,...) {
  rd-read.table(textConnection(txt), header=header, ...)
   closeAllConnections()
  rd}
 DF - rd.txt(V1 V2
+ 1a3
+ 2a2
+ 3b9
+ 4c4
+ 5a7
+ 6b11
+ )
 tapply(DF$V2, DF$V1, min)
a b c
2 9 4

 as.data.frame.table(tapply(DF$V2, DF$V1, min))
  Var1 Freq
1a2
2b9
3c4
 DF2 - as.data.frame.table(tapply(DF$V2, DF$V1, min))
 names(DF2) - names(DF)
 DF2
  V1 V2
1  a  2
2  b  9
3  c  4



Example output:

 V1 V2
1 a2
2 b9
3 c4


If somebody could (relatively easily) figure out how to get closer to
a solution, I'd appreciate hearing how.  Also, I'd be interested to
hear how you came upon the answer (so I can get better at searching
the R resources myself).

Regards,
Jonathan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frames; matching/merging

2010-02-08 Thread S Ellison
You could try aggregate:

If we call your data frame df:

aggregate(df[2], by=df[1], FUN=min)

will get you what you asked for (if not necessarily what you need ;-)
)

Switching the columns around is easy enough if you need to; proceeding
stepwise:
df.new-aggregate(df[2], by=df[1], FUN=min)
df.new[,c(2,1)]

As to how I found aggregate: watching R-help daily for years
occasionally pops up fundamental gems like aggregate...

 Steve Ellison
LGC

 Jonathan jonsle...@gmail.com 08/02/2010 16:39:11 
What I'd like is to return a dataframe cut down to have only unique
entires in V1.  V2 should contain a vector, for each V1, that is the
minimum of all the possible choices from the set of redundant V1's.

Example output:

  V1 V2
1 a2
2 b9
3 c4



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frames; matching/merging

2010-02-08 Thread Gabor Grothendieck
Here are 3 solutions assuming DF contains the data frame:

 # 1. aggregate
 aggregate(DF[2], DF[1], min)
  V1 V2
1  a  2
2  b  9
3  c  4

 # 2. aggregate.formula - requires R 2.11.x
 aggregate(V2 ~ V1, DF, min)
  V1 V2
1  a  2
2  b  9
3  c  4

 # 3. SQL using sqldf
 library(sqldf)
 sqldf(select V1, min(V2) V2 from DF group by V1)
  V1 V2
1  a  2
2  b  9
3  c  4

 # 4. summaryBy in the doBy package
 library(doBy)
 summaryBy(V2 ~., DF, FUN = min, keep.names = TRUE)
  V1 V2
1  a  2
2  b  9
3  c  4

On Mon, Feb 8, 2010 at 11:39 AM, Jonathan jonsle...@gmail.com wrote:
 Hi all,
    I'm feeling a little guilty to ask this question, since I've
 written a solution using a rather clunky for loop that gets the job
 done.  But I'm convinced there must be a faster (and probably more
 elegant) way to accomplish what I'm looking to do (perhaps using the
 merge function?).  I figured somebody out there might've already
 figured this out:

 I have a dataframe with two columns (let's call them V1 and V2).  All
 rows are unique, although column V1 has several redundant entries.

 Ex:

     V1     V2
 1    a        3
 2    a        2
 3    b        9
 4    c        4
 5    a        7
 6    b        11


 What I'd like is to return a dataframe cut down to have only unique
 entires in V1.  V2 should contain a vector, for each V1, that is the
 minimum of all the possible choices from the set of redundant V1's.

 Example output:

      V1     V2
 1     a        2
 2     b        9
 3     c        4


 If somebody could (relatively easily) figure out how to get closer to
 a solution, I'd appreciate hearing how.  Also, I'd be interested to
 hear how you came upon the answer (so I can get better at searching
 the R resources myself).

 Regards,
 Jonathan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.