Re: [R] get latest dates for different people in a dataset

2015-01-26 Thread Tan, Richard
Thank you!

-Original Message-
From: Chel Hee Lee [mailto:chl...@mail.usask.ca] 
Sent: Friday, January 23, 2015 8:09 PM
To: Tan, Richard; 'r-help@R-project.org'
Subject: Re: [R] get latest dates for different people in a dataset

  do.call(rbind, lapply(split(data, data$Name), function(x)
x[order(x$CheckInDate),][nrow(x),]))
  Name CheckInDate Temp
John John  2014-04-01 99.0
Mary Mary  2014-03-01 98.1
Sam   Sam  2014-04-01 97.5
 

Is this what you are looking for?  I hope this helps.

Chel Hee Lee

On 01/23/2015 05:43 PM, Tan, Richard wrote:
 Hi,

 Can someone help for a R question?

 I have a data set like:

 NameCheckInDate  Temp
 John  1/3/2014  97
 Mary 1/3/2014  98.1
 Sam   1/4/2014  97.5
 John  1/4/2014  99

 I'd like to return a dataset that for each Name, get the row that is 
 the latest CheckInDate for that person.  For the example above it 
 would be

 NameCheckInDate  Temp
 John  1/4/2014  99
 Mary 1/3/2014  98.1
 Sam   1/4/2014  97.5


 Thank you for your help!

 Richard


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get latest dates for different people in a dataset

2015-01-26 Thread Tan, Richard
Thank you!

-Original Message-
From: David Barron [mailto:dnbar...@gmail.com] 
Sent: Saturday, January 24, 2015 7:56 AM
To: Tan, Richard; r-help@R-project.org
Subject: Re: [R] get latest dates for different people in a dataset

Hi Richard,

You could also do it using the package dplyr:

dta - data.frame(Name=c('John','Mary','Sam','John'),

CheckInDate=as.Date(c('1/3/2014','1/3/2014','1/4/2014','1/4/2014'),
  format='%d/%m/%Y'),
  Temp=c(97,98.1,97.5,99))


library(dplyr)
dta %% group_by(Name)  %% filter(CheckInDate==max(CheckInDate))

Source: local data frame [3 x 3]
Groups: Name

  Name CheckInDate Temp
1 Mary  2014-03-01 98.1
2  Sam  2014-04-01 97.5
3 John  2014-04-01 99.0

On 24 January 2015 at 01:09, Chel Hee Lee chl...@mail.usask.ca wrote:
 do.call(rbind, lapply(split(data, data$Name), function(x)
 x[order(x$CheckInDate),][nrow(x),]))
  Name CheckInDate Temp
 John John  2014-04-01 99.0
 Mary Mary  2014-03-01 98.1
 Sam   Sam  2014-04-01 97.5


 Is this what you are looking for?  I hope this helps.

 Chel Hee Lee


 On 01/23/2015 05:43 PM, Tan, Richard wrote:

 Hi,

 Can someone help for a R question?

 I have a data set like:

 NameCheckInDate  Temp
 John  1/3/2014  97
 Mary 1/3/2014  98.1
 Sam   1/4/2014  97.5
 John  1/4/2014  99

 I'd like to return a dataset that for each Name, get the row that is 
 the latest CheckInDate for that person.  For the example above it 
 would be

 NameCheckInDate  Temp
 John  1/4/2014  99
 Mary 1/3/2014  98.1
 Sam   1/4/2014  97.5


 Thank you for your help!

 Richard


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get latest dates for different people in a dataset

2015-01-26 Thread Tan, Richard
Thank you!


From: William Dunlap [mailto:wdun...@tibco.com]
Sent: Friday, January 23, 2015 7:14 PM
To: Tan, Richard
Cc: r-help@R-project.org
Subject: Re: [R] get latest dates for different people in a dataset

Here is one way.  Sort the data.frame, first by Name then break ties with 
CheckInDate.
Then choose the rows that are the last in a run of identical Name values.

 txt - NameCheckInDate  Temp
+ John  1/3/2014  97
+ Mary 1/3/2014  98.1
+ Sam   1/4/2014  97.5
+ John  1/4/2014  99
 d - read.table(header=TRUE, colClasses=c(character,character,numeric), 
 text=txt)
 d$CheckInDate - as.Date(d$CheckInDate, as.Date, format=%d/%m/%Y)
 isEndOfRun - function(x) c(x[-1] != x[-length(x)], TRUE)
 dSorted - d[order(d$Name, d$CheckInDate), ]
 dLatestVisit - dSorted[isEndOfRun(dSorted$Name), ]
 dLatestVisit
  Name CheckInDate Temp
4 John  2014-04-01 99.0
2 Mary  2014-03-01 98.1
3  Sam  2014-04-01 97.5


Bill Dunlap
TIBCO Software
wdunlap tibco.comhttp://tibco.com

On Fri, Jan 23, 2015 at 3:43 PM, Tan, Richard 
r...@panagora.commailto:r...@panagora.com wrote:
Hi,

Can someone help for a R question?

I have a data set like:

NameCheckInDate  Temp
John  1/3/2014  97
Mary 1/3/2014  98.1
Sam   1/4/2014  97.5
John  1/4/2014  99

I'd like to return a dataset that for each Name, get the row that is the latest 
CheckInDate for that person.  For the example above it would be

NameCheckInDate  Temp
John  1/4/2014  99
Mary 1/3/2014  98.1
Sam   1/4/2014  97.5


Thank you for your help!

Richard


[[alternative HTML version deleted]]

__
R-help@r-project.orgmailto:R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get latest dates for different people in a dataset

2015-01-23 Thread Tan, Richard
Hi,

Can someone help for a R question?

I have a data set like:

NameCheckInDate  Temp
John  1/3/2014  97
Mary 1/3/2014  98.1
Sam   1/4/2014  97.5
John  1/4/2014  99

I'd like to return a dataset that for each Name, get the row that is the latest 
CheckInDate for that person.  For the example above it would be

NameCheckInDate  Temp
John  1/4/2014  99
Mary 1/3/2014  98.1
Sam   1/4/2014  97.5


Thank you for your help!

Richard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] categorize a character column

2011-01-05 Thread Tan, Richard
Hi, I know I can do this with a for loop with strsplit and grep, but is
there more efficient way?

 

Given a data dataframe (input) and a category column (lst), 

 

 input

 item  loc

1 item 1.1: earnings sep item 1.2: w2 sep  shelf 1

2item 1.3: deductions drawer 1

3  item 1.1: earnings  shelf 2

 lst

  item cat

1 item 1.1   A

2 item 1.2   B

3 item 1.3   C

 

how to get a result frame like 

 

 result

 item  loc cat

1 item 1.1: earnings sep item 1.2: w2 sep  shelf 1  AB

2item 1.3: deductions drawer 1   C

3  item 1.1: earnings  shelf 2   A

 

Thanks,

Richard

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] categorize a character column

2011-01-05 Thread Tan, Richard
Sorry I should have included the r code for the dataframes for ease of
test:

 

input - rbind(data.frame(item=item 1.1: earnings sep item 1.2: w2
sep, loc=shelf 1),

  data.frame(item=item 1.3: deductions sep, loc=drawer 1),

  data.frame(item=item 1.1: earnings sep, loc=shelf 2))

 

lst - rbind(data.frame(item=item 1.1, cat=A),data.frame(item=item
1.2, cat=B),data.frame(item=item 1.3, cat=C))

 

 

want to get result like:

 

 result

 item  loc cat

1 item 1.1: earnings sep item 1.2: w2 sep  shelf 1  AB

2item 1.3: deductions drawer 1   C

3  item 1.1: earnings  shelf 2   A

 

Thanks,

Richard

 

From: Tan, Richard 
Sent: Wednesday, January 05, 2011 5:55 PM
To: 'r-help@r-project.org'
Subject: categorize a character column

 

Hi, I know I can do this with a for loop with strsplit and grep, but is
there more efficient way?

 

Given a data dataframe (input) and a category column (lst), 

 

 input

 item  loc

1 item 1.1: earnings sep item 1.2: w2 sep  shelf 1

2item 1.3: deductions drawer 1

3  item 1.1: earnings  shelf 2

 lst

  item cat

1 item 1.1   A

2 item 1.2   B

3 item 1.3   C

 

how to get a result frame like 

 

 result

 item  loc cat

1 item 1.1: earnings sep item 1.2: w2 sep  shelf 1  AB

2item 1.3: deductions drawer 1   C

3  item 1.1: earnings  shelf 2   A

 

Thanks,

Richard

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] aggregate a Date column does not work?

2010-11-22 Thread Tan, Richard
Hi, I am trying to aggregate max a Date type column but have weird
result, how do I fix this?

 

 a - rbind(

+ data.frame(name='Tom', payday=as.Date('1999-01-01')),

+ data.frame(name='Tom', payday=as.Date('2000-01-01')),

+ data.frame(name='Pete', payday=as.Date('1998-01-01')),

+ data.frame(name='Pete', payday=as.Date('1999-01-01'))

+ )

 a

  name payday

1  Tom 1999-01-01

2  Tom 2000-01-01

3 Pete 1998-01-01

4 Pete 1999-01-01

 aggregate(a$payday, list(a$name), max)

  Group.1 x

1 Tom 10957

2Pete 10592

 

Thanks,

Richard

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate a Date column does not work?

2010-11-22 Thread Tan, Richard
Thanks, add as.Date('1970-01-01') to the result column works.

Richard

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Monday, November 22, 2010 3:51 PM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] aggregate a Date column does not work?


On Nov 22, 2010, at 3:39 PM, Tan, Richard wrote:

 Hi, I am trying to aggregate max a Date type column but have weird
 result, how do I fix this?

In the process of getting max() you coerced the Dates to numeric and  
now you need to re-coerce them back to Dates

?as.Date
as.Date(your result)  (possibly with an origin it the default  
1970-01-01 doesn't get used.

-- 
David.



 a - rbind(

 + data.frame(name='Tom', payday=as.Date('1999-01-01')),

 + data.frame(name='Tom', payday=as.Date('2000-01-01')),

 + data.frame(name='Pete', payday=as.Date('1998-01-01')),

 + data.frame(name='Pete', payday=as.Date('1999-01-01'))

 + )

 a

  name payday

 1  Tom 1999-01-01

 2  Tom 2000-01-01

 3 Pete 1998-01-01

 4 Pete 1999-01-01

 aggregate(a$payday, list(a$name), max)

  Group.1 x

 1 Tom 10957

 2Pete 10592



 Thanks,

 Richard




   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate a Date column does not work?

2010-11-22 Thread Tan, Richard
Yes, I meant something like one of these:

b - aggregate(a$payday, list(a$name), max)
b$x - as.Date('1970-01-01') + b$x 
or
b$x - as.Date(b$x, origin='1970-01-01')

Thanks.


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Monday, November 22, 2010 3:58 PM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] aggregate a Date column does not work?


On Nov 22, 2010, at 3:54 PM, Tan, Richard wrote:

 Thanks, add as.Date('1970-01-01') to the result column works.

But that should make them all the same date in 1970. Since aggregate  
renames the date column to x, this should work:

as.Date( aggregate(a$payday, list(a$name), max)$x )
[1] 2000-01-01 1999-01-01


 Richard

 -Original Message-
 From: David Winsemius [mailto:dwinsem...@comcast.net]
 Sent: Monday, November 22, 2010 3:51 PM
 To: Tan, Richard
 Cc: r-help@r-project.org
 Subject: Re: [R] aggregate a Date column does not work?


 On Nov 22, 2010, at 3:39 PM, Tan, Richard wrote:

 Hi, I am trying to aggregate max a Date type column but have weird
 result, how do I fix this?

 In the process of getting max() you coerced the Dates to numeric and
 now you need to re-coerce them back to Dates

 ?as.Date
 as.Date(your result)  (possibly with an origin it the default
 1970-01-01 doesn't get used.

 -- 
 David.



 a - rbind(

 + data.frame(name='Tom', payday=as.Date('1999-01-01')),

 + data.frame(name='Tom', payday=as.Date('2000-01-01')),

 + data.frame(name='Pete', payday=as.Date('1998-01-01')),

 + data.frame(name='Pete', payday=as.Date('1999-01-01'))

 + )

 a

 name payday

 1  Tom 1999-01-01

 2  Tom 2000-01-01

 3 Pete 1998-01-01

 4 Pete 1999-01-01

 aggregate(a$payday, list(a$name), max)

 Group.1 x

 1 Tom 10957

 2Pete 10592



 Thanks,

 Richard




  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] aggregate text column by a few rows

2010-10-07 Thread Tan, Richard
Hi, R function aggregate can only take summary stats functions, can I
aggregate text columns?  For example, for the dataframe below, 

 

 a - rbind(data.frame(id=1, name='Tom',
hobby='fishing'),data.frame(id=1, name='Tom',
hobby='reading'),data.frame(id=2, name='Mary',
hobby='reading'),data.frame(id=3, name='John',
hobby='boating'),data.frame(id=2, name='Mary', hobby='running'))

 a

  id name   hobby

1  1  Tom fishing

2  1  Tom reading

3  2 Mary reading

4  3 John boating

5  2 Mary running

 

 

I want output as 

b

id name hobbies

1 Tomfishing reading

2 Mary reading running

3 John boating

 

 

 

Thanks,

Richard

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate text column by a few rows

2010-10-07 Thread Tan, Richard
Thank you!
Richard


-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Thursday, October 07, 2010 12:08 PM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] aggregate text column by a few rows

try this using sqldf:

 a
  id name   hobby
1  1  Tom fishing
2  1  Tom reading
3  2 Mary reading
4  3 John boating
5  2 Mary running
 require(sqldf)
 sqldf('select name, group_concat(hobby) hobby from a group by id', 
 method='raw')
  name   hobby
1  Tom fishing,reading
2 Mary reading,running
3 John boating


On Thu, Oct 7, 2010 at 11:52 AM, Tan, Richard r...@panagora.com wrote:
 Hi, R function aggregate can only take summary stats functions, can I
 aggregate text columns?  For example, for the dataframe below,



 a - rbind(data.frame(id=1, name='Tom',
 hobby='fishing'),data.frame(id=1, name='Tom',
 hobby='reading'),data.frame(id=2, name='Mary',
 hobby='reading'),data.frame(id=3, name='John',
 hobby='boating'),data.frame(id=2, name='Mary', hobby='running'))

 a

  id name   hobby

 1  1  Tom fishing

 2  1  Tom reading

 3  2 Mary reading

 4  3 John boating

 5  2 Mary running





 I want output as

b

 id name hobbies

 1 Tom    fishing reading

 2 Mary reading running

 3 John boating







 Thanks,

 Richard




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get top n rows group by a column from a dataframe

2010-09-16 Thread Tan, Richard
Hi, is there an R function like sql's TOP key word?

 

I have a dataframe that has 3 columns: company, person, salary

 

How do I get top 5 highest paid person for each company, and if I have
fewer than 5 people for a company, just return all of them?

 

Thanks,

Richard

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get top n rows group by a column from a dataframe

2010-09-16 Thread Tan, Richard
Hi Richard

 

Thanks for the suggestion, but I want top 5 salary for each company, not
the whole list.  I don't see how your way can work?  

 

Thanks,

Richard

 

From: RICHARD M. HEIBERGER [mailto:r...@temple.edu] 
Sent: Thursday, September 16, 2010 11:53 AM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] get top n rows group by a column from a dataframe

 

 tmp - data.frame(matrix(rnorm(30), 10, 3,
   dimnames=list(letters[1:10],
 c(company, person,
salary
 tmp
  company person  salary
a -1.04590176 -0.7841855  1.07150503
b -1.06643101  0.6545647  0.43920454
c  0.72894531 -1.3812867  0.41313659
d -0.39265263 -0.3871271  0.69404325
e  0.54028124  0.7124772  0.66630904
f -1.46931714 -0.3823353  0.03069797
g -0.33283666 -0.6351862  0.37920017
h -0.79977129  0.2605315  0.92373900
i  0.80614119  0.3727227 -1.16560563
j  0.03165012  0.4690400 -0.81966285
 order(tmp$person, decreasing=TRUE)[1:min(5, length(tmp$person))]
[1]  5  2 10  9  8
 tmp[order(tmp$person, decreasing=TRUE)[1:min(5, length(tmp$person))],]
  companyperson salary
e  0.54028124 0.7124772  0.6663090
b -1.06643101 0.6545647  0.4392045
j  0.03165012 0.4690400 -0.8196628
i  0.80614119 0.3727227 -1.1656056
h -0.79977129 0.2605315  0.9237390

You can easily write a function for that.
top - function(DF, varname, howmany) {}


On Thu, Sep 16, 2010 at 11:39 AM, Tan, Richard r...@panagora.com
wrote:

Hi, is there an R function like sql's TOP key word?

I have a dataframe that has 3 columns: company, person, salary

How do I get top 5 highest paid person for each company, and if
I have
fewer than 5 people for a company, just return all of them?

Thanks,

Richard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get top n rows group by a column from a dataframe

2010-09-16 Thread Tan, Richard
Thanks, it works. 

 

Richard

 

From: Henrique Dallazuanna [mailto:www...@gmail.com] 
Sent: Thursday, September 16, 2010 1:56 PM
To: Tan, Richard
Cc: RICHARD M. HEIBERGER; r-help@r-project.org
Subject: Re: [R] get top n rows group by a column from a dataframe

 

You can try this:

(Using Richard's example):
aggregate(sdata['salary'], sdata[c('company')], function(x)tail(sort(x), 5))

On Thu, Sep 16, 2010 at 2:26 PM, Tan, Richard r...@panagora.com wrote:

Hi Richard



Thanks for the suggestion, but I want top 5 salary for each company, not
the whole list.  I don't see how your way can work?



Thanks,

Richard



From: RICHARD M. HEIBERGER [mailto:r...@temple.edu]
Sent: Thursday, September 16, 2010 11:53 AM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] get top n rows group by a column from a dataframe




 tmp - data.frame(matrix(rnorm(30), 10, 3,
  dimnames=list(letters[1:10],
c(company, person,
salary
 tmp
 company person  salary
a -1.04590176 -0.7841855  1.07150503
b -1.06643101  0.6545647  0.43920454
c  0.72894531 -1.3812867  0.41313659
d -0.39265263 -0.3871271  0.69404325
e  0.54028124  0.7124772  0.66630904
f -1.46931714 -0.3823353  0.03069797
g -0.33283666 -0.6351862  0.37920017
h -0.79977129  0.2605315  0.92373900
i  0.80614119  0.3727227 -1.16560563
j  0.03165012  0.4690400 -0.81966285
 order(tmp$person, decreasing=TRUE)[1:min(5, length(tmp$person))]
[1]  5  2 10  9  8
 tmp[order(tmp$person, decreasing=TRUE)[1:min(5, length(tmp$person))],]
 companyperson salary
e  0.54028124 0.7124772  0.6663090
b -1.06643101 0.6545647  0.4392045
j  0.03165012 0.4690400 -0.8196628
i  0.80614119 0.3727227 -1.1656056
h -0.79977129 0.2605315  0.9237390

You can easily write a function for that.
top - function(DF, varname, howmany) {}


On Thu, Sep 16, 2010 at 11:39 AM, Tan, Richard r...@panagora.com
wrote:

   Hi, is there an R function like sql's TOP key word?

   I have a dataframe that has 3 columns: company, person, salary

   How do I get top 5 highest paid person for each company, and if
I have
   fewer than 5 people for a company, just return all of them?

   Thanks,

   Richard


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame select max group by like function

2010-03-09 Thread Tan, Richard
Hi, I have a data frame with 3 columns: ID, year and score.  How can I
select for each unique ID, the year that has the max score?  For
example, for data frame
 
ID, year, score
tom, 1995, 88
rick, 1994, 90
mary, 2000, 97
tom, 1998, 60
mary, 1998,100
 
I shall have
ID, year, score
tom, 1995, 88
rick, 1994, 90
mary, 1998,100
 
Thanks,
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame select max group by like function

2010-03-09 Thread Tan, Richard
Thanks all for the help! 

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Tuesday, March 09, 2010 5:58 PM
To: Phil Spector; Tan, Richard
Cc: r-help@r-project.org
Subject: RE: [R] data frame select max group by like function

And yet another way is
   isLastInRun - function(x)c(x[-1]!=x[-length(x)], TRUE)
   sortedDat - dat[order(dat$ID,dat$score),]
   sortedDat[isLastInRun(sortedDat$ID),]
  ID year score
  5 mary 1998   100
  2 rick 199490
  1  tom 199588
The row names (5,2,1) show where in the
original dataset the output rows
come from.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Phil Spector
 Sent: Tuesday, March 09, 2010 11:55 AM
 To: Tan, Richard
 Cc: r-help@r-project.org
 Subject: Re: [R] data frame select max group by like function
 
 Yet another way to do this with base R:
 
  dat = read.csv(textConnection('ID, year, score
 + tom, 1995, 88
 + rick, 1994, 90
 + mary, 2000, 97
 + tom, 1998, 60
 + mary, 1998,100'))
  
 do.call(rbind,lapply(split(dat,dat$ID),function(x)x[which.max(
x$score),]))
 ID year score
 mary mary 1998   100
 rick rick 199490
 tom   tom 199588
 
   - Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spec...@stat.berkeley.edu
 
 
 On Tue, 9 Mar 2010, Tan, Richard wrote:
 
  Hi, I have a data frame with 3 columns: ID, year and score. 
  How can I
  select for each unique ID, the year that has the max score?  For 
  example, for data frame
 
  ID, year, score
  tom, 1995, 88
  rick, 1994, 90
  mary, 2000, 97
  tom, 1998, 60
  mary, 1998,100
 
  I shall have
  ID, year, score
  tom, 1995, 88
  rick, 1994, 90
  mary, 1998,100
 
  Thanks,
  Richard
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gsub does not support \b?

2009-11-10 Thread Tan, Richard
Hello, can someone help?  How come
 
 gsub(\bINDS\b,INDUSTRIES,ADVANCED ENERGY INDS)
[1] ADVANCED ENERGY INDS
 
not ADVANCED ENERGY INDUSTRIES
 
Thanks.
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub does not support \b?

2009-11-10 Thread Tan, Richard
Ok, I figured it out.  My stupid mistake, should be \\b instead of \b.



From: Tan, Richard 
Sent: Tuesday, November 10, 2009 3:36 PM
To: 'r-help@r-project.org'
Subject: gsub does not support \b?


Hello, can someone help?  How come

 gsub(\bINDS\b,INDUSTRIES,ADVANCED ENERGY INDS)
[1] ADVANCED ENERGY INDS

not ADVANCED ENERGY INDUSTRIES

Thanks.
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number

2009-06-09 Thread Tan, Richard
Sorry I did not give some examples in my previous posting to make my
question clear.  It's not exactly 1 digit, but at least one digit.  Here
are some examples:

 input = c(none='0foo f0oo foo0 foofoofoo0 0foofoofoo TOOL9NGG
NONUMBER',all='foob0 fo0o0b 0foob 0foobardo foob4rdoo foobardo0')
 gsub(x=input, replacement='x', perl=TRUE,pattern=something)

  none
all 
0foo f0oo foo0 foo00 f0o0o foofoofoo0 0foofoofoo TOOL9NGG NONUMBER
x x x x x x 

-Original Message-
From: Wacek Kusnierczyk [mailto:waclaw.marcin.kusnierc...@idi.ntnu.no] 
Sent: Tuesday, June 09, 2009 1:06 PM
To: Greg Snow
Cc: Marc Schwartz; Barry Rowlingson; r-help@r-project.org; Tan, Richard
Subject: Re: [R] Regex question to find a string that contains 5-9
alpha-numeric characters, at least one of which is a number

Greg Snow wrote:
 Here is one way using a single pattern (so can be used in a
substitution), it uses Perl's positive look ahead patters:

   
 test - 
 c(SHRT,5HRT,M1TCH,M1TCH5,LONG3RS,NONUMBER,TOOLNGG,
 ooops.3)

 sub( '(?=[a-zA-Z]{0,8}[0-9])[a-zA-Z0-9]{5,9}', 'xxx', test, 
 perl=TRUE)
 


yes, but:

sub( '(?=[a-zA-Z]{0,8}[0-9])[a-zA-Z0-9]{5,9}', 'x', '12345',
perl=TRUE)
# x

which is not what was expected -- as far as i understand, the point was
to match 5-9 character strings with exactly 1 digit.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number

2009-06-08 Thread Tan, Richard
Hi, 

This is not exactly an R question but I am trying to use gsub to replace
a string that contains 5-9 alpha-numeric characters, at least one of
which is a number.  Is there a good way to write it in a one line regex?

Thanks,
Richard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] toupper does not work in sub + regex

2009-04-13 Thread Tan, Richard
Hi, I don't know what I am doing wrong to the toupper does not seem
working in sub + regex.  The following returns 's' not the upper class
'S' as I expect:
 
sub(q_([a-z])[a-zA-Z]*,toupper('\\1'),q_sviRaw)
 
Can someone tell me where I did wrong?
 
Thanks,
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] toupper does not work in sub + regex

2009-04-13 Thread Tan, Richard
Thanks, Martin.  I did not realize that.  I never used perl compatible
regex before but seems now I should!

Richard

-Original Message-
From: Martin Morgan [mailto:mtmor...@fhcrc.org] 
Sent: Monday, April 13, 2009 12:08 PM
To: Tan, Richard
Subject: Re: [R] toupper does not work in sub + regex

Tan, Richard r...@panagora.com writes:

 Hi, I don't know what I am doing wrong to the toupper does not seem 
 working in sub + regex.  The following returns 's' not the upper class

 'S' as I expect:
  
 sub(q_([a-z])[a-zA-Z]*,toupper('\\1'),q_sviRaw)

you're expecting toupper to be evaluated after substitution, but it is
evaluated before: toupper('\\1') == '\\1'. try

  sub(q_([a-z])[a-zA-Z]*,'\\U\\1',q_sviRaw, perl=TRUE)

  
 Can someone tell me where I did wrong?
  
 Thanks,
 Richard

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center 1100
Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] toupper does not work in sub + regex

2009-04-13 Thread Tan, Richard
Thanks, Bill!  One more question, how do I get SviRaw, i.e., just
uppercase the 1st char and keep everything else the same?  

sub(q_([a-z])([a-zA-Z]*), \\U\\1 \\2, q_sviRaw,perl=TRUE)

Did not work. 

Thank you!
Richard

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Monday, April 13, 2009 1:17 PM
To: Tan, Richard; r-help@r-project.org
Subject: Re: [R] toupper does not work in sub + regex

You could also use \\U and \\L in the replacement with perl=TRUE.  \\U
converts the rest of the replacement to upper case and \\L converts to
lowercase. (By replacement it means the parts of the replacement that
arise from parenthesized subpatterns in the pattern argument, not the
replacement argument itself.)  E.g.,

 sub(q_([a-z])[a-zA-Z]*, \\U\\1\\L, q_sviRaw, perl=TRUE)
[1] S
 sub(q_([a-z])([a-zA-Z]*), \\U\\1 then \\L\\2, q_sviRaw,
perl=TRUE)
[1] S then viraw
 sub(q_([a-z])([a-zA-Z]*), \\U\\1 then \\2, q_sviRaw, perl=TRUE)
[1] S then VIRAW

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

--
[R] toupper does not work in sub + regex

Gabor Grothendieck ggrothendieck at gmail.com Mon Apr 13 18:26:12 CEST
2009

sub only handles replacement strings, not replacement functions.
Your code is the same as:

sub(q_([a-z])[a-zA-Z]*, '\\1', q_sviRaw)

since toupper('\\1') has no alphabetics so its just literally '\\1' and
the latter is what sub uses.

The gsubfn function in the gsubfn package can deal with replacement
functions:

 library(gsubfn)
 gsubfn(q_([a-z])[a-zA-Z]*, toupper, q_sviRaw)
[1] S

See the home page: http;//gsubfn.googlecode.com, vignette and help page.

On Mon, Apr 13, 2009 at 11:54 AM, Tan, Richard RTan at panagora.com
wrote:
 Hi, I don't know what I am doing wrong to the toupper does not seem 
 working in sub + regex.  The following returns 's' not the upper class

 'S' as I expect:

 sub(q_([a-z])[a-zA-Z]*,toupper('\\1'),q_sviRaw)

 Can someone tell me where I did wrong?

 Thanks,
 Richard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] search for string insider a string

2009-03-13 Thread Tan, Richard
Hi, sorry if it is a too stupid question, but how do I a string search
in R:
 
I have a dataframe A with A$test like:
 
test1
bcdtestblabla2.1bla
cdtestblablabla3.88blabla
 
and I want to search for string that start with 'dtest' and ends with
number and return the location of that substring and the number, so the
end result would be:
 
NANA
32.1
23.88
 
I find grep can probably do this but I am new to the function so would
like a good example.
 
Thanks,
Richard
 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] search for string insider a string

2009-03-13 Thread Tan, Richard
That works.  I want the position just for the purpose of my later manual check. 
 Thanks a lot Gabor.

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Friday, March 13, 2009 2:18 PM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] search for string insider a string

Try this.  We use regexpr to get the positions and strapply puts the values in 
list s.  The unlist statement converts NULL to NA and simplifies the list, s, 
to a numeric vector.  For more info on strapply see http://gsubfn.googlecode.com

library(gsubfn)  # strapply

x - ctest1, bcdtestblabla2.1bla, cdtestblablabla3.88blabla)

dtest.info - cbind(posn = regexpr(dtest, x),
   value = { s - strapply(x, dtest[^0-9]*([0-9][0-9.]*), as.numeric)
unlist(ifelse(sapply(s, length), s, NA))
})

 # the above may be sufficient but
 # if its important to NA out rows with no match add 
 dtest.info[dtest.info[,1]  0,] - NA dtest.info
 pos value
[1,]  NANA
[2,]   3  2.10
[3,]   2  3.88

Why do you want the position?   Is there a further transformation needed?
What is it?  There may be even easier approaches to the entire problem.

On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard r...@panagora.com wrote:
 Hi, sorry if it is a too stupid question, but how do I a string search 
 in R:

 I have a dataframe A with A$test like:

 test1
 bcdtestblabla2.1bla
 cdtestblablabla3.88blabla

 and I want to search for string that start with 'dtest' and ends with 
 number and return the location of that substring and the number, so 
 the end result would be:

 NA    NA
 3    2.1
 2    3.88

 I find grep can probably do this but I am new to the function so would 
 like a good example.

 Thanks,
 Richard



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Get top cluster for each item in a correlation matrix

2009-02-23 Thread Tan, Richard
Hi,
 
I posted a question a few days ago and got extremely well response.
https://stat.ethz.ch/pipermail/r-help/2009-February/188225.html.  Now I
have a somewhat related question:
 
I have a correlation matrix of about 3000 items, with 1 on diagonal (
for example, cor.mat - cor(matrix(rnorm(3000*1000), 1000, 3000)) ).
For each item in the matrix, I want to find the cluster of which 1
belongs to, i.e., the cluster with the highest correlation coeffs, and
generate a data frame with 3 columns like (ID, ID2, cor), where in
each row ID is one of those 3000 items, and ID2 is ID of items with in
that top cluster, and cor is the correlation of ID and ID2.
 
The cluster method is fanny, setting number of clusters to 60.  It is
very time consuming to do a for loop like this:
 
for (i in 1:ncol(cor.mat)) {
  f - fanny(cor.mat[,i],60)
  temp - cbind(ID = i,ID2 = f$clustering, cor = cor.mat[,i])
  temp - temp[which(temp[,2]==f$clustering[i]),]
  if (i == 1) {
out - temp
  } else {
out - rbind(out,temp)
  }
}
out
 
Is there a better way to do it?  
 
Thanks,
Richard 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] transform key value pair to column

2009-02-19 Thread Tan, Richard
Thank you, works! 

-Original Message-
From: Rowe, Brian Lee Yung (Portfolio Analytics) [mailto:b_r...@ml.com] 
Sent: Thursday, February 19, 2009 5:52 PM
To: Wacek Kusnierczyk; Tan, Richard
Cc: r-help@r-project.org
Subject: RE: [R] transform key value pair to column

Try this:

 dummy
  id code value
1  1   hi  10.3
2  1   lo   5.2
3  2   hi  19.4
4  3   hi  20.0
5  3   lo  12.0
6  4   lo   5.8

 reshape(dummy, idvar='id', timevar='code', direction='wide')
  id value.hi value.lo
1  1 10.3  5.2
3  2 19.4   NA
4  3 20.0 12.0
6  4   NA  5.8

Brian

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Wacek Kusnierczyk
Sent: Thursday, February 19, 2009 5:39 PM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] transform key value pair to column


see ?stack, for example.

vQ

Tan, Richard wrote:
 Hi, is there a good way (instead of a time-consuming for loop) to 
 transfer a key/value pair dataframe to a dataframe with key as column 
 and value as row?  For example, I have a dataframe with three columns:
 id, code, value:
  
 id,code,value
 1,hi,10.3
 1,lo,5.2
 2,hi,19.4
 3,hi,20
 3,lo,12
 4,lo,5.8
  
 I want to get a dataframe like this:
  
 id,hi,lo
 1,10.3,5.2
 2,19.4,NA
 3,20,12
 4,NA,5.8
  
 Thank you,
 Richard

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
This message w/attachments (message) may be privileged, confidential or
proprietary, and if you are not an intended recipient, please notify the
sender, do not use or share it and delete it. Unless specifically
indicated, this message is not an offer to sell or a solicitation of any
investment products or other financial product or service, an official
confirmation of any transaction, or an official statement of Merrill
Lynch. Subject to applicable law, Merrill Lynch may monitor, review and
retain e-communications (EC) traveling through its networks/systems. The
laws of the country of each sender/recipient may impact the handling of
EC, and EC may be archived, supervised and produced in countries other
than the country in which you are located. This message cannot be
guaranteed to be secure or error-free. References to Merrill Lynch are
references to any company in the Merrill Lynch  Co., Inc. group of
companies, which are wholly-owned by Bank of America Corporation.
Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank
Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a
Condition to Any Banking Service or Activity * Are Not Insured by Any
Federal Government Agency. Attachments that are part of this
E-communication may have additional important disclosures and
disclaimers, which you should read. This message is subject to terms
available at the following link:
http://www.ml.com/e-communications_terms/. By messaging with Merrill
Lynch you consent to the foregoing.

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get top 50 correlated item from a correlation matrix for each item

2009-02-12 Thread Tan, Richard
Hi,
 
I have a correlation matrix of about 3000 items, i.e., a 3000*3000
matrix.  For each of the 3000 items, I want to get the top 50 items that
have the highest correlation with it (excluding itself) and generate a
data frame with 3 columns like (ID, ID2, cor), where ID is those
3000 items each repeat 50 times, and ID2 is the top 50 correlated items
with ID, and cor is the correlation of ID and ID2.  I know I can use two
for loops to do it but it is very time consuming considering the
correlation matrix is generated for each month of the past 20 years.  Is
there a better way to do it?
 
Regards,
 
Richard 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get top 50 correlated item from a correlation matrix for each item

2009-02-12 Thread Tan, Richard
Works like a charm, thank you! 

-Original Message-
From: Dimitris Rizopoulos [mailto:d.rizopou...@erasmusmc.nl] 
Sent: Thursday, February 12, 2009 12:11 PM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] get top 50 correlated item from a correlation matrix
for each item

a possible vectorized solution is the following:

cor.mat - cor(matrix(rnorm(100*1000), 1000, 100)) p - 30 # how many
top items

n - ncol(cor.mat)
cmat - col(cor.mat)
ind - order(-cmat, cor.mat, decreasing = TRUE) - (n * cmat - n)
dim(ind) - dim(cor.mat)
ind - ind[seq(2, p + 1), ]
out - cbind(ID = c(col(ind)), ID2 = c(ind)) as.data.frame(cbind(out,
cor = cor.mat[out]))


I hope it helps.

Best,
Dimitris


Tan, Richard wrote:
 Hi,
  
 I have a correlation matrix of about 3000 items, i.e., a 3000*3000 
 matrix.  For each of the 3000 items, I want to get the top 50 items 
 that have the highest correlation with it (excluding itself) and 
 generate a data frame with 3 columns like (ID, ID2, cor), where 
 ID is those 3000 items each repeat 50 times, and ID2 is the top 50 
 correlated items with ID, and cor is the correlation of ID and ID2.  I

 know I can use two for loops to do it but it is very time consuming 
 considering the correlation matrix is generated for each month of the 
 past 20 years.  Is there a better way to do it?
  
 Regards,
  
 Richard
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.