Re: [R] merge function in R?

David Winsemius Fri, 13 Aug 2010 16:30:44 -0700

Neither you nor your responder have continued the eamil chain verywell so let me put things back together:

on  Aug 13, 2010; 03:54pm fishkbob wrote subj = merge function in R?

So I have a bunch of c(start,end) points and want to consolidatethem into as few c(start,end) as possible.
For example:
sample   start    end
A              5       10
B              7       18
C              1        4
D              16      20
I'd want the function to return the two distinct sets (1,4) and(5,20)
Is there an R function that already does this?
or should I write my own? (how would I go about that?)

In an effort to be be helpful but not copying the prior message onAug 13, 2010; 06:46pm JesperHybel wrote:

I think it would be helpful if you could clarify youre question -do you want distinct sets - maybe use
unique()
but why (5,20) when its (5,10) in the row in youre example? Whatcriteria do you want the function to select the "sets" by and whatkind of output do you need?
Maybe it's just me who dosn't get the question..sr


On Aug 13, 2010, at 7:01 PM, fishkbob wrote:

I too think I worded it incorrectly...
so the second two columns of the matrix are the start and end of anintervalhowever, because some of the intervals overlap, I want to limit thenumber
of intervals I have to deal with.

So therefore,
(5     10)    should merge with    (7     18)   making    (5     18)
and then (5    18)   should merge with (16    20)   giving   (5    20)
whereas (1 4) has no overlap with any other interval and istherefore
left on its own

Ideal output would just be a collapsing of the matrix
sample   start     end
#              5       20
#              1        4
I got this to work using unique(c(5:10,7:18,16:20,1:4)) which givesme a
c(1:4,5:20)
However, I have to do this on a very large dataset and the numbersare more
like
c(100542:100782,598322:598821,...)

any help would be appreciated
thanks
--
View this message in context: 
http://r.789695.n4.nabble.com/merge-function-in-R-tp2324684p2324855.html
Sent from the R help mailing list archive at Nabble.com.


Nabble is where I saw all of this, but Nabble is not r-help:

I suggest you sort your rows by the "start" variable and then examinewhere the breaks would remain by looking at the prior values of "end":


> dd <- rd.txt("sample   start    end
+ A              5       10
+ B              7       18
+ C              1        4
+ D              16      20")
> dd[order(dd$start), ]
  sample start end
3      C     1   4
1      A     5  10
2      B     7  18
4      D    16  20
> ndd <- dd[order(dd$start), ]
> ndd$inprior <- c(NA, ndd[1:nrow(ndd)-1,3] >= ndd[2:nrow(ndd),2] )
> ndd
  sample start end inprior
3      C     1   4      NA
1      A     5  10   FALSE
2      B     7  18    TRUE
4      D    16  20    TRUE

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merge function in R?

Reply via email to