Your original method would be the following function
f <- function (x, y) 
{
    xy <- cbind(x, y)
    outside <- function(z) {
        !any(x > z[1] & y > z[2])
    }
    j <- apply(xy, 1, outside)
    which(j)
}

and the following one quickly computes the same thing as the above
as long as there are no repeated points (if there are repeated
points it chooses one of them).

f1 <- function (x, y) 
{
    o <- order(x, decreasing = TRUE)
    yo <- y[o]
    j <- logical(length(y))
    j[o] <- yo == cummax(yo)
    which(j)
}

Think of the problem as finding the "ladder points" (Feller's term)
of a sequence of points, the places where the sequence reaches
a new high point.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of William Dunlap
> Sent: Wednesday, October 10, 2012 9:52 AM
> To: tonja.krue...@web.de; r-help@r-project.org
> Subject: Re: [R] own function: computing time
> 
> No, the desired points are not a subset of the convex hull.
> E.g., x=c(0,1:5), y=c(0,1/(1:5)).
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -----Original Message-----
> > From: William Dunlap
> > Sent: Wednesday, October 10, 2012 9:46 AM
> > To: 'tonja.krue...@web.de'; r-help@r-project.org
> > Subject: RE: [R] own function: computing time
> >
> > Are the points you are looking for (those data points with no other data
> > points above or to the right of them) a subset of the convex hull of the
> > data points?  If so, chull(x,y) can quickly give you the points on the 
> > convex
> > hull (typically a fairly small number) and you can look through them for
> > the ones you want.
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >
> > > -----Original Message-----
> > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
> > > On
> Behalf
> > > Of tonja.krue...@web.de
> > > Sent: Wednesday, October 10, 2012 3:16 AM
> > > To: r-help@r-project.org
> > > Subject: [R] own function: computing time
> > >
> > > Hi all,
> > >
> > > I wrote a function that actually does what I want it to do, but it tends 
> > > to be very slow
> > for
> > > large amount of data. On my computer it takes 5.37 seconds for 16000 data 
> > > points
> and
> > > 21.95 seconds for 32000 data points. As my real data consists of 18000000 
> > > data
> points
> > it
> > > would take ages to use the function as it is now.
> > > Could someone help me to speed up the calculation?
> > >
> > > Thank you, Tonja
> > >
> > > system.time({
> > > x <- runif(32000)
> > > y <- runif(32000)
> > >
> > > xy <- cbind(x,y)
> > >
> > > outer <- function(z){
> > > !any(x > z[1] & y > z[2])}
> > > j <- apply(xy,1, outer)
> > >
> > > plot(x,y)
> > > points(x[j],y[j],col="green")
> > >
> > > })
> > >
> > > ______________________________________________
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to