Hello,

Try length.out = max(length(loga), length(logb))

Note also that all of the previous code and the line above assumes that we are interested in the max distance, whereas the KS statistic computes the supremum of the distance. If it's a two sample test then their values are almost surely the same but not if it's a one sample test.

Hope this helps,

Rui Barradas
Em 05-10-2012 12:15, user1234 escreveu:
Rui,

Your response nearly answered a similar question of mine except that I also
have ecdfs of different lengths.

Do you know how I can adjust  x <- seq(min(loga, logb), max(loga, logb),
length.out=length(loga))
to account for this?  It must be in length.out() but I'm unsure how to
proceed.

Any advice is much appreciated.

-L


Rui Barradas wrote
Hello,

Try the following.
(i've changed the color of the first ecdf.)


loga <- log10(a+1) # do this
logb <- log10(b+1) # only once

f.a <- ecdf(loga)
f.b <- ecdf(logb)
# (2) max distance D

x <- seq(min(loga, logb), max(loga, logb), length.out=length(loga))
x0 <- x[which( abs(f.a(x) - f.b(x)) == max(abs(f.a(x) - f.b(x))) )]
y0 <- f.a(x0)
y1 <- f.b(x0)

plot(f.a, verticals=TRUE, do.points=FALSE, col="blue")
plot(f.b, verticals=TRUE, do.points=FALSE, col="green", add=TRUE)
## alternatine, use standard R plot of ecdf
#plot(f.a, col="blue")
#lines(f.b, col="green")

points(c(x0, x0), c(y0, y1), pch=16, col="red")
segments(x0, y0, x0, y1, col="red", lty="dotted")
## alternative, down to x axis
#segments(x0, 0, x0, y1, col="red", lty="dotted")


Hope this helps,

Rui Barradas
maxbre wrote
Hi all,

given this example

#start

a<-c(0,70,50,100,70,650,1300,6900,1780,4930,1120,700,190,940,
760,100,300,36270,5610,249680,1760,4040,164890,17230,75140,1870,22380,5890,2430)
length(a)

b<-c(0,0,10,30,50,440,1000,140,70,90,60,60,20,90,180,30,90,
      3220,490,20790,290,740,5350,940,3910,0,640,850,260)
length(b)

out<-ks.test(log10(a+1),log10(b+1))

# max distance D
out$statistic

f.a<-ecdf(log10(a+1))
f.b<-ecdf(log10(b+1))

plot(f.a, verticals=TRUE, do.points=FALSE, col="red")
plot(f.b, verticals=TRUE, do.points=FALSE, col="green", add=TRUE)

#inverse of ecdf a
x.a<-get("x", environment(f.a))
y.a<-get("y", environment(f.a))

# inverse of ecdf b
x.b<-get("x", environment(f.b))
y.b<-get("y", environment(f.b))


#end

I want to plot the max distance between the two ecdf curves as in the
above given chart

Is that possible and how?


Thanks for your help

PS: this is an amended version of a previous thread (but no reply
followed) that I’ve deleted from Nabble repository because I realised it
was not enough clear (now I hope it’s a little better, sorry for that)




--
View this message in context: 
http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4645140.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to