My understanding is that this represents bivariate normal approximation of the data which uses the kernel density function to test for inclusion within a level set. (please correct me)
In order to exclude the outlier to these ellipses/contours is it advisable to do something like this: SNP$density <- get_density(SNP$mean, SNP$var) > summary(SNP$density) Min. 1st Qu. Median Mean 3rd Qu. Max. 0 383 696 738 1170 1789 where get_density() is function from here: https://slowkow.com/notes/ggplot2-color-by-density/ and then do something like this: a=SNP[SNP$density>400,] and plot it again: p <- ggplot(a, mapping = aes(x = mean, y = var)) p <- p + geom_density_2d() + geom_point() + my.theme + ggtitle("SNPS_red") On Thu, Oct 8, 2020 at 3:52 PM Ana Marija <sokovic.anamar...@gmail.com> wrote: > > Hello, > > I have a data frame like this: > > > head(SNP) > mean var sd > FQC.10090295 0.0327 0.002678 0.0517 > FQC.10119363 0.0220 0.000978 0.0313 > FQC.10132112 0.0275 0.002088 0.0457 > FQC.10201128 0.0169 0.000289 0.0170 > FQC.10208432 0.0443 0.004081 0.0639 > FQC.10218466 0.0116 0.000131 0.0115 > ... > > and I am creating plot like this: > > s <- ggplot(SNP, mapping = aes(x = mean, y = var)) > s <- s + geom_density_2d() + geom_point() + my.theme + ggtitle("SNPs") > s > > I am getting plot in attach. > > My question is how do I: > 1.interpret the inclusion versus exclusion within the ellipses-contours? > > 2. how do I extract from my data frame the points which are outside of > ellipses? > > Thanks > Ana ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.