You want a Wilconxon test. Here is an example for R. # Load necessary libraries library(ggplot2) # For plotting library(stats) # For statistical tests (base R)
# Example data (replace with your actual data)
# admitted_ids <- c(1, 3, 5, 7, 10, 12) # Example admitted student IDs
# waitlisted_ids <- c(15, 18, 20, 22, 25, 30) # Example waitlisted student IDs
# Combine data into a data frame for plotting
data <- data.frame(
ID = c(admitted_ids, waitlisted_ids),
Status = factor(c(rep("Admitted", length(admitted_ids)),
rep("Waitlisted", length(waitlisted_ids))))
)
# 1. Plot histograms with density
ggplot(data, aes(x = ID, fill = Status)) +
geom_histogram(aes(y = ..density..), alpha = 0.5, position = "identity") +
labs(title = "Distribution of Student IDs", x = "Student ID", y = "Density") +
theme_minimal()
# 2. Box plot to compare medians and spread
ggplot(data, aes(x = Status, y = ID, fill = Status)) +
geom_boxplot() +
labs(title = "Box Plot of Student IDs by Status", y = "Student ID") +
theme_minimal()
# 3. Mann-Whitney U Test (non-parametric, good for skewed data)
mw_test <- wilcox.test(admitted_ids, waitlisted_ids, alternative = "two.sided")
print("Mann-Whitney U Test:")
print(mw_test)
# 4. Kolmogorov-Smirnov Test (compare distributions)
ks_test <- ks.test(admitted_ids, waitlisted_ids)
print("Kolmogorov-Smirnov Test:")
print(ks_test)
# 5. Compare means
mean_admitted <- mean(admitted_ids)
mean_waitlisted <- mean(waitlisted_ids)
cat("Mean ID (Admitted):", mean_admitted, "\n")
cat("Mean ID (Waitlisted):", mean_waitlisted, "\n")
# 6. Optional: T-Test (if data is roughly normal)
t_test <- t.test(admitted_ids, waitlisted_ids)
print("Two-Sample T-Test:")
print(t_test)
# 7. Quantify skewness (requires 'moments' package)
# Install if needed: install.packages("moments")
library(moments)
skew_admitted <- skewness(admitted_ids)
skew_waitlisted <- skewness(waitlisted_ids)
cat("Skewness (Admitted):", skew_admitted, "\n")
cat("Skewness (Waitlisted):", skew_waitlisted, "\n")
# 8. Logistic regression (modeling probability of admission)
model <- glm(Status ~ ID, data = data, family = "binomial")
summary(model)
From: Friam <[email protected]> on behalf of cody dooderson
<[email protected]>
Date: Thursday, March 13, 2025 at 1:28 PM
To: The Friday Morning Applied Complexity Coffee Group <[email protected]>
Subject: [FRIAM] statistics question
I have a question concerning preschool admissions. The kindergarten that my
daughter went to for preschool has a "random lottery" for admissions. They
published a list of all of the student ids of the students who got in and the
ones who did not and were put on a waitlist.
She did not get in, so I decided that it was unfair and plotted the data. What
statistical tricks should I use to figure out if the lottery was random or not?
I have attached a plot of the data in question. To me, the plot looks slightly
skewed towards the low numbers. The lower numbers are kids that signed up for
the lottery earlier and I hypothesise have favorable connections in the school.
_ Cody Smith _
[email protected] <mailto:[email protected]>
smime.p7s
Description: S/MIME cryptographic signature
.- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-.. FRIAM Applied Complexity Group listserv Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com FRIAM-COMIC http://friam-comic.blogspot.com/ archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/ 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/
