Re: [Aus-soaring] Comparing accident rates

Teal Thu, 10 Mar 2016 01:51:38 -0800


On 10/03/2016 6:50 PM, Texler, Michael wrote:

  I've not seen them described that way in the road safety literature that I'm 
familiar with. How would that work? If the number of accidents is on the Y 
axis, what variable would the X axis have? If we go with road accidents (my 
field of expertise) it can't be age/driving experience, because the accident 
stats in NO way form a poisson distribution  when age/experience is your X-axis 
variable. (Actually, road prangs by age/experience gives you more of a U-shaped 
curve.) Also, rate of accidents (be they road prangs or glider prangs) aren't 
constant over time (as required for a poisson distribution to be your 
distribution of

choice) - they vary by time of day, for fairly obvious reasons, as well as 
other things (day of the week, long weekends, etc etc).

You appear to be approaching the issue from a rather different statistical 
approach to the ones I'm familiar with. Could you spell out your 
approach/methods in more detail? It's always interesting to hear how folk in 
other fields approach problems I'm familiar with. :-)

I am approaching it as counting events occurring over a duration of time 
(analogous to say counting disintegrations per second for radioactive decay).

Y axis would be the accident rate with any metric that you care to choose (i.e. 
accidents per 1,000 hours flown, accidents per 100km travelled, accidents per 
1,000 flights etc.).
Y axis would be a duration of time, i.e over one year, over 10 years, over 100 
years.

Then it is a case of using the appropriate test to compare the two groups (null 
hypothesis being that the accident rate between two groups is the same).

I'm afraid I'm still not with you. *Which* two groups, exactly?Displaying all recorded traffic accidents over time in that way will (ifyou use Australian data) give you a single line that (depending on theperiod covered, but lets go with "the last 20 years") trends downwardover time. Who are you comparing again whom, in your example?

A fairly blunt measure granted.

Given your experience with road accidents analysis, how would you approach it?

Well, it would depend on exactly which question was being asked. If wewere interested in the numbers of accidents had by drivers of differentages, my previous example (up in the first para quoted above) was asimple descriptive graph showing difference in number of accidents byage, for a set amount of time (a year, say). Or we could do it anotherway, and have a graph with dates along the X axis, and separate lines(one for each age group, maybe 16-25, 26-35 and so on) showing howaccident numbers have changed over time for each age group, if we wereinterested in seeing if there were any obvious differences in crashrates over time by age group.

Or, if the question whether a particular time of day is more crash-pronethan other times, we could graph all the accidents occurring in the lastyear with the X axis showing hours of the day (midnight-0200, 0201-0400,etc). Or whatever. All this is pretty basic stuff. We could go on fromthere, and report means and standard deviations for age groups/timeperiods/whatever of interest, and see if anything leaps out in terms ofobvious differences or trends. But that still isn't going to get youanything you might want to discuss using null hypotheses or p values ...for that you really do need actual *inferential* statistical tests, withspecific groups that you are comparing. And this broad-brush descriptiveapproach isn't going to give you that. You need to narrow it down a bit.

So: lets come back to the original topic that started all this - glideraccidents. How would I approach that?

Well, first would be deciding exactly what question I want an answer to.Do I want to know if the glider prang rate is increasing or decreasingover time? Or do I want to know whether more crashes are happening incomps than in cross-country gliding? Or how the glider crash rate as awhole compares with the number of motorcycle crashes for a given period?

Lets go with the last one, since we were also discussing that earlier.Firstly, getting a good source of data for *both* of those elements inthe comparison is tricky. So I'm gonna handwave past that and assumethat we have good quality data on both of these, including exposure data(i.e. how much time was spent per pilot/cyclist actually flying/cyclingduring that time period), because exposure is critical for topics likethis: it means absolutely nothing to say that there were 12 gliderprangs and 355 bike prangs in a given period, if we don't *also* knowthat there were a lot more cyclists on the road, driving for a lot moreoverall hours, than there were glider pilots in the air during the sameperiod.

OK. So now I hypothetically have ten years' worth of crash rates perhour of flying or riding for the respective groups, and I want tocompare them. This is where the inferential statistics come in. Therewill be differences between any two groups that are simply randomchance, but the real trick is identifying *actual* differences throughthe "noise" of random variation. We want to perform a simple comparisonof the two groups, to see if they basically have the same means andvariances - i.e. is it reasonable to assume they're both samples fromone overall population? (Yes, I know they're probably not in real life,but that's how the statistical tests work.) In this example I'd probablygo for a t-test for independent samples (since we're assuming that thebikers and the pilots are, by and large, different people). And whatthat would give me would be a probability value which, as you pointedout earlier, is basically the probability that the difference betweenthe groups is due to random chance, as opposed to being a realdifference. So if we get a p value of .05 from my t-test, that tells usthat there is a 5% chance that this result is a random fluke, and a 95%chance that it's a real difference between our bikers and glider pilots.

Lets mix it up a bit. What if we want to add other factors into themodel to see if that makes any difference... age, say. Are the patternsof accidents for pilots and bikers of different ages similar? Does itmatter what the age of the vehicle they're flying/riding is? For thoseI'd probably run a regression or analysis of variance of some kind onthe data, with the exact type dependent on the exact nature of theadditional factor(s) I'm plugging into the model. Or lets say I comeacross a group of bikers who also fly gliders. That's extra-useful,because, as the *same* individuals doing both activities, we can get a*lot* more statistical power out of whatever model we choose.Repeated-measures analysis of variance may well be my tool of choice forthat sort of analysis. Or maybe even a mixed-methods general linearmodel (now, *those* can get complex enough to lead to tears and tearingof hair...)


And so on it goes.

Does that help clarify things?


Teal



_______________________________________________
Aus-soaring mailing list
Aus-soaring@lists.base64.com.au
http://lists.base64.com.au/listinfo/aus-soaring

Re: [Aus-soaring] Comparing accident rates

Reply via email to