Re: Statistical Functions and Boxplots

2005-06-09 Thread Nick de Voil
 I have a client who wants some data represented by boxplots.
 I've been unable to find software for CF which will automatically
calculate
 and graph the required information (upper quartile, lower quartile,
median)
 from raw data passed, so I need to find a way of performing these
 calculations myself and pass them to the graph software.

We have a customer that we calculate quartiles for (CF6.1, SS2K) and in that
application we draw line graphs and bar charts using jFreeChart, which I
believe can do boxplots although we don't currently use that part of it.
Worth a look if you don't mind doing Java.

Quartiles are not what you said. In colloquial use they are what Jochem
said, but our customer says

a) it is one value, not a set of values
b) the algorithm used by Excel is wrong

As you know, the customer is always right. We do it their way, with some CF
and SQL which is rather slow but not very complicated.

Nick




~|
Find out how CFTicket can increase your company's customer support 
efficiency by 100%
http://www.houseoffusion.com/banners/view.cfm?bannerid=49

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:209107
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Statistical Functions and Boxplots

2005-06-08 Thread Ryan Edgar
I have a client who wants some data represented by boxplots.
I've been unable to find software for CF which will automatically calculate 
and graph the required information (upper quartile, lower quartile, median) 
from raw data passed, so I need to find a way of performing these 
calculations myself and pass them to the graph software.
I've got a Median function from cflib.org http://cflib.org but where 
things fall apart slightly is in the calculation of the upper and lower 
quartiles. I had been taught that to calculate these, you find the average 
of the top and bottom half of the dataset respectively. However, when I put 
the dataset into Excel and use its statistical functions, the results are 
slightly different. Further investigation has shown that there are a few 
ways of calculating quartiles but this is where I start getting out of my 
depth.
 Does anyone know:
a) Is there any graphing software compatible with CF which can create 
boxplots from raw data? I've looked at ChartFX Statistical but it's only 
available for .NET.
b) Is there software or functions around which can quickly and accurately 
calculate quartiles?
 I'm using CF5 by the way.
 Thanks,
 Ryan


~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:208973
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Re: Statistical Functions and Boxplots

2005-06-08 Thread Jochem van Dieten
Ryan Edgar wrote:
 I have a client who wants some data represented by boxplots.
 I've been unable to find software for CF which will automatically calculate 
 and graph the required information (upper quartile, lower quartile, median) 
 from raw data passed, so I need to find a way of performing these 
 calculations myself and pass them to the graph software.
 I've got a Median function from cflib.org http://cflib.org but where 
 things fall apart slightly is in the calculation of the upper and lower 
 quartiles. I had been taught that to calculate these, you find the average 
 of the top and bottom half of the dataset respectively.

I have been taught that the top quartile is the 25% of data 
points with the highest values :)

Where do you get these data from? If you get them from a 
database, what native functions does that database have for doing 
statistics? For instance, PostgreSQL has the R language for 
statistical computing as a loadable module and you can use it as 
a procedural language.

Jochem

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:209015
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Re: Statistical Functions and Boxplots

2005-06-08 Thread Ryan Edgar
Sorry, should have said I'm using MS SQL 2000. I'm pulling data from survey 
results which will have a value in the range 1 - 5 so if I have 50 surveys, 
I'll want to find the max, min, upper and lower quartiles and median of the 
dataset for each individual quesion. i.e. 1 question = 1 box in the 
graphical output.
 Failing that, is there any way I can get Excel to generate these charts on 
the fly if I dump the data into a spreadsheet (automatically on the server)? 
At the end of the day the client wants a boxplot to be generated and isn't 
too concerned about how we go about getting it, as long as he doesn't have 
to do any work!
 Thanks,
 Ryan

 On 6/8/05, Jochem van Dieten [EMAIL PROTECTED] wrote: 
 
 Ryan Edgar wrote:
  I have a client who wants some data represented by boxplots.
  I've been unable to find software for CF which will automatically 
 calculate
  and graph the required information (upper quartile, lower quartile, 
 median)
  from raw data passed, so I need to find a way of performing these
  calculations myself and pass them to the graph software.
  I've got a Median function from cflib.org http://cflib.org 
 http://cflib.org but where
  things fall apart slightly is in the calculation of the upper and lower
  quartiles. I had been taught that to calculate these, you find the 
 average
  of the top and bottom half of the dataset respectively.
 
 I have been taught that the top quartile is the 25% of data
 points with the highest values :)
 
 Where do you get these data from? If you get them from a
 database, what native functions does that database have for doing
 statistics? For instance, PostgreSQL has the R language for
 statistical computing as a loadable module and you can use it as
 a procedural language.
 
 Jochem
 
 

~|
Find out how CFTicket can increase your company's customer support 
efficiency by 100%
http://www.houseoffusion.com/banners/view.cfm?bannerid=49

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:209040
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54


Re: Statistical Functions and Boxplots

2005-06-08 Thread Jochem van Dieten
Ryan Edgar wrote:
 Sorry, should have said I'm using MS SQL 2000. I'm pulling data from survey 
 results which will have a value in the range 1 - 5 so if I have 50 surveys, 
 I'll want to find the max, min, upper and lower quartiles and median of the 
 dataset for each individual quesion. i.e. 1 question = 1 box in the 
 graphical output.

Those aren't really hard to do just in SQL:
SELECT MAX(dataset) as max, MIN(dataset) as min, Count(dataset) 
as freedom
FROM table;

SELECT TOP (.25 * freedom) dataset as upper
FROM table
ORDER BY dataset;

SELECT TOP (.5 * freedom) dataset as mean
FROM table
ORDER BY dataset;

SELECT TOP (.75 * freedom) dataset as lower
FROM table
ORDER BY dataset;

Jochem

~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:209043
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations  Support: http://www.houseoffusion.com/tiny.cfm/54