Re: Correlation problem
janne [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... I have a correlation formula I don't get to work. And we must use this formula on the test. Let me give you an example: Let's say X and Y are: xy 1 68 2 91 3 102 3 107 4 105 4 114 5 115 6 127 _ ___ 28 829 __ X is =3.5 and Y is =103.625 Now to my problem. Look at the formula in this URL: http://www.jannesgallery.com/corr.html. How do I do the first (X-X(with a line above))? I have tried to take _ X-X 1-3.5=2.5 2-3.5=-1.5 3-3.5=-0.5 3-3.5=-0.5 4-3.5=0.5 4-3.5=0.5 5-3.5=1.5 6-3.5=2.5 0 As you see the answer is zero. What do I do wrong? and the same with Y-Y(with a line above). It turns out to be zero. Please help me to tell how I should do. Janne The sum is: (1-3.5)*(68-103.625) + (2-3.5)*(91-103.625) + ... + (6-3.5)*(127-103.625) which, in general, will not be zero. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Which one fit better??
Glen [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Chia C Chong [EMAIL PROTECTED] wrote in message news:a0n001$b7v$[EMAIL PROTECTED]... I plotted a histogram density of my data and its smooth version using the normal kernel function. I tried to plot the estimated PDF (Laplacian Generalised Gaussian) estimated using maximum likelihood method on top as well. Graphically, its seems that Laplacian wil fit thr histogram density graph better while the Generalised Gaussian will fit the smooth version (i.e. the kernel densoty version). Imagine that you began with a sample from a Laplacian (double exponential) distribution. What will happen to the central peak after you smooth it with a KDE? The peak does not changed significantly...Maybe shifted to the left a bit...not too much!! CCC Glen = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Correlation problem
This is a multi-part message in MIME format. --871448A000A42FB121E62065 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit another way to phrase thatis for each case find x= X_XBAR and y = YBAR then multiply x*y (this is called a cross-product). then find the sum of the crossproducts. Stephen Clark wrote: janne [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... I have a correlation formula I don't get to work. And we must use this formula on the test. Let me give you an example: Let's say X and Y are: xy 1 68 2 91 3 102 3 107 4 105 4 114 5 115 6 127 _ ___ 28 829 __ X is =3.5 and Y is =103.625 Now to my problem. Look at the formula in this URL: http://www.jannesgallery.com/corr.html. How do I do the first (X-X(with a line above))? I have tried to take _ X-X 1-3.5=2.5 2-3.5=-1.5 3-3.5=-0.5 3-3.5=-0.5 4-3.5=0.5 4-3.5=0.5 5-3.5=1.5 6-3.5=2.5 0 As you see the answer is zero. What do I do wrong? and the same with Y-Y(with a line above). It turns out to be zero. Please help me to tell how I should do. Janne The sum is: (1-3.5)*(68-103.625) + (2-3.5)*(91-103.625) + ... + (6-3.5)*(127-103.625) which, in general, will not be zero. --871448A000A42FB121E62065 Content-Type: text/x-vcard; charset=us-ascii; name=Arthur.Kendall.vcf Content-Transfer-Encoding: 7bit Content-Description: Card for Art Kendall Content-Disposition: attachment; filename=Arthur.Kendall.vcf begin:vcard n:Kendall;Art tel;work:301-864-5570 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] fn:Art Kendall end:vcard --871448A000A42FB121E62065-- = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Excel Limitations and Mac Excel 2001 Bug?
Hi, [Please excuse multiple postings, but I need to get feedback from several email communities.] The discussion of Excel's limitations on edstat-l (archives are available at http://jse.stat.ncsu.edu/) has been interesting and informative. I agree substantially with David Heiser that Excel can be used for statistical analysis, but the user must exercise judgment and be knowledgeable about the software. I can also see Cryer-McCullough-et al's point that the level of knowledge required is way too high and many unsophisticated users, relying on defaults, will get miserable results. No question the product can be better. The discussion then turned to variable declaration and David Firth posted a nice review of data types. His email signature says David Firth Still Thinking Different: Apple Powerbook 3400 Newton 2100 As a former Excel Mac user (now on the Dark Side), I recently had the opportunity to debug a Mac user's add-in (Physics). Here's what I found. I believe there is a problem with the declaration of variables in Mac Excel 2001. The largest number that the machine should be able to represent with 32-bit floating point double-precision (that's Double declaration in the code below), is supposed to be 1.79769313486232E308. In fact, in Mac Excel 2001, 1.797 * 10^38 works but 1.797* 10^39 does not! (Of course, forget about 10 ^ 308 and this was the problem with add-in.) So Mr. Firth, and other Mac Excel users, please run the macro below on Excel 2001 on a Mac to see if you get the same behavior and let me know what happens. You'll have to add a module, copy the code from below, and then run it. If you're like me, you'll get an overflow error on the line that reads, myMaxBug = 1.797 * 10 ^ 39. Mac Excel 2001 cannot represent that number. Sub Excel2001BugTest() Dim myMaxOK As Double myMaxOK = 1.797 * 10 ^ 38 Dim myMaxBug As Double myMaxBug = 1.797 * 10 ^ 39 End Sub A variable declared as a Single has a highest value of 3.402823E38. In the code above, Mac Excel 2001 does myMaxOK = 9* 10 ^ 38, just fine. It's not that it doesn't support a Double, it appears that, somehow, the Double has been coded for 38 instead of 308!!! How can that happen? Note, Excel98 and all Win versions that I have tested work just fine. Only Mac Excel 2001 gives the problem. I was running it on a G4 with OS 9.1. I have an RNG that uses the Currency data type (to use a essentially a Double Long for large integer computations) and it works just fine on Mac Excel 2001. Only Double (and Variant) don't work. Please let me know what you find or if you have any explanations for this odd behavior. Thanks! Humberto Barreto x6315
Re: Correlation problem
In sci.stat.consult janne [EMAIL PROTECTED] wrote: : I have a correlation formula I don't get to work. And we must use this : formula on the test. Let me give you an example: Let's say X and Y If you don't know with x(with a line above) MEANS, you need to STUDY your text. Also your instructor should be available for consultation. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Correlation problem
sum of deviations around a mean always = 0 X-X 1-3.5=2.5 2-3.5=-1.5 3-3.5=-0.5 3-3.5=-0.5 4-3.5=0.5 4-3.5=0.5 5-3.5=1.5 6-3.5=2.5 0 As you see the answer is zero. What do I do wrong? and the same with Y-Y(with a line above). It turns out to be zero. Please help me to tell how I should do. Janne = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
sorry for late reply ranking is the LEAST useful thing you can do ... so, i would never START with simple ranks any sort of an absolute kind of scale ... imperfect as it is ... would generally be better ... one can always convert more detailed scale values INTO ranks at the end if necessary BUT, you cannot go the reverse route say we have 10 people measured on variable X ... and we end up with no ties ... so, we get ranks of 1 to 10 ... but, these value give on NO idea whatsoever as to the differences amongst the 10 if i had a 3 person senior high school class with cumulative gpas of 4.00, 3.97, and 2.38 ... the ranks would be 1, 2, and 3 ... but clearly, there is a huge difference between either of the top 2 and the bottom ... but, ranks give no clue to this at all so, my message is ... DON'T START WITH RANKS At 02:11 AM 12/19/01 +, Doug Federman wrote: I have a dilemma which I haven't found a good solution for. I work with students who rotate with different preceptors on a monthly basis. A student will have at least 12 evaluations over a year's time. A preceptor usually will evaluate several students over the same year. Unfortunately, the preceptors rarely agree on the grades. One preceptor is biased towards the middle of the 1-9 likert scale and another may be biased towards the upper end. Rarely, does a given preceptor use the 1-9 range completely. I suspect that a 6 from an easy grader is equivalent to a 3 from a tough grader. I have considered using ranks to give a better evaluation for a given student, but I have a serious constraint. At the end of each year, I must submit to another body their evaluation on the original 1-9 scale, which is lost when using ranks. Any suggestions? -- It has often been remarked that an educated man has probably forgotten most of the facts he acquired in school and university. Education is what survives when what has been learned has been forgotten. - B.F. Skinner New Scientist, 31 May 1964, p. 484 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Correlation problem
janne [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... How do I do the first (X-X(with a line above))? I have tried to take _ X-X *snip* 0 As you see the answer is zero. What do I do wrong? You calculate SUM[(x-x_bar)] * SUM[(y-y_bar)] instead of what you are asked for, which is SUM[(x-x_bar)*(y-y_bar)]. In other words, multiply each x term by the corresponding y term BEFORE you perform the sum. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Looking for some datasets
I have a page of links to data: http://www.keypress.com/fathom/Data_Sets.html Perhaps you can find something there. At 11:01 PM -0700 1/4/02, Michael Joner wrote: The new semester has started and one of my first assignments has been to find some datasets that I'd be interested in evaluating during some of my classes. I spent some time searching the Internet for some interesting data. The data available on StatLib is not exactly what I'd prefer to study (although I guess if I can't find anything else I can always fall back on StatLib), mainly because the data is not very recent to begin with. I found some information but it seems that most of the information I've found has been in PDF format or some other document-based format that would be very difficult to read into SAS or S-PLUS. I can clean data if needed but would rather not have to go to the trouble of parsing a Word document (I also found some of them). The data would be for a Modern Regression Methods class. I need some datasets with categorical variables and some datasets with continuous variables (I guess some datasets with a mix of categorical and continuous wouldn't be bad, either). Could someone here steer me in the direction of some good datasets? Some of my interests include technology, sports, and vital statistics. Mike Joner = Jill Binker Fathom Dynamic Statistics Software KCP Technologies, an affiliate of Key Curriculum Press 1150 65th St Emeryville, CA 94608 1-800-995-MATH (6284) [EMAIL PROTECTED] http://www.keypress.com http://www.keycollege.com __ = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Looking for some datasets
some minitab files and other things are here http://roberts.ed.psu.edu/users/droberts/datasets.htm _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Which one fit better??
Chia C Chong [EMAIL PROTECTED] wrote in message a1bpk5$62b$[EMAIL PROTECTED]">news:a1bpk5$62b$[EMAIL PROTECTED]... Glen [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Chia C Chong [EMAIL PROTECTED] wrote in message news:a0n001$b7v$[EMAIL PROTECTED]... I plotted a histogram density of my data and its smooth version using the normal kernel function. I tried to plot the estimated PDF (Laplacian Generalised Gaussian) estimated using maximum likelihood method on top as well. Graphically, its seems that Laplacian wil fit thr histogram density graph better while the Generalised Gaussian will fit the smooth version (i.e. the kernel densoty version). Imagine that you began with a sample from a Laplacian (double exponential) distribution. What will happen to the central peak after you smooth it with a KDE? The peak does not changed significantly...Maybe shifted to the left a bit...not too much!! No, I was not talking about your data, since you don't necessarily have Laplacian - that's what you're trying to decide! Imagine you have data actually from a Laplacian distribution. (It has a sharp peak in the middle, and exponential tails.) Now you smooth it (KDE via gaussian kernel). What happens to the peak? (assume a typical window width) [Answer? It gets smoothed, so it no longer looks like a sharp peak.] That's where your impression of a gaussian-looking KDE is probably coming from. Note that the tails of a normal and a laplace are different, so if those are the two choices, that may help. Glen = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
clusters within a sample
I am working with a large administrative data (N=1,086) set for a foster care agency. In short, I am comparing client outcomes across two branches (each is delivering a different service model). For analyses, I am using logistic regression (SPSS) where my dependent variables include a variety of outcomes measuring program success vs. failure. My test variable is the program (two groups), plus I have several other demographic and service related variables. My problem is that I have two types of clusters of children in my data set: siblings from the same biological family (may or may not be placed in the same foster home) foster children placed in one foster home (may or may not be siblings) I am looking for ways to test the amount of error associated with the above clusters using SPSS. My strategy to date has been to SELECT the restricted sample, run the LR analysis, then eyeball the results. What are my other options? Many thanks. Yvonne A. Unrau, PhD Associate Professor School of Social Work Illinois State University Campus Box 4650 Normal, Illinois 61790-4650 Direct Office Phone: (309) 438-8579 School Office Phone: (309) 438-3631 School Fax: (309) 438-5880 e-mail: [EMAIL PROTECTED]
Re: Excel2000- the same errors in stat. computations andgraphics
Jon Cryer wrote: David: I have certainly never said nor implied that Excel cannot produce reasonably good graphics. My concern is that it makes it so easy to produce poor graphics. The defaults are absurd and should never be used. It seems to me that defaults should produce at least something useful. The default graphs are certainly not good business graphs if the intent is to produce good visual display of quantitative information! Isn't that what graphs are for? The purpose of fancy graphics is to cover up the paucity of information contained therein. For that reason alone, Excel's cornucopia of choices fits the bill very nicely. (Slap your own face, Jay, for being such a cynic.) Cheers, Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today?
Excel vs Quattro Pro
Does anyone know if Quattro Pro suffers the same statistical problems as Excel? Cheers. ECD ___ Edward C. Dreyer Political Science The University of Tulsa = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Thank you!
Thank you for helping me with my problem! Janne = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Thank you all!!!
Thank you everybody who helped me with my correlation problem Stephen, Art, Timothy, Patrick. It was very sweet of you. Janne = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
i don't know the answer to this but ... i have a general question with regards to using spreadsheets for stat analysis why? ... why do we not help our students and encourage our students to use tools designed for a task ... rather than substituting something that may just barely get us by? we don't ask stat packages to do what spreadsheets were designed to do ... why the reverse? just because packages like excel are popular and readily available ... does not therefore mean that we should be recommending it (or them) to people for statistical analysis it's like telling people that notepad will be sufficient to do all your word processing needs ... At 04:56 PM 1/7/02 -0600, Edward Dreyer wrote: Does anyone know if Quattro Pro suffers the same statistical problems as Excel? Cheers. ECD ___ Edward C. Dreyer Political Science The University of Tulsa = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
i don't know the answer to this but ... i have a general question with regards to using spreadsheets for stat analysis Many students are computer illiterate and it might be easier to teach them how to use the spreadsheet than a formal programming language. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
most stat packages have nothing to do with programming anything ... you either use simple commands to do things you want done (like in minitab ... mtb correlation 'height' 'weight') or, select procedures from menus and dialog boxes At 12:27 AM 1/8/02 +, Kenmlin wrote: i don't know the answer to this but ... i have a general question with regards to using spreadsheets for stat analysis Many students are computer illiterate and it might be easier to teach them how to use the spreadsheet than a formal programming language. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ = = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
there is a lot of packages that are half-way between spreadsheets and formal programming languages: SAS, SPSS, Stata. anything is better than spreadsheets. On 8 Jan 2002, Kenmlin wrote: i don't know the answer to this but ... i have a general question with regards to using spreadsheets for stat analysis Many students are computer illiterate and it might be easier to teach them how to use the spreadsheet than a formal programming language. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
This is a multi-part message in MIME format. --EFD979E9843F6B9938938A9A Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Spreadsheets are fine for minor business/commercial data analysis. They are not designed to be statistical packages. A package like SPSS is designed for a wide variety of statistical applications across many disciplines. It shares many features of a spreadsheet in the user interface. It is a package not a programming language. A person who is going to use statistics does not have to become a programmer. (Although exposure to a programming language or two will be a help to statisticians.) Kenmlin wrote: i don't know the answer to this but ... i have a general question with regards to using spreadsheets for stat analysis Many students are computer illiterate and it might be easier to teach them how to use the spreadsheet than a formal programming language. --EFD979E9843F6B9938938A9A Content-Type: text/x-vcard; charset=us-ascii; name=Arthur.Kendall.vcf Content-Transfer-Encoding: 7bit Content-Description: Card for Art Kendall Content-Disposition: attachment; filename=Arthur.Kendall.vcf begin:vcard n:Kendall;Art tel;work:301-864-5570 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] fn:Art Kendall end:vcard --EFD979E9843F6B9938938A9A-- = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =