Re: probability definition
This is a multi-part message in MIME format. --FF841A0334127EDA335D19E4 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I'm glad to hear that somebody has his eye on the ball. Unfortunately, a designation of a region like "western Puerto Rico" means so many different things to so many different people, that I disbelieve its utility. With the definition you quote, we should have a 100% chance of precipitation almost every day. --FF841A0334127EDA335D19E4 Content-Type: text/x-vcard; charset=us-ascii; name="rabeldin.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Richard A. Beldin Content-Disposition: attachment; filename="rabeldin.vcf" begin:vcard n:Beldin;Richard tel;home:787-255-2142 x-mozilla-html:TRUE url:netdial.caribe.net/~rabeldin/Home.html org:BELDIN Consulting Services version:2.1 email;internet:[EMAIL PROTECTED] title:Professional Statistician (retired) adr;quoted-printable:;;PO Box 716=0D=0A;BoquerĂ³n;PR;00622; fn:Richard A. Beldin end:vcard --FF841A0334127EDA335D19E4-- = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Fisher's z-transformation
On Sat, 3 Mar 2001, Arenson, Ethan wrote: Would someone please remind me the formula for Fisher's z-transformation of correlation coefficients? Z = 0.5 log[(1 + r)/(1 - r)] (using the natural logarithm). Its standard error is 1/sqrt(n - 3) ("sqrt" = "square root of"). To convert back:r = (exp(2Z) - 1)/(exp(2Z) + 1) ("exp(2Z)" is the natural antilogarithm of 2Z, aka e to the power 2Z). Equivalently, Z = tanh(r) and r = inverse tanh(Z) ("tanh" = hyperbolic tangent). -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
power,beta, etc.
when we discuss things like power, beta, type I error, etc. ... we often show a 2 by 2 table ... similar to null truenull false retain correct type II, beta reject type I, alpha power i think that we need a bit of overhaul to this typical way of doing things ... 1. each cell needs to have a name ... label ... that reflects the consequence of the decision (retain, reject) that was made i propose something along the lines of null true null false retaintype I correct, 1C type II error, 2E rejecttype I error, 1E type II correct, 2C then, we have names or symbols for probabilities attached to each cell null true null false retain WHAT NAME/SYMBOL FOR THIS??beta reject alpha power DOES ANYONE HAVE SOME SUGGESTION AS TO HOW THE UPPER LEFT CELL MIGHT BE REFERRED TO via A SYMBOL??? OR, SOME NAME THAT IS DIFFERENT FROM POWER BUT ... STILL GIVES THE FLAVOR THAT A CORRECT DECISION HAS BEEN MADE (better than making an error)? 2. i think it would be helpful to first identify each cell with a distinctive label ... describing the decision (correct, error) and ... the type ... 1 or 2 3. i think it would be helpful to have a system where there are names for EACH cell (why should the poor upper left be "left" out in the cold??) ... FIRST ... then some OTHER name/symbol for the probability associated with that cell confusions that might be avoided would be like: a. saying type II error is the same as beta ... b. saying that power is NOT a name for a decision but, rather, THE probability of making some particular decision we have special names for errors of the first and second kind type I and type II ... and we have symbols of alpha and beta to represent their associated probabilities we have power which is supposed to be the probability of making a certain kind of decision ... but, no special name for THAT cell like we have given to differentiate the two kinds of errors one can make ... any support out there to try to right this somewhat ambiguous ship? == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Final CFP: WI-2001 (Web Intelligence)
[Apologies if you receive this more than once] --- FINAL CALL FOR PAPERS: WI-2001 The First Asia-Pacific Conference on Web Intelligence SPONSORED BY ACM SIGART Maebashi Institute of Technology --- Maebashi TERRSA, Maebashi City, Japan October 23-26, 2001 Home Page: http://kis.maebashi-it.ac.jp/wi01 Mirror Page: http://cs.uregina.ca/~wi01/ Paper Submission Deadline: March 20, 2001 ~ IN COOPERATION WITH ACM SIGCHI, ACM SIGWEB Japanese Society for Artificial Intelligence (JSAI) JSAI SIGFAI, JSAI SIGKBS, IEICE SIGKBSE CORPORATE SPONSORS Maebashi Convention Bureau Maebashi City Government Gunma Prefecture Government The Japan Research Institute, Limited US AFOSR/AOARD and US Army Research Office in Far East WI-2001 will be jointly held with The Second Asia-Pacific Conference on Intelligent Agent Technology (IAT-2001) (One registration may attend both IAT-2001 and WI-2001) === WI-2001 and IAT-2001 Joint Keynote Speakers: Edward A. Feigenbaum (Turing Award Winner), Stanford University Benjamin Wah (2001 IEEE CS President), U. Illinois at Urbana-Champaign WI-2001 Invited Speakers: James Hendler (DARPA/ISO, USA) W. Lewis Johnson (University of Southern California, USA) Riichiro Mizoguchi (Osaka University, Japan) Prabhakar Raghavan (Verity Inc., USA) Patrick S. P. Wang (Northeastern University, USA) The 21st century is the age of Internet and World Wide Web. The Web revolutionizes the way we gather, process, and use information. At the same time, it also redefines the meanings and processes of business, commerce, marketing, finance, publishing, education, research, development, as well as other aspects of our daily life. Although individual Web-based information systems are constantly being deployed, advanced issues and techniques for developing and for benefiting from Web intelligence still remain to be systematically studied. Broadly speaking, Web Intelligence (WI) exploits AI and advanced information technology on the Web and Internet. It is the key and the most urgent research field of IT for business intelligence. The Asia-Pacific Conference on Web Intelligence (WI) is an international forum for researchers and practitioners (1) to present the state-of-the-art in the development of Web intelligence; (2) to examine performance characteristics of various approaches in Web-based intelligent information technology; (3) to cross-fertilize ideas on the development of Web-based intelligent information systems among different domains. By idea-sharing and discussions on the underlying foundations and the enabling technologies of Web intelligence, WI-2001 is expected to stimulate the future development of new models, new methodologies, and new tools for building a variety of embodiments of Web-based intelligent information systems. The Asia-Pacific Conference on Web Intelligence (WI) is a high-quality, high-impact biennial conference series. It will be jointly held with the Asia-Pacific Conference on Intelligent Agent Technology (IAT). TOPICS == WI-2001 welcomes submissions of original papers. The technical issues to be addressed include, but not limited to: * Web-Based Applications: - Business Intelligence - Computational Societies and Markets - Conversational Systems - Customer Relationship Management (CRM) - Direct Marketing - Electronic Commerce and Electronic Business - Electronic Library - Information Markets - Price Dynamics and Pricing Algorithms - Measuring and Analyzing Web Merchandising - Web-Based Decision Support Systems - Web-Based Distributed Information Systems - Web-Based EDI - Web-Based Learning Systems - Web Marketing - Web Publishing * Web Human-Media Engineering: - Art of Web Page Design - Multimedia Information Representation - Multimedia Information Processing - Visualization of Web Information - Web-Based Human Computer Interface * Web Information Management: - Data Quality Management - Information Transformation - Internet and Web-Based Data Management - Multi-Dimensional Web Databases and OLAP - Multimedia Information Management - New Data Models for the Web - Object Oriented Web Information Management - Personalized Information Management - Semi-Structured Data Management - Use and Management of Metadata - Web Knowledge Management - Web Page Automatic Generation and Updating - Web Security, Integrity, Privacy and Trust * Web Information Retrieval: - Approximate Retrieval - Conceptual Information Extraction -
Re: basic stats question
But what does this (in)dependence really mean? Can it change on conditioning? Suppose that we take into account a plausible confounder: defective equipment. Suppose blacks are more likely to have "defective equipment (broken light, etc.). Suppose we find that percentage who are black among those stopped for defective equipment is the same as the percentage who are black among those having defective equipment. Now we have independence at one level and non-independence at another. This seems related to Simpson's paradox. In any event, it seems that independence can be conditional. Is this so? If so, where is this discussed in more detail? "Lise DeShea" [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Re probability/independence, I've found that the most effective way to communicate this concept to my students (College of Education, not heavily math-oriented) is the following: SNIP Then you can move to an example of racial profiling. Out of all the people in your city who drive, what proportion are African-American? [p(African-American).] Now, GIVEN that you look only at drivers who are pulled over, what proportion of these people are African American? [p(African-American|pulled over).] If being black and being pulled over are independent events, then the probabilities should be equal. You can illustrate this graphically by drawing a large box to represent all the drivers, then mark the proportion representing African-American drivers. Then draw a smaller box representing the people being pulled over, with a proportion of the box marked to represent the African-American drivers who are pulled over. If the proportions of each box are equal, then the events are independent. So now, I would welcome comments from the more mathematically/statistically rigorous list members among us! ~~~ Lise DeShea, Ph.D. Assistant Professor Educational and Counseling Psychology Department University of Kentucky 245 Dickey Hall Lexington KY 40506 Email: [EMAIL PROTECTED] Phone: (859) 257-9884 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Trend analysis question
On Sun, 4 Mar 2001, Philip Cozzolino wrote in part: However, after the cubic non-significant finding, the 4th and 5th order trends are significant. Intuitively, it seems that if there is no cubic trend of significance, there will not be any higher order trend, but this is relatively new to me. Your intuition is, in this case, incorrect. The five trends are mutually independent in the sense that any combination of them may be operating. (I am for the moment accepting the implied premise that a power function of the IV is a reasonable function to try to fit to your data. In most instances I know of, this is not "really" the case, and the power function is more usefully thought of as an approximation to whatever the "real" functionality is.) This may be seen by considering the following relationships between Y and X (think of them as DV and IV if you wish): I. + * * -* * Y - -* * - + * * - - * * - * - +-+-+-+-+-+- X II.+ * - * ** - Y - ** * - + * * * - - * * * - - * * +-+-+-+-+-+- X In I. above, the linear trend is approximately zero, and the quadratic component of X accounts for nearly all the variation in Y. A "rule" that claimed "If the linear trend is insignificant there can be no significant quadratic trend" is clearly false in this case. In II. above, both the linear and quadratic components of trend are virtually zero -- certainly insignificant -- and the cubic component accounts for nearly all the varition in Y. Similar situations can be imagined, where only the quartic, or only the quintic, or only the linear, quadratic, and quartic, or any other arbitrary combination of the basic trends are significant, and other components are not. If you are carrying out your trend analysis by using orthogonal polynomials (as you probably should be), try constructing the model derived from your linear + quadratic fit only, and plot those as predicted values against X; then construct the model derived from linear + quadratic + quartic + quintic, and plot those predicted values against X. You may find it illuminating also to plot the residuals in each case against X, especially if you force the same vertical scale on the two sets of residuals. I note in passing that you haven't stated how much of the variance of Y is accounted for by each of the significant components, nor how much residual variance there is after each component is entered. That also might be illuminating. -- DFB. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: power,beta, etc.
On Sat, 3 Mar 2001, dennis roberts wrote: when we discuss things like power, beta, type I error, etc. ... we often show a 2 by 2 table ... similar to null truenull false retain correct type II, beta reject type I, alpha power Similar, but not the same. I usually present a table correcterror: Type II of "states of affairs", without probabilities; error: Type I correct see table at right. (And usually with the rows interchanged, so that "Type I error" LOOKS like the first kind of error one encounters.) It seems to me that to include the probabilities in the same 2x2 table as the "states of affairs" would be actively to invite rampant (or at least, and more alliteratively, couchant) confusion of the concepts. I have another problem with writing "power" in the lower right cell, apart from the fact that it's a probability and not a state of affairs. I'm aware that many people think of power as a conditional probability (of rejecting the null when it's false); but I came to understand it as an UNconditional probability (of rejecting the null, period). This definition permits drawing power curves that include the parameter value specified by the null hypothesis: the power at that point (or, in that case) is alpha. For a symmetric two-sided alternative, this is also the minimum value of power. Since the value of power approaches alpha as the parameter value approaches the value specified in the null hypothesis, it seems a little silly to omit that one point from the continuous curve. i think that we need a bit of overhaul to this typical way of doing things ... 1. each cell needs to have a name ... label ... that reflects the consequence of the decision (retain, reject) that was made i propose something along the lines of null true null false retaintype I correct, 1C type II error, 2E rejecttype I error, 1Etype II correct, 2C I've long been persuaded of the need to distinguish between the two different kinds of errors. That there are two distinct kinds is not at all obvious, evidently; some folks seem never to master the distinction. But I am not convinced that we need to distinguish between two kinds of correct decision. After all, the decisions themselves are different: to reject, or to retain (though some folks prefer "accept" to "retain"). Knowing the decision, and that it is (at least hypothetically) correct, is surely all one needs to know. "Correct rejection" or "correct retention" (or "acceptance") of the hypothesis being tested seems to me easier to handle and apprehend than "a Type I correct decision" or "a Type II correct decision". then, we have names or symbols for probabilities attached to each cell null true null false retain WHAT NAME/SYMBOL FOR THIS??beta reject alpha power If you want to construct such a table, I'd recommend including the marginal row, showing the column totals to be 1 (or, if one prefers, 100%). That helps to emphasize the conditional nature of the probabilities being displayed: conditional on the state of nature, not on the decision. And consistent with my understanding of power, I'd present such a table thus: State of nature null true null false P{retain}1 - alpha beta Power alpha1 - beta -- Total 1 1 Sometime along about now one really ought to point out that a 2x2 table like this is grossly oversimplified. Beta (and therefore power) cannot be evaluated for "null false". It can be evaluated only for a specified particular value of the parameter that is different from the value specified in the null hypothesis. And, ceteris paribus, the farther that parameter value is from the null-hypothetical value, the smaller is beta (and the larger is power). This leads more or less directly to the idea of a power curve, and then to the variations in such a curve as a function of alpha and sample size. DOES ANYONE HAVE SOME SUGGESTION AS TO HOW THE UPPER LEFT CELL MIGHT BE REFERRED TO via A SYMBOL??? OR, SOME NAME THAT IS DIFFERENT FROM POWER BUT ... STILL GIVES THE FLAVOR THAT A CORRECT DECISION HAS BEEN MADE (better than making an error)? Do you have a reasoned objection to "1 - alpha"? In other contexts we routinely use, e.g., "1 - Rsq" for the proportion of variance unexplained by the model being considered. The "1 minus" construction shows the logical and arithmetical connection between two quantities, which can easily get lost if one uses very different-looking terms for those quantities. 2. i think it would be helpful to first identify each cell with a