Re: Edstat: I. J. Good and Walker
dennis roberts wrote: At 06:08 PM 6/19/01 +, Jerry Dallal wrote: >Alex Yu wrote: > > > > In 1940 Helen M. Walker wrote an article in the journal of Educational > > Psychology regarding the concept degrees of freedom. In 1970s, I. J. Good > > wrote something to criticize Walker's idea. I forgot the citation. I tried > > many databases and even searched the internet but got no result. Does any > > one know the citation? Thanks in advance. > > > >73AmerStat 27 227- 228 J What are degrees of freedom? Good, I. >J. answer??? the number of options you have in deciding what courses you can take during your freshperson year in college ... MINUS ONE the minus one is for the option of 0 courses - droping out. It is presumed that you have already decided to stick it out. :) Cheers, Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Consistency quotation
G. B. Shaw - Pygmaillion (sp) My Fair Lady, maybe too. "I can tell a woman's age in half a minute - and I do." Surely H. Higgins prided himself on consistency :) Jay [EMAIL PROTECTED] wrote: I remember reading something like the following: "Consistency alone is not necessarily a virtue. One can be consistently obnoxious." I believe it was in a discussion to an RSS read paper, maybe from about 30 years ago, but I have not been able to find it again. A web-search for "consistently obnoxious" taught me more about asbestos corks than I care to know, but was otherwise unhelpful. Can anyone provide the source, or at least a lead? Many thanks, Ewart Shaw. -- J.E.H.Shaw [Ewart Shaw] [EMAIL PROTECTED] TEL: +44 2476 523069 Department of Statistics, University of Warwick, Coventry CV4 7AL, U.K. http://www.warwick.ac.uk/statsdept/Staff/JEHS/ The opposite of a profound truth is not also a profound truth. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ ========= -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Marijuana
Brother! That topic sure drew a crowd! :) Paul Jones wrote: There was some research recently linking heart attacks with Marijuana smoking. [big snip] Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: sample size and sampling error
When I sell something to a person, I am the supplier, and they are the customer. We know that. When this person tells me what they like and don't like about the thing I sold, they are turning over to me something of value. So I am now the customer, no? Maybe I should pay for what I get. If you really want to get responses, try paying the person with the information. Maybe not $, but something. Jay Jill Binker wrote: Interesting. "Get as many as possible" sounds good, but I've heard otherwise (from those with more background than I have). The best way to get 100% compliance from the random sample you're trying to get responses from, is to send as few forms as possible, then put your resources in pestering those people -- whoops! I mean encouraging them to respond (follow up mailings, then phone calls). That way you're minimizing the effect of a skewed sample (since liklihood of responding may well be correlated with the things you're trying to learn about. Also, I believe that hypothesis testing is sounder if your sample is less than 1 tenth of the population size -- but don't go by me; I'm a non-statistical lurker. At 7:04 PM + 5/25/01, W. D. Allen Sr. wrote: >Get all the samples you can afford! > >Text book recipes on determining sample size implicitly assume that all the >elements of the population in question are selected randomly [ equally >likely to be selected ]. > >Voluntary response to a mail-in survey means you will get only those samples >that "volunteer" to respond, which means non-random selection. The >non-respondents are also hunters but you won't hear from them. > >Generally the more samples the better in a mathematically imperfect world. >Look at it this way, if everyone responded you would have sampled the entire >population of hunters. So the closer to all hunters the better. Maybe you >could think of a way to induce more of those potential non-responders to >respond. > >Good luck > >WDA > >end > >"Mike Tonkovich" <[EMAIL PROTECTED]> wrote in message >3b0d107d_2@newsfeeds">news:3b0d107d_2@newsfeeds... >> Before I get to the issue at hand, I was hoping someone might explain the >> differences between the following 3 newsgroups: sci.stat.edu, >sci.stat.cons, >> and sci.stat.math? Now that I've found these newsgroups, chances are good >I >> will be taking advantage of the powerful resources that exist out there. >> However, I could use some guideance on what tends to get posted where? >Some >> general guidelines would be helpful. >> >> Now for my question. >> >> We have an estimated 479,000 hunters in Ohio and we want to conduct a >survey >> to estimate such things as hunter success rates, participation rates, and >> opinions on various issues related to deer management. The first question >> of course, is how large of a sample? >> >> [snipped] >> > > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= Jill Binker Fathom Dynamic Statistics Software KCP Technologies, an affiliate of Key Curriculum Press 1150 65th St Emeryville, CA 94608 1-800-995-MATH (6284) [EMAIL PROTECTED] http://www.keypress.com http://www.keycollege.com __ ========= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardized testing in schools
At the Three Mile Island plant, there was a strip chart temperature recorder in the control room, with two pens, red & blue. And a tag note on it saying, "Remember, blue means hot." Common sense is not so common. Jay "W. D. Allen Sr." wrote: > "And this proved to me , once again, > why nuclear power plants are too hazardous to trust:..." > > Maybe you better rush to tell the Navy how risky nuclear power plants are! > They have only been operating nuclear power plants for almost half a century > with NO, I repeat NO failures that has ever resulted in any radiation > poisoning or the death of any ship's crew. In fact the most extensive use of > Navy nuclear power plants has been under the most constrained possible > conditions, and that is aboard submarines! > > Beware of our imaginary boogy bears!! > > You are right though. There is nothing really hazardous about the operation > of nuclear power plants. The real problem has been civilian management's > ignorance or laziness! > > WDA > > end > > "Rich Ulrich" <[EMAIL PROTECTED]> wrote in message > [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > > Standardized tests and their problems? Here was a > > problem with equating the scores between years. > > > > The NY Times had a long front-page article on Monday, May 21: > > "When a test fails the schools, careers and reputations suffer." > > It was about a minor screw-up in standardizing, in 1999. Or, since > > the company stonewalled and refused to admit any problems, > > and took a long time to find the problems, it sounds like it > > became a moderately *bad* screw-up. > > > > The article about CTB/McGraw-Hill starts on page 1, and covers > > most of two pages on the inside of the first section. It seems > > highly relevant to the 'testing' that the Bush administration > > advocates, to substitute for having an education policy. > > > > CTB/McGraw-Hill runs the tests for a number of states, so they > > are one of the major players. And this proved to me , once again, > > why nuclear power plants are too hazardous to trust: we can't > > yet Managements to spot problems, or to react to credible problem > > reports in a responsible way. > > > > In this example, there was one researcher from Tennessee who > > had strong longitudinal data to back up his protest to the company; > > the company arbitrarily (it sounds like) fiddled with *his* scores, > > to satisfy that complaint, without ever facing up to the fact that > > they did have a real problem. Other people, they just talked down. > > > > The company did not necessarily lose much business from the > > episode because, as someone was quoted, all the companies > > who sell these tests have histories of making mistakes. > > (But, do they have the same history of responding so badly?) > > > > -- > > Rich Ulrich, [EMAIL PROTECTED] > > http://www.pitt.edu/~wpilib/index.html > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: (none)
I've had occasion to talk with a number of educator types lately, at different application and responsibility levels of primary & secondary Ed. Only one recalled the term, regression toward the mean. Some (granted, the less analytically minded) vehemently denied that such could be causing the results I was discussing. Lots of other causes were invoked. IN an MBA course I teach, which frequently includes teachers wishing to escape the trenches, the textbook never once mentions the term. I don't recall any other intro stat book including the term, much less an explanation. The explanation I worked out required some refinement to become rational to those educator types (if it has yet :). So I'm not surprised that even the NYT would miss it entirely. Rich, I hope you penned a short note to the editor, pointing out its presence. Someone has to, soon. BTW, Campbell's text, "A primer on regression artifacts" mentions a correction factor/method, which I haven't understood yet. Does anyone in education and other social science circles use this correction, and may I have a worked out example? Jay Rich Ulrich wrote: - selecting from CH's article, and re-formatting. I don't know if I am agreeing, disagreeing, or just rambling on. On 4 May 2001 10:15:23 -0700, [EMAIL PROTECTED] (Carl Huberty) wrote: CH: " Why do articles appear in print when study methods, analyses, results, and conclusions are somewhat faulty?" - I suspect it might be a consequence of "Sturgeon's Law," named after the science fiction author. "Ninety percent of everything is crap." Why do they appear in print when they are GROSSLY faulty? Yesterday's NY Times carried a report on how the WORST schools have improved more than the schools that were only BAD. That was much- discussed, if not published. - One critique was, the absence of peer review. There are comments from statisticians in the NY Times article; they criticize, but (I thought) they don't "get it" on the simplest point. The article, while expressing skepticism by numerous people, never mentions "REGRESSION TOWARD the MEAN" which did seem (to me) to account for every single claim of the original authors whose writing caused the article. CH: " [] My first, and perhaps overly critical, response is that the editorial practices are faulty[ ... ] I can think of two reasons: 1) journal editors can not or do not send manuscripts to reviewers with statistical analysis expertise; and 2) manuscript originators do not regularly seek methodologists as co-authors. Which is more prevalent?" APA Journals have started trying for both, I think. But I think that "statistics" only scratches the surface. A lot of what arises are issues of design. And then there are issues of "data analysis". Becoming a statistician helped me understand those so that I could articulate them for other people; but a lot of what I know was never important in any courses. I remember taking just one course or epidemiology, where we students were responsible for reading and interpreting some published report, for the edification of the whole class -- I thought I did mine pretty well, but the rest of the class really did stagger through the exercise. Is this "critical reading" something that can be learned, and improved? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Arithmetic, Harmonic, Geometric, etc., Means
"Simon, Steve, PhD" wrote: > Stan Alekman writes: > > >What is the physical significance or meaning regarding a manufacturing > process > >whose output over an extended period of time has the same value for the > >arithmetic, geometric and harmonic mean of a property, its purity, for > >example? > > Exactly the same value? I suspect that the only way this could happen would > be if the data were constant. > > Almost the same value? Probably the data is very close to constant (i.e., > the coefficient of variation is very small). > > The geometric and harmonic means represent averages on the log and inverse > scale, respectively, that are back-transformed to the original units of > measurement. You might want to review a bit on transformations, especially > the stuff on Exploratory Data Analysis by Tukey et al. One rule of thumb I > seem to remember is that transformations do not have much of an impact on > the data analysis until there is a good amount of relative spread in the > data, Yes. > such as the maximum value being at least three times larger than the > minimum value. This assumes of course that all your data is positive. Note > that the ratio of the maximum to minimum values could be considered a > measure of relative spread, just like the coefficient of variation. > > You might want to rethink your approach, however. Usually there are good > physical reasons for preferring one measure of central tendency over > another. Just blindly computing all possible measures of central tendency is > an indication, perhaps, that you are not spending enough time thinking about > the physical process that creates your data. > > You mention elsewhere, for example, that this data represents purity levels. > Perhaps it might make more sense to look at impurity levels, since small > relative changes in purity levels might be associated with large relative > changes in impurity levels. Perhaps certain factors might influence impurity > levels in a multiplicative fashion. when impurities get down to low levels, all kinds of interesting things can happen. Steve's is good advice. > > Steve Simon, [EMAIL PROTECTED], Standard Disclaimer. > STATS: STeve's Attempt to Teach Statistics. http://www.cmh.edu/stats > Watch for a change in servers. On or around June 2001, this page will > move to http://www.childrens-mercy.org/stats > Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: probability and repeats
Dale Glaser wrote: > Hi there..I have scoured my admittedly limited collection of probability > texts, and am stumped to find the answer to the following, so any help is > most appreciateda colleague just approached me with the following > problem at work: he wants to know the number of possible combinations of > boxes, with repeats being viable...so, e.g,. if there are 3 boxes then > what he wants to get at is the following answer (i.e, c = 10): > > 111 > 222 > 333 > 112 > 221 > 332 > 113 > 223 > 331 > 123 > > ...so there are 10 possible combinations (not permutations, since 331 = > 133)...however, when I started playing around with various > combinations/factorial equations, I realized that there really isn't a pool > of 3 boxes..there has to be a pool of 9 numbers, in order to arrive at > combinations such as 111 or 333.so any assistance would be most > appreciated as I can't seem to find an algorithm in any of my > texts..thank you.dale glaser maybe we need some more conditions on the problem? too many options here. Suppose there are 3 unique boxes. How many ways can I stack the 3 of them? Answer: If I count permutations, it would be 3P3 (read sub scripted numbers here), or 3!/(3-3)!= 3! = 6 But you said not to count permutations. For combinations: 3C3 = 3!/[0!*3!] = 1 Silly of course. Suppose you can re-use a box, so 222 is OK. first position has 3 possibles, same with 2nd pos, and 3rd. Total: 3*3*3 = 27 If you set them out in a careful sequence, it will become more evident: 1 1 1 2 1 1 3 1 1 1 2 1 2 2 1 3 2 1 1 3 1 2 3 1 2 2 1 etc. -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: errors in journal articles
Jerry Dallal wrote: A few years ago (many years ago?) someone wrote an article for the newsletter of the newsletter of the ASA Section on Teaching Statistics in the Health Sciences in which he described having each student select a published article "at random" and check for internal consistency. Round-off errors were NOT counted as violations. His students found errors in one quarter of all articles checked. My experience with journal clubs suggests nothing has changed in the intervening years. In a (probably unpublished) study a few years ago, a graduate student found that many or most of the articles which contained enough data to analyze in a language research archival journal drew conclusions with acceptable alpha levels, but with severely weak power - beta - levels. But hey, they presented data. Let's not knock their efforts too hard... Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help me an idiot
Abdul Rahman wrote: > Please help me with my statistics. > > Question: > > If you order a burger from McDonald's you have a choice of the following > condiments:ketchup, mustard , lettuce. pickles, and mayonnaise. A > customer can ask for all thesecondiments or any subset of them when he > or she orders a burger. How many different combinations of condiments > can be ordered? No condiment at all conts as one combination. > > Your help is badly needed > > Just an Idiot@leftover Before you 'put yourself down' too hard, remember, ignorance can be cured, but stupid is forever. I recommend you pick the former, given a choice. So the recommended solution is a _combination_, not _permutation_. If you say that a condiment can be (a) absent or (b) present, then you have the 6*5*4*3*2*1 _permutations_ possible. For combinations, you will divide by the number of permutations for each set of a selection. For example, for the number of possible combinations of 2 items taken from the 6 possible, we would have 6C2 = 6*5*4*3*2*1/[(4*3*2*1)*(2*1)] but then you would repeat for 1, 3, 4, and 5 items selected. I don't like this - at this early hour (4:00 am local time) I sense something seriously invalid. Best go back to listing all the possibles. Keep in mind that for a permutation, the empty (absent item) slot is different, if it is empty ketchup or empty mustard. then see if the combination equation can cover them. I doubt there are 720 possible _combinations_. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: Compartments, - [Fwd: [nsf01104] - Program Announcements & Information]
I think these people have the same problem with 'compartmentalizing' that Dennis and others have WRT academic departments. I haven't checked it out yet, but perhaps there is a solution here. If we can vertically integrate R & E (education), perhaps we can horizontally integrate some of the E in pursuit of some of the research results. Jay Original Message Subject: [nsf01104] - Program Announcements & Information Date: Fri, 27 Apr 2001 05:00:23 -0400 (EDT) From: NSF Custom News Service <[EMAIL PROTECTED]> To: CNS Subscribers <[EMAIL PROTECTED]> The following document (nsf01104) is now available from the NSF Online Document System Title: Vertical Integration of Research and Education in the Mathematical Sciences (VIGRE) Type: Program Announcements & Information Subtype: Math/Physical Sciences Replaces nsf0040 It may be found at: http://www.nsf.gov/cgi-bin/getpub?nsf01104 -- NSF Custom News Service http://www.nsf.gov/home/cns/start.htm Please send questions and comments to [EMAIL PROTECTED] = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ways of grading group participation
Well, you sure started a large brush fire! Quick comments: 1) if the objective is to learn how to function in a group, then giving all the same grade for the work is a good way to empahsize the interdependent nature of a group project. 2) Yes, some compalin that they did all the work. They usually know this will happen, going in. What about the group dynamics that the rest of the class learns (if they are in a cohort) not to lean on one individual. What are the chances that this pattern will repeat when they get into the 'real world'? 3) In 11 years of small group proejcts in classes, I had to intervene due to non-functional behavior only 1 time. I had to endure about 5X as many complaints. 4) Academia, the traditional approach, tensdds to be solatoary effort. Business in USA today tends to be group oriented, with individuals doing their thing as contributions to the team. Many of my students were elated to find that their intervewiers would ask them about group projects, and they could answer, "yes, I can do that. I already have." 5) Dennis is right. 1 problem in 20 years is remarkable. Did you change your presenrtation in some manner? After all 1/20 is 0.05. Maybe we could stretch this into a SL of 0.05, and declare something had changed! EAKIN MARK E wrote: I have been assigning group projects for about 20 years and have lucky enough (until this semester) to have few students complaints about their fellow groups members. This semester I have many, many problems with groups complaining about members not carrying their fair share. Up to now while I occasionally ask the students to grade themselves and others in their groups, I have never formally written a group participation scoring protocol on my syllabus. Therefore I have started wondering about the best way of grading group member participation. I asked several professors how they graded member participation and each had a slightly different way of doing it. I was wondering how other faculty graded this participation. By the way, I have found a site called "Student Survival Guide to Managing Group Projects" that looks intriquing but I haven't had time to investigate it in detail. If you are interested, see www.csc.calpoly.edu/~sludi/SEmanual/TableofContents.html Mark Eakin Associate Professor Information Systems and Management Sciences Department University of Texas at Arlington [EMAIL PROTECTED] or [EMAIL PROTECTED] = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ ========= -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Correlation of vectors
good idea! but, I think it's the dot products. - two vectors at right angles to one another have a dot product of 0. and I think WDA's suggestion of checking a book on the subject is a very good one :) Jay W. D. Allen Sr. wrote: qvLA6.9052$[EMAIL PROTECTED]">If I remember correctly two vectors are independent if their cross productis zero. Check a vector analysis book for verification of this.WDAend"Peter J. Wahle" <[EMAIL PROTECTED]> wrote in messageN%FA6.403$[EMAIL PROTECTED]">news:N%FA6.403$[EMAIL PROTECTED]... What can I tell about the relationship of two sets of experimentally derived vectors?Example:VectorAVectorB(-1,1)(-2,-1)(-2,0)(-2,0)(-2,1)(-1,0)(0,0)(0,0)(-1,1)(-1,0)......Each row is subject to the same conditions. I need to check forindependence and need some sort of measure of correlation. =Instructions for joining and leaving this list and remarks aboutthe problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/========= -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today?
Re: normal approx. to binomial
one tech issue, one thinking issue, I believe. 1) Tech: if np _and_ n(1-p) are > 5, the distribution of binomial observations is considered 'close enough' to Normal. So 'large n' is OK, but fails when p, the p(event), gets very small. Most examples you see in the books use p = .1 or .25 or so. Modern industrial situations usually have p(flaw) around 0.01 and less. Good production will run under 0.001. To reach the 'Normal approximation' level with p = 0.001, you have to have n = 5000. Not particularly reasonable, in most cases. If you generate the distribution for the situation with np = 5 and n = 20 or more, you will see that it is still rather 'pushed' (tech term) up against the left side - your eye will balk at calling it normal. But that's the 'rule of thumb.' I have worked with cases, pushing it down to np = 4, and even 3. However, I wouldn't want to put 3 decimal precision on the calculations at that point. My personal suggestion is that if you believe you have a binomial distribution, and you need the confidence intervals or other applications of the distribution, then why not simply compute them out with the binary equations. Unless n is quite large, you will have to adjust the limits to suit the potential observations, anyway. For example, if n = 10, there is no sense in computing a 3 sigma limit of np = 3.678 - you will never measure more precisely than 3, and then 4. But that's the application level speaking here. 2)I think your books are saying that, when n is very large (or I would say, when np>5), the binomial measurement will fit a Normal dist. It will be discrete, of course, so it will look like a histogram not a continuous density curve. But you knew that. I think your book is calling the binomial rv a single measurement, and it is the collection of repeated measurements that forms the distribution, no? I explain a binomial measurement as, n pieces touched/inspected, x contain the 'flaw' in question, so p = x/n. p is now a single measurement in subsequent calculations. to get a distribution of 100 proportion values, I would have to 'touch' 100*n. I guess that's OK, if you are paying the inspector. Clearly, one of the draw backs of a dichotomous measurement (either OK or not-OK) is that we have to measure a heck of a lot of them to start getting decent results. the better the product (fewer flaws) the worse it gets. See the situation for p = 0.001 above. Eventually we don't bother inspecting, or automate and do 100% inspection. So the next paragraph better explain about the improved information with a continuous measure... Sorry, I got up on my soap box by mistake. Is this enough explanation? Jay James Ankeny wrote: > Hello, > I have a question regarding the so-called normal approx. to the binomial > distribution. According to most textbooks I have looked at (these are > undergraduate stats books), there is some talk of how a binomial random > variable is approximately normal for large n, and may be approximated by the > normal distribution. My question is, are they saying that the sampling > distribution of a binomial rv is approximately normal for large n? > Typically, a binomial rv is not thought of as a statistic, at least in these > books, but this is the only way that the approximation makes sense to me. > Perhaps, the sampling distribution of a binomial rv may be normal, kind of > like the sampling distribution of x-bar may be normal? This way, one could > calculate a statistic from a sample, like the number of successes, and form > a confidence interval. Please tell me if this is way off, but when they say > that a binomial rv may be normal for large n, it seems like this would only > be true if they were talking about a sampling distribution where repeated > samples are selected and the number of successes calculated. > > > > > > > ___ > Send a cool gift with your E-Card > http://www.bluemountain.com/giftcenter/ > > > > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = > > > -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Regression toward the Mean - search question
Dear Everyone, I feel singularly stupid. My filing system has collapsed, if it ever was structured. A few weeks ago, I believe on this list, a quick discussion of Galton's regression to the mean popped up. I downloaded some of Galton's data, generated my own, and found some ways to express the effect in ways my non-statistian education friends might understand. Still working on that part. In addition, there was a reference to a wonderful article, which I read, and which explained the whole thing in excellent terms and clarity for me. The author is clearly an expert on the subject of detecting change in things. He (I think) even listed people who had fallen into the regression toward the mean fallacy, including himself. Problem: Now of course I really want that article again, and reference. I cannot find it on my hard drive. Maybe I didn't download it - it was large. But I can't find the reference to it, either. Bummer! Can anyone figure out who and what article I'm referring to, and re-point me to it? Very much obliged to you all, Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: probability textbook
Herman Rubin wrote: 99jhgk$[EMAIL PROTECTED]">In article [EMAIL PROTECTED]"><[EMAIL PROTECTED]>,Vadim Marmer <[EMAIL PROTECTED]> wrote: I have tried a number of textbooks but I still cannot find one thatcombines intuition with mathematical rigour in a satisfactory way. Thebest I have seen so far is 'Probability and Measure" by Billingsley,and the last one I have tried is "Probability for Statisticians" byShorack which is great and provides a lot of details but too dry,and does not care much about developing intuition . What's your favorite textbook on Probability Measure Theory? It is verbose, but Loeve has considerable advantages.One criticism which has been made of it is that it uses the"cafeteria" style of theorems, namely, only the necessaryconditions are used. I consider this to be an advantage,as special cases allow proofs which conceal the concepts.In proving a general version of a theorem, one usually isforced to come dowm to the essentials.As for developing intuition, this does not seem to be donein any book on any subject. I'd like to think that 'intuition' is what is commonly developed by graduate students, or employed graduates who use this stuff on a regular basis. That would fit with experience educating engineers, learning theory, etc.,e tc. Then I could concentrate on developing said intution as part of the student-instructor interaction in class. Whether that interaction can be expressed or developed in a textbook, a fundamentally one-way communication, might be a subject for debate. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today?
Re: can you use a t-test with non-interval data?
Ben Kenward wrote: > My girlfriend is researching teaching methods using a questionnaire, and she > has answers for questions in the form of numbers from 1 to 5 where 5 is > strongly agree with a statement and 1 is strongly disagree. She is proposing > to do a t-test to compare, for example, male and female responses to a > particular question. > > I was surprised by this because I always thought that you needed at least > interval data in order for a t-test to be valid. Her textbook actually says > it is OK to do this though. I don't have any of my old (life-sciences) stats > books with me, so I can't check what I used to do. > > So are the social scientists playing fast and loose with test validity, or > is my memory playing up? Classic issue, frequent discussion, careful response distinctions needed. Yes, interval data is needed to do a t test. Is the data from a Likert scale (what your friend has) interval data? Depends on how you see it. When a respondent puts a mark halfway between two check boxes (i.e., 3.5 on the numerical scale), they are trying to tell you that _they_ see it as interval, as continuous in fact. What is the '3' position? Is it really between 2 and 4, or is it 'none of the above' type of thing? If the latter, it's no dice - not interval. for a t test, you really want intervals that are equally spaced. Is this so? Is this reasonably close to so? Lots more debate on that. By making the levels marked as points on the continuum from the 1 to the 5 positions, you are implying that they are equally spaced. Does the respondant see them that way? Could be. maybe we should just try it, to see what comes out. For a t test, you prefer a scale which is in principle potentially infinite. When I do this sort of thing, I sometimes get responses of 0 and 6, for potential conditions I didn't anticipate. Otherwise, the scale is restricted at the bottom and top. How to correct for this? One way is to do a logit transform (if I get the term right) Convert the 1 - 5 scale into a 0 to 1 scale by: y' = (y-1)/4 then a logit transform (omega transform via Taguchi): y'' = ln(y'/(1-y') the y'' distribution will more closely approach the infinite width potential requested, and will never give you a prediction of more than 5 or less than 1 on the y scale. BUT... this assumes that the earlier assumptions about scale and interval size are very tight. They probably aren't. Why waste your time doing very precise analyses on weak data? Suggestion: (a)run the t test on the raw responses, y's. See if anything pops up. (b)go back and check that the assumption requirements are met or at least arguable. Check some respondents to see that they saw the scale as you did, and adjust your thinking to theirs. (c)IF you have time and the data is reasonably tight, AND if you want to impress someone with your transformational skills, then go do that transform and re-analyze. In most cases, the conclusions will not be greatly different, in my experience. the only place things get dicey is when a mean response is near the ends (1 or 5). Detecting differences there can be harder, and a small change there is more significant than a small change in the middle. references?sorry. I've only done it a couple times, and know it works - it gets me predictions that pan out in confirmation. treating the data as nominal, instead of interval, may give away information. that's expensive. good luck, Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Patenting a statistical innovation
Hi, Paige, I am also confident that your employer, Kodak, saw too it that you have an agreement with them, to the effect that what comes out of your head is theirs. I've signed a few of these, too. Received $1.00 for one :) I think they got their money's worth. Without conferring with a legal beagle (whose time is worth considerable...), I'd hazard a guess that mathematical principles, and perhaps the concepts of the algorithm, are universal and not patentable. The code to execute a principle/algorithm, however, can and is patented or otherwise withheld. I recall some years ago, when an expert on linear programming methods explained how a new analytic procedure for solutions was held back as a trade secret. At a conference workshop, the inventor of it would walk around the room as people worked on a problem, giving them encouragement by saying, 'your closer,' and 'that didn't work for me.' but he would not reveal how to do it, only a deep background type of confirmation. He was specifically enjoined from telling, it seems. Was this a free and open intellectual discussion? No way. Did it hold back progress & insight. Yup. On the other hand, did this person's employer pay a sizable salary to have him sitting around working it out? Yes, and if they hadn't, it is unlikely that the technique would have been discovered/invented. Patents run out, too. In the aluminum industry there are many patents around. The holders waste no time in cross licensing, at an agreed upon rate (usually pennies per pound produced). It is standard procedure. the key is whether the product of the invention is easily traceable. If I can buy a piece of aluminum and tell immediately that it was made using my patented process, then I can go after the makers for my piece. Can the results of patented software be traced in the same manner? Maybe not, but the code can be. Especially if it is provided in compiled form. but the software industry has a long history of flat out rip-offs. "What's in code is mine, and only big evil companies would take money for what I want." A very adolescent attitude, perhaps, but also not a good attitude on which to found cross licensing of patents. So I don't think that patenting of software will make others' software worse, or cumbersomely unusable. I think unwillingness to work out cross licensing agreements, and stick to them, may cause greater problems. Jay Paige Miller wrote: > dennis roberts wrote: > >> just think about all the software packages ... and what they would HAVE (or >> HAD) to do if these routines were "patented" ... >> >> sure, i see "inventing" some algorithm as being a highly creative act and >> usually, if it is of value, the person(s) developing it will gain fame (ha >> ha) and some pr ... but, not quite in the same genre of developing a >> process for extracting some enzyme from a substance ... using a particular >> piece of equipment specially developed for that purpose >> >> i hope we don't see a trend IN this direction ... > > > If it so happens that while I am in the employ of a certain company, I > invent some new algorithm, then my company has a vested interest in > making sure that the algorithm remains its property and that no one > else uses it, especially a competitor. Thus, it is advantageous for my > employer to patent such inventions. In this view, mathematical > inventions are no different than mechanical, chemical or other > inventions. > -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: basic stats question
Richard A. Beldin wrote: > This is a multi-part message in MIME format. > --20D27C74B83065021A622DE0 > Content-Type: text/plain; charset=us-ascii > Content-Transfer-Encoding: 7bit > > I have long thought that the usual textbook discussion of independence > is misleading. In the first place, the most common situation where we > encounter independent random variables is with a cartesian product of > two independent sample spaces. Example: I toss a die and a coin. I have > reasonable assumptions about the distributions of events in either case > and I wish to discuss joint events. I have tried in vain to find natural > examples of independent random variables in a sample space not > constructed as a cartesian product. > > I think that introducing the word "independent" as a descriptor of > sample spaces and then carrying it on to the events in the product space > is much less likely to generate the confusion due to the common informal > description "Independent events don't have anything to do with each > other" and "Mutually exclusive events can't happen together." > > Comments? 1)It is conceivable, that a plant making blue and red 'thingies' on the same production line would discover that the probability that the next thingie is internally flawed (in the cast portion) is independent of the probability that it is blue. BTW - 'Thingies' are so commonly used by everyone that it is not necessary to describe them in detail. :) 2)There are many terms, concepts, and definitions in the 'textbook' that have no exact match in reality. Common expressions include, "There is no such thing as random,' 'There is no such thing as Normal (distribution),' and my own contribution, "There is no such thing as a dichotomy this side of a theological discussion.' The abstract definitions are just that - theoretical ideals. Down here in the mud of reality, we recognize this, and try to decide if the theory is reasonably close to what is happening. A couple confirmation trials help, too. If the internal casting flaws are generated at an early point, and the paint is added later, depending on the orders received, then I would assert that independence was likely. If the paint is added to castings made on different dies or production machines, as a color code, then I would suspect independence was unlikely. 3)Presenting 'independence' as axes in a cartesian coordinate system is extremely handy, especially for discussing orthogonal arrays and designed experiments, etc. The presentation, however, does not make them independent. One has to check the physical system behavior to assure that. 4)I may have shot far wider than your intended mark, in which case, sorry for the interruption. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: two sample t
dennis roberts wrote: > when we do a 2 sample t test ... where we are estimating the > population variances ... in the context of comparing means ... the > test statistic ... > > diff in means / standard error of differences ... is not exactly like > a t distribution with n1-1 + n2-1 degrees of freedom (without using > the term non central t) > > would it be fair to tell students, as a thumb rule ... that in the > case where: > > ns are quite different ... AND, smaller variance associated with > larger n, and reverse ... is the situation where the test statistic > above is when we are LEAST comfortable saying that it follows (close > to) a t distribution with n1-1 + n2-1 degrees of freedom? > > that is ... i want to set up the "red flag" condition for them ... > > what are guidelines (if any) any of you have used in this situation? G. E. P. Box says, (a) if n(1) = n(2), treat them as if s(1) = s(2). (b) if s(1)/s(2) (selecting 1 & 2 so ratio is >1) is less than about 3, treat them as if s(1) = s(2). This is approx. equal to running an F test for diff in vars. And I think this is where he gets this from. (c) if n(1) is within 'about' 10% of n(2), go for option (a) above. I have a paper I can't find for (a) and (b), but (c) was a verbal. When you speak of getting 'LEAST comfortable' I think you are saying, how much deviation can you stand. A lot depends on the consequences of deviation - decision 'theory' etc. If you take a non-dichotomous view of 't' testing, the question becomes immaterial, anyway. Cheers, Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Type III errors
And all this time I thought it was a tongue-in-cheek way of saying, "find an answer to the wrong question." Ah! How thick of me. :) Jay Karl L. Wuensch wrote: > Recently there was a discussion here involving the phrase "Type III > errors." I noted that others have used that phrase to mean inferring > the incorrect direction of effect after rejecting a nondirectional > hypothesis, but I was unable to give references. This week I stumbled > across references to such use of the term. Interested parties may > consult the article by Leventhal and Huynh (Psychological Methods, > 1996, 1, 278-292). They recommend that the probability of making a > Type III error be subtracted from the probability of correctly > rejecting the nondirectional null when computing power (since it is > common practice to infer a direction of effect following rejection of > a nondirectional null. > > I have posted a summary of the article at: > http://core.ecu.edu/psyc/wuenschk/StatHelp/Type_III.htm > > + > Karl L. Wuensch, Department of Psychology, > East Carolina University, Greenville NC 27858-4353 > Voice: 252-328-4102 Fax: 252-328-6283 > > <mailto:[EMAIL PROTECTED]> > <mailto:[EMAIL PROTECTED]>[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]> > <http://core.ecu.edu/psyc/wuenschk/klw.htm> > <http://core.ecu.edu/psyc/wuenschk/klw.htm>http://core.ecu.edu/psyc/wuenschk/klw.htm > -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Determing Best Performer (fwd)
Eric Bohlman wrote: 96bj7p$afn$[EMAIL PROTECTED]">Bob Hayden <[EMAIL PROTECTED]> wrote: The most cost-effective method is to roll a die. (See Deming.) If the representative's performances all lie within the control limits for the process that's being measured, yes. If one or more representatives' performances lie outside the system on the "good" side, then Deming would have had no problem with rewarding them (though he'd have emphasized that the most important thing for management to do would be to find out why they were doing so well and whether they were using any techniques that could be adopted by the others). Very true. It's called 'continuous improvement,' if you can stand the cliche/buzzword. As I recall, however, the original measurement scheme for assessing 'performance' left something to be desired. When we measure one item of a system in the hopes of changing another item of it, we are believing there is a relationship between them. Hopefully causative. If that relationship is not strong, or is accompanied by a great deal of scatter, then less causative relationships will do as well. Hence, roll a die may be equally related to the desired outcome, and cost a lot less. Now, if you can show, with numbers and data, that your assessment rating of 'performance' and 'performance' measured by another method are closely related, then show us all. Please. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today?
Re: Regression analysis.
Accounting problem or no, it looks like a multiple regression issue to me. As such, I would ask: 1)Have you looked for many possible factors (ind. vars)? I'm thinking something nearly off the wall might be a big one. If you miss a major factor in your list, then the analysis will say there is a lot of scatter, and you will conclude that none of what you have really do a lot for it. Takes thinking to get these down. 2)social science and business analysis in general cannot insist on orthogonal factor conditions. Which leads to a lot of soul searching near the conclusions. And confusion in those conclusions. Clearly, you are going to have that problem. Getting orthogonal factor conditions does not mean looking only at individuals who are narrowly defined, but that you get a mathematically proper spread of variations. Can you search your data for a few factors that seem to really count, then select data that fits an orthogonal array? use the 'discarded' data to test the resultant model for predictive capability. 3)It's going to be interesting to see if you can really pull this one off. Better put gender into it somewhere, while you're at it, too. That just popped up in the news lately. Joe Meyer wrote: > I am trying to estimate how faculty salaries at my university are allocated > by instructional level and academic discipline to estimate the actual cost > of teaching a semester credit hour by instructional level and discipline. I > developed a regression model with faculty teaching purely in specific > academic disciplines as my observations. Actual salary is my dependent > variable. Independent variables include lower-level credit hours taught, > upper-level credit hours taught, masters-level credit hours taught, doctoral > credit hours taught, total students taught, and dummy variables for faculty > rank, tenure status, and academic discipline. My original idea was to > either: > > 1) use dummy variables for instructional levels, or > > 2) plug in lower-, upper-, and masters-level credit hour data one at a time > to get separate estimates for each level and discipline. > > The dummy variable idea will not work, because I do not know what amount of > each salary is spent at each level for my dependent variable. I believe this is part of what you are trying to find out. So you shouldn't know it up front! :) > And, I will > double count the estimators for tenure status, faculty rank, and discipline > if I just plug credit hour data for each of the different instructional > levels into the model while using zeros for other levels of instruction to > get a prediction for each level. Is there a way to predict the salary cost > by instructional level and discipline without fitting the model only on > faculty who are teaching purely in one discipline and purely at one > instructional level? I can do that, but I am concerned that such faculty > would not be very representative of the population. I am not too worried > about fitting the regression model to faculty who teach in a single > discipline, since most do this anyway, but am afraid that limiting to > faculty who teach at a particular level will skew the results. > > Thanks! > Joe Meyer > > > f INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = > > > -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Hypothesis Testing where General Limit Theorem doesn't hold?
Neo Sunrider wrote: > I am just taking an undergraduate introductory stats course but now I > am faced with a somewhat difficult problem (at least for me). > > If I want to test a hypothesis (t-test, z-score etc.) and the underlying > distribution will under no circumstances approach normal... (i.e. the results > of the experiment will always be something like 100*10.5, 40*-5 etc.) The > Central Limit Theorem doesn't help here, or does it? > > Can anyone explain, or point me in the right direction - how can I test in > these cases? > > Thanks alot > > A. > Test if the distribution is nowhere 'near normal': generate a whole pot load of data (at least 100 points), form nominally identical conditions. Make a histogram. Eyeball it. Then run some of the various Normal testing tests on it. If your eye can't say it is not, it is probably close enough for a 't' test. Second, the Central Limit Theorem says the distribution of the _contrast_ (i.e., the average of a sample) tends toward a normal distribution. It says nothing about the individual observations. Only certain very special distributions that do not converge properly are not included, and they may fall under the CLT's power, too. If your dist. is clearly not normal, but looks log normal or the like, consider a transformation on the raw data. -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Determing Best Performer
Grant Robertson wrote: > Hi All, > I have a client who insist on determining which call centre representative > is the weeks best performer. We conduct weekly interviews and ask > respondents to rate each representative. For some reps we conduct 5 > interviews others 1 and so on. Now the question is how do you go about > determining which rep performed best, we combine the two top box ratings. We > have looked at margin's of error for each rep but this hasn't proved to be > useful as you will be unlikely to find significant differences with such > small bases sizes. Any suggestion as to the best method to do this would be > appreciated. > > Regards, > Grant Question 1)What does the boss want to accomplish with the information you build? If it's improved performance, what measurement will help the call centre reps move in that direction? And I'm begging the question of what is 'performance.' You can't. If the reps learn that the top performer title is awarded for what amount to random selection, it will loose any motivating characteristic. Selecting only the top individual each week tends to include a large randomness component. How about the top 15% on a rating scale, as one improvement. Also, make some kind of run chart of the ratings (ugh! but it's your client.), and see that there is an award for consistent high ratings. And re-visit question 1. Does your survey 'instrument' get measures of what the boss _really_ cares about, or an honest indicator of same? Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: careers in statistics
a)there is always plenty of work for those who like what they do. b)If it's money you're wanting, then you have to be able to do something that someone else will pay to have done. c)happiness comes to those who find (a) and (b) in the same activity. Or who do not compromise (a) very much in pursuit of (b). d)What can a master in stats do? to translate to what CEO's and other money sources understand, I'd suggest you emphasize the technical issues you can address, and not mention the 'pure' math angle of statistics. Or search well for the CEO's who understand the details of the PDCA loop. JohnD231 wrote: > Id like anyones input on the following: > > 1. job prospects at the masters level better than MS in non-data driven topics. > > 2. starting pay You won't starve, and you may pay off your loans quickly. > > 3. job satisfaction that's your responsibility, not the company''s. > > > Thanks > > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = > > > -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Statistics is Bunk
Eric Bohlman wrote: > Jeff Rasmussen <[EMAIL PROTECTED]> wrote: > >> I tried to subtlety nudge the Pro-Bunk side to look at Quantum Physics and >Heisenberg's Uncertainty Principle as well as Eastern Philosophies,[snip of excellent >stuff] > > > > I think this ultimately comes down to George Box's famous quip that all > models are wrong, but some models are useful. 'Tain't a quip. It's fact! :) > [more snip of more excellent stuff] > Might also push Persig's Zen & the Art of Motorcycle Maintenance. At the very edge of understanding, there is a zone of intuition, not objectivity. A miraculous place. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Significance Testing in Experiments
Magill, Brett wrote: > The more general concern about significance testing notwithstanding, I have > a question about the use of testing, or other inferential statistical > techniques, in experiments using human subjects, or any other research > method that does not use probability sampling... > > Now, all of these tests that we run--whether from ANOVA, regression, > difference of means, correlations, etc.--are based on the assumption that we > have sampled from a population using some probability sampling procedure. > And the meaning of p is inextricably linked to the properties of the > sampling distribution. > > However, little experimental research with human subjects is done using a > sample. maybe a nit pick, but on the contrary, they are _all_ done with a sample. - a group of measured elements of the set of the population. The sample must be finite, which it is. > Most often, in my experience, these studies use volunteer subjects > or naturally existing groups. These subjects are then randomly assigned to > treatment and control groups. If we can assert, loudly, that the subjects are 'representative of the population' that we care about, then how we obtained them is of no concern, I'd say. I can, indeed must, define the 'population' to include the group of non-subjects I wish to make a prediction on. For example, I would not test females with Viagra, as I don't believe it will have any effect (I'm guessing in technical ignorance, here). a counter example: did the first tests with Thalidomide include any pregnant women? I guess no. Then nobody should predict how the drug will work on pregnant women (or that it will not have side effects). The tragedy was that the predictions made (whether explicitly or by uninformed decision/action) were proven absolutely false. the difficulty, as I see it, is that occasionally, we define a population, solicit some subjects, and then assert that the subjects match the population, when they don't. the solicitation of volunteers occasionally has a selection effect. For example, in the USA it tends to under represent African Americans, who often distrust medical researchers. > Yet, in every instance that I know of, > results are presented with tests of significance. It seems to me that > outside of the context of probability sampling, these tests have no meaning. > Despite this, presentation of such results with tests of significance are > common. > > Is there a reasonable interpretation of these results that does not rely on > the assumption of probability sampling? It seems to me that simply > presenting and comparing descriptive results, perhaps mean differences, > betas from a regression, or some other measure of effect size without a test > would be more appropriate. This would however be admitting that results have > no applicability beyond the participants in the study. Moreover, these > results say nothing about the number of subjects one has, which p might help > with regard to establishing minimum believability. Yet, conceptually, p > doesn't work. > > Am I missing the boat here? Significance testing in these situations seem to > go over just fine in journals. Appreciate your clarification of the issue. > > Regards > > Brett How would you design a study, using a Baysian approach? Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Excel Graphics
Robert J. MacG. Dawson wrote: [EMAIL PROTECTED]">David Duffy wrote: [EMAIL PROTECTED] wrote:But Excel CAN produce simple scatter plots or bar charts. It is just thatthe defaults are so horrible. With a lot of tweaking you can make themMy problem is cost. I want to get everyone in my department to have thefacility to produce reasonable charts that have a common style. I do allPerhaps it is would be easier for people to quickly dump a CSV file intothe Win32 version of R or Gnuplot. R really does do nice graphics,which can go back into Word/Powerpoint etc. as JPEGs (with minimalcompression). What about GIF/PNG? For graphics with few colors and sharp edges theseforms - run-length-encoded and then compressed using a reeated-stringsalgorithm - are usually more compact, and are loss-free. However, as Irecall, they do not resize as well as JPEGS. Anything in .gif files will not print well - plan on it, and cheer for exceptions. these are usually 72 dpi resolution, which looks good only on your semi-friendly computer screen. Good screens show the difference, too. [EMAIL PROTECTED]"> With small numbers of data, in Windows, the WMF (Windows Metafile) isideal - it is portable between all major applications, and as it storesthe picture as curves, it is infinitely scalable. By which I take you mean, 'vectorized' or EPS structures. That makes a good, rescalable image. I wonder about portability of WMF's, though. MS PowerPoint has problems right and left; sorry 'bout that. Especially on a Mac. PowerPoint can handle EPS, just barely, on any platform. If you are really concerned with publication level output, why not check with your friendly publisher/printer? They won't necessarily sympathize with economic concerns, but if your archival journal publisher specifies a format, it darn well better be a valid one! Also, I once dealt with a book publisher who used an old version of Word for the editors, and liked the word portion material in flat text form. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today?
Re: Recommend multiple regression text please
I agree, a statistician would be helpful to your problem. Or at least, someone who can weed through some of the alternatives and potholes in the roads to your solution. (Sorry, had to say that.) Jim Kroger wrote: > Hi, > > We are using a program called Fidap to do multiple regression on > neuroimaging data. The problem is nobody here or where the program came > from understands how to select coefficients, ie., whether and how to make > them orthogonal. I am not very sophisticated statistically, but can muddle > through a book if it's not just equations. Would anyone recommend an > understandable book on multiple regression? Thanks very much for your > help. > > I'll explain the problem a bit in case it helps in understanding what I > need to learn. > > We administer problems of type A or B. Each has two main periods: solve > and rest. We want to find areas in the brain that respond to a particular > epoch during each type of problem. So I guess the factors (regressors?) > would look like (though they would be extended over all trials): I a factor was the duration of a rest or solve period, this is interesting. But you have the situation of time sequence, which standard DoE (the direction you are aiming for) has a hard time handling. > A solve | restB solve | rest ... > > |~~~| > > > |~~~| > > > |~~~| > > > |~~~| > > > So we select coefficients for these regressors' epochs as such: > > 1 0 0 0 > 0 1 0 0 > 0 0 1 0 > 0 0 0 1 > > We do not know whether these four regressors (can we call them vectors?) > need to be orthogonal. Especially as we get into more complicated analyses > where the regressors or effects overlap or coincide. Nobody who uses the > software we're using has used more than two very simple regressors so > nobody is able to advise us, so we need to learn about multiple regression > and figure out what to do ourselves. > Thanks much. > > Jim In US Industry, there exists a 'what software' syndrome, in which someone selects a certain software package, before they have explored the technical problem to solve. Then they wonder why statisticians have less hair than others - it's been pulled out in frustration. This is often the level at which the decision maker can ask the questions, but it is not in a suitable sequence to efficiently solve the technical problem at hand. Yes, you will want to do studies that involve orthogonal arrays, for those question that can be formulated to use them. I wonder about your issue of order/sequence, nonetheless. If order makes a difference, or is adjusted, then the order is a factor, and the number of factors you must consider just blossomed, greatly. traditional DoE's address a (small) portion of the system at hand, in such a way that they usually avoid feedback _within_ the experimental conditions. Only you can understand if this is an issue, although an experienced DoE person (statistician or otherwise) can often help dig out the phenomena. If you are willing to handle the real thing, I'd suggest Box & Draper, 1987, Empirical Model Building & Response Surfaces. A bit heavy on the math, but watch the pictures. And orthogonal designs are all there. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Two sided test with the chi-square distribution?
Bob Wheeler wrote: > Your point is a good one, but as a side issue, let > me object to the word "fudged." It implies > chicanery, which is not something that even Fisher > cared to imply. No one will ever know why Mendel's > results appear as they do, but It was not > necessarily with an intent to mislead. An argument > can be made, that his intent was to call attention > to the regularities involved just as one does by > showing a line on a plot instead of the scattered > points from which it is calculated. Attitudes > about data were different then. [snip the rest] Even Sir Isaac Newton, in correspondence with his publisher, discussed what we call 'massaging' of data until it fit right. Nonetheless, I spend a fair amount of time in an UG metallurgy lab emphasizing that _they_ cannot adjust and discard embarrassing points to fit preconceptions. Cheers, Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: statistics question
Flash response: 1)Are the levels fixed by some characteristic of the process? they look continuous, and you could do much better if they were, and you could select different intermediate levels. 2)the number of levels can be what you want of it. Some good response surface designs use 5 levels. some use more. 3)Factor B levels are equally spaced, which is good. Factor A levels are not evenly spaced. A full factorial will not give you a 'clean' design - Without doing the math, I don't believe it will be orthogonal, even if you did do all the combinations. 4)what are you going to do with the results of this experiment? If you wish to build a model of the system behavior, then a full factorial type approach is a waste of your effort, time, and experimental runs. 5)Suggest you look at a Response model, with maybe 3-5 levels in both factors, but using a proper RSM type design. If you do it properly, you can avoid a single 'corner' point and recover it mathematically. 6)I'd also ask if you have hard reason to believe that a RSM type model, which will get you quadratic terms in a model, is in fact worth doing (financial/your time costs) the first time out? If little prior information is available, it would probably be better to do a simpler, 2-level factorial first, if at all possible. Doing this will teach you a great deal [that you probably don't already know]. Your choice here, but remember - most people overestimate their knowledge level :) 7)You haven't discussed the response yet. Please spend some time thinking about that, too. More later, if this helps at all. Let me know. Jay [EMAIL PROTECTED] wrote: > Hi, > > I have two factors A and B and I want to run a DOE to study my response. > My factor B is at 3 levels; (900, 1450 and 2000) , my factor A is at 4 > levels 35, 65, 80 and 105. > First of all is it right to have one factor at 4 levels. I have > encountered situations where the factors are either at 2 levels or 3 > levels.? > This will require me to have 12 runs for a full factorial, right? > Also, I do not want to run only the level 35 of factor A with the level > 900 of factor B. If I remove the combination 35, 1450 and 35, 2000; > I'll have only 10 runs and the resulting design space will not be > orthogonal. How do I tackle this problem? > Is there a different design that you would suggest. > Thanks for your help. > SH Lee > > > Sent via Deja.com > http://www.deja.com/ > > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = > > > -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Group Level Data - Proportion of Group Movement
Careful, Tim, this is where the State of Massachusetts came in. If you select a group based on their position at the bottom 10% or top 10% on a test, and then retest, _even the next day_, we can expect the bottom 10% to improve! Fantastic! That was one heck of a day's education! IN addition, the top 10% will drop. guess we'd better get on those teachers' cases, for doing such a lousy job. Especially all in one day! This is Galton's 'regression toward the mean,' and it will bite you at least every time you select the study groups based on their ranking on the test, and then retest the groups. Without going into the details (see refernces given before), I suggest you select groups based on demographics, and other non-test graded variables. Ask exactly what you intend to do with the results - assess students, teachers, or schools. It makes a big difference how you adminster, and what you have on, the tests. If the test cannot answer the technical quesiton you ask, then there is no sense running the numbers for analysis. Jay Tim Victor wrote: > Here's the scenario. We have a years worth data on several school > districts consisting of: > > ° scores obtained from the beginning of the school year > ° scores obtained from the end of the school year > ° ordinal index of beginning of the school ability, e.g., at risk, > mainstream, gifted > ° grade > ° demographic variables > > A student's score would determine group membership. > > We would like to use these data to predict the proportion of students > in each group who will move from the current group (k) to the k+ith > group. > > I'm thinking Poisson regression might be the way to go here. I'd like > to hear/read others' thoughts. > > Thanks in advance. > > Tim > > > Sent via Deja.com > http://www.deja.com/ > > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = > > > -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Levels of measurement.
Time to get down & dirty on this one. I think the logic/word concepts is/are in need of refinement. Paul W. Jeffries wrote: > I have been thinking about levels of measurement too much lately. I have > a question that must have a simple response, but I don't see it right now. > The textbooks say that a ratio scale has the properties of an interval > scale plus a true zero point. This implies that any scale that has a true > zero point should have the cardinal property of an interval scale; namely, > equal intervals represent equal amounts of the property being measured. I don't think so, on abstract grounds. If I set up a scale of visual surface roughness, 0 is 'absolutely smooth,' and then I can put others out there, out to 5 or 6. Nobody's to say if I have them spaced evenly (by some contact surface roughness scale, as AA or RMS), but we have a true zero. Your argument to this point says that if an item has property A and B, then presence of B implies item contains A. My example above shows no, it does not. > But isn't it possible to have a scale that has a true zero point but on > which equal intervals do not always represent the same magnitude of the > property? Yes, see above. However, I don't' think dollars is such a scale. > Income measured in dollars has a true zero point; zero dollars > is the absence of income. Yet, an increase in income from say 18,000 to > 19,000 is not the same as an increase in 1,000,000 to 1,001,000. Depends a great deal on what you mean by 'magnitude.' [And what 'is' is. Sorry, I couldn't help myself.] I think you are considering incremental value, not dollars. Say I look at a pressure gauge. A clear 0 (absolute zero pressure - space), and measurable, even incremented units. Increase in pressure from 100 psi to 101 psi is the same increase as from 0.1 psi to 1.1 psi, is the same from 10,000 to 10,001 psi. Same for dollars. We put a different emotional value on dollars than psi, however :) > At the > low end of the income scale an increase of a thousand dollars is a greater > increase in income than a thousand dollar increase at the high end of the > scale. this looks like you are looking at incremental amounts, in percentage units/values. By this thinking, Bill Gates could drop a few million on my favorite charity with the same elan as I drop $10. the charity receiving these will place a different value on the $ , depending on who makes the contribution. But if we each send the charity $100, the charity will value the contributions equally. (at least until they see who signed one of the checks!) > It seems the reason that an interval of $1000 is not the same on all parts > of the scale is because the proportion of the increase in income is > different. Going from 18,000 to 19,000 is a 6% increase in income and > would be felt. But an increase from 1,000,000 to 1,001,000 is a mere .1% > and would hardly be noticed. Exactly so. > So is income in dollars measured at an interval level, and the zero is not > a true zero point? Is income measured at a ratio level and so equal > intervals represent equal amounts of income? I say ratio level, with equal increments. The value that you place on a given increment depends on the percentage increase, and your perception of that increment. BTW, a change in pressure from 14.7 psi to 13.7 psi would signal a major storm approaching. A change in pressure from 1000 to 1001 psi would not make a whit of difference, if it was applied on your bod - you'd be pretty flattened either way. > I'm anxious to read what list members make of this. > > Paul W. Jeffries > Department of Psychology > SUNY--Stony Brook > Stony Brook NY 11794-2500 consider a single example. If the amount you receive as a raise in salary is exactly equal to the amount your child care bill increases, and they both get announced to you on the same day, it ain't a raise. Jay -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX:(262) 681-1133 email: [EMAIL PROTECTED] web:http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =