July Sale on Toner Cartridges!!!

2001-07-25 Thread lacachatare897743



D  J Printing Corporation
2564 Cochise Drive
Acworth, GA 30102
770-974-8228
[EMAIL PROTECTED]



 --LASER, FAX AND COPIER PRINTER TONER CARTRIDGES--

*WE ACCEPT GOVERNMENT, SCHOOL AND UNIVERSITY PURCHASE ORDERS*


***FREE SHIPPING WITH ANY ORDER OF $200 OR MORE!!!***

APPLE

  LASER WRITER SELECT 300/310/360   $60  
  LASER WRITER PRO 600/630 OR 16/600$60 
  LASER WRITER 300/320 OR 4/600 $45   
  LASER WRITER LS/NT/NTR/SC $50
  LASER WRITER 2NT/2NTX/2SC/2F/2G   $50
  LASER WRITER 12/640$60
   
HEWLETT PACKARD

  LASERJET SERIES 1100/1100A (92A)  $40
  LASERJET SERIES 2100/SE/XI/M/TN (96A) $70
  LASERJET SERIES 2/2D/3/3D (95A)   $43
  LASERJET SERIES 2P/2P+/3P (75A)   $55 
  LASERJET SERIES 3SI/4SI   (91A)   $75  
  LASERJET SERIES 4/4M/4+/4M+/5/5M/5N (98A) $55  
  LASERJET SERIES 4L/4ML/4P/4MP (74A)   $40  
  LASERJET SERIES 4000/T/N/TN  (27X-HIGH YIELD) $70
  LASERJET SERIES 4V/4MV   $80 
 
  LASERJET SERIES 5000 (29X)   $95 
 
  LASERJET SERIES 5L/6L $39
  LASERJET SERIES 5P/5MP/6P/6MP $50
  LASERJET SERIES 5SI/5SI MX/5SI MOPIER/8000$85
  LASERJET SERIES 8100/N/DN (82X)   $115

HEWLETT PACKARD LASERFAX

  LASERFAX 500/700, FX1 $50  
  LASERFAX 5000/7000, FX2   $50
  LASERFAX FX3  $60 
  LASERFAX FX4  $65 

LEXMARK

  OPTRA 4019, 4029 HIGH YIELD   $130   
  OPTRA R, 4039, 4049 HIGH YIELD$135   
  OPTRA S, 4059 HIGH YIELD  $135   
  
  OPTRA N   $110   


EPSON LASER TONER

  EPL-7000/7500/8000$95
   
  EPL-1000/1500 $95

EPSON INK JET

  STYLUS COLOR 440/640/740/760/860 (COLOR)   $20

  STYLUS COLOR 740/760/860  (BLACK)  $20


CANON
  LBP-430   $45  
  LBP-460/465 $55   
  LBP-8 II  $50 
  LBP-LX$54 
  LBP-NX$90 
  LBP-AX$49 
  LBP-EX$59 
  LBP-SX$49 
  LBP-BX$90 
  LBP-PX$49 
  LBP-WX$90 
  LBP-VX$59 


  CANON FAX L700 THRU L790 (FX1)$55 
  CANON FAX L5000 THRU L7000 (FX2)  $55 

CANON COPIERS

  PC 1/2/3/6/6RE/7/8/11/12/65 (A30) $69 
  PC 210 THRU 780 (E40/E31)  $80   
 
  PC 300/400 (E20/E16)  $80

NEC

  SERIES 2 LASER MODEL 90/95$100
  SUPERSCRIPT 860   $115


PLEASE NOTE:
***FREE SHIPPING WITH ANY ORDER OF $200 OR MORE!!!***
 * ALL OF OUR PRICES ARE IN US DOLLARS
 * WE SHIP UPS GROUND.  ADD $6.50 FOR SHIPPING AND HANDLING
 * WE ACCEPT ALL MAJOR CREDIT CARDS OR COD ORDERS.
 * COD CHECK ORDERS ADD $3.50 TO YOUR SHIPPING COST.   
 * OUR STANDARD MERCHANDISE REPLACEMENT POLICY IS NET 90 DAYS.
 * WE DO NOT SELL TO RESELLERS OR BUY FROM DISTRIBUTERS.
 * WE DO NOT CARRY: BROTHER, MINOLTA, KYOSERA, PANASONIC, XEROX, 
FUJITSU, OKIDATA OR SHARP PRODUCTS. 
 * WE ALSO DO NOT CARRY:  DESKJET OR BUBBLEJET SUPPLIES.
 * WE DO NOT BUY FROM OR SELL TO RECYCLERS OR REMANUFACTURERS.

   -PLACE YOUR ORDER AS FOLLOWS-

1) BY PHONE (770) 974-8228
2) BY MAIL:  D AND J PRINTING CORPORATION
 2564 COCHISE DR
 ACWORTH, GA 30102
3) BY INTERNET: [EMAIL PROTECTED]

 INCLUDE THE FOLLOWING INFORMATION WHEN YOU PLACE YOUR ORDER:

1) YOUR PHONE NUMBER
2) COMPANY NAME
3) SHIPPING ADDRESS
4) CONTACT NAME
5) ITEMS NEEDED WITH QUANTITIES
6) METHOD OF PAYMENT (COD OR CREDIT CARD)
7) CREDIT CARD NUMBER WITH EXPIRATION DATE
** IF YOU ARE ORDERING BY 

New Opportunity

2001-07-25 Thread Julie Cooper

An excellent opportunity to utilise your technical skills within the
broader drug development arena as a:

Biostatistician
South-East England

You will join a major international pharmaceutical company, committed
to the development of innovative new therapies for the treatment of
respiratory disease.  As a key member of the Clinical Development
team, you will bring statistical expertise to the design, analysis and
interpretation of Phase II to III trials, utilising your knowledge to
influence the overall direction of clinical development programmes.

To succeed in this challenging role you will have an MSc or PhD in
Biostatistics / Statistics, backed by at least 2 years#8217;
experience within the pharmaceutical industry or a contract research
organisation.  An enthusiastic self-starter, you will have the
interpersonal skills necessary to succeed within a multi-disciplinary
team and to effectively communicate statistical concepts and
information to non-statisticians.

To apply, send your CV, ideally by e-mail as a Word document, to Dr
Kay Wardle at [EMAIL PROTECTED], quoting reference 01156Go. 
Alternatively, call first for a brief, confidential discussion on 00
44 1707 280815.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts

At 07:26 AM 7/25/01 -0400, Teen Assessment Project wrote:
I am using a measure with likert scale items.  Original psychometrics
for the measure
included factor analysis to reduce the 100 variables to 20 composites.
However, since the variables are not interval,  shouldn't non-parametic
tests be done to determine group differences (by gender, age, income) on
the variables?

what were you assuming about the variables when you did a factor analysis 
on them???

  Can I still use the composites...was it appropriate to
do the original factor analysis on ordinal data?



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
=

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



vote counting

2001-07-25 Thread Sanford Lefkowitz

In a certain process, there are millions of people voting for thousands
of candidates. The top N will be declared winners. But the counting
process is flawed and with probability 'p', a vote will be miscounted.
(it might be counted for the wrong candidate or it might be counted for
a non-existent candidate.)

What is the probability that the counted top N will correspond to the
real top N?
(there are actually two cases here: 1 where I want the order of the top
N to be in the correct order and the other where I don't care if the
order is correct)

Thanks for any ideas,
Sanford Lefkowitz



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: vote counting

2001-07-25 Thread Dennis Roberts

At 09:33 AM 7/25/01 -0400, Sanford Lefkowitz wrote:
In a certain process, there are millions of people voting for thousands
of candidates. The top N will be declared winners. But the counting
process is flawed and with probability 'p', a vote will be miscounted.
(it might be counted for the wrong candidate or it might be counted for
a non-existent candidate.)


could you elaborate on a real context for something like this? sure, in 
elections, millions of people vote for thousands of candidates BUT ... 
winners are not determined by the top N # of votes across the millions ... 
for example ... in utah ... the winner might have a very SMALL SMALL 
fraction of millions ... but, in ny state ... a LOSER might have a very 
LARGE fraction of the millions

so, a little more detail about a real context  might be helpful



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



output

2001-07-25 Thread Dennis Roberts

for a class ... i used an example from moore and mccabe ... a 2 factor 
anova case ...

4 levels of factor A ... 4 levels of factor B ... completely randomized 
design ... n=10 in each of the 16 cells

now, after the data are stacked so that data are in a column and codes for 
the two independent variables are in TWO other columns ... it is easy to 
get a nice graph ... and do the anova  which yielded one main effect 
and a significant interaction

graph = 1 page ... anova output = part of 1 page ...

now, if you wanted to do some multiple comparisons ... say, the tukey test 
... there is an option in the minitab glm command to do this 

think of it ... 16 means ... all possible comparisons ... and minitab not 
only produces (which i like) confidence intervals but ... all possible t 
test statistics ...

THAT TOOK AND YIELDED ... 12 pages of output!

reading statistical output these days is really complicated due (partly) to 
THAT ... the volume of possible output becomes huge ... hence, the 
confusion factor of reading (heaven forbid ... understanding!) what is 
there drastically increases



_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: vote counting

2001-07-25 Thread Donald Burrill

The answers to your questions depend heavily on structural information 
that you almost certainly don't have, else one would not bother to have 
arranged a voting process.  But consider two very different cases:
  A.  Voters are absolutely indifferent to candidates:  that is, all the 
candidates are equally attractive, or equally preferred by the voters. 
Then the identity of the candidate with the most votes is purely random, 
and the probability that the counted top N will correspond to the real 
top N will be very low indeed (in part because there IS no real top 
N;  but even in the sense that another vote taken tomorrow would be 
very unlikely to reproduce the same set of top N, let alone in the 
same order). 
  B.  Some candidates are strongly preferred to others (by the voters as 
a whole, that is, as a population), and exactly N such candidates are so 
preferred.  About the rest the voters are indifferent, on the whole.  In 
these circumstances, one would expect a large difference between the 
number of votes cast for the least of the N and the number of votes cast 
for the greatest of the remaining candidates, and the probability that
the counted top N will correspond to the real top N would be rather 
high (depending in part on how large 'p' is).
I do not see how to estimate such a probability in the absence 
of any information about the distribution of preferences.
I've assumed that by counting votes you mean that each voter 
casts exactly one ballot for (at most?) one candidate.  For other voting 
schemes (e.g., vote for K candidates, K .LE. N, and specify one's 
preferences among them by assigning each candidate a preference from 1 
(most favored) to K (least favored)) it is imaginable that answers to 
your questions might not differ, but showing that to be the case (or 
not) is another matter entirely.
It also occurs to me that a single probability 'p' of error in 
voting must be a global average and is an oversimplification almost 
certainly.  In case A above, the results of an election might be 
dominated by voters whose personal 'p' is large;  although, again, it is 
not clear to me how one might show such a thing formally.
-- DFB.

On Wed, 25 Jul 2001, Sanford Lefkowitz wrote:

 In a certain process, there are millions of people voting for thousands
 of candidates. The top N will be declared winners. But the counting
 process is flawed and with probability 'p', a vote will be miscounted.
 (it might be counted for the wrong candidate or it might be counted for
 a non-existent candidate.)

The latter would constitute a spoiled ballot, or not?

 What is the probability that the counted top N will correspond to the
 real top N?
 (there are actually two cases here: 1 where I want the order of the top
 N to be in the correct order and the other where I don't care if the
 order is correct)
 
 Thanks for any ideas,
 Sanford Lefkowitz

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: vote counting

2001-07-25 Thread Lefkowitz, Sanford

The case is very much like case B. A relatively small percent of candidates (maybe 
about 15%) will have a significant number of votes. A large number of candidates will 
have only 1 or 2 votes. It is the case that each voter gets only one vote. It is 
possible (but non trivial) to estimate the shape of the distribution of number of 
votes received. 
It probably is a major oversimplification to assume the probability 'p' of error is 
constant from all sources, but it would be highly impractical to assume otherwise.
 

-Original Message-
From: Donald Burrill [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, July 25, 2001 11:33 AM
To: Sanford Lefkowitz
Cc: [EMAIL PROTECTED]
Subject: Re: vote counting


The answers to your questions depend heavily on structural information 
that you almost certainly don't have, else one would not bother to have 
arranged a voting process.  But consider two very different cases:
  A.  Voters are absolutely indifferent to candidates:  that is, all the 
candidates are equally attractive, or equally preferred by the voters. 
Then the identity of the candidate with the most votes is purely random, 
and the probability that the counted top N will correspond to the real 
top N will be very low indeed (in part because there IS no real top 
N;  but even in the sense that another vote taken tomorrow would be 
very unlikely to reproduce the same set of top N, let alone in the 
same order). 
  B.  Some candidates are strongly preferred to others (by the voters as 
a whole, that is, as a population), and exactly N such candidates are so 
preferred.  About the rest the voters are indifferent, on the whole.  In 
these circumstances, one would expect a large difference between the 
number of votes cast for the least of the N and the number of votes cast 
for the greatest of the remaining candidates, and the probability that
the counted top N will correspond to the real top N would be rather 
high (depending in part on how large 'p' is).
I do not see how to estimate such a probability in the absence 
of any information about the distribution of preferences.
I've assumed that by counting votes you mean that each voter 
casts exactly one ballot for (at most?) one candidate.  For other voting 
schemes (e.g., vote for K candidates, K .LE. N, and specify one's 
preferences among them by assigning each candidate a preference from 1 
(most favored) to K (least favored)) it is imaginable that answers to 
your questions might not differ, but showing that to be the case (or 
not) is another matter entirely.
It also occurs to me that a single probability 'p' of error in 
voting must be a global average and is an oversimplification almost 
certainly.  In case A above, the results of an election might be 
dominated by voters whose personal 'p' is large;  although, again, it is 
not clear to me how one might show such a thing formally.
-- DFB.

On Wed, 25 Jul 2001, Sanford Lefkowitz wrote:

 In a certain process, there are millions of people voting for thousands
 of candidates. The top N will be declared winners. But the counting
 process is flawed and with probability 'p', a vote will be miscounted.
 (it might be counted for the wrong candidate or it might be counted for
 a non-existent candidate.)

The latter would constitute a spoiled ballot, or not?

 What is the probability that the counted top N will correspond to the
 real top N?
 (there are actually two cases here: 1 where I want the order of the top
 N to be in the correct order and the other where I don't care if the
 order is correct)
 
 Thanks for any ideas,
 Sanford Lefkowitz

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: output

2001-07-25 Thread Dennis Roberts


perhaps we need for software to have 2 overall options ... show me all the 
output


or, in the case of some interaction plots ... find a graphing method ... 
using different symbols ... that represent ON the graph ... pairs that are 
different from others (ie, any pair of DARK dots means different ... 
whereas a DARK dot and a LIGHT dot ... mean no) ... if we have adopted some 
pre set alpha ...

or, a little table FIRST in the output ... that simply lists the 
combinations ... and says next to them ... YES ... NO ... without all the 
other peripherals included ...


_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: variance estimation and cross-validation

2001-07-25 Thread Michael F.

From: Mark Everingham ([EMAIL PROTECTED])
Subject: variance estimation and cross-validation 
Newsgroups: sci.stat.math, sci.stat.edu, sci.stat.consult, sci.math
Date: 2001-07-24 03:14:05 PST 

I'm not familiar with your research area, but if I understand your data 
correctly you have N image measurements per k folds.  So that N=8 and and k=10
in your example below.

I have a set of N images. I train a classifier to label pixels in an image
as one of a set of classes. To estimate the accuracy of the classifier I use
cross-validation with k folds, training on k-1 and testing on 1. Thus the
estimated accuracy on an image is
   
mu = mean(mean[i], i=1..k)

where mean[i] is the mean accuracy across the images in fold i

In the line above you define mu as the mean accuracy over images and folds, 
but above the formula for mu, you say that it is the accuracy on an image. 

I also want to know how much the accuracy varies from one image to another.
I can think of two ways of estimating this:

So, each observation is a measurement of accuracy?  As you focus on variances 
below, you seem to be interested in the variability of the measurements, 
perhaps primarily within folds?

Depending upon your design and sampling, a variance components analysis may 
be appropriate to answer your question.  For example, if the within and between 
fold errors were independent, you could compare their variances. 
 
(a) sigma^2 = mean(var[i], i=1..k)

where var[i] is the variance of the accuracy across the images in fold i

One disadvantage with this approach is that you assume that the covariance 
between measurements on the same fold is zero (it doesn't look to be enormous 
in your data below, but it could still be important).  Another is that, in your 
example, each var[i] is based only upon 8 measurements.

or

(b) sigma^2 = var(mean[i], i=1..k) * n

where n is the number of images in each of the folds.

With n=1 in the above formula (how do you interpret it otherwise?), you are 
ignoring the within-fold variance, which you might do if you have reason to 
believe that it is small.  Though, your example data below seems to suggest 
otherwise.  The variance for fold 3 looks unusually small.

It might be a good idea to first estimate the between and within-fold variance 
components, and possibly others, depending upon the details of your study design, 
how much data you really have, and which assumptions you are willing to make.

Different approaches could be used to look at accuracy in other ways.

An example:

fold  mean   var
   1  91.43  36.2404
   2  89.05  58.3696
   3  97.39  3.3856
   4  89.38  78.1456
   5  91.09  104.858
   6  88.49  87.4225
   7  86.59  148.596
   8  90.36  97.8121
   9  86.05  77.6161
  10  88.98  125.44

n = 8 (fold size)

mu = 89.881
sigma^2 by (a) = 81.7886 (sigma = 9.0437)
simga^2 by (b) = 71.7367 (sigma = 8.4698)

Which estimate is better, or are both incorrect? I appreciate that the fold
size (8) and number of folds (10) are small. Is there a better way? Is there
any way to establish a confidence interval on the estimate?

Exact confidence intervals can be easily calculated for many variance 
components when the data are balanced, though for unbalanced data, you'll
usually have to settle with approximate methods.  See the texts below for 
some details on OLS approaches.  I don't know a good reference on ML methods 
for computing CIs on variance components.

Snedecor GW, Cochran WG. Statistical methods (8th ed.) Iowa: Iowa, 1989.
 
Burdick RK, Graybill FA.  Confidence intervals on variance components. 
Marcel Dekker, NY, 1992.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Alex Yu


The following is extracted from one of my webpage. Hope it can help:

--

The issue regarding the appropriateness of ordinal-scaled data in 
parametric tests was unsettled even in the eyes of Stevens (1951), the 
inventor of the four levels of measurement: As a matter of fact, most of 
the scales used widely and effectively by psychologists are ordinal 
scales ¡K there can be involved a kind of pragmatic sanction: in numerous 
instances it leads to fruitful results. (p.26) Based on the central 
limit theorem and Monte Carlo simulations, Baker, Hardyck, and 
Petrinovich (1966) and Borgatta and Bohrnstedt (1980) argued that for 
typical data, worrying about whether scales are ordinal or interval 
doesn't matter.

Another argument against not using interval-based statistical techniques 
for ordinal data was suggested by Tukey (1986). In Tukey's view, this was 
a historically unfounded overreaction. In physics before precise 
measurements were introduced, many physical measurements were only 
approximately interval scales. For example, temperature measurement was 
based on liquid-in-glass thermometers. But it is unreasonable not to use 
a t-test to compare two groups of such temperatures. Tukey argued that 
researchers painted themselves into a corner on such matters because we 
were too obsessed with sanctification by precision and certainty. If 
our p-values or confidence intervals are to be sacred, they must be 
exact. In the practical world, when data values are transformed (e.g. 
transforming y to sqrt(y), or logy), the p values resulted from different 
expressions of data would change. Thus, ordinal-scaled data should not be 
banned from entering the realm of parametric tests. For a review of the 
debate concerning ordinal- and interval- scaled data, please consult 
Velleman and Wilkinson (1993).

from:
http://seamonkey.ed.asu.edu/~alex/teaching/WBI/parametric_test.html



Chong-ho (Alex) Yu, Ph.D., MCSE, CNE
Academic Research Professional/Manager
Educational Data Communication, Assessment, Research and Evaluation
Farmer 418
Arizona State University
Tempe AZ 85287-0611
Email: [EMAIL PROTECTED]
URL:http://seamonkey.ed.asu.edu/~alex/
   
  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: SRSes

2001-07-25 Thread Art Kendall



Dennis Roberts wrote:

 snip

 but, we KNOW that most samples are drawn in a way that is WORSE than SRS ...

 thus, essentially every CI ... is too narrow ... or, every test statistic
 ... t or F or whatever ... has a p value that is too LOW ...

 what adjustment do we make for this basic problem?

The adjustment for design is done with weights to get the point estimates using
regular software such as SPSS etc.  To get the confidence estimates special
software such as WESVAR, SUDAAN, or CPLEX is commonly used.  Because the latter
packages are not as user friendly in their presentation of results, I usually
get the point estimates in SPSS, then I use WESVAR or SUDAAN and get both point
and interval estimates.  I use the point estimates from the latter packages as
navigational aids to find the interval estimates in the output and to assure
that I am getting the right computations.

Some sampling designs include cluster sampling (random effects), some
stratification (fixed effects), and some both.
For those with stratification only, if there is any difference in the means
(proportions) among the strata, usually the CIs will be too wide.  For those
with cluster sampling, usually the CIs will be too narrow.  For those designs
with both stratification and clustering, the CIs will be subject to both
narrowing and widening, and only specilized software will tell the net effect.

In addition, ratio, regression, or difference estimates may have narrower true
CIs.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Rich Ulrich

On Wed, 25 Jul 2001 07:26:19 -0400, Teen Assessment Project
[EMAIL PROTECTED] wrote:

 I am using a measure with likert scale items.  Original psychometrics
 for the measure
 included factor analysis to reduce the 100 variables to 20 composites.
 However, since the variables are not interval,  

This question does recur.  I think there was some bad
teaching in psychology departments, many years ago.
but I don't think there is a textbook published in the last 
20 years that doesn't regard a decent Likert scale as 
one of the better examples of Interval scaling.  (Of course,
there has been some misunderstanding, too, of what
interval is all about.  Similarly, I think error is more likely
to exist in class notes, than within any of the current texts.)


By design, a Likert total score is intended to be interval.
You should keep scaling and scoring in mind for any 
criterion that you have.  You might check-up on your
individual Likert items, if you devised them yourself, to 
be sure that you didn't mess up your labels or your scoring.
But most people don't worry about their Likert-type scores
at all; treating them as interval is the standard.

Your concern is not *totally*  misplaced - especially in 
regard to separate items - but *almost*.

A Likert item scored as Interval, as it happens, is more 
robust than a Likert item that is re-expressed merely as
ranks -- given the deficiencies in the dealing with *ties*
in the usual 'nonparametric'  tests.  (Logistic or normal
modeling of ties is much better; but those are both rare.)


   shouldn't non-parametic
 tests be done to determine group differences (by gender, age, income) on
 the variables?  Can I still use the composites...was it appropriate to
 do the original factor analysis on ordinal data?

You can find other (old) comments in my stats-FAQ, 
or use groups.google.com  to search the  sci.stat.*  groups.
I know some other people have articulated the same 
conclusions.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: vote counting

2001-07-25 Thread Rich Ulrich

On Wed, 25 Jul 2001 09:33:41 -0400, Sanford Lefkowitz
[EMAIL PROTECTED] wrote:

 In a certain process, there are millions of people voting for thousands
 of candidates. The top N will be declared winners. But the counting
 process is flawed and with probability 'p', a vote will be miscounted.
 (it might be counted for the wrong candidate or it might be counted for
 a non-existent candidate.)

For clarification:  I assume you are talking about votes and winners
in the thoroughly abstract, hypothetical, imaginary instance - where 
(for example) votes are miscounted TOTALLY AT RANDOM, and
not because of ballot-flaws  relating to position on a ballot; etc.

 
 What is the probability that the counted top N will correspond to the
 real top N?
 (there are actually two cases here: 1 where I want the order of the top
 N to be in the correct order and the other where I don't care if the
 order is correct)

We could say, you are referring to errors in counting  that are 
entirely uncorrelated with each other, or with anything.

I can offer:  You will need to parameterize the cases according 
to the spread in vote.  And use some model.  Is this, 20% of the 
candidates get 80% of the vote?  or 10% get 90%?  or 1%, 99%?
(There is some name for those curves - Pareto?)

AND - what is your question, to be quantified?   I can be 
sure without doing anything, that for large N and smooth
distributions, the top N counts will not fall in perfect order.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: SRSes

2001-07-25 Thread Art Kendall

my previous remarks were about other sampling designs.  I was comaring valid
complex designs to SRS design and not non-sampling case selection.

dennis roberts wrote:

 my hypothesis of course is that more often than not ... in data collection
 problems where sampling is involved AND inferences are desired ... we goof
 far more often ... than do a better than SRS job of sampling

 1. i wonder if anyone has really taken a SRS of the literature ... maybe
 stratified by journals or disciplines ... and tried to see to what extent
 sampling in the investigations was done via SRS ... better than that ... or
 worse than that??? of course, i would expect even if this is done ... we
 would have a + biased figure ... since, the notion is that only the
 better/best of the submitted stuff gets published so, the figures for all
 stuff that is done (ie, the day in day out batch), published or not ...
 would have to look worse off ...

 2. can worse than SRS ... be as MUCH worse ... as complex sampling plans
 can be better than SRS??? that is ... could a standard error for a bad
 sampling plan (if we could even estimate it) ... be proportionately as much
 LARGER than the standard error for SRS samples ... as complex sampling
 plans can produce standard errors that are as proportionately SMALLER than
 SRS samples? are there ANY data that exist on this matter?

 ==
 dennis roberts, penn state university
 educational psychology, 8148632401
 http://roberts.ed.psu.edu/users/droberts/drober~1.htm

 =
 Instructions for joining and leaving this list and remarks about
 the problem of INAPPROPRIATE MESSAGES are available at
   http://jse.stat.ncsu.edu/
 =



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Nonrandomness of binary matrices

2001-07-25 Thread Rich Strauss

Thanks to Rich Ulrich for the suggestion below -- that was the direction I
was heading, but there seem to be difficulties.  The general problem is
that I have a standard [nxp] data matrix, but (skipping over the scientific
details) some of the values are special, typically 5-20% of them, and I
want to know whether their distribution within the matrix is structured in
some way.  In particular, they might be concentrated in particular rows or
columns, but beyond that I have no notion of nonrandom.  I'm hoping that
they're uniformly randomly distributed (or rather, not significantly
different from random) because then I can basically ignore the fact that
they're special, for the scientific problem at hand.

I'd like to have two things: a nicely behaved index of nonrandomness
(perhaps a test statistic, rescaled to an interval 0-1?) and a significance
test.  So I recoded the matrix as binary, with the special values coded as
1s.  I presumed that the null marginal distributions would be binomial
rather than Poisson because the frequency of occurrence is so high, but
either way I could test that.  And if I measured the deviations of marginal
totals from expected (as a chi-square statistic, perhaps, or a mean squared
deviation) that would provide both an index and a goodness-of-fit
significance test for the entire matrix.

But the problem is: what if the row totals and column totals are not
independent?  I've done a few 2-way chi-square contingency tests on these
matrices (using randomized null distributions, of course, since the
matrices are binary), and some of the results are statistically
significant.  Doesn't this mean that I can't simply accumulate the row and
column totals for a goodness-of-fit test, since they're not always
independent?  And even if I did the goodness-of-fit tests for rows and
columns independently, how do I combine the p-values to get a single level
of singificance for the entire matrix, if the tests are not independent?

I have the feeling that I'm missing something obvious here but I can't
quite get a handle on it, and this little problem is holding up the
analysis of the results from a much larger study.  I've talked to
statisticians on campus, with little progress, so basically I'm begging for
help.

Rich Strauss

At 10:47 AM 7/25/01 -0400, you wrote:
On 23 Jul 2001 14:22:58 -0700, [EMAIL PROTECTED] (Rich Strauss)
wrote:

 Say I have a binary data matrix for which both the rows (observations) and
 columns (variables) are computely permutable.  (In practice, about 5-20% of
 the cells will contain 1's, and the remainder will contain 0's.)   Assume
 that the expected probability of a cell containing a '1' is identical for
 all cells in the matrix.  I'd like to be able to test this assumption by
 measuring (and testing the significance of) the degree of 'nonrandomness'
 of the 1's in the matrix.
 
 If the rows and columns were fixed in sequence, then this would be an easy
 problem involving spatial statistics, but the permutability seems to really
 complicate things.  I think that I can test the rows or columns separately
 by comparing the row or column totals against a corresponding binomial
 distribution using a goodness-of-fit test, but I can't get a handle on how
 to do this for the entire matrix.  I'd really appreciate ideas about this.
 Thanks in advance.

I'm not sure that I grasp what you are after, but - an idea.

If they are completely permutable, then permute:
sort them by decreasing counts for row and for column.
This puts me in mind of certain alternatives to random.

The set of counts on a margin should be ... Poisson?
The table can be drawn into quadrants or smaller sections, 
so that the number of 1s in each can be tabulated, to make
ordinary contingency tables.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
 


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts

for a good treatment of this issue ... levels of measurement and statistics 
to use ... though, it is not real simple ...

see

ftp://ftp.sas.com/pub/neural/measurement.html

warren sarle of SAS wrote this and, it is excellent

forget about scales and statistics for a moment ... what kinds of 
STATEMENTS do you want to be able to make ... about measurement variables 
... THAT is the real issue ... and whether you should pay attention to 
levels of measurement and statistics ...

At 09:18 AM 7/25/01 -0700, Alex Yu wrote:

The following is extracted from one of my webpage. Hope it can help:



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts

inherent problems related to LICKert items and level of measurement that 
create problems would be these too

1. how many response categories are there for AN item??? by the way ... 
likert used many types ... including YES ? NO

at THIS level ... i think it a bit presumptuous to think that we are 
working with interval level measurement

2. what the labelling is FOR points ON an item ... i think it is easier to 
pretend the item level measurement is interval IF the scale is in terms of 
% agreement terms ... rather than SA ... SD kinds of response points

3. how MANY items there are ...

now, if you have FEW items ... with FEW points ... that are like SA ? SD 
... then at the item or total score level ... i think it is hard to assume 
interval level data ... if you have MANY items that each have NUMEROUS 
scale points ... that are framed differently that SA to SD ... then 
assuming interval level data is much more tenable ...






_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread John Uebersax

If your items are visually anchored so as to imply equal spacing,
like:

+++++
01234
  leastmost
 possiblepossible

then one might accept the data as interval-level, on the assumption
that respondents interpret them as such.

Also keep in mind that after you add responses on several items, minor
deviations of the response categories from being equally-spaced may
matter less.

In my substance abuse and personality research with teens, I have done
a lot of factor analysis on ordered-category response items.  One way
to avoid the assumption of equally-spaced categories (though
introducing an assumption of normally distributed traits) is to
perform factor analysis of polychoric correlation coefficients.

For more information on polychoric correlations and their factor
analysis, see:

http://ourworld.compuserve.com/homepages/jsuebersax/tetra.htm
http://ourworld.compuserve.com/homepages/jsuebersax/irt.htm

With my data, factor analysis produced mostly the same results
regardless of whether polychoric correlations or regular Pearson
correlations were used.

If you are concerned about creating scales by summing ordered-category
responses, there is the alternative of latent trait modeling.  See:

http://ourworld.compuserve.com/homepages/jsuebersax/lta.htm

and some of the links there.  Again, one often finds it makes little
or no practical difference.  Scale scores produced by simply adding
item responses and scores produced by more complex latent trait models
may correlate .99 or better with each other.

BTW, the original study you describe sounds so much like one I did
the analysis for that I wonder if they are the same.  You aren't by
any chance referring to a study done in Winston-Salem, North Carolina,
are you?

John Uebersax

Teen Assessment Project [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]...
 I am using a measure with likert scale items.  Original psychometrics
 for the measure
 included factor analysis to reduce the 100 variables to 20 composites.
 However, since the variables are not interval,  shouldn't non-parametic
 tests be done to determine group differences (by gender, age, income) on
 the variables?  Can I still use the composites...was it appropriate to
 do the original factor analysis on ordinal data?


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts


here are a few videos of likert ...


http://ollie.dcccd.edu/mgmt1374/book_contents/3organizing/org_process/Likert.htm


_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Reach and Frequency

2001-07-25 Thread Vincent Granville

An advertiser purchases online Ad impressions and wants to achieve a
certain reach over a specified period of time. How many impressions
does he have to purchase? The page-per-user distribution is known.

I've published a solution on my web site, at
datashaping.com/internet.shtml. However,  I am wondering if more
general results are available. Thanks.

Vincent Granville, Ph.D.
www.datashaping.com






=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=