[ECOLOG-L] Data sharing in ecology

2009-09-14 Thread Anon.

Hej!

Last week Nature published a special feature on data sharing
(http://www.nature.com/news/specials/datasharing/index.html).  it was
mostly about other areas of science, but I think the problem of how to
equitably share data is present in ecology too.  SO, I blogged some
thoughts:

I'm coming at this from the perspective of someone who wants to use the
data, and I'd be interested in hearing other views - particularly from
people who generate data on the problems associated with free access.

All comments are welcome, preferably on my blog (just to keep the
discussions in one place).

Bob

--
Bob O'Hara
WWW:  http://www.RNI.Helsinki.FI/~boh/
Blog: http://network.nature.com/blogs/user/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org

Help send my wife to Antarctica (please?)
http://www.blogyourwaytoantarctica.com/blogs/view/152


Re: [ECOLOG-L] Open Access and Intellectual Imperialism

2009-05-19 Thread Anon.

borretts wrote:

Colleagues,

Biology/ecology could adopt a model like the physicists and develop an 
electronic preprint archive like arXiv (http://arxiv.org/).  This 
provides a way to share research results and ideas -- even those that 
have been peer reviewed -- in a moderated fashion without violating 
copyrights (as far as I know).   For those of us working in 
quantitative biology/ecology there is already a "quantitative biology" 
subsection available.


as I pointed out earlier, there is Nature Precedings 
(http://precedings.nature.com/) which fulfils this function.  The FAQ 
discusses copyright (http://precedings.nature.com/site/help#copyright).  
I don't know how journals will view manuscripts being placed there: I 
guess I should ask.


We just need to start using it, folks!

Bob

--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Blog: http://network.nature.com/blogs/user/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: [ECOLOG-L] stealing from websites; educational use

2009-05-14 Thread Anon.

MaryBeth Voltura wrote:


I would be interested in the list's opinion of this type of project, and
how best to allow students to create interesting and educational
websites without violating fair use of images.  Obviously, they are not
going to be able to obtain their own pictures of red kangaroos and
arctic springtails.

  
Flickr allows you to search for photos that are available under a 
"creative commons" licence, which means you can re-use them.  Check the 
advanced search options. 


Bob

--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Blog: http://network.nature.com/blogs/user/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: [ECOLOG-L] Open Access and Intellectual Imperialism

2009-05-13 Thread Anon.

Jane Shevtsov wrote:

The physical sciences seem to be halfway there with arXiv.org .

Jane

On Tue, May 12, 2009 at 3:22 PM, joseph gathman  wrote:
  

Jane wrote:


The journal's contribution is coordinating peer review, formatting the
paper and, most importantly, disseminating the paper.
  

It seems we are approaching the time when journals become obsolete for these 
functions.  We could do all this through the internet right now.  Imagine just 
posting your paper here on ECOLOG-L, where anybody can review it and comment 
publicly.  It would make for more dynamic review and discussion of research.

So now it seems the main function of journals is to make the publication 
"official" so it will count toward retention and tenure and other professional 
tally counting.




There is Nature Precedings (http://precedings.nature.com/) which allows 
you to do just this.


Bob

--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Blog: http://network.nature.com/blogs/user/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: [ECOLOG-L] analyzing ordinal phenology data

2009-04-02 Thread Anon.

John Skillman wrote:

Ecologgers...
We have regularly censused populations of several different plant species
throughout the growing season and categorized the observed individuals into
one of 7 different phenological stages (e.g., stage 1 = initial greening,
stage 4 = peak flowering, stage 6 = seed drop, etc.).  These numeral IDs for
the different stages are ordinal data that, by coincidence, tend to scale
linearly with day of the growing season.  Although using ordinal data is not
permitted (and makes no sense) in regression analyses, we've done it anyway!
 By running regressions we are able to get slopes (change in phenological
stage vs. day of year) which, in essence, quantifies the seasonal rates of
development for the different species.  Taking it one step further, Analyses
of Covariance confirm that some species progress through these phenological
stages at rates that are significantly different from that of other species.
So if this tells me what I want to know, what is the problem? The problem,
of course, is that this approach treats these phenological stage IDs (1-7)
as quantitative values when, in fact, they are nothing more than category
labels.
Can anyone suggest an alternative way to use these data to quantify seasonal
development rates and test for differences among species?

BTW, we censused different individuals within each population haphazardly
(~10 individuals per population per census date) and did NOT follow the same
individuals over the season.
 
I don't know if anyone has responded privately, but the analysis should 
just be an ordinal regression, e.g.


Guisan, A. and Harrell, F.E. (2000)  Ordinal Response Regression Models 
in Ecology.  Journal of Vegetation Science, 11: 617-626.


For R (and S-PLUS) users, there is the polr() function in the MASS 
package that will do this.


Incidentally, regression might not be too bad: it sounds as if the data 
are approximately interval data.  A bit of model checking to see if the 
assumptions of normality and (probably more importantly) linearity are 
reasonable might be all that is needed.


Bob

--
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: [ECOLOG-L] dna from museum skins and pickled animals

2008-12-25 Thread Anon

malcolm McCallum wrote:

Hi,
how much of the genome can be obtained from museum skins and other
kinds of preserved specimens?
Can we get complete genomes or only fractions?

I do realize a lot of factors play into this!

I'm not an expert, but I would have thought we could get a full genome. 
 The sequencing methods that are being used now rely on quickly 
sequencing lots of short segments, and then fitting them together 
afterwards.  The problem with museum specimens, of course, is that the 
DNA is degraded, and in small segments already.


This is how they could sequence the Neanderthal genome (um, Neanderthal 
man, not the valley...).


Bob

--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Blog: http://network.nature.com/blogs/user/boboh
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: Data set with many many zeros..... Help?

2008-01-14 Thread Anon.
Aargh!  I forgot that reply doesn't go to the whole list.  So this is 
something I intended to send yesterday.

To emphasise one of the points here, which people seem to be missing - 
the purpose of a ZIP/ZINB model is to allow for separate processes 
affecting presence/absence and abundance given presence.  Analysing the 
data separately has two problems:
1. The presence/absence analysis confounds the observations of zeroes 
where the species isn't there with the observations of zeroes where the 
species is there but not sampled.
2. The abundance analysis over-estimates mean abundance because it 
doesn't have any zeroes, where the species is present but not sampled.

The ZIP/ZINB models work by allowing for species to be present but not 
observed.  The paper which first proposed the ZIP model suggested 
fitting it by estimating the number of zeroes where the species was 
present, and then estimating the other paramters, and from those 
re-estimating the "false" zeroes, and iterating between the two.  I 
haven't checked the methods Alain was suggesting, but I suspect they use 
the same approach (it's now called the EM algortihm).

Bob

 Original Message 
Subject: Re: Data set with many many zeros. Help?
Date: Sun, 13 Jan 2008 18:38:41 +0200
From: Anon. <[EMAIL PROTECTED]>
To: Highland Statistics Ltd. <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>

Highland Statistics Ltd. wrote:
> On Sat, 12 Jan 2008 15:38:33 -0400, Stephen Cole <[EMAIL PROTECTED]> wrote:
> 
>> Hello Ecolog - I was wondering if anyone had any advice on the following
>> problem.
>>
>> I have a data set that is infested by a plague of zeros that is causing me
>> to violate all assumptions of classic parametric testing.  These are true
>> zeros in that the organisms in question did not occur in my randomly 
> sampled
>> quadrats.  They are not "missing data"
>>
>> I have a fully nested Hierarchical design
>> My response variable is density obtained from quadrat counts.
>> my explanatory variables are as follows
>>
>> Region   (3 levels-fixed)
>> Location(Region) (4 levels - random
>> Site(Location(Region))  (4 levels - random)
>>
>> My plan was to analyze the data with a nested anova and then proceed to
>> calculate variance components to allow me to parse out the variance that
>> could be attributed to each spatial scale in my design.  Since it is known
>> that violations of assumptions severely distort variance components in
>> random factors, i would really like to clean up my data set to meet the
>> assumptions but as of yet i have found no acceptable remedial measure.
>>
> 
> Stephen,
> The good news for you is that this is a common problem; it is called zero 
> inflation. The solution is zero inflated Poisson, zero inflated negative 
> binomial, zero altered poisson, or zero altered negative binomial GLMs. 
> These are mixture models. Just Google ZIP, ZINB, ZAP, ZANB (or hurdle 
> models). There is a nice online pdf from Zeileis, Kleiber and Jackman, 
> showing you how to do these analyses in R. The book from Cameron and 
'
> Trivedi gives the maths. Our next book has a 40 page chapter on this stuff 
> (in R), but that won't help you now.
> 
> The difference between ZI and ZA is the nature of the zeros (false zeros 
> or true zeros), and the difference between Poisson and NB is wether you 
> have extra overdispersion due to the counts, or only due to the zeros.
> 
> Software in R for this stuff is reasonably new. Packages pscl and VGAM are 
> good starting points.
> 
> The bad news is that I am not sure what you have in terms of software for 
> ZIPs + random effects. Both Cameron and Trivedi and Hilbe (2007) discuss 
> these methods in the context of random effects. There was a paper in 
> Environmetrics (end of 2007) applying ZIP with spatial/temporal 
> correlation on seal data...in R. There are more, all very recent, papers 
> with ZIP/ZAP + random effects. You may have to write the software code for 
> doing this...I don't know.
> 
> Having said that...you say that your random effects have 4 levels. I doubt 
> if this is enough! Perhaps you should consider them as fixed? See Pinheiro 
> and Bates.
> 
> ZIP/ZAP is very interesting stuff!
> 
I would mostly add "I agree" to what Alain has written, but just add a
couple of comments:
1. It might be that a negative binomial is sufficient - that alone can
produce lots of zeroes.  It depends a bit on whether you think a
sufficient proportion of the zeroes are because the species genuinely
aren't present, as opposed to being present and not recorded in the
sample (which is what the Poisson or negat

Data wanted!

2008-01-09 Thread Anon.
I have a student who has been doing some work on fitting population 
dynamic models to data.  One thing he has looked at is fitting a model 
with an Allee effect.  However, he would like to have some real data to 
try it out on.

So, does anyone have any suitable data, or know where it would be 
available from?  Obviously it should be a population where an Allee 
effect is suspected, and where a range of densities is covered.  It 
would have to be a complete count of a closed population (or if it's an 
open population, we would need to know the numbers of immigrants) over a 
period of years (as many as possible!).  Some amount of missing data can 
be handled.

I am aware of the Global Population Dynamics Database 
(http://www3.imperial.ac.uk/cpb/research/patternsandprocesses/gpdd), but 
there too many species in there to check all of them!

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: Input of non-normal variables into GLM models

2007-11-14 Thread Anon.
Highland Statistics Ltd. wrote:
> On Tue, 13 Nov 2007 20:07:33 +0200, Anon. <[EMAIL PROTECTED]> wrote:
> 
>> Sami Ullah wrote:
>>> Hey Ecologers:
>>>
>>> I have a various variables for running multiple linear regression model
>>> using GLM. Some of my predictor variables are non-normally distributed.
>>> Using multiple linear regression, I use proc-univariate to check if the
>>> residuals in the regression model met the normality criteria, which the
>>> model did.
>>>
>>> Now I am wondering if it is advisable if I can keep the skewed predictor
>>> variables in the model or have to go for non-parametric analysis?
>>>
>>>
>> The distribution of the predictor variables is irrelevant, so you can
>> happily keep them in.  Well, the distribution is almost irrelevant.  You
>> can get problems if they are co-linear (i.e. highly correlated), or if
>> you have outliers (which can have a large influence on the fit).
> 
> Agree. One extra thingI would argue that normality of explanatory 
> variables (predictors) is actually bad. It means that most observations 
> have the same (or similar) value for that explanatory variable, which may 
> (!) make it more difficult to find a significant effect. Bad experimental 
> design. Perhaps a histogram shaped like the uniform distribution would be 
> the best. It means that you have similar number of observations for each 
> part of you sampled gradient...for that explanatory variable.  
> 
Theoretically, the best distribution (in terms of power) is to have a 
bimodal distribution, with values either at their maximum or minimum. 
However, this design makes it impossible to check whether the 
relationship is linear or not.  I mention this because I forgot to 
mention that linearity is assumed, and this is more important.  It's 
also easy to check - plot the residuals against the predictor.  If they 
look curved, then it suggests that the relationship is not linear, and 
that would be a reason to transform.

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: Input of non-normal variables into GLM models

2007-11-13 Thread Anon.
=?iso-8859-1?Q?Sami_Ullah?= wrote:
> Hey Ecologers:
>
> I have a various variables for running multiple linear regression model 
> using GLM. Some of my predictor variables are non-normally distributed. 
> Using multiple linear regression, I use proc-univariate to check if the 
> residuals in the regression model met the normality criteria, which the 
> model did.
>
> Now I am wondering if it is advisable if I can keep the skewed predictor 
> variables in the model or have to go for non-parametric analysis?
>
>   
The distribution of the predictor variables is irrelevant, so you can 
happily keep them in.  Well, the distribution is almost irrelevant.  You 
can get problems if they are co-linear (i.e. highly correlated), or if 
you have outliers (which can have a large influence on the fit).

I've come across the impression that the predictors have to be normally 
distributed a few times, but I don't know where it originates from - 
certainly not from statistical theory.

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Blog: http://deepthoughtsandsilliness.blogspot.com/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: On the Bonferroni adustment of correlograms

2007-04-25 Thread Anon.
Alexandre Souza wrote:
> Dear friends that work on spatial Ecology,
> 
> I am proceeding with the analysis of a dataset on the spatial
> structure of canopy openness in the southern Brazilian mixed
> conifer-hardwood forests, and would like to ask your opinion on a
> rather simple matter on which I have doubts.
> 
> I have six one-hectare plots subdivided in 100 10 x 10 m plots each.
> In the centre of each subplot we took a hemispherical photograh and
> estimated canopy openness. In Legendre and Fortin (1989) it is said
> that before examining each significant value in a correlogram, we
> must first perfom a global test, since several tests are done at the
> same time, for a given overall significance level. The global test is
> made by checking whether the correlogram contains at least one value
> which is significant after a Bonferroni correction.
> 
I had a similar problem during my PhD, and it became an early 
introduction to the problems of p-values.

I think the Bonferroni correction is a bad idea in this context, because 
you expect that the auto-correlation will decrease with distance. 
Therefore, if a distance of 16 is not significant, then 17+ will not be 
either.

So, suppose you start by testing for all distances up to 10m, and find 
that distances 1m and 2m are significant.  Then, you decide to test for 
all distances up to 50m.  Now the Bonferroni correction will hammer the 
critical p-value, so you could very well find that nothing is 
significant: not because the data is different, but simply because you 
are doing more tests, and test that a priori are not so interesting.

My solution was to do the test sequentially: test distance 1, then 2, 
then 3 etc.  You find you don't need a correction.  The reference is this:
O'Hara R.B., Brown J.K.M., 1997. Spatial aggregation of pathotypes of 
barley powdery mildew. Plant Pathology, 46: 969-977.

Nowadays, I would probably try and find a more model-based approach to 
estimating aggregation, and wouldn't be so interested in p-values.  I 
don't think they should be taken too seriously: they are a guide to 
what's going on rather than being the truth.

For the damaged plot, I think I would still run the analysis, but in the 
knowledge that the results have a lower power, so the p-values will be 
higher than for a full plot.

Hope this helps!

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: Fwd: Global Warming Swindle

2007-04-21 Thread Anon.
David M Bryant wrote:
> Hello fellow propagandists,
>
> I just received this link to a video supposedly contradicting the  
> recent media hype on global warming.
>
> You owe it to yourself to take a look.  To paraphrase the "final  
> line"  it would be hilarious if it weren't such a sad rhetorical  
> example of poor debate.  No data is presented other than the  
> observations that climate has changed in the past and that the  
> recession of the 1970's should have resulted in cooling.  I'm curious  
> as to whether the scientists quoted really understand the feedbacks  
> and lags involved in the carbon cycle or even the physical connect  
> between CO2 and IR absorption.  Perhaps they've never heard of an IRGA.
>
>   

The programme was broadcast about a month ago in the UK.  The maker has 
a history of, um, unreliability, and lived up to it: 


And for a really good presentation on what was wrong with the 
documentary, go here:


Bob

> Cheers,
>
> David Bryant
>
> Begin forwarded message:
>
>   
>> From: "Insight" <[EMAIL PROTECTED]>
>> Date: April 19, 2007 11:25:35 AM EDT
>> To: "Insight" <[EMAIL PROTECTED]>
>> Subject: Global Warming Swindle
>>
>> If you believe the prejudice-based science of Al Gore and Sheryl Crow,
>> you need to look at this video:
>>
>> http://www.youtube.com/watch?v=aJSupf6rkgE&mode=related&search=
>>
>> 


-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: Dealing with non-normal, ordinal data for 2-way ANOVA with interactions

2007-03-14 Thread Anon.
Bahram Momen wrote:
> Highland Statistics Ltd. wrote:
>   
>>  >Date:Mon, 12 Mar 2007 15:35:18 -0700
>>  >From:John Gerlach <[EMAIL PROTECTED]>
>>  >Subject: Re: Dealing with non-normal, ordinal data for 2-way ANOVA 
>> with interactions
>>
>>  >My short answer is that for controlled blocked factorial experiments where 
>> =
>>  >interactions are important and where you have planned contrasts - since 
>> you=
>>  >designed it you should know what the important questions are - I'm not awa=
>>  >re of any tool except ANOVA that will suffice.
>>
>>
>> Am I missing something here?? ANOVA is linear regression...linear 
>> regression is GLM (generalised linear modelling).
>> 
> GLM used to stand for 'General Linear Models'. 'General' in this context 
> differs from 'Generalized'
>
>   
That depends on how you're brought up: if you haven't come from the SAS 
world, GLMs have always been Generalised Linear Models.

> Also, a recent procedure in SAS called 'GLIMMIX' that enables modeling a 
> number of distributions (normal, poisson, binomial, etc.) seem to be 
> relevant.
>   
That description would also apply to Proc GENMOD. GLIMMIX, I believe, is 
for GLMMs: Generalised Linear Mixed Modelling.  It extends GLMs by 
allowing for random effects.  These are also available in other 
packages, such as R (actually, R has several implementations of them!).

John Nelder (of GLM fame) has now developed Hierarchical GLMs, and 
Doubly Hierarchical GLMs (where the GLM model structure can also be used 
to model the variance).  These are available in Genstat, but don't seem 
to have been widely implemented elsewhere yet.

Of course, Bayesians have been able to fit these models for years...

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: example R-code for ARIMA transfer function model

2007-01-18 Thread Anon.
Daniel C. McEwen wrote:
> Can anyone help me out on using R to run ARIMA transfer models by sending
> (or posting) a generalized R-code for such?   I have one response and
> multiple predictors and desire to model seasonality of the series as well.
>  I sort of pieced something together, but it seems to be "clunky".  I am
> just sure there has to be an easier way (perhaps a package I am not aware
> of) but I can't figure it out!
> 
No code, but have you looked in the Task Views on CRAN:
http://cran.r-project.org/src/contrib/Views/

There's a list of stuff in the econometrics task view.

Bob
(CC'ed to ECOLOG-L, as others might be interested in task views)

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: fuzzy arithmetic

2006-12-18 Thread Anon.
William Silvert wrote:
> Although Bob (Anon) O'Hara is right in pointing out that there is a 
> difference between uncertainty and probability, both can be characterised by 
> fuzzy methods. There is nothing problematical about this, after all 
> non-fuzzy mathematics has also ben applied to both uncertainty and 
> probability. The advantage of the fuzzy approach is that it is more 
> transparent. Of course one needs to know what the fuzzy sets mean to use the 
> formalism -- is a 50% rainy day one with a 50% chance of rain, or one which 
> is simply misty?
> 
I think this is where the problem is: the former (50% chance of rain) is 
uncertain whereas the latter is certain, but the categories (rain/not 
rain) are vague.  I would argue (strongly!) that probability is 
appropriate for the former, because it is uncertain (there are 
mathematical proofs of this, essentially showing that probability gives 
the right calculus to enable you to consistently calculate the odds so 
that you would take either side of the bet with zero expected return), 
whereas fuzzy logic is appropriate for the latter, because it is 
certain, but the concept is vague.

There's an interesting philosophical article on vagueness here:
<http://plato.stanford.edu/entries/vagueness/>

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: fuzzy arithmetic

2006-12-18 Thread Anon.
Wayne Tyson wrote:
> It doesn't "worry" me, but a greater concern might be that this 
> discussion could descend into a competition of egos.
> 
Why?  Everyone will read what I write and agree with it, because I'm 
always right.

Um, should I add a smiley?

> Let me confess, if it is necessary at all in the face of the naivety 
> of my question, that I am undoubtedly the most ignorant 
> participant.  That should take me out of the "equation" entirely, if 
> not disqualify me from participation.
> 
> May I add (I still don't understand what the distinctions--I don't 
> mean definitions--are between fuzzy math, arithmetic, and logic) yet 
> another naive question?  Are interactions over time, biological 
> (ecological) and otherwise, "fuzzy" phenomena?  If not, what are they?
> 
I think knowing the distinction would help: I know what fuzzy logic is 
(things are not either "true" or "false", but they have "truth values", 
between 0 and 1.  Sounds like there should be a lot of applications in 
politics).  Fuzzy arithmetic appears to be the mathematical tools for 
manipulating fuzzy logic.

According to Wiki, fuzzy math[s] is a method for teaching standard 
mathematics.

> If ecological phenomena are "fuzzy," how can "non-fuzzy" arithmetic, 
> statistics, mathematics, and logic reveal their nature?  I hasten to 
> add that I have no argument concerning the use and usefulness of 
> those things with respect to pieces of the puzzle, so my intent is 
> not to bog down the discussion with generalities.  On the contrary, I 
> am concerned with adding specificity and discipline to the 
> discussion, not taking it away.
> 
I think the problem is that fuzzy logic is often used to represent 
uncertainty (e.g. http://www.ramas.com/interval.htm).  To me, this looks 
wrong: there's a big difference between there not being a true value, 
and us not knowing what the value is.  Whether this makes a difference 
in practice I don't know.

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: Species richness estimators

2006-11-26 Thread Anon.
Gareth Russell wrote:
> Alexandre,
> 
> The Species Richness module on www.eco-tools.net calculates the same 
> statistics as EstimateS. 
> There is also a Mathematica notebook you can download that allows you to do 
> the calculations on 
> your own computer, if you have access to Mathematica. I mention it because 
> you could easily add 
> a few lines to make it loop through a series of files.
> 
> If you DO have access to Mathematica, and would like to try this, I would be 
> happy to add those 
> lines in the 'examples' section and send the notebook to you.
> 
> Gareth Russell
> 
> On Thu, 16 Nov 2006 15:25:45 -0200, Alexandre Souza <[EMAIL PROTECTED]> wrote:
> 
>> Dear friends,
>>
>> In my present research, I am willing to estimate the species richness of 
>> ca 200 forest  
> communities, sampled by the Brazilian government. Samples are relatively 
> small (0,1 ha - 10 
> subsamples of 0,01ha each), and I need a software that calculates 
> nonparametric richness 
> estimators for more than one community at a time.
>> Does any of you know of any software that do that? As far as I 
>> understood, EstimateS, the 
> most popular package, does not perform multiple tests.
>> Thank you in advance for any suggestions,

Someone has already mentioned my functions that Jari O. put into vegan: 
these can easily be looped over (optimally with apply()!).  However, I 
feel I should point out that I was hesitant to have them publically 
available, because I can't see how we can even tell if any species 
richness estimator is any good:

Link, W.A., 2003.  Nonidentifiability of Population Size from 
Capture-Recapture Data with Heterogeneous Detection Probabilities. 
Biometrics 59: 1123-1130.

O'Hara, R.B., 2005. Species richness estimates: How many species can 
dance on the head of a pin? J. Anim. Ecol., 74: 375-386.

The non-parametric estimators are fine to use as lower bounds (Chao2 was 
derived as such: I haven't checked it formally, but I suspect that ACE 
assumes the abundance distribution is symmetric).

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: hare/lynx time series data

2006-09-21 Thread Anon.
John M. Drake wrote:
> For teaching, I am seeking data for the hare/lynx time series from Elton's
> Hudson's Bay Company. If anybody is able to provide these data or similar, I
> would be grateful to be contacted off list.
>
>   
The time series are (should be?) here:
http://cpbnts1.bio.ic.ac.uk/gpdd/
Along with a lot of other population time series.  One can spend far too 
much time browsing the series.

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: Guidelines for Authorship

2006-09-19 Thread Anon.
Malcolm McCallum wrote:
> I would say that ultimately, regardless of the ethical implications that =
> the lead author will determine who ends up on the bilines.  Some people =
> are ultra inclusionary for authorship others are ultra conservative with =
> virtually everyone else ending up in the acknowledgements.  The =
> guidelines for authorship are just that, guidelines.  There is nothing =
> sacred about them, but people tend to use them when they need some =
> objectivity.  First authorship is important because you usually are the =
> author.
>   
Personally, I think the most accurate guidelines for authorship are here:
http://www.phdcomics.com/comics/archive.php?comicid=562

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: 3D spatially autocorrelated data

2006-08-31 Thread Anon.
Daniel C. McEwen wrote:
> I am developing models of benthic invertebrate abundance but am uncertain
> of the best way to treat the spatial data.  I have lat, long, and depth. 
> I thought about detrending, but then wouldn't I have to double detrend
> (first with respect to lat and long and second with depth)? 
> Alternatively, would it make sense to just include the interaction of lat,
> long, and depth in the model as a covariable?
>
>   
I think the honest answer has to be that it's impossible to say from 
what you've described: it depends on what you're trying to do, and what 
the data looks like. I'm sure there are several things that could be 
tried., but the devil will be in the detail, so it'll be easier if you 
sit down with someone to discuss your data and what you want from it.  
The good news is that your university has a statistical consulting 
service, so I would recommend you try that. 

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: Comparing Non-linear regression lines

2006-08-23 Thread Anon.
Jay P Sah wrote:
> Recent discussion on how to compare regression lines was really =
> enlightening. As far as I understood, the discussion was about comparing =
> slopes in linear regressions.=20
>
> =20
>
> I have similar problems in comparing non-linear regression lines. Here, =
> it goes what I want to compare. I have shrub biomass and pine seedling =
> density data collected in 160 50-m^2 plots. I used non-linear regression =
> (y =3D b0*(1-exp(-b1^x)) to model the cumulative seedling density =
> against shrub biomass. I am interested to test if this regression curve =
> (observed pattern) differs from another similar curve (a reference =
> model), generated by using Poisson cumulative distribution function. My =
> questions are: how to compare these two non-linear regression lines or =
> the coefficients there in? Are the methods described in Zar's book and =
> elsewhere, more commonly used for comparing slopes in linear regression, =
> also applicable for non-linear regression lines? =20
>
>   
If the two models are nested, then you can compare them using an F test 
(I'm not sure if the results are exact, but they should be OK with a 
decent amount of data).  If they're not nested, but have the same number 
of parameters, then you can compare the residual sums of squares: the 
smallest wins!  If they have different numbers of parameters and are not 
nested, then there are several ways of comparing them, for example, 
using AIC, BIC etc.

As a starter, I would recommend plotting the residuals against the 
predicted values: it may be that it's obvious that one curve does not 
fit.  Or indeed that neither curve fits!

Incidentally, I'm not sure what you mean by using the Poisson cdf: 
that's a stepped function, but it sounds like your data are continuous.

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: standard deviation of a slope

2006-08-16 Thread Anon.
Geoffrey Poole wrote:
> Sarah:
>
> I think the reviewer comment has merit.
>
> I understand your problem as follows:  Your goal is to compare the 
> "usefulness" (not sure what you means by "usefulness", but we'll go with 
> it...) of a regressions across environmental conditions.  However, under 
> one set of environmental conditions the regression might be based on 10 
> points, but under another set of conditions, it might be based on 100 
> points.
>
> Unfortunately, even under the SAME environmental conditions, the SE of 
> the slope will decrease as the sample size increases.  Thus, if the 
> number of points varies across environmental conditions, you don't know 
> if changes in the SE of the slope are caused by differences in sample 
> size or differences in "usefulness" across conditions.
>
> In section 17.3 "Testing the significance of a regression" of Zar's 
> "Biostatistical Analysis" (pages 334-5 of forth edition) there is a clue 
> that might help you with your dilemma...
>
> Zar notes that the "standard error of estimate" (AKA "standard error of 
> the regression") is a measure of the remaining variance in Y *after* 
> taking into account the dependence of Y on X.  
Zar says that?  That's rubbish: the residual variance is the measure of 
the remaining variance in Y after taking into account the dependence of 
Y on X.

> However, since the 
> magnitude of this value is proportional to the magnitude of the 
> dependent variable, 
Again, rubbish: add 20 000 to all of your Y's, and the variances will 
all be the same.  The only difference is that the estimated intercept is 
20 000 higher.

I might now have understood the original problem (possibly...).

I think the idea is that in any single environment, one can regress two 
variables and get a fit etc.  But the question is: how well will this 
fit do in another environment?  The (actual) slope will probably be 
different between environments, and the more different they are, the 
less use it is to use the slope in one environment to predict in 
another.  The problem is the variation between the slopes in the 
different environments: obviously we can measure this variation by the 
standard deviation (or the variance!).

In practice, I would suggest fitting a mixed model, where you allow the 
slope to vary randomly between environments.  Any decent stats package 
can do this: I think some people call them random regressions.  This 
will estimate the variation in slopes between environments, allowing for 
any differences in sample sizes in the different environments.  If the 
variance is small, then the predictions from one environment to another 
will be pretty good (obviously this depends a bit on the size of the 
regression coefficient: if it's zero, then there's no improvement anyway).

I'll have to think a bit more about the best way of evaluating the 
importance of the variation in the slopes: the intuition is to ask how 
much better you do at predicting the value of a data point if you know 
which environment it was measured in, as compared to if it's a random 
environment.  Something similar to an intraclass correlation could be used.

Incidentally, this is perhaps a good opportunity to plug this book:

I read a draft in the spring and can heartily recommend it.  It covers 
the family of models that can be used for most statistical analyses I 
see in ecology (including the problem here!), in a practical way.

And now to bed.

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: standard deviation of a slope

2006-08-16 Thread Anon.
Sarah Gilman wrote:
> Is it possible to calculate the standard deviation of the slope of a  
> regression line and does anyone know how?  My best guess after  
> reading several stats books is that the standard deviation and the  
> standard error of the slope are different names for the same thing.
> 
Technically, the standard error is the standard deviation of the 
sampling distribution of a statistic, so it is the same as the standard 
deviation.  So, you're right.

> The context of this question is  a manuscript comparing the  
> usefulness of regression to estimate the slope of a relationship  
> under different environmental conditions.  A reviewer suggested  
> presenting the standard deviation of the slope rather than the  
> standard error to compare the precision of the regression under  
> different conditions.  For unrelated reasons, the sample sizes used  
> in the compared regressions vary  from 10 to 200.  The reviewer  
> argues that the sample size differences are influencing the standard  
> error values, and so the standard deviation (which according to the  
> reviewer doesn't incorporate the sample size) would be a more robust  
> comparison of the precision of the slope estimate among these  
> different regressions.
> 
Well of course the sample sizes differences are influencing the standard 
error values!  And so they should: if you have a larger sample size, 
then the estimates are more accurate.  Why would one want anything other 
than this to be the case?

In some cases, standard errors are calculated by dividing a standard 
deviation by sqrt(n), but these are only special cases.

It may be that the reviewer can provide further enlightenment, but from 
what you've written, I'm not convinced that they have the right idea.

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf H„llstr”min katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: probability question?

2006-05-07 Thread Anon.
Wirt Atmar wrote:
> Regarding my earlier answer to the question, two people have written me and 
> said that I misread the question. That's entirely possible:
>
>   
>> This assumes the smaller circle is defined within the bigger.  If the 
>> question concerns *any* smaller circle within the bigger one, then the 
>> 
> first 
>   
>> seed could fall anywhere.
>> 
>
> Probabilities are always calculated on the basis of what specifies success or 
> failure. I was presuming that the smaller circle was pre-specified, lying 
> somewhere within the larger one. If that were the case, my previous answer 
> would 
> be correct.
>
> However the way that others are reading the question is that the first seed 
> specifies the center the smaller circle, essentially the same if the question 
> were more along the lines of "what's the probability of getting a duplicate?" 
> In that case, the first draw species what a duplicate would be. In this 
> alternate interpretation, the drop of the first seed doesn't count. Rather, 
> its 
> position will be used to specify the center of the circle. If that is the 
> case, 
> then the probability of getting the next two seeds within that first 
> seed-specified circle is 1/4 * 1/4 = 0.0625.
>
>   
Not true: if the first seed drops near the edge, then the next seed 
can't fall outside the circle.

Ah, but now I can see the approach: let the first seed drop, if the 
probability that the next two all within the area is p, then the 
probability is p^2.  You just calculate p for every point in the unit 
circle, and integrate over the area.

I'll let someone else sort out the details.  :-)

> However, I am still a little reluctant to endorse that interpretation simply 
> because the original question asked: "What is the probability that all three 
> will be clustered within a circle of one-half meter radius?" It's the "all 
> three" part of the question that gives me pause.
>
>   
It's ambiguous.  The question asks about the seeds falling in "a 
circle", but is this circle defined beforehand or not?  If it is then 
your solution is right.

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: R Texts

2006-03-18 Thread Anon.
Ron E. VanNimwegen wrote:

>Two R libraries (packages) that deal with multivariate ecological 
>statistics that complement each other quite well are "VEGAN" and 
>"LABDSV" written by Jari Oksanen and Dave Roberts, respectively.  If you 
>Google their web sites, you'll find some very useful tutorials as well.  
>They both depend on the package "MASS" which is included in the base 
>installation.  Other than that, most univariate techniques are available 
>in the base package "STATS" and the R-project web site has tutorials for 
>those.  Library "NLME" does mixed models if you need to go that route.
>
>  
>
R has a new thing called "Task Views" which summarises the packages that 
are useful in different areas.  The Spatial task view:
http://cran.r-project.org/src/contrib/Views/Spatial.html
includes a compontent on ecological data.  It would be possible for 
someone to write a task view for ecological data (hint, hint).

Unfortunately, there's nothing there that will definitively identify 
different species of woodpecker from grainy pictures.

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: AIC (effective parameters)

2006-03-07 Thread Anon.
E Holmes wrote:

>>Having said that, 16 parameters is a lot as well, so, without 
>>knowing anything about your model, you might want to look for redundancy,
>>and/or try to reduce 
>>your number of parameters with PCA or some other technique.
>>
>>
>
>Gareth Russells' response to Bonnie's questions brings up a recent issue
>I've been struggling with with AIC.  
>
>Burnham and Anderson seem to gloss over a bit what is a parameter in the
>latest edition (e.g. section 6.8.5).  Many models I come across are complex
>and hierarchical, and it's not obvious what the effective parameter size is.
> So before I could apply Russell's recommendations about reducing the number
>of parameters via PCA or something else, I would need a good definition of
>'parameter' or 'redundancy (in the context of AIC or other model complexity
>indices).
>
>The closest I've come to a definition of parameter size, K, so far is that
>
>K /approx trace(Fisher info matrix X large-samp var-cov matrix for the MLE) 
>when the true model is within the candidate models.
>ref. chap 7 (and esp. pages 385 and eqn. 7.13) in Burnham and Anderson (ed.
>2).  This seems to be basically the definition of effective parameters, p_D,
>in "Bayesian measures of model complexity and fit", Spiegelhalter et al. 
>Journal of the Royal Statistical Society: Series B (Statistical
>Methodology), Volume 64, Number 4, 2002, pp. 583-639(57).
>
>  
>
David Spiegelhalter recently admitted that that's what they were aiming 
for in their definition of p_D.  i.e. they were working backwards from 
the answer they wanted!

>I've had a hard time finding papers discussing the estimation (or
>computation) of 'effective parameter' size except the Spiegelhalter et al
>paper, and that is focused on Bayesian metrics.
>
>  
>
>Has anyone come across papers discussing this issue of effective parameter
>size for AIC?  Spiegelhalter et al use p_D for in the computation for DIC (a
>Bayesian model comparison metric), but it's not clear to me why you couldn't
>use the same p_D in a computation for AIC.
>
>  
>
David has some nice slides from a recent meeting we had here:

Slide 22 summarises the relationship between the different measures of 
model plausability.  I'm hoping that David will write something more 
formal about this.

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: AIC

2006-03-06 Thread Anon.
Wirt Atmar wrote:

>Gareth Russell writes:
>
>  
>


>Gareth's comments do allow me however an opportunity to expand a little bit 
>on my previous posting. I personally hold David Anderson and Ken Burnham in 
>very high regard, but I worry that the AIC is being oversold to the ecological 
>community -- for two different reasons.
>
>The first is that there is a high degree of arbitariness to formulation of 
>the AIC. Cost-of-complexity penalties are common in engineering equations, but 
>the penalty rate isn't something assigned by God or Mother Nature. It depends 
>more on the whims of the equation's author at the time he first wrote the 
>equation than anything else.
>
>
>  
>
Things are not quite as bad as this.  One thing Burnham and Anderson 
don't make claear is that AIC is a measure of predictive ability of a 
model: it finds the model that minimuses the predictive loss.  Hence, 
there is a solid theoretical underpinning for it.

The problem, in ecology at least, is that few ecologists are actually 
interested in building models that will be used for prediction.  Instead 
they're more interested in explanation (i.e. finding out which factors 
explain the abundance of blue retractable ballpoint pens in their native 
habitats).  The complexity that is needed for the final model will 
depend on a host of factors (amount of data, how much is already known 
about the system etc.), and it's difficult to see how to develop a model 
selection criterion that will take these into account in an easy way (in 
principle it can be done, by defining a loss function).

>And that brings me to my second concern. All models are not equal in their 
>value to us. The equations of Shannon, Boltzmann, Clausius, Kepler and 
>Einstein 
>represent fundamental understandings of the governing rules of the universe. 
>And in that regard, they represent a deep human understanding, which is of 
>course the primary goal of science. Indeed, Einstein's E = mc^2 was such a 
>triumph 
>because it doesn't even require a scaling factor to relate such previously 
>disparate qualities as mass and energy. Due to earlier careful measurements, 
>we 
>had already gotten the units correct.
>
>This is a qualitatively different condition than sequentially running through 
>every conceivable polynomial model, attempting to choose the best solely by 
>means of some mechanical metric such as the AIC. If that's done, in the end 
>nothing has been learned, and the question becomes: was it even worth the 
>effort? 
>I would have a terrible time calling this scattershot procedure science.
>
>  
>
I would hope that most of us would agree.  At the very least, one should 
look at a range of plausible models, and find the ones that make 
substansive sense.  Don't ask a statistician to do your thinking for you!

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: The Earth is round (B<0.5)

2006-02-27 Thread Anon.
DeSolla,Shane [Burlington] wrote:

>Shouldn't it be, "The Earth is round (B<.05)"?
>
>After all, the null hypothesis would be that the Earth was round, and
>rejecting it would give you, "The Earth is not round (P<.05)"
>
>According to Cohen, you can only accept a null hypothesis to be true if
>the power was high enough to detect the smallest relevant effect size.
>Thus, if the Earth did not differ significantly from being round, and
>you had a high enough power to detect a relevant degree of "roundness",
>then you could declare the Earth was round (Power = 0.95, or B = 0.05;
>or whatever your acceptable cutoff for power). The p, of course, would
>be bigger than 0.05, or whatever value of alpha you are using.
>
>Although I am not worthy enough of statistics to comment on this, some
>statisticians say you should never use a P-value. But that is for
>Bayesians to comment upon...
>
>  
>
It's one of those amusing little ironies that makes the world what it is 
that Bayesians are so associated with anti-P-valueism.  I think it's 
fair to say that most applied statisticians know the problems with 
p-values, but this knowledge hasn't perculated down far enough yet.  The 
irony here is that what most people think a p-value is is actually the 
Bayesian version.

Before I start ranting, I'll pass on this link to a selection of short 
articles about stats, aimed at medics but still useful:

Have fun!

Bob

-- 
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org


Re: hypothesis testing vs descriptive statistics

2006-02-27 Thread Anon.
Malcolm McCallum wrote:
> I vaguely recall an exchange on here regarding the role of hypothesis =
> testing and the statistical validity of this approach.  Any citations or =
> comments would be greatly apprecieated!

There's enough here to keep you going:


This is probably one of the more important papers: it sparked off a big 
discussion amongst pshycologists:
  Cohen, J.  1994.  The earth is round (p<.05). American Psychologist 
49:997-1003.

My only comment is to observe that nobody should be allowed to use a 
p-values unless they can demonstrate that they fully understand what it 
means.

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf H„llstr”min katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


JOB: Post-doc in ecological modelling

2006-01-26 Thread Anon.
A post-doc position is available in the Dept. of Mathematics and 
Statistics at the University of Helsinki from April 2006 to the end of 
August 2007. The applicant will work on the development of models for 
risk assessment of the effects of release of GMOs. In particular, the 
successful applicant will continue the development of life history 
models which can be used to model the effects of a GMO on the growth and 
fitness of a plant. The aim is to use these models to investigate the 
effects of the GMO on specific endpoints (such as invasion by the 
transgene).

The successful applicant should have experience in ecological modelling, 
and an  interest in working on applied problems.  The work will be part 
of the ARGUE  project, and will be done in collaboration with SYKE, the 
Finnish Environment  Centre.  For more details contact Bob O'Hara 
([EMAIL PROTECTED]). Applications, including a CV should also be 
sent to Bob O'Hara.  The closing date for applications is February 17th 
2006.

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org


Re: Online journals and publications: time to revolt? Copyright law

2005-12-22 Thread Anon.
Dekker, Jasja wrote:
> Dear all,
> 
> So why do we keep submitting papers to this group of journals? Are page
> charges so far spread that fitting, but free-of-charge journals are so
> rare? I think the scientific community, being both primary producer and
> consumer of the journals, has more power than it thinks!
> 
I don't think the ESA is going to like you writing _that_ on one of its 
newsgroups!

Seriously, the problem is that journals have to be paid for somehow.  I 
did see a quote that a paper in an online journal costs $100 to publish. 
  That can only be reduced by people working for free: I'm part of the 
editorial board of an online journal (Journal of Negative Results - 
Ecology and Evolutionary Biology: www.jnr-eeb.org) and we don't charge 
for submissions or downloading.  But we can only do that because of 
goodwill from several sides (in particular the managing editor).

> I hope you all will forgive me for starting again on copyright law, BUT:
> does transfering copyright to the journal mean we can not offer our own
> papers on personal websites?
> 
Alas, in some cases yes.  We should give the ESA its due here for 
allowing authors to put their papers on the web.

Bob

-- 
Bob O'Hara

Dept. of Mathematics and Statistics
P.O. Box 68 (Gustaf H„llstr”min katu 2b)
FIN-00014 University of Helsinki
Finland

Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax:  +358-9-191 51400
WWW:  http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: http://www.jnr-eeb.org