Andrea,

I agree that one must consider both statistical significance and biological 
meaningfulness in evaluating patterns.  Considering one of these without the 
other can often get one into trouble.

Your post concerned the inability to statistically detect differences due to 
sample size limitations, and the possibility of concluding homogeneity from 
this result when it may not be the case. But as Mike mentioned, the opposite is 
also a concern. In fact, one might recall a discussion some months ago on 
Morphmet on this very issue; where large samples afforded the ability to 
discern allometric differences between groups, but where those statistical 
differences may not be biologically important. In both cases, critical thinking 
and a merger of statistical result and biological knowledge of the system are 
required to arrive at a well-reasoned understanding of the patterns in the data.

Best,

Dean

Dr. Dean C. Adams
Professor
Department of Ecology, Evolution, and Organismal Biology
       Department of Statistics
Iowa State University
www.public.iastate.edu/~dcadams/<http://www.public.iastate.edu/~dcadams/>
phone: 515-294-3834

From: Mike Collyer [mailto:[email protected]]
Sent: Monday, December 12, 2016 8:34 AM
To: andrea cardini <[email protected]>
Cc: [email protected]
Subject: Re: brief comment on non-significance Re: [MORPHMET] procD.allometry 
with group inclusion

Andrea,

My opinion on this is that the researcher who has collected the data must 
retain at all times a biological wisdom that supersedes a suggested course of 
action based on results from a statistical test.  If the purpose of a study is 
to assess the allometric pattern of shape variation within populations, then 
maybe the results of a homogeneity of slopes test can be an unnecessary burden. 
 If a researcher wants to compare the mean shapes of different groups but is 
concerned that allometric variation might differ among groups, then a 
homogeneity of slopes test could be an important first step, but I agree that a 
non-significant result should not spur the researcher to immediately conclude a 
common allometry or no allometry is appropriate.  Sample size, variation in 
size among groups, and appropriate distributions of specimen size within groups 
might all be things to think about.

The point you make about a potential type II error is a real concern.  The 
opposite problem is also a real concern.  One might have very large sample 
sizes and sufficient statistical power to suggest that allometric slopes are 
heterogeneous.  However, the coefficient of determination and/or effect size 
for size:group interaction might be quite small.  Just because there is a low 
probability of finding as large of an effect based on thousands of random 
permutations, is one ready to accept that different groups have evolved unique 
allometric trajectories?  It is easy to forget that the choice of “significance 
level” - the a priori acceptable rate of type I error - is arbitrary.  Making 
strong inferential decisions based on a binary decision for an arbitrary 
criterion is probably not wise.  I would argue that instead of focusing on a 
P-value, one could just as arbitrarily, but perhaps more justifiably, choose a 
coefficient of determination of R^2 = 0.10 or an effect size of 2 SD as a 
criterion for whether to retain or omit the interaction coefficients that allow 
for heterogenous slopes.

*** Warning: pedantic discussion on model selection starts here.  Skip if 
unappealing.

One could also turn to model selection approaches.  However, I think 
multivariate generalization for indices like AIC is an area lacking needed 
theoretical research for high-dimensional shape data.  There are two reasons 
for this.  First, the oft-defined AIC is model log-likelihood + 2K, where K is 
the number of coefficients in a linear model (rank of the model design matrix) 
+ 1, where the 1 is the dimension of the value for the variance of the error.  
This is a simplification for univariate data.  The second half of the equation 
is actually 2[pk + 0.5p(p+1)], where p is the number of shape variables and k 
is the rank of the design matrix.  (One might define p as the rank of the shape 
variable matrix - the number of actual dimensions in the tangent space, also 
equal to the number of principal components with positive eigen values greater 
than 0 from a PCA - if using high-dimensional data or small samples.)  Notice 
that substituting 1 for p in this equation gets one back to the 2K, as defined 
first.  The pk part of the equation represents the dimensions of linear model 
coefficients; the 0.5p(p+1) part represents the dimensions of the error 
covariance matrix.  The reason this is important is that one might have picked 
up along the way that a delta AIC of 1-2 means two models are comparable (as if 
with equal likelihoods, they differ by around 1 parameter or less).  This rule 
of thumb would have to be augmented with highly multivariate data to 1*p to 
2*p, which makes it hard to have a good general sense of when models are 
comparable, unless one takes into consideration how many shape variables are in 
use.

Second, the log-likelihood involves calculating the determinant of the error 
covariance matrix, which is problematic for singular matrices, like might be 
found with high-dimensional shape data.  Recently, colleagues and I have used 
plots of the log of the trace of error covariance matrices versus the log of 
parameter penalties - the 2[pk + 0.5p(p+1)] part - as a way of scanning 
candidate models for the one or two that have lower error relative to the 
number of parameters in the model.  Such an approach allows one to have no 
allometric slope, a common allometric slope, and unique allometric slopes, in 
combination with other important factors, and consider many models at once.  
But again, there is a certain level of arbitrariness to this.

*** End pedantic discussion

There are other issues that can be quite real with real data.  For example, if 
one wishes to consider if there are shape differences among groups but first 
wishes to address if there is meaningful allometric shape variation, and 
whether there might be different allometries among groups, a homogeneity of 
slopes test might be done.  But what if it is revealed that one group has all 
small specimens and one group has all large specimens?  The researcher knows 
better than anyone else whether this is sampling error or a biological 
phenomenon.  How to proceed should not rest solely on an outcome from a 
statistical test.  For example, if the specimens are adult organisms and 
represent large individuals within populations, one might want to discuss shape 
differences without adjusting for allometry, as well as discuss size 
differences.  A discussion of allometries in this case might obscure what is 
really most important, that maybe two populations evolved size and shape 
differences because of some ecologically meaningful reason, for example.

So I agree with you, and more.  “No significance” or “significance” is only 
part of the evaluation.  Effect sizes and assessment of sampling errors, 
biases, or limitations should also be considered.  And no matter what, careful 
communication that reveals the researcher’s logic needs to be made in published 
articles.

Just my opinion,
Mike

On Dec 12, 2016, at 2:40 AM, andrea cardini 
<[email protected]<mailto:[email protected]>> wrote:

Dear All,
if I can, I'd add a brief comment on the interpretation of non-significant 
results. I'd appreciate this to be checked by those with a proper understanding 
and background on stats (which I haven't!).
I use Mike's sentence on non-significant slopes as an example but the issue is 
a general one, although I find it particularly tricky in the context of 
comparing trajectories (allometries or other) across groups. Mike wisely said 
"approximately ("If not significant, than the slope vectors are APPROXIMATELY 
parallel"). With permutations, one might be able to perform tests even when 
sample sizes are small (and maybe, which is even more problematic, 
heterogeneous across groups): then, non-significance could simply mean that 
samples are not large enough to make strong statements (rejection of the null 
hp) with confidence (i.e., statistical power is low). Especially with short 
trajectories (allometries or other), it might happen to find n.s. slopes with 
very large angles between the vectors, a case where it is probably hard to 
conclude that allometries really are parallel.
That of small samples is a curse of many studies in taxonomy and evolution. 
We've done a couple of exploratory (non-very-rigorous!) empirical analyses of 
the effect of reducing sample sizes on means, variances, vector angles etc. in 
geometric morphometrics (Cardini & Elton, 2007, Zoomorphol.; Cardini et al., 
2015, Zoomorphol.) and some, probably, most of these, literally blow up when N 
goes down. That happened even when differences were relatively large (species 
separated by several millions of years of independent evolution or samples 
including domestic breeds hugely different from their wild cpunterpart).
Unless one has done power analyses and/or has very large samples, I'd be 
careful with the interpretations. There's plenty on this in the difficult (for 
me) statistical literature. Surely one can do sophisticated power analyses in R 
and, although probably and unfortunately not used by many, one of the programs 
of the TPS series (TPSPower) was written by Jim exactly for this aim (possibly 
not for power analyses in the case of MANCOVAs/vector angles but certainly in 
the simpler case of comparisons of means).
Cheers

Andrea

On 11/12/16 19:17, Mike Collyer wrote:
Dear Tsung,

The geomorph function, advanced.procD.lm, allows one to extract group slopes 
and model coefficients.  In fact, procD.allometry is a specialized function 
that uses advanced.procD.lm to perform the HOS test and then uses procD.lm to 
produce an ANOVA table, depending on the results of the HOS test.  It also uses 
the coefficients and fitted values from procD.lm to generate the various types 
of regression scores.  In essence, procD.allometry is a function that carries 
out several analyses with geomorph base functions, procD.lm and 
advanced.procD.lm, in a specified way.  By comparison, the output is more 
limited, but one can use the base functions to get much more output.

In advanced.procD.lm, if one specifies groups and a slope, one of the outputs 
is a matrix of slope vectors.  Also, one can perform pairwise tests to compare 
either the correlation or angle between slope vectors.

Regarding the operation of the HOS test, it is a permutational test that does 
the following: calculate the sum of squared residuals for a “full” model, shape 
~ size + group + size:group and the same for a “reduced” model, shape ~ size + 
group.  (The sum of squared residuals is the trace of the error SSCP matrix, 
which is the same of the sum of the summed squared residuals for every shape 
variable.)    The difference between these two values is the sum of squares for 
the size:group effect.  If significantly large (i.e., is found with low 
probability in many random permutations), one can conclude that the 
coefficients for this effect are collectively large enough to justify this 
effect should be retained, as the slope vectors are (at least in part) not 
parallel.  If not significant, than the slope vectors are approximately 
parallel, and the effect can be removed from the model.  A randomized residual 
permutation procedure is used, which randomizes the residual vectors of the 
reduced model in each random permutation to obtain random pseudo-values, 
repeating the sum of squares calculations each time.

Regarding your final question, yes, you are correct.  In a case like this, one 
might conclude that logCS is not a significant source of shape variation, and 
proceed with other analyses that do not include it as a covariate.  In either 
case - whether is is retained as a covariate or excluded - advanced.procD.lm 
will allow one to perform pairwise comparison tests among groups.

Cheers!
Mike

On Dec 11, 2016, at 10:56 AM, Tsung Fei Khang 
<[email protected]<mailto:[email protected]>> wrote:

Dear Mike,

Many thanks for the reply!

When the procD.allometry function performs HOS test with multiple group labels 
given, does it compute the regression vectors for each group, and then tests 
whether the coefficients of these vectors were equal, using some multivariate 
statistical test? If so, is there an option that outputs the regression 
vectors? Given the high frequency of the latter being discussed in the primary 
GM literature, it seems important to be able to extract this result from the 
function.

Finally, on the interpretation side - If group variation is significant, but 
not logCS, then under the model shape~size+group, does this imply that shape 
variation is mainly explained by variation in species, and allometry is absent?

Regards,

T.F.

On Thursday, December 8, 2016 at 6:08:17 PM UTC+8, Mike Collyer wrote:
Dear Tsung,

The procD.allometry function performs two basic processes when groups are 
provided.  First, it does a homogeneity of slopes (HOS) test.  This test 
ascertains whether two or more groups have parallel or unique slopes (the 
latter meaning at least one groups’s slope is different than the others).  The 
HOS test constructs two linear models: shape ~ size + group and shape ~ size + 
group + size:group, and performs an analysis of variance to determine if the 
size:group interaction significantly reduces the residual error produced.  
(Note: log(size) is a possible and default choice in this analysis.)

After this test, procD.allometry then provides an analysis of variance on each 
term in the resulting model from the HOS test.

Regarding your question, if the HOS test reveals there is significant 
heterogeneity in slopes, the coefficients returned allow one to find the unique 
linear equations, by group, which would be found from separate runs on 
procD.allometry, one group at a time.  If the HOS test reveals that there is 
not significant heterogeneity in slopes, the coefficients constrain the slopes 
for different groups to be the same (parallel).

Finally, and I think more to your point, the projected regression scores are 
found by using for a (in the Xa calculation you note) the coefficients that 
represent a common or individual slope from the linear model produced.  The 
matrix of coefficients, B, is arranged as first row = intercept, second row = 
common slope, next rows (if applicable) are coefficients for the group factor 
(essentially change the intercept, by group), and finally, the last rows are 
the coefficients for the size:group interaction (if applicable), which change 
the common slope to match each group’s unique slope.  Irrespective of the 
complexity of this B matrix, a is found as the second row.  If you run 
procD.allometry group by group, it is the same as (1) asserting that group 
slopes are unique and (2) changing a to match not the common slope, but the 
summation of the common slope and the group-specific slope adjustment.  One 
could do that, but would lose the ability to compare the groups in the same 
plot, as each group would be projected on a different axis.

Hope that helps.

Mike


On Dec 8, 2016, at 3:37 AM, Tsung Fei Khang 
<[email protected]<mailto:[email protected]>> wrote:

Hi all,

I would like to use procD.allometry to study allometry in two species.

I understand that the function returns the regression score for each specimen 
as Reg.proj, and that the calculation is obtained as:
s = Xa, where X is the nxp matrix of Procrustes shape variables, and a is the 
px1 vector of regression coefficients normalized to 1. I am able to verify this 
computation from first principles when all samples are presumed to come from 
the same species.

However, what happens when we are interested in more than 1 species (say 2)? I 
could run procD.allometry by including the species labels via f2=~gps, where 
gps gives the species labels. Is there just 1 regression vector (which feels 
weird, since this should be species-specific), or 2? If so, how can I recover 
both vectors? What is the difference of including f2=~gps using all data, 
compared to if we make two separate runs of procD.allometry, one for samples 
from species 1, and another for samples from species 2?

Thanks for any help.

Rgds,

TF






" PENAFIAN: E-mel ini dan apa-apa fail yang dikepilkan bersamanya ("Mesej") 
adalah ditujukan hanya untuk kegunaan penerima(-penerima) yang termaklum di 
atas dan mungkin mengandungi maklumat sulit. Anda dengan ini dimaklumkan bahawa 
mengambil apa jua tindakan bersandarkan kepada, membuat penilaian, mengulang 
hantar, menghebah, mengedar, mencetak, atau menyalin Mesej ini atau sebahagian 
daripadanya oleh sesiapa selain daripada penerima(-penerima) yang termaklum di 
atas adalah dilarang. Jika anda telah menerima Mesej ini kerana kesilapan, anda 
mesti menghapuskan Mesej ini dengan segera dan memaklumkan kepada penghantar 
Mesej ini menerusi balasan e-mel. Pendapat-pendapat, rumusan-rumusan, dan 
sebarang maklumat lain di dalam Mesej ini yang tidak berkait dengan urusan 
rasmi Universiti Malaya adalah difahami sebagai bukan dikeluar atau diperakui 
oleh mana-mana pihak yang disebut.

DISCLAIMER: This e-mail and any files transmitted with it ("Message") is 
intended only for the use of the recipient(s) named above and may contain 
confidential information. You are hereby notified that the taking of any action 
in reliance upon, or any review, retransmission, dissemination, distribution, 
printing or copying of this Message or any part thereof by anyone other than 
the intended recipient(s) is strictly prohibited. If you have received this 
Message in error, you should delete this Message immediately and advise the 
sender by return e-mail. Opinions, conclusions and other information in this 
Message that do not relate to the official business of University of Malaya 
shall be understood as neither given nor endorsed by any of the forementioned. "

--
MORPHMET may be accessed via its webpage at 
http://www.morphometrics.org<http://www.morphometrics.org/>
---
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected]<http://morphometrics.org/>.


" PENAFIAN: E-mel ini dan apa-apa fail yang dikepilkan bersamanya ("Mesej") 
adalah ditujukan hanya untuk kegunaan penerima(-penerima) yang termaklum di 
atas dan mungkin mengandungi maklumat sulit. Anda dengan ini dimaklumkan bahawa 
mengambil apa jua tindakan bersandarkan kepada, membuat penilaian, mengulang 
hantar, menghebah, mengedar, mencetak, atau menyalin Mesej ini atau sebahagian 
daripadanya oleh sesiapa selain daripada penerima(-penerima) yang termaklum di 
atas adalah dilarang. Jika anda telah menerima Mesej ini kerana kesilapan, anda 
mesti menghapuskan Mesej ini dengan segera dan memaklumkan kepada penghantar 
Mesej ini menerusi balasan e-mel. Pendapat-pendapat, rumusan-rumusan, dan 
sebarang maklumat lain di dalam Mesej ini yang tidak berkait dengan urusan 
rasmi Universiti Malaya adalah difahami sebagai bukan dikeluar atau diperakui 
oleh mana-mana pihak yang disebut.

DISCLAIMER: This e-mail and any files transmitted with it ("Message") is 
intended only for the use of the recipient(s) named above and may contain 
confidential information. You are hereby notified that the taking of any action 
in reliance upon, or any review, retransmission, dissemination, distribution, 
printing or copying of this Message or any part thereof by anyone other than 
the intended recipient(s) is strictly prohibited. If you have received this 
Message in error, you should delete this Message immediately and advise the 
sender by return e-mail. Opinions, conclusions and other information in this 
Message that do not relate to the official business of University of Malaya 
shall be understood as neither given nor endorsed by any of the forementioned. "

--
MORPHMET may be accessed via its webpage at 
http://www.morphometrics.org<http://www.morphometrics.org/>
---
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.

--
MORPHMET may be accessed via its webpage at 
http://www.morphometrics.org<http://www.morphometrics.org/>
---
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.



--



Dr. Andrea Cardini

Researcher, Dipartimento di Scienze Chimiche e Geologiche, Università di Modena 
e Reggio Emilia, Via Campi, 103 - 41125 Modena - Italy

tel. 0039 059 2058472



Adjunct Associate Professor, School of Anatomy, Physiology and Human Biology, 
The University of Western Australia, 35 Stirling Highway, Crawley WA 6009, 
Australia



E-mail address: [email protected]<mailto:[email protected]>, 
[email protected]<mailto:[email protected]>

WEBPAGE: https://sites.google.com/site/alcardini/home/main



FREE Yellow BOOK on Geometric Morphometrics: 
http://www.italian-journal-of-mammalogy.it/public/journals/3/issue_241_complete_100.pdf



ESTIMATE YOUR GLOBAL FOOTPRINT: 
http://www.footprintnetwork.org/en/index.php/GFN/page/calculators/

--
MORPHMET may be accessed via its webpage at 
http://www.morphometrics.org<http://www.morphometrics.org/>
---
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.

--
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
---
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].

Reply via email to