Re: [MORPHMET] Re: number of landmarks and sample size

2017-06-11 Thread Justin Bagley
Hi Will,

I think you meant to say that you are writing a study design paper
presenting results of simulations and power analysis to determine
appropriate sample sizes for multivariate analyses in geometric
morphometrics. But I would think that would have already been settled by
now, and possibly would be more relevant for certain clustering methods.
The only parameterized PCA variant I am aware of is Kernel PCA, which is a
nonlinear PCA method used for pattern analysis (e.g. used in image
analysis), but that is not often employed in biological geometric
morphometrics papers (at least, those that I frequently come across). When
kernels are used they usually are meant to estimate densities of
reduced-dimensionality data like CS, or PCs as shape variables.

Best,

Justin

Justin C. Bagley, Ph.D.
Postdoctoral Scholar
Plant Evolutionary Genomics Laboratory
Department of Biology
Virginia Commonwealth University
Richmond, VA 23284-2012
jcbag...@vcu.edu

Senior/Postdoctoral Research Associate
Departamento de Zoologia
Universidade de Brasília
Campus Universitário Darcy Ribeiro
70910-900 Brasília, DF, Brasil
Website: http://www.justinbagley.org
Lattes CV: http://lattes.cnpq.br/0028570120872581

On Wed, May 31, 2017 at 6:41 PM, William Gelnaw  wrote:

> I'm currently working on a paper that deals with the problem of
> over-parameterizing PCA in morphometrics.  The recommendations that I'm
> making in the paper are that you should try to have at least 3 times as
> many samples as variables.  That means that if you have 10 2D landmarks,
> you should have at least 60 specimens that you measure.  Based on
> simulations, if you have fewer than 3 specimens per variable, you quickly
> start getting eigenvalues for a PCA that are very different from known true
> eigenvalues.  I did a literature survey and about a quarter of
> morphometrics studies in the last decade haven't met that standard.  A good
> way to test if you have enough samples is to do a jackknife analysis.  If
> you cut out about 10% of your observations and still get the same
> eigenvalues, then your results are probably stable.
>   I hope this helps.
>   - Will
>
> On Wed, May 31, 2017 at 1:31 PM, mitte...@univie.ac.at <
> mitte...@univie.ac.at> wrote:
>
>> Adding more (semi)landmarks inevitably increases the spatial resolution
>> and thus allows one to capture finer anatomical details - whether relevant
>> to the biological question or not. This can be advantageous for the
>> reconstruction of shapes, especially when producing 3D morphs by warping
>> dense surface representations. Basic developmental or evolutionary trends,
>> group structures, etc., often are visible in an ordination analysis with a
>> smaller set of relevant landmarks; finer anatomical resolution not
>> necessarily affects these patterns. However, adding more landmarks cannot
>> reduce or even remove any signals that were found with less landmarks, but
>> it can make ordination analyses and the interpretation distances and angles
>> in shape space more challenging.
>>
>> An excess of variables (landmarks) over specimens does NOT pose problems
>> to statistical methods such as the computation of mean shapes and
>> Procrustes distances, PCA, PLS, and the multivariate regression of shape
>> coordinates on some independent variable (shape regression). These methods
>> are based on averages or regressions computed for each variable separately,
>> or on the decomposition of a covariance matrix.
>>
>> Other techniques, including Mahalanobis distance, DFA, CVA, CCA, and
>> relative eigenanalysis require the inversions of a full-rank covariance
>> matrix, which implies an access of specimens over variables. The same
>> applies to many multivariate parametric test statistics, such as
>> Hotelling's T2, Wilks' Lambda, etc. But shape coordinates are NEVER of full
>> rank and thus can never be subjected to any of these methods without prior
>> variable reduction. In fact, reliable results can only be obtained if there
>> are manifold more specimens than variables, which usually requires variable
>> reduction by PCA, PLS or other techniques, or the regularization of
>> covariance matrices (which is more common in the bioinformatic community).
>>
>> For these reasons, I do not see any disadvantage of measuring a large
>> number of landmarks, except for a waste of time perhaps. If life time is an
>> issue, one can optimize landmark schemes as suggested by Jim or Aki.
>>
>> Best,
>>
>> Philipp
>>
>> --
>> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "MORPHMET" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to morphmet+unsubscr...@morphometrics.org.
>>
>
> --
> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
> ---
> You received this message because you are subscribed to the Google Groups
> "MORPHMET" group.
> To unsubscri

Re: [MORPHMET] Phylogenetic tree in MorphoJ-regrading

2019-01-29 Thread Justin Bagley
Dear Arvind,

MorphoJ allows you to map shape variation across multiple taxa onto an
existing phylogenetic tree. So, you need to either use a published
phylogeny of your taxa (e.g. accessioned in TreeBASE or Dryad, or from the
authors) or reconstruct your own phylogenetic hypothesis of evolutionary
relationships among your focal taxa. After you have one or multiple
appropriate phylogenetic trees, you can then import them into MorphoJ (e.g.
from a single NEXUS file), so long as there are no negative branch lengths
or other formatting issues. Here is a link to the section of the MorphoJ
documentation on the topic, which includes examples of NEXUS format
accepted by the program:

http://www.flywings.org.uk/MorphoJ_guide/frameset.htm?file/ImpPhylogeny.htm

Good luck with your analysis.

Best,
Justin

Justin C. Bagley, Ph.D.
Postdoctoral Research Associate
Department of Biology
University of Missouri-St. Louis
One University Boulevard, 223 Research Building
St. Louis, MO 63121-4499
E-mail: bagl...@umsl.edu
Website: https://justinbagley.org
CV: https://justinbagley.org/pages/cv.html
Blog: https://justinbagley.rbind.io

Affiliate Researcher
Department of Biology
Virginia Commonwealth University
1000 W Cary St, Rm 126,
Richmond, VA 23284-2012



On Mon, Jan 28, 2019 at 11:07 AM Arvind Kumar Dwivedi 
wrote:

> Dear Sir/Ma'am
>
> Greetings...
>
> I am working on shape variations among fishes.
>
> I have a problem in preparing phylogenetic tree in MorphoJ software.
>
> MorphoJ is asking to upload NEXUS file to prepare phylogenetic tree.
>
> I have two major queries:
>
> 1. Do I have to prepare NEXUS file from the 2D XY coordinate data ???
>
> 2. If this is the case, how I can prepare NEXUS file  I don not how to
> format 2D XY data in NEXUS file.
>
> Here is my data on 2D XY coordinate for 3 species having 3 samples in each
> species:
>
> LM=13
> 122.0 583.0
> 430.0 725.0
> 808.0 778.0
> 1083.0 748.0
> 1653.0 612.0
> 1706.0 523.0
> 1651.0 452.0
> 1489.0 462.0
> 1239.0 361.0
> 843.0 280.0
> 510.0 412.0
> 530.0 467.0
> 263.0 567.0
> IMAGE=TIL 845.JPG
> ID=species1
> SCALE=0.013768
> LM=13
> 144.0 810.0
> 455.0 970.0
> 828.0 1024.0
> 1108.0 985.0
> 1665.0 862.0
> 1712.0 770.0
> 1651.0 692.0
> 1482.0 704.0
> 1264.0 594.0
> 861.0 550.0
> 538.0 658.0
> 562.0 722.0
> 303.0 803.0
> IMAGE=TIL 846.JPG
> ID=species1
> SCALE=0.011780
> LM=13
> 75.0 935.0
> 436.0 1098.0
> 842.0 1140.0
> 1099.0 1092.0
> 1681.0 990.0
> 1735.0 902.0
> 1655.0 836.0
> 1521.0 819.0
> 1298.0 704.0
> 843.0 615.0
> 485.0 755.0
> 520.0 822.0
> 237.0 915.0
> IMAGE=TIL 847.JPG
> ID=species1
> SCALE=0.014137
> LM=13
> 43.0 3131.0
> 110.0 3183.0
> 248.0 3212.0
> 316.0 3203.0
> 550.0 3182.0
> 573.0 3154.0
> 553.0 3123.0
> 497.0 3122.0
> 433.0 3089.0
> 282.0 3046.0
> 147.0 3072.0
> 149.0 3115.0
> 78.0 3138.0
> IMAGE=TIL 402.JPG
> ID=species2
> SCALE=0.037037
> LM=13
> 83.0 3174.0
> 165.0 3222.0
> 278.0 3246.0
> 339.0 3238.0
> 549.0 3209.0
> 568.0 3175.0
> 546.0 3153.0
> 502.0 3162.0
> 434.0 3127.0
> 302.0 3100.0
> 193.0 3122.0
> 197.0 3158.0
> 132.0 3178.0
> IMAGE=TIL 403.JPG
> ID=species2
> SCALE=0.038462
> LM=13
> 84.0 3126.0
> 155.0 3174.0
> 291.0 3205.0
> 347.0 3198.0
> 581.0 3154.0
> 597.0 3119.0
> 587.0 3096.0
> 522.0 3099.0
> 450.0 3066.0
> 304.0 3048.0
> 191.0 3077.0
> 195.0 3107.0
> 126.0 3135.0
> IMAGE=TIL 404.JPG
> ID=species2
> SCALE=0.037037
> LM=13
> 45.0 1452.0
> 227.0 1601.0
> 478.0 1672.0
> 635.0 1656.0
> 1061.0 1568.0
> 1106.0 1508.0
> 1070.0 1452.0
> 840.0 1392.0
> 566.0 1317.0
> 338.0 1341.0
> 286.0 1320.0
> 177.0 1472.0
> 91.0 1469.0
> IMAGE=HKL 14.JPG
> ID=species3
> SCALE=0.014700
> LM=13
> 68.0 1477.0
> 260.0 1605.0
> 494.0 1652.0
> 640.0 1627.0
> 1036.0 1529.0
> 1097.0 1467.0
> 1037.0 1418.0
> 805.0 1358.0
> 541.0 1316.0
> 344.0 1355.0
> 296.0 1332.0
> 203.0 1494.0
> 120.0 1492.0
> IMAGE=HKL 15.JPG
> ID=species3
> SCALE=0.015385
> LM=13
> 56.0 1496.0
> 242.0 1620.0
> 506.0 1666.0
> 666.0 1645.0
> 1094.0 1532.0
> 1141.0 1471.0
> 1074.0 1423.0
> 835.0 1364.0
> 574.0 1315.0
> 341.0 1355.0
> 282.0 1341.0
> 175.0 1508.0
> 103.00

Re: [MORPHMET] Statistics software question

2019-01-31 Thread Justin Bagley
Dear Phil,

SAS has excellent support and documentation. Just go to their website at
https://support.sas.com/en/documentation.html, type in a search query for
the statistical test of interest, and you'll get links to the appropriate
section of the SAS/STAT 14.3 User's Guide. Detailed information is given on
statements to call different tests. You should be able to quickly find the
information you need using this procedure (with a statistics text in hand),
and I imagine that similar online documentation resources are available for
the other major programs that you mentioned.

Nevertheless, all in all, I don't recommend that you go with _any_ of the
software programs in your list for statistical analyses of biological data,
unless they are the only software programs that implement the test you
need. Instead, I suggest that you conduct statistical analyses in the R
environment for statistical computing (https://cran.r-project.org) or write
bash or Python wrapper scripts around existing programs to conduct your
analyses. Is there not an R package that will conduct the test you need to
do?

Since we received this through MORPHMET, perhaps you could state the
question you have about the "particular statistical test" in question in a
way that is specific and that relates to morphometrics, and I'm sure that
someone would be able to help you out in more detail.

Good luck.

Best,

Justin C. Bagley, Ph.D.
Postdoctoral Research Associate
Department of Biology
University of Missouri-St. Louis
One University Boulevard, 223 Research Building
St. Louis, MO 63121-4499
E-mail: bagl...@umsl.edu
Website: https://justinbagley.org
CV: https://justinbagley.org/pages/cv.html
Blog: https://justinbagley.rbind.io

Affiliate Researcher
Department of Biology
Virginia Commonwealth University
1000 W Cary St, Rm 126,
Richmond, VA 23284-2012



On Thu, Jan 31, 2019 at 11:04 AM Novack-Gottshall, Philip M. <
pnovack-gottsh...@ben.edu> wrote:

> Hi all,
>
> Apologies for cross-posting, but I'm not sure where this best lands.
>
> I'm trying to find people who have access to (preferably some experience
> with) any of the following statistical software programs:
> -MiniTab
> -SAS
> -SPSS
> -S-Plus
> -STATA
> -SYSTAT
>
> If you do, might you contact me off-list ?
> I'm trying to find out how each program handles a particular statistical
> test. My question can likely be answered with a quick check of the help
> documentation for the software or by running a sample data set I can
> provide, if interested.
>
> Thanks,
> Phil
>
> ~
>   Phil Novack-Gottshall, PhD
>   Professorpnovack-gottsh...@ben.edu
>   Department of Biological Sciences
>   Benedictine University
>   5700 College Road
>   Lisle, IL 60532
>
>   Office: 332 Birck Hall
>   Lab: 316 Birck Hall
>   Phone: 630-829-6514
>   Fax: 630-829-6547
>   https://pnovack-gottshall.wixsite.com/home
>
>   Spring 2019 office hours:  Tues/Thurs 9:30-11:00 AM
>  Wed 10 AM - 12:15 PM
>
>   If you have urgent academic advising questions, please contact
>   Anne Baysinger (Birck 130)
>  ~
>
>
> --
> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
> ---
> You received this message because you are subscribed to the Google Groups
> "MORPHMET" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to morphmet+unsubscr...@morphometrics.org.
>
>

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org.


Re: [MORPHMET] Statistics software question

2019-01-31 Thread Justin Bagley
Hi Phil,

Congrats on the R package and for helping make others aware of the need for
said KS correction! Nice work.

I'm sure you're already aware of this, but just in case, and for
clarification,... Lilliefors' Kolmogorov-Smirnoff test is already
implemented in the 'lillie.test' function of the nortest R package. This
test is also already implemented in the R package EnvStats, in its
'gofTest' function (i.e. gofTest(y, ..., test='lillie')). Interested
parties might like to know, Does the development of your package predate
these, or does it imply that the test is performed incorrectly in these
existing packages? Take care.

Best,
Justin

Justin C. Bagley, Ph.D.
Postdoctoral Research Associate
Department of Biology
University of Missouri-St. Louis
One University Boulevard, 223 Research Building
St. Louis, MO 63121-4499
E-mail: bagl...@umsl.edu
Website: https://justinbagley.org
CV: https://justinbagley.org/pages/cv.html
Blog: https://justinbagley.rbind.io

Affiliate Researcher
Department of Biology
Virginia Commonwealth University
1000 W Cary St, Rm 126,
Richmond, VA 23284-2012



On Thu, Jan 31, 2019 at 11:53 AM Novack-Gottshall, Philip M. <
pnovack-gottsh...@ben.edu> wrote:

> Thanks, Justin! I've checked out whichever help docs I can find, but I've
> discovered that sometimes a particular "correct" is not always mentioned,
> even when used in the software.
>
> The reason I'm checking is that I'm a co-author on an R package (LcKS)
> that implements the Lilliefors correction for the one-sample
> goodness-of-fit Kolmogorov-Smirnoff test and we're writing a manuscript to
> accompany it. Apparently the correction is not widely used or known about
> (outside of the statistical community), and it's a major oversight. (For
> example, ks.test in R [very subtly] cautions the user about the violation
> but does not actually offer a fix, and it's not available in base R or
> 'stats'.) We've discovered many published articles that appear to do the
> test in the incorrect manner. Our package and manuscript, we hope, will
> help improve the situation by calling attention to the bias and offering a
> simple solution.
>
> Best wishes,
> Phil
>
> On 1/31/2019 11:46 AM, Justin Bagley wrote:
>
> Dear Phil,
>
> SAS has excellent support and documentation. Just go to their website at
> https://support.sas.com/en/documentation.html, type in a search query for
> the statistical test of interest, and you'll get links to the appropriate
> section of the SAS/STAT 14.3 User's Guide. Detailed information is given on
> statements to call different tests. You should be able to quickly find the
> information you need using this procedure (with a statistics text in hand),
> and I imagine that similar online documentation resources are available for
> the other major programs that you mentioned.
>
> Nevertheless, all in all, I don't recommend that you go with _any_ of the
> software programs in your list for statistical analyses of biological data,
> unless they are the only software programs that implement the test you
> need. Instead, I suggest that you conduct statistical analyses in the R
> environment for statistical computing (https://cran.r-project.org) or
> write bash or Python wrapper scripts around existing programs to conduct
> your analyses. Is there not an R package that will conduct the test you
> need to do?
>
> Since we received this through MORPHMET, perhaps you could state the
> question you have about the "particular statistical test" in question in a
> way that is specific and that relates to morphometrics, and I'm sure that
> someone would be able to help you out in more detail.
>
> Good luck.
>
> Best,
>
> Justin C. Bagley, Ph.D.
> Postdoctoral Research Associate
> Department of Biology
> University of Missouri-St. Louis
> One University Boulevard, 223 Research Building
> St. Louis, MO 63121-4499
> E-mail: bagl...@umsl.edu
> Website: https://justinbagley.org
> CV: https://justinbagley.org/pages/cv.html
> Blog: https://justinbagley.rbind.io
>
> Affiliate Researcher
> Department of Biology
> Virginia Commonwealth University
> 1000 W Cary St, Rm 126,
> Richmond, VA 23284-2012
>
>
>
> On Thu, Jan 31, 2019 at 11:04 AM Novack-Gottshall, Philip M. <
> pnovack-gottsh...@ben.edu> wrote:
>
>> Hi all,
>>
>> Apologies for cross-posting, but I'm not sure where this best lands.
>>
>> I'm trying to find people who have access to (preferably some experience
>> with) any of the following statistical software programs:
>> -MiniTab
>> -SAS
>> -SPSS
>> -S-Plus
>> -STATA
>> -S