Many thanks for your useful advice Bob! Unfortunately I did try to use my University's statistical consulting department, but they were not able to provide advice at this level for either the multivariate or mixed effect models. :( I would be happy to consult with someone else if anyone if offering such a service?
Tania Bird On 27 February 2017 at 18:14, Bob OHara <boh...@senckenberg.de> wrote: > Hm, this is a big job. The optimal solution is to see if your university > offers a statistical consulting service. I don't see any big conceptual > problems, but getting a good analysis will take a bit of time and > exploration. I think you can probably 'just' use a GLMM, but getting the > right GLMM and deciding what a good model is will take time and some poking > of the data. > > Anyway, some answers below, which may (or may not) help. > > > On 02/27/2017 04:27 PM, Tania Bird wrote: >> >> Hi all, >> >> I am seeking advice on how to analyse my unbalanced, multi-nested >> multivariate data set. I realise there are many questions in this >> email and I would be willing to consult with someone privately on this >> if it is an option. >> >> I am using abundance data for insect species (I have the same >> experimental design for reptiles, and annual plants as well). I use >> Simpson's diversity as a univariate response and species composition >> as a multivariate response. >> >> Experimental Design: >> Plots are divided into three habitat types A, B, C based on vegetation. >> Each habitat has 3 or 4 replicate control plots that are repeat >> sampled (one sample a year always in spring). >> In addition B and C have 3 or 4 treatment (vegetation removal) plots. >> 'S' plots are disturbed( trampling and off-road vehicles) but the >> disturbance is unquantified and I don't know the pre-disturbance >> habitat type. >> >> The total data set is across a 12 year period, but the sampling was >> unbalanced for various reasons. I attach a png of the metadata of the >> plots over time to show the unbalanced sampling. >> https://www.dropbox.com/s/7vxvo3x9lnywdbm/insects_years.gif?dl=0 >> >> Each year the sampling across plots was conducted at the same time, >> and so plots are comparable within a year. >> In general, As were sampled every year and are considered the 'target' >> habitat. B's were sampled in the earlier years and C's later on, and >> in the last couple of years all three types were sampled together. >> >> The treatments on B & C were conducted using different methods and in >> different years, so in principle I should probably test each >> separately just against their own control pairs. However the >> hypothesis for both treatments is that treated plots will be more >> similar in composition to A plots than the paired control plots (if >> possible I want to check if they become more or less similar to A over >> time). >> >> So in that regard I thought there might be a way to include all >> habitat types in one analysis? Perhaps using time as "number of years >> since treatment" rather than a date? (Although I have no environmental >> data with which to standardise). S dunes have no "pre-treatment" but >> the hypothesis is that S plots will be most similar to A compared to >> all other (treated and control) plot types. I am not sure how to >> include these plots in a testable model. >> >> Questions regarding the design: >> >>>> Can I use all the habitat types in one model (preferable!) or can I only >>>> test B treated against B control etc? > > Yes you can. you obviously need a Treatment effect, and you should expect to > have a Treatment by Habitat interaction. > > There may also be some sort of interaction with time (either as Time, or > Time Since Treatment) > >>>> Must I remove data to create blocks of sampled or is 'all data useful'? >> >> e.g. A's were the only plots sampled in 2010- should I remove that >> year completely? >> e.g. C1 & C5 were sampled in 2005 while the rest were not until 2011,- >> should I only include data from 2011 onwards for all C's? >> e.g. Should I remove A4 completely since its only sampled in the last >> few years or its still useable? > > No, you should be able to use all of the data, you just have to be a bit > careful about how you model Time. >>>> >>>> Can I include S in the analyses in order to compare them with B and C >>>> treated plots in relation to A plots? > > Yes, in principal. It just doesn't have a Habitat:Treatment interaction. > >> I have already analysed my first research question >> Q1) To understanding the differences in diversity and composition >> across control habitats, irrelevant of time. >> >> The analysis approach I used for this is: >> i) Mixed effect model: GLMM PQL (Penalised Quasi-Likelihood) using >> MASS R package. >> Diversity ~ fixed effect = habitat type + random effect = year , >> Family = poisson > > There are better tools than glmmPQL nowadays. Have a look at the lme4 > package, for example. >> >> ii) Pairwise permutational multivariate analysis of variance (MANOVA) >> with R code based on the adonis2 function, to determine if the >> composition among habitats (visualised in NMDS) were significantly >> different from each other. >> >> iii) RDA with habitat as explanatory and year as covariate to test >> explained variance. >> >> Now I am trying to expand this analysis to include a temporal element >> to answer Q2 & Q3 >> Q2) to understand the trends in diversity and composition over time in >> control habitats >> Q3) to understand the impact of treatment on diversity and composition >> (over time if possible?) >> >> The addition of time into the analyses is a bit difficult for me to >> work out, due to the multi-nested and unbalanced design of the data; I >> am not sure what methods to use to include time as a variable for >> looking at a) diversity and b) composition > > Take a look at repeated measures models. There are a few ways this could be > set up, depending a bit on the data. > >> Questions regarding analyses: >>> >>> 1> Is there an appropriate mixed effect model I can use to look at >>> differences in diversity on different control plots and include time as a >>> factor (rather than as a random effect)? > > There are probably several. :-) For example you could include Time as a > continuous covariate, alongside the random effect. You could also just > include it as a fixed effect, but that could get messy. >>> >>> 2> How can I appropriately test if different habitats exhibit different >>> trends in composition over time (ie. a multivariate approach). For example, >>> I might expect that A's will remain relatively stable over time, while C's >>> will exhibit high turnover (fluctuation) across years, or that B's will >>> slowly shift composition to be more similar to C. How can I test these >>> directional hypotheses? >> >> I thought to create a Principle Response Curve to see relative >> differences over time, but as far as I understand, I cannot use a >> permutation test here due to the unbalance design. I also thought to >> take the scores on the first RDA axis as a univariate measure, and >> then plot this over time.. but I'm not sure if its an appropriate >> approach or how to then test this statistically. >> >> I also thought to try and create some measure of "compositional >> temporal stability" for each plot and test this using ANOVA (like some >> sort of "multivariate Coefficient of Variation"). One such measure >> could be distance of each plot-year from the habitat centroid in >> ordination space but again, I'm not sure if this is an appropriate >> approach. Any suggestions for other measures would be welcome. > > That's essentially a question about the variance in responses. There are > doubly hierarchical models that you could try, but you might not want to go > there. >>> >>> 3> Finally can I extend these temporal analysis (of diversity and of >>> composition) to look at response trends to treatments, given the structure >>> of my data? >> >> I would like to see if I can detect some form of resistance to, or >> recovery from, the treatment over time ... But if not, can I test the >> overall treatment affect and use time as a random effect like i did >> for my first question? >> >> Thank you for any suggestions of analyses and/or ways to subset the >> data that would allow me to answer these questions. > > Essentially you need some structure on the time covariate. You could start > by using time since treatment as a factor, and plot those estimates. Again, > there should be a bit of playing around with the model, to see what makes > sense. > > Bob > > -- > Bob O'Hara > NOTE NEW ADDRESS!!! > Institutt for matematiske fag > NTNU > 7491 Trondheim > Norway > > Mobile: +49 1515 888 5440 > Journal of Negative Results - EEB: www.jnr-eeb.org > > _______________________________________________ > R-sig-ecology mailing list > R-sig-ecology@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology