[R] The R Journal, Volume 7, Issue 2

2016-01-06 Thread Bettina Gruen

Dear all,

The latest issue of The R Journal is now available at
http://journal.r-project.org/archive/2015-2/

Many thanks to all contributors.

Regards,
Bettina

--
---
Bettina Grün
Department of Applied Statistics

JOHANNES KEPLER
UNIVERSITY LINZ
Altenbergerstraße 69
Science Park 3, 627
4040 Linz, Austria
P +43 732 2468 6829
F +43 732 2468 6800
bettina.gr...@jku.at
www.jku.at

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Syntax error in using Anova (car package)

2015-11-25 Thread Bettina Gruen

Dear John,

thanks for the hint. This issue should be corrected now.

Best,
Bettina

On 11/25/2015 07:42 PM, Fox, John wrote:

Dear David,

Thanks for the correction.

I copied the link from the R Journal website at 
, so I guess they need to 
fix their .bib file.

I'm cc'ing the R Journal editor.

Best,
  John




-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Wednesday, November 25, 2015 12:52 PM
To: Fox, John
Cc: angelo.arc...@virgilio.it; r-help@r-project.org
Subject: Re: [R] Syntax error in using Anova (car package)



On Nov 25, 2015, at 9:23 AM, Fox, John  wrote:

Dear Angelo,

I'm afraid that this is badly confused. To use Anova() for repeated

measures, the data must be in "wide" format, with one row per subject.
To see how this works, check out the OBrienKaiser example in ?Anova and
?OBrienKaiser, or for more detail, the R Journal paper at
<{http://journal.r-project.org/archive/2013-1/RJournal_2013-1_fox-
friendly-weisberg.pdf>.

I got an error with that link, but this link succeeded:

https://journal.r-project.org/archive/2013-1/fox-friendly-weisberg.pdf




I hope this helps,
John

---
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.socsci.mcmaster.ca/jfox/




-Original Message-
From: angelo.arc...@virgilio.it [mailto:angelo.arc...@virgilio.it]
Sent: Wednesday, November 25, 2015 11:30 AM
To: r-help@r-project.org
Cc: Fox, John
Subject: Syntax error in using Anova (car package)

Dear list members,
I am getting an error while performing a repeated measures MANOVA

using

the Anova function
of the "car" package. I want to apply it on the results of an

experiment

involving 19 participants,
who were subjected to 36 stimuli, each stimulus was repeated twice

for a

total of 72 trials
per subject. Participants had to adjust two parameters of sounds,
Centroid and Sound_Level_Peak,
for each stimulus. This is the head of my dataset (dependent

variables:

Centroid and
Sound_Level_Peak; independent variables: Mat (6 levels) and Sh (2
levels)).


head(scrd)

Subject Mat   Sh  CentroidSound_Level_Peak
1 Subject1  C DS1960.2   -20.963
2 Subject1  C SN5317.2   -42.741
3 Subject1  G DS   11256.0   -16.480
4 Subject1  G SN9560.3   -19.682
5 Subject1  M DS4414.1   -33.723
6 Subject1  M SN4946.1   -23.648


Based on my understanding of the online material I found, this is the
procedure I used:

idata <- data.frame(scrd$Subject)
mod.ok <- lm(cbind(Centroid,Sound_Level_Peak) ~  Mat*Sh,data=scrd)
av.ok <- Anova(mod.ok, idata=idata, idesign=~scrd$Subject)


I get the following error

Error in check.imatrix(X.design) :
  Terms in the intra-subject model matrix are not orthogonal.


Can anyone please tell me what is wrong in my formulas?

Thanks in advance

Best regards

Angelo






__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-

guide.html

and provide commented, minimal, self-contained, reproducible code.


David Winsemius
Alameda, CA, USA






__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The R Journal, Volume 7, Issue 1

2015-07-09 Thread Bettina Gruen

Dear all,

The latest issue of The R Journal is now available at
http://journal.r-project.org/archive/2015-1/

Many thanks to all contributors.

Regards,
Bettina

--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-6829
Fax: +43 732 2468-6800
E-Mail: bettina.gr...@jku.at
www.ifas.jku.at

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Flexmix new data classification

2012-02-07 Thread Bettina Gruen

Hi,


I built a flexmix GLM binomial model with 200 observations and the model
gave me 2 clusters, so if the model is named as newModel then i get the
cluster index for each row using newModel@clusters. Now is there any way to
predict  which cluster the new observation or 201 observation belongs to
using the above built model (newModel) ie so 201 observation can either
belong to cluster 1 or cluster 2.


You can obtain the predicted cluster memberships from the fitted model 
using the accessor function clusters(), i.e.,


clusters(newModel).

If you want to predict the cluster memberships of new observations, you can 
then use


clusters(newModel, newdata = data_frame_with_new_observations)

HTH,
Bettina


Thanks

--
View this message in 
context:http://r.789695.n4.nabble.com/Flexmix-new-data-classification-tp4363996p4363996.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-6829
Fax: +43 732 2468-6800
E-Mail:bettina.gr...@jku.at
www.ifas.jku.at

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] findFreqTerms vs minDocFreq in Package 'tm'

2011-09-12 Thread Bettina Gruen

On 09/12/2011 04:28 PM, vioravis wrote:

I am using 'tm' package for text mining and facing an issue with finding the
frequently occuring terms. From the definition it appears that findFreqTerms
and minDocFreq are equivalent commands and both tries to identify the
documents with terms appearing more than a specified threshold. However, I
am getting drastically different results with both. I have given the results
from both the commands below:

findFreqTerms identifies 3140 words that appear more than 5 times but
minDocFreq identifies only 659 terms. Can someone please explain the reason
for the different or whether I have misunderstood their definitions??


From the help page of termFreq:

‘minDocFreq’ An integer value. Words that appear less often
  in ‘doc’ than this number are discarded. Defaults to ‘1’
  (i.e., every token will be used).

The description for findFreqTerms states:

Find frequent terms in a term-document matrix.

So minDocFreq assesses how often a word appears in a document in order to 
decide if it should be included in the frequency vector of words for this 
document.

By contrast findFreqTerms focuses on the document-term matrix and determines 
how often the word occurs in the matrix. So in fact the whole corpus is used to 
decide on the frequency and if the word should be included or not.

Because one function uses frequency of words in a document, while the other 
uses frequency of words in the document-term matrix, they are obviously not 
equivalent commands. Your results indicate that 3140 words occur at least 5 
times in the whole corpus, i.e., when summing over all documents. By contrasts 
659 words occur at least 5 times in one single document.

HTH,
Bettina


--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-6829
Fax: +43 732 2468-6800
E-Mail: bettina.gr...@jku.at
www.ifas.jku.at

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] betareg question - keeping the mean fixed?

2011-09-04 Thread Bettina Gruen

On 09/02/2011 07:20 PM, betty_d wrote:

Thanks for your response, that does work, however, it is still not quite what
want. I would like to tell betareg what the mean is (in my case, 0.5) and
force it to use that value. Is this possible?


AFAIK package betareg currently does not allow you to fix the mean and only 
estimate the precision parameters.


Best,
Bettina

--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-5889
Fax: +43 732 2468-9846
E-Mail: bettina.gr...@jku.at
www.ifas.jku.at

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] betareg question - keeping the mean fixed?

2011-09-01 Thread Bettina Gruen

Hi,

I have a dataset with proportions that vary around a fixed mean, is it
possible to use betareg to look at variance in the dispersion parameter
while keeping the mean fixed?

I am very new to R but have tried the following:

svec-c(qlogis(mean(data1$scaled)),0,0,0)
f-betareg(scaled~-1 | expt_label + grouped_hpi, data=data1, link.phi=log,
control=betareg.control(start=svec))

I understood that y~-1 could be used to give a fixed mean of 0.5 however I
get the following error:
Error in linkinv(x %*% beta + offset) :
Argument eta must be a nonempty numeric vector


If you want to have a fixed mean, i.e., only fit an intercept, you need to 
specify it using


y ~ 1 | exp_label + grouped_hpi.

Including -1 in the formula on the right hand side makes only sense if you 
have other covariates included and explicitly want to exclude the intercept.


HTH,
Bettina

--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-5889
Fax: +43 732 2468-9846
E-Mail:bettina.gr...@jku.at
www.ifas.jku.at
---



--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-5889
Fax: +43 732 2468-9846
E-Mail: bettina.gr...@jku.at
www.ifas.jku.at

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem on flexmix when trying to apply signature developed in one model to a new sample

2011-03-03 Thread Bettina Gruen

Jon,

if I did understand you correctly the problem is that you did not 
specify the newdata argument in posterior() correctly. You need to 
specify it in way such that evaluating the formula uses the correct 
object. If you have a matrix as dependent variable, you have to use a 
list which contains an object with the name of the dependent variable 
which contains the data you want to use for determining the a-posteriori 
probabilities. The same holds for clusters().


Have a look at the following code:

library(flexmix)
library(mvtnorm)
set.seed(123)
BM - rbind(rmvnorm(100, rep(0, 2)),
rmvnorm(100, rep(5, 2)))
ex2 - flexmix(BM ~ 1, k = 2, model = FLXMCmvnorm(diagonal = FALSE))
print(ex2)
plotEll(ex2, BM)

Data2 - data.frame(var1 = BM[c(1:5, 101:105), 1],
var2 = BM[c(1:5, 101:105), 2])
BM2 - list(BM = cbind(Data2$var1, Data2$var2))
ProbMCI - posterior(ex2, BM2)

HTH,
Bettina

On 03/01/2011 05:34 PM, Jon Toledo wrote:


Problem on flexmix when trying to apply signature developed in one model to a 
new sample.
Dear
R Users, R Core Team,



I have a problem when trying to know the
classification of the tested cases using two variables with the function  of 
flexmix:



After importing the database and creating
a matrix:

BM-cbind(Data$var1,Data$var2)



I see that the best model has 2 groups and
use:



ex2
- flexmix(BM~1, k=2, model=FLXMCmvnorm(diagonal=FALSE))

print(ex2)

plotEll(ex2, BM)



Then I want to test to which group one
subset of patients belongs, so I import a smaller sample of the previous data:

BM2-data.frame (Data2$var1,Data2$var2)



However when I test the results I get are
from the complete training sample I used in ex2 and not from the new sample
BM2.



ProbMCI-posterior(ex2, BM2)



And if I do the following I get double the
number of entered cases (I think because I entered 2 variables):

BM2-cbind (Data2$var1,Data2$var2)

p-posterior(ex2)[BMMCI,]

max.col(p)



(The same with clusters(ex2)[BM2])



In the future I would like to test the
result of this mixture also in new samples.



Thank you in advance
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

   



--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-5889
Fax: +43 732 2468-9846
E-Mail: bettina.gr...@jku.at
www.ifas.jku.at

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] model based clustering with flexmix

2009-11-09 Thread Bettina Gruen

In your model driver truncatedmodel() the fit function looks like:

z...@fit - function(x, y, w) {
para - list(mean = mean(x), sd = sd(x), lower = lower, upper= upper)
para$df - 4
with(para, eval(z...@definecomponent))
}

w are the a-posteriori probabilities and denote the weights with which 
observations are currently assigned to each component. These weights 
are not used in your fit function and hence, the parameters of each 
component are estimated identically using the whole sample. Please 
modify the fit function to take the weights into account for example by


z...@fit - function(x, y, w) {
   para - cov.wt(y, wt = w)[c(center, cov)]
   para$df - (3 * ncol(y) + ncol(y)^2)/2
   if (diagonal) {
para$cov - diag(diag(para$cov))
para$df - 2 * ncol(y)
   }
   with(para, eval(z...@definecomponent))
}

Please also note that your fit function is not appropriate because you
also have to take the truncation into account in the M-step. See for
example for grouped and truncated data:

G. McLachlan and P. Jones (1988) Fitting Mixture Models to Grouped and
Truncated Data via the EM Algorithm. Biometrics, 44(2): 571-578.

Best,
Bettina


Giovanni Luca Ciampaglia wrote:

Hello all,
I am trying to fit a truncated mixture model and I wrote a driver for 
flexmix following the example in the vignette, but it doesn't work for 
me: it assigns all data points to one component only, e.g.:

 source('bugged.R')

Call:
flexmix(formula = x ~ 1, k = 2, model = truncatedmodel(lower = -4,
upper = 4))

   prior size post0 ratio
Comp.1 0.4940   1000 0
Comp.2 0.506 1000   1000 1

'log Lik.' -707703.3 (df=9)
AIC: 1415425   BIC: 1415469


What am I doing wrong? Please find my code attached.

cheers




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New package: exams - Automatic Generation of Standardized Exams

2009-02-23 Thread Bettina Gruen

Dear useRs,

the new R package exams provides Sweave-based automatic generation of
exams with multiple-choice questions and arithmetic problems. The
package is available from CRAN:

http://CRAN.R-project.org/package=exams

It includes a vignette giving an overview of the main design aims and
principles as well as strategies for adaptation and extension.
Hands-on illustrations - based on example exercises and control files
provided in the package - are presented to get new users started easily.

Best,
Bettina

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Call for abstracts: Innovative Tools in Data Analysis (ERCIM08)

2008-02-27 Thread Bettina Gruen
Dear useRs,

we are organizing the following session

Topic: Innovative Tools in Data Analysis
Organizers: Achim Zeileis and Bettina Gruen

at the

First Workshop of the ERCIM Working Group on Computing  Statistics
June 19-21, 2008 Neuchatel, Switzerland
URL: http://www.dcs.bbk.ac.uk/ercim08

To improve the quality of statistical data analysis the provision of
innovative tools which make new techniques readily available is
extremely important. In the session 'Innovative Tools for Data
Analysis' we are looking for presentations of tools which support any
area of data analysis and address techniques ranging from classical
methods and their extensions to machine learning.

Please consider giving a presentation on a flexible tool you have
implemented in R in our session. Submit your abstract via the web page
indicating our session name in the text field and let us also know
informally.

Deadline for early registration: March 3, 2008
Deadline for submission of abstracts: April 30, 2008

Kind regards,
Bettina and Achim

-- 
---
Bettina Grün
Department für Statistik und Mathematik
Wirtschaftsuniversität Wien
Augasse 2-6
A-1090 Wien, Österreich
Tel: (+43 1) 31336 5032
Fax: (+43 1) 31336 734
---





-- 
---
Bettina Grün
Department für Statistik und Mathematik
Wirtschaftsuniversität Wien
Augasse 2-6
A-1090 Wien, Österreich
Tel: (+43 1) 31336 5032
Fax: (+43 1) 31336 734

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.