Re: [R] Sensitivity and Specificity
MKclass::perfMeasures(predict_testing, truth = testing$case, namePos = 1) should also work and computes 80 performance measures. Best Matthias Am 25.10.22 um 06:42 schrieb Jin Li: Hi Greg, This can be done by: spm::pred.acc(testing$case, predict_testing) It will return both sensitivity and specificity, along with a few other commonly used measures. Hope this helps, Jin On Tue, Oct 25, 2022 at 6:01 AM Rui Barradas wrote: Às 16:50 de 24/10/2022, greg holly escreveu: Hi Michael, I appreciate your writing. Here are what I have after; predict_testing <- ifelse(predict > 0.5,1,0) head(predict) 1 2 3 5 7 8 0.29006984 0.28370507 0.10761993 0.02204224 0.12873872 0.08127920 # Sensitivity and Specificity sensitivity<-(predict_testing[2,2]/(predict_testing[2,2]+predict_testing[2,1]))*100 Error in predict_testing[2, 2] : incorrect number of dimensions sensitivity function (data, ...) { UseMethod("sensitivity") } specificity<-(predict_testing[1,1]/(predict_testing[1,1]+predict_testing[1,2]))*100 Error in predict_testing[1, 1] : incorrect number of dimensions specificity function (data, ...) { UseMethod("specificity") } On Mon, Oct 24, 2022 at 10:45 AM Michael Dewey wrote: Rather hard to know without seeing what output you expected and what error message you got if any but did you mean to summarise your variable predict before doing anything with it? Michael On 24/10/2022 16:17, greg holly wrote: Hi all R-Help , After partitioning my data to testing and training (please see below), I need to estimate the Sensitivity and Specificity. I failed. It would be appropriate to get your help. Best regards, Greg inTrain <- createDataPartition(y=data$case, p=0.7, list=FALSE) training <- data[ inTrain,] testing <- data[-inTrain,] attach(training) #model training and prediction data_training <- glm(case ~ age+BMI+Calcium+Albumin+meno_1, data = training, family = binomial(link="logit")) predict <- predict(data_training, data_predict = testing, type = "response") predict_testing <- ifelse(predict > 0.5,1,0) # Sensitivity and Specificity sensitivity<-(predict_testing[2,2]/(predict_testing[2,2]+predict_testing[2,1]))*100 sensitivity specificity<-(predict_testing[1,1]/(predict_testing[1,1]+predict_testing[1,2]))*100 specificity [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael http://www.dewey.myzen.co.uk/home.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Instead of computing by hand, why not use package caret? tbl <- table(predict_testing, testing$case) caret::sensitivity(tbl) caret::specificity(tbl) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What are the pros and cons of the log.p parameter in (p|q)norm and similar?
Response to 1You need the log version e.g. in maximum likelihood, otherwise the product of the densities and probabilities can become very small. Ursprüngliche Nachricht Von: r-help-requ...@r-project.org Datum: 04.08.21 12:01 (GMT+01:00) An: r-help@r-project.org Betreff: R-help Digest, Vol 222, Issue 4 Send R-help mailing list submissions to r-help@r-project.orgTo subscribe or unsubscribe via the World Wide Web, visit https://stat.ethz.ch/mailman/listinfo/r-helpor, via email, send a message with subject or body 'help' tor-help-request@r-project.orgYou can reach the person managing the list at r-help-owner@r-project.orgWhen replying, please edit your Subject line so it is more specificthan "Re: Contents of R-help digest..."Today's Topics: 1. What are the pros and cons of the log.p parameter in (p|q)norm and similar? (Michael Dewey) 2. Help with package EasyPubmed (bharat rawlley) 3. Re: Help with package EasyPubmed (bharat rawlley) 4. Re: What are the pros and cons of the log.p parameter in (p|q)norm and similar? (Duncan Murdoch) 5. Re: What are the pros and cons of the log.p parameter in (p|q)norm and similar? (Bill Dunlap) 6. Creating a log-transformed histogram of multiclass data (Tom Woolman) 7. Re: Creating a log-transformed histogram of multiclass data (Tom Woolman)--Message: 1Date: Tue, 3 Aug 2021 17:20:12 +0100From: Michael Dewey To: "r-help@r-project.org" Subject: [R] What are the pros and cons of the log.p parameter in (p|q)norm and similar?Message-ID: Content-Type: text/plain; charset="utf-8"; Format="flowed"Short versionApart from the ability to work with values of p too small to be of much practical use what are the advantages and disadvantages of setting this to TRUE?Longer versionI am contemplating upgrading various functions in one of my packages to use this and as far as I can see it would only have the advantage of allowing people to use very small p-values but before I go ahead have I missed anything? I am most concerned with negatives but if there is any other advantage I would mention that in the vignette. I am not concerned about speed or the extra effort in coding and expanding the documentation.-- Michaelhttp://www.dewey.myzen.co.uk/home.html--Message: 2Date: Tue, 3 Aug 2021 18:20:52 + (UTC)From: bharat rawlley To: R-help Mailing List Subject: [R] Help with package EasyPubmedMessage-ID: <1046636584.2205366.1628014852...@mail.yahoo.com>Content-Type: text/plain; charset="utf-8"Hello, When I try to run the following code using the package Easypubmed, I get a null result - > batch_pubmed_download(query_7)NULL#query_7 <- "Cardiology AND randomizedcontrolledtrial[Filter] AND 2011[PDAT]"However, the exact same search string yields 668 results on Pubmed. I am unable to figure out why this is happening. If I use the search string "Cardiology AND 2011[PDAT]" then it works just fine. Any help would be greatly appreciatedThank you! [[alternative HTML version deleted]]--Message: 3Date: Tue, 3 Aug 2021 18:26:40 + (UTC)From: bharat rawlley To: R-help Mailing List Subject: Re: [R] Help with package EasyPubmedMessage-ID: <712126143.2207911.1628015200...@mail.yahoo.com>Content-Type: text/plain; charset="utf-8" Okay, the following search string resolved my issue - "Cardiology AND randomized controlled trial[Publication type] AND 2011[PDAT]"Thank you! On Tuesday, 3 August, 2021, 02:21:38 pm GMT-4, bharat rawlley via R-help wrote: Hello, When I try to run the following code using the package Easypubmed, I get a null result - > batch_pubmed_download(query_7)NULL#query_7 <- "Cardiology AND randomizedcontrolledtrial[Filter] AND 2011[PDAT]"However, the exact same search string yields 668 results on Pubmed. I am unable to figure out why this is happening. If I use the search string "Cardiology AND 2011[PDAT]" then it works just fine. Any help would be greatly appreciatedThank you! [[alternative HTML version deleted]]__r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, seehttps://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]--Message: 4Date: Tue, 3 Aug 2021 14:53:28 -0400From: Duncan Murdoch To: Michael Dewey , "r-help@r-project.org" Subject: Re: [R] What are the pros and cons of the log.p parameter in(p|q)norm and similar?Message-ID: Content-Type: text/plain; charset="utf-8"; Format="flowed"On 03/08/2021 12:20 p.m., Michael Dewey wrote:> Short version> > Apart from the ability to work with values of p too small to be of much> practical use what
Re: [R] density with weights missing values
Thanks Martin and the others. I will do so accordingly. I guess the 0.1% of the population who uses density with weights will write code like this x = c(1, 2, 3, NA) weights = c(1, 1, 1, 1) density(x[!is.na(x)], weights=weights[!is.na(x)]) These people won’t be affected. For the 0.01% of people with code like this, density(x, weights=weights[!is.na(x)], na.rm=TRUE) the corrected version would almost surely raise an error. Note that the error message can, in principle, check if length(x[!is.na(x)]) == length(the provided weights) and tell the programmer that this was the old behavior. Best wishes, Matthias PS. Sorry for the HTML email. I’ve given up trying to fix such behavior. Von: Martin Maechler Gesendet: Dienstag, 13. Juli 2021 09:09 An: Matthias Gondan Cc: r-help@r-project.org Betreff: Re: [R] density with weights missing values >>>>> Matthias Gondan >>>>> on Mon, 12 Jul 2021 15:09:38 +0200 writes: > Weighted mean behaves differently: > • weight is excluded for missing x > • no warning for sum(weights) != 1 >> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1)) > [1] 2.5 >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1)) > [1] NA >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE) > [1] 2 I'm sure the 'weights' argument in weighted.mean() has been used much more often than the one in density(). Hence, it's quite "probable statistically" :-) that the weighted.mean() behavior in the NA case has been more rational and thought through So I agree with you, Matthias, that ideally density() should behave differently here, probably entirely analogously to weighted.mean(). Still, Bert and others are right that there is no bug formally, but something that possibly should be changed; even though it breaks back compatibility for those cases, such case may be very rare (I'm not sure I've ever used weights in density() but I know I've used it very much all those 25 years ..). https://www.r-project.org/bugs.html contains good information about determining if something may be a bug in R *and* tell you how to apply for an account on R's bugzilla for reporting it formally. I'm hereby encouraging you, Matthias, to do that and then in your report mention both density() and weighted.mean(), i.e., a cleaned up version of the union of your first 2 e-mails.. Thank you for thinking about this and concisely reporting it. Martin > Von: Richard O'Keefe > Gesendet: Montag, 12. Juli 2021 13:18 > An: Matthias Gondan > Betreff: Re: [R] density with weights missing values > Does your copy of R say that the weights must add up to 1? > ?density doesn't say that in mine. But it does check. another small part to could be improved, indeed, thank you, Richard. -- Martin Maechler ETH Zurich and R Core team > On Mon, 12 Jul 2021 at 22:42, Matthias Gondan wrote: >> >> Dear R users, >> >> This works as expected: >> >> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE)) >> >> This raises an error >> >> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))) >> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA))) [..] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density with weights missing values
You're right, of course. Extrapolating your argument a bit, the whole practice of na.rm is questionable, since there's always a reason for missingness (that is not in x and rarely elsewhere in the data)Best wishes Matthias Ursprüngliche Nachricht Von: Jeff Newmiller Datum: 12.07.21 18:44 (GMT+01:00) An: r-help@r-project.org, matthias-gondan , Bert Gunter Cc: r-help@r-project.org Betreff: Re: [R] density with weights missing values Sure, you might think that.But most likely the reason this code has not been corrected is that when you give weights for missing data the most correct result is for your entire density to be invalid.Fix your inputs so they make sense to you and there is no problem. But absent your intellectual input to restructure your problem the weights no longer make sense once density() removes the NAs from the data.On July 12, 2021 9:13:12 AM PDT, matthias-gondan wrote:>The thing is that for na.rm=TRUE, I would expect the weights>corresponding to the missing x to be removed, as well. Like in>weighted.mean. So this one shouldn't raise an error,density(c(1, 2, 3,>4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))Or am I missing>something? > Ursprüngliche Nachricht Von: Bert Gunter> Datum: 12.07.21 16:25 (GMT+01:00) An:>Matthias Gondan Cc: r-help@r-project.org>Betreff: Re: [R] density with weights missing values The behavior is as>documented AFAICS.na.rmlogical; if TRUE, missing values are removed>from x. If FALSE anymissing values cause an error.The default is>FALSE.weightsnumeric vector of non-negative observation weights.NA is>not a non-negative numeric.Bert Gunter"The trouble with having an open>mind is that people keep coming alongand sticking things into it."-->Opus (aka Berkeley Breathed in his "Bloom County" comic strip )Bert>Gunter"The trouble with having an open mind is that people keep coming>alongand sticking things into it."-- Opus (aka Berkeley Breathed in his>"Bloom County" comic strip )On Mon, Jul 12, 2021 at 6:10 AM Matthias>Gondan wrote:>> Weighted mean behaves>differently:> • weight is excluded for missing x> • no warning for>sum(weights) != 1>> > weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1,>1))> [1] 2.5> > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))>>[1] NA> > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1),>na.rm=TRUE)> [1] 2>>>>> Von: Richard O'Keefe> Gesendet: Montag, 12.>Juli 2021 13:18> An: Matthias Gondan> Betreff: Re: [R] density with>weights missing values>> Does your copy of R say that the weights must>add up to 1?> ?density doesn't say that in mine. But it does check.>>>On Mon, 12 Jul 2021 at 22:42, Matthias Gondan >wrote:> >> > Dear R users,> >> > This works as expected:> >> > •>plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))> >> > This raises an>error> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE,>weights=c(1, 1, 1, 1, 1, 1)))> > • plot(density(c(1,2, 3, 4, 5, NA),>na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))> >> > This seems to work (it>triggers a warning that the weights don’t add up to 1, which makes>sense*):> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE,>weights=c(1, 1, 1, 1, 1)))> >> > Questions> >> > • But shouldn’t the>na.rm filter also filter the corresponding weights?> > • Extra>question: In case the na.rm filter is changed to filter the weights,>the check for sum(weights) == 1 might trigger false positive warnings>since the weights might not add up to 1 anymore> >> > Best wishes,> >>>> Matthias> >> >> > [[alternative HTML version deleted]]> >> >>__> > R-help@r-project.org>mailing list -- To UNSUBSCRIBE and more, see> >>https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the>posting guide http://www.R-project.org/posting-guide.html> > and>provide commented, minimal, self-contained, reproducible>code.>>> [[alternative HTML version deleted]]>>>__> R-help@r-project.org>mailing list -- To UNSUBSCRIBE and more, see>>https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the>posting guide http://www.R-project.org/posting-guide.html> and provide>commented, minimal, self-contained, reproducible code.> [[alternative HTML version deleted]]>>__>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see>https://stat.ethz.ch/mailman/listinfo/r-help>PL
Re: [R] density with weights missing values
The thing is that for na.rm=TRUE, I would expect the weights corresponding to the missing x to be removed, as well. Like in weighted.mean. So this one shouldn't raise an error,density(c(1, 2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))Or am I missing something? Ursprüngliche Nachricht Von: Bert Gunter Datum: 12.07.21 16:25 (GMT+01:00) An: Matthias Gondan Cc: r-help@r-project.org Betreff: Re: [R] density with weights missing values The behavior is as documented AFAICS.na.rmlogical; if TRUE, missing values are removed from x. If FALSE anymissing values cause an error.The default is FALSE.weightsnumeric vector of non-negative observation weights.NA is not a non-negative numeric.Bert Gunter"The trouble with having an open mind is that people keep coming alongand sticking things into it."-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )Bert Gunter"The trouble with having an open mind is that people keep coming alongand sticking things into it."-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )On Mon, Jul 12, 2021 at 6:10 AM Matthias Gondan wrote:>> Weighted mean behaves differently:> • weight is excluded for missing x> • no warning for sum(weights) != 1>> > weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))> [1] 2.5> > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))> [1] NA> > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)> [1] 2>>>>> Von: Richard O'Keefe> Gesendet: Montag, 12. Juli 2021 13:18> An: Matthias Gondan> Betreff: Re: [R] density with weights missing values>> Does your copy of R say that the weights must add up to 1?> ?density doesn't say that in mine. But it does check.>> On Mon, 12 Jul 2021 at 22:42, Matthias Gondan wrote:> >> > Dear R users,> >> > This works as expected:> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))> >> > This raises an error> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1)))> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))> >> > This seems to work (it triggers a warning that the weights don’t add up to 1, which makes sense*):> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1)))> >> > Questions> >> > • But shouldn’t the na.rm filter also filter the corresponding weights?> > • Extra question: In case the na.rm filter is changed to filter the weights, the check for sum(weights) == 1 might trigger false positive warnings since the weights might not add up to 1 anymore> >> > Best wishes,> >> > Matthias> >> >> > [[alternative HTML version deleted]]> >> > __> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see> > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.>>> [[alternative HTML version deleted]]>> __> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density with weights missing values
Weighted mean behaves differently: • weight is excluded for missing x • no warning for sum(weights) != 1 > weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1)) [1] 2.5 > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1)) [1] NA > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE) [1] 2 Von: Richard O'Keefe Gesendet: Montag, 12. Juli 2021 13:18 An: Matthias Gondan Betreff: Re: [R] density with weights missing values Does your copy of R say that the weights must add up to 1? ?density doesn't say that in mine. But it does check. On Mon, 12 Jul 2021 at 22:42, Matthias Gondan wrote: > > Dear R users, > > This works as expected: > > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE)) > > This raises an error > > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))) > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA))) > > This seems to work (it triggers a warning that the weights don’t add up to 1, > which makes sense*): > > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1))) > > Questions > > • But shouldn’t the na.rm filter also filter the corresponding weights? > • Extra question: In case the na.rm filter is changed to filter the weights, > the check for sum(weights) == 1 might trigger false positive warnings since > the weights might not add up to 1 anymore > > Best wishes, > > Matthias > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] density with weights missing values
Dear R users, This works as expected: • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE)) This raises an error • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))) • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA))) This seems to work (it triggers a warning that the weights don’t add up to 1, which makes sense*): • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1))) Questions • But shouldn’t the na.rm filter also filter the corresponding weights? • Extra question: In case the na.rm filter is changed to filter the weights, the check for sum(weights) == 1 might trigger false positive warnings since the weights might not add up to 1 anymore Best wishes, Matthias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Embedded R: Test if initialized
Dear R friends, I am currently trying to write a piece of C code that uses „embedded R“, and for specific reasons*, I cannot keep track if R already has been initialized. So the code snippet looks like this: LibExtern char *R_TempDir; if(R_TempDir == NULL) …throw exception R not initialized… I have seen that the source code of Rf_initialize_R itself checks if it is ivoked twice (num_initialized), but this latter flag does not seem to accessible, or is it? int Rf_initialize_R(int ac, char **av) { int i, ioff = 1, j; Rboolean useX11 = TRUE, useTk = FALSE; char *p, msg[1024], cmdlines[1], **avv; structRstart rstart; Rstart Rp = Rboolean force_interactive = FALSE; if (num_initialized++) { fprintf(stderr, "%s", "R is already initialized\n"); exit(1); } Is the test of the TempDir a good substitute, or should I choose another solution? Having said this, it may be a good idea to expose a function Rf_R_initialized that performs such a test. Thank you for your consideration. Best regards, Matthias *The use case is an R library that connects to swi-prolog and allows the „embedded“ swi-prolog to establish the reverse connection to R. In that case, i.e., R -> Prolog -> R, I do not want to initialize R a second time. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence intervals for the difference between group means
you could try: library(MKinfer) meanDiffCI(a, b, boot = TRUE) Best Matthias Am 04.08.20 um 16:08 schrieb varin sacha via R-help: Dear R-experts, Using the bootES package I can easily calculate the bootstrap confidence intervals of the means like in the toy example here below. Now, I am looking for the confidence intervals for the difference between group means. In my case, the point estimate of the mean difference is 64.4. I am looking at the 95% confidence intervals around this point estimate (64.4). Many thanks for your response. library(bootES) a<-c(523,435,478,567,654) b<-c(423,523,421,467,501) bootES(a) bootES(b) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ncol() vs. length() on data.frames
should have added: dim(x)[2L] -> length(x) Am 31.03.20 um 16:21 schrieb Prof. Dr. Matthias Kohl: Dear Ivan, if I enter ncol in the console, I get function (x) dim(x)[2L] indicating that function dim is called. Function dim has a method for data.frame; see methods("dim"). The dim-method for data.frame is dim.data.frame function (x) c(.row_names_info(x, 2L), length(x)) Hence, it calls length on the provided data.frame. In addition, some "magic" with .row_names_info is performed, where base:::.row_names_info function (x, type = 1L) .Internal(shortRowNames(x, type)) Best Matthias Am 31.03.20 um 16:10 schrieb Ivan Calandra: Thanks Ivan for the answer. So it confirms my first thought that these two functions are equivalent when applied to a "simple" data.frame. The reason I was asking is because I have gotten used to use length() in my scripts. It works perfectly and I understand it easily. But to be honest, ncol() is more intuitive to most users (especially the novice) so I was thinking about switching to using this function instead (all my data.frames are created from read.csv() or similar functions so there should not be any issue). But before doing that, I want to be sure that it is not going to create unexpected results. Thank you, Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 31/03/2020 16:00, Ivan Krylov wrote: On Tue, 31 Mar 2020 14:47:54 +0200 Ivan Calandra wrote: On a simple data.frame (i.e. each element is a vector), ncol() and length() will give the same result. Are they just equivalent on such objects, or are they differences in some cases? I am not aware of any exceptions to ncol(dataframe)==length(dataframe) (in fact, ncol(x) is dim(x)[2L] and ?dim says that dim(dataframe) returns c(length(attr(dataframe, 'row.names')), length(dataframe))), but watch out for AsIs columns which can have columns of their own: x <- data.frame(I(volcano)) dim(x) # [1] 87 1 length(x) # [1] 1 dim(x[,1]) # [1] 87 61 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ncol() vs. length() on data.frames
Dear Ivan, if I enter ncol in the console, I get function (x) dim(x)[2L] indicating that function dim is called. Function dim has a method for data.frame; see methods("dim"). The dim-method for data.frame is dim.data.frame function (x) c(.row_names_info(x, 2L), length(x)) Hence, it calls length on the provided data.frame. In addition, some "magic" with .row_names_info is performed, where base:::.row_names_info function (x, type = 1L) .Internal(shortRowNames(x, type)) Best Matthias Am 31.03.20 um 16:10 schrieb Ivan Calandra: Thanks Ivan for the answer. So it confirms my first thought that these two functions are equivalent when applied to a "simple" data.frame. The reason I was asking is because I have gotten used to use length() in my scripts. It works perfectly and I understand it easily. But to be honest, ncol() is more intuitive to most users (especially the novice) so I was thinking about switching to using this function instead (all my data.frames are created from read.csv() or similar functions so there should not be any issue). But before doing that, I want to be sure that it is not going to create unexpected results. Thank you, Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 31/03/2020 16:00, Ivan Krylov wrote: On Tue, 31 Mar 2020 14:47:54 +0200 Ivan Calandra wrote: On a simple data.frame (i.e. each element is a vector), ncol() and length() will give the same result. Are they just equivalent on such objects, or are they differences in some cases? I am not aware of any exceptions to ncol(dataframe)==length(dataframe) (in fact, ncol(x) is dim(x)[2L] and ?dim says that dim(dataframe) returns c(length(attr(dataframe, 'row.names')), length(dataframe))), but watch out for AsIs columns which can have columns of their own: x <- data.frame(I(volcano)) dim(x) # [1] 87 1 length(x) # [1] 1 dim(x[,1]) # [1] 87 61 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to rum Multiple ANOVA and Multiple T-test between the same groups?
Have a look at Bioconductor package genefilter, especially functions colttests and colFtests. Best Matthias Am 10.02.19 um 10:35 schrieb AbouEl-Makarim Aboueissa: Dear All: good morning *Re:* How to rum Multiple ANOVA and Multiple T-test between the same groups. Your help will be highly appreciated. *1.* is there a way to run multiple t-tests on different variables between the same two groups. *Data for t-tests:* The data frame “dataTtest” has 5 variables (x1,x2,x3,x4,x5) and one factor (factor1) with 2 levels (group1, group2). x1<-rnorm(20,1,1) x2<-rnorm(20,2,1) x3<-rnorm(20,3,1) x4<-rnorm(20,4,1) x5<-rnorm(20,5,1) factor1<-rep(c("group1", "group2"), each = 10) dataTtest<-data.frame(x1,x2,x3,x4,x5,factor1) dataTtest *2.* is there a way to run *multiple ANOVA* and multiple comparisons *Tukey tests* on different variables between the same groups. *Data for ANOVA tests:* The data frame “dataANOVA” has 6 variables (x1,x2,x3,x4,x5,x6) and one factor (factor2) with 5 levels (group1, group2, group3, group4, group5). x1<-rnorm(40,1,1) x2<-rnorm(40,2,1) x3<-rnorm(40,3,1) x4<-rnorm(40,4,1) x5<-rnorm(40,5,1) x6<-rnorm(40,6,1) factor2<-rep(c("group1", "group2", "group3", "group4", "group5"), each = 8) dataANOVA<-data.frame(x1,x2,x3,x4,x5,x6,factor2) dataANOVA with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Statistics and Data Science* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange lazy evaluation of default arguments
Dear S Ellison, Thanks for the flowers! Indeed, I was actually considering to use it in my own teaching material, as well. With rounded numbers, of course. Though I am still slightly disturbed about this feature. I thought, now it is the time to switch to Python, but that’s even worse, see here: def add_elem(List=[]): List.append('elem') return List >>> add_elem([2]) [2, 'elem'] >>> add_elem() ['elem'] >>> add_elem() ['elem', 'elem'] <<<<<<<<<<< why on earth does this happen? [it’s documented behavior, but still…] So, I’ll stick with R. Still 25 years or so until retirement, but I’ll survive, even without crossreferenced default arguments. Best wishes, Matthias Von: S Ellison Gesendet: Dienstag, 5. September 2017 16:17 An: Matthias Gondan; r-help@r-project.org Betreff: RE: [R] Strange lazy evaluation of default arguments Mathias, If it's any comfort, I appreciated the example; 'expected' behaviour maybe, but a very nice example for staff/student training! S Ellison > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matthias > Gondan > Sent: 02 September 2017 18:22 > To: r-help@r-project.org > Subject: [R] Strange lazy evaluation of default arguments > > Dear R developers, > > sessionInfo() below > > Please have a look at the following two versions of the same function: > > 1. Intended behavior: > > > Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2) > + { > + print(c(u, l, mu)) # here, l is set to u’s value > + u = u/sqrt(sigma2) > + l = l/sqrt(sigma2) > + mu = mu/sqrt(sigma2) > + print(c(u, l, mu)) > + } > > > > Su1() > [1] 100.00 100.00 0.53 > [1] 23.2558140 23.2558140 0.1232558 > > In the first version, both u and l are correctly divided by 4.3. > > 2. Strange behavior: > > > Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2) > + { > + # print(c(u, l, mu)) > + u = u/sqrt(sigma2) > + l = l/sqrt(sigma2) # here, l is set to u’s value > + mu = mu/sqrt(sigma2) > + print(c(u, l, mu)) > + } > > > > Su2() > [1] 23.2558140 5.4083288 0.1232558 > In the second version, the print function is commented out, so the variable u > is copied to l (lowercase L) at a later place, and L is divided twice by 4.3. > > Is this behavior intended? It seems strange that the result depends on a > debugging message. > > Best wishes, > > Matthias > > > > sessionInfo() > R version 3.4.1 (2017-06-30) > Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 > x64 (build 9200) > > Matrix products: default > > locale: > [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 > LC_MONETARY=German_Germany.1252 > [4] LC_NUMERIC=CLC_TIME=German_Germany.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.4.1 tools_3.4.1 > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use...{{dropped:12}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange lazy evaluation of default arguments
Dear Bill, All makes perfect sense (including the late evaluation). I actually discovered the problem by looking at old code which used your proposed solution. Still I find it strange (and, hnestly, I don’t like R’s behavior in this respect), and I am wondering why u is not being copied to L just before u is assigned a new value. Of course, this would require the R interpreter to track all these dependencies in both ways incl. more complicated ones in which L might depend on more than just u. In the future, I’ll avoid dependencies between parameters. Su4 <- function(u=100, l=100, mu=0.53, sigma2=4.3^2) # instead of l=u And maybe also „in-place“ changes of values… Best regards, Matthias Von: William Dunlap Gesendet: Samstag, 2. September 2017 19:41 An: Rui Barradas Cc: Matthias Gondan; r-help@r-project.org Betreff: Re: [R] Strange lazy evaluation of default arguments Another way to avoid the problem is to not redefine variables that are arguments. E.g., > Su3 <- function(u=100, l=u, mu=0.53, sigma2=4.3^2, verbose) { if (verbose) { print(c(u, l, mu)) } uNormalized <- u/sqrt(sigma2) lNormalized <- l/sqrt(sigma2) muNormalized <- mu/sqrt(sigma2) c(uNormalized, lNormalized, muNormalized) } > Su3(verbose=TRUE) [1] 100.00 100.00 0.53 [1] 23.2558140 23.2558140 0.1232558 > Su3(verbose=FALSE) [1] 23.2558140 23.2558140 0.1232558 Not redefining variables at all makes debugging easier, although it may waste space. Bill Dunlap TIBCO Software wdunlap tibco.com On Sat, Sep 2, 2017 at 10:33 AM, <ruipbarra...@sapo.pt> wrote: Hello, One way of preventing that is to use ?force. Just put force(l) right after the commented out print and before you change 'u'. Hope this helps, Rui Barradas Citando Matthias Gondan <matthias-gon...@gmx.de>: Dear R developers, sessionInfo() below Please have a look at the following two versions of the same function: 1. Intended behavior: Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2) + { + print(c(u, l, mu)) # here, l is set to u’s value + u = u/sqrt(sigma2) + l = l/sqrt(sigma2) + mu = mu/sqrt(sigma2) + print(c(u, l, mu)) + } Su1() [1] 100.00 100.00 0.53 [1] 23.2558140 23.2558140 0.1232558 In the first version, both u and l are correctly divided by 4.3. 2. Strange behavior: Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2) + { + # print(c(u, l, mu)) + u = u/sqrt(sigma2) + l = l/sqrt(sigma2) # here, l is set to u’s value + mu = mu/sqrt(sigma2) + print(c(u, l, mu)) + } Su2() [1] 23.2558140 5.4083288 0.1232558 In the second version, the print function is commented out, so the variable u is copied to l (lowercase L) at a later place, and L is divided twice by 4.3. Is this behavior intended? It seems strange that the result depends on a debugging message. Best wishes, Matthias sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=C LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.1 tools_3.4.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Strange lazy evaluation of default arguments
Dear R developers, sessionInfo() below Please have a look at the following two versions of the same function: 1. Intended behavior: > Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2) + { + print(c(u, l, mu)) # here, l is set to u’s value + u = u/sqrt(sigma2) + l = l/sqrt(sigma2) + mu = mu/sqrt(sigma2) + print(c(u, l, mu)) + } > > Su1() [1] 100.00 100.00 0.53 [1] 23.2558140 23.2558140 0.1232558 In the first version, both u and l are correctly divided by 4.3. 2. Strange behavior: > Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2) + { + # print(c(u, l, mu)) + u = u/sqrt(sigma2) + l = l/sqrt(sigma2) # here, l is set to u’s value + mu = mu/sqrt(sigma2) + print(c(u, l, mu)) + } > > Su2() [1] 23.2558140 5.4083288 0.1232558 In the second version, the print function is commented out, so the variable u is copied to l (lowercase L) at a later place, and L is divided twice by 4.3. Is this behavior intended? It seems strange that the result depends on a debugging message. Best wishes, Matthias > sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=CLC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.1 tools_3.4.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] IF LOOOP
Hey This should be a rather simple quesiton for some of you. I want to make some progress in looping... I have the vector r, which contains single values --> see below: r [1] 1.1717118 1.1605215 1.1522907 1.1422830 1.1065277 1.1165451 1.1163768 1.1048872 1.0848836 1.0627211 [11] 1.0300964 1.0296879 1.0308194 1.0518188 1.0657229 1.0685514 1.0914881 1.1042577 1.1039351 1.0880163 I would like to take out simply the value "0.990956" from the vector, printing out the rest of it. The code is from the internet but does not seem to work for my vector. Can't figure out why... Thanks for the help r <- as.vector(lw) count=0 for (i in r) { if(i == 0.990956) { break } print(i) } Best Matthias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paired sample t-test with mi.t.test
Dear Joel, are you trying to apply function mi.t.test from my package MKmisc? Could you please try: mi.t.test(implist, "pre_test", "post_test", alternative = "greater", paired = TRUE, var.equal = TRUE, conf.level = 0.95) x and y are the names of the variables, not the variables themselves. Best Matthias Am 06.04.2017 um 18:32 schrieb Joel Gagnon: Dear all, It is my first time posting on this list so forgive me for any rookie mistakes I could make. I want to conduct t-tests on a dataset that has been imputed using the mice package: imput_pps <- mice(pps, m=20, maxit=20, meth='pmm') # pps is my dataset. It contains items from an 11-item questionnaire gather at pre and post test. So the data set has 22 columns. I then proceed to compute the total scores for the pre and post test on my imputed datasets: long_pps <- complete(imput_pps, action ="long", include = TRUE) long_pps$pre_test <- rowSums(long_pps[ ,c(3:13)]) long_pps$post_test <- rowSums(long_pps[ , c(14:24)]) I then used as.mids to convert back to mids object: mids_pps <- as.mids(long_pps) Next, I created an imputation list object using mitools: implist <- lapply(seq(mids_pps$m), function(im) complete(mids_pps, im)) implist <- imputationList(implist) Now, I want to conduct t-tests using the mi.t.test package. I tried the following code: mi.t.test(implist, implist$pre_test, implist$post_test, alternative = "greater", paired = TRUE, var.equal = TRUE, conf.level = 0.95) When I run this code, R tells me that Y is missing. I know this may sound stupid, but I thought that I specified Y with this line: implist$pre_test, implist$post_test - with implist$pre_test being X and implist$post_test being Y - like I usually do for a normal t-test using the t.test function. It seems I don't quite understand what the Y variable is supposed to represent. Could someone help me figure out what I am doing wrong? You help would be very much appreciated. Best regards, Joel Gagnon, Ph.D(c), Department of Psychology, Université du Québec à Trois-Rivières Québec, Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quantiles with ordered categories
I found it: quantile(ordered(1:10), probs=0.5, type=1) works, because type=1 seems to round up or down, whatever. The default option for is 7, which wants to interpolate, and then produces the error. Two options come to my mind: - The error message could be improved. - The default type could be 1 if the data is from ordered categories. - Or both. It is probably a little thing to fix, but I lack the skills to do this myself. Best wishes, Matthias Von: Bert Gunter Gesendet: Dienstag, 14. März 2017 21:34 An: matthias-gon...@gmx.de Cc: r-help@r-project.org Betreff: Re: [R] Quantiles with ordered categories Inline. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Mar 14, 2017 at 12:36 PM, <matthias-gon...@gmx.de> wrote: > Dear R users, > > This works: > > quantile(1:10, probs=0.5) > > This fails (obviously): > > quantile(factor(1:10), probs=0.5) > > But why do quantiles for ordered factors not work either? > > quantile(ordered(1:10), probs=0.5) > > Is it because interpolation (see the optional type argument) is not defined? Yes. Is there an elegant workaround? No. How can there be? By definition, all that is assumed by an ordered factor is an ordering of the categories. How can you "interpolate" in ordered(letters[1:3]) . ASAIK there is no "a.5" . -- Bert > > Thank you. > > Best wishes, > > Matthias > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Quantiles with ordered categories
Dear R users, This works: quantile(1:10, probs=0.5) This fails (obviously): quantile(factor(1:10), probs=0.5) But why do quantiles for ordered factors not work either? quantile(ordered(1:10), probs=0.5) Is it because interpolation (see the optional type argument) is not defined? Is there an elegant workaround? Thank you. Best wishes, Matthias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] summing up a matrix
Hello together, is it possible, to summing up a matrix? I have the following matrix at the moment: [,1] [,2] [,3][,4] [,5] [,6] 2016-1120200100 50 100 30 2016-12100 200100 50 100 30 2017-0150200100 50 100 30 Now I want to summing up the matrix in a new matrix. The result should look like the following: [,1] [,2] [,3][,4] [,5] [,6] 2016-1120220320 370 470 500 2016-12100 300400 450 550 580 2017-0150250350 400 500 530 Is it possible, to create that? Thanks for your help. Best regards. Mat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fPortfolio dont work in R 3.3.1
Andre, you need to make sure you got a C/C++ compiler as well as a Fortran compiler to compile a package from source that makes use of these language. Many R packages use one of those languages under the hood to speed things up. gcc / gfortran are common and free choices for such compilers. Depending on your OS, these ship with your OS' installation. What OS do you have? Also, i'd try to install RSymphony first and then install fPortfolio. HTH, matt Hi, I'm trying use the package fPortfolio in R version 3.3.1. But the package don't work. I recieve the messages below: Package which is only available in source form, and may need compilation of C/C++/Fortran: 'Rsymphony' These will not be installed Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called 'Rsymphony' Erro: package or namespace load failed for 'fPortfolio' How I can use fPortfolio in R 3.3.1? Thanks in advance, Andr� Barbosa Oliveira. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] filter a data.frame in dependence of a column value
Hello togehter, i have short question, maybe anyone can help me. I have a data.frame like this one: NO ORDER 1 1530 for Mr. Muller (10.0 -> 11.2) 2 1799 for Mr Giulani 3 1888 for Mr. Marius (11.2 -> 12) I need a solution, which only contains the values in brackets. The result should look like the following: NO ORDER 1 1530 for Mr. Muller (10.0 -> 11.2) 2 1888 for Mr. Marius (11.2 -> 12) I tried it with the following code, but that doesn't work. data4.1<-data3[data3$ORDER%in% "[(]*->*[)]",] maybe anyone can help me. Thank you. Best regards Mat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sustring in dependence of values
Hello togehter, maybe anyone can help me with a little problem. I have the following column in an data.frame (called TEST) VERSION 1abc (9.2 -> 10.0) 2def (10.2 -> 11.0) 3ghi (7.4 -> 8.4) 4jkl (10.2 -> 10.4) 5mno (8.1 -> 9.0) I now need the code for the following result (I want the numbers "before/after" in a separate column. The solution look like this one: VERSION FROM TO 1abc (9.2 -> 10.0) 9.2 10.0 2def (10.2 -> 11.0)10.2 11.0 3ghi (7.4 -> 8.4)7.4 8.4 4jkl (10.2 -> 10.4) 10.2 10.4 5mno (8.1 -> 9.0) 8.1 9.0 Maybe anyone can help me. The substring-code doesn't help me at the moment. Best regards. Mat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Thank you for the help. It worked out fine. However trying to add additional aesthetics I came to another point, where I couldnt figure out a way to solve it. Perhaps you run into similar problems since your familiar with the ggplot package. The code is the following: runmean_10yr_Nino_3_Index_17th_century <- SMA(SST_NINO3_detrended[0:100],n=10) runmean_10yr_Nino_3_Index_18th_century <- SMA(SST_NINO3_detrended[101:200],n=10) runmean_10yr_Nino_3_Index_19th_century <- SMA(SST_NINO3_detrended[201:300],n=10) runmean_10yr_Nino_3_Index_20th_century <- SMA(SST_NINO3_detrended[301:400],n=10) time_Nino3 <- c(1:100) runmean_yearly_decadal_Nino_Index <- data.frame(runmean_10yr_Nino_3_Index_17th_century,runmean_10yr_Nino_3_Index_18th_century,runmean_10yr_Nino_3_Index_19th_century,runmean_10yr_Nino_3_Index_20th_century,time_Nino3) melted3 <- melt(runmean_yearly_decadal_Nino_Index , id.vars="time_Nino3") ggplot(data=melted3, aes(x=time_Nino3,y=value,colour=variable))+ geom_line()+ ylab("SST Anomaly") + xlab("year") + ggtitle("Decadal El Nino Index ")+ facet_grid(variable ~ .) + geom_vline(aes(xintercept = xp),data=dat.vline) dat.vline <- data.frame( xp = c(15, 23)) As you may have noticed I created several plots below each other and now I would like to add vertical lines to this plots, therefore adding "geom_vline(aes(xintercept = xp),data=dat.vline). This worked so far, however I would like to add vertical lines for these four different plots at different places. Do you know how I could do this? I added the plot I created at this e-mail. You can see two lines on every of those 4 plots (at the same place). What I want is the same thing but the lines at different places for every of those 4 plots (independently). Thank you for the help!! If you need to create the plots you can simply change the "runmean_10yr_Nino_3_Index_17th_century" to any other vector. Best Matthias __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hello everyone and happy new year
I got the following problem in R studio. Im trying to make a plot of different time series using different colours, which so far worked finde with the ggplot tool. However I did not manage to accually write a name to the correct layers. Below you can see the code that I wrote and also woks. Remember, the code works, meaning I get a plot with all 9 timeseires from aa until ii. I just could not figure out how to label the different layers. What should I do when I would like to add a name to the red line, the green line and so on beside the plot (in al list). Thank you for your time. Best Matthias aa <- OHC_d[40:70] bb <- OHC_d[50:80] cc <- OHC_d[173:203] dd <- OHC_d[205:235] ee <- OHC_d[273:303] ff <- OHC_d[292:322] gg <- OHC_d[302:332] hh <- OHC_d[370:400] ii <- OHC_d[381:411] s3 <- data.frame(aa,bb,cc,dd,ee,ff,gg,hh,ii) y2 <- ggplot(s3,aes(x=time2,y=aa),colour="black") + geom_line() + geom_line(data=s3,aes(y=bb),colour="red") + geom_line() + geom_line(data=s3,aes(y=cc),colour="green") + geom_line() + geom_line(data=s3,aes(y=dd),colour="blue") + geom_line() + geom_line(data=s3,aes(y=ee),colour="yellow") + geom_line() + geom_line(data=s3,aes(y=ff),colour="pink") + geom_line() + geom_line(data=s3,aes(y=gg),colour="purple")+ geom_line()+ geom_line(data=s3,aes(y=hh),colour="violet") geom_line()+ geom_line(data=s3,aes(y=ii),colour="grey") y2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to install pkg when "not found" ?
use quotes! install.packages("ggplot2") Am 30.12.2015 um 06:41 schrieb Judson: Using "Install Packages" from CRAN, in RStudio on Windows 7, I downloaded (and supposedly installed) ggplot2 package to here: C:\Program Files\R\R-3.1.0\library\ggplot2_2.0.0\ggplot2 when I try this: require(ggplot2) I get the following: Loading required package: ggplot2 Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'ggplot2' I also tried this: install.packages(ggplot2) Error in install.packages : object 'ggplot2' not found .. The above library has the usual things like MASS, stats, base, utils, graphics and so on . and if I do this: require(MASS) I get this: Loading required package: MASS so that works . Any idea what I'm doing wrong with ggplot2? ... judson blake [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Compute Regressions with R-Function
Hello I was trying to set up a function() that allows easely to calculate regressions for severel dependent variables with the same independent variables. My function looked like this but was turned down by an error (Im quiet new to R Studio). So here is my solution: b CO2 logTrop_Aerosol lm(b ~ CO2 + logTrop_Aerosol ) Trend <- function(x,CO2,logTrop_Aerosol) { lm x CO2 + logTrop_Aerosol} b is a vector containing 400 values of heat content CO2 also contains 400 values as well as logTropAerosol my idea would be that I simply can replace x (heat content) by other vectors containing heat content. The error I got was the following: Error: unexpected symbol in "Trend <- function(x,y,z) { lm x" Thanks a lot for the help! Best Matthias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ..lsmeans and coxme..
Dear list-eners, I run into the following problem when I want to get contrasts from a coxme model using the lsmeans package: A call to lsmeans on the coxme model throws the following error: Error in if (adjustSigma object$method == ML) V = V * object$dims$N/(object$dims$N - : missing value where TRUE/FALSE needed I give an example: library(coxme) library(lsmeans) fm - coxme(Surv(y, uncens) ~ trt + (trt | center) + strata(center), data=eortc) summary(fm) lsmeans(fm, ~ trt) traceback points to the internal function lsmeans:::lsm.basis.lme() Any ideas? Thanks in advance. -m [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate value in dependence of target value
Hi David, thanks for the reply. My spelling of the numbers was not correct. What I mean with 100.000 is 10.00 ! I have corrected the values in my example below me. Maybe you can understand it better now. Crucially is, that the MARGE rises up in dependence of the ID. The ID 11 will be count with 2% because we don't reach the 50% hurdle (5). The ID 12 will reach the 50% hurdle, so the ID 12 should be count with 1200 (result of 4 * 2% + 1 * 4%). The 1 with 4% will be credited more, because they exceed the 50% Target Value. Thanks for your help. Best regards. Mat -Ursprüngliche Nachricht- Von: David L Carlson [mailto:dcarl...@tamu.edu] Gesendet: Montag, 9. März 2015 16:08 An: Matthias Weber; r-help@r-project.org Betreff: RE: calculate value in dependence of target value It is very hard to figure out what you are trying to do. 1. All of the VALUEs are greater than the target of 100 2. Your description of what you want does not match your example. Perhaps VALUE should be divided by 1000 (e.g. not 1, but 10)? Perhaps your targets do not apply to VALUE, but to cumulative VALUE? - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matthias Weber Sent: Monday, March 9, 2015 7:46 AM To: r-help@r-project.org Subject: [R] calculate value in dependence of target value Hello together, i have a litte problem. Maybe anyone can help me. I have to calculate a new column in dependence of a target value. As a example: My target value is 10. At the moment I have a data.frame with the following values. IDVALUE 1 111 2 125 3 133 4 142 The new column (MARGE) should be calculated with the following graduation: Until the VALUE reach 50% of the target value (5) = 2% Until the VALUE reach 75% of the target value (75000) = 4% Until the VALUE reach 100% of the target value (10) = 8% If the VALUE goes above 100% of the value (10) = 10% The result looks like this one: IDVALUE MARGE 1 111 200 (result of 1 * 2%) 2 125 1200 (result of 4 * 2% + 1 * 4%) 3 133 1800 (result of 15000 * 4% + 15000 * 8%) 4 142 1800 (result of 1 * 8% + 1 * 10%) Is there anyway to calculate the column MARGE automatically in R? Thanks a lot for your help. Best regards. Mat This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculate value in dependence of target value
Hello together, i have a litte problem. Maybe anyone can help me. I have to calculate a new column in dependence of a target value. As a example: My target value is 100.000 At the moment I have a data.frame with the following values. IDVALUE 1 111 2 125 3 133 4 142 The new column (MARGE) should be calculated with the following graduation: Until the VALUE reach 50% of the target value (50.000) = 2% Until the VALUE reach 75% of the target value (75.000) = 4% Until the VALUE reach 100% of the target value (100.000) = 8% If the VALUE goes above 100% of the value (100.000) = 10% The result looks like this one: IDVALUE MARGE 1 111 200 (result of 10.000 * 2%) 2 125 1200 (result of 40.000 * 2% + 10.000 * 4%) 3 133 1800 (result of 15.000 * 4% + 15.000 * 8%) 4 142 1800 (result of 10.000 * 8% + 10.000 * 10%) Is there anyway to calculate the column MARGE automatically in R? Thanks a lot for your help. Best regards. Mat This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] move date-values from one line to several lines
Hello together, i have a data.frame with date-values. What I want is a data.frame with a several lines for each date. My current data.frame looks like this one: ID FROM TOREASON 1 2015-02-27 2015-02-28Holiday 1 2015-03-15 2015-03-20Illness 2 2015-05-20 2015-02-23Holiday 2 2015-06-01 2015-06-03Holiday 2 2015-07-01 2015-07-01Illness The result looks like this one: ID DATE REASON 12015-02-27Holiday 12015-02-28Holiday 12015-03-15Illness 12015-03-16Illness 12015-03-17Illness 12015-03-18Illness 12015-03-19Illness 12015-03-20Illness 22015-05-20 Holiday 22015-05-21 Holiday 22015-05-22 Holiday 22015-05-23 Holiday 22015-06-01 Holiday 22015-06-02 Holiday 22015-06-02 Holiday 22015-07-01 Illness Maybe anyone can help me, how I can do this. Thank you. Best regards. Mat This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge 2 data.frames in dependence of 2 values
Hello togehter, i have a little problem. Maybe anyone can help me. I have 2 data.frames, which look like as follows: First: NAMEMONTH BONUS 1 Andy 2014-10 100 2 Pete 2014-10200 3 Marc2014-10300 4 Andy2014-11400 Second: NAME MONTHBONUS_2 1Andy2014-10 150 2Pete 2014-11 180 3Jason 2014-10 190 4Paul 2014-10 210 5Andy2014-11 30 How can I merge this 2 data.frames, if I want the following result: NAMEMONTH BONUSBONUS_2 1 Andy 2014-10 100150 2 Pete 2014-11 180 3 Marc2014-10300 4 Andy2014-11 400 30 5 Pete 2014-10 200 6 Jason 2014-10 190 7 Paul 2014-10 210 The important thing is, that for every accordance in the Columns NAME and MONTH I get a new line. Thanks for your help. Best regards. Mat This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] split a field in dependence of the semicolon
Hello togehter, i have a little problem, maybe anyone can help me. I have a data.frame, which look like this one: ID Members 1 1 ; abc; def; ghi 2 2 ; abc; 3 3 ;def; How can I create another column for each value between 2 semicolons? The result look like this one: ID MembersMember1Member2 Member3 1 1 ; abc; def; ghi abc def ghi 2 2 ; abc; abc 3 3 ;def; def Maybe anyone can help me. Thank you. Best regards. Mat This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotplot with library lattice
Hi Jim, the graph looks at the moment nearly perfect. I have one last question. How can I change the scaling of the x-axis. At the moment i see the values at 0,20,40,60,80,100. My wish is, that the x-axis looks like the abline, so the scaling for the x-axis should be at 0,25,50,75,100. Thanks a lot. Best regards. Mat -Ursprüngliche Nachricht- Von: Jim Lemon [mailto:j...@bitwrit.com.au] Gesendet: Montag, 27. Oktober 2014 09:47 An: Matthias Weber Cc: r-help@r-project.org Betreff: Re: AW: [R] dotplot with library lattice On Mon, 27 Oct 2014 08:53:51 AM Matthias Weber wrote: Hi Jim, looks perfect to me. Thank you very much. One last question. Is there any possibility to add 3 horizontal lines in the graph. One at 25%, the other one at 50% and the last one at 75%? So I can see the process a bit better? Hi Mat, Change this: abline(h=1:13,lty=2,col=lightgray) to this: abline(h=1:13,v=c(25,50,75),lty=2,col=lightgray) Jim This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotplot with library lattice
Hi Jim, looks perfect to me. Thank you very much. One last question. Is there any possibility to add 3 horizontal lines in the graph. One at 25%, the other one at 50% and the last one at 75%? So I can see the process a bit better? Thank you. Best regards. Mat -Ursprüngliche Nachricht- Von: Jim Lemon [mailto:j...@bitwrit.com.au] Gesendet: Samstag, 25. Oktober 2014 03:27 An: Matthias Weber Cc: r-help@r-project.org Betreff: Re: AW: [R] dotplot with library lattice On Fri, 24 Oct 2014 12:49:39 PM Matthias Weber wrote: I want always to see the Process of the green button in dependence of the blue button at 100%. Okay. mwdat-read.table(text= KOST BudgetIST 1060 -2.18 0 1080 91037.71 91647.15 1100 955573.87 907938.98 1120 23326.8 0 1150 2521.57 0 1180 51302.03 48760.45 1200 2027.04-1667.5 1210 2385.032386.06 1220 0 0 1250 528.87 0 1255 766.54 0 1260 12154.974861.41 Gesamtbudget 1141622.25 1054236.55, header=TRUE) par(las=1,mar=c(5,7,4,6)) plot(rep(100,13),1:13,main=IST against Budget, xlab=IST/Budget (prozent),ylab=KOST, xlim=c(0,100),type=n,yaxt=n) abline(h=1:13,lty=2,col=lightgray) points(rep(100,13),1:13,pch=19,col=blue,cex=3) mwdat$ISTpct-100*mwdat$IST/mwdat$Budget # fix divide by zero mwdat$ISTpct[9]-0 points(mwdat$ISTpct,1:13,pch=19,col=green,cex=3) legend(105,10,c(Budget,IST),pch=19, col=c(blue,green),bty=n,xpd=TRUE) axis(2,at=1:13,labels=mwdat$KOST) par(las=0,mar=c(5,4,4,2)) Jim This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotplot with library lattice
Hi Jim, thanks for your reply. Doesn't look very bad, but what I mean with Budget 100, is the percentage of the process on the x-axis. As a example, the Budget for 1120 is 23326.8. At the moment we have 0 as value for IST. In this case, the blue button should be stand at 100% (x-axis = process in %) on the right side and the green button is on the x-axis at 0 (left side), because the value is 0. If the value for 1120 rises up to 11663, the blue button is still at 100% on the right side (the blue button is in all cases on the right side) and the green button is right in the middle, because the Budget is 50% full. I want always to see the Process of the green button in dependence of the blue button at 100%. Maybe you can help me. Thank you. Best regards. Mat -Ursprüngliche Nachricht- Von: Jim Lemon [mailto:j...@bitwrit.com.au] Gesendet: Freitag, 24. Oktober 2014 11:29 An: r-help@r-project.org Cc: Matthias Weber Betreff: Re: [R] dotplot with library lattice On Thu, 23 Oct 2014 05:57:27 PM Matthias Weber wrote: Hello together, i have a short question. Maybe anyone can help me to create a barplot in R with the package lattice. I have the following data as ouput values and the following code: Data (d): KOST BudgetIST 1060 -2.18 0 1080 91037.71 91647.15 1100 955573.87 907938.98 1120 23326.8 0 1150 2521.57 0 1180 51302.03 48760.45 1200 2027.04-1667.5 1210 2385.032386.06 1220 0 0 1250 528.87 0 1255 766.54 0 1260 12154.974861.41 Gesamtbudget 1141622.25 1054236.55 Code: ### read the data d$KOST - ordered( d$KOST, levels = d$KOST) ### load lattice and grid require( lattice ) ### setup the key k - simpleKey( c( Budget, IST ) ) k$points$fill - c(blue, darkgreen) k$points$pch - 21 k$points$col - black k$points$cex - 1 ### create the plot dotplot( KOST ~ Budget + IST , data = d, horiz = TRUE, par.settings = list( superpose.symbol = list( pch = 21, fill = c( blue, darkgreen), cex = 3, col = black ) ) , xlab = Kostenstellenübersicht, key = k, panel = function(x, y, ...){ panel.dotplot( x, y, ... ) # grid.text( # unit( x, native) , unit( y, native) , # label = x, gp = gpar( cex = .7 ) ) } ) The result look like the attached graph. But this is not exactly what I want. I want the Budget on the right side (100), and the IST-Value in dependence of the Budget between 0 and 100. As a example. If there is a budget over 100.000 and the IST-Value ist around 50.000, the blue button should be on the right side and the green button right in the middle. Maybe anyone can help me. Thank you. Best regards. Mat Hi Mat, I must admit that I have probably misunderstood your request. I have assumed that you want the Budget as the x axis and the KOST as the y axis. You seemed to be asking for Budget to be always 100 and that didn't make sense. For some reason the scipen option stopped working for me after the first plot and so I couldn't get rid of the scientific notation on the x axis. Also this was done in base graphics rather than lattice. I also left the Gesamtbudget (Total budget) out as it would have squeezed most of the dots even farther to the left. However, it might help. mwdat-read.table(text= KOST BudgetIST 1060 -2.18 0 1080 91037.71 91647.15 1100 955573.87 907938.98 1120 23326.8 0 1150 2521.57 0 1180 51302.03 48760.45 1200 2027.04-1667.5 1210 2385.032386.06 1220 0 0 1250 528.87 0 1255 766.54 0 1260 12154.974861.41, header=TRUE) options(scipen=4) par(las=1) plot(mwdat$Budget,mwdat$KOST,main=IST against Budget, xlab=Budget,ylab=KOST, xlim=range(mwdat$Budget),ylim=range(mwdat$KOST), type=n,yaxt=n) abline(h=mwdat$KOST,lty=2,col=lightgray) points(mwdat$Budget,mwdat$KOST,pch=19,col=blue,cex=3) points(mwdat$IST,mwdat$KOST,pch=19,col=green,cex=3) legend(40,1250,c(Budget,IST),pch=19, col=c(blue,green),bty=n) options(scipen=0) axis(2,at=mwdat$KOST) par(las=0) Jim This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E
[R] dotplot with library lattice
Hello together, i have a short question. Maybe anyone can help me to create a barplot in R with the package lattice. I have the following data as ouput values and the following code: Data (d): KOST BudgetIST 1060 -2.18 0 1080 91037.71 91647.15 1100 955573.87 907938.98 1120 23326.8 0 1150 2521.57 0 1180 51302.03 48760.45 1200 2027.04-1667.5 1210 2385.032386.06 1220 0 0 1250 528.87 0 1255 766.54 0 1260 12154.974861.41 Gesamtbudget 1141622.25 1054236.55 Code: ### read the data d$KOST - ordered( d$KOST, levels = d$KOST) ### load lattice and grid require( lattice ) ### setup the key k - simpleKey( c( Budget, IST ) ) k$points$fill - c(blue, darkgreen) k$points$pch - 21 k$points$col - black k$points$cex - 1 ### create the plot dotplot( KOST ~ Budget + IST , data = d, horiz = TRUE, par.settings = list( superpose.symbol = list( pch = 21, fill = c( blue, darkgreen), cex = 3, col = black ) ) , xlab = Kostenstellenübersicht, key = k, panel = function(x, y, ...){ panel.dotplot( x, y, ... ) # grid.text( # unit( x, native) , unit( y, native) , # label = x, gp = gpar( cex = .7 ) ) } ) The result look like the attached graph. But this is not exactly what I want. I want the Budget on the right side (100), and the IST-Value in dependence of the Budget between 0 and 100. As a example. If there is a budget over 100.000 and the IST-Value ist around 50.000, the blue button should be on the right side and the green button right in the middle. Maybe anyone can help me. Thank you. Best regards. Mat This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sort a data.frame in a different way
Hello together, i have a little problem. Maybe anyone can help me. I have a data. frame which look like this one: 1000 1001 10021003 15 6 1211 24 3 81 What i need is a data.frame, which looks like this one. The Column names should be in the first column, and the values right of the column name. The solution look like this one: A B 1000 5 4 1001 6 3 1002 12 8 1003 11 1 maybe anyone can help me. Thank you. Best regards. Mat This e-mail may contain trade secrets, privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strange behavior of the compare operator
I had a strange behavior of a function written a few days ago. I pointed the problem down to the following minimal example. can anyone explain why the following comparisons don't reply the samecorrect answer? Thanks for your reply! Matthias R version 3.1.1 (2014-07-10) -- Sock it to Me Copyright (C) 2014 The R Foundation for Statistical Computing Platform: i386-w64-mingw32/i386 (32-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. sessionInfo()R version 3.1.1 (2014-07-10) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C [5] LC_TIME=German_Switzerland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_3.1.1 rm(list = ls()) myDataFrame - data.frame(var1 = seq(from = -1, to = 0, by = 0.01)) any(myDataFrame$var1 == (0.68-1))[1] TRUE any(myDataFrame$var1 == -0.32)[1] FALSE myDataFrame$var1[69][1] -0.32 str((0.68-1)) num -0.32 str(-0.32) num -0.32 str(myDataFrame$var1) num [1:101] -1 -0.99 -0.98 -0.97 -0.96 -0.95 -0.94 -0.93 -0.92 -0.91 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ..nlme: start values necessary even though using self-start function?..
Dear list. I am stuck with an subscript out of bounds error when I play with an nlme-example from the yellow Pinheiro/Bates book. It is about using nlme() using the formula interface for non-linear mixed models. The following code snippet is inspired by the book (in section 8.2), using the Orange-example data from R datasets containing repeated measures of trunk circumferences over time for 5 trees. It is a nfnGroupedData with structure circumference ~ age | Tree. nlme( circumference ~ SSlogis(age, Asym, xmid, scal), data=Orange, fixed= Asym + xmid + scal ~ 1) It results in the following error: Error in nlme.formula(circumference ~ SSlogis(age, Asym, xmid, scal), : subscript out of bounds Specifiying the grouping as in groups=~Tree does not help, either. When I do provide start values, it works, though. nlme( circumference ~ SSlogis(age, Asym, xmid, scal), data=Orange, start=c(Asym = 200, xmid=725, scal=350), fixed= Asym + xmid + scal ~ 1) Beginning of result output: Nonlinear mixed-effects model fit by maximum likelihood Model: circumference ~ SSlogis(age, Asym, xmid, scal) Data: Orange Log-likelihood: -129.9907 Fixed: Asym + xmid + scal ~ 1 Asym xmid scal 192.0959 727.5887 356.6011 ... SSlogis() is a selfStart object and should be able to get its own starting values like so: getInitial(circumference ~ SSlogis(age, Asym, xmid, scal), data=Orange) So start= argument should not be necessary. What am I missing here? Any ideas? Thanks in advance.. -m ps This is my setting: sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-apple-darwin13.1.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nlme_3.1-117 devtools_1.5 loaded via a namespace (and not attached): [1] digest_0.6.4evaluate_0.5.5 grid_3.1.1 httr_0.5 lattice_0.20-29 memoise_0.2.1 parallel_3.1.1 RCurl_1.95-4.3 stringr_0.6.2 tools_3.1.1 whisker_0.3-2 And regarding my nlme-package from CRAN: Information on package ‘nlme’ Description: Package:nlme Version:3.1-117 Date: 2014-03-31 Priority: recommended Title: Linear and Nonlinear Mixed Effects Models Authors@R: c(person(José, Pinheiro, role = aut, comment = S version), person(Douglas, Bates, role = aut, comment = up to 2007), person(Saikat, DebRoy, role = ctb, comment = up to 2002), person(Deepayan, Sarkar, role = ctb, comment = up to 2005), person(EISPACK authors, role = ctb, comment = src/rs.f), person(R-core, email = r-c...@r-project.org, role = c(aut, cre))) Description:Fit and compare Gaussian linear and nonlinear mixed-effects models. Depends:graphics, stats, R (= 3.0.0) Imports:lattice Suggests: Hmisc, MASS LazyData: yes ByteCompile:yes Encoding: UTF-8 License:GPL (= 2) BugReports: http://bugs.r-project.org Packaged: 2014-03-31 07:56:40 UTC; ripley Author: José Pinheiro [aut] (S version), Douglas Bates [aut] (up to 2007), Saikat DebRoy [ctb] (up to 2002), Deepayan Sarkar [ctb] (up to 2005), EISPACK authors [ctb] (src/rs.f), R-core [aut, cre] Maintainer: R-core r-c...@r-project.org NeedsCompilation: yes Repository: CRAN Date/Publication: 2014-03-31 10:16:43 Built: R 3.1.1; x86_64-apple-darwin13.1.0; 2014-07-11 12:37:41 UTC; unix __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Could someone recommend a package for time series?
Hi, the Cran Task View about time series that pascal just send is certainly helpful. Personally I think the standard ts() already can do a lot for you. Apart from that I like zoo particularly if you have other than yearly frequencies like quarterly or even irregular frequencies to handle. This is a pretty nice applied course that uses R for illustration: https://stat.ethz.ch/education/semesters/ss2012/atsa apart from this course, I think Prof. Rob Hyndman (who also has a blog) has a lot of useful stuff to say when it comes time series analysis with R best Matthias Bannert KOF Swiss Economic Institute, Switzerland From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of jpm miao [miao...@gmail.com] Sent: Monday, September 29, 2014 11:47 AM To: Pascal Oettli Cc: mailman, r-help Subject: Re: [R] Could someone recommend a package for time series? Thanks Pascal! It certainly helps! 2014-09-29 17:10 GMT+08:00 Pascal Oettli kri...@ymail.com: Hi Miao, You certainly will find useful answers here : http://cran.r-project.org/web/views/TimeSeries.html Regards, Pascal Oettli On Mon, Sep 29, 2014 at 6:05 PM, jpm miao miao...@gmail.com wrote: Hi, I've not used R for about one year and don't know well about the updates on the time series-related package. My primary job is to do economic/financial time series data analysis - annual, monthly, daily, etc. I usually read data by the package XLConnect, which can read xls or xlsx files directly. It's excellent. However I can't find a package to manipulate time series data. For example, I just want to do an easy manipulation , e.g, to label the dates of the data from , say, 1991M10 to 2014M07, and then extract part of the data, say, 2005M01 to 2010M12 and do analysis. Is there any package work well for my purpose? I sometimes need to aggregate monthly data to quarterly data and I find aggregate function helpful. In the past I used packages xts, zoo and don't find it really user friendly. Maybe I haven't mastered it; maybe there're some updates (which I don't know) now. Could someone recommend a package or provide an example (or just the document, I can read it) for my purpose? Attached is an exemplary data set I talked about. Thanks, Miao __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Pascal Oettli Project Scientist JAMSTEC Yokohama, Japan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Propensity Score Matching with Restrictions (Matching package)
Dear listers, I am using Jas Sekhon's Matching package to perform propensity score matching and like it a lot so far. Still though I wonder whether it is possible to impose restrictions on the Matching. I've seen the restriction parameter, but I doubt this is what I want (or I don't get how to use it more neatly). Here's what I do: I used a pool several years of observations into one dataset and estimate a propensity score using a standard glm / probit approach. I then use the fitted values as propensity scores and perform the matching. Here's what I would like to do (in addition): restrict the matching in way that only allows for matches within the same year, so that only propensity scores within the same year are considered for a respective match. Note that I don't what to estimate the years separately. Is there a way to do so? best regards matthias --- Matthias Bannert KOF Swiss Economic Institute Leonhardstrasse 21 8045 Zürich Switzerland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] C.D.F
Dear Diba, you could try package distr; eg. library(distr) G1 - Gammad(scale = 0.7, shape = 0.5) G2 - Gammad(scale = 2.1, shape = 1.7) G3 - G1+G2 # convolution G3 For the convolution exact formulas are applied if available, otherwise we use FFT; see also http://www.jstatsoft.org/v59/i04/ (will appear soon) resp. a previous version at http://arxiv.org/abs/1006.0764 hth Matthias Am 11.08.2014 um 10:17 schrieb pari hesabi: Hello everybody, Can anybody help me to write a program for the CDF of sum of two independent gamma random variables ( covolution of two gamma distributions) with different amounts of parameters( the shape parameters are the same)? Thank you Diba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] legendre quadrature
you could use package distrEx: library(distrEx) GLIntegrate(function(x) x^2, lower = -1, upper = 1, order = 50) hth Matthias On 01.05.2014 09:43, pari hesabi wrote: Hello everybody I need to approximate the amount of integral by using legendre quadrature. I have written a program which doesn't give me a logical answer; Can anybody help me and send the correct program? For example the approximated amount of integral of ( x ^2) on (-1,1) based on legendre quad rule. integrand-function(x) {x^2} rules - legendre.quadrature.rules( 50 ) order.rule - rules[[50]] chebyshev.c.quadrature(integrand, order.rule, lower = -1, upper = 1) Thank you Diba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] joint distribution
have a look at our package distr: library(distr) x1 - Norm(mean = 0, sd = 1) x2 - Binom(size = 1, prob = 0.75) x3 - x1 + x2 plot(x3) # to get density, cdf, quantile, and random numbers use d(x3)(5) p(x3)(0) q(x3)(0.975) r(x3)(20) # you can also have additonal coefficients; eg. x4 - 3*x1 + 2*x2 plot(x4) For more details on the computations see http://arxiv.org/abs/1006.0764 hth, Matthias On 22.04.2014 13:11, Ms khulood aljehani wrote: Hello i have two independent variablesx1 from normal (0,1)x2 from bernoulli (o.75) i need the density estimation of(b1*x1) + (b2*x2) where b1 and b2 are two fixed coefficients thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mistakes in date conversion for future date/time (POSIXct)
Thank you for your answers. They helped me a lot! I used R 3.1.0 RC and the mistake didn't show up. I also made an additional test, the same as David McPearson did: I tried it again with R 3.0.3 on a pc with 2 OS (Windows 7 and Linux). The error showed up at the windows system but not in Linux. So this seems to be a problem related to Windows. Kind regards Matthias -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von David McPearson Gesendet: Sonntag, 6. April 2014 12:19 An: r-help@r-project.org Betreff: Re: [R] Mistakes in date conversion for future date/time (POSIXct) I _do_ see this error - on R 3.0.3 / Win XP however, not on R 2.11.1 / Linux. (Same hardware, 2 x OS, 2 x R versions) Maybe it's peculiar to to 'doze... datetimesequenz - seq.POSIXt(from=as.POSIXct(1960-01-01 00:00), to=as.POSIXct(2100-01-01 00:00), by=1 hour) levels(as.factor(strftime(datetimesequenz, format=%Y))) [1] 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 [14] 1973 1974 ... ... [183] 2154 2155 2157 2158 2159 2160 2161 2162 2167 sessionInfo() R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C [5] LC_TIME=English_Australia.1252 attached base packages: [1] grDevices datasets splines graphics stats tcltk utils methods base other attached packages: [1] svSocket_0.9-55 TinnR_1.0-5 R2HTML_2.2.1Hmisc_3.12-2 Formula_1.1-1 [6] survival_2.37-4 loaded via a namespace (and not attached): [1] cluster_1.14.4 fortunes_1.5-0 grid_3.0.2 lattice_0.20-23 rpart_4.1-3 [6] svMisc_0.9-69 tools_3.0.2 Cheers, D. On Fri, 4 Apr 2014 14:19:09 -0700 David Winsemius dwinsem...@comcast.net wrote On Apr 4, 2014, at 9:54 AM, Duncan Murdoch wrote: On 04/04/2014 10:55 AM, Winkler, Matthias wrote: Dear R-users, I'm working on datasets which contain data from the years 1960 to 2100 .. I also produced a date/time-sequence in R, which showed the same mistakes (see example below). The mistakes occur at the same dates like in my datasets. It's always at the end of march. datetimesequenz - seq.POSIXt(from=as.POSIXct(1960-01-01 00:00), to=as.POSIXct(2100-01-01 00:00), by=1 hour) levels(as.factor(strftime(datetimesequenz, format=%Y)))[1] 1960 1961 1962 ... [181] 2152 2153 2154 2156 2157 2158 2159 2160 2161 2166 Has anybody experienced the same problem and knows a workaround? I'm using R 3.0.1 under Windows 7 64bit. I also tried this with R 3.0.3, it showed the same problem. Thank you for your help! I don't see this in 3.1.0 beta. Do you? I'm not seeing it on a Mac in 3.0.2 either. max(datetimesequenz) [1] 2100-01-01 PST length(datetimesequenz) [1] 1227241 Duncan Murdoch David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. South Africas premier free email service - www.webmail.co.za Fight Crime And Corruption! http://www.anc.org.za/2014/manifesto/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mistakes in date conversion for future date/time (POSIXct)
Dear R-users, I'm working on datasets which contain data from the years 1960 to 2100 with a timestep of one hour. Every year has 365 days, leap years are ignored. After reading the dataset with R I convert the column which contains date/time to POSIXct: as.POSIXct(strptime(MyData [,1], format=%d.%m.%Y : %H)) After that, I divide the data with split() into parts of one year each. Then I recognized, that the years for some rows are obviously converted wrong: They show years larger than 2100 (see example below). I've controlled my original dataset, but the dates are correct there. I also produced a date/time-sequence in R, which showed the same mistakes (see example below). The mistakes occur at the same dates like in my datasets. It's always at the end of march. datetimesequenz - seq.POSIXt(from=as.POSIXct(1960-01-01 00:00), to=as.POSIXct(2100-01-01 00:00), by=1 hour) levels(as.factor(strftime(datetimesequenz, format=%Y))) [1] 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 [19] 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 [37] 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 [55] 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 [73] 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 [91] 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 [109] 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 [127] 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 [145] 2105 2107 2109 2110 2111 2112 2113 2114 2115 2117 2118 2120 2121 2122 2124 2125 2126 2128 [163] 2129 2130 2131 2132 2133 2135 2137 2138 2139 2140 2141 2142 2143 2145 2146 2148 2149 2150 [181] 2152 2153 2154 2156 2157 2158 2159 2160 2161 2166 Has anybody experienced the same problem and knows a workaround? I'm using R 3.0.1 under Windows 7 64bit. I also tried this with R 3.0.3, it showed the same problem. Thank you for your help! Kind regards, Matthias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] computationally singular
Hi Weiwei, I found your post on Mahalanobis distances five years ago. / Error in solve.default(cov, ...) : system is computationally singular: // reciprocal condition number = 1.09501e-25 /I have the same problem now - and I don't know how to deal with it. I would be thankful if you could help me with that. All the best, Matthias // [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MPICH2 Rmpi and doSNOW
Hi I have managed to install MPICH2 and Rmpi on my Windows 7 machine. I can also run the following code library(Rmpi) mpi.spawn.Rslaves() 4 slaves are spawned successfully. 0 failed. master (rank 0, comm 1) of size 5 is running on: MyMaster slave1 (rank 1, comm 1) of size 5 is running on: MyMaster slave2 (rank 2, comm 1) of size 5 is running on: MyMaster slave3 (rank 3, comm 1) of size 5 is running on: MyMaster slave4 (rank 4, comm 1) of size 5 is running on: MyMaster mpichhosts() master slave1 slave2 slave3 slave4 localhost localhost localhost localhost localhost mpi.universe.size() [1] 4 mpi.close.Rslaves() [1] 1 library(doSNOW) But every time I try to set up a cluster via cluster - makeCluster(4, type = MPI) My computer hangs up and I have to close the R session. Any advice how I get this running? Thanks in advance sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 [3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C [5] LC_TIME=German_Switzerland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Rmpi_0.6-3 loaded via a namespace (and not attached): [1] tools_3.0.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MPICH2 Rmpi and doSNOW
Hi I have managed to install MPICH2 and Rmpi on my Windows 7 machine. I can also run the following code library(Rmpi) mpi.spawn.Rslaves() 4 slaves are spawned successfully. 0 failed. master (rank 0, comm 1) of size 5 is running on: MyMaster slave1 (rank 1, comm 1) of size 5 is running on: MyMaster slave2 (rank 2, comm 1) of size 5 is running on: MyMaster slave3 (rank 3, comm 1) of size 5 is running on: MyMaster slave4 (rank 4, comm 1) of size 5 is running on: MyMaster mpichhosts() master slave1 slave2 slave3 slave4 localhost localhost localhost localhost localhost mpi.universe.size() [1] 4 mpi.close.Rslaves() [1] 1 library(doSNOW) But every time I try to set up a cluster via cluster - makeCluster(4, type = MPI) My computer hangs up and I have to close the R session. Any advice how I get this running? Thanks in advance sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating new instances from original ones
see function SMOTE in package DMwR hth Matthias On 31.03.2013 10:46, Nicolás Sánchez wrote: I have a question about data mining. I have a dataset of 70 instances with 14 features that belong to 4 classes. As the number of each class is not enough to obtain a good accuracy using some classifiers( svm, rna, knn) I need to oversampling the number of instances of each class. I have heard that there is a method to do this. It consists in generating these new instances as follows: new_instance original_instance + u(epsilon) U(epsilon) is a uniform number in the range [-epsilon,epsilon] and this number is applied to each feature of the dataset to obtain a new instance without modified the original class. Anybody has used this method to oversampling his data? Anybody has more information about it? Thanks in advance! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create bar chart with different totals in a bar
Hello together There is another try as a png file. Hope you can see it now, what i want to do with my bar chart. Your example with ggplot2 works, but it wont help to convert my data like this one: 1 2 3 4 abnr2 11425 11425 11555 11888 TIME 21 1 2 Cat 12 1 2 to: 11425 11555 11888 1 21 0 2 1 02 Thanks for your help Von: John Kane [mailto:jrkrid...@inbox.com] Gesendet: Freitag, 8. März 2013 16:29 An: Matthias Weber Betreff: RE: AW: [R] create bar chart with different totals in a bar The image did not come through. The list strips off most attachments to reduce the chance of virus or malware. I think a png file will get through. Anyway I still don't quite understand you but does this look like what you want? Note I made a slight change in the data.frame to use ggplot2. If you want to try out the ggplot2 code you will need to install ggplot2 --use the command install.packages(ggplot2) to do so. Also note that dd is a data.frame rather than your matrix. Again done for ggplot2 ##==# dd - structure(list(x = 1:4, abnr2 = c(11425, 11425, 11555, 11888), time = c(2, 1, 1, 2), cat = c(1, 2, 1, 2)), .Names = c(x, abnr2, time, cat), row.names = c(NA, -4L), class = data.frame) barplot(dd$abnr2, col= c(red,blue,red,blue)) library(ggplot2) p - ggplot(dd, aes(x = x, y = abnr2, fill = as.factor(cat) )) + geom_bar(stat = identity, position = dodge) + xlab(Something) ##=# John Kane Kingston ON Canada -Original Message- From: matthias.we...@fnt.demailto:matthias.we...@fnt.de Sent: Fri, 8 Mar 2013 16:01:51 +0100 To: jrkrid...@inbox.commailto:jrkrid...@inbox.com Subject: AW: [R] create bar chart with different totals in a bar Hello John, thanks for your comment. Your code is the way my matrix look like, yes. What i want to do is, that each equal abnr2 is represented in the same bar. Like the picture: So in the end, i have a PDF, which contains for each abnr2 one bar. If there are one abnr2 with 2 different kind of „cat“ (like 11425) i want to distinguish this difference in the color. Simplified revealed, it should be look like this one: Thanks for your help. Mat -Ursprüngliche Nachricht- Von: John Kane [mailto:jrkrid...@inbox.com] Gesendet: Freitag, 8. März 2013 15:42 An: Matthias Weber; r-help@r-project.orgmailto:r-help@r-project.org Betreff: RE: [R] create bar chart with different totals in a bar https://github.com/hadley/devtools/wiki/Reproducibility Is this what your matrix looks like? mat1 - structure(c(11425, 11425, 11555, 11888, 2, 1, 1, 2, 1, 2, 1, 2), .Dim = c(4L, 3L), .Dimnames = list(NULL, c(abnr2, time, cat))) It is good practice to use dput() to supply sample data. It is not particularly clear what you want to do. You apparently have four entries in the matrix and say that you want to have three bars. How do you want to handle the 11425 value since it has diffference cats? John Kane Kingston ON Canada -Original Message- From: matthias.we...@fnt.demailto:matthias.we...@fnt.de Sent: Fri, 8 Mar 2013 03:00:39 -0800 (PST) To: r-help@r-project.orgmailto:r-help@r-project.org Subject: [R] create bar chart with different totals in a bar Hello together, perhabs anyone of you, has an ideal, how i can do this: I have a matrix, like this one: [,1] [,2] [,3] [,4] abnr2 11425 11425 11555 11888 TIME 21 1 2 Cat 12 1 2 and now i want a bar chart, in which one abnr2 is one bar. So my bar chart has to have 3 bars, one for 11425, one for 11555 and one for 11888. in my 11425 bar, the distinction has to be shown. So the value of one column has to have a own color in dependence of the Cat. Perhabs anyone have an idea? Thanks. Mat -- View this message in context: http://r.789695.n4.nabble.com/create-bar-chart-with-different-totals-i n-a-bar-tp4660703.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code
Re: [R] integrate function
use pmin instead of min. hth Matthias On 12.02.2013 16:41, dan wang wrote: Hi All, Can any one help to explain why min and max function couldn't work in the integrate function directly. For example, if issue following into R: integrand - function(x) {min(1-x, x^2)} integrate(integrand, lower = 0, upper = 1) it will return this: Error in integrate(integrand, lower = 0, upper = 1) : evaluation of function gave a result of wrong length However, as min(U,V) = (U+V)/2-abs(U-V)/2 Below code working; integrand - function(x){(1-x+x^2)/2-0.5*abs(1-x-x^2)} integrate(integrand, lower = 0, upper = 1) 0.151639 with absolute error 0.00011 I am confused... Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with any function
The warning 1: In (ind.c == TRUE) (ind.sgn == TRUE) : longer object length is not a multiple of shorter object length means that ind.c and ind.sgn have different lengths, for whatever reason. Although R continues the routine, the warning should, in general, not be ignored. Try: 1:3 + 1:2 at the command prompt to see what type of unintended things happen. Actually, 1:4 + 1:2 is running silently. Best wishes, Matthias Am 17.11.2012 19:56, schrieb Rui Barradas: Hello, First of all, that's not an error message, it's a warning. As for your condition, it's too complicated. You don't need to test a logical value for equality to TRUE, it's redundant. And any(x, y) is equivalent to any(x, y, x y). Try any(ind.c, ind.r, ind.sgn) and you'll be fine. Hope this helps, Rui Barradas Em 17-11-2012 18:36, Haris Rhrlp escreveu: Dear R users, I have the any function of R any(ind.c, ind.r, ind.sgn) all are logical factors it works fine when any of three is true but when they are combined it doesnt work. i tried this any(ind.c, ind.r, ind.sgn,((ind.c==TRUE) (ind.r==TRUE)), ((ind.c==TRUE) (ind.sgn==TRUE)),((ind.r==TRUE) (ind.sgn==TRUE)), ((ind.c==TRUE) (ind.r==TRUE) (ind.sgn==TRUE))) } but i have this error message Warning messages: 1: In (ind.c == TRUE) (ind.sgn == TRUE) : longer object length is not a multiple of shorter object length 2: In (ind.c == TRUE) (ind.r == TRUE) (ind.sgn == TRUE) : longer object length is not a multiple of shorter object length I want to check combined the three logikal factors any help will be welcome [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survreg gompertz
On 15/11/12 21:22, David Winsemius wrote: On Nov 15, 2012, at 5:38 AM, Matthias Ziehm wrote: Hi all, Sorry if this has been answered already, but I couldn't find it in the archives or general internet. A Markmail/Rhelp search on: gompertz survreg ...brings this link to a reply by Terry Therneau. Seems to address everything you asked and a bit more http://markmail.org/search/?q=list%3Aorg.r-project.r-help+gompertz+survreg#query:list%3Aorg.r-project.r-help%20gompertz%20survreg+page:1+mid:6xdlsmo272oa7zkw+state:results (Depending on how your mailer breaks URLs you may need to paste it back together.) Thanks! I hadn't read this thread because of the missleading title. However, now I've got follow up questions, on that explanation: On Feb 5, 2010 at 8:48:08 am, Terry Therneau wrote: Subject: Re: [R] Using coxph with Gompertz-distributed survival data. ... the note below talks about how to do so approximately with survreg. It's a note to myself of something to add to the survival package documentation, not yet done, and to my embarassment the file has a time stamp in 1996. Ah well. Terry Therneau My document A Package for Survival Analysis in S contains statements about how to fit Gompertz and Rayleigh distributions with the survreg routine. Nicholas Brouard, in a recent query to this group, quite correctly states that Therneau's documentation is a little elliptic for people not so familiar with extreme value theory. I've spent the last day trying to work out concrete examples of the fits. Let me start by saying that I now think my paper's remarks were overly optimistic. This note will try to indicate why. I will use some TeX notation below: \alpha, \beta, etc for Greek letters. Hazard functions: Weibull: p*(\lambda)^p * t^(p-1) Extreme value: (1/ \sigma) * exp( (t- \eta)/ \sigma) Rayleigh: a + bt Gompertz: b * c^t Makeham: a + b* c^t The Makeham hazard seems to fit human mortality experience beyond infancy quite well, where a is a constant mortality which is independent of the health of the subject (accidents, homicide, etc) and the second term models the Gompertz assumption that the average exhaustion of a man's power to avoid death to is such that at theendof equal infinitely small itervals of time he lost equal portions of his remaining power to oppose destruction which he had at the commencement of these intervals. For older ages a is a neglible portion of the death rate and the Gompertz model holds. The fitting routine depends on the decomposition Y = \eta + \sigma W, where \eta = \beta_0 + \beta_1 * X_1 + \beta_2 * X_2 + ... is the fitted linear predictor and W is a distribution in standard form. For instance, if the response time t is Weibull, then y = log(t) follows this with \eta = log(\lambda) \sigma = 1/p Clearly 1. The Wiebull distribution with p=2 (sigma=.5) is the same as a Rayleigh distribution with a=0. It is not, however, the most general form of a Rayleigh. 2. The (least) extreme value and Gompertz distributions have the same hazard function, with \sigma = 1/ log(c), and exp(-\eta/ \sigma) = b. If I do the math correctly from the above given extreme value hazard to Gompertz hazard. It needs to b = 1/ \sigma * exp(-\eta/ \sigma) Other wise the 1/ \sigma of the Extreme value hazard is missing, isn't it? It would appear that the Gompertz can be fit with an identity link function combined with the extreme value distribution. However, this ignores a boundary restriction. If f(x; \eta, \sigma) is the extreme value distribution with paramters \eta and \sigma, then the definition of the Gompertz densitiy is g(x; \eta, \sigma) = 0 x 0 g(x; \eta, \sigma) = c f(x; \eta, \sigma) x=0 where c= exp(exp(-\eta / \sigma)) is the necessary constant so that g integrates to 1. Here I got really lost were the addition double exp suddenly comes from and how it fits in. Given the above I would have thought that: g(x,b,c) = f(x, \eta=-1/log(c)*log(b*1/log(c)), \sigma= 1/log(c) ) for x=0 Can anyone clarify these thinga to me, please? Matthias If \eta / \sigma is far from 1, then the correction term will be minimal and survreg should give a reasonable answer. If not, the distribution can't be fit, nor can it be made to easily conform to the general fitting scheme of the program. The Makeham distribution falls into the gamma family (equation 2.3 of Kalbfleisch and Prentice, Survival Analysis), but with the same range restriction problem. In summary, the Gompertz is a truncated form of the extreme value distribution (Johnson, Kotz and Blakrishnan, Contiuous Univariate Distri- butions, section 22.8). If one ignores the truncation, i.e., assume that negative time values are possible, then it can be fit with survreg. My original note seems to have been compounded of 3 errors: the -1 arises from confusing the maximal extreme distribution (most common in theory books) with the minimal extreme distribution (used in survreg), the log() term was a typing mistake, and I never noticed
[R] survreg gompertz
Hi all, Sorry if this has been answered already, but I couldn't find it in the archives or general internet. Is it possible to implement the gompertz distribution as survreg.distribution to use with survreg of the survival library? I haven't found anything and recent attempts from my side weren't succefull so far. I know that other packages like 'eha' and 'flexsurv' offer functions similar to survreg with gompertz support. However, due to the run-time environment were this needs to be running in the end, I can't use these packages :( Same questions for the gompertz-makeham distribution. Many thanks! Matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two-way Random Effects with unbalanced data
Asaf, I think that the lme4-package on CRAN with its main function lmer() can help you here. It uses restricted ML to find the estimates and should cope with unbalanced data. The two factors of your random effects can be either nested or crossed. Have a look in the book (draft-version) on lme4 by Doug Bates, in particular chapter 2. http://lme4.r-forge.r-project.org/lMMwR/ There is a PDF in that directory. By the way, there is another mailing list dedicated to mixed-models in R, the R-sig-mixed-models. You have better chances to get quicker reply in case you have further questions on lme4.. Matthias. On Mon, Oct 29, 2012 at 10:42 PM, asafwe as...@wharton.upenn.edu wrote: Hi there, I am looking to fit a two-way random effects model to an *unblalanced* layout, y_ijk = mu + a_i + b_j + eps_ijk, i=1,...,R, j=1,...,C, k=1,...,K_ij. I am interested first of all in estimates for the variance components, sigsq_a, sigsq_b and sigsq_error. In the balanced case, there are simple (MM, MLE) estimates for these; In the unbalanced setup, this is much more complicated because orthogonality relations no longer exist between row space and column space. Since the covariance structure between the cell means y_{ij.} is much more complicated than in the one-way (unbalanced) case, trying to manually obtain MLEs looks very difficult. I couldn't find anything addressing point estimates for the unbalanced case - neither in the literature nor in a computer program (R or alike). Any idea will be greatly appreciated... Thank you, Asaf -- View this message in context: http://r.789695.n4.nabble.com/Two-way-Random-Effects-with-unbalanced-data-tp4647795.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ylim with only one value specified
Dear R developers, I would like to have R choose the limits of the y-axis semi-automatically, e.g., zero should be included, but the maximum should be chosen depending on the data. Examples: plot(1:10, 1:10) # selects min and max automatically plot(1:10, 1:10, ylim=c(1, 10)) # manual definition plot(1:10, 1:10, ylim=c(0, Inf)) # this would be a nice feature, i.e. lower y limit = 0 defined manually, upper limit = 10 selected automatically The relevant code from plot.default would have to be modified: old: ylim - if (is.null(ylim)) range(xy$y[is.finite(xy$y)]) else ylim new: ylim - if (is.null(ylim)) range(xy$y[is.finite(xy$y)]) else if(length(ylim) == 2 ylim[2] == Inf) c(ylim[1], max(xy$y[is.finite(xy$y)]) else ylim and some more if-else statements if cases like ylim=c(0, -Inf), ylim=c(Inf, 0), ylim=c(-Inf, 0) and ylim=c(-Inf, Inf)[as a replacement for NULL/ autoselection] and ylim=c(Inf, -Inf)[autoselection, reversed y axis] should be handled correctly. I would find such a feature useful. Do you think it would interfere with other functions? Thank you for your consideration. Best wishes, Matthias Gondan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transform RTs
you might to do something like library(SuppDists) t = runif(100, 100, 500) # original RT t_IG = qinvGauss(ecdf(t)(t)-0.5/length(t), 1, 16) plot(density(t_IG)) but what is the purpose of it? Usually reaction times are thought to follow a certain kind of distribution (e.g. an inverse Gaussian distribution). Am 29.08.2012 17:54, schrieb Katja Böer: Hello, I'm trying to transform reaction times which are not normally distributed to an ex gaussian or an inverse gaussian distribution, but I don't really know how to use the exGAUS() function. Can someone show me a script in which data has been transformed? Thanks in advance k [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Font size in geom_dl (using ggplot2)
Hey everyone, I am an R-newby... so sorry for bothering you with simple-to-solve questions;) I have the following issue: trying to add labels to my scatterplots (with geom_dl in ggplot2). Everything works fine, but after checking every resource I do not find a way to change the font size of my labels. I tried size, cex, fontsize at every position... but it always stays the same. ggplot()+ opts(legend.position=none)+ xlab(x)+ ylab(y)+ xlim(-15,15) + ylim(0,6)+ theme_complete_bw()+ scale_colour_manual(values=cols)+ geom_point(data=m, aes(x=x, y=y, colour=s), shape=19, cex=6, alpha=0.3)+ geom_dl(method=top.bumptwice, data=m_sig, aes(x=x, y=y, label=gene.name, colour=s, size=10))+ geom_line(data=m_s0, aes(x= x, y=y), linetype=5, colour=grey55, size=0.5) Your help is very much appreciated! Thx. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What makes R different from other programming languages?
My vote: 1. Symbolic function arguments: fn = function(a, b) { a/b } fn(b=10, a=2) 2. Names for elements of a vector and matrices v = c(a=1, b=2) v['a'] = v['a'] * 2 same for matrices 3. about 10,000 user-contributed packages on CRAN 4. weird things like a = numeric(10) a[1:10] = 1:2 no error message a answer: five times 1:2 which guarantee happy debugging 5. and, of course, much built-in statistical stuff Am 20.08.2012 20:02, schrieb johannes rara: My intention is to give a presentation about R programming language for software developers. I would like to ask, what are the things that make R different from other programming languages? What are the specific cases where Java/C#/Python developer might say Wow, that was neat!? What are the things that are easy in R, but very difficult in other programming languages (like Java)? Thanks, -J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convolve 2 uniform distributed variables
take a look at the distr package library(distr) U - Unif() U2 - U + U # or U2 - convpow(U, 2) plot(U2) # or curve(d(U2)(x), from = 0, to = 2) Best Matthias On 24.05.2012 08:29, c...@mail.tu-berlin.de wrote: Hi, I want to convolve 2 uniform distributed [0,1] variables, so that I get this graphic: http://de.wikipedia.org/wiki/Benutzer:Alfred_Heiligenbrunner/Liste_von_Verteilungsdichten_der_Summe_gleichverteilter_Zufallsvariabler (second graphic) if I do u1-seq(0,1,length=100) fu1=dunif(u1,min=0,max=1) plot(u1,fu1,type=l,xlim=c(-2,2)) u2-seq(0,1,length=100) fu2=dunif(u2,min=0,max=1) u1u2-convolve(u1,rev(u1),typ=o) plot(u1u2) it does not work, could you help me please? The point is the convolve function I think, what do I have to type, to get the correct convolution of two uniform distributed [0,1] variables as to be seen in the second graphic in the given link? Thanks a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 - geom_bar
On 23.04.2012 20:55, Brian Diggs wrote: On 4/23/2012 9:24 AM, Matthias Rieber wrote: Hello, I've some problem with the ggplot2. Here's a small example: [...] Is it wrong to use geom_bar with that kind of data? I could avoid this issue when I cast the data.frame, but I like to avoid that solution. There is nothing wrong with using bars with this sort of data. There is a bug in the faceting code of 0.9.0 that will be fixed in 0.9.1 (see https://github.com/hadley/ggplot2/issues/443 ) which caused duplicate rows of data to be dropped when there was faceting. That is what you are seeing in the second example; row 6 is identical to row 7 and is dropped before plotting. One easy workaround until 0.9.1 comes out is to add unique column to the data that is otherwise ignored: molten - data.frame(date=c('01','01','01','01', '02','02','02','02'), channel=c('red','red','blue','blue', 'red','red','blue','blue'), product=c('product1','product2', 'product1','product2', 'product1','product1', 'product1','product2'), value=c(1,1,1,1, 1,1,1,1), dummy=1:8) Thanks! That worked. Matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare String Similarity
you should also look at Bioconductor Package Biostrings hth Matthias Am 19.04.2012 18:01, schrieb R. Michael Weylandt: Though if you do decide to use Levenstein, it's implemented here in R: http://finzi.psych.upenn.edu/R/library/RecordLinkage/html/strcmp.html I'm pretty sure this is in C code so it should be mighty fast. Michael On Thu, Apr 19, 2012 at 11:40 AM, Bert Guntergunter.ber...@gene.com wrote: Wrong list.This is R, not statistics (or linguistics?).Please post elsewhere. -- Bert On Thu, Apr 19, 2012 at 8:05 AM, Alekseiy Beloshitskiy abeloshits...@velti.com wrote: Dear All, I need to estimate the level of similarity of two strings. For example: string1- c(depending,audience,research, school); string2- c(audience,push,drama,button,depending); The words in string may occur in different order though. What function would you recommend to use to estimate similarity (e.g., levenstein, distance)? Appreciate for any advices. -Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave UFT8 problem
try: Sweave(sim_pi.Rnw, encoding = utf8) Best, Matthias On 15.04.2012 11:41, Philippe Grosjean wrote: Hello, Have you tried to put that command in a comment: %\usepackage[utf8]{inputenc} I haven't tested it in this particular case, but it works in some other situations. Best, Philippe ..¡})) ) ) ) ) ) ( ( ( ( ( Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( ( Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons University, Belgium ( ( ( ( ( .. On 14/04/12 22:37, Mark Heckmann wrote: Hi, I work on MacOS, trying to Sweave an UFT8 document. AFAI remember R 2.14 used to render a warning when the encoding was not declared when using Sweave. With R 2.15 it seems to render an error. Sweave(sim_pi.Rnw) Error: 'sim_pi.Rnw' is not ASCII and does not declare an encoding Declaring an encoding by adding a line like \usepackage[utf8]{inputenc} in the preamble does the job. In my case though the .Rnw document does no have a preamble as it is just one chapter. All chapters are Sweaved separately (due to computation time). Hence I cannot inject the above line as LaTex will cause an error afterwards. (usepackage{} is only allowed in the preamble which only appears once in the main document, not in each chapter). How can I get around this not using the terminal for Sweaving, like e.g. R CMD Sweave --encoding=utf-8 sim_pi.Rnw ? Thanks Mark [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] line width in legend of interaction.plot
Dear R developers, The following command produces an interaction plot with lwd=2. interaction.plot(c(1, 2, 1, 2), c(1, 1, 2, 2), 1:4, lwd=2) In the legend, however, lwd seems to be 1, which does not seem to be intended behavior. Probably the lwd is not correctly forwarded to legend: from the interaction.plot source: legend(xleg, yleg, legend = ylabs, col = col, pch = if (type %in% c(p, b)) pch, lty = if (type %in% c(l, b)) lty, bty = leg.bty, bg = leg.bg) - here I would add lwd=well, the lwd from the ... argument, or perhaps something like leg.lwd Best wishes, Matthias Gondan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reference for dataset colon (package survival)
Dear R team, dear Prof. Therneau, library(survival) data(colon) ?colon gives me only a very rudimentary source (only a name). Is there a possibility to get a reference to the clinical trial these data are taken from? Many thanks in advance. With best wishes, Matthias Gondan -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate quantiles of a custom function
Dear Gerhard, you could also use package distr; e.g. library(distr) ## use generating function AbscontDistribution D - AbscontDistribution(d = function(x) dbeta(x, 2, 6) + dbeta(x,6,2), low = 0, up = 1, withStand = TRUE) ## quantiles q(D)(seq(0,1,0.1)) Best Matthias On 03.01.2012 19:33, Albyn Jones wrote: What do quantiles mean here? If you have a mixture density, say myf - function(x,p0) p0*dbeta(x,2,6) + (1-p0)*dbeta(x,6,2) then I know what quantiles mean. To find the Pth quantile use uniroot to solve for the x such that myf(x,p0) - P =0. albyn Quoting VictorDelgado victor.m...@fjp.mg.gov.br: Gerhard wrote Suppose I create a custom function, consisting of two beta-distributions: myfunction - function(x) { dbeta(x,2,6) + dbeta(x,6,2) } How can I calculate the quantiles of myfunction? Thank you in advance, Gerhard Gehard, if do you want to know the quantiles of the new distribution created by myfunction. Maybe you can also do: x - seq(0,1,.01) # insert your 'x' q - myfunction(x) # And: quantile(x) 0% 25% 50% 75% 100% 0.00 1.476177 2.045389 2.581226 2.817425 # This gives the sample quantiles. You can also look foward to simulations (like Bert Gunter had suggested) to know better the properties of distributions quantiles obtained after 'myfunction'. - Victor Delgado cedeplar.ufmg.br P.H.D. student www.fjp.mg.gov.br reseacher -- View this message in context: http://r.789695.n4.nabble.com/calculate-quantiles-of-a-custom-function-tp4256887p4257551.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Re: Poisson GLM using non-integer response/predictors?
Hi, Use offset variables if count occurrences of an event and you want to model the observation time. glm(count ~ predictors + offset(log(observation_time)), family=poisson) If you want to compare durations, look at library(survival), ?coxph If tnoise_sqrt is the square root of tourist noise, your example seems incorrect, because it is a predictor, not the dependent variable tnoise_sqrt ~ lengthfeeding_log Best wishes, Matthias Am 30.12.2011 16:29, schrieb Lucy Dablin: Great lists, I always find them useful, thank you to everyone who contributes to them. My question is regarding non-integer values from some data I collected on parrots when using the poisson GLM. I observed the parrots on a daily basis to see if they were affected by tourist presence. My key predictors are tourist noise (averaged over a day period so decimal value, square root to adjust for skew), tourist number (the number of tourists at a site, square root), and the number of boats passing the site in a day (log). These are compared with predictors: total number of birds (count data, square root), average time devoted to foraging at site (log), species richness (sqrt), and the number of flushes per day. Apart from the last one they are all non-integer values. When I run a glm for example: parrots- glm(tnoise_sqrt ~ lengthfeeding_log, family = poisson) summary(parrots) There are warnings which are 27: In dpois(y, mu, log = TRUE) : non-integer x = 1.889822 I was advised to use the offset() function however this does not seem to correct the problem and I find the code confusing. What GLM approach should I be using for multiple non-integer predictors and non-integer responses? Does my GLM approach seem appropriate? Thank you for taking the time to consider this. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Re: Poisson GLM using non-integer response/predictors?
Hi Ben, Thanks for clarifying this, I used a misleading word, model the observation time sounds as if observation time were the dependent variable - which it is not, of course, instead, in the scenario described, the parrot counts are modeled. Best wishes, Matthias Am 30.12.2011 20:50, schrieb Ben Bolker: Matthias Gondanmatthias-gondanat gmx.de writes: Hi, Use offset variables if count occurrences of an event and you want to model the observation time. glm(count ~ predictors + offset(log(observation_time)), family=poisson) If you want to compare durations, look at library(survival), ?coxph If tnoise_sqrt is the square root of tourist noise, your example seems incorrect, because it is a predictor, not the dependent variable tnoise_sqrt ~ lengthfeeding_log Best wishes, Matthias Am 30.12.2011 16:29, schrieb Lucy Dablin: Great lists, I always find them useful, thank you to everyone who contributes to them. My question is regarding non-integer values from some data I collected on parrots when using the poisson GLM. I observed the parrots on a daily basis to see if they were affected by tourist presence. My key predictors are tourist noise (averaged over a day period so decimal value, square root to adjust for skew), tourist number (the number of tourists at a site, square root), and the number of boats passing the site in a day (log). These are compared with predictors: total number of birds (count data, square root), average time devoted to foraging at site (log), species richness (sqrt), and the number of flushes per day. Apart from the last one they are all non-integer values. When I run a glm for example: Your description sounds like you might already have transformed your predictors: generally speaking, you don't want to do that before running a GLM (the variance function incorporated in the GLM takes care of heteroscedasticity, and the link function takes care of nonlinearity in the response). I suspect you want total number of birds, number of flushes per day, and species richness to be modeled as Poisson (or negative binomial -- see ?glm.nb in the MASS package). Species richness *might* be binomial, or more complicated, if you are drawing from a limited species pool (e.g. if there are only 5 possible species and you sometimes see 4 or 5 of them in a day). Is the total number of birds really non-integer *before* you square-root transform it? Time devoted to foraging at the site is most easily modeled as log-normal (unless the response includes zeros: i.e., log-transform as you have already done and use lm), or possibly Gamma-distributed (although you may want to use a log link instead of the default inverse link). As Matthias said, offsets are used for the specific case of non-uniform sampling effort (e.g. if you sampled different areas, or for different lengths of time, every day). You may be interested in r-sig-ecol...@r-project.org , which is an R mailing list specifically devoted to ecological questions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] arrow egdes and lty
Dear R developers, I want to draw an arrow in a figure with lty=2. The lty argument also affects the edge of the arrow, which is unfortunate. Feature or bug? Is there some way to draw an arrow with intact edge, still with lty=2? Example code: plot(1:10) arrows(4, 5, 6, 7, lty=2) Best wishes, Matthias -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] image problem in 2.13.1
Dear R people, On my freshly installed R-2.13.1 (winxp), the following code yields unsatisfactory results (irregular grid lines instead of a smooth plane): image(matrix(1, nrow=100, ncol=100)) This is working fine, image(matrix(1, nrow=100, ncol=100), useRaster=TRUE) but then right-click and save as metafile causes a segmentation fault. Everything worked fine on 2.13.0 Best wishes, Matthias -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Defining functions inside loops
Dear Eduardo, try: f - function(x){} s - 0.2 body(f) - substitute({x^2 + s}, list(s = s)) Best, Matthias On 15.02.2011 16:53, Eduardo de Oliveira Horta wrote: Thanks... but I guess I didn't make myself clear. What I was trying to do was precisely to store inside the function the number associated to s[i] rather than the call to s[i], such that I wouldn't need to keep that object in subsequent function calls. In other words, I wanted to use lapply to get functions equivalent to: s- c( 0.2, 0.45, 0.38, 0.9) f-list() f[[1]]- function(x) x^2+0.2 f[[2]]- function(x) x^2+0.45 f[[3]]- function(x) x^2+0.38 f[[4]]- function(x) x^2+0.9 Best regards, Eduardo On Tue, Feb 15, 2011 at 7:20 AM, Dennis Murphydjmu...@gmail.com wrote: Hi: If you look at the error message, you'll see that you removed s before evaluating f, and since an element of s is called in the function Try s- c( 0.2, 0.45, 0.38, 0.9) f- lapply(1:10, function(i)local({ force(i) ; function(x)x^2+s[i]})) f[[1]](s) [1] 0.2400 0.4025 0.3444 1.0100 f is a list with 10 components, the first of which is [[1]] function (x) x^2 + s[i] environment: 0x02a26d48 Each component occupies a different environment. To see what you get, f[[1]](0.1) [1] 0.21 for(i in 1:10) print(f[[i]](i)) [1] 1.2 [1] 4.45 [1] 9.38 [1] 16.9 [1] NA [1] NA [1] NA [1] NA [1] NA [1] NA for(i in 1:10) print(f[[i]](1)) [1] 1.2 [1] 1.45 [1] 1.38 [1] 1.9 [1] NA [1] NA [1] NA [1] NA [1] NA [1] NA HTH, Dennis On Mon, Feb 14, 2011 at 9:50 PM, Eduardo de Oliveira Horta eduardo.oliveiraho...@gmail.com wrote: Hello again. Let me try something a little more intricate. Let's say instead of forcing evaluation of 'i' I'd want to force evaluation of a vector; for example: s- c( 0.2, 0.45, 0.38, 0.9) f- lapply(1:10, function(i)local({ force(i) ; function(x)x^2+s[i]})) rm(s) f[[1]](0.1) Error in f[[1]](0.1) : object 's' not found Any thoughts? Best regards, Eduardo sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-mingw32 locale: [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Revobase_4.2.0 RevoScaleR_1.1-1 lattice_0.19-13 loaded via a namespace (and not attached): [1] grid_2.11.1 pkgXMLBuilder_1.0 revoIpe_1.0 tools_2.11.1 [5] XML_3.1-0 On Mon, Nov 15, 2010 at 7:10 PM, William Dunlapwdun...@tibco.com wrote: You could make f[[i]] be function(t)t^2+i for i in 1:10 with f- lapply(1:10, function(i)local({ force(i) ; function(x)x^2+i})) After that we get the correct results f[[7]](100:103) [1] 10007 10208 10411 10616 but looking at the function doesn't immdiately tell you what 'i' is in the function f[[7]] function (x) x^2 + i environment: 0x19d7458 You can find it in f[[7]]'s environment get(i, envir=environment(f[[7]])) [1] 7 The call to force() in the call to local() is not necessary in this case, although it can help in other situations. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Eduardo de Oliveira Horta Sent: Monday, November 15, 2010 12:50 PM To: r-help@r-project.org Subject: [R] Defining functions inside loops Hello, I was trying to define a set of functions inside a loop, with the loop index working as a parameter for each function. Below I post a simpler example, as to illustrate what I was intending: f-list() for (i in 1:10){ f[[i]]-function(t){ f[[i]]-t^2+i } } rm(i) With that, I was expecting that f[[1]] would be a function defined by t^2+1, f[[2]] by t^2+2 and so on. However, the index i somehow doesn't get in the function definition on each loop, that is, the functions f[[1]] through f[[10]] are all defined by t^2+i. Thus, if I remove the object i from the workspace, I get an error when evaluating these functions. Otherwise, if don't remove the object i, it ends the loop with value equal to 10 and then f[[1]](t)=f[[2]](t)=...=f[[10]](t)=t^2+10. I am aware that I could simply put f-function(u,i){ f-t^2+i } but that's really not what I want. Any help would be appreciated. Thanks in advance, Eduardo Horta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code
Re: [R] Package distr and define your own distribution
Dear Gabriel, it is possible. You can define a new class or just an object of an already existing class. Did you look at: library(distr) vignette(newDistributions) as well as ?AbscontDistribution ?DiscreteDistribution Please let me know if you have further questions! Best Matthias On 11.02.2011 16:05, gabriel.ca...@ubs.com wrote: Hi all I am using the Package distr (and related) Do you know if it is possible to define your own distribution (object) GIVEN that you have an analytical form of the probability density function (pdf) ? I would then like to use the standard feature of the distr and related packages. Best regards Giuseppe Gabriel Cardi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JGR and font antialiasing
Hi, I've noticed that font antialiasing is enabled in the console window, but not in the editor windows. I'm using R version 2.12.1. JGR 1.7.4 and the jdk from Ubuntu 10.04, which claims to be: OpenJDK Runtime Environment (IcedTea6 1.9.5) (6b20-1.9.5-0ubuntu1~10.04.1). Is that a problem with my installation? Has someone else noticed that issue? Regards, Matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random sequences for rnorm and runif
Dear R experts, For a fixed seed, the first random number produced by rnorm and runif has the same rank within the distribution, which I find useful. The following ranks differ, however. set.seed(123) runif(4) [1] *0.2875775* 0.7883051 *0.4089769* 0.8830174 set.seed(123) pnorm(rnorm(4)) [1] 0.2875775 0.4089769 0.9404673 0.5281055 I noticed that rnorm seems to 'eat' two seeds of the random number generator, whereas runif eats only one seed. Is this intended behavior or do you think it should be fixed? The strange thing is that the 1st/3rd/5th etc number of rnorm corresponds to the 1st/2nd/3rd in runif. If two seeds are necessary, I would have expected the following correspondence, 2-1, 4-2, 6-3, etc. Temporary fix: myrnorm = function(n, mean=0, sd=1) + { + qnorm(runif(n), mean, sd) + } # myrnorm set.seed(123) pnorm(myrnorm(4)) [1] 0.2875775 0.7883051 0.4089769 0.8830174 Best wishes, Matthias Gondan -- GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! http://portal.gmx.net/de/go/dsl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] R on Atlas library
Hi Allan, Thank you for your response. Finally I succeeded. These were the commands to compile Atlas and R: Atlas: ../configure -t 4 -Fa alg -fPIC make make install (-fPIC is needed to wrap some of objects of lapack.a as into a shared lib) R: ../configure --with-blas=/usr/local/lib/libptf77blas.a /usr/local/lib/libatlas.a -lpthread make make install Using your benchmark example, all CPUs seem to be used. It is now working at such a high speed that I could not entirely be sure, top refreshes only every 5 s. Cheers Matthias Am 09.08.2010 11:51, schrieb Allan Engelhardt: I don't know about the specific function (a simple, stand-alone reproducible example is always helpful, see the posting guide for details), but ATLAS certainly uses all my cores on a test like this: s - 7500 # Adjust for your hardware a - matrix(rnorm(s*s), ncol = s, nrow = s) b - crossprod(a) # Uses all cores here. My configuration of R with ATLAS on Linux (Fedora) is described at http://www.cybaea.net/Blogs/Data/Faster-R-through-better-BLAS.html Maybe your distribution has single-threaded ATLAS and you forgot to rebuild it with enable_native_atlas 1 or the equivalent for your platform? Allan On 06/08/10 15:12, Matthias Gondan wrote: Dear List, I am aware this is slightly off-topic, but I am sure there are people who already had the problem and who perhaps solved it. I am running long-lasting model fits using constrOptim command. At work there is a linux computer (Quad Core, debian) on which I already have compiled R and Atlas, in the hope that things will go faster on that machine. Atlas offers the possibility to be compiled for multiple cores, I used that option, but without success. Performance meter (top) for Linux shows 25% CPU usage, and there is no loss of performance if I run 4 instances of R doing heavy matrix multiplications. Performance drops when a 5th instance of R is doing the same job, so I assume my attempt was not successful. I am sure I did something wrong. Is there anybody who sucessfully runs R with Atlas and all processors? A short description of the necessary steps would be helpful. Searching around the internet was not very encourageing. Some people wrote that it is not so simple to have Atlas fully exploit a multicore computer. I hope this is not too unspecific. Best wishes, Matthias -- Dr. rer. nat. Matthias Gondan Institut für Psychologie Universität Regensburg D-93050 Regensburg Tel. 0941-943-3856 Fax 0941-943-3233 Email: matthias.gon...@psychologie.uni-regensburg.de http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimization subject to constraints
try this (package Rsolnp) library(Rsolnp) g- function(x) { return(x[1]^2+x[2]^2) } # constraint f- function(x) { return(x[1]+x[2]) } # objective function x0 = c(1, 1) solnp(x0, fun=f, eqfun=g, eqB=c(1)) Am 10.08.2010 14:59, schrieb Gildas Mazo: Thanks, but I still cannot get to solve my problem: consider this simple example: f- function(x){ return(x[1]+x[2]) } # objective function g- function(x){ return(x[1]^2+x[2]^2) } # constraint # I wanna Maximize f(x) subject to g(x) = 1. By hand the solution is (1/sqrt(2), 1/sqrt(2), sqrt(2)). This is to maximizing a linear function subject to a nonlinear equality constraint. I didn't find any suitable function in the packages I went through. Thanks in advance, Gildas Spencer Graves a écrit : To find every help page containing the term constrained optimization, you can try the following: library(sos) co- findFn('constrained optimization') Printing this co object opens a table in a web browser with all matches sorted first by the package with the most matches and with hot links to the documentation page. writeFindFn2xls(co) This writes an excel file, with the browser table as the second tab and the first being a summary of the packages. This summary table can be made more complete and useful using the installPackages function, as noted in the sos vignette. A shameless plug from the lead author of the sos package. Spencer Graves On 8/9/2010 10:01 AM, Ravi Varadhan wrote: constrOptim can only handle linear inequality constraints. It cannot handle equality (linear or nonlinear) as well as nonlinear inequality constraints. Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dwayne Blind Sent: Monday, August 09, 2010 12:56 PM To: Gildas Mazo Cc: r-help@r-project.org Subject: Re: [R] optimization subject to constraints Hi ! Why not constrOptim ? Dwayne 2010/8/9 Gildas Mazogildas.m...@curie.fr Dear R users, I'm looking for tools to perform optimization subject to constraints, both linear and non-linear. I don't mind which algorithm may be used, my primary aim is to get something general and easy-to-use to study simples examples. Thanks for helping, Gildas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting -guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimization subject to constraints
try command solnp in package Rsolnp Am 09.08.2010 18:56, schrieb Dwayne Blind: Hi ! Why not constrOptim ? Dwayne 2010/8/9 Gildas Mazogildas.m...@curie.fr Dear R users, I'm looking for tools to perform optimization subject to constraints, both linear and non-linear. I don't mind which algorithm may be used, my primary aim is to get something general and easy-to-use to study simples examples. Thanks for helping, Gildas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [OT] R on Atlas library
Dear List, I am aware this is slightly off-topic, but I am sure there are people who already had the problem and who perhaps solved it. I am running long-lasting model fits using constrOptim command. At work there is a linux computer (Quad Core, debian) on which I already have compiled R and Atlas, in the hope that things will go faster on that machine. Atlas offers the possibility to be compiled for multiple cores, I used that option, but without success. Performance meter (top) for Linux shows 25% CPU usage, and there is no loss of performance if I run 4 instances of R doing heavy matrix multiplications. Performance drops when a 5th instance of R is doing the same job, so I assume my attempt was not successful. I am sure I did something wrong. Is there anybody who sucessfully runs R with Atlas and all processors? A short description of the necessary steps would be helpful. Searching around the internet was not very encourageing. Some people wrote that it is not so simple to have Atlas fully exploit a multicore computer. I hope this is not too unspecific. Best wishes, Matthias -- Dr. rer. nat. Matthias Gondan Institut für Psychologie Universität Regensburg D-93050 Regensburg Tel. 0941-943-3856 Fax 0941-943-3233 Email: matthias.gon...@psychologie.uni-regensburg.de http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a random data from a empirical distribition
Hi Dennis, you should take a look at the CRAN task view for distributions http://cran.r-project.org/web/views/Distributions.html Beside that our distr-family of packages might be useful, see also http://www.jstatsoft.org/v35/i10/ http://cran.r-project.org/web/packages/distrDoc/vignettes/distr.pdf Best, Matthias On 27.07.2010 10:37, Dennis Murphy wrote: Hi: On Mon, Jul 26, 2010 at 11:36 AM, xin weixin...@stat.psu.edu wrote: hi, this is more a statistical question than a R question. but I do want to know how to implement this in R. I have 10,000 data points. Is there any way to generate a empirical probablity distribution from it (the problem is that I do not know what exactly this distribution follows, normal, beta?). My ultimate goal is to generate addition 20,000 data point from this empirical distribution created from the existing 10,000 data points. thank you all in advance. The problem, it seems to me, is the leap of faith you're taking that the empirical distribution of your manifest sample will serve as a useful data-generating mechanism for the 20,000 future observations you want to take. I would think that, if you intend to take a sample of 20,000 from ANY distribution, you would want some confidence in the specification of said distribution. Even if you don't know exactly what type of population distribution you're dealing with, there are ways to narrow down the set of possibilities. What is the domain/support of the distribution? For example, the Normal is defined on all of R (as in the real numbers, not our favorite statistical programming language), whereas the lognormal, Gamma and Weibull distributions, among others, are defined on the nonnegative reals. The beta distribution is defined on [0, 1]. Therefore, knowledge of the domain is useful in and of itself. Is it plausible that the distribution is symmetric, or should it have a distinct left or right skew? (Similar comments apply to discrete distributions.) Is censoring or truncation a relevant concern? If there is a random process that well describes how the data you observe are generated, that will certainly narrow down the class of potential data-generating mechanisms/distributions. Once you've narrowed down the class of possible distributions as much as possible, you could look into the fitdistr() function in MASS or the fitdistrplus package on CRAN to test out which candidates seem plausible wrt your existing sample and which are not. You are not likely to be able to narrow it down to one family of distributions, but you should have a much better idea about the characteristics of the distribution that gave rise to your sample of 10,000 (assuming, of course, that it is a *random* sample) after going through this exercise, which you can apply to the generation of the next 20,000 observations. OTOH, if your existing 10,000 observations were not produced by some random process, all bets are off. HTH, Dennis -- View this message in context: http://r.789695.n4.nabble.com/how-to-generate-a-random-data-from-a-empirical-distribition-tp2302716p2302716.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust Asymptotic Statistics (RobASt)
Dear Alex, after installation of ROptRegTS there should be a folder scripts in the package directory of ROptRegTS which includes some further examples. If you are assuming normal errors you should switch to package RobRex which is optimized for normal errors. There you will also find a folder scripts in the package directory. In general, multivariate regression should work. But, I have to admit that we didn't try many multivariate examples so far. Beside that, the user interface is still rudimentary at the moment and some ideas which will clearly reduce computation time (as in the case of package RobLox) are unfortunately not yet implemented. So, if you are willing to accept these drawbacks I will be glad to give you a hand to get our estimators to work for your data. Best, Matthias On 06.06.2010 06:47, Alex Weslowski wrote: Hi all, Other than: http://www.stamats.de/F2006.r Are there other good simple examples out there of using the ROptRegTS package (part of the RobASt project)? I'm hoping to plug it in for multivariate regression. Or is this not a good idea? Just trying to find out how it compares to rlm, lts, glm, etc. Hopefully this makes sense, I'm new to the world of statistics and R. Thanks! St0x __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with format(,%G-%V) - Segfault
Hi, I've some problems with the new R version converting date to year-week: R version 2.11.0 (2010-04-22) Copyright (C) 2010 The R Foundation for Statistical Computing ISBN 3-900051-07-0 load(test.data.R) str(test.data) Class 'Date' num [1:3599546] 13888 14166 14188 14189 14189 ... result - format(test.data, format=%G-%V) Error: segfault from C stack overflow this happens with a self compiled R version and the debian packages. The same operation works with R 2.9.x and R 2.10.x I've attached the backtrace. kind regards, Matthias Core was generated by `/opt/R/lib64/R/bin/exec/R --vanilla --no-save'. Program terminated with signal 11, Segmentation fault. #0 0x7f855f6e34d1 in __strftime_internal (s=value optimized out, maxsize=256, format=value optimized out, tp=value optimized out, tzset_called=value optimized out, loc=0x7f855f9b0580) at strftime_l.c:1046 in strftime_l.c #0 0x7f855f6e34d1 in __strftime_internal (s=value optimized out, maxsize=256, format=value optimized out, tp=value optimized out, tzset_called=value optimized out, loc=0x7f855f9b0580) at strftime_l.c:1046 _n = 4 _delta = value optimized out modifier = value optimized out number_value = value optimized out subfmt = 0x7f85fffd Address 0x7f85fffd out of bounds buf = P\312\313e\377\177\000\000\364\226e\000\000\000\000\000%t2010 width = value optimized out to_lowcase = value optimized out pad = 0 digits = value optimized out negative_number = value optimized out change_case = value optimized out bufp = 0x7fff652c20e2 2010 to_uppcase = 0 current = 0x7f855f9ae020 hour12 = 12 zone = 0x0 i = 0 p = 0x7fff65cbcac0 2010-06 f = 0x7fff652c2151 G-%V #1 0x7f855f6e3db6 in *__GI___strftime_l (s=0x7fff65cbcac0 2010-06, maxsize=140734890778850, format=0x4 Address 0x4 out of bounds, tp=0x7fff652c20e2, loc=value optimized out) at strftime_l.c:490 tzset_called = false #2 0x0052677f in do_formatPOSIXlt (call=value optimized out, op=value optimized out, args=value optimized out, env=value optimized out) at datetime.c:809 q = 0x33a7420 %G-%V secs = 0 fsecs = value optimized out x = 0x2cd3bd0 sformat = 0x3394328 ans = 0x148bd230 tz = 0x33a7788 i = value optimized out n = value optimized out N = 3599546 nlen = {3599546, 3599546, 3599546, 3599546, 3599546, 3599546, 3599546, 3599546, 3599546} UseTZ = 0 buff = 2010-06, '\000' repeats 17 times\270, \245:\003\000\000\000\000\377\377\377\377\000\000\000\000\350\316\313e\377\177\000\000\202\231a, '\000' repeats 13 times, \002\000\000\000\000\000\000\000`Ø_\205\177\000\000\002\000\000\000\000\000\000\000t\313\313e\377\177\000\000p\313\313e\377\177\000\000\000\000\000\000\000\000\000\000\024\000\000\000\002\000\000\000\000\016\271^\301\277u\372p\313\313e\377\177\000\000\270\245:\003\000\000\000\000\060\315\313e\377\177, '\000' repeats 18 times\270, \245:\003\000\000\000\000\270\245:\003\000\000\000\000\230~H\000\000\000\000\000Ц:\003\000\000\000\000P\235:\003, '\000' repeats 12 times, Ц:\003\000\000\000\000\003\000\000\000\000\000\000\000\230U:\003\000\000\000\000P\235:\003\000\000\000\000\200T:\003\000\000\000\000Ð:\003\000\000\000\000\230... p = value optimized out tm = {tm_sec = 0, tm_min = 0, tm_hour = 0, tm_mday = 13, tm_mon = 1, tm_year = 110, tm_wday = 6, tm_yday = 43, tm_isdst = 0, tm_gmtoff = 0, tm_zone = 0x0} #3 0x00429f93 in do_internal (call=value optimized out, op=value optimized out, args=0x33a8d98, env=0x33a9d50) at names.c:1185 s = 0x33a54f0 fun = 0x2178e50 ans = value optimized out save = 45 flag = value optimized out vmax = 0x0 #4 0x0055c7b9 in Rf_eval (e=0x33a5480, rho=0x33a9d50) at eval.c:464 save = 44 flag = 2 vmax = 0x0 op = 0x2157fc0 tmp = value optimized out evalcount = 508 srcrefsave = 0x213a608 depthsave = 8 #5 0x0055ed3c in do_begin (call=0x33aa778, op=0x2157870, args=0x33a5448, rho=0x33a9d50) at eval.c:1245 srcrefs = 0x213a608 i = 1 s = 0x213a608 #6 0x0055c7b9 in Rf_eval (e=0x33aa778, rho=0x33a9d50) at eval.c:464 save = 42 flag = 2 vmax = 0x0 op = 0x2157870 tmp = value optimized out evalcount = 508 srcrefsave = 0x213a608 depthsave = 7 #7 0x00560560 in Rf_applyClosure (call=value optimized out, op=value optimized out, arglist=0x33a4e10, rho=value optimized out, suppliedenv=0x33a9d18) at eval.c:699 body = 0x33aa778 formals = value optimized out actuals = 0x33a9d18 savedrho = 0x213a608 newrho = 0x33a9d50 f = value optimized out a = 0x213a608
Re: [R] Nonparametric generalization of ANOVA
This is your first of three postings in the last hour and they are all in a category that could well be described as requests for tutoring in basic statistical topics. I am of the impression you have been requested not to engage in such behavior on this list. For this question for instance there is an entire CRAN Task View available and you have been in particular asked to sue such resource before posting. Please allow me to ask for details on this task view, because I am interested in the topic of nonparametric ANOVAs, as well. To my knowledge, there are some R scripts from Brunner et al. available on his website http://www.ams.med.uni-goettingen.de/de/sof/ld/index.html But they seem not to be working with current R versions. Best regards, Matthias Gondan -- Sicherer, schneller und einfacher. Die aktuellen Internet-Browser - jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Accessing named elements of a vector
Dear R developers, A great R feature is that elements of vectors, lists and dataframes can have names: vx = c(a=1, b=2) lx = list(a=1, b=2) Accessing element a of vx: vx['a'] Accessing element a of lx: lx[['a']] or lx$a Might be a matter of taste, but I like the $ very much. Unfortunately, vx$a is not functional. Would it break existing compatibility if the $ would be allowed to access elements of a vector, as well? Best regards, Matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new package RFLPtools
The new package RFLPtools is available on CRAN. RFLPtools provides analysis functions for DNA fragment molecular weights (e.g.\ derived from RFLP-analysis) and nucleotide sequence similarities. It aims mainly at the identification of similar or identical fragment patterns to evaluate the amount of different genotypes gained from environmental samples during diversity studies and at further analysis of similarities of nucleotide sequences derived from pairwise sequence alignments (e.g.\ derived from standalone BLAST). Additionally, functions to check the data quality of molecular fingerprints are available. To identify the organisms represented by the extracted fragment patterns (e.g.\ RFLP genotypes), RFLPtools includes a function to compare samples with fragment patterns stored in a reference dataset, from which the taxonomic affiliations are already known. To identify unknown samples in scientific projects, DNA sequences are used, and the gained DNA sequences are compared to existing sequence data via alignments tools, as standalone BLAST. RFLPtools offers tools to generate groups based on tabular report files of sequence comparison of pairwise nucleotide sequence alignment. To get a first impression try: vignette(RFLPtools) as well as library(RFLPtools) example(RFLPplot) example(RFLPrefplot) Best regards, Fabienne Alexandra Matthias -- Dr. Matthias Kohl www.stamats.de ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new version of RobASt-family of packages
The new version 0.7 of our RobASt-family of packages is available on CRAN for several days. As there were many changes we will only sketch the most important ones here. For more details see the corresponding NEWS files (e.g. news(package = RobAStBase) or using function NEWS from package startupmsg i.e. NEWS(RobAStBase)). First of all the new package RobLoxBioC was added to the family which includes S4 classes and methods for preprocessing omics data, in particular gene expression data. ## ## All packages (RandVar, RobAStBase, RobLox, RobLoxBioC, ## ROptEst, ROptEstOld, ROptRegTS, RobRex) ## - TOBEDONE file was added as a starting point for collaborations. The file can be displayed via function TOBEDONE from package startupmsg; e.g. TOBEDONE(distr) - tests/Examples folder for some automatic testing was introduced ## ## Package RandVar ## - mainly fixing of warnings and bugs ## ## Package RobAStBase ## - enhanced plotting, in particular methods for qqplot - unified treatment of NAs - extended implementation for total variation neighbourhoods - implementation of k-step estimator construction extended ## ## Package RobLox ## - na.rm argument added - introduction of finite-sample correction ## ## Package ROptEst ## - optional use of alternative algorithm to obtain Lagrange multipliers using duality based optimization - extended implementation for total variation neighbourhoods - solutions for general parameter transformations with nuisance components - several extensions to the examples in folder scripts - implementation of k-step estimator construction extended ## ## Package ROptEstOld ## - still needed for packages ROptRegTS and RobRex - removed Symmetry and DistributionSymmetry implementation to make ROptEstOld compatible with distr 2.2 ## ## Package ROptRegTS ## - still depends on ROptEstOld ## ## Packages RobRex ## - moved some of the examples in \dontrun{} to reduce check time ... - some minor corrections in ExamplesEstimation.R in folder scripts Best Peter Matthias -- Dr. Matthias Kohl www.stamats.de ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new version of distr-family of packages
The new version 2.2 of our distr-family of packages has now been available on CRAN for several days. As there were many changes we will only sketch the most important ones here. For more details see the corresponding NEWS files (e.g. news(package = distr) or using function NEWS from package startupmsg i.e. NEWS(distr)). First of all a new package distrEllipse was added to the family which includes S4 classes and methods for elliptical symmetric distributions. ## ## All packages (distr, distrEx, distrSim, distrTEst, distrTeach, ## distrMod, distrEllipse, distrDoc as well as ## startupmsg and SweaveListingUtils) ## - TOBEDONE file was added as a starting point for collaborations. The file can be displayed via function TOBEDONE from package startupmsg; e.g. TOBEDONE(distr) - tests/Examples folder for some automatic testing was introduced ## ## Package distr ## - new diagnostic function (more specifically S4 method) qqplot() for quantile-quantile plots to work with distribution objects as argument; in addition to function qqplot() from package stats has functionality for point-wise and simultaneous confidence bands - under the hood: a new slot symmetry was introduced for distributions ## ## Package distrEx ## - enhanced E (expectation), m1df, m2df (clipped 1. and 2. moments) methods --- now you can specify lower and upper integration bounds in the E(.), var(.) functions - new class GPareto for generalized Pareto distribution (implemented by Nataliya Horbenko) ## ## Package distrMod ## - methods for qqplot() for parametric model objects as argument - unified treatment of NAs ## ## Package SweaveListingUtils ## - commands: * SweaveListingPreparations() is more flexible [see help NEWS file] * integrated Andrew Ellis's nice ideas into SweaveListingUtils to use \lstenvironment * individual markup style for base or recommended packages (checked for with new function isBaseOrRecommended()) is distinct now by default (extra color RRecomdcolor) * temporarily changes (like background color) made easier: by new functions + lstdefRstyle to change Rstyle + lstsetRall to change all R-like styles (i.e. Rstyle, Rinstyle, Routstyle, Rcodestyle) * now can specify markup styles for Sinput, Soutput, Scode, separately as Rin, Rout, Rcode * colors now have suffix color, i.e. Rcomment - Rcommentcolor, Rout - Routcolor - vignette: * included an example with escape sequences in vignette * included an example with framed code in vignette Best Peter Matthias -- Dr. Matthias Kohl www.stamats.de ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot filled.contour over continent map
Dear all, As a newbie in R I would like to do the following (simple?) thing: to plot a filled.contour plot over a map showing country boundaries (e.g. for Europe) What i do is: map('worldHires',xlim=c(-10,40),ylim=c(35,70),boundary = TRUE,border=0.1) map.axes() filled.contour(mslp, zlim=c(1000,1020),color.palette = colorRampPalette(c(blue, white, red)),main=Avegared MLSP (hPa) ERA40 JJA [1996-2002], xlab=Longitude,ylab=Latitude) in which the mslp file is a netcdf file, with mean sea level pressure for a range of lat/lon values. If I run the above-mentioned, I just get the map of Europe, without the contourplot. When commenting the map statements I can see the contourplot. So I am doing something wrong, but I really have no idea what? Anybody could help me out here? Thanks in advance, Matthias - Department of Earth Environmental Sciences Physical and Regional Geography Research Group Regional climate studies Celestijnenlaan 200E 3001 Heverlee (Leuven) BELGIUM Tel: + 32 16 326424 Fax: + 32 16 322980 http://geo.kuleuven.be/aow/ www.kuleuven.be/geography - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] COURSE: Introduction to Metabolomics and biomarker research
Introduction to Metabolomics and biomarker research. The course will take place in Innsbruck, 16th to 17th November 2009, and will be held in German. For more details see: http://www.gdch.de/vas/fortbildung/kurse/fortbildung2009.htm#1175 -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming to calculate variance
I think it should be var(y[i-3:i-1,]) instead of var(x[i-3:i-1,]) otherwise the values of the vector are overwritten Best wishes, Matthias marlene marchena schrieb: Dear R-user Suppose I have the following data y=c(2,1,5,8,11,3,1,7,50,21,33,7,60) x=data.frame(y) for(i in 4:nrow(x)) x[i,] =var(x[i-3:i-1,]) I'm trying to get a new variable with the variance of the 3 previous values (just an example) and with NA in the three first positions. I know that my for() is wrong but I'm not able to find my error. Any idea? Thanks, Marlene. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. rer. nat. Matthias Gondan Institut für Psychologie Universität Regensburg D-93050 Regensburg Tel. 0941-943-3856 Fax 0941-943-3233 Email: matthias.gon...@psychologie.uni-regensburg.de http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Example data for analysis of a highly pseudoreplicate mixed-effects experiment
Hello everybody, it may be better to have sample data. I have provided data with less levels of gene and day and only ca. 400 data points per condition. Sample code: small=as.data.frame(read.csv(small.csv)) small$species=factor(small$species) small$gene=factor(small$gene) small$day=factor(small$day) small$repl=factor(small$repl) library(lme4) model1=lmer(APC~(FITC+species+gene)^3+(1|day)+(1|repl),REML=F,data=small) model2=lmer(APC~(FITC+species+gene)^2+(1|day)+(1|repl),REML=F,data=small) anova(model1,model2) gives me a significant difference, but in fact there are far too many residual df, and this is much worse in the original data. I suppose I cannot trust this difference. model3=lmer(APC~species*gene+(1|day/repl/FITC),data=small) Error: length(f1) == length(f2) is not TRUE In addition: Warning messages: 1: In FITC:(repl:day) : numerical expression has 886 elements: only the first used 2: In FITC:(repl:day) : numerical expression has 886 elements: only the first used Can I do nesting without incurring this kind of error ? And finally model4=lmer(APC~gene*species+(1|day) + (1|repl) + (1+(gene:species)|FITC),data=small) model5=lmer(APC~gene+species+(1|day) + (1|repl) + (1|FITC),data=small) anova(model4,model5) works with this reduced data set, but fails to converge for the original, much larger data set. I am unsure if this is the right kind of analysis for the data and there is only a problem of convergence, or if it is the wrong formula. Can anybody give me some advice on this problem ? Or should I just stick to reducing the data from each condition to a single parameter, as explained in my first email below? Thank you again! Matthias Gralle wrote: Hello everybody, I have been trying for some weeks to state the correct design of my experiment as a GLM formula, and have not been able to find something appropriate in Pinheiro Bates, so I am posting it here and hope somebody can help me. In each experimental condition, described by 1) gene (10 levels, fixed, because of high interest to me) 2) species (2 levels, fixed, because of high interest) 3) day (2 levels, random) 4) replicate (2 levels per day, random), I have several thousand data points consisting of two variables: 5) FITC (level of transfection of a cell) 6) APC (antibody binding to the cell) Because of intrinsic and uncontrollable cell-to-cell variation, FITC varies quite uniformly over a wide range, and APC correlates rather well with FITC. In some cases, I pasted day and replicate together as day_repl. My question is the following: Is there any gene (in my set of 10 genes) where the species makes a difference in the relation between FITC and APC ? If yes, in what gene does species have an effect ? And what is the effect of the species difference ? My attempts are the following: 1. Fit the data points of each experimental condition to a linear equation APC=Intercept+Slope*FITC and analyse the slopes : lm(Slope~species*gene*day_repl) This analysis shows clear differences between the genes, but no effect of species and no interaction gene:species. The linear fit to the cells is reasonably good, but of course does not represent the data set completely, so I wanted to incorporate the complete data set. 2a. lmer(APC~FITC*species*gene+(1|day)+(1|repl)) This gives extremely significant values for any interaction and variable because there are 200 000 df. Of course, it cannot be true, because the cells are not really independent. I have done many variations of the above, e.g. 2b. lmer(APC~FITC*species*gene+(1|day)+(1+FITC|day_repl)), but they all suffer from the excess of df. 3. lmer(APC~species*gene+(1|day/repl/FITC) gives several warning messages like this one: In repl:day : numerical expression has 275591 elements: only the first used 4. lmer(APC~gene*species+(1|day_repl) + (1+gene:species|FITC)) ran several days, but failed to converge... Can somebody give me any hint, or do you think the only possible analysis is a simplification as in my model 1 ? By the way, I am using R version 2.8.0 (2008-10-20) on Ubuntu 8.04 on a linux 2.6.24-24-generic kernel on different Intel systems. I am using the lme4 that came with R 2.8.0. Thank you very much for your time! -- Matthias Gralle, PhD Dept. Evolutionary Genetics Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig, Germany Tel +49 341 3550 519 Fax +49 341 3550 555 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthias Gralle, PhD Dept. Evolutionary Genetics Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig, Germany Tel +49 341 3550 519 Fax +49 341 3550 555 __ R-help@r-project.org mailing list
Re: [R] Example data for analysis of a highly pseudoreplicate mixed-effects experiment
There were some wrong NA values in the provided data set, this is now corrected. The data can be read in as small=read.csv(small.csv,colClasses=c(character,rep(integer,2),rep(factor,5))) The high number of residual df can be seen using the nlme package (can it be seen in the lme4 package, too ?): library(nlme) model1=lme(APC~(FITC+species+gene)^3,random=~1|day_repl,method=ML,data=small) anova(model1) numDF denDF F-value p-value (Intercept) 1 789 1365.7400 .0001 FITC 1 789 3040.8168 .0001 species 1 789 24.0521 .0001 gene 1 789 10.4035 0.0013 FITC:species 1 7895.0690 0.0246 FITC:gene 1 789 10.7925 0.0011 species:gene 1 789 72.5823 .0001 FITC:species:gene 1 7894.6742 0.0309 In the original data set, denDF is 200 000. Once again, how do I correctly account for pseudoreplication, avoiding the artificially high number of df ? Thank you, Matthias Matthias Gralle wrote: Hello everybody, it may be better to have sample data. I have provided data with less levels of gene and day and only ca. 400 data points per condition. Sample code: small=as.data.frame(read.csv(small.csv)) small$species=factor(small$species) small$gene=factor(small$gene) small$day=factor(small$day) small$repl=factor(small$repl) library(lme4) model1=lmer(APC~(FITC+species+gene)^3+(1|day)+(1|repl),REML=F,data=small) model2=lmer(APC~(FITC+species+gene)^2+(1|day)+(1|repl),REML=F,data=small) anova(model1,model2) gives me a significant difference, but in fact there are far too many residual df, and this is much worse in the original data. I suppose I cannot trust this difference. model3=lmer(APC~species*gene+(1|day/repl/FITC),data=small) Error: length(f1) == length(f2) is not TRUE In addition: Warning messages: 1: In FITC:(repl:day) : numerical expression has 886 elements: only the first used 2: In FITC:(repl:day) : numerical expression has 886 elements: only the first used Can I do nesting without incurring this kind of error ? And finally model4=lmer(APC~gene*species+(1|day) + (1|repl) + (1+(gene:species)|FITC),data=small) model5=lmer(APC~gene+species+(1|day) + (1|repl) + (1|FITC),data=small) anova(model4,model5) works with this reduced data set, but fails to converge for the original, much larger data set. I am unsure if this is the right kind of analysis for the data and there is only a problem of convergence, or if it is the wrong formula. Can anybody give me some advice on this problem ? Or should I just stick to reducing the data from each condition to a single parameter, as explained in my first email below? Thank you again! Matthias Gralle wrote: Hello everybody, I have been trying for some weeks to state the correct design of my experiment as a GLM formula, and have not been able to find something appropriate in Pinheiro Bates, so I am posting it here and hope somebody can help me. In each experimental condition, described by 1) gene (10 levels, fixed, because of high interest to me) 2) species (2 levels, fixed, because of high interest) 3) day (2 levels, random) 4) replicate (2 levels per day, random), I have several thousand data points consisting of two variables: 5) FITC (level of transfection of a cell) 6) APC (antibody binding to the cell) Because of intrinsic and uncontrollable cell-to-cell variation, FITC varies quite uniformly over a wide range, and APC correlates rather well with FITC. In some cases, I pasted day and replicate together as day_repl. My question is the following: Is there any gene (in my set of 10 genes) where the species makes a difference in the relation between FITC and APC ? If yes, in what gene does species have an effect ? And what is the effect of the species difference ? My attempts are the following: 1. Fit the data points of each experimental condition to a linear equation APC=Intercept+Slope*FITC and analyse the slopes : lm(Slope~species*gene*day_repl) This analysis shows clear differences between the genes, but no effect of species and no interaction gene:species. The linear fit to the cells is reasonably good, but of course does not represent the data set completely, so I wanted to incorporate the complete data set. 2a. lmer(APC~FITC*species*gene+(1|day)+(1|repl)) This gives extremely significant values for any interaction and variable because there are 200 000 df. Of course, it cannot be true, because the cells are not really independent. I have done many variations of the above, e.g. 2b. lmer(APC~FITC*species*gene+(1|day)+(1+FITC|day_repl)), but they all suffer from the excess of df. 3. lmer(APC~species*gene+(1|day/repl/FITC) gives several warning messages like this one: In repl:day : numerical expression has 275591 elements: only the first used 4. lmer(APC~gene*species+(1|day_repl) + (1+gene:species|FITC)) ran several days, but failed to converge... Can somebody give me
[R] Analysis of a highly pseudoreplicate mixed-effects experiment
Hello everybody, I have been trying for some weeks to state the correct design of my experiment as a GLM formula, and have not been able to find something appropriate in Pinheiro Bates, so I am posting it here and hope somebody can help me. In each experimental condition, described by 1) gene (10 levels, fixed, because of high interest to me) 2) species (2 levels, fixed, because of high interest) 3) day (2 levels, random) 4) replicate (2 levels per day, random), I have several thousand data points consisting of two variables: 5) FITC (level of transfection of a cell) 6) APC (antibody binding to the cell) Because of intrinsic and uncontrollable cell-to-cell variation, FITC varies quite uniformly over a wide range, and APC correlates rather well with FITC. In some cases, I pasted day and replicate together as day_repl. My question is the following: Is there any gene (in my set of 10 genes) where the species makes a difference in the relation between FITC and APC ? If yes, in what gene does species have an effect ? And what is the effect of the species difference ? My attempts are the following: 1. Fit the data points of each experimental condition to a linear equation APC=Intercept+Slope*FITC and analyse the slopes : lm(Slope~species*gene*day_repl) This analysis shows clear differences between the genes, but no effect of species and no interaction gene:species. The linear fit to the cells is reasonably good, but of course does not represent the data set completely, so I wanted to incorporate the complete data set. 2a. lmer(APC~FITC*species*gene+(1|day)+(1|repl)) This gives extremely significant values for any interaction and variable because there are 200 000 df. Of course, it cannot be true, because the cells are not really independent. I have done many variations of the above, e.g. 2b. lmer(APC~FITC*species*gene+(1|day)+(1+FITC|day_repl)), but they all suffer from the excess of df. 3. lmer(APC~species*gene+(1|day/repl/FITC) gives several warning messages like this one: In repl:day : numerical expression has 275591 elements: only the first used 4. lmer(APC~gene*species+(1|day_repl) + (1+gene:species|FITC)) ran several days, but failed to converge... Can somebody give me any hint, or do you think the only possible analysis is a simplification as in my model 1 ? By the way, I am using R version 2.8.0 (2008-10-20) on Ubuntu 8.04 on a linux 2.6.24-24-generic kernel on different Intel systems. I am using the lme4 that came with R 2.8.0. Thank you very much for your time! -- Matthias Gralle, PhD Dept. Evolutionary Genetics Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig, Germany Tel +49 341 3550 519 Fax +49 341 3550 555 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test for stochastic dominance, non-inferiority test for distributions
Dear R-Users, Is anyone aware of a significance test which allows demonstrating that one distribution dominates another? Let F(t) and G(t) be two distribution functions, the alternative hypothesis would be something like: F(t) = G(t), for all t null hypothesis: F(t) G(t), for some t. Best wishes, Matthias PS. This one would be ok, as well: F(t) G(t), for all t null hypothesis: F(t) = G(t), for some t. -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with median for each row
if your matrix has many rows you might want to consider rowMedians from Bioconductor package Biobase. hth, Matthias Greg Hirson schrieb: Edward, See ?apply x = matrix(rnorm(100), ncol = 10) apply(x, 1, median) Hope this helps, Greg Edward Chen wrote: Hi, I tried looking through google search on whether there's a way to computer the median for each row of a nxn matrix and return the medians for each row for further computation. And also if the number of columns in the matrix are even, how could I specify which median to use? Thank you very much! -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the quantile method in bwplot
for teaching purposes I wrote a corresponding function; cf. qbxp.stats (as well as qboxplot ...) in package MKmisc. hth, Matthias Deepayan Sarkar schrieb: On Tue, Jul 21, 2009 at 7:47 AM, Jun Shenjun.shen...@gmail.com wrote: Uwe, Thank you for your reply. I am still not very clear about the meanings of the arguments in the stats function. To make it clearer, quantile() uses type=7 as default method. I believe this is the method bwplot() uses to calculate the quantiles. I want to use type=6 method for bwplot(). How do I achieve that? Thanks again. Maybe this will be clearer: bwplot() uses the boxplot.stats() function to compute the quantiles used, which in turn uses fivenum(), which has its own quantile calculation (and does not explicitly use quantile()). There is no easy way to allow for type=6 etc. here. bwplot() allows you to replace boxplot.stats() and provide your own alternative. So what you need to do is: (1) write a function, say, 'my.boxpot.stats', that takes the same arguments as boxplot.stats() and returns a similar result, but using your preferred calculation for the quantiles. There are many ways to do this. (2) plug in this function into the bwplot() call; e.g. bwplot(..., stats = my.boxplot.stats) -Deepayan Jun 2009/7/21 Uwe Ligges lig...@statistik.tu-dortmund.de Jun Shen wrote: Hi, everyone, Since quantile calculation has nine different methods in R, I wonder how I specify a method when calling the bwplot() in lattice. I couldn't find any information in the documentation. Thanks. bwplot() uses the panel function panel.bwplot() which allows to specify a function that calculates the statistics in its argument stats that defaults to boxplot.stats(). Hence you can change that function. Example with some fixed values: bwplot( ~ 1:10, stats = function(x, ...) return(list(stats=1:5, n=10, conf=1, 10, out=integer(0))) ) Uwe Ligges [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble using optim for maximalisation of 2-parameter function
try: # first argument of llik.nor has to be the parameter llik.nor-function(theta,x){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)} # optim by default does minimization # setting fnscale = -1 one obtains a maximization problem optim(c(1,5), llik.nor, x=x, control = list(fnscale = -1)) hth, Matthias Anjali Sing schrieb: Hello, I am having trouble using optim. I want to maximalise a function to its parameters [kind of like: univariate maximum likelihood estimation, but i wrote the likelihood function myself because of data issues ] When I try to optimize a function for only one parameter there is no problem: llik.expo-function(x,lam){(length(x)*log(lam))-(length(x)*log(1-exp(-1*lam* *cons*)))-lam*sum(x)} cons- data-c(.) expomx-optimize(llik.expo,c(0,100),maximum=TRUE,tol=0.0001,x=data) expomx To optimize to two parameters you can't use optimize, so I tried the following to test my input: llik.nor-function(x,theta){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)} x-c(-1,4,6,4,2) normx-optim(c(1,20),llik.nor) the output should be close to parameters: mu=3 and sigma=2.366 [This I calculated by hand to compare with the output] but in stead of output I get an error: Error in fn(par, ...) : argument theta is missing, with no default I don't understand why this happened? I hope someone can help me with this for I am still a [R]ookie. Kind regards, Sonko Lady [Anjali] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] implementing Maximum Likelihood with distrMod when only the PDF is known
Dear Guillaume, retrospectively, I'm not sure if the decision to have spezial initialize methods is the optimal way to go. In distrMod and our other packages on robust statistics we don't introduce special initialize methods, but use so-called generating functions. This approach has the advantage that the default initialize method can be used for programming where I find it very useful. I would say it is the canonical (and recommended) way to first define a function as generic (like mu) and then implement methods for it. Choosing names for which there is already an existing definition for a generic function may also not be what you want in general. By defining new methods you have to respect the definition of the generic function; that is, your method definition has to be with respect to the arguments in the given generic function and also dispatching will be on these arguments (cf. scale in the distrMod example). In defining our classes we decided that at least the slot r has to be filled (of class function) whereas d, p and q are of class OptionalFunction. Hence, there are functions to fill d, p and q for given r but, so far not for filling r, p and q given d. A way to avoid implementing r is given in http://www.stamats.de/distrModExample1.R I also do not fill the slots p and q in this example. This avoids the simulation of a large random sample and speeds up the computation of the MLE. However, this is rather a quick and dirty solution and it would of course be better to have a valid definition for r, d, p and q. Best, Matthias guillaume.martin schrieb: Dear Mathias, That's pretty amazing, thanks a lot ! I'll have to look all this through because I don't easily understand why each part has to be set up, in particular the initialize method part seems crucial and is not easy to intuit. From what I get, the actual name we give to a parameter (my original mu for example) is important in itself, and if we introduce new variable names we have to define a new generic, right? The simplest option then is to re-use an existing variable name that has the same properties/range, right? Another general question: my actual pdf is of the same type but not the exact same as the skew normal. In particular, I don't have a rule for building the slot r (eg the one borrowed from the sn package in your example); is it a problem? isn't it sufficient to give slot d, and then you have automatic methods implemented to get from d() to r() slots etc. is that right? Thanks a lot for your help and time ! Best, Guillaume Matthias Kohl a écrit : Dear Guillaume, thanks for your interest in the distrMod package. Regarding your question I took up your example and put a file under: http://www.stamats.de/distrModExample.R Hope that helps ... Don't hesitate to contact me if you have further questions! Best, Matthias guillaume.martin schrieb: Dear R users and Dear authors of the distr package and sequels I am trying to use the (very nice) package distrMod as I want to implement maximum likelihood (ML) fit of some univariate data for which I have derived a theoretical continuous density (pdf). As it is a parametric density, I guess that I should implement myself a new distribution of class AbscontDistributions (as stated in the pdf on creating new distributions in distr), and then use MLEstimator() from the distrMod package. Is that correct or is there a simpler way to go? Note that I want to use the distr package because it allows me to implement simply the convolution of my theoretical pdf with some noise distribution that I want to model in the data, this is more difficult with fitdistr or mle. It proved rather difficult for me to implement the new class following all the steps provided in the example, so I am asking if someone has an example of code he wrote to implement a parametric distribution from its pdf alone and then used it with MLEstimator(). I am sorry for the post is a bit long but it is a complicate question to me and I am not at all skillful in the handling of such notions as S4 - class, etc.. so I am a bit lost here.. As a simple example, suppose my theoretical pdf is the skew normal distribution (available in package sn): #skew normal pdf (default values = the standard normal N(0,1) fsn-function(x,mu=0,sd=1,d=0) {u = (x-mu)/sd; f = dnorm(u)*pnorm(d*u); return(f/sd)} # d = shape parameter (any real), mu = location (any real), sd = scale (positive real) #to see what it looks like try x-seq(-1,4,length=200);plot(fsn(x,d=3),type=l) #Now I tried to create the classes SkewNorm and SkewNormParameter copying the example for the binomial ##Class:parameters setClass(SkewNormParameter, representation=representation(mu=numeric,sd=numeric,d=numeric), prototype=prototype(mu=0,sd=1,d=0,name=gettext(Parameter of the Skew Normal distribution)), contains=Parameter ) ##Class: distribution (created using the pdf of the skew normal defined
Re: [R] implementing Maximum Likelihood with distrMod when only the PDF is known
Dear Guillaume, thanks for your interest in the distrMod package. Regarding your question I took up your example and put a file under: http://www.stamats.de/distrModExample.R Hope that helps ... Don't hesitate to contact me if you have further questions! Best, Matthias guillaume.martin schrieb: Dear R users and Dear authors of the distr package and sequels I am trying to use the (very nice) package distrMod as I want to implement maximum likelihood (ML) fit of some univariate data for which I have derived a theoretical continuous density (pdf). As it is a parametric density, I guess that I should implement myself a new distribution of class AbscontDistributions (as stated in the pdf on creating new distributions in distr), and then use MLEstimator() from the distrMod package. Is that correct or is there a simpler way to go? Note that I want to use the distr package because it allows me to implement simply the convolution of my theoretical pdf with some noise distribution that I want to model in the data, this is more difficult with fitdistr or mle. It proved rather difficult for me to implement the new class following all the steps provided in the example, so I am asking if someone has an example of code he wrote to implement a parametric distribution from its pdf alone and then used it with MLEstimator(). I am sorry for the post is a bit long but it is a complicate question to me and I am not at all skillful in the handling of such notions as S4 - class, etc.. so I am a bit lost here.. As a simple example, suppose my theoretical pdf is the skew normal distribution (available in package sn): #skew normal pdf (default values = the standard normal N(0,1) fsn-function(x,mu=0,sd=1,d=0) {u = (x-mu)/sd; f = dnorm(u)*pnorm(d*u); return(f/sd)} # d = shape parameter (any real), mu = location (any real), sd = scale (positive real) #to see what it looks like try x-seq(-1,4,length=200);plot(fsn(x,d=3),type=l) #Now I tried to create the classes SkewNorm and SkewNormParameter copying the example for the binomial ##Class:parameters setClass(SkewNormParameter, representation=representation(mu=numeric,sd=numeric,d=numeric), prototype=prototype(mu=0,sd=1,d=0,name=gettext(Parameter of the Skew Normal distribution)), contains=Parameter ) ##Class: distribution (created using the pdf of the skew normal defined above) setClass(SkewNorm,prototype = prototype( d = function(x, log = FALSE){fsn(x, mu=0, sd=1,d=0)}, param = new(SkewNormParameter), .logExact = TRUE,.lowerExact = TRUE), contains = AbscontDistribution ) #so far so good but then with setMethod(mu, SkewNormParameter, function(object) obj...@mu) #I get the following error message: Error in setMethod(mu, SkewNormParameter, function(object) obj...@mu) : no existing definition for function mu I don't understand because to me mu is a parameter not a function... maybe that is too complex programming for me and I should switch to implementing my likelihood by hand with numerical convolutions and optim() etc., but I would like to know how to use distr, so if there is anyone who had the same problem and solved it, I would be very grateful for the hint ! All the best, Guillaume -- Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.