Re: [R] Sensitivity and Specificity

2022-10-24 Thread Prof. Dr. Matthias Kohl

MKclass::perfMeasures(predict_testing, truth = testing$case, namePos = 1)

should also work and computes 80 performance measures.

Best Matthias

Am 25.10.22 um 06:42 schrieb Jin Li:

Hi Greg,

This can be done by:
spm::pred.acc(testing$case,  predict_testing)

It will return both sensitivity and specificity, along with a few other
commonly used measures.

Hope this helps,
Jin

On Tue, Oct 25, 2022 at 6:01 AM Rui Barradas  wrote:


Às 16:50 de 24/10/2022, greg holly escreveu:

Hi Michael,

I appreciate your writing. Here are what I have after;


predict_testing <- ifelse(predict > 0.5,1,0)

head(predict)

   1  2  3  5  7  8
0.29006984 0.28370507 0.10761993 0.02204224 0.12873872 0.08127920


# Sensitivity and Specificity





sensitivity<-(predict_testing[2,2]/(predict_testing[2,2]+predict_testing[2,1]))*100

Error in predict_testing[2, 2] : incorrect number of dimensions

sensitivity

function (data, ...)
{
  UseMethod("sensitivity")
}








specificity<-(predict_testing[1,1]/(predict_testing[1,1]+predict_testing[1,2]))*100

Error in predict_testing[1, 1] : incorrect number of dimensions

specificity

function (data, ...)
{
  UseMethod("specificity")
}



On Mon, Oct 24, 2022 at 10:45 AM Michael Dewey 
wrote:


Rather hard to know without seeing what output you expected and what
error message you got if any but did you mean to summarise your variable
predict before doing anything with it?

Michael

On 24/10/2022 16:17, greg holly wrote:

Hi all R-Help ,

After partitioning my data to testing and training (please see below),

I

need to estimate the Sensitivity and Specificity. I failed. It would be
appropriate to get your help.

Best regards,
Greg


inTrain <- createDataPartition(y=data$case,
  p=0.7,
  list=FALSE)
training <- data[ inTrain,]
testing  <- data[-inTrain,]

attach(training)
#model training and prediction
data_training <- glm(case ~ age+BMI+Calcium+Albumin+meno_1, data =
training, family = binomial(link="logit"))

predict <- predict(data_training, data_predict = testing, type =

"response")


predict_testing <- ifelse(predict > 0.5,1,0)

# Sensitivity and Specificity





  
sensitivity<-(predict_testing[2,2]/(predict_testing[2,2]+predict_testing[2,1]))*100

sensitivity





  
specificity<-(predict_testing[1,1]/(predict_testing[1,1]+predict_testing[1,2]))*100

specificity

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html



   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


Hello,

Instead of computing by hand, why not use package caret?


tbl <- table(predict_testing, testing$case)
caret::sensitivity(tbl)
caret::specificity(tbl)


Hope this helps,

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What are the pros and cons of the log.p parameter in (p|q)norm and similar?

2021-08-04 Thread matthias-gondan
Response to 1You need the log version e.g. in maximum likelihood, otherwise the 
product of the densities and probabilities can become very small.
 Ursprüngliche Nachricht Von: r-help-requ...@r-project.org 
Datum: 04.08.21  12:01  (GMT+01:00) An: r-help@r-project.org Betreff: R-help 
Digest, Vol 222, Issue 4 Send R-help mailing list submissions to
r-help@r-project.orgTo subscribe or unsubscribe via the World Wide Web, visit   
https://stat.ethz.ch/mailman/listinfo/r-helpor, via email, send a message with 
subject or body 'help' tor-help-request@r-project.orgYou can reach the 
person managing the list at   r-help-owner@r-project.orgWhen replying, 
please edit your Subject line so it is more specificthan "Re: Contents of 
R-help digest..."Today's Topics:   1. What are the pros and cons of the log.p 
parameter in  (p|q)norm and similar? (Michael Dewey)   2. Help with package 
EasyPubmed (bharat rawlley)   3. Re: Help with package EasyPubmed (bharat 
rawlley)   4. Re:  What are the pros and cons of the log.p parameter in  
(p|q)norm and similar? (Duncan Murdoch)   5. Re:  What are the pros and cons of 
the log.p parameter in  (p|q)norm and similar? (Bill Dunlap)   6. Creating 
a log-transformed histogram of multiclass data  (Tom Woolman)   7. Re: 
Creating a log-transformed histogram of multiclass data  (Tom 
Woolman)--Message:
 1Date: Tue, 3 Aug 2021 17:20:12 +0100From: Michael Dewey 
To: "r-help@r-project.org" 
Subject: [R] What are the pros and cons of the log.p 
parameter in (p|q)norm and similar?Message-ID: 
Content-Type: 
text/plain; charset="utf-8"; Format="flowed"Short versionApart from the ability 
to work with values of p too small to be of much practical use what are the 
advantages and disadvantages of setting this to TRUE?Longer versionI am 
contemplating upgrading various functions in one of my packages to use this and 
as far as I can see it would only have the advantage of allowing people to use 
very small p-values but before I go ahead have I missed anything? I am most 
concerned with negatives but if there is any other advantage I would mention 
that in the vignette. I am not concerned about speed or the extra effort in 
coding and expanding the documentation.-- 
Michaelhttp://www.dewey.myzen.co.uk/home.html--Message:
 2Date: Tue, 3 Aug 2021 18:20:52 + (UTC)From: bharat rawlley 
To: R-help Mailing List 
Subject: [R] Help with package EasyPubmedMessage-ID: 
<1046636584.2205366.1628014852...@mail.yahoo.com>Content-Type: text/plain; 
charset="utf-8"Hello, When I try to run the following code using the package 
Easypubmed, I get a null result - > batch_pubmed_download(query_7)NULL#query_7 
<- "Cardiology AND randomizedcontrolledtrial[Filter] AND 2011[PDAT]"However, 
the exact same search string yields 668 results on Pubmed. I am unable to 
figure out why this is happening. If I use the search string "Cardiology AND 
2011[PDAT]" then it works just fine. Any help would be greatly appreciatedThank 
you!  [[alternative HTML version 
deleted]]--Message: 3Date: Tue, 3 Aug 2021 18:26:40 
+ (UTC)From: bharat rawlley To: R-help Mailing 
List Subject: Re: [R] Help with package 
EasyPubmedMessage-ID: 
<712126143.2207911.1628015200...@mail.yahoo.com>Content-Type: text/plain; 
charset="utf-8"  Okay, the following search string resolved my issue  - 
"Cardiology AND randomized controlled trial[Publication type] AND 
2011[PDAT]"Thank you!    On Tuesday, 3 August, 2021, 02:21:38 pm GMT-4, bharat 
rawlley via R-help  wrote:    Hello, When I try to run 
the following code using the package Easypubmed, I get a null result - > 
batch_pubmed_download(query_7)NULL#query_7 <- "Cardiology AND 
randomizedcontrolledtrial[Filter] AND 2011[PDAT]"However, the exact same search 
string yields 668 results on Pubmed. I am unable to figure out why this is 
happening. If I use the search string "Cardiology AND 2011[PDAT]" then it works 
just fine. Any help would be greatly appreciatedThank you! [[alternative 
HTML version 
deleted]]__r-h...@r-project.org 
mailing list -- To UNSUBSCRIBE and more, 
seehttps://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide 
http://www.R-project.org/posting-guide.htmland provide commented, minimal, 
self-contained, reproducible code.  [[alternative HTML version 
deleted]]--Message: 4Date: Tue, 3 Aug 2021 14:53:28 
-0400From: Duncan Murdoch To: Michael Dewey 
, "r-help@r-project.org"  
Subject: Re: [R]  What are the pros and cons of the log.p 
parameter in(p|q)norm and similar?Message-ID: 
Content-Type: text/plain; 
charset="utf-8"; Format="flowed"On 03/08/2021 12:20 p.m., Michael Dewey wrote:> 
Short version> > Apart from the ability to work with values of p too small to 
be of much> practical use what 

Re: [R] density with weights missing values

2021-07-13 Thread Matthias Gondan
Thanks Martin and the others. I will do so accordingly. 

I guess the 0.1% of the population who uses density with weights will write 
code like this

x = c(1, 2, 3, NA)
weights = c(1, 1, 1, 1)
density(x[!is.na(x)], weights=weights[!is.na(x)])

These people won’t be affected. For the 0.01% of people with code like this,

density(x, weights=weights[!is.na(x)], na.rm=TRUE)

the corrected version would almost surely raise an error. Note that the error 
message can, in principle, check if length(x[!is.na(x)]) == length(the provided 
weights) and tell the programmer that this was the old behavior.

Best wishes,

Matthias

PS. Sorry for the HTML email. I’ve given up trying to fix such behavior.


Von: Martin Maechler
Gesendet: Dienstag, 13. Juli 2021 09:09
An: Matthias Gondan
Cc: r-help@r-project.org
Betreff: Re: [R] density with weights missing values

>>>>> Matthias Gondan 
>>>>> on Mon, 12 Jul 2021 15:09:38 +0200 writes:

> Weighted mean behaves differently:
> • weight is excluded for missing x
> • no warning for sum(weights) != 1

>> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))
> [1] 2.5
>> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))
> [1] NA
>> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)
> [1] 2


I'm sure the 'weights' argument in weighted.mean() has been used
much more often than the one in density().
Hence, it's quite "probable statistically" :-)  that the
weighted.mean() behavior in the NA case has been more rational
and thought through 

So I agree with you, Matthias, that ideally density() should
behave differently here,  probably entirely analogously to weighted.mean().

Still, Bert and others are right that there is no bug formally,
but something that possibly should be changed; even though it
breaks back compatibility for those cases,  such case may be
very rare (I'm not sure I've ever used weights in density() but
I know I've used it very much all those 25 years ..).

https://www.r-project.org/bugs.html

contains good information about determining if something may be
a bug in R *and* tell you how to apply for an account on R's
bugzilla for reporting it formally.
I'm hereby encouraging you, Matthias, to do that and then in
your report mention both density() and weighted.mean(), i.e., a
cleaned up version of the union of your first 2 e-mails..

Thank you for thinking about this and concisely reporting it.
Martin


> Von: Richard O'Keefe
> Gesendet: Montag, 12. Juli 2021 13:18
> An: Matthias Gondan
> Betreff: Re: [R] density with weights missing values

> Does your copy of R say that the weights must add up to 1?
> ?density doesn't say that in mine.   But it does check.

another small part to could be improved, indeed,
thank you, Richard.

--
Martin Maechler
ETH Zurich  and  R Core team
 
> On Mon, 12 Jul 2021 at 22:42, Matthias Gondan  
wrote:
>> 
>> Dear R users,
>> 
>> This works as expected:
>> 
>> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))
>> 
>> This raises an error
>> 
>> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 
1)))
>> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 
NA)))
[..]


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density with weights missing values

2021-07-12 Thread matthias-gondan
You're right, of course. Extrapolating your argument a bit, the whole practice 
of na.rm is questionable, since there's always a reason for missingness (that 
is not in x and rarely elsewhere in the data)Best wishes Matthias 
 Ursprüngliche Nachricht Von: Jeff Newmiller 
 Datum: 12.07.21  18:44  (GMT+01:00) An: 
r-help@r-project.org, matthias-gondan , Bert Gunter 
 Cc: r-help@r-project.org Betreff: Re: [R] density with 
weights missing values Sure, you might think that.But most likely the reason 
this code has not been corrected is that when you give weights for missing data 
the most correct result is for your entire density to be invalid.Fix your 
inputs so they make sense to you and there is no problem. But absent your 
intellectual input to restructure your problem the weights no longer make sense 
once density() removes the NAs from the data.On July 12, 2021 9:13:12 AM PDT, 
matthias-gondan  wrote:>The thing is that for 
na.rm=TRUE, I would expect the weights>corresponding to the missing x to be 
removed, as well. Like in>weighted.mean. So this one shouldn't raise an 
error,density(c(1, 2, 3,>4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))Or 
am I missing>something? > Ursprüngliche Nachricht Von: Bert 
Gunter> Datum: 12.07.21  16:25  (GMT+01:00) 
An:>Matthias Gondan  Cc: r-help@r-project.org>Betreff: 
Re: [R] density with weights missing values The behavior is as>documented 
AFAICS.na.rmlogical; if TRUE, missing values are removed>from x. If FALSE 
anymissing values cause an error.The default is>FALSE.weightsnumeric vector of 
non-negative observation weights.NA is>not a non-negative numeric.Bert 
Gunter"The trouble with having an open>mind is that people keep coming alongand 
sticking things into it."-->Opus (aka Berkeley Breathed in his "Bloom County" 
comic strip )Bert>Gunter"The trouble with having an open mind is that people 
keep coming>alongand sticking things into it."-- Opus (aka Berkeley Breathed in 
his>"Bloom County" comic strip )On Mon, Jul 12, 2021 at 6:10 AM Matthias>Gondan 
 wrote:>> Weighted mean behaves>differently:> • weight 
is excluded for missing x> • no warning for>sum(weights) != 1>> > 
weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1,>1))> [1] 2.5> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))>>[1] NA> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1),>na.rm=TRUE)> [1] 2>>>>> 
Von: Richard O'Keefe> Gesendet: Montag, 12.>Juli 2021 13:18> An: Matthias 
Gondan> Betreff: Re: [R] density with>weights missing values>> Does your copy 
of R say that the weights must>add up to 1?> ?density doesn't say that in mine. 
  But it does check.>>>On Mon, 12 Jul 2021 at 22:42, Matthias Gondan 
>wrote:> >> > Dear R users,> >> > This works as 
expected:> >> > •>plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))> >> > This 
raises an>error> >> > • plot(density(c(1,2, 3, 4, 5, NA), 
na.rm=TRUE,>weights=c(1, 1, 1, 1, 1, 1)))> > • plot(density(c(1,2, 3, 4, 5, 
NA),>na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))> >> > This seems to work 
(it>triggers a warning that the weights don’t add up to 1, which 
makes>sense*):> >> > • plot(density(c(1,2, 3, 4, 5, NA), 
na.rm=TRUE,>weights=c(1, 1, 1, 1, 1)))> >> > Questions> >> > • But shouldn’t 
the>na.rm filter also filter the corresponding weights?> > • Extra>question: In 
case the na.rm filter is changed to filter the weights,>the check for 
sum(weights) == 1 might trigger false positive warnings>since the weights might 
not add up to 1 anymore> >> > Best wishes,> >>>> Matthias> >> >> > 
[[alternative HTML version deleted]]> >> 
>>__> > 
R-help@r-project.org>mailing list -- To UNSUBSCRIBE and more, see> 
>>https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the>posting 
guide http://www.R-project.org/posting-guide.html> > and>provide commented, 
minimal, self-contained, reproducible>code.>>> [[alternative HTML 
version deleted]]>>>__> 
R-help@r-project.org>mailing list -- To UNSUBSCRIBE and more, 
see>>https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the>posting 
guide http://www.R-project.org/posting-guide.html> and provide>commented, 
minimal, self-contained, reproducible code.> [[alternative HTML version 
deleted]]>>__>R-help@r-project.org 
mailing list -- To UNSUBSCRIBE and more, 
see>https://stat.ethz.ch/mailman/listinfo/r-help>PL

Re: [R] density with weights missing values

2021-07-12 Thread matthias-gondan
The thing is that for na.rm=TRUE, I would expect the weights corresponding to 
the missing x to be removed, as well. Like in weighted.mean. So this one 
shouldn't raise an error,density(c(1, 2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 
1, 1, 1, 1, 1))Or am I missing something? 
 Ursprüngliche Nachricht Von: Bert Gunter 
 Datum: 12.07.21  16:25  (GMT+01:00) An: Matthias 
Gondan  Cc: r-help@r-project.org Betreff: Re: [R] 
density with weights missing values The behavior is as documented 
AFAICS.na.rmlogical; if TRUE, missing values are removed from x. If FALSE 
anymissing values cause an error.The default is FALSE.weightsnumeric vector of 
non-negative observation weights.NA is not a non-negative numeric.Bert 
Gunter"The trouble with having an open mind is that people keep coming alongand 
sticking things into it."-- Opus (aka Berkeley Breathed in his "Bloom County" 
comic strip )Bert Gunter"The trouble with having an open mind is that people 
keep coming alongand sticking things into it."-- Opus (aka Berkeley Breathed in 
his "Bloom County" comic strip )On Mon, Jul 12, 2021 at 6:10 AM Matthias Gondan 
 wrote:>> Weighted mean behaves differently:> • weight 
is excluded for missing x> • no warning for sum(weights) != 1>> > 
weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))> [1] 2.5> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))> [1] NA> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)> [1] 2>>>>> 
Von: Richard O'Keefe> Gesendet: Montag, 12. Juli 2021 13:18> An: Matthias 
Gondan> Betreff: Re: [R] density with weights missing values>> Does your copy 
of R say that the weights must add up to 1?> ?density doesn't say that in mine. 
  But it does check.>> On Mon, 12 Jul 2021 at 22:42, Matthias Gondan 
 wrote:> >> > Dear R users,> >> > This works as 
expected:> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))> >> > This 
raises an error> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, 
weights=c(1, 1, 1, 1, 1, 1)))> > • plot(density(c(1,2, 3, 4, 5, NA), 
na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))> >> > This seems to work (it 
triggers a warning that the weights don’t add up to 1, which makes sense*):> >> 
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1)))> 
>> > Questions> >> > • But shouldn’t the na.rm filter also filter the 
corresponding weights?> > • Extra question: In case the na.rm filter is changed 
to filter the weights, the check for sum(weights) == 1 might trigger false 
positive warnings since the weights might not add up to 1 anymore> >> > Best 
wishes,> >> > Matthias> >> >> > [[alternative HTML version deleted]]> 
>> > __> > R-help@r-project.org 
mailing list -- To UNSUBSCRIBE and more, see> > 
https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting 
guide http://www.R-project.org/posting-guide.html> > and provide commented, 
minimal, self-contained, reproducible code.>>> [[alternative HTML 
version deleted]]>> __> 
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see> 
https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, 
self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density with weights missing values

2021-07-12 Thread Matthias Gondan
Weighted mean behaves differently:
• weight is excluded for missing x
• no warning for sum(weights) != 1

> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))
[1] 2.5
> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))
[1] NA
> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)
[1] 2




Von: Richard O'Keefe
Gesendet: Montag, 12. Juli 2021 13:18
An: Matthias Gondan
Betreff: Re: [R] density with weights missing values

Does your copy of R say that the weights must add up to 1?
?density doesn't say that in mine.   But it does check.

On Mon, 12 Jul 2021 at 22:42, Matthias Gondan  wrote:
>
> Dear R users,
>
> This works as expected:
>
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))
>
> This raises an error
>
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1)))
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))
>
> This seems to work (it triggers a warning that the weights don’t add up to 1, 
> which makes sense*):
>
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1)))
>
> Questions
>
> • But shouldn’t the na.rm filter also filter the corresponding weights?
> • Extra question: In case the na.rm filter is changed to filter the weights, 
> the check for sum(weights) == 1 might trigger false positive warnings since 
> the weights might not add up to 1 anymore
>
> Best wishes,
>
> Matthias
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] density with weights missing values

2021-07-12 Thread Matthias Gondan
Dear R users,

This works as expected:

• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))

This raises an error

• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1)))
• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))

This seems to work (it triggers a warning that the weights don’t add up to 1, 
which makes sense*):

• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1)))

Questions

• But shouldn’t the na.rm filter also filter the corresponding weights?
• Extra question: In case the na.rm filter is changed to filter the weights, 
the check for sum(weights) == 1 might trigger false positive warnings since the 
weights might not add up to 1 anymore

Best wishes,

Matthias


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Embedded R: Test if initialized

2021-06-16 Thread Matthias Gondan
Dear R friends,

I am currently trying to write a piece of C code that uses „embedded R“, and 
for specific reasons*, I cannot keep track if R already has been initialized. 
So the code snippet looks like this:

LibExtern char *R_TempDir;

if(R_TempDir == NULL)
…throw exception R not initialized…

I have seen that the source code of Rf_initialize_R itself checks if it is 
ivoked twice (num_initialized), but this latter flag does not seem to 
accessible, or is it? 

int Rf_initialize_R(int ac, char **av)
{
int i, ioff = 1, j;
Rboolean useX11 = TRUE, useTk = FALSE;
char *p, msg[1024], cmdlines[1], **avv;
structRstart rstart;
Rstart Rp = 
Rboolean force_interactive = FALSE;

if (num_initialized++) {
fprintf(stderr, "%s", "R is already initialized\n");
exit(1);
}


Is the test of the TempDir a good substitute, or should I choose another 
solution? Having said this, it may be a good idea to expose a function 
Rf_R_initialized that performs such a test.

Thank you for your consideration.

Best regards,

Matthias

*The use case is an R library that connects to swi-prolog and allows the 
„embedded“ swi-prolog to establish the reverse connection to R. In that case, 
i.e., R -> Prolog -> R, I do not want to initialize R a second time.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confidence intervals for the difference between group means

2020-08-04 Thread Prof. Dr. Matthias Kohl

you could try:

library(MKinfer)
meanDiffCI(a, b, boot = TRUE)

Best
Matthias

Am 04.08.20 um 16:08 schrieb varin sacha via R-help:

Dear R-experts,

Using the bootES package I can easily calculate the bootstrap confidence 
intervals of the means like in the toy example here below. Now, I am looking 
for the confidence intervals for the difference between group means. In my 
case, the point estimate of the mean difference is 64.4. I am looking at the 
95% confidence intervals around this point estimate (64.4).

Many thanks for your response.


library(bootES)
a<-c(523,435,478,567,654)
b<-c(423,523,421,467,501)
bootES(a)
bootES(b)


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ncol() vs. length() on data.frames

2020-03-31 Thread Prof. Dr. Matthias Kohl

should have added: dim(x)[2L] -> length(x)

Am 31.03.20 um 16:21 schrieb Prof. Dr. Matthias Kohl:

Dear Ivan,

if I enter ncol in the console, I get

function (x)
dim(x)[2L]



indicating that function dim is called. Function dim has a method for 
data.frame; see methods("dim").


The dim-method for data.frame is

dim.data.frame
function (x)
c(.row_names_info(x, 2L), length(x))



Hence, it calls length on the provided data.frame. In addition, some 
"magic" with .row_names_info is performed, where


base:::.row_names_info
function (x, type = 1L)
.Internal(shortRowNames(x, type))



Best
Matthias

Am 31.03.20 um 16:10 schrieb Ivan Calandra:

Thanks Ivan for the answer.

So it confirms my first thought that these two functions are equivalent
when applied to a "simple" data.frame.

The reason I was asking is because I have gotten used to use length() in
my scripts. It works perfectly and I understand it easily. But to be
honest, ncol() is more intuitive to most users (especially the novice)
so I was thinking about switching to using this function instead (all my
data.frames are created from read.csv() or similar functions so there
should not be any issue). But before doing that, I want to be sure that
it is not going to create unexpected results.

Thank you,
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 31/03/2020 16:00, Ivan Krylov wrote:

On Tue, 31 Mar 2020 14:47:54 +0200
Ivan Calandra  wrote:


On a simple data.frame (i.e. each element is a vector), ncol() and
length() will give the same result.
Are they just equivalent on such objects, or are they differences in
some cases?

I am not aware of any exceptions to ncol(dataframe)==length(dataframe)
(in fact, ncol(x) is dim(x)[2L] and ?dim says that dim(dataframe)
returns c(length(attr(dataframe, 'row.names')), length(dataframe))), but
watch out for AsIs columns which can have columns of their own:

x <- data.frame(I(volcano))
dim(x)
# [1] 87  1
length(x)
# [1] 1
dim(x[,1])
# [1] 87 61




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ncol() vs. length() on data.frames

2020-03-31 Thread Prof. Dr. Matthias Kohl

Dear Ivan,

if I enter ncol in the console, I get

function (x)
dim(x)[2L]



indicating that function dim is called. Function dim has a method for 
data.frame; see methods("dim").


The dim-method for data.frame is

dim.data.frame
function (x)
c(.row_names_info(x, 2L), length(x))



Hence, it calls length on the provided data.frame. In addition, some 
"magic" with .row_names_info is performed, where


base:::.row_names_info
function (x, type = 1L)
.Internal(shortRowNames(x, type))



Best
Matthias

Am 31.03.20 um 16:10 schrieb Ivan Calandra:

Thanks Ivan for the answer.

So it confirms my first thought that these two functions are equivalent
when applied to a "simple" data.frame.

The reason I was asking is because I have gotten used to use length() in
my scripts. It works perfectly and I understand it easily. But to be
honest, ncol() is more intuitive to most users (especially the novice)
so I was thinking about switching to using this function instead (all my
data.frames are created from read.csv() or similar functions so there
should not be any issue). But before doing that, I want to be sure that
it is not going to create unexpected results.

Thank you,
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 31/03/2020 16:00, Ivan Krylov wrote:

On Tue, 31 Mar 2020 14:47:54 +0200
Ivan Calandra  wrote:


On a simple data.frame (i.e. each element is a vector), ncol() and
length() will give the same result.
Are they just equivalent on such objects, or are they differences in
some cases?

I am not aware of any exceptions to ncol(dataframe)==length(dataframe)
(in fact, ncol(x) is dim(x)[2L] and ?dim says that dim(dataframe)
returns c(length(attr(dataframe, 'row.names')), length(dataframe))), but
watch out for AsIs columns which can have columns of their own:

x <- data.frame(I(volcano))
dim(x)
# [1] 87  1
length(x)
# [1] 1
dim(x[,1])
# [1] 87 61




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to rum Multiple ANOVA and Multiple T-test between the same groups?

2019-02-10 Thread Prof. Dr. Matthias Kohl
Have a look at Bioconductor package genefilter, especially functions 
colttests and colFtests.

Best Matthias

Am 10.02.19 um 10:35 schrieb AbouEl-Makarim Aboueissa:

Dear All: good morning





*Re:* How to rum Multiple ANOVA and Multiple T-test between the same groups.



Your help will be highly appreciated.





*1.*  is there a way to run multiple t-tests on different variables between
the same two groups.





*Data for t-tests:*



The data frame “dataTtest”  has 5 variables (x1,x2,x3,x4,x5) and one factor
(factor1) with 2 levels (group1, group2).





x1<-rnorm(20,1,1)

x2<-rnorm(20,2,1)

x3<-rnorm(20,3,1)

x4<-rnorm(20,4,1)

x5<-rnorm(20,5,1)

factor1<-rep(c("group1", "group2"), each = 10)

dataTtest<-data.frame(x1,x2,x3,x4,x5,factor1)

dataTtest









*2.* is there a way to run *multiple ANOVA* and multiple comparisons *Tukey
tests* on different variables between the same groups.





*Data for ANOVA tests:*



The data frame “dataANOVA”  has 6 variables (x1,x2,x3,x4,x5,x6) and one
factor (factor2) with 5 levels (group1, group2, group3, group4, group5).







x1<-rnorm(40,1,1)

x2<-rnorm(40,2,1)

x3<-rnorm(40,3,1)

x4<-rnorm(40,4,1)

x5<-rnorm(40,5,1)

x6<-rnorm(40,6,1)

factor2<-rep(c("group1", "group2", "group3", "group4", "group5"), each = 8)

dataANOVA<-data.frame(x1,x2,x3,x4,x5,x6,factor2)

dataANOVA





with many thanks

abou
__


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange lazy evaluation of default arguments

2017-09-05 Thread Matthias Gondan
Dear S Ellison,

Thanks for the flowers! Indeed, I was actually considering to use it in my own 
teaching material, as well. With rounded numbers, of course.

Though I am still slightly disturbed about this feature. I thought, now it is 
the time to switch to Python, but that’s even worse, see here:

def add_elem(List=[]):
List.append('elem')
return List

>>> add_elem([2])
[2, 'elem']
>>> add_elem()
['elem']
>>> add_elem()
['elem', 'elem'] <<<<<<<<<<< why on earth does this happen? [it’s documented 
behavior, but still…]

So, I’ll stick with R. Still 25 years or so until retirement, but I’ll survive, 
even without crossreferenced default arguments.

Best wishes,

Matthias


Von: S Ellison
Gesendet: Dienstag, 5. September 2017 16:17
An: Matthias Gondan; r-help@r-project.org
Betreff: RE: [R] Strange lazy evaluation of default arguments

Mathias,
If it's any comfort, I appreciated the example; 'expected' behaviour maybe, but 
a very nice example for staff/student training!

S Ellison


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matthias
> Gondan
> Sent: 02 September 2017 18:22
> To: r-help@r-project.org
> Subject: [R] Strange lazy evaluation of default arguments
> 
> Dear R developers,
> 
> sessionInfo() below
> 
> Please have a look at the following two versions of the same function:
> 
> 1. Intended behavior:
> 
> > Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
> + {
> +   print(c(u, l, mu)) # here, l is set to u’s value
> +   u = u/sqrt(sigma2)
> +   l = l/sqrt(sigma2)
> +   mu = mu/sqrt(sigma2)
> +   print(c(u, l, mu))
> + }
> >
> > Su1()
> [1] 100.00 100.00   0.53
> [1] 23.2558140 23.2558140  0.1232558
> 
> In the first version, both u and l are correctly divided by 4.3.
> 
> 2. Strange behavior:
> 
> > Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
> + {
> +   # print(c(u, l, mu))
> +   u = u/sqrt(sigma2)
> +   l = l/sqrt(sigma2) # here, l is set to u’s value
> +   mu = mu/sqrt(sigma2)
> +   print(c(u, l, mu))
> + }
> >
> > Su2()
> [1] 23.2558140  5.4083288  0.1232558
> In the second version, the print function is commented out, so the variable u
> is copied to l (lowercase L) at a later place, and L is divided twice by 4.3.
> 
> Is this behavior intended? It seems strange that the result depends on a
> debugging message.
> 
> Best wishes,
> 
> Matthias
> 
> 
> > sessionInfo()
> R version 3.4.1 (2017-06-30)
> Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8
> x64 (build 9200)
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
> LC_MONETARY=German_Germany.1252
> [4] LC_NUMERIC=CLC_TIME=German_Germany.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> loaded via a namespace (and not attached):
> [1] compiler_3.4.1 tools_3.4.1
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


***
This email and any attachments are confidential. Any use...{{dropped:12}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange lazy evaluation of default arguments

2017-09-02 Thread Matthias Gondan
Dear Bill,

All makes perfect sense (including the late evaluation). I actually discovered 
the problem by looking at old code which used your proposed solution. Still I 
find it strange (and, hnestly, I don’t like R’s behavior in this respect), and 
I am wondering why u is not being copied to L just before u is assigned a new 
value. Of course, this would require the R interpreter to track all these 
dependencies in both ways incl. more complicated ones in which L might depend 
on more than just u.

In the future, I’ll avoid dependencies between parameters.

Su4 <- function(u=100, l=100, mu=0.53, sigma2=4.3^2) # instead of l=u

And maybe also „in-place“ changes of values…

Best regards,

Matthias

Von: William Dunlap
Gesendet: Samstag, 2. September 2017 19:41
An: Rui Barradas
Cc: Matthias Gondan; r-help@r-project.org
Betreff: Re: [R] Strange lazy evaluation of default arguments

Another way to avoid the problem is to not redefine variables that are 
arguments.  E.g.,

> Su3 <- function(u=100, l=u, mu=0.53, sigma2=4.3^2, verbose)
  {
    if (verbose) {
      print(c(u, l, mu))
    }
    uNormalized <- u/sqrt(sigma2)
    lNormalized <- l/sqrt(sigma2)
    muNormalized <- mu/sqrt(sigma2)
    c(uNormalized, lNormalized, muNormalized)
  }
> Su3(verbose=TRUE)
[1] 100.00 100.00   0.53
[1] 23.2558140 23.2558140  0.1232558
> Su3(verbose=FALSE)
[1] 23.2558140 23.2558140  0.1232558

Not redefining variables at all makes debugging easier, although it may waste 
space.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Sep 2, 2017 at 10:33 AM, <ruipbarra...@sapo.pt> wrote:
Hello,

One way of preventing that is to use ?force.
Just put

   force(l)

right after the commented out print and before you change 'u'.

Hope this helps,

Rui Barradas



Citando Matthias Gondan <matthias-gon...@gmx.de>:

Dear R developers,

sessionInfo() below

Please have a look at the following two versions of the same function:

1. Intended behavior:
Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   print(c(u, l, mu)) # here, l is set to u’s value
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2)
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }

Su1()
[1] 100.00 100.00   0.53
[1] 23.2558140 23.2558140  0.1232558

In the first version, both u and l are correctly divided by 4.3.

2. Strange behavior:
Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   # print(c(u, l, mu))
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2) # here, l is set to u’s value
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }

Su2()
[1] 23.2558140  5.4083288  0.1232558
In the second version, the print
function is commented out, so the variable u is
copied to l (lowercase L) at a later place, and L is divided twice by 4.3.

Is this behavior intended? It seems strange that the result depends on a 
debugging message.

Best wishes,

Matthias

sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1


        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Strange lazy evaluation of default arguments

2017-09-02 Thread Matthias Gondan
Dear R developers,

sessionInfo() below

Please have a look at the following two versions of the same function:

1. Intended behavior:

> Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   print(c(u, l, mu)) # here, l is set to u’s value
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2)
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }
> 
> Su1()
[1] 100.00 100.00   0.53
[1] 23.2558140 23.2558140  0.1232558

In the first version, both u and l are correctly divided by 4.3.

2. Strange behavior:

> Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   # print(c(u, l, mu))
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2) # here, l is set to u’s value
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }
> 
> Su2()
[1] 23.2558140  5.4083288  0.1232558
In the second version, the print function is commented out, so the variable u 
is 
copied to l (lowercase L) at a later place, and L is divided twice by 4.3.

Is this behavior intended? It seems strange that the result depends on a 
debugging message.

Best wishes,

Matthias


> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=CLC_TIME=German_Germany.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1   


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] IF LOOOP

2017-06-13 Thread matthias worni
Hey

This should be a rather simple quesiton for some of you. I want to make
some progress in looping...
I have the vector r, which contains single values --> see below:

r
  [1] 1.1717118 1.1605215 1.1522907 1.1422830 1.1065277 1.1165451 1.1163768
1.1048872 1.0848836 1.0627211
 [11] 1.0300964 1.0296879 1.0308194 1.0518188 1.0657229 1.0685514 1.0914881
1.1042577 1.1039351 1.0880163


I would like to take out simply the value "0.990956" from the vector,
printing out the rest of it.  The code is from the internet but does not
seem to work for my vector. Can't figure out why... Thanks for the help

r <- as.vector(lw)
count=0
for (i in r)  {
  if(i == 0.990956) {
break
  }
print(i)
  }


Best
Matthias

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Paired sample t-test with mi.t.test

2017-04-07 Thread Prof. Dr. Matthias Kohl

Dear Joel,

are you trying to apply function mi.t.test from my package MKmisc?

Could you please try:
mi.t.test(implist, "pre_test", "post_test", alternative =
"greater", paired = TRUE, var.equal = TRUE, conf.level = 0.95)

x and y are the names of the variables, not the variables themselves.

Best
Matthias

Am 06.04.2017 um 18:32 schrieb Joel Gagnon:

Dear all,

It is my first time posting on this list so forgive me for any rookie
mistakes I could make.

I want to conduct t-tests on a dataset that has been imputed using the mice
package:
imput_pps <- mice(pps, m=20, maxit=20, meth='pmm') # pps is my dataset. It
contains items from an 11-item questionnaire gather at pre and post test.
So the data set has 22 columns.

I then proceed to compute the total scores for the pre and post test on my
imputed datasets:

long_pps <- complete(imput_pps, action ="long", include = TRUE)
long_pps$pre_test <- rowSums(long_pps[ ,c(3:13)])
long_pps$post_test <- rowSums(long_pps[ , c(14:24)])

I then used as.mids to convert back to mids object:
mids_pps <- as.mids(long_pps)

Next, I created an imputation list object using mitools:
implist <- lapply(seq(mids_pps$m), function(im) complete(mids_pps, im))
implist <- imputationList(implist)

Now, I want to conduct t-tests using the mi.t.test package. I tried the
following code:
mi.t.test(implist, implist$pre_test, implist$post_test, alternative =
"greater", paired = TRUE, var.equal = TRUE, conf.level = 0.95)

When I run this code, R tells me that Y is missing. I know this may sound
stupid, but I thought that I specified Y with this line: implist$pre_test,
implist$post_test - with implist$pre_test being X and implist$post_test
being Y - like I usually do for a normal t-test using the t.test function.

It seems I don't quite understand what the Y variable is supposed to
represent. Could someone help me figure out what I am doing wrong? You
help would be very much appreciated.

Best regards,

Joel Gagnon, Ph.D(c),
Department of Psychology,
Université du Québec à Trois-Rivières
Québec, Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quantiles with ordered categories

2017-03-14 Thread matthias-gondan
I found it:

quantile(ordered(1:10), probs=0.5, type=1) 

works, because type=1 seems to round up or down, whatever. The default option 
for is 7, which wants to interpolate, and then produces the error. 

Two options come to my mind:

- The error message could be improved.
- The default type could be 1 if the data is from ordered categories.
- Or both.

It is probably a little thing to fix, but I lack the skills to do this myself.

Best wishes,

Matthias

Von: Bert Gunter
Gesendet: Dienstag, 14. März 2017 21:34
An: matthias-gon...@gmx.de
Cc: r-help@r-project.org
Betreff: Re: [R] Quantiles with ordered categories

Inline.
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 14, 2017 at 12:36 PM,  <matthias-gon...@gmx.de> wrote:
> Dear R users,
>
> This works:
>
> quantile(1:10, probs=0.5)
>
> This fails (obviously):
>
> quantile(factor(1:10), probs=0.5)
>
> But why do quantiles for ordered factors not work either?
>
> quantile(ordered(1:10), probs=0.5)
>
> Is it because interpolation (see the optional type argument) is not defined?
Yes.


Is there an elegant workaround?
No. How can there be? By definition, all that is assumed by an ordered
factor is an ordering of the categories. How can you "interpolate" in
ordered(letters[1:3]) . ASAIK there is no "a.5"  .

-- Bert



>
> Thank you.
>
> Best wishes,
>
> Matthias
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quantiles with ordered categories

2017-03-14 Thread matthias-gondan
Dear R users,

This works: 

quantile(1:10, probs=0.5)

This fails (obviously):

quantile(factor(1:10), probs=0.5)

But why do quantiles for ordered factors not work either?

quantile(ordered(1:10), probs=0.5)

Is it because interpolation (see the optional type argument) is not defined? Is 
there an elegant workaround?

Thank you.

Best wishes,

Matthias

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] summing up a matrix

2016-11-18 Thread Matthias Weber
Hello together,

is it possible, to summing up a matrix?
I have the following matrix at the moment:

[,1] [,2] [,3][,4]  
   [,5] [,6]
2016-1120200100 50   100
 30
2016-12100  200100 50   100 
30
2017-0150200100 50   100
 30

Now I want to summing up the matrix in a new matrix.

The result should look like the following:

[,1] [,2] [,3][,4]  
   [,5] [,6]
2016-1120220320 370 470 
500
2016-12100  300400 450 550  
   580
2017-0150250350 400 500 
530

Is it possible, to create that?

Thanks for your help.

Best regards.

Mat


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fPortfolio dont work in R 3.3.1

2016-09-07 Thread Bannert Matthias
Andre, 

you need to make sure you got a C/C++ compiler as well as a Fortran compiler to 
compile a package from source that makes use of these language. Many R packages 
use one of those languages under the hood to speed things up. 

gcc / gfortran are common and free choices for such compilers. Depending on 
your OS, these ship with your OS' installation. What OS do you have? 

Also, i'd try to install RSymphony first and then install fPortfolio.

HTH, 

matt

Hi,


I'm trying use the package fPortfolio in R version 3.3.1. But the package don't 
work. I recieve the messages below:


Package which is only available in source form, and may need
  compilation of C/C++/Fortran: 'Rsymphony'
  These will not be installed

Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :
  there is no package called 'Rsymphony'
Erro: package or namespace load failed for 'fPortfolio'

How I can use fPortfolio in R 3.3.1?


Thanks in advance,


Andr� Barbosa Oliveira.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] filter a data.frame in dependence of a column value

2016-06-17 Thread Matthias Weber
Hello togehter,

i have short question, maybe anyone can help me.

I have a data.frame like this one:

   NO   ORDER
1 1530 for Mr. Muller (10.0 -> 11.2)
2 1799 for Mr Giulani
3 1888 for Mr. Marius (11.2 -> 12)

I need a solution, which only contains the values in brackets. The result 
should look like the following:

   NO   ORDER
1 1530 for Mr. Muller (10.0 -> 11.2)
2 1888 for Mr. Marius (11.2 -> 12)

I tried it with the following code, but that doesn't work.

data4.1<-data3[data3$ORDER%in% "[(]*->*[)]",]

maybe anyone can help me.

Thank you.

Best regards

Mat

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sustring in dependence of values

2016-06-03 Thread Matthias Weber
Hello togehter,

maybe anyone can help me with a little problem.

I have the following column in an data.frame (called TEST)

  VERSION
1abc (9.2 -> 10.0)
2def (10.2 -> 11.0)
3ghi (7.4 -> 8.4)
4jkl (10.2 -> 10.4)
5mno (8.1 -> 9.0)

I now need the code for the following result (I want the numbers "before/after" 
in a separate column. The solution look like this one:

  VERSION  FROM TO
1abc (9.2 -> 10.0) 9.2 10.0
2def (10.2 -> 11.0)10.2   11.0
3ghi (7.4 -> 8.4)7.4 8.4
4jkl (10.2 -> 10.4)  10.2   10.4
5mno (8.1 -> 9.0)  8.1 9.0

Maybe anyone can help me. The substring-code doesn't help me at the moment.

Best regards.

Mat


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2016-01-08 Thread Matthias Worni
Thank you for the help. It worked out fine. However trying to add
additional aesthetics I came to another point, where I couldnt figure out a
way to solve it. Perhaps you run into similar problems since your familiar
with the ggplot package.

The code is the following:

runmean_10yr_Nino_3_Index_17th_century <-
SMA(SST_NINO3_detrended[0:100],n=10)
runmean_10yr_Nino_3_Index_18th_century <-
SMA(SST_NINO3_detrended[101:200],n=10)
runmean_10yr_Nino_3_Index_19th_century <-
SMA(SST_NINO3_detrended[201:300],n=10)
runmean_10yr_Nino_3_Index_20th_century <-
SMA(SST_NINO3_detrended[301:400],n=10)


time_Nino3 <- c(1:100)
runmean_yearly_decadal_Nino_Index <-
data.frame(runmean_10yr_Nino_3_Index_17th_century,runmean_10yr_Nino_3_Index_18th_century,runmean_10yr_Nino_3_Index_19th_century,runmean_10yr_Nino_3_Index_20th_century,time_Nino3)
melted3 <- melt(runmean_yearly_decadal_Nino_Index , id.vars="time_Nino3")

ggplot(data=melted3, aes(x=time_Nino3,y=value,colour=variable))+
  geom_line()+ ylab("SST Anomaly") + xlab("year") + ggtitle("Decadal El
Nino Index ")+ facet_grid(variable ~ .) +
  geom_vline(aes(xintercept = xp),data=dat.vline)

dat.vline <- data.frame( xp = c(15, 23))


As you may have noticed I created several plots below each other and now I
would like  to add vertical lines to this plots, therefore adding
"geom_vline(aes(xintercept = xp),data=dat.vline). This worked so far,
however I would like to add vertical lines for these four different plots
at different places. Do you know how I could do this? I added the plot I
created at this e-mail. You can see two lines on every of those 4 plots (at
the same place). What I want is the same thing but the lines at different
places for every of those 4 plots (independently). Thank you for the help!!
If you need to create the plots you can simply change the
"runmean_10yr_Nino_3_Index_17th_century" to any other vector.

Best Matthias
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hello everyone and happy new year

2016-01-06 Thread Matthias Worni
I got the following problem in R studio. Im trying to make a plot of
different time series using different colours, which so far worked finde
with the ggplot tool. However I did not manage to accually write a name to
the correct layers. Below you can see the code that I wrote and also woks.
Remember, the code works, meaning I get a plot with all 9 timeseires from
aa until ii. I just could not figure out how to label the different layers.
What should I do when I would like to add a name to the red line, the green
line and so on beside the plot (in al list). Thank you for your time.

Best Matthias

aa <- OHC_d[40:70]
bb <- OHC_d[50:80]
cc <- OHC_d[173:203]
dd <- OHC_d[205:235]
ee <- OHC_d[273:303]
ff <- OHC_d[292:322]
gg <- OHC_d[302:332]
hh <- OHC_d[370:400]
ii <- OHC_d[381:411]

s3 <- data.frame(aa,bb,cc,dd,ee,ff,gg,hh,ii)
y2 <- ggplot(s3,aes(x=time2,y=aa),colour="black") +
  geom_line() +
  geom_line(data=s3,aes(y=bb),colour="red") +
  geom_line() +
  geom_line(data=s3,aes(y=cc),colour="green") +
  geom_line() +
  geom_line(data=s3,aes(y=dd),colour="blue") +
  geom_line() +
  geom_line(data=s3,aes(y=ee),colour="yellow") +
  geom_line() +
  geom_line(data=s3,aes(y=ff),colour="pink") +
  geom_line() +
  geom_line(data=s3,aes(y=gg),colour="purple")+
  geom_line()+
  geom_line(data=s3,aes(y=hh),colour="violet")
  geom_line()+
  geom_line(data=s3,aes(y=ii),colour="grey")
y2

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install pkg when "not found" ?

2015-12-30 Thread Prof. Dr. Matthias Kohl

use quotes!
install.packages("ggplot2")

Am 30.12.2015 um 06:41 schrieb Judson:


Using "Install Packages" from CRAN, in RStudio on Windows 7,
I downloaded (and supposedly installed)  ggplot2 package to here:

C:\Program Files\R\R-3.1.0\library\ggplot2_2.0.0\ggplot2


 when I try this:

require(ggplot2)

 I get the following:
Loading required package: ggplot2
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = 
TRUE,  :

   there is no package called 'ggplot2'


 I also tried this:

install.packages(ggplot2)

Error in install.packages : object 'ggplot2' not found

..


The above library has the usual things like MASS, stats, base, utils, graphics 
and so on
. and if I do this:

require(MASS)

 I get this:
Loading required package: MASS

 so that works


. Any idea what I'm doing wrong with ggplot2?
... judson blake

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compute Regressions with R-Function

2015-11-12 Thread Matthias Worni
Hello

I was trying to set up a function() that allows easely to calculate
regressions for severel dependent variables with the same independent
variables.

My function looked like this but was turned down by an error (Im quiet new
to R Studio). So here is my solution:

b
CO2
logTrop_Aerosol
lm(b ~ CO2  + logTrop_Aerosol  )

Trend <- function(x,CO2,logTrop_Aerosol) { lm x   CO2  + logTrop_Aerosol}

b is a vector containing 400 values of heat content
CO2 also contains 400 values as well as logTropAerosol

my idea would be that I simply can replace x (heat content) by other
vectors containing heat content.

The error I got was the following:

Error: unexpected symbol in "Trend <- function(x,y,z) { lm x"

Thanks a lot for the help!

Best Matthias

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ..lsmeans and coxme..

2015-05-28 Thread Matthias Kuhn
Dear list-eners,

I run into the following problem when I want to get contrasts from a coxme
model using the lsmeans package: A call to lsmeans on the coxme model
throws the following error:

Error in if (adjustSigma  object$method == ML) V = V *
object$dims$N/(object$dims$N -  :
  missing value where TRUE/FALSE needed


I give an example:


library(coxme)
library(lsmeans)

fm - coxme(Surv(y, uncens) ~ trt + (trt | center) + strata(center),
data=eortc)
summary(fm)

lsmeans(fm, ~ trt)



traceback points to the internal function lsmeans:::lsm.basis.lme()

Any ideas?

Thanks in advance.

-m

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculate value in dependence of target value

2015-03-09 Thread Matthias Weber
Hi David,

thanks for the reply. My spelling of the numbers was not correct. What I mean 
with 100.000 is 10.00 !
I have corrected the values in my example below me.

Maybe you can understand it better now.

Crucially is, that the MARGE rises up in dependence of the ID. The ID 11 will 
be count with 2% because we don't reach the 50% hurdle (5). The ID 12 will 
reach the 50% hurdle, so the ID 12 should be count with 1200 (result of 4 * 
2% + 1 * 4%). The 1 with 4% will be credited more, because they exceed 
the 50% Target Value.

Thanks for your help.

Best regards.

Mat

-Ursprüngliche Nachricht-
Von: David L Carlson [mailto:dcarl...@tamu.edu]
Gesendet: Montag, 9. März 2015 16:08
An: Matthias Weber; r-help@r-project.org
Betreff: RE: calculate value in dependence of target value

It is very hard to figure out what you are trying to do.

1. All of the VALUEs are greater than the target of 100 2. Your description of 
what you want does not match your example.

Perhaps VALUE should be divided by 1000 (e.g. not 1, but 10)?
Perhaps your targets do not apply to VALUE, but to cumulative VALUE?

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matthias Weber
Sent: Monday, March 9, 2015 7:46 AM
To: r-help@r-project.org
Subject: [R] calculate value in dependence of target value

Hello together,

i have a litte problem. Maybe anyone can help me.

I have to calculate a new column in dependence of a target value.

As a example: My target value is 10. At the moment I have a data.frame with 
the following values.

 IDVALUE
1   111
2   125
3   133
4   142

The new column (MARGE) should be calculated with the following graduation:
Until the VALUE reach 50% of the target value (5) = 2%

Until the VALUE reach 75% of the target value (75000) = 4%

Until the VALUE reach 100% of the target value (10) = 8%

If the VALUE goes above 100% of the value (10) = 10%

The result looks like this one:

 IDVALUE  MARGE
1   111  200  (result of 1 * 2%)
2   125 1200 (result of 4 * 2% + 1 * 4%)
3   133 1800 (result of 15000 * 4% + 15000 * 8%)
4   142 1800 (result of 1 * 8% + 1 * 10%)

Is there anyway to calculate the column MARGE automatically in R?

Thanks a lot for your help.

Best regards.

Mat

This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calculate value in dependence of target value

2015-03-09 Thread Matthias Weber
Hello together,

i have a litte problem. Maybe anyone can help me.

I have to calculate a new column in dependence of a target value.

As a example: My target value is 100.000
At the moment I have a data.frame with the following values.

 IDVALUE
1   111
2   125
3   133
4   142

The new column (MARGE) should be calculated with the following graduation:
Until the VALUE reach 50% of the target value (50.000) = 2%
Until the VALUE reach 75% of the target value (75.000) = 4%
Until the VALUE reach 100% of the target value (100.000) = 8%
If the VALUE goes above 100% of the value (100.000) = 10%

The result looks like this one:

 IDVALUE  MARGE
1   111  200  (result of 10.000 * 2%)
2   125 1200 (result of 40.000 * 2% + 10.000 * 4%)
3   133 1800 (result of 15.000 * 4% + 15.000 * 8%)
4   142 1800 (result of 10.000 * 8% + 10.000 * 10%)

Is there anyway to calculate the column MARGE automatically in R?

Thanks a lot for your help.

Best regards.

Mat


This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] move date-values from one line to several lines

2014-12-02 Thread Matthias Weber
Hello together,

i have a data.frame with date-values. What I want is a data.frame with a 
several lines for each date.

My current data.frame looks like this one:

ID FROM TOREASON
1  2015-02-27   2015-02-28Holiday
1  2015-03-15   2015-03-20Illness
2  2015-05-20   2015-02-23Holiday
2  2015-06-01   2015-06-03Holiday
2  2015-07-01   2015-07-01Illness

The result looks like this one:

ID   DATE   REASON
12015-02-27Holiday
12015-02-28Holiday
12015-03-15Illness
12015-03-16Illness
12015-03-17Illness
12015-03-18Illness
12015-03-19Illness
12015-03-20Illness
22015-05-20   Holiday
22015-05-21   Holiday
22015-05-22   Holiday
22015-05-23   Holiday
22015-06-01   Holiday
22015-06-02   Holiday
22015-06-02   Holiday
22015-07-01   Illness

Maybe anyone can help me, how I can do this.

Thank you.

Best regards.

Mat


This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merge 2 data.frames in dependence of 2 values

2014-11-13 Thread Matthias Weber
Hello togehter,

i have a little problem. Maybe anyone can help me.

I have 2 data.frames, which look like as follows:
First:

NAMEMONTH BONUS
1  Andy 2014-10   100
2  Pete 2014-10200
3  Marc2014-10300
4  Andy2014-11400

Second:

  NAME  MONTHBONUS_2
1Andy2014-10   150
2Pete 2014-11   180
3Jason   2014-10   190
4Paul 2014-10   210
5Andy2014-11   30

How can I merge this 2 data.frames, if I want the following result:

NAMEMONTH BONUSBONUS_2
1 Andy 2014-10   100150
2 Pete 2014-11 180
3 Marc2014-10300
4 Andy2014-11   400 30
5 Pete 2014-10   200
6 Jason  2014-10 190
7 Paul 2014-10 210

The important thing is, that for every accordance in the Columns NAME and 
MONTH I get a new line.

Thanks for your help.

Best regards.

Mat


This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] split a field in dependence of the semicolon

2014-11-03 Thread Matthias Weber

Hello togehter,

i have a little problem, maybe anyone can help me.

I have a data.frame, which look like this one:
  ID Members
1 1 ; abc; def; ghi
2 2 ; abc;
3 3 ;def;

How can I create another column for each value between 2 semicolons?

The result look like this one:

  ID MembersMember1Member2 
Member3
1 1 ; abc; def; ghi abc def 
ghi
2 2 ; abc; abc
3 3 ;def;  def

Maybe anyone can help me. Thank you.

Best regards.

Mat



This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dotplot with library lattice

2014-10-30 Thread Matthias Weber
Hi Jim,

the graph looks at the moment nearly perfect. I have one last question.
How can I change the scaling of the x-axis. At the moment i see the values at 
0,20,40,60,80,100.

My wish is, that the x-axis looks like the abline, so the scaling for the 
x-axis should be at 0,25,50,75,100.

Thanks a lot.

Best regards.

Mat

-Ursprüngliche Nachricht-
Von: Jim Lemon [mailto:j...@bitwrit.com.au]
Gesendet: Montag, 27. Oktober 2014 09:47
An: Matthias Weber
Cc: r-help@r-project.org
Betreff: Re: AW: [R] dotplot with library lattice

On Mon, 27 Oct 2014 08:53:51 AM Matthias Weber wrote:
 Hi Jim,

 looks perfect to me. Thank you very much. One last question. Is there
any
 possibility to add 3 horizontal lines in the graph. One at 25%, the
 other one at 50% and the last one at 75%? So I can see the process a
 bit
better?

Hi Mat,
Change this:

abline(h=1:13,lty=2,col=lightgray)

to this:

abline(h=1:13,v=c(25,50,75),lty=2,col=lightgray)

Jim


This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dotplot with library lattice

2014-10-27 Thread Matthias Weber
Hi Jim,

looks perfect to me. Thank you very much. One last question. Is there any 
possibility to add 3 horizontal lines in the graph. One at 25%, the other one 
at 50% and the last one at 75%? So I can see the process a bit better?

Thank you.

Best regards.

Mat

-Ursprüngliche Nachricht-
Von: Jim Lemon [mailto:j...@bitwrit.com.au]
Gesendet: Samstag, 25. Oktober 2014 03:27
An: Matthias Weber
Cc: r-help@r-project.org
Betreff: Re: AW: [R] dotplot with library lattice

On Fri, 24 Oct 2014 12:49:39 PM Matthias Weber wrote:

 I want always to see the Process of the green button in
dependence
 of the blue button at 100%.

Okay.

mwdat-read.table(text=
KOST  BudgetIST
1060 -2.18  0
1080  91037.71   91647.15
1100  955573.87  907938.98
1120  23326.8  0
1150 2521.57  0
1180 51302.03   48760.45
1200  2027.04-1667.5
1210 2385.032386.06
1220   0  0
1250  528.87  0
1255 766.54  0
1260 12154.974861.41
Gesamtbudget 1141622.25 1054236.55,
 header=TRUE)
par(las=1,mar=c(5,7,4,6))
plot(rep(100,13),1:13,main=IST against Budget,  xlab=IST/Budget 
(prozent),ylab=KOST,
 xlim=c(0,100),type=n,yaxt=n)
abline(h=1:13,lty=2,col=lightgray)
points(rep(100,13),1:13,pch=19,col=blue,cex=3)
mwdat$ISTpct-100*mwdat$IST/mwdat$Budget
# fix divide by zero
mwdat$ISTpct[9]-0
points(mwdat$ISTpct,1:13,pch=19,col=green,cex=3)
legend(105,10,c(Budget,IST),pch=19,
 col=c(blue,green),bty=n,xpd=TRUE)
axis(2,at=1:13,labels=mwdat$KOST)
par(las=0,mar=c(5,4,4,2))

Jim


This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dotplot with library lattice

2014-10-24 Thread Matthias Weber
Hi Jim,

thanks for your reply.

Doesn't look very bad, but what I mean with Budget 100, is the percentage of 
the process on the x-axis. As a example, the Budget for 1120 is 23326.8. At 
the moment we have 0 as value for IST. In this case, the blue button should 
be stand at 100% (x-axis = process in %) on the right side and the green button 
is on the x-axis at 0 (left side), because the value is 0. If the value for 
1120 rises up to 11663, the blue button is still at 100% on the right side 
(the blue button is in all cases on the right side) and the green button is 
right in the middle, because the Budget is 50% full.

I want always to see the Process of the green button in dependence of the 
blue button at 100%.

Maybe you can help me.

Thank you.

Best regards.

Mat

-Ursprüngliche Nachricht-
Von: Jim Lemon [mailto:j...@bitwrit.com.au]
Gesendet: Freitag, 24. Oktober 2014 11:29
An: r-help@r-project.org
Cc: Matthias Weber
Betreff: Re: [R] dotplot with library lattice

On Thu, 23 Oct 2014 05:57:27 PM Matthias Weber wrote:
 Hello together,

 i have a short question. Maybe anyone can help me to create a
barplot in R
 with the package lattice.

 I have the following data as ouput values and the following code:

 Data (d):

 KOST  BudgetIST
 1060 -2.18  0
 1080  91037.71   91647.15
 1100  955573.87  907938.98
 1120  23326.8  0
 1150 2521.57  0
 1180 51302.03   48760.45
 1200  2027.04-1667.5
 1210 2385.032386.06
 1220   0  0
 1250  528.87  0
 1255 766.54  0
 1260 12154.974861.41
 Gesamtbudget 1141622.25 1054236.55

 Code:

 ### read the data

 d$KOST - ordered( d$KOST, levels = d$KOST)

 ### load lattice and grid
 require( lattice )

  ### setup the key
 k - simpleKey( c( Budget,  IST ) ) k$points$fill - c(blue,
 darkgreen) k$points$pch - 21 k$points$col - black
 k$points$cex - 1

  ### create the plot
 dotplot( KOST ~ Budget + IST , data = d, horiz = TRUE,
  par.settings = list(
  superpose.symbol = list(
  pch = 21,
  fill = c( blue, darkgreen),
  cex = 3,
  col = black
  )
   ) , xlab = Kostenstellenübersicht, key = k,
   panel = function(x, y,  ...){
 panel.dotplot( x, y, ... )
 #   grid.text(
   #   unit( x, native) , unit( y, native) ,
#  label = x, gp = gpar( cex = .7 ) )
   } )

 The result look like the attached graph. But this is not exactly what
 I want. I want the Budget on the right side (100), and the
 IST-Value
in
 dependence of the Budget between 0 and 100. As a example. If
there is a
 budget over 100.000 and the IST-Value ist around 50.000, the blue
button
 should be on the right side and the green button right in the middle.

 Maybe anyone can help me.

 Thank you.

 Best regards.

 Mat


Hi Mat,
I must admit that I have probably misunderstood your request. I have assumed 
that you want the Budget as the x axis and the KOST as the y axis. You seemed 
to be asking for Budget to be always 100 and that didn't make sense. For some 
reason the scipen option stopped working for me after the first plot and so I 
couldn't get rid of the scientific notation on the x axis. Also this was done 
in base graphics rather than lattice. I also left the Gesamtbudget (Total 
budget) out as it would have squeezed most of the dots even farther to the left.
However, it might help.

mwdat-read.table(text=
KOST  BudgetIST
1060 -2.18  0
1080  91037.71   91647.15
1100  955573.87  907938.98
1120  23326.8  0
1150 2521.57  0
1180 51302.03   48760.45
1200  2027.04-1667.5
1210 2385.032386.06
1220   0  0
1250  528.87  0
1255 766.54  0
1260 12154.974861.41,
 header=TRUE)
options(scipen=4)
par(las=1)
plot(mwdat$Budget,mwdat$KOST,main=IST against Budget,  
xlab=Budget,ylab=KOST,  xlim=range(mwdat$Budget),ylim=range(mwdat$KOST),
 type=n,yaxt=n)
abline(h=mwdat$KOST,lty=2,col=lightgray)
points(mwdat$Budget,mwdat$KOST,pch=19,col=blue,cex=3)
points(mwdat$IST,mwdat$KOST,pch=19,col=green,cex=3)
legend(40,1250,c(Budget,IST),pch=19,
 col=c(blue,green),bty=n)
options(scipen=0)
axis(2,at=mwdat$KOST)
par(las=0)

Jim


This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E

[R] dotplot with library lattice

2014-10-23 Thread Matthias Weber
Hello together,

i have a short question. Maybe anyone can help me to create a barplot in R with 
the package lattice.

I have the following data as ouput values and the following code:

Data (d):

KOST  BudgetIST
1060 -2.18  0
1080  91037.71   91647.15
1100  955573.87  907938.98
1120  23326.8  0
1150 2521.57  0
1180 51302.03   48760.45
1200  2027.04-1667.5
1210 2385.032386.06
1220   0  0
1250  528.87  0
1255 766.54  0
1260 12154.974861.41
Gesamtbudget 1141622.25 1054236.55

Code:

### read the data

d$KOST - ordered( d$KOST, levels = d$KOST)

### load lattice and grid
require( lattice )

 ### setup the key
k - simpleKey( c( Budget,  IST ) )
k$points$fill - c(blue, darkgreen)
k$points$pch - 21
k$points$col - black
k$points$cex - 1

 ### create the plot
dotplot( KOST ~ Budget + IST , data = d, horiz = TRUE,
 par.settings = list(
 superpose.symbol = list(
 pch = 21,
 fill = c( blue, darkgreen),
 cex = 3,
 col = black
 )
  ) , xlab = Kostenstellenübersicht, key = k,
  panel = function(x, y,  ...){
panel.dotplot( x, y, ... )
#   grid.text(
  #   unit( x, native) , unit( y, native) ,
   #  label = x, gp = gpar( cex = .7 ) )
  } )

The result look like the attached graph. But this is not exactly what I want. I 
want the Budget on the right side (100), and the IST-Value in dependence of 
the Budget between 0 and 100. As a example. If there is a budget over 100.000 
and the IST-Value ist around 50.000, the blue button should be on the right 
side and the green button right in the middle.

Maybe anyone can help me.

Thank you.

Best regards.

Mat



This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sort a data.frame in a different way

2014-10-22 Thread Matthias Weber
Hello together,

i have a little problem. Maybe anyone can help me.

I have a data. frame which look like this one:
 1000 1001 10021003
15  6 1211
24  3  81

What i need is a data.frame, which looks like this one. The Column names should 
be in the first column, and the values right of the column name. The solution 
look like this one:
  A   B
1000  5   4
1001  6   3
1002 12  8
1003 11  1

maybe anyone can help me.

Thank you.

Best regards. Mat



This e-mail may contain trade secrets, privileged, undisclosed or otherwise 
confidential information. If you have received this e-mail in error, you are 
hereby notified that any review, copying or distribution of it is strictly 
prohibited. Please inform us immediately and destroy the original transmittal. 
Thank you for your cooperation.

Diese E-Mail kann Betriebs- oder Geschaeftsgeheimnisse oder sonstige 
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtuemlich 
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine 
Vervielfaeltigung oder Weitergabe der E-Mail ausdruecklich untersagt. Bitte 
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strange behavior of the compare operator

2014-10-03 Thread Matthias Salvisberg
I had a strange behavior of a function written a few days ago. I
pointed the problem down to the following minimal example.

can anyone explain why the following comparisons don't reply the
samecorrect answer?


Thanks for your reply!

Matthias




R version 3.1.1 (2014-07-10) -- Sock it to Me
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
 sessionInfo()R version 3.1.1 (2014-07-10)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=German_Switzerland.1252
LC_CTYPE=German_Switzerland.1252
LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_3.1.1  rm(list = ls())  myDataFrame - data.frame(var1 =
seq(from = -1, to = 0, by = 0.01))  any(myDataFrame$var1 ==
(0.68-1))[1] TRUE  any(myDataFrame$var1 == -0.32)[1] FALSE 
myDataFrame$var1[69][1] -0.32   str((0.68-1)) num -0.32 str(-0.32)
num -0.32 str(myDataFrame$var1) num [1:101] -1 -0.99 -0.98 -0.97
-0.96 -0.95 -0.94 -0.93 -0.92 -0.91 ...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ..nlme: start values necessary even though using self-start function?..

2014-09-30 Thread Matthias Kuhn
Dear list.

I am stuck with an subscript out of bounds error when I play with an
nlme-example from the yellow Pinheiro/Bates book. It is about using
nlme() using the formula interface for non-linear mixed models.

The following code snippet is inspired by the book (in section 8.2),
using the Orange-example data from R datasets containing repeated
measures of trunk circumferences over time for 5 trees. It is a
nfnGroupedData with structure circumference ~ age | Tree.


nlme( circumference ~ SSlogis(age, Asym, xmid, scal), data=Orange,
  fixed= Asym + xmid + scal ~ 1)

It results in the following error:
Error in nlme.formula(circumference ~ SSlogis(age, Asym, xmid, scal),  :
  subscript out of bounds

Specifiying the grouping as in groups=~Tree does not help, either.

When I do provide start values, it works, though.

nlme( circumference ~ SSlogis(age, Asym, xmid, scal), data=Orange,
  start=c(Asym = 200, xmid=725, scal=350),
  fixed= Asym + xmid + scal ~ 1)

Beginning of result output:

Nonlinear mixed-effects model fit by maximum likelihood
Model: circumference ~ SSlogis(age, Asym, xmid, scal)
Data: Orange
Log-likelihood: -129.9907
Fixed: Asym + xmid + scal ~ 1
Asym xmid scal
192.0959 727.5887 356.6011
...



SSlogis() is a selfStart object and should be able to get its own
starting values like so:
getInitial(circumference ~ SSlogis(age, Asym, xmid, scal), data=Orange)


So start= argument should not be necessary.
What am I missing here? Any ideas?

Thanks in advance..

-m



ps This is my setting:

 sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] nlme_3.1-117 devtools_1.5

loaded via a namespace (and not attached):
 [1] digest_0.6.4evaluate_0.5.5  grid_3.1.1  httr_0.5
lattice_0.20-29 memoise_0.2.1   parallel_3.1.1  RCurl_1.95-4.3
stringr_0.6.2   tools_3.1.1 whisker_0.3-2



And regarding my nlme-package from CRAN:

Information on package ‘nlme’


Description:


Package:nlme

Version:3.1-117

Date:   2014-03-31

Priority:   recommended

Title:  Linear and Nonlinear Mixed Effects Models

Authors@R:  c(person(José, Pinheiro, role = aut, comment
= S version), person(Douglas, Bates, role = aut, comment = up
to 2007), person(Saikat, DebRoy, role = ctb, comment = up
  to 2002), person(Deepayan, Sarkar, role = ctb,
comment = up to 2005), person(EISPACK authors, role = ctb,
comment = src/rs.f), person(R-core, email =

r-c...@r-project.org, role = c(aut, cre)))

Description:Fit and compare Gaussian linear and nonlinear
mixed-effects models.

Depends:graphics, stats, R (= 3.0.0)

Imports:lattice

Suggests:   Hmisc, MASS

LazyData:   yes

ByteCompile:yes

Encoding:   UTF-8

License:GPL (= 2)

BugReports: http://bugs.r-project.org

Packaged:   2014-03-31 07:56:40 UTC; ripley

Author: José Pinheiro [aut] (S version), Douglas Bates
[aut] (up to 2007), Saikat DebRoy [ctb] (up to 2002), Deepayan Sarkar
[ctb] (up to 2005), EISPACK authors [ctb] (src/rs.f), R-core [aut,

cre]

Maintainer: R-core r-c...@r-project.org

NeedsCompilation:   yes

Repository: CRAN

Date/Publication:   2014-03-31 10:16:43

Built:  R 3.1.1; x86_64-apple-darwin13.1.0; 2014-07-11
12:37:41 UTC; unix

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Could someone recommend a package for time series?

2014-09-29 Thread Bannert Matthias
Hi, 

the Cran Task View about time series that pascal just send is certainly 
helpful. 
Personally I think the standard ts() already can do a lot for you. Apart from 
that I like zoo particularly if you have other than yearly frequencies like 
quarterly or even irregular frequencies to handle. 

This is a pretty nice applied course that uses R for illustration: 
https://stat.ethz.ch/education/semesters/ss2012/atsa

apart from this course, I think Prof. Rob Hyndman (who also has a blog) has a 
lot of useful stuff to say when it comes time series analysis with R

best

Matthias Bannert

KOF Swiss Economic Institute, Switzerland 


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
jpm miao [miao...@gmail.com]
Sent: Monday, September 29, 2014 11:47 AM
To: Pascal Oettli
Cc: mailman, r-help
Subject: Re: [R] Could someone recommend a package for time series?

Thanks Pascal! It certainly helps!

2014-09-29 17:10 GMT+08:00 Pascal Oettli kri...@ymail.com:

 Hi Miao,

 You certainly will find useful answers here :
 http://cran.r-project.org/web/views/TimeSeries.html

 Regards,
 Pascal Oettli

 On Mon, Sep 29, 2014 at 6:05 PM, jpm miao miao...@gmail.com wrote:
  Hi,
 
 I've not used R for about one year and don't know well about the
 updates
  on the time series-related package.
 
My primary job is to do economic/financial time series data analysis -
  annual, monthly, daily, etc. I usually read data by the package
  XLConnect, which can read xls or xlsx files directly. It's excellent.
  However I can't find a package to manipulate time series data. For
 example,
  I just want to do an easy manipulation , e.g, to label the dates of the
  data from , say, 1991M10 to 2014M07, and then extract part of the data,
  say, 2005M01 to 2010M12 and do analysis. Is there any package work well
 for
  my purpose?
 
I sometimes need to aggregate monthly data to quarterly data and I find
  aggregate function helpful.
 
In the past I used packages xts, zoo and don't find it really user
  friendly. Maybe I haven't mastered it; maybe there're some updates
 (which I
  don't know) now. Could someone recommend a package or provide an example
  (or just the document, I can read it) for my purpose?
 
 Attached is an exemplary data set I talked about.
 
 Thanks,
 
  Miao
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Pascal Oettli
 Project Scientist
 JAMSTEC
 Yokohama, Japan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Propensity Score Matching with Restrictions (Matching package)

2014-09-19 Thread Bannert Matthias
Dear listers,

I am using Jas Sekhon's Matching package to perform propensity score matching 
and like it a lot so far.
Still though I wonder whether it is possible to impose restrictions on the 
Matching. I've seen the restriction parameter, but I doubt this is what I want 
(or I don't get how to use it more neatly).

Here's what I  do:
I used a pool several years of observations into one dataset and estimate a 
propensity score
using a standard glm / probit approach. I then use the fitted values as 
propensity scores and perform the
matching.

Here's what I would like to do (in addition):
restrict the matching in way that only allows for matches within the same year, 
so that only propensity scores within the same year are considered for a 
respective match. Note that I don't what to estimate the years separately.

Is there a way to do so?

best regards

matthias

---
Matthias Bannert
KOF Swiss Economic Institute
Leonhardstrasse 21
8045 Zürich
Switzerland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] C.D.F

2014-08-11 Thread Prof. Dr. Matthias Kohl

Dear Diba,

you could try package distr; eg.

library(distr)
G1 - Gammad(scale = 0.7, shape = 0.5)
G2 - Gammad(scale = 2.1, shape = 1.7)
G3 - G1+G2 # convolution
G3

For the convolution exact formulas are applied if available, otherwise 
we use FFT; see also http://www.jstatsoft.org/v59/i04/ (will appear 
soon) resp. a previous version at http://arxiv.org/abs/1006.0764


hth
Matthias

Am 11.08.2014 um 10:17 schrieb pari hesabi:

Hello everybody,

Can anybody help me to write a program for the CDF of sum of two independent 
gamma random  variables ( covolution of two gamma distributions)  with 
different amounts of parameters( the shape parameters are the same)?

Thank you

Diba

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] legendre quadrature

2014-05-01 Thread Prof. Dr. Matthias Kohl

you could use package distrEx:

library(distrEx)
GLIntegrate(function(x) x^2, lower = -1, upper = 1, order = 50)

hth
Matthias

On 01.05.2014 09:43, pari hesabi wrote:

Hello everybody
I need to approximate the amount of integral by using
  legendre quadrature. I have written a program which doesn't give me a
logical answer; Can anybody help me and send the correct program? For
example  the approximated amount of integral of ( x ^2)  on (-1,1) based
  on legendre quad rule.




integrand-function(x) {x^2}
rules - legendre.quadrature.rules( 50 )
order.rule - rules[[50]]
chebyshev.c.quadrature(integrand, order.rule, lower = -1, upper = 1)

Thank you
Diba
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] joint distribution

2014-04-22 Thread Prof. Dr. Matthias Kohl

have a look at our package distr:

library(distr)
x1 - Norm(mean = 0, sd = 1)
x2 - Binom(size = 1, prob = 0.75)
x3 - x1 + x2
plot(x3)
# to get density, cdf, quantile, and random numbers use
d(x3)(5)
p(x3)(0)
q(x3)(0.975)
r(x3)(20)

# you can also have additonal coefficients; eg.
x4 - 3*x1 + 2*x2
plot(x4)

For more details on the computations see http://arxiv.org/abs/1006.0764

hth,
Matthias

On 22.04.2014 13:11, Ms khulood aljehani wrote:


Hello
i have two independent variablesx1 from normal (0,1)x2 from bernoulli (o.75)
i need the density estimation of(b1*x1) + (b2*x2)
where b1 and b2 are two fixed coefficients
thank you   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mistakes in date conversion for future date/time (POSIXct)

2014-04-07 Thread Winkler, Matthias
Thank you for your answers. They helped me a lot!

I used R 3.1.0 RC and the mistake didn't show up. 

I also made an additional test, the same as David McPearson did: I tried it 
again with R 3.0.3 on a pc with 2 OS (Windows 7 and Linux). The error showed up 
at the windows system but not in Linux. So this seems to be a problem related 
to Windows.

Kind regards
Matthias

 


-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im 
Auftrag von David McPearson
Gesendet: Sonntag, 6. April 2014 12:19
An: r-help@r-project.org
Betreff: Re: [R] Mistakes in date conversion for future date/time (POSIXct)

I _do_ see this error - on R 3.0.3 / Win XP however, not on R 2.11.1 / Linux.
(Same hardware, 2 x OS, 2 x R versions)

Maybe it's peculiar to to 'doze...

datetimesequenz - seq.POSIXt(from=as.POSIXct(1960-01-01 00:00),
to=as.POSIXct(2100-01-01 00:00), by=1 hour) 
levels(as.factor(strftime(datetimesequenz, format=%Y))) [1] 1960 1961 
1962 1963 1964 1965 1966 1967 1968 1969
1970 1971 1972
 [14] 1973 1974 ...
 ...
[183] 2154 2155 2157 2158 2159 2160 2161 2162 2167

sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C  
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] grDevices datasets  splines   graphics  stats tcltk utils
methods   base 

other attached packages:
[1] svSocket_0.9-55 TinnR_1.0-5 R2HTML_2.2.1Hmisc_3.12-2   
Formula_1.1-1
[6] survival_2.37-4

loaded via a namespace (and not attached):
[1] cluster_1.14.4  fortunes_1.5-0  grid_3.0.2  lattice_0.20-23
rpart_4.1-3
[6] svMisc_0.9-69   tools_3.0.2  


Cheers,
D.

On Fri, 4 Apr 2014 14:19:09 -0700 David Winsemius dwinsem...@comcast.net wrote

 On Apr 4, 2014, at 9:54 AM, Duncan Murdoch wrote:
 
  On 04/04/2014 10:55 AM, Winkler, Matthias wrote:
  Dear R-users,
  
  I'm working on datasets which contain data from the years 1960 to 
  2100
..
  I also produced a date/time-sequence in R, which showed the same 
  mistakes (see example below). The mistakes occur at the same dates 
  like in my datasets. It's always at the end of march.
   datetimesequenz - seq.POSIXt(from=as.POSIXct(1960-01-01 
   00:00),
   to=as.POSIXct(2100-01-01 00:00), by=1 hour)
   levels(as.factor(strftime(datetimesequenz, format=%Y)))[1]
   1960 1961 1962 ...
  [181] 2152 2153 2154 2156 2157 2158 2159 2160 2161
  2166 
  Has anybody experienced the same problem and knows a workaround?
  
  I'm using R 3.0.1 under Windows 7 64bit. I also tried this with R 
  3.0.3, it showed the same problem. Thank you for your help!
  
  I don't see this in 3.1.0 beta.  Do you?
 
 I'm not seeing it on a Mac in 3.0.2 either.
 
  max(datetimesequenz)
 [1] 2100-01-01 PST
  length(datetimesequenz)
 [1] 1227241
 
  
  Duncan Murdoch
 
 David Winsemius
 Alameda, CA, USA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




South Africas premier free email service - www.webmail.co.za 

Fight Crime And Corruption! http://www.anc.org.za/2014/manifesto/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mistakes in date conversion for future date/time (POSIXct)

2014-04-04 Thread Winkler, Matthias
Dear R-users,

I'm working on datasets which contain data from the years 1960 to 2100 with a 
timestep of one hour. Every year has 365 days, leap years are ignored.
After reading the dataset with R I convert the column which contains date/time 
to POSIXct:

as.POSIXct(strptime(MyData [,1], format=%d.%m.%Y : %H))

After that, I divide the data with split() into parts of one year each. Then I 
recognized, that the years for some rows are obviously converted wrong: They 
show years larger than 2100 (see example below).
I've controlled my original dataset, but the dates are correct there.

I also produced a date/time-sequence in R, which showed the same mistakes (see 
example below). The mistakes occur at the same dates like in my datasets. It's 
always at the end of march.

 datetimesequenz - seq.POSIXt(from=as.POSIXct(1960-01-01 00:00), 
 to=as.POSIXct(2100-01-01 00:00), by=1 hour)
 levels(as.factor(strftime(datetimesequenz, format=%Y)))
  [1] 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 
1970 1971 1972 1973 1974 1975 1976 1977
[19] 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 
1988 1989 1990 1991 1992 1993 1994 1995
[37] 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 
2006 2007 2008 2009 2010 2011 2012 2013
[55] 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 
2024 2025 2026 2027 2028 2029 2030 2031
[73] 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 
2042 2043 2044 2045 2046 2047 2048 2049
[91] 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 
2060 2061 2062 2063 2064 2065 2066 2067
[109] 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 
2078 2079 2080 2081 2082 2083 2084 2085
[127] 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 
2096 2097 2098 2099 2100 2101 2102 2103
[145] 2105 2107 2109 2110 2111 2112 2113 2114 2115 2117 
2118 2120 2121 2122 2124 2125 2126 2128
[163] 2129 2130 2131 2132 2133 2135 2137 2138 2139 2140 
2141 2142 2143 2145 2146 2148 2149 2150
[181] 2152 2153 2154 2156 2157 2158 2159 2160 2161 2166

Has anybody experienced the same problem and knows a workaround?

I'm using R 3.0.1 under Windows 7 64bit. I also tried this with R 3.0.3, it 
showed the same problem.
Thank you for your help!

Kind regards,
Matthias




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] computationally singular

2014-03-19 Thread Matthias Endres
Hi Weiwei,

I found your post on Mahalanobis distances five years ago.

/  Error in solve.default(cov, ...) : system is computationally singular:
//  reciprocal condition number = 1.09501e-25

/I have the same problem now - and I don't know how to deal with it. I would be 
thankful if you could help me with that.

All the best,
Matthias
//


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MPICH2 Rmpi and doSNOW

2013-11-06 Thread Matthias Salvisberg
Hi

 

I have managed to install MPICH2 and Rmpi on my Windows 7 machine. I can
also run the following code

 

 library(Rmpi)

 mpi.spawn.Rslaves()

4 slaves are spawned successfully. 0 failed.

master (rank 0, comm 1) of size 5 is running on: MyMaster 

slave1 (rank 1, comm 1) of size 5 is running on: MyMaster 

slave2 (rank 2, comm 1) of size 5 is running on: MyMaster 

slave3 (rank 3, comm 1) of size 5 is running on: MyMaster 

slave4 (rank 4, comm 1) of size 5 is running on: MyMaster 

 mpichhosts()

 master  slave1  slave2  slave3  slave4 

localhost localhost localhost localhost localhost 

 mpi.universe.size()

[1] 4

 mpi.close.Rslaves()

[1] 1

 

library(doSNOW)

 

But every time I try to set up a cluster via

 

cluster - makeCluster(4, type = MPI)

 

My computer hangs up and I have to close the R session.

 

Any advice how I get this running?

Thanks in advance

 

 sessionInfo()

R version 3.0.1 (2013-05-16)

Platform: x86_64-w64-mingw32/x64 (64-bit)

 

locale:

[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252   

[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C   

[5] LC_TIME=German_Switzerland.1252

 

attached base packages:

[1] stats graphics  grDevices utils datasets  methods   base 

 

other attached packages:

[1] Rmpi_0.6-3

 

loaded via a namespace (and not attached):

[1] tools_3.0.1

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MPICH2 Rmpi and doSNOW

2013-11-06 Thread Matthias Salvisberg
Hi

I have managed to install MPICH2 and Rmpi on my Windows 7 machine. I can
also run the following code

 library(Rmpi)
 mpi.spawn.Rslaves()
4 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 5 is running on: MyMaster 
slave1 (rank 1, comm 1) of size 5 is running on: MyMaster 
slave2 (rank 2, comm 1) of size 5 is running on: MyMaster 
slave3 (rank 3, comm 1) of size 5 is running on: MyMaster 
slave4 (rank 4, comm 1) of size 5 is running on: MyMaster 
 mpichhosts()
 master  slave1  slave2  slave3  slave4 
localhost localhost localhost localhost localhost 
 mpi.universe.size()
[1] 4
 mpi.close.Rslaves()
[1] 1

library(doSNOW)

But every time I try to set up a cluster via

cluster - makeCluster(4, type = MPI)

My computer hangs up and I have to close the R session.

Any advice how I get this running?
Thanks in advance

 sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating new instances from original ones

2013-03-31 Thread Prof. Dr. Matthias Kohl

see function SMOTE in package DMwR
hth
Matthias

On 31.03.2013 10:46, Nicolás Sánchez wrote:

I have a question about data mining. I have a dataset of 70 instances with
14 features that belong to 4 classes. As the number of each class is not
enough to obtain a good accuracy using some classifiers( svm, rna, knn) I
need to oversampling the number of instances of each class.

I have heard that there is a method to do this. It consists in generating
these new instances as follows:

new_instance  original_instance + u(epsilon)

  U(epsilon) is a uniform number in the range [-epsilon,epsilon] and this
number is applied to each feature of the dataset to obtain a new instance
without modified the original class.

Anybody has used this method to oversampling his data? Anybody has more
information about it?

Thanks in advance!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create bar chart with different totals in a bar

2013-03-08 Thread Matthias Weber
Hello together

There is another try as a png file. Hope you can see it now, what i want to do 
with my bar chart.

Your example with ggplot2 works, but it wont help to convert my data like this 
one:

1   2  3  4
abnr2 11425   11425 11555 11888
TIME   21  1  2
Cat  12  1  2

to:

11425   11555  11888
1  21   0
2  1   02

Thanks for your help

Von: John Kane [mailto:jrkrid...@inbox.com]
Gesendet: Freitag, 8. März 2013 16:29
An: Matthias Weber
Betreff: RE: AW: [R] create bar chart with different totals in a bar

The image did not come through. The list strips off most attachments to reduce 
the chance of virus or malware.

I think a png file will get through.

Anyway I still don't quite understand you but does this look like what you 
want?  Note I made a slight  change in the data.frame to use ggplot2.  If you 
want to try out the ggplot2 code you will need to install ggplot2 --use the 
command install.packages(ggplot2) to do so.


Also note that dd is a data.frame rather than your matrix. Again done for 
ggplot2

##==#
dd  -  structure(list(x = 1:4, abnr2 = c(11425, 11425, 11555, 11888),
time = c(2, 1, 1, 2), cat = c(1, 2, 1, 2)),
.Names = c(x, abnr2, time, cat),
row.names = c(NA, -4L), class = data.frame)

  barplot(dd$abnr2, col= c(red,blue,red,blue))

  library(ggplot2)
  p  -  ggplot(dd, aes(x =  x, y = abnr2, fill = as.factor(cat)   )) +
geom_bar(stat = identity, position = dodge) +
 xlab(Something)

##=#
John Kane
Kingston ON Canada

-Original Message-
From: matthias.we...@fnt.demailto:matthias.we...@fnt.de
Sent: Fri, 8 Mar 2013 16:01:51 +0100
To: jrkrid...@inbox.commailto:jrkrid...@inbox.com
Subject: AW: [R] create bar chart with different totals in a bar

Hello John,



thanks for your comment.

Your code is the way my matrix look like, yes.

What i want to do is, that each equal abnr2 is represented in the same bar.

Like the picture:



So in the end, i have a PDF, which contains for each abnr2 one bar.

If there are one abnr2 with 2 different kind of „cat“ (like 11425) i want to 
distinguish this difference in the color.



Simplified revealed, it should be look like this one:





Thanks for your help.

Mat





-Ursprüngliche Nachricht-
Von: John Kane [mailto:jrkrid...@inbox.com]
Gesendet: Freitag, 8. März 2013 15:42
An: Matthias Weber; r-help@r-project.orgmailto:r-help@r-project.org
Betreff: RE: [R] create bar chart with different totals in a bar



https://github.com/hadley/devtools/wiki/Reproducibility



Is this what your matrix looks like?

mat1  -  structure(c(11425, 11425, 11555, 11888, 2, 1, 1, 2, 1, 2, 1, 2), .Dim 
= c(4L, 3L), .Dimnames = list(NULL, c(abnr2, time,

cat)))

It is good practice to use dput() to supply sample data.



It is not particularly clear what you want to do. You apparently have four 
entries in the matrix and say that you want to have three bars.



How do you want to handle the 11425 value since it has diffference cats?



John Kane

Kingston ON Canada





 -Original Message-

 From: matthias.we...@fnt.demailto:matthias.we...@fnt.de

 Sent: Fri, 8 Mar 2013 03:00:39 -0800 (PST)

 To: r-help@r-project.orgmailto:r-help@r-project.org

 Subject: [R] create bar chart with different totals in a bar



 Hello together,



 perhabs anyone of you, has an ideal, how i can do this:

 I have a matrix, like this one:



 [,1] [,2]  [,3]

 [,4]

 abnr2 11425   11425 11555 11888

 TIME   21  1

 2

 Cat  12  1

 2



 and now i want a bar chart, in which one abnr2 is one bar.

 So my bar chart has to have 3 bars, one for 11425, one for 11555 and

 one for 11888.

 in my 11425 bar, the distinction has to be shown. So the value of one

 column has to have a own color in dependence of the Cat.



 Perhabs anyone have an idea?



 Thanks.



 Mat







 --

 View this message in context:

 http://r.789695.n4.nabble.com/create-bar-chart-with-different-totals-i

 n-a-bar-tp4660703.html Sent from the R help mailing list archive at

 Nabble.com.



 __

 R-help@r-project.orgmailto:R-help@r-project.org mailing list

 https://stat.ethz.ch/mailman/listinfo/r-help

 PLEASE do read the posting guide

 http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code

Re: [R] integrate function

2013-02-12 Thread Prof. Dr. Matthias Kohl

use pmin instead of min.
hth
Matthias

On 12.02.2013 16:41, dan wang wrote:

Hi All,

Can any one help to explain why min and max function couldn't work in the
integrate function directly.

For example, if issue following into R:

integrand - function(x) {min(1-x, x^2)}
integrate(integrand, lower = 0, upper = 1)

it will return this:
Error in integrate(integrand, lower = 0, upper = 1) :
   evaluation of function gave a result of wrong length


However, as min(U,V) = (U+V)/2-abs(U-V)/2

Below code working;

integrand - function(x){(1-x+x^2)/2-0.5*abs(1-x-x^2)}

integrate(integrand, lower = 0, upper = 1)

0.151639 with absolute error  0.00011

I am confused...


Dan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with any function

2012-11-17 Thread Matthias Gondan

The warning

1: In (ind.c == TRUE)  (ind.sgn == TRUE) :
   longer object length is not a multiple of shorter object length

means that ind.c and ind.sgn have different lengths, for whatever reason.
Although R continues the routine, the warning should, in general, not be 
ignored.


Try:
1:3 + 1:2
at the command prompt to see what type of unintended things happen.

Actually, 1:4 + 1:2 is running silently.

Best wishes,

Matthias

Am 17.11.2012 19:56, schrieb Rui Barradas:

Hello,

First of all, that's not an error message, it's a warning.
As for your condition, it's too complicated. You don't need to test a
logical value for equality to TRUE, it's redundant. And any(x, y) is
equivalent to any(x, y, x  y). Try

any(ind.c, ind.r, ind.sgn)

and you'll be fine.

Hope this helps,

Rui Barradas
Em 17-11-2012 18:36, Haris Rhrlp escreveu:

Dear R users,

I have the any function of R

any(ind.c, ind.r, ind.sgn)


all are logical factors

it works fine when any of three is true
but when they are combined it doesnt work.

i tried
this
any(ind.c, ind.r, ind.sgn,((ind.c==TRUE)  (ind.r==TRUE)),
((ind.c==TRUE)  (ind.sgn==TRUE)),((ind.r==TRUE)  (ind.sgn==TRUE)),
   ((ind.c==TRUE)  (ind.r==TRUE)  (ind.sgn==TRUE)))
}

but i have this error message
Warning messages:
1: In (ind.c == TRUE)  (ind.sgn == TRUE) :
longer object length is not a multiple of shorter object length
2: In (ind.c == TRUE)  (ind.r == TRUE)  (ind.sgn == TRUE) :
longer object length is not a multiple of shorter object length

I want  to check combined  the three logikal factors

any help will be welcome
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survreg gompertz

2012-11-16 Thread Matthias Ziehm

On 15/11/12 21:22, David Winsemius wrote:

On Nov 15, 2012, at 5:38 AM, Matthias Ziehm wrote:


Hi all,

Sorry if this has been answered already, but I couldn't find it in the archives 
or general internet.


A Markmail/Rhelp search on:  gompertz survreg  ...brings this link to a reply 
by Terry Therneau. Seems to address everything you asked and a bit more

http://markmail.org/search/?q=list%3Aorg.r-project.r-help+gompertz+survreg#query:list%3Aorg.r-project.r-help%20gompertz%20survreg+page:1+mid:6xdlsmo272oa7zkw+state:results

(Depending on how your mailer breaks URLs you may need to paste it back 
together.)


Thanks! I hadn't read this thread because of the missleading title. 
However, now I've got follow up questions, on that explanation:


On Feb 5, 2010 at 8:48:08 am, Terry Therneau wrote:

Subject: Re: [R] Using coxph with Gompertz-distributed survival data.
...
the note below talks about how to do so approximately with survreg.
It's a note to myself of something to add to the survival package
documentation, not yet done, and to my embarassment the file has a
time stamp in 1996. Ah well.

Terry Therneau

My document A Package for Survival Analysis in S contains statements
about how to fit Gompertz and Rayleigh distributions with the survreg
routine. Nicholas Brouard, in a recent query to this group, quite
correctly states that Therneau's documentation is a little elliptic for
people not so familiar with extreme value theory.

I've spent the last day trying to work out concrete examples of the
fits. Let me start by saying that I now think my paper's remarks were
overly optimistic. This note will try to indicate why. I will use some
TeX notation below: \alpha, \beta, etc for Greek letters.

Hazard functions:

Weibull: p*(\lambda)^p * t^(p-1)
Extreme value: (1/ \sigma) * exp( (t- \eta)/ \sigma)
Rayleigh: a + bt
Gompertz: b * c^t
Makeham: a + b* c^t

The Makeham hazard seems to fit human mortality experience beyond
infancy quite well, where a is a constant mortality which is
independent of the health of the subject (accidents, homicide, etc) and
the second term models the Gompertz assumption that the average
exhaustion of a man's power to avoid death to is such that at theendof
equal infinitely small itervals of time he lost equal portions of his
remaining power to oppose destruction which he had at the commencement
of these intervals. For older ages a is a neglible portion of the
death rate and the Gompertz model holds.

The fitting routine depends on the decomposition Y = \eta + \sigma W,
where \eta = \beta_0 + \beta_1 * X_1 + \beta_2 * X_2 + ... is the fitted
linear predictor and W is a distribution in standard form. For
instance, if the response time t is Weibull, then y = log(t) follows
this with \eta = log(\lambda) \sigma = 1/p

Clearly

1. The Wiebull distribution with p=2 (sigma=.5) is the same as a
Rayleigh distribution with a=0. It is not, however, the most general
form of a Rayleigh.

2. The (least) extreme value and Gompertz distributions have the same
hazard function, with \sigma = 1/ log(c), and exp(-\eta/ \sigma) = b.
If I do the math correctly from the above given extreme value hazard to 
Gompertz hazard. It needs to b = 1/ \sigma * exp(-\eta/ \sigma)

Other wise the 1/ \sigma of the Extreme value hazard is missing, isn't it?



It would appear that the Gompertz can be fit with an identity link
function combined with the extreme value distribution. However, this
ignores a boundary restriction. If f(x; \eta, \sigma) is the extreme
value distribution with paramters \eta and \sigma, then the definition
of the Gompertz densitiy is g(x; \eta, \sigma) = 0 x 0 g(x; \eta,
\sigma) = c f(x; \eta, \sigma) x=0

where c= exp(exp(-\eta / \sigma)) is the necessary constant so that g
integrates to 1.
Here I got really lost were the addition double exp suddenly comes from 
and how it fits in.

Given the above I would have thought that:
g(x,b,c) = f(x, \eta=-1/log(c)*log(b*1/log(c)), \sigma= 1/log(c) ) for x=0

Can anyone clarify these thinga to me, please?

Matthias


If \eta / \sigma is far from 1, then the correction
term will be minimal and survreg should give a reasonable answer. If
not, the distribution can't be fit, nor can it be made to easily conform
to the general fitting scheme of the program.

The Makeham distribution falls into the gamma family (equation 2.3 of
Kalbfleisch and Prentice, Survival Analysis), but with the same range
restriction problem.

In summary, the Gompertz is a truncated form of the extreme value
distribution (Johnson, Kotz and Blakrishnan, Contiuous Univariate
Distri- butions, section 22.8). If one ignores the truncation, i.e.,
assume that negative time values are possible, then it can be fit with
survreg. My original note seems to have been compounded of 3 errors: the
-1 arises from confusing the maximal extreme distribution (most common
in theory books) with the minimal extreme distribution (used in
survreg), the log() term was a typing mistake, and I never noticed

[R] survreg gompertz

2012-11-15 Thread Matthias Ziehm

Hi all,

Sorry if this has been answered already, but I couldn't find it in the 
archives or general internet.


Is it possible to implement the gompertz distribution as 
survreg.distribution to use with survreg of the survival library?
I haven't found anything and recent attempts from my side weren't 
succefull so far.


I know that other packages like 'eha' and 'flexsurv' offer functions 
similar to survreg with gompertz support. However, due to the run-time 
environment were this needs to be running in the end, I can't use these 
packages :(


Same questions for the gompertz-makeham distribution.

Many thanks!

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two-way Random Effects with unbalanced data

2012-11-06 Thread Matthias Kuhn
Asaf,

I think that the lme4-package on CRAN with its main function lmer() can
help you here. It uses restricted ML to find the estimates and should cope
with unbalanced data. The two factors of your random effects can be either
nested or crossed.
Have a look in the book (draft-version) on lme4 by Doug Bates, in
particular chapter 2.
http://lme4.r-forge.r-project.org/lMMwR/ There is a PDF in that directory.

By the way, there is another mailing list dedicated to mixed-models in R,
the R-sig-mixed-models. You have better chances to get quicker reply in
case you have further questions on lme4..

Matthias.


On Mon, Oct 29, 2012 at 10:42 PM, asafwe as...@wharton.upenn.edu wrote:

 Hi there,

 I am looking to fit a two-way random effects model to an *unblalanced*
 layout,

   y_ijk = mu + a_i + b_j + eps_ijk,
 i=1,...,R, j=1,...,C, k=1,...,K_ij.

 I am interested first of all in estimates for the variance components,
 sigsq_a,  sigsq_b and sigsq_error.
 In the balanced case, there are simple (MM, MLE) estimates for these; In
 the
 unbalanced setup, this is much more complicated because orthogonality
 relations no longer exist between row space and column space. Since the
 covariance structure between the cell means y_{ij.} is much more
 complicated
 than in the one-way (unbalanced) case, trying to manually obtain MLEs looks
 very difficult.
 I couldn't find anything addressing point estimates for the unbalanced case
 - neither in the literature nor in a computer program (R or alike).

 Any idea will be greatly appreciated...

 Thank you,

 Asaf



 --
 View this message in context:
 http://r.789695.n4.nabble.com/Two-way-Random-Effects-with-unbalanced-data-tp4647795.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ylim with only one value specified

2012-10-09 Thread Matthias Gondan
Dear R developers,

I would like to have R choose the limits of the y-axis semi-automatically,
e.g., zero should be included, but the maximum should be chosen depending 
on the data.

Examples:

plot(1:10, 1:10) # selects min and max automatically
plot(1:10, 1:10, ylim=c(1, 10)) # manual definition
plot(1:10, 1:10, ylim=c(0, Inf)) # this would be a nice feature, i.e. lower y 
limit = 0 defined manually, upper limit = 10 selected automatically

The relevant code from plot.default would have to be modified:

old:

ylim - if (is.null(ylim)) 
range(xy$y[is.finite(xy$y)])
else ylim

new:

ylim - if (is.null(ylim)) 
range(xy$y[is.finite(xy$y)])
else if(length(ylim) == 2  ylim[2] == Inf)
c(ylim[1], max(xy$y[is.finite(xy$y)])
else ylim

and some more if-else statements if cases like ylim=c(0, -Inf), 
ylim=c(Inf, 0), ylim=c(-Inf, 0) and ylim=c(-Inf, Inf)[as a replacement for NULL/
autoselection] and ylim=c(Inf, -Inf)[autoselection, reversed y axis] should be 
handled correctly.

I would find such a feature useful. Do you think it would interfere with other 
functions? 

Thank you for your consideration.

Best wishes,

Matthias Gondan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] transform RTs

2012-08-30 Thread Matthias Gondan

you might to do something like

library(SuppDists)
t = runif(100, 100, 500) # original RT
t_IG = qinvGauss(ecdf(t)(t)-0.5/length(t), 1, 16)
plot(density(t_IG))

but what is the purpose of it? Usually reaction times are thought to
follow a certain kind of distribution (e.g. an inverse Gaussian 
distribution).


Am 29.08.2012 17:54, schrieb Katja Böer:

Hello,

I'm trying to transform reaction times which are not normally distributed
to an ex gaussian or an
inverse gaussian distribution, but I don't really know how to use the
exGAUS() function.

Can someone show me a script in which data has been transformed?

Thanks in advance

k

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Font size in geom_dl (using ggplot2)

2012-08-27 Thread Matthias Habjan
Hey everyone,

I am an R-newby... so sorry for bothering you with simple-to-solve
questions;) I have the following issue: trying to add labels to my
scatterplots (with geom_dl in ggplot2). Everything works fine, but after
checking every resource I do not find a way to change the font size of my
labels. I tried size, cex, fontsize at every position... but it always stays
the same.

ggplot()+
opts(legend.position=none)+
xlab(x)+ ylab(y)+
xlim(-15,15) + ylim(0,6)+
theme_complete_bw()+
scale_colour_manual(values=cols)+
geom_point(data=m, aes(x=x, y=y, colour=s), shape=19, cex=6,
alpha=0.3)+
geom_dl(method=top.bumptwice, data=m_sig, aes(x=x, y=y,
label=gene.name, colour=s, size=10))+
geom_line(data=m_s0, aes(x= x, y=y), linetype=5, colour=grey55,
size=0.5)

Your help is very much appreciated! Thx.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What makes R different from other programming languages?

2012-08-20 Thread Matthias Gondan

My vote:

1. Symbolic function arguments:

fn = function(a, b)
{
a/b
}

fn(b=10, a=2)

2. Names for elements of a vector and matrices

v = c(a=1, b=2)
v['a'] = v['a'] * 2

same for matrices

3. about 10,000 user-contributed packages on CRAN

4. weird things like

a = numeric(10)
a[1:10] = 1:2
no error message
a
answer: five times 1:2

which guarantee happy debugging

5. and, of course, much built-in statistical stuff


Am 20.08.2012 20:02, schrieb johannes rara:

My intention is to give a presentation about R programming language
for software developers. I would like to ask, what are the things that
make R different from other programming languages? What are the
specific cases where Java/C#/Python developer might say Wow, that was
neat!? What are the things that are easy in R, but very difficult in
other programming languages (like Java)?

Thanks,
-J

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convolve 2 uniform distributed variables

2012-05-24 Thread Prof. Dr. Matthias Kohl

take a look at the distr package

library(distr)
U - Unif()
U2 - U + U
# or
U2 - convpow(U, 2)
plot(U2)
# or
curve(d(U2)(x), from = 0, to = 2)


Best
Matthias

On 24.05.2012 08:29, c...@mail.tu-berlin.de wrote:

Hi,
I want to convolve 2 uniform distributed [0,1] variables, so that I get
this graphic:
http://de.wikipedia.org/wiki/Benutzer:Alfred_Heiligenbrunner/Liste_von_Verteilungsdichten_der_Summe_gleichverteilter_Zufallsvariabler
(second graphic)

if I do

u1-seq(0,1,length=100)
fu1=dunif(u1,min=0,max=1)

plot(u1,fu1,type=l,xlim=c(-2,2))


u2-seq(0,1,length=100)
fu2=dunif(u2,min=0,max=1)

u1u2-convolve(u1,rev(u1),typ=o)

plot(u1u2)

it does not work, could you help me please? The point is the convolve
function I think, what do I have to type, to get the correct convolution
of two uniform distributed [0,1] variables as to be seen in the second
graphic in the given link?

Thanks a lot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 - geom_bar

2012-04-23 Thread Matthias Rieber
On 23.04.2012 20:55, Brian Diggs wrote:
 On 4/23/2012 9:24 AM, Matthias Rieber wrote:
 Hello,

 I've some problem with the ggplot2. Here's a small example:

 [...]
 Is it wrong to use geom_bar with that kind of data? I could avoid this
 issue when I cast the data.frame, but I like to avoid that solution.

 There is nothing wrong with using bars with this sort of data. There
 is a bug in the faceting code of 0.9.0 that will be fixed in 0.9.1
 (see https://github.com/hadley/ggplot2/issues/443 ) which caused
 duplicate rows of data to be dropped when there was faceting. That is
 what you are seeing in the second example; row 6 is identical to row 7
 and is dropped before plotting.  One easy workaround until 0.9.1 comes
 out is to add unique column to the data that is otherwise ignored:

 molten - data.frame(date=c('01','01','01','01',
 '02','02','02','02'),
  channel=c('red','red','blue','blue',
'red','red','blue','blue'),
  product=c('product1','product2',
'product1','product2',
'product1','product1',
'product1','product2'),
  value=c(1,1,1,1,
  1,1,1,1),
  dummy=1:8)


Thanks! That worked.

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare String Similarity

2012-04-19 Thread Prof. Dr. Matthias Kohl

you should also look at Bioconductor Package Biostrings
hth
Matthias

Am 19.04.2012 18:01, schrieb R. Michael Weylandt:

Though if you do decide to use Levenstein, it's implemented here in R:
http://finzi.psych.upenn.edu/R/library/RecordLinkage/html/strcmp.html

I'm pretty sure this is in C code so it should be mighty fast.

Michael

On Thu, Apr 19, 2012 at 11:40 AM, Bert Guntergunter.ber...@gene.com  wrote:

Wrong list.This is R, not statistics (or linguistics?).Please post elsewhere.

-- Bert

On Thu, Apr 19, 2012 at 8:05 AM, Alekseiy Beloshitskiy
abeloshits...@velti.com  wrote:

Dear All,

I need to estimate the level of similarity of two strings. For example:
string1- c(depending,audience,research, school);
string2- c(audience,push,drama,button,depending);

The words in string may occur in different order though. What function would 
you recommend to use to estimate similarity (e.g., levenstein, distance)?

Appreciate for any advices.

-Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave UFT8 problem

2012-04-15 Thread Prof. Dr. Matthias Kohl

try:
Sweave(sim_pi.Rnw, encoding = utf8)

Best,
Matthias

On 15.04.2012 11:41, Philippe Grosjean wrote:

Hello,

Have you tried to put that command in a comment:

%\usepackage[utf8]{inputenc}

I haven't tested it in this particular case, but it works in some other
situations.
Best,

Philippe

..¡}))
) ) ) ) )
( ( ( ( ( Prof. Philippe Grosjean
) ) ) ) )
( ( ( ( ( Numerical Ecology of Aquatic Systems
) ) ) ) ) Mons University, Belgium
( ( ( ( (
..

On 14/04/12 22:37, Mark Heckmann wrote:

Hi,

I work on MacOS, trying to Sweave an UFT8 document.
AFAI remember R 2.14 used to render a warning when the encoding was not
declared when using Sweave.
With R 2.15 it seems to render an error.

Sweave(sim_pi.Rnw)
Error: 'sim_pi.Rnw' is not ASCII and does not declare an encoding

Declaring an encoding by adding a line like

\usepackage[utf8]{inputenc}

in the preamble does the job. In my case though the .Rnw document does no
have a preamble as it is just one chapter.
All chapters are Sweaved separately (due to computation time). Hence I
cannot inject the above line as LaTex will cause an error afterwards.
(usepackage{} is only allowed in the preamble which only appears once in
the main document, not in each chapter).

How can I get around this not using the terminal for Sweaving, like e.g.
R CMD Sweave --encoding=utf-8 sim_pi.Rnw ?

Thanks
Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] line width in legend of interaction.plot

2012-02-22 Thread Matthias Gondan

Dear R developers,

The following command produces an interaction plot with lwd=2.

interaction.plot(c(1, 2, 1, 2), c(1, 1, 2, 2), 1:4, lwd=2)

In the legend, however, lwd seems to be 1, which does not seem
to be intended behavior. Probably the lwd is not correctly forwarded
to legend:

from the interaction.plot source:

legend(xleg, yleg, legend = ylabs, col = col, pch = if (type %in%
c(p, b))
pch, lty = if (type %in% c(l, b))
lty, bty = leg.bty, bg = leg.bg) - here I would add 
lwd=well, the lwd from the ... argument, or perhaps something like leg.lwd


Best wishes,

Matthias Gondan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reference for dataset colon (package survival)

2012-01-17 Thread Matthias Gondan
Dear R team, dear Prof. Therneau,

library(survival)
data(colon)
?colon

gives me only a very rudimentary source (only a name). Is there a 
possibility to get a reference to the clinical trial these data
are taken from?

Many thanks in advance. With best wishes,

Matthias Gondan
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculate quantiles of a custom function

2012-01-03 Thread Prof. Dr. Matthias Kohl

Dear Gerhard,

you could also use package distr; e.g.

library(distr)

## use generating function AbscontDistribution
D - AbscontDistribution(d = function(x) dbeta(x, 2, 6) + dbeta(x,6,2), 
low = 0, up = 1, withStand = TRUE)


## quantiles
q(D)(seq(0,1,0.1))

Best
Matthias

On 03.01.2012 19:33, Albyn Jones wrote:

What do quantiles mean here? If you have a mixture density, say

myf - function(x,p0) p0*dbeta(x,2,6) + (1-p0)*dbeta(x,6,2)

then I know what quantiles mean. To find the Pth quantile use uniroot to
solve for the x such that myf(x,p0) - P =0.

albyn

Quoting VictorDelgado victor.m...@fjp.mg.gov.br:



Gerhard wrote



Suppose I create a custom function, consisting of two
beta-distributions:

myfunction - function(x) {
dbeta(x,2,6) + dbeta(x,6,2)
}

How can I calculate the quantiles of myfunction?

Thank you in advance,

Gerhard




Gehard, if do you want to know the quantiles of the new distribution
created
by myfunction. Maybe you can also do:

x - seq(0,1,.01) # insert your 'x'
q - myfunction(x)
# And:
quantile(x)

0% 25% 50% 75% 100%
0.00 1.476177 2.045389 2.581226 2.817425

# This gives the sample quantiles. You can also look foward to
simulations
(like Bert Gunter had suggested) to know better the properties of
distributions quantiles obtained after 'myfunction'.



-
Victor Delgado
cedeplar.ufmg.br P.H.D. student
www.fjp.mg.gov.br reseacher
--
View this message in context:
http://r.789695.n4.nabble.com/calculate-quantiles-of-a-custom-function-tp4256887p4257551.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Re: Poisson GLM using non-integer response/predictors?

2011-12-30 Thread Matthias Gondan
Hi,

Use offset variables if count occurrences of an event and you want to 
model the
observation time.

glm(count ~ predictors + offset(log(observation_time)), family=poisson)

If you want to compare durations, look at library(survival), ?coxph

If tnoise_sqrt is the square root of tourist noise, your example seems
incorrect, because it is a predictor, not the dependent variable

tnoise_sqrt ~ lengthfeeding_log

Best wishes,

Matthias


Am 30.12.2011 16:29, schrieb Lucy Dablin:
 Great lists, I always find them useful, thank you to
 everyone who contributes to them.


 My question is regarding non-integer values from some data I
 collected on parrots when using the poisson GLM. I observed the parrots on a
 daily basis to see if they were affected by tourist presence. My key 
 predictors
 are tourist noise (averaged over a day period so decimal value, square root to
 adjust for skew),  tourist number (the
 number of tourists at a site, square root), and the number of boats passing 
 the
 site in a day (log). These are compared with predictors: total number of birds
 (count data, square root), average time devoted to foraging at site (log), 
 species
 richness (sqrt), and the number of flushes per day. Apart from the last one
 they are all non-integer values. When I run a glm for example:


 parrots- glm(tnoise_sqrt ~ lengthfeeding_log, family =
 poisson)

 summary(parrots)


 There are warnings which are 27: In dpois(y, mu, log =
 TRUE) : non-integer x = 1.889822 I was advised to use the offset() function
 however this does not seem to correct the problem and I find the code 
 confusing.
 What GLM approach should I be using for multiple non-integer predictors and
 non-integer responses? Does my GLM approach seem appropriate?
 Thank you for taking the time to consider this.



   
   [[alternative HTML version deleted]]



 __
 R-help@r-project.org  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Re: Poisson GLM using non-integer response/predictors?

2011-12-30 Thread Matthias Gondan

Hi Ben,

Thanks for clarifying this, I used a misleading word, model the 
observation time
sounds as if observation time were the dependent variable - which it is 
not, of course,

instead, in the scenario described, the parrot counts are modeled.

Best wishes,

Matthias

Am 30.12.2011 20:50, schrieb Ben Bolker:

Matthias Gondanmatthias-gondanat  gmx.de  writes:


Hi,

Use offset variables if count occurrences of an event and you want to
model the
observation time.

glm(count ~ predictors + offset(log(observation_time)), family=poisson)

If you want to compare durations, look at library(survival), ?coxph

If tnoise_sqrt is the square root of tourist noise, your example seems
incorrect, because it is a predictor, not the dependent variable

tnoise_sqrt ~ lengthfeeding_log

Best wishes,

Matthias

Am 30.12.2011 16:29, schrieb Lucy Dablin:

Great lists, I always find them useful, thank you to
everyone who contributes to them.

My question is regarding non-integer values from some data I
collected on parrots when using the poisson GLM. I observed the
parrots on a daily basis to see if they were affected by tourist
presence. My key predictors are tourist noise (averaged over a day
period so decimal value, square root to adjust for skew), tourist
number (the number of tourists at a site, square root), and the
number of boats passing the site in a day (log). These are
compared with predictors: total number of birds (count data,
square root), average time devoted to foraging at site (log),
species richness (sqrt), and the number of flushes per day. Apart
from the last one they are all non-integer values. When I run a
glm for example:

  Your description sounds like you might already have transformed
your predictors: generally speaking, you don't want to do that
before running a GLM (the variance function incorporated in the
GLM takes care of heteroscedasticity, and the link function takes
care of nonlinearity in the response).

   I suspect you want total number of birds, number of flushes per day,
and species richness to be modeled as Poisson (or negative binomial --
see ?glm.nb in the MASS package).  Species richness *might* be
binomial, or more complicated, if you are drawing from a limited
species pool (e.g. if there are only 5 possible species and you
sometimes see 4 or 5 of them in a day).  Is the total number
of birds really non-integer *before* you square-root transform it?

Time devoted to foraging at the site is most easily
modeled as log-normal (unless the response includes zeros:
i.e., log-transform as you have already done and use lm),
or possibly Gamma-distributed (although you may want to
use a log link instead of the default inverse link).

  As Matthias said, offsets are used for the specific case of
non-uniform sampling effort (e.g. if you sampled different areas,
or for different lengths of time, every day).

   You may be interested in r-sig-ecol...@r-project.org , which
is an R mailing list specifically devoted to ecological questions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] arrow egdes and lty

2011-11-14 Thread Matthias Gondan
Dear R developers,

I want to draw an arrow in a figure with lty=2. The
lty argument also affects the edge of the arrow, which is 
unfortunate. Feature or bug?

Is there some way to draw an arrow with intact edge, still
with lty=2?

Example code:

plot(1:10)
arrows(4, 5, 6, 7, lty=2)

Best wishes,

Matthias
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] image problem in 2.13.1

2011-07-13 Thread Matthias Gondan
Dear R people,

On my freshly installed R-2.13.1 (winxp), the following code yields 
unsatisfactory results (irregular grid lines instead of a smooth plane):

image(matrix(1, nrow=100, ncol=100))

This is working fine,

image(matrix(1, nrow=100, ncol=100), useRaster=TRUE)

but then right-click and save as metafile causes a segmentation fault.

Everything worked fine on 2.13.0

Best wishes,

Matthias
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Defining functions inside loops

2011-02-15 Thread Dr. Matthias Kohl

Dear Eduardo,

try:
f - function(x){}
s - 0.2
body(f) - substitute({x^2 + s}, list(s = s))

Best,
Matthias

On 15.02.2011 16:53, Eduardo de Oliveira Horta wrote:

Thanks... but I guess I didn't make myself clear. What I was trying to
do was precisely to store inside the function the number associated
to s[i] rather than the call to s[i], such that I wouldn't need to
keep that object in subsequent function calls.

In other words, I wanted to use lapply to get functions equivalent to:
s- c( 0.2, 0.45, 0.38, 0.9)
f-list()
f[[1]]- function(x) x^2+0.2
f[[2]]- function(x) x^2+0.45
f[[3]]- function(x) x^2+0.38
f[[4]]- function(x) x^2+0.9

Best regards,

Eduardo


On Tue, Feb 15, 2011 at 7:20 AM, Dennis Murphydjmu...@gmail.com  wrote:

Hi:

If you look at the error message, you'll see that you removed s before
evaluating f, and since an element of s is called in the function

Try

s- c( 0.2, 0.45, 0.38, 0.9)
f- lapply(1:10, function(i)local({ force(i) ; function(x)x^2+s[i]}))
f[[1]](s)

[1] 0.2400 0.4025 0.3444 1.0100

f is a list with 10 components, the first of which is
[[1]]
function (x)
x^2 + s[i]
environment: 0x02a26d48

Each component occupies a different environment. To see what you get,


f[[1]](0.1)

[1] 0.21


for(i in 1:10) print(f[[i]](i))

[1] 1.2
[1] 4.45
[1] 9.38
[1] 16.9
[1] NA
[1] NA
[1] NA
[1] NA
[1] NA
[1] NA


for(i in 1:10) print(f[[i]](1))

[1] 1.2
[1] 1.45
[1] 1.38
[1] 1.9
[1] NA
[1] NA
[1] NA
[1] NA
[1] NA
[1] NA

HTH,
Dennis

On Mon, Feb 14, 2011 at 9:50 PM, Eduardo de Oliveira Horta
eduardo.oliveiraho...@gmail.com  wrote:


Hello again.

Let me try something a little more intricate. Let's say instead of
forcing evaluation of 'i' I'd want to force evaluation of a vector;
for example:
s- c( 0.2, 0.45, 0.38, 0.9)
f- lapply(1:10, function(i)local({ force(i) ; function(x)x^2+s[i]}))
rm(s)
f[[1]](0.1)
Error in f[[1]](0.1) : object 's' not found

Any thoughts?

Best regards,

Eduardo


sessionInfo()

R version 2.11.1 (2010-05-31)
x86_64-pc-mingw32

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Revobase_4.2.0   RevoScaleR_1.1-1 lattice_0.19-13

loaded via a namespace (and not attached):
[1] grid_2.11.1   pkgXMLBuilder_1.0 revoIpe_1.0   tools_2.11.1
[5] XML_3.1-0


On Mon, Nov 15, 2010 at 7:10 PM, William Dunlapwdun...@tibco.com
wrote:

You could make f[[i]] be function(t)t^2+i for i in 1:10
with
 f- lapply(1:10, function(i)local({ force(i) ; function(x)x^2+i}))
After that we get the correct results
  f[[7]](100:103)
[1] 10007 10208 10411 10616
but looking at the function doesn't immdiately tell you
what 'i' is in the function
  f[[7]]
function (x)
x^2 + i
environment: 0x19d7458
You can find it in f[[7]]'s environment
  get(i, envir=environment(f[[7]]))
[1] 7

The call to force() in the call to local() is not
necessary in this case, although it can help in
other situations.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Eduardo de
Oliveira Horta
Sent: Monday, November 15, 2010 12:50 PM
To: r-help@r-project.org
Subject: [R] Defining functions inside loops

Hello,

I was trying to define a set of functions inside a loop, with
the loop index
working as a parameter for each function. Below I post a
simpler example, as
to illustrate what I was intending:

f-list()
for (i in 1:10){
   f[[i]]-function(t){
 f[[i]]-t^2+i
   }
}
rm(i)

With that, I was expecting that f[[1]] would be a function
defined by t^2+1,
f[[2]] by t^2+2 and so on. However, the index i somehow
doesn't get in the
function definition on each loop, that is, the functions
f[[1]] through
f[[10]] are all defined by t^2+i. Thus, if I remove the
object i from the
workspace, I get an error when evaluating these functions.
Otherwise, if
don't remove the object i, it ends the loop with value equal
to 10 and then
f[[1]](t)=f[[2]](t)=...=f[[10]](t)=t^2+10.

I am aware that I could simply put

f-function(u,i){
   f-t^2+i
}

but that's really not what I want.

Any help would be appreciated. Thanks in advance,

Eduardo Horta

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code

Re: [R] Package distr and define your own distribution

2011-02-12 Thread Dr. Matthias Kohl

Dear Gabriel,

it is possible. You can define a new class or just an object of an 
already existing class.


Did you look at:
library(distr)
vignette(newDistributions)

as well as
?AbscontDistribution
?DiscreteDistribution

Please let me know if you have further questions!

Best
Matthias

On 11.02.2011 16:05, gabriel.ca...@ubs.com wrote:


Hi all
I  am using the Package distr (and related)

Do you know if it is possible to define your own distribution (object)
GIVEN that you have an analytical form of the probability density
function (pdf) ?

I would then like to use the standard feature of the distr and related
packages.


Best regards

Giuseppe Gabriel Cardi




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] JGR and font antialiasing

2011-02-10 Thread Matthias Rieber
Hi,

I've noticed that font antialiasing is enabled in the console window,
but not in the editor windows. I'm using R version 2.12.1. JGR 1.7.4 and
the jdk from Ubuntu 10.04, which claims to be: OpenJDK Runtime
Environment (IcedTea6 1.9.5) (6b20-1.9.5-0ubuntu1~10.04.1). Is that a
problem with my installation? Has someone else noticed that issue?

Regards,
Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] random sequences for rnorm and runif

2011-02-03 Thread Matthias Gondan
Dear R experts,

For a fixed seed, the first random number produced by rnorm and runif 
has the same rank within the distribution, which I find useful. The 
following ranks differ, however.

 set.seed(123)
 runif(4)
[1] *0.2875775* 0.7883051 *0.4089769* 0.8830174

 set.seed(123)
 pnorm(rnorm(4))
[1] 0.2875775 0.4089769 0.9404673 0.5281055

I noticed that rnorm seems to 'eat' two seeds of the random
number generator, whereas runif eats only one seed. Is this 
intended behavior or do you think it should be fixed? 

The strange thing is that the 1st/3rd/5th etc number of rnorm 
corresponds to the 1st/2nd/3rd in runif. If two seeds are necessary, 
I would have expected the following correspondence, 2-1, 4-2, 6-3,
etc.

Temporary fix:

 myrnorm = function(n, mean=0, sd=1)
+ {
+ qnorm(runif(n), mean, sd)
+ } # myrnorm
 
 set.seed(123)
 pnorm(myrnorm(4))
[1] 0.2875775 0.7883051 0.4089769 0.8830174

Best wishes,

Matthias Gondan
-- 
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit 
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] R on Atlas library

2010-08-11 Thread Matthias Gondan

Hi Allan,

Thank you for your response. Finally I succeeded. These were the 
commands to compile Atlas and R:


Atlas:

../configure -t 4 -Fa alg -fPIC
make
make install

(-fPIC is needed to wrap some of objects of lapack.a as into a shared lib)


R:
../configure --with-blas=/usr/local/lib/libptf77blas.a 
/usr/local/lib/libatlas.a -lpthread

make
make install

Using your benchmark example, all CPUs seem to be used. It is now
working at such a high speed that I could not entirely be sure, top
refreshes only every 5 s.

Cheers

Matthias

Am 09.08.2010 11:51, schrieb Allan Engelhardt:

I don't know about the specific function (a simple, stand-alone
reproducible example is always helpful, see the posting guide for
details), but ATLAS certainly uses all my cores on a test like this:

s - 7500 # Adjust for your hardware
a - matrix(rnorm(s*s), ncol = s, nrow = s)
b - crossprod(a) # Uses all cores here.

My configuration of R with ATLAS on Linux (Fedora) is described at
http://www.cybaea.net/Blogs/Data/Faster-R-through-better-BLAS.html

Maybe your distribution has single-threaded ATLAS and you forgot to
rebuild it with enable_native_atlas 1 or the equivalent for your
platform?

Allan


On 06/08/10 15:12, Matthias Gondan wrote:

Dear List,

I am aware this is slightly off-topic, but I am sure there are people
who already had the problem and who perhaps solved it.

I am running long-lasting model fits using constrOptim command. At work
there is a linux computer (Quad Core, debian) on which I already have
compiled R and Atlas, in the hope that things will go faster on that
machine.

Atlas offers the possibility to be compiled for multiple cores, I used
that option, but without success. Performance meter (top) for Linux
shows 25% CPU usage, and there is no loss of performance if I run
4 instances of R doing heavy matrix multiplications. Performance drops
when a 5th instance of R is doing the same job, so I assume my attempt
was not successful.

I am sure I did something wrong. Is there anybody who sucessfully
runs R with Atlas and all processors? A short description of the
necessary steps would be helpful.

Searching around the internet was not very encourageing. Some people
wrote that it is not so simple to have Atlas fully exploit a multicore
computer.

I hope this is not too unspecific.

Best wishes,

Matthias





--
Dr. rer. nat. Matthias Gondan
Institut für Psychologie
Universität Regensburg
D-93050 Regensburg
Tel. 0941-943-3856
Fax 0941-943-3233
Email: matthias.gon...@psychologie.uni-regensburg.de
http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimization subject to constraints

2010-08-10 Thread Matthias Gondan

 try this (package Rsolnp)

library(Rsolnp)

g- function(x)
{
  return(x[1]^2+x[2]^2)
} # constraint

f- function(x)
{
  return(x[1]+x[2])
} # objective function

x0 = c(1, 1)

solnp(x0, fun=f, eqfun=g, eqB=c(1))



Am 10.08.2010 14:59, schrieb Gildas Mazo:

Thanks, but I still cannot get to solve my problem: consider this simple
example:



f- function(x){
   return(x[1]+x[2])
} # objective function

g- function(x){
   return(x[1]^2+x[2]^2)
} # constraint

#

I wanna Maximize f(x) subject to g(x) = 1. By hand the solution is
(1/sqrt(2), 1/sqrt(2), sqrt(2)). This is to maximizing a linear function
subject to a nonlinear equality constraint.  I didn't find any suitable
function in the packages I went through.

Thanks in advance,

Gildas





Spencer Graves a écrit :

To find every help page containing the term constrained
optimization, you can try the following:


library(sos)
co- findFn('constrained optimization')


   Printing this co object opens a table in a web browser with
all matches sorted first by the package with the most matches and with
hot links to the documentation page.


writeFindFn2xls(co)


   This writes an excel file, with the browser table as the second
tab and the first being a summary of the packages.  This summary table
can be made more complete and useful using the installPackages
function, as noted in the sos vignette.


   A shameless plug from the lead author of the  sos package.
   Spencer Graves


On 8/9/2010 10:01 AM, Ravi Varadhan wrote:

constrOptim can only handle linear inequality constraints.  It cannot
handle
equality (linear or nonlinear) as well as nonlinear inequality
constraints.

Ravi.

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On
Behalf Of Dwayne Blind
Sent: Monday, August 09, 2010 12:56 PM
To: Gildas Mazo
Cc: r-help@r-project.org
Subject: Re: [R] optimization subject to constraints

Hi !

Why not constrOptim ?

Dwayne

2010/8/9 Gildas Mazogildas.m...@curie.fr


Dear R users,

I'm looking for tools to perform optimization subject to constraints,
both linear and non-linear. I don't mind which algorithm may be
used, my
primary aim is to get something general and easy-to-use to study
simples
examples.

Thanks for helping,

Gildas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide


http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting

-guide.html

and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimization subject to constraints

2010-08-09 Thread Matthias Gondan

 try command solnp in package Rsolnp

Am 09.08.2010 18:56, schrieb Dwayne Blind:

Hi !

Why not constrOptim ?

Dwayne

2010/8/9 Gildas Mazogildas.m...@curie.fr


Dear R users,

I'm looking for tools to perform optimization subject to constraints,
both linear and non-linear. I don't mind which algorithm may be used, my
primary aim is to get something general and easy-to-use to study simples
examples.

Thanks for helping,

Gildas



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [OT] R on Atlas library

2010-08-06 Thread Matthias Gondan
Dear List,

I am aware this is slightly off-topic, but I am sure there are people who 
already had the problem and who perhaps solved it.

I am running long-lasting model fits using constrOptim command. At work
there is a linux computer (Quad Core, debian) on which I already have
compiled R and Atlas, in the hope that things will go faster on that
machine.

Atlas offers the possibility to be compiled for multiple cores, I used
that option, but without success. Performance meter (top) for Linux
shows 25% CPU usage, and there is no loss of performance if I run
4 instances of R doing heavy matrix multiplications. Performance drops
when a 5th instance of R is doing the same job, so I assume my attempt
was not successful.

I am sure I did something wrong. Is there anybody who sucessfully
runs R with Atlas and all processors? A short description of the
necessary steps would be helpful.

Searching around the internet was not very encourageing. Some people
wrote that it is not so simple to have Atlas fully exploit a multicore
computer.

I hope this is not too unspecific.

Best wishes,

Matthias

-- 
Dr. rer. nat. Matthias Gondan
Institut für Psychologie
Universität Regensburg
D-93050 Regensburg
Tel. 0941-943-3856
Fax 0941-943-3233
Email: matthias.gon...@psychologie.uni-regensburg.de
http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html 
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a random data from a empirical distribition

2010-07-27 Thread Dr. Matthias Kohl

Hi Dennis,

you should take a look at the CRAN task view for distributions
http://cran.r-project.org/web/views/Distributions.html

Beside that our distr-family of packages might be useful, see also
http://www.jstatsoft.org/v35/i10/
http://cran.r-project.org/web/packages/distrDoc/vignettes/distr.pdf

Best,
Matthias

On 27.07.2010 10:37, Dennis Murphy wrote:

Hi:

On Mon, Jul 26, 2010 at 11:36 AM, xin weixin...@stat.psu.edu  wrote:



hi, this is more a statistical question than a R question. but I do want to
know how to implement this in R.
I have 10,000 data points. Is there any way to generate a empirical
probablity distribution from it (the problem is that I do not know what
exactly this distribution follows, normal, beta?). My ultimate goal is to
generate addition 20,000 data point from this empirical distribution
created
from the existing 10,000 data points.
thank you all in advance.



The problem, it seems to me, is the leap of faith you're taking that the
empirical distribution of your manifest sample will serve as a useful
data-generating mechanism for the 20,000 future observations you want to
take. I would think that, if you intend to take a sample of 20,000 from ANY
distribution, you would want some confidence in the specification of said
distribution.

Even if you don't know exactly what type of population distribution you're
dealing with, there are ways to narrow down the set of possibilities. What
is the domain/support of the distribution? For example, the Normal is
defined on all of R (as in the real numbers, not our favorite statistical
programming language), whereas the lognormal, Gamma and Weibull
distributions, among others, are defined on the nonnegative reals. The beta
distribution is defined on [0, 1]. Therefore, knowledge of the domain is
useful in and of itself. Is it plausible that the distribution is symmetric,
or should it have a distinct left or right skew? (Similar comments apply to
discrete distributions.) Is censoring or truncation a relevant concern? If
there is a random process that well describes how the data you observe are
generated, that will certainly narrow down the class of potential
data-generating mechanisms/distributions.

Once you've narrowed down the class of possible distributions as much as
possible, you could look into the fitdistr() function in MASS or the
fitdistrplus package on CRAN to test out which candidates seem plausible wrt
your existing sample and which are not. You are not likely to be able to
narrow it down to one family of distributions, but you should have a much
better idea about the characteristics of the distribution that gave rise to
your sample of 10,000 (assuming, of course, that it is a *random* sample)
after going through this exercise, which you can apply to the generation of
the next 20,000 observations.

OTOH, if your existing 10,000 observations were not produced by some random
process, all bets are off.

HTH,
Dennis




--
View this message in context:
http://r.789695.n4.nabble.com/how-to-generate-a-random-data-from-a-empirical-distribition-tp2302716p2302716.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Robust Asymptotic Statistics (RobASt)

2010-06-07 Thread Dr. Matthias Kohl

Dear Alex,

after installation of ROptRegTS there should be a folder scripts in 
the package directory of ROptRegTS which includes some further examples.
If you are assuming normal errors you should switch to package RobRex 
which is optimized for normal errors. There you will also find a 
folder scripts in the package directory.


In general, multivariate regression should work. But, I have to admit 
that we didn't try many multivariate examples so far. Beside that, the 
user interface is still rudimentary at the moment and some ideas which 
will clearly reduce computation time (as in the case of package RobLox) 
are unfortunately not yet implemented.


So, if you are willing to accept these drawbacks I will be glad to give 
you a hand to get our estimators to work for your data.


Best,
Matthias


On 06.06.2010 06:47, Alex Weslowski wrote:

Hi all,

Other than:
http://www.stamats.de/F2006.r

Are there other good simple examples out there of using the ROptRegTS
package (part of the RobASt project)? I'm hoping to plug it in for
multivariate regression. Or is this not a good idea? Just trying to
find out how it compares to rlm, lts, glm, etc. Hopefully this makes
sense, I'm new to the world of statistics and R.

Thanks!

St0x

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
   


--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with format(,%G-%V) - Segfault

2010-04-30 Thread Matthias Rieber

Hi,

I've some problems with the new R version converting date to year-week:

R version 2.11.0 (2010-04-22)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

 load(test.data.R)
 str(test.data)
Class 'Date'  num [1:3599546] 13888 14166 14188 14189 14189 ...
 result - format(test.data, format=%G-%V)
Error: segfault from C stack overflow


this happens with a self compiled R version and the debian packages.
The same operation works with R 2.9.x and R 2.10.x

I've attached the backtrace.

kind regards,

Matthias



Core was generated by `/opt/R/lib64/R/bin/exec/R --vanilla --no-save'.
Program terminated with signal 11, Segmentation fault.
#0  0x7f855f6e34d1 in __strftime_internal (s=value optimized out, 
maxsize=256, format=value optimized out, tp=value optimized out, 
tzset_called=value optimized out, loc=0x7f855f9b0580) at strftime_l.c:1046
in strftime_l.c
#0  0x7f855f6e34d1 in __strftime_internal (s=value optimized out, 
maxsize=256, format=value optimized out, tp=value optimized out, 
tzset_called=value optimized out, loc=0x7f855f9b0580) at strftime_l.c:1046
_n = 4
_delta = value optimized out
modifier = value optimized out
number_value = value optimized out
subfmt = 0x7f85fffd Address 0x7f85fffd out of bounds
buf = P\312\313e\377\177\000\000\364\226e\000\000\000\000\000%t2010
width = value optimized out
to_lowcase = value optimized out
pad = 0
digits = value optimized out
negative_number = value optimized out
change_case = value optimized out
bufp = 0x7fff652c20e2 2010
to_uppcase = 0
current = 0x7f855f9ae020
hour12 = 12
zone = 0x0
i = 0
p = 0x7fff65cbcac0 2010-06
f = 0x7fff652c2151 G-%V
#1  0x7f855f6e3db6 in *__GI___strftime_l (s=0x7fff65cbcac0 2010-06, 
maxsize=140734890778850, format=0x4 Address 0x4 out of bounds, 
tp=0x7fff652c20e2, loc=value optimized out) at strftime_l.c:490
tzset_called = false
#2  0x0052677f in do_formatPOSIXlt (call=value optimized out, 
op=value optimized out, args=value optimized out, env=value optimized 
out) at datetime.c:809
q = 0x33a7420 %G-%V
secs = 0
fsecs = value optimized out
x = 0x2cd3bd0
sformat = 0x3394328
ans = 0x148bd230
tz = 0x33a7788
i = value optimized out
n = value optimized out
N = 3599546
nlen = {3599546, 3599546, 3599546, 3599546, 3599546, 3599546, 3599546, 
3599546, 3599546}
UseTZ = 0
buff = 2010-06, '\000' repeats 17 times\270, 
\245:\003\000\000\000\000\377\377\377\377\000\000\000\000\350\316\313e\377\177\000\000\202\231a,
 '\000' repeats 13 times, 
\002\000\000\000\000\000\000\000`ؚ_\205\177\000\000\002\000\000\000\000\000\000\000t\313\313e\377\177\000\000p\313\313e\377\177\000\000\000\000\000\000\000\000\000\000\024\000\000\000\002\000\000\000\000\016\271^\301\277u\372p\313\313e\377\177\000\000\270\245:\003\000\000\000\000\060\315\313e\377\177,
 '\000' repeats 18 times\270, 
\245:\003\000\000\000\000\270\245:\003\000\000\000\000\230~H\000\000\000\000\000Ц:\003\000\000\000\000P\235:\003,
 '\000' repeats 12 times, 
Ц:\003\000\000\000\000\003\000\000\000\000\000\000\000\230U:\003\000\000\000\000P\235:\003\000\000\000\000\200T:\003\000\000\000\000Ѝ:\003\000\000\000\000\230...
p = value optimized out
tm = {tm_sec = 0, tm_min = 0, tm_hour = 0, tm_mday = 13, tm_mon = 1, 
tm_year = 110, tm_wday = 6, tm_yday = 43, tm_isdst = 0, tm_gmtoff = 0, tm_zone 
= 0x0}
#3  0x00429f93 in do_internal (call=value optimized out, op=value 
optimized out, args=0x33a8d98, env=0x33a9d50) at names.c:1185
s = 0x33a54f0
fun = 0x2178e50
ans = value optimized out
save = 45
flag = value optimized out
vmax = 0x0
#4  0x0055c7b9 in Rf_eval (e=0x33a5480, rho=0x33a9d50) at eval.c:464
save = 44
flag = 2
vmax = 0x0
op = 0x2157fc0
tmp = value optimized out
evalcount = 508
srcrefsave = 0x213a608
depthsave = 8
#5  0x0055ed3c in do_begin (call=0x33aa778, op=0x2157870, 
args=0x33a5448, rho=0x33a9d50) at eval.c:1245
srcrefs = 0x213a608
i = 1
s = 0x213a608
#6  0x0055c7b9 in Rf_eval (e=0x33aa778, rho=0x33a9d50) at eval.c:464
save = 42
flag = 2
vmax = 0x0
op = 0x2157870
tmp = value optimized out
evalcount = 508
srcrefsave = 0x213a608
depthsave = 7
#7  0x00560560 in Rf_applyClosure (call=value optimized out, 
op=value optimized out, arglist=0x33a4e10, rho=value optimized out, 
suppliedenv=0x33a9d18) at eval.c:699
body = 0x33aa778
formals = value optimized out
actuals = 0x33a9d18
savedrho = 0x213a608
newrho = 0x33a9d50
f = value optimized out
a = 0x213a608

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Matthias Gondan
 This is your first of three postings in the last hour and they are all in 
 a category that could well be described as requests for tutoring in basic 
 statistical topics. I am of the impression you have been requested not to 
 engage in such behavior on this list. For this question for instance 
 there is an entire CRAN Task View available and you have been in 
 particular asked to sue such resource before posting.

Please allow me to ask for details on this task view, because I am interested 
in the topic of nonparametric ANOVAs, as well. To my knowledge,
there are some R scripts from Brunner et al. available on his website

http://www.ams.med.uni-goettingen.de/de/sof/ld/index.html

But they seem not to be working with current R versions.

Best regards,

Matthias Gondan



-- 
Sicherer, schneller und einfacher. Die aktuellen Internet-Browser -
jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Accessing named elements of a vector

2010-02-25 Thread Matthias Gondan

Dear R developers,

A great R feature is that elements of vectors, lists and dataframes can 
have names:


vx = c(a=1, b=2)
lx = list(a=1, b=2)

Accessing element a of vx: vx['a']
Accessing element a of lx: lx[['a']] or lx$a

Might be a matter of taste, but I like the $ very much. Unfortunately, 
vx$a is not
functional. Would it break existing compatibility if the $ would be 
allowed to

access elements of a vector, as well?

Best regards,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] new package RFLPtools

2010-02-20 Thread Dr. Matthias Kohl

The new package RFLPtools is available on CRAN.

RFLPtools provides analysis functions for DNA fragment molecular weights 
(e.g.\ derived from RFLP-analysis) and nucleotide sequence similarities. 
It aims mainly at the identification of similar or identical fragment 
patterns to evaluate the amount of different genotypes gained from
environmental samples during diversity studies and at further analysis 
of similarities of nucleotide sequences derived from pairwise sequence 
alignments (e.g.\ derived from standalone BLAST). Additionally, 
functions to check the data quality of molecular fingerprints are 
available. To identify the organisms represented by the extracted 
fragment patterns (e.g.\ RFLP genotypes), RFLPtools includes a function 
to compare samples with fragment patterns stored in a reference dataset, 
from which the taxonomic affiliations are already known. To identify 
unknown samples in scientific projects, DNA sequences are used, and the 
gained DNA sequences are compared to existing sequence data via 
alignments tools, as standalone BLAST.
RFLPtools offers tools to generate groups based on tabular report files 
of sequence comparison of pairwise nucleotide sequence alignment.


To get a first impression try:

vignette(RFLPtools)

as well as

library(RFLPtools)
example(RFLPplot)
example(RFLPrefplot)

Best regards,
Fabienne
Alexandra
Matthias

--
Dr. Matthias Kohl
www.stamats.de

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] new version of RobASt-family of packages

2009-12-09 Thread Matthias Kohl
The new version 0.7 of our RobASt-family of packages is available on 
CRAN for several days. As there were many changes we will only sketch 
the most important ones here. For more details see the corresponding 
NEWS files (e.g. news(package = RobAStBase) or using function NEWS 
from package startupmsg i.e. NEWS(RobAStBase)).



First of all the new package RobLoxBioC was added to the family which 
includes S4 classes and methods for preprocessing omics data, in 
particular gene expression data.



##
## All packages (RandVar, RobAStBase, RobLox, RobLoxBioC,
## ROptEst, ROptEstOld, ROptRegTS, RobRex)
##
- TOBEDONE file was added as a starting point for collaborations. The
file can be displayed via function TOBEDONE from package startupmsg;
e.g. TOBEDONE(distr)
- tests/Examples folder for some automatic testing was introduced


##
## Package RandVar
##
- mainly fixing of warnings and bugs


##
## Package RobAStBase
##
- enhanced plotting, in particular methods for qqplot
- unified treatment of NAs
- extended implementation for total variation neighbourhoods
- implementation of k-step estimator construction extended


##
## Package RobLox
##
- na.rm argument added
- introduction of finite-sample correction


##
## Package ROptEst
##
- optional use of alternative algorithm to obtain Lagrange multipliers
 using duality based optimization
- extended implementation for total variation neighbourhoods
- solutions for general parameter transformations with nuisance
 components
- several extensions to the examples in folder scripts
- implementation of k-step estimator construction extended


##
## Package ROptEstOld
##
- still needed for packages ROptRegTS and RobRex
- removed Symmetry and DistributionSymmetry implementation to
 make ROptEstOld compatible with distr 2.2


##
## Package ROptRegTS
##
- still depends on ROptEstOld


##
## Packages RobRex
##
- moved some of the examples in \dontrun{} to reduce check time ...
- some minor corrections in ExamplesEstimation.R in folder scripts


Best
Peter
Matthias

--
Dr. Matthias Kohl
www.stamats.de

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] new version of distr-family of packages

2009-11-24 Thread Matthias Kohl
The new version 2.2 of our distr-family of packages has now been 
available on CRAN for several days. As there were many changes we will 
only sketch the most important ones here. For more details see the 
corresponding NEWS files (e.g. news(package = distr) or using function 
NEWS from package startupmsg i.e. NEWS(distr)).


First of all a new package distrEllipse was added to the family which 
includes S4 classes and methods for elliptical symmetric distributions.


##
## All packages (distr, distrEx, distrSim, distrTEst, distrTeach,
## distrMod, distrEllipse, distrDoc as well as
## startupmsg and SweaveListingUtils)
##
- TOBEDONE file was added as a starting point for collaborations. The
 file can be displayed via function TOBEDONE from package startupmsg;
 e.g. TOBEDONE(distr)
- tests/Examples folder for some automatic testing was introduced


##
## Package distr
##
- new diagnostic function (more specifically S4 method)
 qqplot()
 for quantile-quantile plots to work with distribution objects
 as argument;  in addition to function qqplot() from package stats
 has functionality for point-wise and simultaneous confidence bands
- under the hood: a new slot symmetry was introduced
 for distributions

##
## Package distrEx
##
- enhanced E (expectation), m1df, m2df (clipped 1. and 2.
 moments) methods  --- now you can specify lower and
 upper integration bounds in the E(.), var(.) functions
- new class GPareto for generalized Pareto distribution
 (implemented by Nataliya Horbenko)

##
## Package distrMod
##
- methods for qqplot() for parametric model objects as argument
- unified treatment of NAs

##
## Package SweaveListingUtils
##
- commands:
* SweaveListingPreparations() is more flexible
   [see help  NEWS file]
* integrated Andrew Ellis's nice ideas into SweaveListingUtils
   to use \lstenvironment
* individual markup style for base or recommended packages
  (checked for with new function isBaseOrRecommended()) is
   distinct now by default (extra color RRecomdcolor)
* temporarily changes (like background color) made easier:
   by new functions
+ lstdefRstyle to change Rstyle
+ lstsetRall to change all R-like styles (i.e. Rstyle, Rinstyle,
   Routstyle, Rcodestyle)
* now can specify markup styles for Sinput, Soutput, Scode,
   separately as Rin, Rout, Rcode
* colors now have suffix color, i.e.
Rcomment - Rcommentcolor, Rout - Routcolor
- vignette:
* included an example with escape sequences in vignette
* included an example with framed code in vignette

Best
Peter
Matthias

--
Dr. Matthias Kohl
www.stamats.de

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot filled.contour over continent map

2009-11-19 Thread Matthias Demuzere
Dear all,

As a newbie in R I would like to do the following (simple?) thing:

to plot a filled.contour plot over a map showing country boundaries (e.g. for 
Europe)
What i do is:
map('worldHires',xlim=c(-10,40),ylim=c(35,70),boundary = TRUE,border=0.1)
map.axes()
filled.contour(mslp, zlim=c(1000,1020),color.palette = 
colorRampPalette(c(blue, white, red)),main=Avegared MLSP (hPa) ERA40 JJA 
[1996-2002], xlab=Longitude,ylab=Latitude)

in which the mslp file is a netcdf file, with mean sea level pressure for a 
range of lat/lon values.

If I run the above-mentioned, I just get the map of Europe, without the 
contourplot.
When commenting the map statements I can see the contourplot.

So I am doing something wrong, but I really have no idea what?

Anybody could help me out here?
Thanks in advance,

Matthias

-
Department of Earth  Environmental Sciences
Physical and Regional Geography Research Group
Regional climate studies

Celestijnenlaan 200E
3001 Heverlee (Leuven)
BELGIUM

Tel: + 32 16 326424
Fax: + 32 16 322980

http://geo.kuleuven.be/aow/
www.kuleuven.be/geography
-


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] COURSE: Introduction to Metabolomics and biomarker research

2009-10-05 Thread Matthias Kohl

Introduction to Metabolomics and biomarker research.

The course will take place in Innsbruck, 16th to 17th November 2009, and
will be held in German.

For more details see:
http://www.gdch.de/vas/fortbildung/kurse/fortbildung2009.htm#1175

--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] programming to calculate variance

2009-09-30 Thread Matthias Gondan

I think it should be

var(y[i-3:i-1,])

instead of 


var(x[i-3:i-1,])

otherwise the values of the vector are overwritten

Best wishes,

Matthias



marlene marchena schrieb:

Dear R-user

Suppose I have the following data

 y=c(2,1,5,8,11,3,1,7,50,21,33,7,60)

x=data.frame(y)

for(i in 4:nrow(x)) x[i,] =var(x[i-3:i-1,])

I'm trying to get a new variable with the variance of the 3 previous values
(just an example) and with NA in the three first positions. I know that my
for() is wrong
but I'm not able to find my error.

Any idea?

Thanks,

Marlene.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  



--
Dr. rer. nat. Matthias Gondan
Institut für Psychologie
Universität Regensburg
D-93050 Regensburg
Tel. 0941-943-3856
Fax 0941-943-3233
Email: matthias.gon...@psychologie.uni-regensburg.de
http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Example data for analysis of a highly pseudoreplicate mixed-effects experiment

2009-09-16 Thread Matthias Gralle

Hello everybody,

it may be better to have sample data. I have provided data with less 
levels of gene and day and only ca. 400 data points per condition.


Sample code:
small=as.data.frame(read.csv(small.csv))
small$species=factor(small$species)
small$gene=factor(small$gene)
small$day=factor(small$day)
small$repl=factor(small$repl)
library(lme4)
model1=lmer(APC~(FITC+species+gene)^3+(1|day)+(1|repl),REML=F,data=small)
model2=lmer(APC~(FITC+species+gene)^2+(1|day)+(1|repl),REML=F,data=small)
anova(model1,model2)

gives me a significant difference, but in fact there are far too many 
residual df, and this is much worse in the original data. I suppose I 
cannot trust this difference.


model3=lmer(APC~species*gene+(1|day/repl/FITC),data=small)
Error: length(f1) == length(f2) is not TRUE
In addition: Warning messages:
1: In FITC:(repl:day) :
 numerical expression has 886 elements: only the first used
2: In FITC:(repl:day) :
 numerical expression has 886 elements: only the first used

Can I do nesting without incurring this kind of error ?

And finally
model4=lmer(APC~gene*species+(1|day) + (1|repl) + 
(1+(gene:species)|FITC),data=small)

model5=lmer(APC~gene+species+(1|day) + (1|repl) + (1|FITC),data=small)
anova(model4,model5)

works with this reduced data set, but fails to converge for the 
original, much larger data set. I am unsure if this is the right kind of 
analysis for the data and there is only a problem of convergence, or if 
it is the wrong formula.


Can anybody give me some advice on this problem ? Or should I just stick 
to reducing the data from each condition to a single parameter, as 
explained in my first email below?


Thank you again!

Matthias Gralle wrote:

Hello everybody,

I have been trying for some weeks to state the correct design of my 
experiment as a GLM formula, and have not been able to find something 
appropriate in Pinheiro  Bates, so I am posting it here and hope 
somebody can help me.


In each experimental condition, described by
1) gene (10 levels, fixed, because of high interest to me)
2) species (2 levels, fixed, because of high interest)
3) day (2 levels, random)
4) replicate (2 levels per day, random),

I have several thousand data points consisting of two variables:

5) FITC (level of transfection of a cell)
6) APC (antibody binding to the cell)
Because of intrinsic and uncontrollable cell-to-cell variation, FITC 
varies quite uniformly over a wide range, and APC correlates rather 
well with FITC. In some cases, I pasted day and replicate together as 
day_repl.


My question is the following:

Is there any gene (in my set of 10 genes) where the species makes a 
difference in the relation between FITC and APC ? If yes, in what gene 
does species have an effect ? And what is the effect of the species 
difference ?


My attempts are the following:
1. Fit the data points of each experimental condition to a linear 
equation APC=Intercept+Slope*FITC and analyse the slopes :

lm(Slope~species*gene*day_repl)
This analysis shows clear differences between the genes, but no effect 
of species and no interaction gene:species.


The linear fit to the cells is reasonably good, but of course does not 
represent the data set completely, so I wanted to incorporate the 
complete data set.


2a. lmer(APC~FITC*species*gene+(1|day)+(1|repl))
This gives extremely significant values for any interaction and 
variable because there are 200 000 df. Of course, it cannot be true, 
because the cells are not really independent. I have done many 
variations of the above, e.g.

2b. lmer(APC~FITC*species*gene+(1|day)+(1+FITC|day_repl)),
but they all suffer from the excess of df.

3. lmer(APC~species*gene+(1|day/repl/FITC) gives several warning 
messages like this one:

In repl:day :
numerical expression has 275591 elements: only the first used

4. lmer(APC~gene*species+(1|day_repl) + (1+gene:species|FITC)) ran 
several days, but failed to converge...


Can somebody give me any hint, or do you think the only possible 
analysis is a simplification as in my model 1 ?


By the way, I am using R version 2.8.0 (2008-10-20) on Ubuntu 8.04 on 
a linux 2.6.24-24-generic kernel on different Intel systems. I am 
using the lme4 that came with R 2.8.0.


Thank you very much for your time!

-- Matthias Gralle, PhD
Dept. Evolutionary Genetics
Max Planck Institute for Evolutionary Anthropology
Deutscher Platz 6
04103 Leipzig, Germany
Tel +49 341 3550 519
Fax +49 341 3550 555

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.






--
Matthias Gralle, PhD
Dept. Evolutionary Genetics
Max Planck Institute for Evolutionary Anthropology
Deutscher Platz 6
04103 Leipzig, Germany
Tel +49 341 3550 519
Fax +49 341 3550 555

__
R-help@r-project.org mailing list

Re: [R] Example data for analysis of a highly pseudoreplicate mixed-effects experiment

2009-09-16 Thread Matthias Gralle
There were some wrong NA values in the provided data set, this is now 
corrected.

The data can be read in as

small=read.csv(small.csv,colClasses=c(character,rep(integer,2),rep(factor,5)))

The high number of residual df can be seen using the nlme package (can 
it be seen in the lme4 package, too ?):

library(nlme)
model1=lme(APC~(FITC+species+gene)^3,random=~1|day_repl,method=ML,data=small) 


anova(model1)
 numDF denDF   F-value p-value
(Intercept)   1   789 1365.7400  .0001
FITC  1   789 3040.8168  .0001
species   1   789   24.0521  .0001
gene  1   789   10.4035  0.0013
FITC:species  1   7895.0690  0.0246
FITC:gene 1   789   10.7925  0.0011
species:gene  1   789   72.5823  .0001
FITC:species:gene 1   7894.6742  0.0309

In the original data set, denDF is 200 000.

Once again, how do I correctly account for pseudoreplication, avoiding 
the artificially high number of df ?


Thank you,

Matthias

Matthias Gralle wrote:

Hello everybody,

it may be better to have sample data. I have provided data with less 
levels of gene and day and only ca. 400 data points per condition.


Sample code:
small=as.data.frame(read.csv(small.csv))
small$species=factor(small$species)
small$gene=factor(small$gene)
small$day=factor(small$day)
small$repl=factor(small$repl)
library(lme4)
model1=lmer(APC~(FITC+species+gene)^3+(1|day)+(1|repl),REML=F,data=small)
model2=lmer(APC~(FITC+species+gene)^2+(1|day)+(1|repl),REML=F,data=small)
anova(model1,model2)

gives me a significant difference, but in fact there are far too many 
residual df, and this is much worse in the original data. I suppose I 
cannot trust this difference.


model3=lmer(APC~species*gene+(1|day/repl/FITC),data=small)
Error: length(f1) == length(f2) is not TRUE
In addition: Warning messages:
1: In FITC:(repl:day) :
 numerical expression has 886 elements: only the first used
2: In FITC:(repl:day) :
 numerical expression has 886 elements: only the first used

Can I do nesting without incurring this kind of error ?

And finally
model4=lmer(APC~gene*species+(1|day) + (1|repl) + 
(1+(gene:species)|FITC),data=small)

model5=lmer(APC~gene+species+(1|day) + (1|repl) + (1|FITC),data=small)
anova(model4,model5)

works with this reduced data set, but fails to converge for the 
original, much larger data set. I am unsure if this is the right kind 
of analysis for the data and there is only a problem of convergence, 
or if it is the wrong formula.


Can anybody give me some advice on this problem ? Or should I just 
stick to reducing the data from each condition to a single parameter, 
as explained in my first email below?


Thank you again!

Matthias Gralle wrote:

Hello everybody,

I have been trying for some weeks to state the correct design of my 
experiment as a GLM formula, and have not been able to find something 
appropriate in Pinheiro  Bates, so I am posting it here and hope 
somebody can help me.


In each experimental condition, described by
1) gene (10 levels, fixed, because of high interest to me)
2) species (2 levels, fixed, because of high interest)
3) day (2 levels, random)
4) replicate (2 levels per day, random),

I have several thousand data points consisting of two variables:

5) FITC (level of transfection of a cell)
6) APC (antibody binding to the cell)
Because of intrinsic and uncontrollable cell-to-cell variation, FITC 
varies quite uniformly over a wide range, and APC correlates rather 
well with FITC. In some cases, I pasted day and replicate together as 
day_repl.


My question is the following:

Is there any gene (in my set of 10 genes) where the species makes a 
difference in the relation between FITC and APC ? If yes, in what 
gene does species have an effect ? And what is the effect of the 
species difference ?


My attempts are the following:
1. Fit the data points of each experimental condition to a linear 
equation APC=Intercept+Slope*FITC and analyse the slopes :

lm(Slope~species*gene*day_repl)
This analysis shows clear differences between the genes, but no 
effect of species and no interaction gene:species.


The linear fit to the cells is reasonably good, but of course does 
not represent the data set completely, so I wanted to incorporate the 
complete data set.


2a. lmer(APC~FITC*species*gene+(1|day)+(1|repl))
This gives extremely significant values for any interaction and 
variable because there are 200 000 df. Of course, it cannot be true, 
because the cells are not really independent. I have done many 
variations of the above, e.g.

2b. lmer(APC~FITC*species*gene+(1|day)+(1+FITC|day_repl)),
but they all suffer from the excess of df.

3. lmer(APC~species*gene+(1|day/repl/FITC) gives several warning 
messages like this one:

In repl:day :
numerical expression has 275591 elements: only the first used

4. lmer(APC~gene*species+(1|day_repl) + (1+gene:species|FITC)) ran 
several days, but failed to converge...


Can somebody give me

[R] Analysis of a highly pseudoreplicate mixed-effects experiment

2009-09-14 Thread Matthias Gralle

Hello everybody,

I have been trying for some weeks to state the correct design of my 
experiment as a GLM formula, and have not been able to find something 
appropriate in Pinheiro  Bates, so I am posting it here and hope 
somebody can help me.


In each experimental condition, described by
1) gene (10 levels, fixed, because of high interest to me)
2) species (2 levels, fixed, because of high interest)
3) day (2 levels, random)
4) replicate (2 levels per day, random),

I have several thousand data points consisting of two variables:

5) FITC (level of transfection of a cell)
6) APC (antibody binding to the cell)
Because of intrinsic and uncontrollable cell-to-cell variation, FITC 
varies quite uniformly over a wide range, and APC correlates rather well 
with FITC. In some cases, I pasted day and replicate together as day_repl.


My question is the following:

Is there any gene (in my set of 10 genes) where the species makes a 
difference in the relation between FITC and APC ? If yes, in what gene 
does species have an effect ? And what is the effect of the species 
difference ?


My attempts are the following:
1. Fit the data points of each experimental condition to a linear 
equation APC=Intercept+Slope*FITC and analyse the slopes :

lm(Slope~species*gene*day_repl)
This analysis shows clear differences between the genes, but no effect 
of species and no interaction gene:species.


The linear fit to the cells is reasonably good, but of course does not 
represent the data set completely, so I wanted to incorporate the 
complete data set.


2a. lmer(APC~FITC*species*gene+(1|day)+(1|repl))
This gives extremely significant values for any interaction and variable 
because there are 200 000 df. Of course, it cannot be true, because the 
cells are not really independent. I have done many variations of the 
above, e.g.

2b. lmer(APC~FITC*species*gene+(1|day)+(1+FITC|day_repl)),
but they all suffer from the excess of df.

3. lmer(APC~species*gene+(1|day/repl/FITC) gives several warning 
messages like this one:

In repl:day :
numerical expression has 275591 elements: only the first used

4. lmer(APC~gene*species+(1|day_repl) + (1+gene:species|FITC)) ran 
several days, but failed to converge...


Can somebody give me any hint, or do you think the only possible 
analysis is a simplification as in my model 1 ?


By the way, I am using R version 2.8.0 (2008-10-20) on Ubuntu 8.04 on a 
linux 2.6.24-24-generic kernel on different Intel systems. I am using 
the lme4 that came with R 2.8.0.


Thank you very much for your time!

-- Matthias Gralle, PhD
Dept. Evolutionary Genetics
Max Planck Institute for Evolutionary Anthropology
Deutscher Platz 6
04103 Leipzig, Germany
Tel +49 341 3550 519
Fax +49 341 3550 555

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test for stochastic dominance, non-inferiority test for distributions

2009-08-31 Thread Matthias Gondan
Dear R-Users,

Is anyone aware of a significance test which allows 
demonstrating that one distribution dominates another?

Let F(t) and G(t) be two distribution functions, the
alternative hypothesis would be something like:

F(t) = G(t), for all t

null hypothesis: F(t)  G(t), for some t.


Best wishes,

Matthias


PS. This one would be ok, as well:

F(t)  G(t), for all t

null hypothesis: F(t) = G(t), for some t.

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with median for each row

2009-08-21 Thread Matthias Kohl
if your matrix has many rows you might want to consider rowMedians from 
Bioconductor package Biobase.

hth,
Matthias

Greg Hirson schrieb:

Edward,

See ?apply

x = matrix(rnorm(100), ncol = 10)
apply(x, 1, median)

Hope this helps,

Greg

Edward Chen wrote:

Hi,

I tried looking through google search on whether there's a way to 
computer
the median for each row of a nxn matrix and return the medians for 
each row

for further computation.
And also if the number of columns in the matrix are even, how could I
specify which median to use?

Thank you very much!

  




--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to change the quantile method in bwplot

2009-07-21 Thread Matthias Kohl

for teaching purposes I wrote a corresponding function; cf.
qbxp.stats (as well as qboxplot ...) in package MKmisc.

hth,
Matthias

Deepayan Sarkar schrieb:

On Tue, Jul 21, 2009 at 7:47 AM, Jun Shenjun.shen...@gmail.com wrote:
  

Uwe,

Thank you for your reply.  I am still not very clear about the meanings of
the arguments in the stats function.   To make it clearer, quantile() uses
type=7 as default method. I believe this is the method bwplot() uses to
calculate the quantiles. I want to use type=6 method for bwplot(). How do I
achieve that? Thanks again.



Maybe this will be clearer: bwplot() uses the boxplot.stats() function
to compute the quantiles used, which in turn uses fivenum(), which
has its own quantile calculation (and does not explicitly use
quantile()). There is no easy way to allow for type=6 etc. here.

bwplot() allows you to replace boxplot.stats() and provide your own
alternative. So what you need to do is:

(1) write a function, say, 'my.boxpot.stats', that takes the same
arguments as boxplot.stats() and returns a similar result, but using
your preferred calculation for the quantiles. There are many ways to
do this.

(2) plug in this function into the bwplot() call; e.g. bwplot(...,
stats = my.boxplot.stats)

-Deepayan


  

Jun

2009/7/21 Uwe Ligges lig...@statistik.tu-dortmund.de



Jun Shen wrote:

  

Hi, everyone,

Since quantile calculation has nine different methods in R, I wonder how I
specify a method when calling the bwplot() in lattice. I couldn't find any
information in the documentation. Thanks.




bwplot() uses the panel function panel.bwplot() which allows to specify a
function that calculates the statistics in its argument stats that defaults
to boxplot.stats(). Hence you can change that function.

Example with some fixed values:

bwplot( ~ 1:10,
   stats = function(x, ...)
   return(list(stats=1:5, n=10, conf=1, 10, out=integer(0)))
)


Uwe Ligges

  

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  


--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trouble using optim for maximalisation of 2-parameter function

2009-07-19 Thread Matthias Kohl

try:

# first argument of llik.nor has to be the parameter
llik.nor-function(theta,x){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)}

# optim by default does minimization
# setting fnscale = -1 one obtains a maximization problem
optim(c(1,5), llik.nor, x=x, control = list(fnscale = -1))

hth,
Matthias

Anjali Sing schrieb:

Hello, I am having trouble using optim.

I want to maximalise a function to its parameters [kind of like: univariate
maximum likelihood estimation, but i wrote the likelihood function myself
because of data issues ]

When I try to optimize a function for only one parameter there is no
problem:

llik.expo-function(x,lam){(length(x)*log(lam))-(length(x)*log(1-exp(-1*lam*
*cons*)))-lam*sum(x)}

cons-

data-c(.)

expomx-optimize(llik.expo,c(0,100),maximum=TRUE,tol=0.0001,x=data)
expomx


 To optimize to two parameters you can't use optimize, so I tried the
following to test my input:

  
llik.nor-function(x,theta){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)}
  

x-c(-1,4,6,4,2)
normx-optim(c(1,20),llik.nor)



the output should be close to
parameters: mu=3 and sigma=2.366
[This I calculated by hand to compare with the output]

but in stead of output I get an error:
Error in fn(par, ...) : argument theta is missing, with no default

I don't understand why this happened? I hope someone can help me with this
for I am still a [R]ookie.

Kind regards,
Sonko Lady  [Anjali]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  


--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] implementing Maximum Likelihood with distrMod when only the PDF is known

2009-06-24 Thread Matthias Kohl

Dear Guillaume,

retrospectively, I'm not sure if the decision to have spezial initialize 
methods is the optimal way to go. In distrMod and our other packages on 
robust statistics we don't introduce special initialize methods, but use 
so-called generating functions. This approach has the advantage that the 
default initialize method can be used for programming where I find it 
very useful.


I would say it is the canonical (and recommended) way to first define a 
function as generic (like mu) and then implement methods for it.


Choosing names for which there is already an existing definition for a 
generic function may also not be what you want in general. By defining 
new methods you have to respect the definition of the generic function; 
that is, your method definition has to be with respect to the arguments 
in the given generic function and also dispatching will be on these 
arguments (cf. scale in the distrMod example).


In defining our classes we decided that at least the slot r has to be 
filled (of class function) whereas d, p and q are of class 
OptionalFunction. Hence, there are functions to fill d, p and q for 
given r but, so far not for filling r, p and q given d.


A way to avoid implementing r is given in
http://www.stamats.de/distrModExample1.R

I also do not fill the slots p and q in this example. This avoids the 
simulation of a large random sample and speeds up the computation of the 
MLE.


However, this is rather a quick and dirty solution and it would of 
course be better to have a valid definition for r, d, p and q.


Best,
Matthias


guillaume.martin schrieb:

Dear Mathias,
That's pretty amazing, thanks a lot ! I'll have to look all this 
through because I don't easily understand why each part has to be set 
up, in particular the initialize method part seems crucial and is 
not easy to intuit. From what I get, the actual name we give to a 
parameter (my original mu for example) is important in itself, and 
if we introduce new variable names we have to define a new generic, 
right? The simplest option then is to re-use an existing variable name 
that has the same properties/range, right?


Another general question: my actual pdf is of the same type but not 
the exact same as the skew normal. In particular, I don't have a rule 
for building the slot r (eg the one borrowed from the sn package in 
your example); is it a problem? isn't it sufficient to give slot d, 
and then you have automatic methods implemented to get from d() to r() 
slots etc. is that right?


Thanks a lot for your help and time !

Best,

Guillaume


Matthias Kohl a écrit :

Dear Guillaume,

thanks for your interest in the distrMod package.

Regarding your question I took up your example and put a file under:

http://www.stamats.de/distrModExample.R

Hope that helps ...

Don't hesitate to contact me if you have further questions!

Best,
Matthias

guillaume.martin schrieb:

Dear R users and Dear authors of the distr package and sequels

I am trying to use the (very nice) package distrMod as I want to 
implement maximum likelihood (ML) fit of some univariate data for 
which I have derived a theoretical continuous density (pdf). As it 
is a parametric density, I guess that I should implement myself a 
new distribution of class AbscontDistributions (as stated in the pdf 
on creating new distributions in distr), and then use 
MLEstimator() from the distrMod package. Is that correct or is there 
a simpler way to go? Note that I want to use the distr package 
because it allows me to implement simply the convolution of my 
theoretical pdf with some noise distribution that I want to model in 
the data, this is more difficult with fitdistr or mle.
It proved rather difficult for me to implement the new class 
following all the steps provided in the example, so I am asking if 
someone has an example of code he wrote to implement a parametric 
distribution from its pdf alone and then used it with MLEstimator().


I am sorry for the post is a bit long but it is a complicate 
question to me and I am not at all skillful in the handling of such 
notions as S4 - class, etc.. so I am a bit lost here..


As a simple example, suppose my theoretical pdf is the skew normal 
distribution (available in package sn):


#skew normal pdf (default values = the standard normal N(0,1)

fsn-function(x,mu=0,sd=1,d=0) {u = (x-mu)/sd;  f = 
dnorm(u)*pnorm(d*u); return(f/sd)}


# d = shape parameter (any real), mu = location (any real), sd = 
scale (positive real)


#to see what it looks like try
x-seq(-1,4,length=200);plot(fsn(x,d=3),type=l)

#Now I tried to create the classes SkewNorm and 
SkewNormParameter copying the example for the binomial

##Class:parameters
setClass(SkewNormParameter,
representation=representation(mu=numeric,sd=numeric,d=numeric),
prototype=prototype(mu=0,sd=1,d=0,name=gettext(Parameter of the 
Skew Normal distribution)),

contains=Parameter
)

##Class: distribution (created using the pdf of the skew normal 
defined

Re: [R] implementing Maximum Likelihood with distrMod when only the PDF is known

2009-06-23 Thread Matthias Kohl

Dear Guillaume,

thanks for your interest in the distrMod package.

Regarding your question I took up your example and put a file under:

http://www.stamats.de/distrModExample.R

Hope that helps ...

Don't hesitate to contact me if you have further questions!

Best,
Matthias

guillaume.martin schrieb:

Dear R users and Dear authors of the distr package and sequels

I am trying to use the (very nice) package distrMod as I want to 
implement maximum likelihood (ML) fit of some univariate data for 
which I have derived a theoretical continuous density (pdf). As it is 
a parametric density, I guess that I should implement myself a new 
distribution of class AbscontDistributions (as stated in the pdf on 
creating new distributions in distr), and then use MLEstimator() 
from the distrMod package. Is that correct or is there a simpler way 
to go? Note that I want to use the distr package because it allows me 
to implement simply the convolution of my theoretical pdf with some 
noise distribution that I want to model in the data, this is more 
difficult with fitdistr or mle.
It proved rather difficult for me to implement the new class following 
all the steps provided in the example, so I am asking if someone has 
an example of code he wrote to implement a parametric distribution 
from its pdf alone and then used it with MLEstimator().


I am sorry for the post is a bit long but it is a complicate question 
to me and I am not at all skillful in the handling of such notions as 
S4 - class, etc.. so I am a bit lost here..


As a simple example, suppose my theoretical pdf is the skew normal 
distribution (available in package sn):


#skew normal pdf (default values = the standard normal N(0,1)

fsn-function(x,mu=0,sd=1,d=0) {u = (x-mu)/sd;  f = 
dnorm(u)*pnorm(d*u); return(f/sd)}


# d = shape parameter (any real), mu = location (any real), sd = scale 
(positive real)


#to see what it looks like try
x-seq(-1,4,length=200);plot(fsn(x,d=3),type=l)

#Now I tried to create the classes SkewNorm and SkewNormParameter 
copying the example for the binomial

##Class:parameters
setClass(SkewNormParameter,
representation=representation(mu=numeric,sd=numeric,d=numeric),
prototype=prototype(mu=0,sd=1,d=0,name=gettext(Parameter of the Skew 
Normal distribution)),

contains=Parameter
)

##Class: distribution (created using the pdf of the skew normal 
defined above)

setClass(SkewNorm,prototype = prototype(
d = function(x, log = FALSE){fsn(x, mu=0, sd=1,d=0)},
param = new(SkewNormParameter),
.logExact = TRUE,.lowerExact = TRUE),
contains = AbscontDistribution
)

#so far so good but then with
setMethod(mu, SkewNormParameter, function(object) obj...@mu)

#I get the following error message:

 Error in setMethod(mu, SkewNormParameter, function(object) 
obj...@mu) : no existing definition for function mu


I don't understand because to me mu is a parameter not a function... 
maybe that is too complex programming for me and I should switch to 
implementing my likelihood by hand with numerical convolutions and 
optim() etc., but I would like to know how to use distr, so if there 
is anyone who had the same problem and solved it, I would be very 
grateful for the hint !


All the best,
Guillaume





--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >