Re: [R] Predictions with missing inputs

2011-02-12 Thread Bernardo Rangel Tura
On Fri, 2011-02-11 at 20:51 -0500, Axel Urbiz wrote:
> Dear users,
> 
> I'll appreciate your help with this (hopefully) simple problem.
> 
> I have a model object which was fitted to inputs X1, X2, X3. Now, I'd like
> to use this object to make predictions on a new data set where only X1 and
> X2 are available (just use the estimated coefficients for these variables in
> making predictions and ignoring the coefficient on X3). Here's my attempt
> but, of course, didn't work.
> 
> #fit some linear model to random data
> 
> x=matrix(rnorm(100*3),100,3)
> y=sample(1:2,100,replace=TRUE)
> mydata <- data.frame(y,x)
> mymodel <- lm(y ~ ns(X1, df=3) + X2 + X3, data=mydata)
> summary(mymodel)
> 
> #create new data with 1 missing input
> 
> mynewdata <- data.frame(matrix(rnorm(100*2),100,2))
> mypred <- predict(mymodel, mynewdata)
> Thanks in advance for your help!
> 
> Axel.

Axel,

I think mice package solve your problem 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can't find and install reshape2??

2010-10-12 Thread Bernardo Rangel Tura
On Mon, 2010-10-04 at 10:27 +0930, Chris Howden wrote:
> Hi everyone,
> 
> 
> 
> Im trying to install reshape2.
> 
> 
> 
> But when I click on install package its not coming up!?!?! Im getting
> reshape, but no reshape2?
> 
> 
> 
> Ive also tried download.packages(reshape2, destdir="c:\\") &
> download.packages(Reshape2, destdir="c:\\")but no luck!!!
> 
> 
> 
> Does anyone have any ideas what could be going on?
> 
> 
> 
> Chris Howden
> 

Hi Chris,

I have two guess:

1- You don't have installed 'stringr' pakage
2- Your R is outdated

Try this two things and after this mail me

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read columns of quoted numbers as factors

2010-10-05 Thread Bernardo Rangel Tura
On Mon, 2010-10-04 at 09:39 -0700, james hirschorn wrote:
> Suppose I have a data file (possibly with a huge number of columns), where 
> the 
> columns with factors are coded as "1", "2", "3", etc ... The default behavior 
> of 
> read.table is to convert these columns to integer vectors. 
> 
> Is there a way to get read.table to recognize that columns of quoted numbers 
> represent factors (while unquoted numbers are interpreted as integers), 
> without 
> explicitly setting them with colClasses ?
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Hi James,

I think you solve ypur problem using the options colClasses in the
read.table command, something like this

rea.table('name.of.table',colClasses=c(rep(30,'integer'),rep(5,'numeric'),etc))
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating with tolerances

2010-09-09 Thread Bernardo Rangel Tura
On Thu, 2010-09-09 at 09:16 +0430, Jan private wrote:
> Dear list,
> 
> I am from an engineering background, accustomed to work with tolerances.
> 
> For example, I have measured
> 
> Q = 0.15 +- 0.01 m^3/s
> H = 10 +- 0.1 m
> 
> and now I want to calculate
> 
> P = 5 * Q * H 
> 
> and get a value with a tolerance +-
> 
> What is the elegant way of doing this in R?
> 
> Thank you,
>   Jan

Hi Jan,

If I understood  your problem this script solve your problem:

q<-0.15 + c(-.1,0,.1)
h<-10 + c(-.1,0,.1)
5*q*h
[1]  2.475  7.500 12.625

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levene.test

2010-07-14 Thread Bernardo Rangel Tura
On Wed, 2010-07-14 at 01:37 -0700, martanair wrote:
> I am trying to use Levene's test (of package car), but I do 
> not understand quite well how to use it. '?levene.test' does 
> not unfortunately provide any example. My data are in a data 
> frame and correspond to 1 factor plus response. Could 
> someone please give me an example about how to use the command 
>  
> levene.test(y, group) 
> 
> Thanks in advance, 
>  
> marta

Hi Marta,

levene.test is deprecated functions in car package as say the help of
command (?levene.test). Instead this use leveneTest (?levene.test
again)If you type ?leveneTest you get examples


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] F# vs. R

2010-07-07 Thread Bernardo Rangel Tura
On Wed, 2010-07-07 at 17:31 +0200, Sergey Goriatchev wrote:
> Hello, everyone
> 
> F# is now public. Compiled code should run faster than R.
> 
> Anyone has opinion on F# vs. R? Just curious
> 
> Best,
> S


Sergey,

F# is public, but is not open source. 

F# run in windows but run in AIX, linux, MAC, UNIX etc?

Compiled code should run faster than R, but is precise?

Compiled code should run faster than R, but is reliable?

Compiled code should run faster than R, but have 2.440 packages for
extend your capacities?

Compiled code should run faster than R, but critical code is in C o
FORTRAN

So I think the F# is not a good alternative, if your concern is velocity
dou you look Littler

http://code.google.com/p/littler/

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to determine if R is 64 bit compiled under Unix-alike?

2010-07-05 Thread Bernardo Rangel Tura
On Mon, 2010-07-05 at 19:25 +0200, Przemek Grabowicz wrote:
> Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I 
> do not have administrative rights to, there is only R executive. It 
> seems that I can allocate more than 3GB of memory, however not 
> everything seems to work the same/right as with R64 under MacOS.
> 
> Pms.
> 

Type .Machine$sizeof.pointer
If respond is 8 your R is 64 bits


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best way to compute a sum

2010-07-04 Thread Bernardo Rangel Tura
On Sat, 2010-07-03 at 17:27 -0700, Roger Deangelis wrote:
> Hi Bernado,
> 
> In many financial applications if you convert the dollars and cents to
> pennies( ie $1.10 to 110) and divide by 100 at the vary end you can get
> maintain higher precision. This applies primarily to sums. This is similar
> to keeping track of the decimal fractions which have exact representations
> in floating point. It is good to know where the errors come from.

OK 
> 
> I suppose you could also improve the sum by understanding the decimal to
> binary rounding algorithms. I have noticed differences in computations
> between Sun hardware(FPUs) and Intel(FPUs). When presented the same set of
> operations, it appears that the floating point hardware do not do the
> operations in the same order. 

It's true
>  
> 
> Consider
> 
>sprintf("%a", sum(rep(1.1,100))) # overestimates the sum
> "0x1.b8001p+6"
>   "0x1.b8001p+6"
> 
>   sprintf("%a", sum(rep(11,100))/10)   # this gives the correct answer
> "0x1.b8p+6"
>   110 = 10 1110 = 32 + 8 + 4 + 2 - "0x1.b8p+6" - tricky due to 53 bit and
> little endian (I think this is right)
>   "0x1.b8p+6"
> 
>   note
>cmp <- ifelse (sum(rep(1.1,100))==sum(rep(11,100))/10, "equal",
> "unequal")
>  
> [1] "unequal"
> 
>cmp <- ifelse (sum(rep(1.1,100))>sum(rep(11,100))/10, "greater than",
> "less than orequal")
>   
> [1] "greater than"


Hi Roger,

First of all, is true 

> sum(rep(1.1,100))==sum(rep(11,100))/10
[1] FALSE

But is true too

> all.equal(sum(rep(1.1,100)),(sum(rep(11,100))/10))
[1] TRUE


I think you need read about "Guard Digits approach" 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best way to compute a sum

2010-07-03 Thread Bernardo Rangel Tura
On Fri, 2010-07-02 at 21:23 -0700, Roger Deangelis wrote:
> 
>   Although it does not apply to your series and is impractical,  it seems to
> me that the most accurate algorithm might be to add all the rational numbers
> whose sum and components can be represented  without error in binary first,
> ie 2.5 + .5 or 1/16 + 1/16 + 1/8.
> 
>   You could also get very clever and investigate a sum that should have an
> exact binary representation when the individual components do not, ie .1 +
> .2 + .2 = .5 and correct the sum.
> 
> Roger

Roger I think you must read: What Every Computer Scientist Should Know
About Floating-Point Arithmetic
( http://docs.sun.com/source/806-3568/ncg_goldberg.html )

I think your question and others like this question is answer in this
paper 
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identifying odd or even number

2010-07-01 Thread Bernardo Rangel Tura
On Thu, 2010-07-01 at 08:40 -0700, Yemi Oyeyemi wrote:
> Hi, I run into problem when writing a syntax, I don't know syntax that will 
> return true or false if an integer is odd or even.
> Thanks
> 
> OYEYEMI, Gafar Matanmi
> 
> Department of Statistics
> 
> University of Ilorin

Hi Yemi!

Your problem is solve wiht "!" and "%%", Look this example

> x <- 1:11
> x
 [1]  1  2  3  4  5  6  7  8  9 10 11
> x%%2
 [1] 1 0 1 0 1 0 1 0 1 0 1
> !(x%%2)
 [1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculation of r squared from a linear regression

2010-06-11 Thread Bernardo Rangel Tura
On Fri, 2010-06-11 at 01:16 -0700, Sandra Hawthorne wrote:
> Hi,
> 
> I'm trying to verify the calculation of coefficient of determination (r 
> squared) for linear regression. I've done the calculation manually with a 
> simple test case and using the definition of r squared outlined in 
> summary(lm) help. There seems to be a discrepancy between the what R produced 
> and the manual calculation. Does anyone know why this is so? What does the 
> multiple r squared reported in summary(lm) represent? 
> 
> # The test case:
> x <- c(1,2,3,4)
> y <- c(1.6,4.4,5.5,8.3)
> dummy <- data.frame(x, y)
> fm1 <- lm(y ~ x-1, data = dummy)
> summary(fm1)
> betax <- fm1$coeff[x] * sd(x) / sd(y) 
> # cd is coefficient of determination
> cd <- betax * cor(y, x)
> 
> Thanks.


Sorry Sandra,

But the problem in yours script. Look this
x <- c(1,2,3,4)
y <- c(1.6,4.4,5.5,8.3)
dummy <- data.frame(x, y)
fm1 <- lm(y ~ x, data = dummy)
summary(fm1)

Call:
lm(formula = y ~ x, data = dummy)

Residuals:
1 2 3 4 
-0.17  0.51 -0.51  0.17 

Coefficients:
Estimate Std. Error t value Pr(>|t|)  
(Intercept)  -0.3500 0.6584  -0.532   0.6481  
x 2.1200 0.2404   8.818   0.0126 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.5376 on 2 degrees of freedom
Multiple R-squared: 0.9749, Adjusted R-squared: 0.9624 
F-statistic: 77.76 on 1 and 2 DF,  p-value: 0.01262 

betax <- fm1$coeff[2] * sd(x) / sd(y) 
# cd is coefficient of determination
cd <- betax * cor(y, x)
cd
   x 
0.974924 

The formula "fm1$coeff[2] * sd(x) / sd(y)" is valid only the model have
a intercept...

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sample without replacement on more than one variables?

2010-05-23 Thread Bernardo Rangel Tura
On Sun, 2010-05-23 at 00:56 -0700, dusadrian wrote:
> This might help, depending on your exact needs:
> > v1 <- sample(letters[1:2], 10, replace=TRUE)
> > v2 <- sample(letters[3:4], 10, replace=TRUE)
> > v3 <- sample(letters[5:6], 10, replace=TRUE)
> > aa <- data.frame(v1=v1, v2=v2, v3=v3)

And now is simple, sample the row of data frame
aa[sample(1:nrows(aa),3),]


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Minimization problem

2010-05-20 Thread Bernardo Rangel Tura
On Thu, 2010-05-20 at 01:35 -0700, Fred wrote:
> Dear R users,
> 
> I am trying to minimize two function simultaneously in R,
> 
> function(x)
> 
> minimize x[1],x[2],x[3]
> 
> mean(distribution(x1,x2,x3) ) - observed mean
> 
> std(distribution(x1,x2,x3)) - observed std
> 
> What I want to achieve is that simulated mean and standard deviation
> of distribution related to x1 x2 x3  would be close to observed mean
> and observed standard deviation.
> 
> 
> is there any function in R can reach this?
> 
> Thank you for the help first .
> 
> F.

Hi!
Do you need use optim, something like this

test <- function(parameters){
m.error <- mean(distribution(x1,x2,x3) ) - observed mean
m.sd <- std(distribution(x1,x2,x3)) - observed std
res <- cbind(m.error,sd.error)
return(res)
}
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple question on binning data

2010-05-15 Thread Bernardo Rangel Tura
On Thu, 2010-05-13 at 17:31 -0400, Carl Witthoft wrote:
> It's very simple to write a "binit()" function.  If all you want to do 
> is e.g., bin 107 values into sums of 10 at a time, then write a loop 
> that sums x[10*i:11*i-1]  (not tested and not syntactically correct).
> 
> The one I wrote for myself discards any partial bin (101-107 in my 
> example) and leaves a warning note that this took place.
> 
> Carl

Hi Carl,

I think the syntactically correct is x[10*i:(11*i-1)]

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generate Numbers

2010-04-04 Thread Bernardo Rangel Tura
On Sun, 2010-04-04 at 04:03 -0700, kende jan wrote:
> Hello,
> How can I generate randomly in R a sample of skewed data with first quartile 
> is 540 and third quartile is 715.
> I need a sample of 100 cases.
> 
> Thank you for your help
> 
> Jan
> 



Hi Jan, 

Sorry my direct approach but why you don't use

data <- c(runif(25,number,540),runif(50,540,715),runif(25,715,number))

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare two fingerprint images

2010-04-03 Thread Bernardo Rangel Tura
On Sat, 2010-04-03 at 21:18 +0200, Juan Antonio Gil Pascual wrote:
> Hi Bernado
> 
> I need to compare two fingerprint images and let me know if you can do 
> with R. I have used the technique of minutiae but it seems to work 
> better with the cross-correlation and wanted to know if you can do with R.
> 
> Thank you very much
> 
> Juan
> 
> 
> Bernardo Rangel Tura escribió:
> > On Fri, 2010-04-02 at 20:52 +0200, Juan Antonio Gil Pascual wrote:
> >   
> >> Hello
> >> I wanted to compare two fingerprint images. How do you do with R?.
> >> Is there a role for cross-correlation of images?
> >>
> >> Thanks
> >>
> >> 
> >
> > Hi juan,
> >
> > You was a quite vage in your question, so I don't know exactly what you
> > need but try this;
> >
> > require(ReadImages)
> > x <- read.jpeg(image1)
> > x1 <- read.jpeg(image2)
> > table(x1==x)
> >
> >
> >   
> 

Juan,

I don't know cross-correlation to images but I know this for time series
in this case:


ccf(ts(as.numeric(x)),ts(as.numeric(x1)))


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare two fingerprint images

2010-04-03 Thread Bernardo Rangel Tura
On Fri, 2010-04-02 at 20:52 +0200, Juan Antonio Gil Pascual wrote:
> Hello
> I wanted to compare two fingerprint images. How do you do with R?.
> Is there a role for cross-correlation of images?
> 
> Thanks
> 

Hi juan,

You was a quite vage in your question, so I don't know exactly what you
need but try this;

require(ReadImages)
x <- read.jpeg(image1)
x1 <- read.jpeg(image2)
table(x1==x)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing content of column in data.frame + efficient join extraction between 2 data.frames

2010-03-23 Thread Bernardo Rangel Tura
On Tue, 2010-03-23 at 09:37 +, Corrado wrote:
> Dear R users,
> 
> I have 2 SpatialPointsDataFrame's, pcs and East.
> 
> The column str_1 in the first (pcs) is:
> 
>  > pcs[0:4,]
>coordinates cat   str_1  int_1  int_2dbl_1 dbl_2
> 1 (101000, 263000)   1 "SM06B" 101000 263000 4.978915 -4.293668
> 2 (101000, 265000)   2 "SM06C" 101000 265000 4.960478 -4.266742
> 3 (101000, 267000)   3 "SM06D" 101000 267000 4.912984 -4.246849
> 4 (101000, 269000)   4 "SM06E" 101000 269000 4.613309 -4.185405
>  >
> 
> The column str_1 in the second (East) is:
> 
>  > East[0:4,]
>coordinates str_1
> 1 (489000, 215000) sp81x
> 2 (489000, 217000) sp81y
> 3 (493000, 209000) sp90j
> 4 (495000, 209000) sp90p
>  >
> 
> I would like to do 2 things:
> 
> 1) I would like to change the format of the column str_1
> in the first to be the same that it is in the second,
> that is I need to remove the inverted commas " and I need to
> make it lower case.
> 
> 2) I would like to extract the rows from the first one (pcs) where 
> pcs$str_1
> is the same as East$str_1.
> 
> I have even tried regexp, but cannot modify
> the content of pcs$str_1 to remove
> the inveretd commas " and change the case to lowercase.
> 
> How do I do that?
> 
> Regards


Hi Corrado!

First: tolower(pcs$str_1) change to lower case
Second: try merge (East,pcs,by.x=str_1,by.y=str_1) to fusion data frames
Third: I don't recreate your database East in my computer do you give a
small part to I try solve your problem?

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] queue simulation

2010-03-22 Thread Bernardo Rangel Tura
On Mon, 2010-03-22 at 02:49 -0600, Carlos Ernesto Lopez Nataren wrote:
> Hello everybody :) I am trying to simulate a queue with times of arrival to 
> the queue and time taken to dispatch every member of the queue coming from 
> two exponential distributions, I am interested in knowing the number of 
> people at any time and the time that takes every member of this queue to be 
> dispatched.
> 
> I thought this was gonna be an easy task but I've failed to try to simulate 
> this, is there any package that does this already?
> 
> Any help will be greatly appreciated.
> 
> Thank you very much
> Carlos

Carlos, if I understand your problem do you need know the time for each
person in a queue dispatched.

You say this time  fit a exponential distribution, so do you have a rate
of dispatch (x dispatch per time)  

If I want generate 10 times of dispatch rate 0.4, use the command 

times <- rexp(10,0.4)

If I need the total delay for each person,  use the command

cumsum(times)

If I need the average time in the queue, use the command

means(cumsum(times)) 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different forms of nls recommendations

2010-03-21 Thread Bernardo Rangel Tura
On Sat, 2010-03-20 at 14:55 -0800, emorway wrote:
> Hello, 
> 
> Using this data:
> http://n4.nabble.com/file/n1676330/US_Final_Values.txt US_Final_Values.txt 
> 
> and the following code i got the image at the end of this message:
> 
> US.final.values<-read.table("c:/tmp/US_Final_Values.txt",header=T,sep=" ")
> US.nls.1<-nls(US.final.values$ECe~a*US.final.values$WTD^b+c,data=US.final.values,start=list(a=2.75,b=-0.95,c=0.731),trace=TRUE)
> f.US1<-function(x){coef(US.nls.1)["a"]*x^coef(US.nls.1)["b"]+coef(US.nls.1)["c"]}
> xvals.US1<-seq(min(US.final.values$WTD),max(US.final.values$WTD),length.out=75)
> yvals.US1<-f.US1(xvals.US1)
> Rsq.nls.1<-sum((predict(US.nls.1)-mean(US.final.values$ECe))^2/sum((US.final.values$ECe-mean(US.final.values$ECe))^2))
> plot(US.final.values$WTD,US.final.values$ECe,col="red",pch=19,cex=.75)
> lines(xvals.US1,yvals.US1,col="blue")
> 
> but the r^2 wasn't so hot.  
> Rsq.nls.1
> [1] 0.2377306
> 
> So I wanted to try a different equation of the general form a/(b+c*x^d)
> 
> US.nls.2<-nls(US.final.values$ECe~(a/(b+c*US.final.values$WTD^d)),data=US.final.values,start=list(a=100.81,b=73.7299,c=0.0565,d=-6.043),trace=TRUE,algorithm="port")
> 
> but that ended with "Convergence failure: false convergence (8)".  I tried


Hi emorway,

Do you have 657 obs and 4 parameters to fit.
In my opinion you have few obs...
I think do you  fit in steps:

US.nls.2<-nls(ECe~(a/(b + c *
WTD^d)),data=US.final.values,start=list(a=100.81,b=73.7299,c=0.0565,d=-6.043),trace=TRUE,algorithm="port")
temp_nls1 <- nls(ECe~(100/(73 + .05 *
WTD^d)),data=US.final.values,start=list(d=-6.043),trace=TRUE,algorithm="port")
temp_nls2 <- nls(ECe~(100/(73 + .05 *
WTD^d)),data=US.final.values,start=list(d=-1.01613),trace=TRUE,algorithm="port")
temp_nls3 <- nls(ECe~(100/(73 + c *
WTD^(-1.01613))),data=US.final.values,start=list(c=0.05),trace=TRUE,algorithm="port")
temp_nls4 <- nls(ECe~(100/(73 + c *
WTD^(-1.01613))),data=US.final.values,start=list(c=-14.7127),trace=TRUE,algorithm="port")
temp_nls5 <- nls(ECe~(100/(b-14.7127 *
WTD^(-1.01613))),data=US.final.values,start=list(b=73),trace=TRUE,algorithm="port")
temp_nls6 <- nls(ECe~(100/(b-14.7127 *
WTD^(-1.01613))),data=US.final.values,start=list(b=70.4936),trace=TRUE,algorithm="port")
temp_nls7 <- nls(ECe~(a/(70.4936-14.7127 *
WTD^(-1.01613))),data=US.final.values,start=list(a=100),trace=TRUE,algorithm="port")
  0: 2243.9898:  100.000
  1: 2122.8218:  106.219
  2: 1359.8819:  187.530
  3: 1359.8819:  187.530



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher.test gives p>1

2010-03-04 Thread Bernardo Rangel Tura
On Thu, 2010-03-04 at 11:15 -0500, Jacob Wegelin wrote:
> The purpose of this email is to
> 
> (1) report an example where fisher.test returns p > 1
> 
> (2) ask if there is a reliable way to avoid p>1 with fisher.test.
> 
> If one has designed one's code to return an error when it finds a 
> "nonsensical" probability, of course a value of p>1 can cause havoc.
> 
> Example:
> 
> > junk<-data.frame(score=c(rep(0,14), rep(1,29), rep(2, 16)))
> > junk<-rbind(junk, junk)
> > junk$group<-c(rep("DOG", nrow(junk)/2), rep("kitty", nrow(junk)/2))
> > table(junk$score, junk$group)
> 
>  DOG kitty
>0  1414
>1  2929
>2  1616
> > dput(fisher.test(junk$score, junk$group)$p.value)
> 1.12

Hi jacob,

I think this is cover in R FAQ 7.31, but look this command
all.equal(dput(fisher.test(matrix(c(14,14,29,29,16,16),byrow=T,ncol=2))$p.value),1)
1.000012
[1] TRUE

P.S
R FAQ 7.31 -
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem in performing goodness of fit test in R.

2010-02-14 Thread Bernardo Rangel Tura
On Sun, 2010-02-14 at 12:42 +0500, Faiz Rasool wrote:
> I am trying to perform goodness of fit test using R. I am using this website 
> http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html for help. 
> However, I am unable to carry out the test successfully. My code follows. It 
> is taken from the website just mentioned. 
> freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the dice 150 
> times.
> prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
> chisq.test(freq,p=prob) # I do not know what this line means. I just followed 
> instructions on the website.
> The erorr I receive is "erorr in chisq.test(freq,p=prob)/6 probabilities must 
> sum to 1" 
> 
> I am very new to R, so any help would be appreciated. 
> Faiz.

Faiz,


Well ... 
In my computer( Phenom X4 9650, runing Ubuntu 9.10 and R 2.10.1) the
script work


> freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the
dice 150 times.
> prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
> chisq.test(freq,p=prob) # I do not know what this line means

Chi-squared test for given probabilities

data:  freq 
X-squared = 6.72, df = 5, p-value = 0.2423


About the third line You must read ?chisq.test for better know the
command, but you execute one chi-square test with uniform probability
distribution 



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using tabble

2010-01-18 Thread Bernardo Rangel Tura
On Mon, 2010-01-18 at 17:17 +0800, Fan Dan wrote:
> Dear Helper: 
> I am using R to finish a project which is near the dealine. 
> Please help me ! 
> I need to construct a table. The name of the table is the sequence from 1 to 
> 7 with some missing values arbitrarily. 
> For example 1 to 7 
> 12   36   7
> 12  3   45  5   7
>  I want to add the missing numbers of the title which should be seq(1,7) 
> In this case, 4  and 5 
> and the corresponding data is 0 
> So after revised the table should be like 
>  12   3   4   5   6   7 
>  12  3  45  0   0   5   7 
> Please tell me how 
> I really appreaciate your time and helpn 
> Yours 
> Wolfgang AMadeurs 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Hi!

If i understood your problem is easy create a fake data for example:

set.seed(123)
x<-rep(sample(1:7,5),3)
table(x)
x
3 4 5 6 7 
3 3 3 3 3

In this example the values "1" and "2" are missing

Solution change your variable to factor and determine your leveal
x.factor<-factor(x,levels=1:7)
table(x.factor)
x
1 2 3 4 5 6 7 
0 0 3 3 3 3 3 






-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with function "fitdistr" in "MASS"

2010-01-03 Thread Bernardo Rangel Tura
On Sat, 2010-01-02 at 23:20 -0800, Saji Ren wrote:
> Hi, R users:
> 
> I want to fit my data into a normal distribution by using the command
> "fitdistr" in "MASS".
> I changed my data class from "ts" to "numeric" by
> 
> >class(mydata)="numeric"
> 
> but after using "fitdistr", I got the result below
> 
> >fitdistr(mydata,"normal")
>   meansd 
>   NA NA  
>  (NA)   (NA) 
> 
> the help doc of "fitdistr" does not mention anything about that, thus I need
> your help.
> 
> Thank you in advanced,
> Saji from Shanghai

Hi Sajj,

You hava NA in your data

try: fitdistr(na.exclude(mydata),"normal")

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with "Cannot compute correct p-values with ties"

2009-12-22 Thread Bernardo Rangel Tura
On Wed, 2009-12-02 at 16:52 +0800, Zhijiang Wang wrote:
> Dear All,
>1. why did the problem happen?
>2. How to solve it?
> 
>--
> 
> Best wishes,
> Zhijiang Wang


Well... The algorithm for Mann-whitney test have problem with ties

To solve you can use jitter

a<-1:10
b<-1:10
wilcox.test(a,b)

Wilcoxon rank sum test with continuity correction

data:  a and b 
W = 50, p-value = 1
alternative hypothesis: true location shift is not equal to 0 

Warning message:
In wilcox.test.default(a, b) : cannot compute exact p-value with ties

wilcox.test(a,jitter(b))

Wilcoxon rank sum test

data:  a and jitter(b) 
W = 49, p-value = 0.9705
alternative hypothesis: true location shift is not equal to 0 

look ?jitter for more information

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] areg (stata) equivalent in R?

2009-12-01 Thread Bernardo Rangel Tura
On Mon, 2009-11-30 at 22:20 -0500, John K. Williams wrote:
> hi, i am looking to reproduce a study done in stata in R, where a
> regression was done while absorbing a categorical variable.  i am new
> to R, i've i installed the design package but haven't been able to
> find an applicable function.  thanks for any help.
> 

John,

Do you show a example for this command?

I don't know stata so I don't help you

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "subset" or "condition" as argument to a function

2009-11-21 Thread Bernardo Rangel Tura
On Fri, 2009-11-20 at 17:40 -0800, Santosh wrote:
> Dear Rxperts!
> 
> I was wondering if it is possible to write a function which can take in
> argument of a subset or condition.. Of course, I am aware of the alternate
> methods like coplot, par.plot, xyplot etc... I am specifically interested in
> using conditions/subsets with "plot"..
> 
> A simple fragmented example is shown here..
> 
> pltit <- function(y,x,dat,dat1,dat2,sbst) {
> plot(y~x, data=dat, subset=sbst)
> lines(y~x,data=dat1, subset=subst)
> points(y~x,data=dat2,subset=subst)
> }
> 
> pltit(profit,weeks,dat=zone1, sbst='group==1&sales>200')
> 
> Could you also suggest pointers/references/examples on efficient ways to
> plot simulated data overlaid with observed data?
> 
> Have a good weekend!

Hi Santosh,

IMHO your script is

pltit <- function(y,x,dat,dat1,dat2,sbst) {
subs <- subset(dat,sbst)
with(subs,plot(y~x))

subs1 <- subset(dat1,sbst)
with(subs1,lines(y~x))

subs2 <- subset(dat2,sbst)
with(subs2,points(y~x))
}


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] negative log likelihood

2009-11-09 Thread Bernardo Rangel Tura
On Sun, 2009-11-08 at 11:12 -0800, mat7770 wrote:
> I have two related variables, each with 16 points (x and Y). I am given
> variance and the y-intercept. I know how to create a regression line and
> find the residuals, but here is my problem. I have to make a loop that uses
> the seq() function, so that it changes the slope value of the y=mx + B
> equation ranging from 0-5 in increments of 0.01. The loop also needs to
> calculate the negative log likelihood at each slope value and determine the
> lowest one. I know that R can compute the best regression line by using
> lm(y~x), but I need to see if that value matches the loop functions. 

Hi

If I understand your question you need extract log-likelihood for a
linear model, so you need using logLik command, see example:

set.seed(1)
x<-rpois(16,6)
y<-2*x+3+rnorm(16,sd=3)
model<-lm(y~x)
logLik(model)
'log Lik.' -40.1177 (df=3)



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] operation with if/else on a dataframe

2009-10-29 Thread Bernardo Rangel Tura
On Thu, 2009-10-29 at 01:47 -0700, Fran100681 wrote:
> Hi to all,
> I have this dataframe (I show the first six rows)
> 
> >head(table)
> 
>  A   RFold.Change P.Value 
> Count1   Count2
> 1 ENSRNOE002_at 0 1.13 0.601  
> 1
> 2 ENSRNOE009_at 0-1.04 0.733  
> 3
> 3 ENSRNOE020_at 0-1.08 0.680  
> 0
> 4 ENSRNOE021_at 0-1.31 0.201  
> 2
> 5 ENSRNOE023_at 0-1.06 0.643  
> 3
> 6 ENSRNOE024_at 0-1.14 0.403  
> 3
> 
> I would like to generate a function that determine for each row a new value
> (resulting in a new vector of values to add to the dataframe). 
> The function should give for every row the same value showed in column "R".
> However,I need of this because not all the R-values reported in this table
> are correctly determined following the criteria mentioned below.
> 
> 
> These new values calculated by the function must be:
> 
> 1)"UP" 
> if all the following conditions are verified in a certain row:
> Fold.Change is >= +1.5, 
> P.Value is < 0.05 
> Count1 >= 2
> 
> 2)"DOWN"
> if all the following conditions are verified in a certain row:
> Fold.Change is <= -1.5, 
> P.Value is < 0.05 
> Count2 >= 2
> 
> 3) 0 if both of previous conditions are not verified
> 
> So I have set these fllowing parameters (new objects) because I'll have to
> repeat this procedure to different dataframes in which the order of columns
> of interest might change (So I can change these parameters depending on the
> order of the columns in any different table)
> 
> Fold.change <-  3 #(because in this table, Fold.Change value is the third
> column and so on...)
> P.Value <-  4
> Count1 <-  5
> Count2 <-  6

[ Quote text]

> This is my problem, I cannot use the function to recalculate values in
> R-column for all rows in my dataframe. I don't understand  where is the
> problem, can someone help me?
> Thanks a lot!!
> 
> Francesco

Francesco,


I think you solve this problem with a simple way.
Remember in R the most function and operations are vectorized so look
this example: 

set.seed(123)
x<-rpois(20,5)
y<-rpois(20,15)
z<-rpois(20,10)
dta<-data.frame(x,y,z)
dta
dta$NEW<-ifelse(x>5 & y>15 & z>10,"UP",
ifelse(x<5 & y<15 & z<10,"DOWN",
"0"))
dta

First, I use ifelse command to simplify your nested conditional
situation.

Second, I know that R test this nested condition in order so the first
position will result test x[1],y[1] and z[1], the second postion will
result test x[2],y[2] and z[2] ...

The new vector result is the same order the original data.frame so I use
dta$NEW to create a new column in data.frame 



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attach

2009-10-18 Thread Bernardo Rangel Tura
On Wed, 2009-10-14 at 07:21 +0200, Christophe Dutang wrote:
> Hi all,
> 
> I have a question regarding the memory usage for the attach function.  
> Say I have a data.frame inputdat that I create with read.csv.
> 
> I would like to know what happens on the memory side when I use
> attach(inputdata)
> 
> Is there a second allocation of memory for inputdata?
> 
> Then I'm using eval on a expression which depends on the columns of  
> inputdata. Is it better not to use attach function?
> 
> Thanks in advance
> 
> Christophe

Well, if you attach a data.frame twice times, it use your memory twice
times.

I don't use attach I prefer with(data.frame, command)
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stochastic process

2009-10-13 Thread Bernardo Rangel Tura
On Tue, 2009-10-13 at 14:43 +0800, 刘哲 wrote:
> Hi,
>  
> I'm a student in China, and I never used R before.
>  
> I'm now wondering how to simulate a sample of Markov chain of ,say 500 
> entries with a certain transition matrix. 
>  
> Thanks a lot.

Hi

Well the naive script for you problem is something like this

#
# initial condition
#
initial<-matrix(c(300,200,0),nrow=1)
initial
 [,1] [,2] [,3]
[1,]  300  2000

#
# transition matrix (third state is absortive)
#
transition<-matrix(c(
.5,.3,.2,
.3,.4,.3,
0,0,1
),nrow=3,byrow=T)
transition
 [,1] [,2] [,3]
[1,]  0.5  0.3  0.2
[2,]  0.3  0.4  0.3
[3,]  0.0  0.0  1.0

#
# first step
#
S1<-initial%*%transition
S1
 [,1] [,2] [,3]
[1,]  210  170  120

#
# Second step
#
S2<-S1%*%transition
S2
 [,1] [,2] [,3]
[1,]  156  131  213


If you use a large number of step is more pratical use command mtx.exp
of Biodem package, something like this


require(Biodem)
#
# 10th step direct
#
initial%*%mtx.exp(transition,10)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] row selection

2009-10-11 Thread Bernardo Rangel Tura
On Thu, 2009-10-08 at 16:14 -0400, Ashta wrote:
> Hi all,
> I have a matrix  named x with N by  C
> I want to select every 5 th rrow from matrix x
> I used the following code
>  n<- nrow(x)
> > for(i in 1: n){
> + b <- a[i+5,]
> >b
> }
> Error: subscript out of bounds
> 
> Can any body point out the problem?

Hi Ashta,

If I understand your request you need select row 5,10,15, ...
In this case you can use this script:

x[1:nrow(n)%%5==0]


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting a mirror "permanently" on R on ubuntu

2009-10-10 Thread Bernardo Rangel Tura
On Sat, 2009-10-10 at 04:54 +0100, Lazarus Mramba wrote:
> Dear all,
> 
> I seem to have many problems as I run R on my ubuntu system.
> want to set a mirror so that anytime I use the command "install.packages",
> it does not ask me for which mirror to use but go direct.
> This is because of the error I keep on getting below and I dont know how to
> solve it.
> 
> Please help.
> 
> Kind regards,
> Lazarus


Lazarus,

use options(repos="mirror URL")

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Satellite ocean color palette?

2009-10-09 Thread Bernardo Rangel Tura
On Fri, 2009-10-09 at 11:51 -0700, Tim Clark wrote:
> Dear List,
> 
> Is there a color palette avaliable similar to what is used in satellite ocean 
> color imagery?  I.e. a gradient with blue on one end and red on the other, 
> with yellow in the middle?  I have tried topo.colors(n) but that comes out 
> more yellow on the end.  I am looking for something similar to what is found 
> on the CoastWatch web page:
> 
> http://oceanwatch.pifsc.noaa.gov/imagery/GA2009281_2009282_sst_2D_eddy.jpg
> 
> Thanks!
> 
> Tim
> 
> 
> Tim Clark
> Department of Zoology 
> University of Hawaii

Tim,

You can make a palette in R, using colorRampPalette, look this example

Satelite.Pallete <-
colorRampPalette(c("blue3","cyan","aquamarine","yellow","orange","red"))
require(fields)
image.plot(volcano, col = Satelite.Pallete(500), legend.lab="Scale") 
contour(volcano, levels = seq(90, 200, by = 5), add = TRUE)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aproximate a titration kurve to the measure data.

2009-10-02 Thread Bernardo Rangel Tura
On Thu, 2009-10-01 at 10:18 -0700, awayguy wrote:
> yes, halo thank you.
> my measure data:
> v2 <- c(0, 2, 4, 6, 6.2, 6.4, 6.6, 6.8, 7, 7.2, 7.4, 7.6, 7.8, 8, 10, 12,
> 14)
> ph2 <- c(12.10, 11.94, 11.68, 11.11, 10.91, 10.74, 10.47, 9.71, 7.1, 4.24,
> 3.3, 3.08, 2.98, 2.86, 2.33, 2.11, 1.98)
> 
> with regards

Ok

temp<-data.frame(ph2,v2)
> drm(temp,, fct = LL.4())

A 'drc' model.

Call:
drm(formula = temp, fct = LL.4())

Coefficients:
b:(Intercept)  c:(Intercept)  d:(Intercept)  e:(Intercept)  
   39.420  2.425     11.487  7.002 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use Subpopulation data?

2009-10-02 Thread Bernardo Rangel Tura
On Fri, 2009-10-02 at 00:42 +, KABELI MEFANE wrote:
> Thanks
> 
> But it seems like you don't get my problem.Do you mean that there is 
> something wrong with the code as it seems like what you are doing is 
> suggesting different ways to write a code. 
> 
> Will i get to use the variable that have been name previously like if i want 
> to calculate the standard deviation of stratum hypermarket in a sample not 
> population, the first start is to check if it would help me is to check the 
> length() of different levels if it is not same as that of the original data
> 
> Best Regards
> 
> I rest my case, i might dream it

Sorry

I don't understand your problem early.

In your case you have 2 data.frames: sampleframe and sample.strat so you
need show to R where calculate sd or length or summary


set.seed(1)# only to make this example don't use in your code
stra<-strata(sampleframe,c("type"),size=c(20,80,200,300,400),method="srswor")
sample.strat<-getdata(sampleframe,stra)

length of value in sampleframe where type is Hypermarket:
> length(sampleframet$value[sampleframe$type=="Hypermarket"]) 
[1] 100

SD of value in sampleframe where type is Hypermarket:
> sd(sampleframe$value[sampleframet$type=="Hypermarket"])
[1] 4586.854

length of value in sample.strat where type is Hypermarket:
> length(sample.strat$value[sample.strat$type=="Hypermarket"] )
[1] 20

SD of value in sample.strat where type is Hypermarket:
> sd(sample.strat$value[sample.strat$type=="Hypermarket"] )
[1] 4679.336


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use Subpopulation data?

2009-10-01 Thread Bernardo Rangel Tura
On Thu, 2009-10-01 at 13:34 +, KABELI MEFANE wrote:
> ## package sampling
> 
> stra=strata(sampleframe,c("type","value","rating"),size=c(20,80,200,300,400),
> method="srswor")
> sample.strat<-getdata(sampleframe,stra)
> 
> sample.strat
> 

Try:

stra<-strata(sampleframe,size=c(20,80,200,300,400),method="srswor")
sample.strat<-getdata(sampleframe,stra)
sample.strat

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use Subpopulation data?

2009-10-01 Thread Bernardo Rangel Tura
On Thu, 2009-10-01 at 10:06 +, KABELI MEFANE wrote:
> Dear Helpers
> 
> I have a sample frame and i have sampled from it using three methods and now 
> i want to calculate the statistics but i only get the population parameters.
> 
> H <- matrix(rnorm(100, mean=5, sd=5000))
> sampleframe=data.frame(type=c(rep("H",100)),value=c(H))
> sampleframe
> 
> str=strata(sampleframe,c("type"),size=c(20,), method="srswor")

Try using

str=strata(sampleframe,c("type"),size=20, method="srswor")

or better

str <- strata(sampleframe,c("type"),size=20, method="srswor")

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aproximate a titration kurve to the measure data.

2009-10-01 Thread Bernardo Rangel Tura
On Wed, 2009-09-30 at 10:55 -0700, awayguy wrote:
> Halo
> 
> i'm studying chemistry, today we made an experiment and i have to draw a
> titration kurve for my mess data. we can do it on a mm paper, or we can also
> use a programe. people from chemistry recomend "R"
> last year i studied civil eng. and we used Matlab, as I see, R ist very
> similar to it, but its got other comands.
> But i think R would be a good help for some exercises.
> 
> so my main question is: i have some measurement data from my titration, and
> I want aproximate a kurve to this data. is it possible to do it with R?
> 
> a titration kurve looks like this:
> 
> http://www.nabble.com/file/p25685986/acetic-acid-titration-curve.png 
> 
> hope you can help me, and yes when its possible, if you know something like
> a tutorial then i would be glad if you could post it.
> 
> with regards

Hi,

I think do you need use drc package:

drc: Analysis of dose-response curves
Analysis of one or multiple curves with focus on concentration-response,
dose-response and time-response curves used, for example in biology,
environmental sciences, medicine, pharmacology, toxicology.

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA or CA

2009-09-30 Thread Bernardo Rangel Tura
On Tue, 2009-09-29 at 17:02 +, Paul Dennis wrote:
> Dear all
> 
> I have a data set for which PCA based between group analysis (BGA) gives 
> significant results but CA-BGA does not.
> 
> I am having difficulty finding a reliable method for deciding which 
> ordination technique is most appropriate. 
> 
> I have been told to do a 1 table CA and if the 1st axis is>2 units go for CA 
> if not then PCA.
> 
> Another approach is that described in the Canoco manual - perform DCA and 
> then look at the length of the axes.  I used decorana in vegan and it gives 
> axis lengths.  I assume that these are measured in SD units. Anyway the 
> manual say if the axis length is <3 go for PCA,>4 use CA and if intermediate 
> use either. 
> 
> Are either of these approaches good/valid/recommended or is there a better 
> method?
> 
> Thanks
> 
> Paul  

Hi Paul 

I think that Ca is Correspondence Analysis and PCA is Principal
Component Analysis, right?

In this case, if all variables is numeric do you must use PCA, if all
variables is factor do you must use CA.

If you have a mixed  variables do you have a problem, in most case I
convert numeric variables to factor (with loss of information) and make
CA
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Color index in image function

2009-09-11 Thread Bernardo Rangel Tura
On Thu, 2009-09-10 at 05:27 -0700, FMH wrote:
> Thank you for the scripts, but the label and the values in the x and y-axis 
> suddently dissapear even the 'axis' function is used as stated in the command 
> below. Could you help on this?
> 
> axis(1, at = seq(100, 800, by = 100))
> axis(2, at = seq(100, 600, by = 100))
> 
> How could i add a tittle on top of color index?
> 
> Thank you
> Fir
Try this


Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
require(fields)
image.plot(volcano, col = Brazilan.Pallete(50), legend.lab="Scale") 
contour(volcano, levels = seq(90, 200, by = 5), add = TRUE)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Color index in image function

2009-09-10 Thread Bernardo Rangel Tura
On Wed, 2009-09-09 at 02:33 -0700, FMH wrote:
> Thank you for the hints, but how could i add the grid lines which have 
> numbers, representing the height of the volcano on the image.
> 
> Thank you
> 

So I think this script  is what you need


Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
require(fileds)
image.plot(volcano, col = Brazilan.Pallete(50), axes = FALSE) 
contour(volcano, levels = seq(90, 200, by = 5), add = TRUE)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil


> 
> 
> - Original Message 
> From: Bernardo Rangel Tura 
> To: FMH 
> Sent: Tuesday, September 8, 2009 10:14:07 AM
> Subject: Re: [R] Color index in image function
> 
> On Mon, 2009-09-07 at 07:59 -0700, FMH wrote:
> > Thank you for the tips. I have manage to run your script, but  was still 
> > never get the way to include the color index beside the image which could 
> > explain the intensity of the color from the lower index(green) to the 
> > higher index(blue). This color index might be represented by  an increasing 
> > of color index in another table beside the image, started from green 
> > followed by green-yellow, yellow, yellow-blue and blue?
> > 
> > Could someone please advice on this matter?
> > 
> > Cheers
> > Fir
> > 
> 
> Hi FHM,
> 
> Well If you desire one color index in a imageplot  I don't know solve
> your problem.
> 
> But in your scirptyou use image and 
> 
> image(x, y, volcano, col = terrain.colors(100), axes = FALSE)
> contour(x, y, volcano, levels = seq(90, 200, by = 5),
> add = TRUE, col = "peru")
> 
> In this case I suggest you use 
> 
> Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
> filled.contour(volcano, color = Brazilan.Pallete)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Color index in image function

2009-09-07 Thread Bernardo Rangel Tura
On Sat, 2009-09-05 at 04:14 -0700, FMH wrote:
> Dear All,
> 
> I was looking for the color index in image function, such as from 
> topo.colors(n) and etc. but still never found it. For instance, from the help 
> menu.
> 
> 
> ###
> # Volcano data visualized as matrix. Need to transpose and flip
> # matrix horizontally.
> image(t(volcano)[ncol(volcano):1,])
> 
> # A prettier display of the volcano
> x <- 10*(1:nrow(volcano))
> y <- 10*(1:ncol(volcano))
> image(x, y, volcano, col = terrain.colors(100), axes = FALSE)
> contour(x, y, volcano, levels = seq(90, 200, by = 5),
> add = TRUE, col = "peru")
> axis(1, at = seq(100, 800, by = 100))
> axis(2, at = seq(100, 600, by = 100))
> box()
> title(main = "Maunga Whau Volcano", font.main = 4)
> #
> 
> >From the script above, it yields a beautiful  image of volcano with variety 
> >of colors but i have to list down the color index that could show the 
> >meaning of each color in my thesis. 
> 
> Could someone please help me to extract this color index?
> 
> Thank you
> Fir

If I understand your question you need change the Palette of image plot.

So you need use "colorRampPalette" look my example

Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
image(x, y, volcano, col = Brazilan.Pallete(50), axes = FALSE)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] correlation between two 2D point patterns?

2009-08-30 Thread Bernardo Rangel Tura
On Sun, 2009-08-30 at 07:51 +0100, William Simpson wrote:
> Suppose I have two sets of (x,y) points like this:
> 
> x1<-runif(n=10)
> y1<-runif(n=10)
> A<-cbind(x1,y1)
> 
> x2<-runif(n=10)
> y2<-runif(n=10)
> B<-cbind(x2,y2)
> 
> I would like to measure how similar the two sets of points are.
> Something like a correlation coefficient, where 0 means the two
> patterns are unrelated, and 1 means they are identical. And in
> addition I'd like to be able to assign a p-value to the calculated
> statistic.
> 
> cor(x1,x2)
> cor(y1,y2)
> gives two numbers instead of one.
> 
> cor(A,B)
> gives a correlation matrix
> 
> I have looked a little at spatial statistics. I have seen methods
> that, for each point, search in some neighbourhood around it and then
> compute the correlation as a function of search radius. That is not
> what I am looking for. I would like a single number that summarises
> the strength of the relationship between the two patterns.
> 
> I will do procrustes on the two point sets first, so that if A is just
> a rotated, translated, scaled, reflected version of B the two patterns
> will superimpose and the statistic I'm looking for will say there is
> perfect correspondence.
> 
> Thanks very much for any help in finding such a statistic and
> calculating it using R.
> 
> Bill

Hi Bill,

If your 2 points set is similar I expect your Euclidean distance is 0,
so I suggest this script:

dist<-sqrt((x1-x2)^2+(y1-y2)^2) # Euclidean distance

t.test(dist) # test for mean equal 0

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL on R

2009-07-05 Thread Bernardo Rangel Tura
On Sat, 2009-07-04 at 22:24 -0700, JoK LoQ wrote:
> I'm dealing with lots of columns and conditions, wats t best way to deal with
> that?
> How do I work with SQL on R? the manual is quite confuse talking about that.
> Do I need a package?

I don't understand your question, but 

if you think use SQL in your data frames do you use SQLDF
(http://cran.r-project.org/web/packages/sqldf/index.html) 

if you thinks use a database server and access it in R i sugest you use
RMySQL (http://cran.r-project.org/web/packages/RMySQL/index.html)
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert a lis to matrix

2009-06-07 Thread Bernardo Rangel Tura
On Sun, 2009-06-07 at 00:14 -0400, Manisha Brahmachary wrote:
> Hello,
> 
>  
> 
> This is an urgent request. I want to convert a list of 3 elements into a
> matrix and I am not sure how to do it.
> 
>  
> 
> The list looks like this:
> 
> List of 3
> 
>  $ : num [1:15364, 1] 0.133 0.622 0.588 1.024 0.583 ...
> 
>   ..- attr(*, "dimnames")=List of 2
> 
>   .. ..$ : chr [1:15364] "6420681" "3610072" "2260458" "60689" ...
> 
>   .. ..$ : NULL
> 
>  $ : num [1:15364, 1] 3.159 0.265 0.522 1.912 3.380 ...
> 
>   ..- attr(*, "dimnames")=List of 2
> 
>   .. ..$ : chr [1:15364] "6420681" "3610072" "2260458" "60689" ...
> 
>   .. ..$ : NULL
> 
>  $ : num [1:15364, 1] 3.214 0.277 1.447 1.827 2.054 ...
> 
>   ..- attr(*, "dimnames")=List of 2
> 
>   .. ..$ : chr [1:15364] "6420681" "3610072" "2260458" "60689" ...
> 
>   .. ..$ : NULL

I'm not sure if understood your question, but look this code

a<-list(B=1:3,C=2:4,D=3:5)
matrix(unlist(a),ncol=3)




-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with a cumullative Hazrd Ratio plot

2009-05-13 Thread Bernardo Rangel Tura
Hi R-masters

I need help to make modified cumulative hazard ratio plot.

I need create a common plot but with the number of subjects in risk each
ticks times for two different groups in bottom of plot (I put one
example in attach).

Do you know a routine for this?
Is possible create a routine for this? 
In this case with how commands?

Thanks in advance!
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and McAfee 8.5

2009-05-11 Thread Bernardo Rangel Tura
On Mon, 2009-05-11 at 09:54 -0300, Joyce, Warren wrote:
> Hi,
> 
> I have been working with R for the last year and using the UKFSST package to 
> look at satellite tag track data and SST information. Fpr those not familiar, 
> the package uses the positions estimated by the satellite tags themselves and 
> the associated SST data from servers (in this case, from the University of 
> Hawaii and the NOAA Coastwatch website) for the time preiod to estimate a 
> reasonable track the animal took while it was at liberty.  I'm currently 
> having trouble downloading data from two different servers which I never had 
> a problem before I recevied an upgraded computer with McAfee 8.5. When the 
> package goes to the various servers for SST information to download, I 
> continue to get a message in R:
> 
(...)
> Into my web browser, I get prompted with a file download window which quickly 
> downloads an empty file. It looks like the data is available, I just can't 
> seem to download it.
> 
> I've deduced that it must be a problem with the new firewall features of 
> McAfee 8.5 and was wondering if anyone else has run into a problem like this. 
> I've tried using the exclusions options in 8.5 to exclude any R files but 
> have come up empty handed. Still waiting on our informatics division to try 
> and solve the problem as well but I thought I might try asking here as well.
> 
> Thanks and I appreciate any assistance!
> 
> Warren N. Joyce 

Hi Joice,

In my opinion you need think about 2 things.

1- This is a problem of McAfee 8.5 not a R problem, so do you contact a
McAfee support for fix this problem ?

2- I presume  you use a windows but i do know your version (XP or Vista
or win 95 etc) neither your R configuration (do you use "R --internet2"
or not?)
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Division ?

2009-05-03 Thread Bernardo Rangel Tura
On Sat, 2009-05-02 at 10:34 -0700, bogdanno wrote:
> It seems division with numbers bigger than 10 000 000 doesn't work
>  2000/21
> [1] 952381
> > /23
> [1] 2415459
> 
> Thank you


Hi bogdanno

First of all look this

> all.equal(21*2000/21,2000)
[1] TRUE

So de division work correctly

but if I type

> all.equal(952381*21,2000)
[1] "Mean relative difference: 5e-08"

It's not means R division don't work correctly if you use the command
> format(2000/21,digits=22)
[1] "952380.952380952

So the result: 952381 is a round number not a real result of division.
This occur because R print only 7 significants digits in numbers, if you
test
> all.equal(21*952380.952380952,2000)
[1] TRUE

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] omit empty cells in crosstab?

2009-04-26 Thread Bernardo Rangel Tura
On Fri, 2009-04-24 at 13:12 -0700, sjaffe wrote:
> small example:
> 
> a<-c(1.1, 2.1, 9.1)
> b<-cut(a,0:10)
> c<-data.frame(b,b)
> d<-table(c)
> dim(d) 
> ##result: c(10, 10)
> 
> But only 9 of the 100 cells are non-zero.
> If there were 10 columns, the table have 10 dimensions each of length 10, so
> have 10^10 elements, too much even to fit in memory

Hi Steve

In your only 3 cells > 0
> d
b.1
b(0,1] (1,2] (2,3] (3,4] (4,5] (5,6] (6,7] (7,8] (8,9] (9,10]
  (0,1]  0 0 0 0 0 0 0 0 0  0
  (1,2]  0 1 0 0 0 0 0 0 0  0
  (2,3]  0 0 1 0 0 0 0 0 0  0
  (3,4]  0 0 0 0 0 0 0 0 0  0
  (4,5]  0 0 0 0 0 0 0 0 0  0
  (5,6]  0 0 0 0 0 0 0 0 0  0
  (6,7]  0 0 0 0 0 0 0 0 0  0
  (7,8]  0 0 0 0 0 0 0 0 0  0
  (8,9]  0 0 0 0 0 0 0 0 0  0
  (9,10] 0 0 0 0 0 0 0 0 0  1

If you desire use simple code to find only cell>0 use this

 table(interaction(c,drop=T))

  (1,2].(1,2]   (2,3].(2,3] (9,10].(9,10] 
    1 1 1 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dummy (factor) based on a pair of variables

2009-04-18 Thread Bernardo Rangel Tura
On Sat, 2009-04-18 at 08:55 +0200, Serguei Kaniovski wrote:
> 
> Dear All!
> 
> my data is on pairs of countries, i and j, e.g.:
> 
> y,i,j
> 1,AUT,BEL
> 2,AUT,GER
> 3,BEL,GER
> 
> I would like to create a dummy (indicator) variable for use in regression
> (using factor?), such that it takes the value of 1 if the country is in the
> pair (i.e. EITHER an i-country OR an j-country).
> 
> Thank you for your help,
> Serguei

Hi Serguei,

If I understand your doubt, the solution is something like this for pair
i-country is AUT or j-country is BEL


output ~ I(i-country=="AUT"|j-country=="BEL")
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newbie query: simple crosstabs

2009-04-08 Thread Bernardo Rangel Tura
On Tue, 2009-04-07 at 16:33 -0400, Donald Braman wrote:
> I've been playing around with various table tools, trying to construct a
> fairly simple cross-tab.  It shouldn't be hard, but for some reason it
> turning out to be (for me).
> 
> If I want to see how many men and how many women agree with a agree/disagree
> question (coded 1,0), I can do this:
> 
> >attach(mydata)
> >mytable <- table(male, q1.bin) # gender and a binary response variable
> >prop.table(mytable, 1) # row percentages
>  q1.bin
> male  0  1
>0 0.3988 0.6012
>1 0.2879 0.7121
> 
> I can repeat that for each of the items I want gender breakdowns for (q2,
> q3, q4 ).   But what I really want is a table that shows the percentage
> answering yes (coded as 1) across many, many binary response items.  E.g.,
> 
> 
> male q1.bin q2.bin q3.bin ...
>0 0.6012 0.3421 0.9871 ...
>1 0.7121 0.6223 0.0198 ...
> 
> I've tried various combinations of apply & cbind, but to no avail. It would
> be easy in SPSS crosstabs, but darnit, I want to use R!

Donald,

The other solutions is using command CrossTable of package gmodels.
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sequences

2009-04-07 Thread Bernardo Rangel Tura
On Tue, 2009-04-07 at 05:16 -0700, Melissa2k9 wrote:
> Hi,
> 
> I am trying to make a sequence and am  using a for loop for this. I want to
> start off with an initial value ie S[0]=0 then use the loop to create other
> values. This is what I have so far but I just keep getting error messages.
> 
> #To calculate the culmulative sums:
> 
> s<-rep(0,207)#as this is the length of the
> vector I know I will have
> s<-as.vector(s)
> s[0]<-0
> for (i in 1:length(lambs))# where lambs is a vector of
> length 207 consisting of temperature 
> values
> 
> 
> {
>   s[i]<-s[i-1]-mean(lambs)
> }
> 
> I continually get the error message: 
> 
> Error in s[i] <- s[i - 1] - mean(lambs) : replacement has length zero
> 
> 
> When I merely use s[i]<-i-mean(lambs) it works so there is obviously
> something wrong with the s[i-1] but i cant see what. All I want is for each
> S[i] to be the previous value for S - the mean.

Hi Melissa,

I think this is a index problem ...

your code is:

s<-rep(0,207)   
s<-as.vector(s)
s[0]<-0
for (i in 1:length(lambs)){
s[i]<-s[i-1]-mean(lambs)
}

But try this code:

s<-rep(0,207)   
s<-as.vector(s)
s[1]<-0 # not s[0]
for (i in 2:length(lambs)){ #not 1:length(lambs)
s[i]<-s[i-1]-mean(lambs)
}

The vector in R begin in 1 not in 0...

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to save the variables in R to a file

2009-04-06 Thread Bernardo Rangel Tura
On Mon, 2009-04-06 at 11:05 +0200, Cetinyürek Aysun wrote:
> Dear list members,
> 
> I have f.mean matrix of size 500 X 83 in R, and I want to save this into a
> file.
> I used the command
> save("f.mean.Rdata",file="D:/Users/Ays/Documents/Results")
> it saves but when I open the file in notepad it is just some characters
> meaningless.
> 
> Thank you in advance,
> 
> Kind Regards,
> 
> Aysun

Hi Aysun,

Try write.csv something like this:

write.csv(f.mean.Rdata,"D:/Users/Ays/Documents/Results/f_mean.csv")




-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold distribution

2009-04-05 Thread Bernardo Rangel Tura
On Sun, 2009-04-05 at 01:13 -0400, jim holtman wrote:
> Here is what I get from using 'fitdistr' in R to fit to a lognormal.
> The resulting density plot from the distribution seems to be a reason
> match to the data.
> 
> > x <- scan()
> 1: 0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559
> 9: 0.85009 0.85804 0.73324 1.04826 0.84002
> 14: 1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549
> 22: 0.93641 0.80418 0.95285 0.76876 0.82588
> 27: 1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696
> 35: 0.82364 0.84390 0.71402 0.80293 1.02873
> 40:
> Read 39 items
> > plot(density(x))
> > library(MASS)
> > fitdistr(x, 'lognormal')
>  meanlogsdlog
>   -0.134806360.19118861
>  ( 0.03061468) ( 0.02164785)
> > lines(dlnorm(x, -.1348, .1911), col='red')

Hi Jim

I agree with your solution but my plot result not fine.
I obtain same result.
> fitdistr(x, 'lognormal')
 meanlogsdlog   
  -0.134806360.19118861 
 ( 0.03061468) ( 0.02164785)

In plot when I use points (blue) and curve (green) the fit o lognormal
and density(data) is fine but when I use lines (red) the fit is bad (in
attach I send a PDF of my output)

Do you know why this  happen?

Thanks in advance

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

P.S. my script is

x <- scan()
0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559
0.85009 0.85804 0.73324 1.04826 0.84002
1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549
0.93641 0.80418 0.95285 0.76876 0.82588
1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696
0.82364 0.84390 0.71402 0.80293 1.02873

require(MASS)
fitdistr(x, 'lognormal')
pdf("adj.pdf")
plot(density(x))
lines(dlnorm(x, -.1348, .1911), col='red')
points(x,dlnorm(x, -.1348, .1911), col='blue')
curve(dlnorm(x, -.1348, .1911), col='green',add=T)
dev.off()




adj.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installation of all packages under Unix

2009-04-02 Thread Bernardo Rangel Tura
On Thu, 2009-04-02 at 14:37 +0200, nboe...@uni-potsdam.de wrote:
> Dear List-Members,
> 
> does anyone know a simple way to automatically install all R packages  
> on a unix system to the default library path? Not from inside R, it  
> should rather work as a shell script - job at startup.
> 
> Something like R cmd install -l pkgs ### where pkgs should mean all  
> packages (is there an option to control overwriting?)
> 
> Thanks a lot,
> 
> N. Boehme

All packages in CRAN

install.packages(new.packages(),dep=T,clean=T)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A query about na.omit

2009-04-01 Thread Bernardo Rangel Tura
On Wed, 2009-04-01 at 16:49 +0100, Jose Iparraguirre D'Elia wrote:
> Dear all,
>  
> Say I have the following dataset:
>  
> > DF
> x y z
> [1]   1 1 1
> [2]   2 2 2
> [3]   3 3NA
> [4]   4   NA   4
> [5]  NA  5 5
>  
> And I want to omit all the rows which have NA, but only in columns X and Y, 
> so that I get:
>  
>  x  y  z
> 1  1  1
> 2  2  2
> 3  3  NA
>  
> If I use na.omit(DF), I would delete the row for which z=NA, obtaining thus
>  
> x y z
> 1 1 1
> 2 2 2
>  
> But this is not what I want, of course. 
> If I use na.omit(DF[,1:2]), then I obtain
>  
> x y 
> 1 1
> 2 2
> 3 3
>  
> which is OK for x and y columns, but I wouldn't get the corresponding values 
> for z (ie 1 2 NA)
>  
> Any suggestions about how to obtain the desired results efficiently (the 
> actual dataset has millions of records and almost 50 columns, and I would 
> apply the procedure on 12 of these columns)?
>  
> Sincerely,
>  
> Jose Luis 
>  
> Jose Luis Iparraguirre
> Senior Research Economist 
> Economic Research Institute of Northern Ireland
>  

Hi Jose Luis,

I think this script is sufficient for your problem:

tab<-matrix(c(1,1,1,2,2,2,3,3,NA,4,NA,4,NA,5,5),ncol=3,byrow=T)
tab[!is.na(tab[,1])&!is.na(tab[,2]),]

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to increase the limit for max.print in R

2009-03-31 Thread Bernardo Rangel Tura
On Tue, 2009-03-31 at 15:51 +0530, pooja arora wrote:
> Thanks, it Worked.
> Do you have any idea how much is the max limit for max.print
> 
> 
> Thanks and Regards,
> Pooja Arora 

Hi Pooja,

In this case max is;

options(max.print=.Machine$double.xmax)

In my case I have a compiled R 2.8.1 in AMD phenom using Ubuntu 8.10 AMD
64 

> .Machine$double.xmax
[1] 1.797693e+308


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to increase the limit for max.print in R

2009-03-31 Thread Bernardo Rangel Tura
On Tue, 2009-03-31 at 14:47 +0530, pooja arora wrote:
> Hi All,
> 
>  
> 
> I am using DNAcopy package in R for copy number analysis of 500K chip.
> 
> The final output which I get from DNA copy is too big to be printed in a
> file.
> 
> So I am getting an error as "reached getOption("max.print") -- omitted
> 475569 rows " 
> 
> Can somebody please provide me the pointers with how to increase the limit
> for max.print .
> 
>  
> 
> Thanks,
> 
>  
> 
> Pooja

Hi Pooja,

You must use options command, something like this

options(max.print=5.5E5)

For more information type? ?options

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] very fast OLS regression?

2009-03-26 Thread Bernardo Rangel Tura
On Wed, 2009-03-25 at 22:11 +0100, Dimitris Rizopoulos wrote:
> check the following options:
> 
> ols1 <- function (y, x) {
>  coef(lm(y ~ x - 1))
> }
> 
> ols2 <- function (y, x) {
>  xy <- t(x)%*%y
>  xxi <- solve(t(x)%*%x)
>  b <- as.vector(xxi%*%xy)
>  b
> }
> 
> ols3 <- function (y, x) {
>  XtX <- crossprod(x)
>  Xty <- crossprod(x, y)
>  solve(XtX, Xty)
> }
> 
> ols4 <- function (y, x) {
>  lm.fit(x, y)$coefficients
> }
> 
> # check timings
> MC <- 500
> N <- 1
> 
> set.seed(0)
> x <- matrix(rnorm(N*MC), N, MC)
> y <- matrix(rnorm(N*MC), N, MC)
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))

Hi Dimitris and Ivo

I think this is not a fair comparison, look this

x[8,100]<-NA

system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
   user  system elapsed 
  8.765   0.000   8.762 
 

 system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
Error in solve.default(t(x) %*% x) : 
  system is computationally singular: reciprocal condition number = 0
Timing stopped at: 0 0 0.002 
 

 system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
Error in solve.default(XtX, Xty) : 
  system is computationally singular: reciprocal condition number = 0
Timing stopped at: 0 0 0.001 
 

 system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))
Error in lm.fit(x, y) : NA/NaN/Inf in foreign function call (arg 1)
Timing stopped at: 0 0 0.001 

So routines ols2, ols3 and ols4 only functional in fully matrix if have
one NA this functions don't run.

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Search "Errors"

2009-03-25 Thread Bernardo Rangel Tura
On Tue, 2009-03-24 at 11:45 -0700, CE.KA wrote:
> Hi R users,
> I have a big program
> So in Rgui I can t see all the execution of it
> Is there a way to ask R if there is "Errors" in my program
> Sincerely yours

Hi

Normally i use 3 functions in debug R routines: trace, browser and
debug.
In special cases i use a debug package
( http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf )
Article: Debugging without (too many) tears 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] any package for connecting berkeley db in R?

2009-03-23 Thread Bernardo Rangel Tura
On Fri, 2009-03-20 at 17:23 -0400, Zheng, Xin (NIH) [C] wrote:
> Hi there,
> 
> Is there any such package? I searched but found none. Thanks in advance.
> 
> Xin Zheng
> 

Hi Xin

Do you try package DBI?

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher.test - FEXACT error 7

2009-03-23 Thread Bernardo Rangel Tura
On Fri, 2009-03-20 at 18:29 +, Helena Mouriño wrote:
> Dear all,
> 
> Im having an awkward problem in R.  When I write the command
> Fisher.test(school.data,workspace=2e+07), where school.data is the matrix
> corresponding to the data set,
> I get the error  message:

> FEXACT error 7.
> LDSTP is too small for this problem.
> Try increasing the size of the workspace.
> 
> Increasing the workspace: 
> Fisher.test(school.data,workspace=1e+10),
> I get a different message, but it still doesnt work:
> 
> NAs in foreign function call (arg 10)
> In addition: Warning message:
> In fisher.test(dados, workspace = 1e+10, alternative = "two.sided") :
>   NAs introduced by coercion

Hi Helena,

In this case you can try 3 solutions:

1- chisq.test(school.data), but pay attention if expected value of any
cell is < 5 

2- Fisher.test(school.data,workspace=2e+07,hybrid=TRUE) from Help

For larger than  2 by 2 tables and 'hybrid = TRUE', asymptotic
 chi-squared probabilities are only used if the 'Cochran
 conditions' are satisfied, that is if no cell has count zero, and
 more than 80% of the cells have counts at least 5.

3- Use "large tables" approach from Sir David Cox:

Law, G. R. and Cox, D. R. and Machonochie, N. E. S. and Simpson, J. and
Roman, E. and Carpenter, L. M. (2001) Large tables. Biostatistics
2(2):pp. 163-171.


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] numeric equality

2009-03-20 Thread Bernardo Rangel Tura
On Wed, 2009-03-18 at 11:58 -0400, Yu, Changhong wrote:
> Dear all,
> 
>  
> 
> I am totally confused by the following R output, but don't have a clue
> for it.
> 
>  
> 
> > a <- 1 - 0.2
> 
> > a == 0.8
> 
> [1] TRUE
> 
> > a <- 1 - 0.8
> 
> > a == 0.2
> 
> [1] FALSE

Hi Yu,

First of all read FAQ 7.31 (Why doesn't R think these numbers are
equal?)

Second, in this case, use all.equal

a <- 1 - 0.8

a == 0.2
[1] FALSE

all.equal(a,0.2)
[1] TRUE

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regressão linear

2009-03-06 Thread Bernardo Rangel Tura
On Thu, 2009-03-05 at 02:20 +, Ben Bolker wrote:
> Sueli Rodrigues  esalq.usp.br> writes:
> > 
> > Olá. Tenho um arquivo que a cada 6 linhas corresponde uma amostra da qual
> > preciso dos coeficientes da regressão linear. Como faço para que o
> > programa distinga a cada 6 linhas como uma amostra e não calcule como um
> > todo?
> > Estou usando a função: model=lm(y ~ x)
> > 
> 
>   You're more likely to get a response if you post to the list
> in English (even fractured English).
> 
>  Based on what Google translator thinks you said (you want
> to perform linear regressions on 6-line subsets of a data set?),
> here's a starting point (assuming your data are in a data frame
> mydata, and have column names x and y):
> 
> splitdat <- split(mydata,rep(1:6,length.out=nrow(mydata))
> linfits <- lapply(splitdata,lm,formula=y~x)
> coefs <- sapply(linfits,coef)
> 
> or something like that.
> 
>   Ben Bolker

Hi Ben Bolker 

First of all I would like to thank the kindness with my countrywoman.

Second in her problem each 6 rows is a subset for a linear regression so
the command is 

splitdat <- split(mydata,rep(1:(nrow(mydata)/6),each=6))

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density > 1?

2009-03-02 Thread Bernardo Rangel Tura
On Mon, 2009-03-02 at 13:27 +0100, Johannes Elias wrote:
> Dear R-Gurus,
> 
> I wonder why 'density' values as shown in hist or plot(density(x)) are
> sometimes over 1. How can that be?
> 
> Example
> 
> >hist(rnorm(1000,sd=.5),freq=FALSE)
> 
> The resulting plot shows density values below 1 on the y-axis. However,
> 
> >hist(rnorm(1000,sd=.1),freq=FALSE)
> 
> shows density values over 1.
> 
> How to interpret density values over 1?
> 
> Greetings,
> 
> Johannes

Johannes,

Well density is not like probability

In histogram with density the area is equal de probability 

in you example

set.seed(123)
hist(rnorm(1000,sd=.1),freq=FALSE)

The interval of -0.05 and 0 have density=4 but a probability of number
in this interval is 4*.05=.2

the fact 

set.seed(123)
hist(rnorm(1000,sd=.1),freq=FALSE)$density
[1] 0.0998 0.2800 0.9400 1.9800 2.6000 4.
[7] 4.0400 2.9200 1.6600 0.9200 0.4400 0.1000
[13] 0.0200

set.seed(123)
sum(hist(rnorm(1000,sd=.1),freq=FALSE)$density)
[1] 1


So the sum of probability is 1 but the sum of density 20 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rounding problem

2009-03-02 Thread Bernardo Rangel Tura
On Sun, 2009-03-01 at 17:51 -0800, tedzzx wrote:
> Yes, round(1.5)=2. but round(2.5)=2.  I want round(2.5)=3 just like the what
> the excel do.  Can we change the setting or do some trick so that the
> computer will work like what we usually do with respect to rounding.
> My system is R 2.8.1, winXP, Intel core 2 dual . Thanks. 
On Sun, 2009-03-01 at 17:51 -0800, tedzzx wrote:
> Yes, round(1.5)=2. but round(2.5)=2.  I want round(2.5)=3 just like
> the what
> the excel do.  Can we change the setting or do some trick so that the
> computer will work like what we usually do with respect to rounding.

If I understand your problem you need using a bias round like excell.
Do you understand this is a wrong way and this is a bug of excell, but
you need this.

Well to help you i create a function for this

round.excell<-function(x){
aux<-round(x,1)
aux<-ifelse(
all.equal(aux-trunc(aux),.5)==TRUE,
ceiling(aux),
round(aux,0))
return(aux)
}
round.excell(1.5)
round.excell(2.5)
round.excell(2.1)
round.excell(1.1)
round.excell(2.6)
round.excell(1.6)

But I need talk to you:

1- The use of routine put a bias in your analysis
2- Is not a standard (check IEC 60559 about this)
3- Do you migrating a excell bug to R
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does The Randvar package contain a virus(Malware) ?

2009-03-02 Thread Bernardo Rangel Tura
On Mon, 2009-03-02 at 08:41 +0100, Matthias Kohl wrote:
> there were only minor changes in the latest version of RandVar (from 
> 0.6.6 to 0.6.7).
> Might this be a mirror problem?
> 
> Best
> Matthias
> 
> Timthy Chang wrote:
> > Today,I update the packages in R.
> > but AntiVir Guard dectects the  Randvar package as affected file.
> > What happen ?? 
> > Thank you for your answer.
> >
> > http://www.nabble.com/file/p22281513/qq.gif 
> >   


Or this find a False Positive

Second
http://www.avira.com/en/threats/section/fulldetails/id_vir/4142/heur_html.malware.html

HEUR/HTML.Malware Not a virus but a code this anti virus suspected may
be a virus


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 1.095e+09 for integers

2009-02-23 Thread Bernardo Rangel Tura
On Sun, 2009-02-22 at 23:42 -0500, Alexy Khrabrov wrote:
> I've had a very long file written out by R with write.table, with  
> fields of time values, converted from POSIXlt as.numeric.  Among 2.5  
> million values, very few had 6 trailing zeroes, and those were output  
> in scientific notation as in the subject.  Is this the default  
> behavior for long integers, and how can it be turned off (with all  
> digits for any integer field in write.table)?  This is important to  
> interoperate with other languages through such text dumps, as some do  
> not expect scientific notation for integers, only for floats.
> 
> Cheers,
> Alexy

Alexy 

If I understood your problem you have something like this


as.numeric(12345678912345678)
[1] 1.234568e+16

In this case your solution is using format, like this

format(as.numeric(12345678912345678),scientific=FALSE)
[1] "12345678912345678


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] System of logistics Equations-correction

2009-02-20 Thread Bernardo Rangel Tura
Hi R-masters

Sorry but i make a error in script the correct code to fake database of 
research 

Subj<-rep(1:20,each=60)
Sti<-rep(c("G","S","P"),40)
SP<-rep(c("S","P"),each=60)
AG<-rep(c("A","P"),60)
Mer<-rep(c("M","NM","M","NM"),each=30)
Car<-round(runif(120,1,7),0)
Val<-round(runif(120,-7,7),0)
base<-data.frame(Subj,Sti,SP,AG,Mer,Car,Val)


In my hypothesis:

logit(Sti=="G") ~ SP+AG+Mer+Car+Val+SP*Car+Mer*Val+AG*Val + errorG

logit(Sti=="S") ~ SP+AG+Mer+Car+Val+SP*Car+Mer*Val+AG*Val + errorS

logit(Sti=="P") ~ SP+AG+Mer+Car+Val+SP*Car+Mer*Val+AG*Val + errorP

I test and the 3 terms of error (errorG,errorS,errorP) is correlated.

So I think useful adjust a system of logistic equations to tread the 3
equations and in same time to obtain estimatives of effects and
uncorrelated error terms.


The systemfit package fit linear system and non-linear system but is
possible adjust a logistic system in R?

Thanks in advance



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] System of logistics Equations

2009-02-20 Thread Bernardo Rangel Tura
Hi R-masters

I need yours help about a problema in one of may ongoing researchers.

In my research the subjects (20 in total) answer 60 questions (20 type
G, 20 type S and 20 type P).

Which questions is classified about 3 factor (2 level each) and the
subject score with 2 scale (not integer value is possible but rare): Val
range -7 to 7 and other Car range 1 to 7.

This a code to fake database of research 

Subj<-rep(1:20,each=60)
Sti<-rep(c("G","S","P"),40)
SP<-rep(c("S","P"),60)
AG<-rep(c("A","P"),60)
Mer<-rep(c("M","NM"),60)
Car<-round(runif(120,1,7),0)
Val<-round(runif(120,-7,7),0)
base<-data.frame(Subj,Sti,SP,AG,Mer,Car,Val)

In my hypothesis:

logit(Sti=="G") ~ SP+AG+Mer+Car+Val+SP*Car+Mer*Val+AG*Val + errorG

logit(Sti=="S") ~ SP+AG+Mer+Car+Val+SP*Car+Mer*Val+AG*Val + errorS

logit(Sti=="P") ~ SP+AG+Mer+Car+Val+SP*Car+Mer*Val+AG*Val + errorP

I test and the 3 terms of error (errorG,errorS,errorP) is correlated.

So I think useful adjust a system of logistic equations to tread the 3
equations and in same time to obtain estimatives of effects and
uncorrelated error terms.


The systemfit package fit linear system and non-linear system but is
possible adjust a logistic system in R?

Thanks in advance



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating Numbers With Certain Distribution in R

2009-02-11 Thread Bernardo Rangel Tura
On Wed, 2009-02-11 at 04:07 +, Ben Bolker wrote:
> Gundala Viswanath  gmail.com> writes:
> 
> > 
> > Dear all,
> > 
> > Is there a way to generate K numbers of integer (K = 10^6).
> > The maximum value of the integer is 200,000 and minimum is 1.
> > 
> > And the occurrences of this integer follows
> > a lognormal distribution.
> > 
> > - Gundala Viswanath
> > Jakarta - Indonesia
> 
>   Technically speaking, I don't think this is possible
> since lognormal variates range from 0 to infinity
> and are continuous.  However, perhaps something like
> 
> x <- rlnorm(1e6,mulog=1,sdlog=1) ## pick any parameters you like
> y <- round((x-min(x))/diff(range(x)))*19+1
> 
>   That probably doesn't do exactly what you want,
> but you will probably need to specify your problem more
> precisely (and perhaps give some context) if you want
> a better answer ...
> 
>   Ben Bolker


Ben,

I think your routine need a little fix

x <- rlnorm(1e6,meanlog=1,sdlog=1) ## pick any parameters you like
y <- round((x-min(x)/diff(range(x)))*19+1)

What you think?

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: A test with dependent samples.

2009-02-11 Thread Bernardo Rangel Tura
On Tue, 2009-02-10 at 21:03 -0500, Murray Cooper wrote:

Hi R-masters

Well,

I think this a complex problem because haven't a control group OR a not
randomized study.

But i think the solution is a Bayesian approach.

I don't know the probability of vomiting in a cat but isn't 0, so i
think de priori is a beta (1[0+1],73[72+1]).

The likelihood is oblivious beta(13[12+1],61[60+1])

So the posteriori is beta(1,73)*beta(13,61)=beta(14,134)

The expected valeu of posteriori is 0.1 in 72 cats is same 7.2 or 7 CATS
is almost a half of numbers of study.

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil
 
> David,
> 
> If you really want to do a test on this data, I would suggest
> a Fisher's Exact test, but you want to use hypergeometric
> probabilities. You would probably want to try the CMH
> test, if the function allows a single table and actually uses
> hypergeometric probabilities.
> 
> My suggestion, would be to calculate the frequency of
> vomiting, for animals that didn't vomit before, calculate
> the CIs and then use some historical data on the vomiting
> rate, for non-treated cats and see whether it falls inside the
> CIs for your treated animals. If it does, then you might
> conclude that the vomiting rate, for treated cats, is
> similar to non-treated cats.
> 
> Murray M Cooper, Ph.D.
> Richland Statistics
> 9800 N 24th St
> Richland, MI, USA 49083
> Mail: richs...@earthlink.net
> 
> - Original Message - 
> From: "David Winsemius" 
> To: "Rolf Turner" 
> Cc: "R-help Forum" 
> Sent: Tuesday, February 10, 2009 4:50 PM
> Subject: Re: [R] OT: A test with dependent samples.
> 
> 
> > In the biomedical arena, at least as I learned from Rosner's  introductory 
> > text, the usual approach to analyzing paired 2 x 2 tables  is McNemar's 
> > test.
> >
> > ?mcnemar.test
> >
> > > mcnemar.test(matrix(c(73,0,61,12),2,2))
> >
> > McNemar's Chi-squared test with continuity correction
> >
> > data:  matrix(c(73, 0, 61, 12), 2, 2)
> > McNemar's chi-squared = 59.0164, df = 1, p-value = 1.564e-14
> >
> > The help page has citation to Agresti.
> >
> > -- 
> > David winsemius
> > On Feb 10, 2009, at 4:33 PM, Rolf Turner wrote:
> >
> >>
> >> I am appealing to the general collective wisdom of this
> >> list in respect of a statistics (rather than R) question.  This  question
> >> comes to me from a friend who is a veterinary oncologist.  In a  study 
> >> that
> >> she is writing up there were 73 cats who were treated with a drug  called
> >> piroxicam.  None of the cats were observed to be subject to vomiting 
> >> prior
> >> to treatment; 12 of the cats were subject to vomiting after treatment
> >> commenced.  She wants to be able to say that the treatment had a 
> >> ``significant''
> >> impact with respect to this unwanted side-effect.
> >>
> >> Initially she did a chi-squared test.  (Presumably on the matrix
> >> matrix(c(73,0,61,12),2,2) --- she didn't give details and I didn't 
> >> pursue
> >> this.) I pointed out to her that because of the dependence --- same 73
> >> cats pre- and post- treatment --- the chi-squared test is  inappropriate.
> >>
> >> So what *is* appropriate?  There is a dependence structure of some  sort,
> >> but it seems to me to be impossible to estimate.
> >>
> >> After mulling it over for a long while (I'm slow!) I decided that a
> >> non-parametric approach, along the following lines, makes sense:
> >>
> >> We have 73 independent pairs of outcomes (a,b) where a or b is 0
> >> if the cat didn't barf, and is 1 if it did barf.
> >>
> >> We actually observe 61 (0,0) pairs and 12 (0,1) pairs.
> >>
> >> If there is no effect from the piroxicam, then (0,1) and (1,0) are
> >> equally likely.  So given that the outcome is in {(0,1),(1,0)} the
> >> probability of each is 1/2.
> >>
> >> Thus we have a sequence of 12 (0,1)-s where (under the null  hypothesis)
> >> the probability of each entry is 1/2.  Hence the probability of this
> >> sequence is (1/2)^12 = 0.00024.  So the p-value of the (one-sided)  test
> >> is 0.00024.  Hence the result is ``significant'' at the usual levels,
> >> and my vet friend is happy.
> >>
> >> I would very much appreciate comments on my reasoning.  Have I made  any
> >> goof-ups, missed any obvious pit-falls?  Gone down a wrong garden  path?
&g

Re: [R] Choosing a random number between x and y

2009-02-10 Thread Bernardo Rangel Tura
On Mon, 2009-02-09 at 06:40 -0800, Vie wrote:
> Hi,
> 
> Ive been trying to find a function that will allow me to pull out a number
> between a minimum and maximum threshold.
> 
> I want a random decimal number between, for example, 0 and 0.5 or 0 and 0.7.
> I've been searching everywhere for a function that will allow me to do this
> in R, but I have yet to be successful. Any help would be much appreciated.
> 
> Thanks in advance

Hi Vie 

I don't understand if you need a only random generation or mixture
random generation, so i will make the 3 examples Using runif

1- Random 10 number Retween 0 and 0.5 runif(10,0,0.5)

2 -Random 20 number Retween 0 and 0.7 runif(20,0,0.7)

3- Random 40 number of mixture two random uniforme random 1 and 2 with
p(random1)= 0.3 and p(random=2) = 0.7

ifelse(runif(40)>.3,runif(40,0,0.7),runif(40,0,0.5))

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to analyse and model 2 time series, when one series needs to be differenced?

2009-02-01 Thread Bernardo Rangel Tura
On Mon, 2009-01-26 at 08:52 +, Andreas Klein wrote:
> Hello.
> 
> How can I analyse the cross-correlation between two time series with ccf, if 
> one of the time series need to be differenced, so it is stationary?
> The two time series differ when in length and maybe ccf produces not the 
> correct cross-correlation?!
> 
> Another problem:
> How can I model the two time series as an VARI-process with the dse package? 
> - 
> So how can I handle it, that one series has to be differenced and the other 
> series not?
> 
> I hope you can give me some hints.
> 
> 
> Regards,
> Andreas.

Hi Andreas,

If I understand your problem this script solve tour question

t<-1:15
x<-rnorm(10)
y<-.2-.3*t+rnorm(15)
y.dif<-diff(y,1)
ccf(x,y.dif)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R for Computational Neuroscience?

2009-01-25 Thread Bernardo Rangel Tura
On Fri, 2009-01-23 at 08:53 -0400, Mike Lawrence wrote:
> Hi all,
> 
> I've noticed that many computational neuroscience research groups use
> MATLAB. While it's possible that MATLAB may have some features
> unavailable in R, I suspect that this may instead simply be a case of
> costly tradition, where researchers were taught MATLAB as students and
> pay for it as researchers because it's all they know.
> 
> I'd like to attempt to break the cycle by offering colleagues
> resources on using R for computational neuroscience, but I haven't
> been able to find anything (searched the task view, r-seek, & google).
> 
> Can anyone direct me to resources on using R for computational
> neuroscience? Input on my possibly naive assumption that R is a
> sufficient tool for this field would also be appreciated.
> 
> Cheers,
> 
> Mike

Mike,

I think neuroscience is a term using for a wide group of researchers.
The common analysis (hypothesis test, ANOVA, regression models, etc) is
perfectly made in R.

But the interpretation of mri is need a packages:

1- AnalyzeFMRI -Functions for I/O, visualisation and analysis of
functional Magnetic Resonance Imaging (fMRI) datasets stored in the
ANALYZE or NIFTI format.

2- fmri - contains R-functions to perform an fmri analysis as described
in Tabelow, K., Polzehl, J., Voss, H.U., and Spokoiny, V. Analysing fMRI
experiments with structure adaptive smoothing procedures, NeuroImage,
33:55-62 (2006)

3- dti - Diffusion Weighted Imaging is a Magnetic Resonance Imaging
modality, that measures diffusion of water in tissues like the human
brain. The package contains R-functions to process diffusion-weighted
data in the context of the diffusion tensor model (DTI). This includes
the calculation of anisotropy measures and, most important, the
implementation of our structural adaptive smoothing algorithm as
described in K. Tabelow, J. Polzehl, V. Spokoiny, and H.U. Voss,
Diffusion Tensor Imaging: Structural Adaptive Smoothing, Neuroimage
39(4), 1763-1773 (2008).


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible for R to import a SigmaPlot file?

2009-01-23 Thread Bernardo Rangel Tura
On Thu, 2009-01-22 at 14:58 -0800, Jason Rupert wrote:
> I recently received a Sigmaplot file (*.jnb)from a customer and would like to 
> know if I can input it to a data frame and then manipulate the data in R. 
> 
> I did a search on Google and on RSeek (www.rseek.org), but did not get any 
> good hits. Thank for any feedback and insight you can provide. 
> 
> P.S. Love the flexibility of R and would love to keep using it. Just wanting 
> to know if this is possible. Thanks again.

Hi Jason,

I don't know sigmaplot, but I thing is possible export sigmaplot
database for other type of files.

If you export jnb file to csv file is possible read a database in R.

I will talk with a person work with me and  tonight I send other mail
with more details.
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] easiest way to integrate own functions on startup

2009-01-19 Thread Bernardo Rangel Tura
On Tue, 2009-01-20 at 01:13 +0100, Jörg Groß wrote:
> Hi,
> 
> I am currently writing some own functions that I frequently need.
> 
> So, it would be perfect if I could load these functions at the  
> beginning of each R-session with a small command.
> 
> 
> I tried to generate a R-package and install it that way.
> 
> But it seems that it is not so easy to add new functions to an  
> existing R-package.
> So I am not so flexible by that.
> 
> 
> Is there a way to just load an .R-file which contains the function- 
> definitions with a small command?
> So that the functions are useable in the current session?
> 
> 
> I tried load() but I get an error message...


Hi Jörg,

I do you not use .Fisrt.

Something like this:

.First <- function(){
source(MyFunction1.txt)
source(MyFunction2.txt)
source(MyFunction3.txt)
...
source(MyFunctionn.txt)
}

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pchisq error

2009-01-19 Thread Bernardo Rangel Tura
On Mon, 2009-01-19 at 09:54 +0100, Jeremy Silver wrote:
> Dear R experts,
(...)
> > pchisq(5.464342,1,lower.tail = FALSE)
> 
> [1] 0.01940836
> 
> > reproduceError(5.464342)
> 
> stat = 5.464342, p = 0.019408
> 
> > pchisq(5.464342,1,lower.tail = FALSE)
> 
> [1] NaN
> 
> Warning messages:
> 
> 1: In pchisq(5.464342, 1, lower.tail = FALSE) :
> 
>** NON-convergence in pgamma()'s pd_lower_cf() f= nan.
> 
> 2: In pchisq(q, df, lower.tail, log.p) : NaNs produced
> > 
(...)
> 
> > sessionInfo()
> 
> R version 2.8.0 (2008-10-20) 
> 
> i686-pc-linux-gnu 
> 
> locale:
> 
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
> 
> attached base packages:
> 
> [1] stats graphics  grDevices utils datasets  methods   base 

Hi Jeremy,

In my computer your error is not occur. Look This:

> pchisq(5.464342,1,lower.tail = FALSE)
[1] 0.01940836

> pchisq(5.464342,1,lower.tail = FALSE)
[1] 0.01940836

> sessionInfo()
R version 2.8.1 Patched (2009-01-16 r47630) 
x86_64-unknown-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base 


Well, do you already try update your R?


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How-To capture and handle errors with R

2008-12-14 Thread Bernardo Rangel Tura
On Fri, 2008-12-12 at 12:51 +0100, mau...@alice.it wrote:
> The following error occurs every now and then by calling a function of wmTSA 
> package:
> 
> Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "0")) : 
>   invalid 'row.names' length
> 
> I would greatly appreciate some guidelines about how to catch such an error 
> upon its occurrence and have it handled by my own routine 
> rather than letting R stop the currently run script.
> I had a look a the R on-line documentation about errors handler and also ran 
> some provided examples that confused my mind.
> The error is in textual form. I do not know whether I have to pass the whole 
> message to the handler.
> When such an exception occurs the execution control is not transferred to my 
> routine.
> I would like to see a working example.
> 
> Thank you so much,
> Maura 

Maura you can use try to catch error for more information digit: ?try
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logical inconsistency

2008-12-11 Thread Bernardo Rangel Tura
On Thu, 2008-12-11 at 17:23 +0200, Kenn Konstabel wrote:
> Hi,
> 
> I agree -- and my examples using round were meant as bad and dangerous
> examples. Using round at the last step is better and may solve the problem,
> but in your example ...
> 
> > round(8.8-7.8,1)==1
> [1] TRUE
> 
> ... you have to know in advance how many decimal places can possibly make a
> difference (is it just one? maybe it is 2? 3? 15?).
> 
> > round(8.8-7.8,14)==1
> [1] TRUE
> > round(8.8-7.8,15)==1
> [1] FALSE
> 
> ... or, equivalently,
> 
> > 8.8-7.8-1 < 1e-15
> [1] TRUE
> > 8.8-7.8-1 < 1e-16
> [1] FALSE
> 
> Best regards,
> Kenn

Hi Kenn,

If you know "guard digits" approach you know the answer for your
question.

You need the minimum numbers of significant digits and put all numbers
in same number of significant digits for comparison.

In all example put in this thread we need only 1 decimal place so you
round for 1 decimal place, if for your job you need 3 decimal places
precision you round for 3 decimal place.


I think don't make sense you using 10 decimal place precision if your
problem need 2 decimal places precision ...

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logical inconsistency

2008-12-11 Thread Bernardo Rangel Tura

Hi Kenn,

Well I think your use or round isn't optimal solution. If you using 

round(x,1)-round(x,1) you create 2 problems

First: error propagation because you make 2 round.

Second: you don't using "guard digits" approach.

The optimal use of round is using in last calculation:

Look this

> round(8.8,1)-round(7.8,1)>1
[1] TRUE
> round(8.8-7.8,1)>1
[1] FALSE
> round(8.8-7.8,1)==1
[1] TRUE

Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

-- Original Message ---
From: "Kenn Konstabel" <[EMAIL PROTECTED]>
To: "emma jane" <[EMAIL PROTECTED]>
Cc: R help <[EMAIL PROTECTED]>
Sent: Thu, 11 Dec 2008 11:53:01 +0200
Subject: Re: [R] Logical inconsistency

> Rounding can do no good because
> 
> round(8.8,1)-round(7.8,1)>1
> # still TRUE
> round(8.8)-round(7.7)>1
> # FALSE
> 
> What you might do is compute a-b-1 and compare it to a very small number:
> 
> (8.8-7.8-1) < 1e-10
> # TRUE
> 
> K
> 
> On Wed, Dec 10, 2008 at 11:47 AM, emma jane <[EMAIL PROTECTED]> 
> wrote:
> 
> > Thanks Greg, that does make sense.  And I've solved the problem by
> > rounding the variables before taking the difference between them.
> >
> > Thanks to all who replied.
> >
> > Emma JaneÂ
> >
> >
> >
> >
> > 
> > From: Greg Snow <[EMAIL PROTECTED]>
> >
> > .com.br>; Wacek Kusnierczyk <[EMAIL PROTECTED]>; Chuck
> > Cleland <[EMAIL PROTECTED]>
> > Cc: R help <[EMAIL PROTECTED]>
> > Sent: Tuesday, 9 December, 2008 16:30:08
> > Subject: RE: [R] Logical inconsistency
> >
> > Some (possibly all) of those numbers cannot be represented exactly, so
> > there is a chance of round off error whenever you do some arithmetic,
> > sometimes the errors cancel out, sometimes they don't.  Consider:
> >
> > > print(8.3-7.3, digits=20)
> > [1] 1.001
> > > print(11.3-10.3, digits=20)
> > [1] 1
> >
> > So in the first case the rounding error gives a value that is slightly
> > greater than 1, so the greater than test returns true (if you round the
> > result before comparing to 1, then it will return false).  In the second
> > case the uncertainties cancelled out so that you get exactly 1 which is not
> > greater than 1 an so the comparison returns false.
> >
> > Hope this helps,
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > [EMAIL PROTECTED]
> > 801.408.8111
> >
> >
> > > -Original Message-
> > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> > > project.org] On Behalf Of emma jane
> > > Sent: Tuesday, December 09, 2008 7:02 AM
> > > To: Bernardo Rangel Tura; Wacek Kusnierczyk; Chuck Cleland
> > > Cc: R help
> > > Subject: Re: [R] Logical inconsistency
> > >
> > > Many thanks for your help, perhaps I should have set my query in
> > > context  !
> > >
> > > I'm simply calculating an indicator variable [0,1] based on the whether
> > > the difference between two measured variables is > 1 or <=1.
> > >
> > > I understand the FAQ about floating point arithmetic, but am still
> > > puzzled that it only apparently applies to certain elements, as
> > > follows:
> > >
> > > 8.8 - 7.8 > 1
> > > > TRUE
> > >
> > > 8.3 - 7.3 > 1
> > > > TRUE
> > >
> > > However,
> > >
> > > 10.2 - 9.2 > 1
> > > >FALSE
> > >
> > > 11.3 - 10.3>1
> > > >Â FALSE
> > >
> > > Emma Jane
> > >
> > >
> > >
> > >
> > > 
> > > From: Bernardo Rangel Tura <[EMAIL PROTECTED]>
> > > To: Wacek Kusnierczyk <[EMAIL PROTECTED]>
> > > Cc: R help <[EMAIL PROTECTED]>
> > > Sent: Saturday, 6 December, 2008 10:00:48
> > > Subject: Re: [R] Logical inconsistency
> > >
> > > On Fri, 2008-12-05 at 14:18 +0100, Wacek Kusnierczyk wrote:
> > > > Berwin A Turlach wrote:
> > > > > Dear Emma,
> > > > >
> > > > > On Fri, 5 Dec 2008 04:23:53 -0800 (PST)
> > >
> > > > >
> > > > >
> > > > >> Please could someone kindly explain the following inconsistencies
> > > > >> I've discovered__when performing logica

Re: [R] Logical inconsistency

2008-12-10 Thread Bernardo Rangel Tura
On Tue, 2008-12-09 at 06:02 -0800, emma jane wrote:
> Many thanks for your help, perhaps I should have set my query in context  
> !
> 
> I'm simply calculating an indicator variable [0,1] based on the whether the 
> difference between two measured variables is > 1 or <=1.
> 
> I understand the FAQ about floating point arithmetic, but am still puzzled 
> that it only apparently applies to certain elements, as follows:
> 
> 8.8 - 7.8 > 1
> > TRUE
> 
> 8.3 - 7.3 > 1
> > TRUE
>  
> However,
>  
> 10.2 - 9.2 > 1
> >FALSE
>  
> 11.3 - 10.3>1
> > FALSE
> 
> Emma Jane

Emma,

This solve two forms:

1-  use all.equal
> all.equal((10.2 - 9.2),1)
[1] TRUE

2- use round
> round(10.2 - 9.2,0)>1
[1] FALSE
> round(10.2 - 9.2,0)>=1
[1] TRUE




-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logical inconsistency

2008-12-06 Thread Bernardo Rangel Tura
On Fri, 2008-12-05 at 14:18 +0100, Wacek Kusnierczyk wrote:
> Berwin A Turlach wrote:
> > Dear Emma,
> >
> > On Fri, 5 Dec 2008 04:23:53 -0800 (PST)
> > emma jane <[EMAIL PROTECTED]> wrote:
> >  
> >   
> >> Please could someone kindly explain the following inconsistencies
> >> I've discovered__when performing logical calculations in R:
> >>
> >> 8.8 - 7.8 > 1
> >> 
> >>> TRUE
> >>>   
> >> 8.3 - 7.3 > 1
> >> 
> >>> TRUE
> >>>   
> >
> > Gladly:  FAQ 7.31
> > http://cran.at.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
> >
> >   
> 
> well, this answer the question only partially.  this explains why a
> system with finite precision arithmetic, such as r, will fail to be
> logically correct in certain cases.  it does not explain why r, a
> language said to isolate a user from the underlying implementational
> choices, would have to fail this way. 
> 
> there is, in principle, no problem in having a high-level language
> perform the computation in a logically consistent way.  for example, bc
> is an "arbitrary precision calculator language", and has no problem with
> examples as the above:
> 
> bc <<< "8.8 - 7.8 > 1"
> # 0, meaning 'no'
> 
> bc <<< "8.3 - 7.3 > 1"
> # 0, meaning 'no'
> 
> bc <<< "8.8 - 7.8 == 1"
> # 1, meaning 'yes'
> 
> 
> the fact that r (and many others, including matlab and sage, perhaps not
> mathematica) does not perform logically here is a consequence of its
> implementation of floating point arithmetic. 
> 
> the faq you were pointed to, and its referring to the goldberg's
> article, show that r does not successfully isolate a user from details
> of the lower-level implementation.
> 
> vQ

Well, first of all for 8.-7.3 is not equal to 1 [for computers]

> 8.3-7.3-1
[1] 8.881784e-16

But if you use only one digit precision 

> round(8.3-7.3,1)-1
[1] 0
> round(8.3-7.3,1)-1>0
[1] FALSE
> round(8.3-7.3,1)==1
[1] TRUE


So the problem is the code write and no the software

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore errors

2008-12-01 Thread Bernardo Rangel Tura
On Mon, 2008-12-01 at 16:17 +0800, Jiang Peng wrote:
> Dear listers,
> 
>   Hi .
> 
>   I googled before I ask this question in case to avoid violating the  
> "list law". The question was proposed by somebody else but nobody  
> gives a proper solution.
> 
>   How to ignore errors in a R loop? I have a R source file ,namely,  
> test.r.
>   I ran a loop in a command line
> 
>   > for(i in 1 : 100 ) {source("test.r",echo =T ) }
> 
>   and I found that when test.r have some error the loop will stop, but  
> test.r is just stochastically wrong,i.e., sometimes it works well if  
> no bad values return.
>   I want to know how to ignore such errors and let the loop continue.
> 
> thanks in advance.
> ___
> 

Jiang Peng,

The default solution is using try command, something like this

for(i in 1 : 100 ) {
try(source("test.r",echo =T ))
}

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help for code in jump diffusion

2008-11-22 Thread Bernardo Rangel Tura
On Fri, 2008-11-21 at 22:37 -0800, subbudas wrote:
> hello everyone ,
> i have written some code in R for jump diffusion model.
> the code generates answer as 
> " NaN 
> There were 50 or more warnings (use warnings() to see the first 50)"
> my code is 
> 
> mu<-0.2
> sig<-0.2
> S0<-100
> j<-0.2
> dt<-1/252
> int<-0.1
> i<-0
> while(i<=1)
> {
>   is.nan
>   k<-rnorm(1,0,1)
>   theta<-ifelse((k<(int*dt)),1,0)
>   m<-rnorm(1)
>   gam<-qnorm(m,0,1)
>   S0<-abs(S0*((1+mu*dt+sig*sqrt(dt)+ gam)- j*theta))
>   if(!is.nan (S0 <= 0))
>  warning("S0 must be positive")
>  cat("NaN","\n")
>   cat(S0,"\n")
>   i<-i+(1/252)
> }
> 
> the problem i am facing is i am not able to find out the reason for this NaN
> output.
> please help 
> 
> thanks in advance.


Hi 

I see two problems in your script:

1- m<-rnorm(1) will produce a random number with distribution normal
mean =0 and sd=1 so m can >1 or <0
In this cases gam<-qnorm(m,0,1) is NAN because m is not a probability

2- I think  if(!is.nan (S0 <= 0)) ...  is wrong 


Try this script :


mu<-0.2
sig<-0.2
S0<-100
j<-0.2
dt<-1/252
int<-0.1
i<-0
while(i<=1){
k<-rnorm(1,0,1)
theta<-ifelse((k<(int*dt)),1,0)
#   m<-rnorm(1)
m<-runif(1,0,1)
    gam<-qnorm(m,0,1)
S0<-abs(S0*((1+mu*dt+sig*sqrt(dt)+ gam)- j*theta))
if(!is.nan(S0)&&(S0 <= 0)){
warning("S0 must be positive")
cat("NaN","\n")
}
cat(S0,"\n")
i<-i+(1/252)
}

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mismatch in logical result?

2008-11-07 Thread Bernardo Rangel Tura
On Fri, 2008-11-07 at 15:53 +0530, Shubha Vishwanath Karanth wrote:
> 
> Hi R,
> 
>  
> 
> I have certain checkings, which gives FALSE, but actually it is true. Why 
> does this happen? Note that the equations that I am checking below are not 
> even the case of recurring decimals...
> 
>  
> 
> > 1.4^2 == 1.96
> 
> [1] FALSE
> 
>  
> 
> > 1.2^3==1.728
> 
> [1] FALSE


Shubha

the correct answer for us is TRUE for the computer is FALSE

1.4^2-1.96
[1] -2.220446e-16

1.2^3-1.728
[1] -2.220446e-16

but if you use "all.equal" 

all.equal(1.4^2,1.96)
[1] TRUE

all.equal(1.2^3,1.728)
[1] TRUE


More details in R FAQ 7.31





-- 
[]s
Tura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to plot with different colours

2008-11-02 Thread Bernardo Rangel Tura
On Sun, 2008-11-02 at 07:58 +, Carlos Morales wrote:
> Hello everyone,
> 
> I'm trying to plot 3600 points and my idea is if this value is higher than 
> 0.35 then this point must appear in green colour, if it's smaller than -0.35 
> then values must appear in red and if values are between -0.35 and 0.35 they 
> must be in yellow. I'm thinking and I'm trying many things but I don't 
> achieve it. Any idea?.
> 
> Thanks so much
> Carlos Morales Diego

Hi Carlos

I think you need use a nested ifelse, something similar this

x<-runif(6000,-1,1)
color<-ifelse(x>.35,"green",ifelse(x< -.35,"red","yellow"))
table(col)
plot(1:6000,x,col=color)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group sequential analysis - stopping for futility

2008-10-30 Thread Bernardo Rangel Tura
Em Ter, 2008-10-28 às 08:30 -0700, [EMAIL PROTECTED] escreveu:
> Hello - 
> 
> I am wondering if anyone has written some R code to calculate futility 
> bounds for a group sequential analysis of clinical trials data.  The 
> library ldbounds has a function 'bounds'  which calculates the 
> effectiveness stopping bounds for various spending functions,  but it does 
> not appear to do the same for futility (sometimes called the 'inner 
> wedge').
> 
> Thanks for any assistance!
> 
> -Alice


Hi alice, 

I have interest in develop this code with you .

Do you have a paper about this?
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing R in Linux: problems with JAVA packages (rJava, RWeka, ...) ?

2008-10-21 Thread Bernardo Rangel Tura
Em Ter, 2008-10-21 às 15:36 +0100, Paulo Cortez escreveu:
> Hi,
> 
> While in MacOS it is quite simple to install R and Java packages, the 
> same is not true for Linux. I surfed the web and it seems that other 
> users also have similar problems. Perhaps a nice FAQ answer or HOWTO 
> would help...
> 
> But here is my situation: I have 2 linux servers, one with Fedora 9 and 
> the other with CentOS5.
> 
> I have installed R (2.7.2 version, with yum) and I tryed to install the 
> RWeka and rJava packages.
> 
> After receiving an error, I installed jdk1.6 from java (file:
> jdk-6u10-linux-i586-rpm.bin?AuthParam=1224512972_e5a9932e886a02f44dfdfe48aad02db8&TicketId=B%2Fw2nBuFSltLQRRFM1JblgDk&GroupName=CDS&FilePath=%2FESD5%2FJSCDL%2Fjdk%2F6u10%2Fjdk-6u10-linux-i586-rpm.bin&File=jdk-6u10-linux-i586-rpm.bin)
> in both machines.
  (...)

Hi Paulo

I think you need install JDK and JRE for using rJava


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about "granova" library

2008-10-20 Thread Bernardo Rangel Tura
Em Qui, 2008-10-16 às 21:11 +1030, Fernando Marmolejo Ramos escreveu:
> Dear all
> 
> Recently the "granova" package was launched. I installed but after when I
> invoked it in R it requested for other libraries. They were downloaded and
> install automatically.
> 
> I tried to run the example syntax of “granova.1w” and “granova.2w” but two
> things happened: i) either a file called “granova.rdb” wasn’t existent or ii)
> the GUI clashed and R shut down.
> 
> Has anyone else experience this? Do the developers have an answer for this
> troubleshot?
> 
> I’m using a Windows Vista system and I have the R version 2.7.2.
> 
> Cheers,
> 
> Fer

Fernando


I using R version 2.7.2 and Ubuntu 8.04 in my computer:

granova.1w - runs fine 

granova.2w - don't run fine, actual only 1 of 2 graphical windows apear
a plot (rgl surface)



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving results of Kruskal Walis test

2008-10-16 Thread Bernardo Rangel Tura
Em Qui, 2008-10-16 às 22:31 +0200, Himanshu Ardawatia escreveu:
> Hello,
> 
> I am running Kruskal-Walis test in R. When I try to save results using
> write.table it gives me the following error :
> 
> Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
> stringsAsFactors) :
>   cannot coerce class "htest" into a data.frame
> 
> The overall code is as follows :
> 
> >data_file = read.table("~/DATA.dir/data_file.txt", header=T)
> 
> >attach(data_file)
> 
> >data_file.out <- krukal.test(data_file)
> 
> >write.table(data_file.out, "~/DATA/results/data_file_out.txt")
> 
> Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
> stringsAsFactors) :
>   cannot coerce class "htest" into a data.frame
> Results do come in data_file.out after analysis as seen below:
> 
> >data_file.out
> 
> 
> Kruskal-Wallis rank sum test
> 
> data:  value by pathway
> Kruskal-Wallis chi-squared = 5.6031, df = 3, p-value = 0.1326
> 
> 
> I am wondering if I am making a mistake with using write.table (It works
> very well saving results from anova analysis) or is there any other way to
> save results in a file for future use..
> 
> Thanks
> Himanshu


Hi Himanshu 

Well the output of htests is a list so data_file.out is a lista to.

You don't put a list ins a data.frame so you need make this


data.frame(unlist(data_file.out))

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help about how can R compute AIC?

2008-10-15 Thread Bernardo Rangel Tura
Em Ter, 2008-10-14 às 17:13 +0200, Arnau Mir Torres escreveu:
> Hello.
> 
> I need to know how can R compute AIC when I study a regression model?
> For example, if I use these data:
>growth tannin
> 1 12  0
> 2 10  1
> 3  8  2
> 4 11  3
> 5  6  4
> 6  7  5
> 7  2  6
> 8  3  7
> 9  3  8
> and I do
> model <- lm (growth ~ tannin)
> AIC(model)
> 
> R responses:
> 38.75990
> 
> I know the following formula to compute AIC:
> AIC= -2*log-likelihood + 2*(p+1)
> 
> In my example, it would be:
> AIC=-2*log-likelihood + 2*2
> but I don't know how R computes log-likelihood:
> 
> logLik(model)
> 'log Lik.' -16.37995 (df=3)

Arnau,

LogLik= -16.37995

AIC= -2*log-likelihood + 2*(p+1)

AIC=-2*-16.37995 + 2*(p+1)

AIC= 32.7599+2*(p+1)

#
# this is very important the model have two
# parameter, because sigma is a parameter to.
# so 
#

AIC= 32.7599+2*(2+1) 

AIC= 32.7599+6

AIC= 38.7599
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] legend

2008-10-15 Thread Bernardo Rangel Tura
Em Ter, 2008-10-14 às 18:44 -0700, Lavan escreveu:
> Hi R users,
> 
> I'm trying to have the symbols for sigma[1] in my legend. the code is given
> below, please have a look.
> 
> Thanks,
> 
> lavan

try legend("bottom",legend=expression(sigma[1]),...)

> 
> 
> legend("bottom",legend=paste("Sigma1=", c(0.01,0.1,0.2,0.5,1,1.5,2,4,6,9.5),
> sep=""),
> fill=c("red","green","blue","black","pink","brown","purple","yellow","lightblue","orange"))
> 
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] interpreting Shapiro-Wilks test result

2008-10-11 Thread Bernardo Rangel Tura
Em Qua, 2008-10-08 às 17:41 -0700, Halizah Basiron escreveu:
> Hi all,
>I am newbie in using R software and also doing statistical test. I want to 
> know if my data in in normal distribution. I have 2 groups of data and I did 
> calculate Shapiro Wilks using R software. Here is the results:
> 
> Group 1: W = 0.9206, p-value = 0.01683
> Group 2: W = 0.9626, p-value = 0.4694
> 
> I am not quite sure what default confidence level (CF) is used in calculating 
> Shapiro Wilks. Else, may I choose my own CF. Let say, if I choose CF = 0.01, 
> therefore I may approve that Group 1 and Group 2 data are normal. If not, 
> what should I do. My next plan is to apply T Test for both groups (assuming 
> both are normal). I really need advice from experts here.
> 
> Cheers,
> Halizah

Hi Halizah,

In my opinion don't exist confidence level of test. What exist is
confidence level of your experiment.

If you plan your experiment with alpha probability you have
100*(1-alpha) confidence level and you p-value need minor than alpha to
named "significant"

So if you need 99% of significance the two groups is normal and you
using t.test. 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Load a program at the front end

2008-10-06 Thread Bernardo Rangel Tura
Em Qui, 2008-10-02 às 14:36 -0400, Gang Chen escreveu:
> I want to run a R program, prog.R,  interactively. My question is, is
> there a way I can start prog.R on the shell terminal when invoking R,
> instead of using source() inside R?
> 
> TIA,
> Gang

Hi Gang

I my system just only type:

 R --no-save https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot open connection: Authorization Required

2008-10-05 Thread Bernardo Rangel Tura
Em Sex, 2008-10-03 às 11:22 -0700, Spencer Graves escreveu:
> Hi, All: 
> 
>   Is there a way in R to access a file / web site that requires 
> permission? 
> 
>   Consider for example the following: 
> 
> 
>  > readLines('http://www.r-project.org/', 4)
> [1] ""
> [2] ""  
> [3] ""  
> [4] "The R Project for Statistical Computing"  
> 
>  > readLines(URL) # URL = web address, which I can see via manual access. 
> Error in file(con, "r") : cannot open the connection
> In addition: Warning message:
> In file(con, "r") :
>   cannot open: HTTP status was '401 Authorization Required'
> 
> 
>   Thanks,
>   Spencer Graves

Spencer,  this is a local problem in my system

> 
> >  readLines('http://www.r-project.org/')
>  [1] ""  
>  
>  [2] "" 
>  
>  [3] "" 
>  
>  [4] "The R Project for Statistical Computing" 
>  
>  [5] "" 
>  
>  [6] " type=\"image/x-icon\">" 
>  [7] "" 
>  
>  [8] ""
>  
>  [9] ""   
>  
> [10] ""
>  
> [11] ""
>  
> [12] ""  
>  
> [13] ""
>  
> [14] ""
>  
> [15] ""   
>  
> [16] "" 
>  
> [17] "The R Project for Statistical Computing"   
>  
> [18] ""   
>  
> [19] "Your browser seems not to support frames,"  
>  
> [20] "here is the contents page of the R 
> Project's"
> [21] "website."       
>  
> [22] ""
>  
> [23] ""
>  
> [24] ""   
>  
> [25] ""   
>  
> [26] ""   
>  

I run R.2.7.2 in Ubuntu AMD 64 machine


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression problem

2008-10-01 Thread Bernardo Rangel Tura
Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu:
> Bernardo Rangel Tura wrote:
> > Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu:
> >> I have a huge data set with thousands of variable and one binary
> >> variable. I know that most of the variables are correlated and are not
> >> good predictors... but...
> >>
> >> It is very hard to start modeling with such a huge dataset. What would
> >> be your suggestion. How to make a first cut... how to eliminate most
> >> of the variables but not to ignore potential interactions... for
> >> example, maybe variable A is not good predictor and variable B is not
> >> good predictor either, but maybe A and B together are good
> >> predictor...
> >>
> >> Any suggestion is welcomed
> > 
> > 
> > milicic.marko
> > 
> > I think do you start with a rpart("binary variable"~.)
> > This show you a set of variables to start a model and the start set to
> > curoff  for continous variables
> 
> I cannot imagine a worse way to formulate a regression model.  Reasons 
> include
> 
> 1. Results of recursive partitioning are not trustworthy unless the 
> sample size exceeds 50,000 or the signal to noise ratio is extremely high.
> 
> 2. The type I error of tests from the final regression model will be 
> extraordinarily inflated.
> 
> 3. False interactions will appear in the model.
> 
> 4. The cutoffs so chosen will not replicate and in effect assume that 
> covariate effects are discontinuous and piecewise flat.  The use of 
> cutoffs results in a huge loss of information and power and makes the 
> analysis arbitrary and impossible to interpret (e.g., a high covariate 
> value:low covariate value odds ratio or mean difference is a complex 
> function of all the covariate values in the sample).
> 
> 5. The model will not validate in new data.

Professor Frank,

Thank you for your explain.

Well, if my first idea is wrong what is your opinion on the following
approach?

1- Make PCA with data excluding the binary variable
2- Put de principal components in logistic model
3- After revert principal componentes in variable (only if is
interesting for milicic.marko)

If this approach is wrong too what is your approach?
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression problem

2008-09-30 Thread Bernardo Rangel Tura
Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu:
> I have a huge data set with thousands of variable and one binary
> variable. I know that most of the variables are correlated and are not
> good predictors... but...
> 
> It is very hard to start modeling with such a huge dataset. What would
> be your suggestion. How to make a first cut... how to eliminate most
> of the variables but not to ignore potential interactions... for
> example, maybe variable A is not good predictor and variable B is not
> good predictor either, but maybe A and B together are good
> predictor...
> 
> Any suggestion is welcomed


milicic.marko

I think do you start with a rpart("binary variable"~.)
This show you a set of variables to start a model and the start set to
curoff  for continous variables
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] birthday problem (factorial limit)

2008-09-30 Thread Bernardo Rangel Tura
Em Dom, 2008-09-28 às 19:43 +0200, Jörg Groß escreveu:
> Hi,
> 
> I tried to calculate the formula for the birthday problem
> (the probability that at least two people out of a group of n people  
> share the same birthday)
> 
> But the factorial-function allows me only to calculate factorials up  
> to 170.
> 
> 
> So is there a way to push that limit?
> 
> to solve this formula:
> 
> (factorial(365) / factorial((365-23))) / (365^23)
> 
> (n=23)

Log experession

n<-23
exp(sum(log(1:365))-sum(log(1:(365-n)))-n*log(365))

[1] 0.4927028


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >