[R] Questions concerning function 'svm' in e1071 package

2006-07-02 Thread Van Kerckhoven, Johan
Greetings everyone,

I have the following problem (illustrating R-code at bottom of mail):
Given a training sample with binary outcomes (-1/+1), I train a linear
Support Vector Machine to separate them. Afterwards, I compute the
weight vector w in the usual way, and obtain the fitted values as
w'x + b > 0  ==>  yfitted = 1, otherwise -1.

However, upon verifying with the 'predict' method, the outcomes do not
match up as they should. I've already tried to find information
concerning this issue on the R-help board, but to no avail. Can any of
you point me in the right direction?

Signed,

Johan Van Kerckhoven
ORSTAT and University Center of Statistics
Katholieke Universiteit Leuven

--

#initialization of the problem

rm(list=ls())

library(e1071)

set.seed(2)

n = 50
d = 4
p = 0.5

x = matrix(rnorm(n*d), ncol=d)

mushift = c(1, -1, rep(0, d-2))

y = runif(n) > p
y = factor(2*y - 1)

x = x - outer(rep(1, n), mushift)
x[y == 1, ] = x[y == 1] + 2*outer(rep(1, sum(y == 1)), mushift)

svclass = svm(x, y, scale=FALSE, kernel="linear")

#Computation of the weight vector

w = t(svclass$coefs) %*% svclass$SV
if (y[1] == -1) {
   w = -w
}

#Derivation of predicted class lavels

#Using method in documentation
yfit = (x %*% t(w) + svclass$rho) > 0
yfit = factor(2*yfit - 1)

#Extracting them directly from the model
yfit2 = svclass$fitted

#Display where predictions differ from each other
yfit != yfit2

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem with try()

2006-07-02 Thread Landsman Leonid
Dear R-experts, 
I am running a large simulation exercise where the enough complicated 
integration is required. 

The integral is computed within a C-function called Denom by use of function 
qags from the gsl library. 

Here is a piece of R-code: 

denom<-try(.C("Denom",as.double(x),as.integer(n), as.integer(p),
as.double(param), as.double(delta),res=as.double(results)))
denomres=if (class(denom)=="try-error") NA else denom$res  


Sometimes, it happens that the integration process fails with the follwoing 
error message 

gsl: qags.c:553: ERROR: bad integrand behavior found in the integration interval
Default GSL error handler invoked.


and the whole simulation job is destroyed. My question is: why try() does not 
work and how to fix this problem? 

Much thanks, 

Leonid Landsman. 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Query : Chi Square goodness of fit test

2006-07-02 Thread priti desai
If we have the data base of frauds given below


 no. of  frauds = variable

variable
<-c(4,1,6,9,9,10,2,4,8,2,3,0,1,2,3,1,3,4,5,4,4,4,9,5,4,3,11,8,12,3,10,0,
7)

pmf  <- dpois(i, lambda, log = FALSE)  # prob. mass function of variable

How to apply chi-square goodness of fit to test, Sample coming from
Poisson distribution.
How to calculate observed frequencies & expected frequencies, after that
how to calculate chi 2 test and interpret the result  



The formula which I have used & answer which I am getting is as follows,


chisq.test(variable, p=pmf, simulate.p.value =FALSE, correct = FALSE)



Chi-squared test for given probabilities

data:  No_of_Frouds 
X-squared = 1.043111e+15, df = 32, p-value < 2.2e-16

Warning message:
Chi-squared approximation may be incorrect in: chisq.test(No_of_Frouds,
p = pmf, simulate.p.value = FALSE, correct = FALSE) 
 


But the answer is not correct. 

Please suggest me the correct variable, calculations & formula in R.

 Awaiting your positive reply.
 
  Regards,
  Priti.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Data Manipulations - Group By equivalent

2006-07-02 Thread ronggui

use doBy package will be more easy.

# GENERATE A TREATMENT GROUP #
group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_'));
# CREATE A SERIES OF RANDOM VALUES #
x<-rnorm(length(group));
# CREATE A DATA FRAME TO COMBINE THE ABOVE TWO #
data<-data.frame(group, x);
library(doBy)
summ2<-summaryBy(x~group,data=data,FUN=c(mean,sum),na.rm=T,prefix=c("mean","sum"))
combine2<-merge(data,summ)

Ronggui


2006/7/2, Wensui Liu <[EMAIL PROTECTED]>:

Zubin,

I bet you are working for intercontinental hotels and think you probably are
not the real Zubin there. right? ^_^. If you have chance, could you please
say hi to him for me?

Here is a piece of R code I copy from my blog side by side with SAS. You
might need to tweak it a little to get what you need.

 CALCULATE GROUP SUMMARY IN R
##
# HOW TO CALCULATE GROUP SUMMARY IN R #
# DATE : DEC-13, 2005 #
##
# EQUIVALENT SAS CODE: #
# #
# DATA DATA; #
# DO I = 1 TO 2; #
# DO J = 1 TO 4; #
# GROUP = 'TREATMENT_'||PUT(I, 1.); #
# X = RANNOR(1); #
# OUTPUT; #
# END; #
# END; #
# KEEP GROUP X; #
# RUN; #
# #
# PROC SQL; #
# CREATE TABLE COMBINE AS #
# SELECT *, MEAN(X) AS MEAN_X, SUM(X) AS SUM_X #
# FROM DATA #
# GROUP BY GROUP; #
# QUIT; #
##


# GENERATE A TREATMENT GROUP #
group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_'));

# CREATE A SERIES OF RANDOM VALUES #
x<-rnorm(length(group));

# CREATE A DATA FRAME TO COMBINE THE ABOVE TWO #
data<-data.frame(group, x);

# CALCULATE SUMMARY FOR X #
x.mean<-tapply(data$x, data$group, mean, na.rm = T);
x.sum<-tapply(data$x, data$group, sum, na.rm = T);

# CREATE A DATA FRAME TO COMBINE SUMMARIES #
summ<-data.frame(x.mean, x.sum, group = names(x.mean));

# COMBINE DATA AND SUMMARIES TOGETHER #
combine<-merge(data, summ, by = "group");


On 7/1/06, zubin <[EMAIL PROTECTED]> wrote:
>
> Hello, a beginner R user - boy i wish there was a book on just data
> manipulations for SAS users learning R (equivalent to the SAS DATA
> STEP)..  Okay, my question:
>
> I have a panel data set, hotel data occupancy by month for 12 months,
> 1000 hotels.  I have a field labeled 'year' and want to consolidate the
> monthly records using an average into 1000 occupancy numbers - just a
> simple average of the 12 months by hotel.  In SQL this operation is
> pretty easy, a group by query (group by hotel where year = 2005, avg
> occupancy) - how is this done in R? (in R language not SQL).  Thx!
>
> -zubin
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



--
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




--
黄荣贵
Department of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to recode in my dataset?

2006-07-02 Thread ronggui

I always use "recode" function (in the car packages) to recode
variables.That works well and I like that function.

2006/7/2, zhijie zhang <[EMAIL PROTECTED]>:

Dear Rusers,
 My question is about "recode variables". First, i'd like to say
something about the idea of recoding:
 My dataset have three variables:type,soiltem and airtem,which means
grass type, soil temperature and air temperature. As we all known, the
change of air temperature is greater than soil temperature,so the
values in those two different temperaturemay represent different
range.
 My recoding is to recode soiltem with 0.2 intervals, and airtem with
0.5 intervals, that is:
In soiltem:0~0.2<-0.1,  0.2~0.4<-0.3, 0.4`0.6<-0.5,...etc;
In airtem:0~0.5<-0.25,  0.5~1<-0.75, 1`1.5<-1.25,...etc;
My example like this:
type<-c(1, 1, 2, 3,4,1,1,4,3,2)
soiltem<-c(19.2,18.6,19.5,19.8,19.6,20.6,19.1,18.7,22.4,21.6)
airtem<-c(19.9,20.5,21.6,25.6,22.6,21.3,23.7,21.5,24.7,24.4)
mydata<-data.frame(type,soiltem,airtem) #copy the above four arguments
to generate the dataset

mydata
   type soiltem airtem
1 119.2   19.9
2 118.6   20.5
3 219.5   21.6
4 319.8   25.6
5 419.6   22.6
6 120.6   21.3
7 119.1   23.7
8 418.7   21.5
9 322.4   24.7
10221.6   24.4

Thanks very much!
--
Kind Regards,
Zhi Jie,Zhang ,PHD
Department of Epidemiology
School of Public Health
Fudan University
Tel:86-21-54237149

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




--
黄荣贵
Department of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to get the studentized residuals in lm()

2006-07-02 Thread ronggui

help.search("studentized")


You will see:
studres(MASS)   Extract Studentized Residuals from a Linear Model



2006/7/3, zhijie zhang <[EMAIL PROTECTED]>:

Dear friends,
 In s-plus, lm()  generates the the studentized residuals
automatically for us, and In R, it seems don't have the results: After
i fitted lm(), i use attibutes() to see the objects and didn't find
studentized residuals .
 How to get the the studentized residuals in lm(),have i missed something?
thanks very much!

--
Kind Regards,
Zhi Jie,Zhang ,PHD
Department of Epidemiology
School of Public Health
Fudan University
Tel:86-21-54237149

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




--
黄荣贵
Department of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] panel ordering in nlme and augPred plots

2006-07-02 Thread Nathaniel Derby
Hi,

I'm new at this, I'm very confused, and I think I'm missing something
important here.  In our pet example we have this:

> fm <- lme(Orthodont)
> plot(Orthodont)
> plot(augPred(fm, level = 0:1))

which gives us a trellis plot with the females above the males,
starting with "F03", "F04", "F11", "F06", etc.  I thought the point of
this was to create an ordering where the females are ordered ("F01",
"F02", "F03", etc -- followed by the males being ordered).  However,
the solution given ...

> fm <- lme(Orthodont)
> plot(Orthodont)
> plot(augPred(fm1, level = 0:1), skip = rep(c(F,T), c(16, 2)))

... doesn't solve it -- although it does do all the females before
starting on the males.  That is, it starts with "F02", "F08", "F03",
... which isn't in order either.

Running Petr's code also gave output which wasn't ordered by the subjects.

Could someone please explain to me how to order the panels of the
trellis plot by the subjects?


thanks,

Nandor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] large dataset!

2006-07-02 Thread miguel manese
Hello Jennifer,

I'm writing a package SQLiteDF for Google SOC2006, under the
supervision of Prof. Bates & Prof. Riley. Basically, it stores data
frame into sqlite databases (i.e. in a file) and aims to be
transparently accessible to R using the same operators for ordinary
data frames.

Right now, it's quite usable (the "indexers" are working, and some
other generic methods), and only for linux (I should have the windows
package any time soon though). I would love to hear about your
requirements so as to test my package.

Cheers,
M. Manese

On 7/3/06, Andrew Robinson <[EMAIL PROTECTED]> wrote:
> Jennifer,
>


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to get the studentized residuals in lm()

2006-07-02 Thread zhijie zhang
Dear friends,
 In s-plus, lm()  generates the the studentized residuals
automatically for us, and In R, it seems don't have the results: After
i fitted lm(), i use attibutes() to see the objects and didn't find
studentized residuals .
 How to get the the studentized residuals in lm(),have i missed something?
thanks very much!

-- 
Kind Regards,
Zhi Jie,Zhang ,PHD
Department of Epidemiology
School of Public Health
Fudan University
Tel:86-21-54237149

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] large dataset!

2006-07-02 Thread Andrew Robinson
Jennifer,

it sounds like that's too much data for R to hold in your computer's
RAM. You should give serious consideration as to whether you need all
those data for the models that you're fitting, and if so, whether you
need to do them all at once.  If not, think about pre-processing
steps, using e.g. SQL command, to pull out the data that you need. For
example, if the data are spatial, then think about analyzing them by
patches.  

Good luck,

Andrew

On Sun, Jul 02, 2006 at 10:12:25AM -0400, JENNIFER HILL wrote:
> 
> Hi, I need to analyze data that has 3.5 million observations and
> about 60 variables and I was planning on using R to do this but
> I can't even seem to read in the data.  It just freezes and ties
> up the whole system -- and this is on a Linux box purchased about
> 6 months ago on a dual-processor PC that was pretty much the top
> of the line.  I've tried expanding R the memory limits but it 
> doesn't help.  I'll be hugely disappointed if I can't use R b/c
> I need to do build tailor-made models (multilevel and other 
> complexities).   My fall-back is the SPlus big data package but
> I'd rather avoid if anyone can provide a solution
> 
> Thanks
> 
> Jennifer Hill
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to recode in my dataset?

2006-07-02 Thread Dimitrios Rizopoulos
probably ?cut() is what you're looking for, e.g., something like:

ind <- cut(mydata$soiltem, seq(0, 60, 0.2), labels = FALSE)
seq(0.1, 60, 0.2)[ind]


I hope it helps.

Best,
Dimitris

 
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


Quoting zhijie zhang <[EMAIL PROTECTED]>:

> Dear Rusers,
>  My question is about "recode variables". First, i'd like to say
> something about the idea of recoding:
>  My dataset have three variables:type,soiltem and airtem,which means
> grass type, soil temperature and air temperature. As we all known,
> the
> change of air temperature is greater than soil temperature,so the
> values in those two different temperaturemay represent different
> range.
>  My recoding is to recode soiltem with 0.2 intervals, and airtem
> with
> 0.5 intervals, that is:
> In soiltem:0~0.2<-0.1,  0.2~0.4<-0.3, 0.4`0.6<-0.5,...etc;
> In airtem:0~0.5<-0.25,  0.5~1<-0.75, 1`1.5<-1.25,...etc;
> My example like this:
> type<-c(1, 1, 2, 3,4,1,1,4,3,2)
> soiltem<-c(19.2,18.6,19.5,19.8,19.6,20.6,19.1,18.7,22.4,21.6)
> airtem<-c(19.9,20.5,21.6,25.6,22.6,21.3,23.7,21.5,24.7,24.4)
> mydata<-data.frame(type,soiltem,airtem) #copy the above four
> arguments
> to generate the dataset
> 
> mydata
>type soiltem airtem
> 1 119.2   19.9
> 2 118.6   20.5
> 3 219.5   21.6
> 4 319.8   25.6
> 5 419.6   22.6
> 6 120.6   21.3
> 7 119.1   23.7
> 8 418.7   21.5
> 9 322.4   24.7
> 10221.6   24.4
> 
> Thanks very much!
> -- 
> Kind Regards,
> Zhi Jie,Zhang ,PHD
> Department of Epidemiology
> School of Public Health
> Fudan University
> Tel:86-21-54237149
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> 
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] curiosity question: new graphics vs. old graphics subsystem

2006-07-02 Thread Mihai Nica
Well, as a newbee, I believe your idea is great. However, the R Core team is, 
in my humble opinion, way too stretched (for a free software development team) 
to do this. A complementary development team (similar to, say, the Tinn-R team) 
might be able to address this issue. I wish I would have the skills to 
contribute :-) Just my 2c.

The least of learning is done in the classrooms.
  - Thomas Merton


> Date: Sun, 2 Jul 2006 09:34:39 -0400
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: [R] curiosity question: new graphics vs. old graphics subsystem
> 
> hi mihai:  it is more likely that the developers will take this more
> seriously if you echo my concern on r-help itself.  regards, /iaw
> 
> On 7/1/06, Mihai Nica <[EMAIL PROTECTED]> wrote:
> >
> > Wow, this is what I would say if I knew how to say it :-) For newbees
> > (such as myself) or those who lack programming expertise (and, why not, for
> > those not interested in programming) this approach would be great.
> > mihai
> >
> > 
> > Express yourself instantly with Windows Live Messenger

_
Express yourself: design your homepage the way you want it with Live.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] large dataset!

2006-07-02 Thread JENNIFER HILL

Hi, I need to analyze data that has 3.5 million observations and
about 60 variables and I was planning on using R to do this but
I can't even seem to read in the data.  It just freezes and ties
up the whole system -- and this is on a Linux box purchased about
6 months ago on a dual-processor PC that was pretty much the top
of the line.  I've tried expanding R the memory limits but it 
doesn't help.  I'll be hugely disappointed if I can't use R b/c
I need to do build tailor-made models (multilevel and other 
complexities).   My fall-back is the SPlus big data package but
I'd rather avoid if anyone can provide a solution

Thanks

Jennifer Hill

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nlme: correlation structure in gls and zero distance

2006-07-02 Thread Patrick Giraudoux


Joris De Wolf a écrit :
> Have you tried to define 'an' as a group? Like in
>
> gls(IKAfox~an,correlation=corExp(2071,form=~x+y|an,nugget=1.22),data=renliev) 
>
>
> A small data set might help to explain the problem.
>
> Joris
Thanks. Seems to work with a small artificial data set:


an<-as.factor(rep(2001:2004,each=10))
x<-rep(rnorm(10),times=4)
y<-rep(rnorm(10),times=4)
IKA<-rpois(40,2)
site<-as.factor(rep(letters[1:10],times=4))


library(nlme)

mod1<-gls(IKA~an-1,correlation=corExp(form=~x+y))

 >Error in getCovariate.corSpatial(object, data = data) :
Cannot have zero distances in "corSpatial"


mod2<-gls(IKA~an-1,correlation=corExp(form=~x+y|an))

 > mod2
Generalized least squares fit by REML
  Model: IKA ~ an - 1
  Data: NULL
  Log-restricted-likelihood: -73.63998

Coefficients:
  an2001   an2002   an2003   an2004
1.987611 2.454520 2.429907 2.761011

Correlation Structure: Exponential spatial correlation
 Formula: ~x + y | an
 Parameter estimate(s):
range
0.4304012
Degrees of freedom: 40 total; 36 residual
Residual standard error: 1.746205





>
> Joris
>
> Patrick Giraudoux wrote:
>> Dear listers,
>>
>> I am trying to model the distribution of  fox density over years  in 
>> the Doubs department. Measurements have been taken on 470 plots in 
>> March each year and georeferenced. Average density is supposed to be 
>> different each year.
>>
>> In a first approach, I would like to use a general model of this 
>> type, taking spatial correlation into account:
>>
>> gls(IKAfox~an,correlation=corExp(2071,form=~x+y,nugget=1.22),data=renliev) 
>>
>>
>> but I get
>>
>>  > 
>> gls(IKAfox~an,correlation=corExp(2071,form=~x+y,nugget=1.22),data=renliev) 
>>
>> Error in getCovariate.corSpatial(object, data = data) :
>> Cannot have zero distances in "corSpatial"
>>
>> I understand that the 470 geographical coordinates are repeated three 
>> times (measurement are taken each of the three years at the same 
>> place) which obviously cannot be handled there.
>>
>> Does anybody know a way to work around that except jittering slightly 
>> the geographical coordinates?
>>
>> Thanks in advance,
>>
>> Patrick
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! 
>> http://www.R-project.org/posting-guide.html
>
>
> confidentiality notice:
> The information contained in this e-mail is confidential a...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] send output to printer

2006-07-02 Thread Uwe Ligges
Matthias Braeunig wrote:
> It has to be a simple thing, but I could not figure it out:
> 
> How do I send the text output from object x to the printer?
> As a shell user I would expect a pipe to the printer... "|kprinter" or
> "|lpr -Pmyprinter" somehow. And yes, I'm on Linux.

I think capture.output() helps to send stuff to a connection.

Uwe Ligges


> Thanks!
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problems with simple statistical procedures

2006-07-02 Thread Uwe Ligges
Thomas Preuth wrote:
> Hello,
> 
> I use an imported dataframe and want to extract the mean value for one 
> column.
> after typing "mean (rae.df$VOL_DEP)" I receive
> "[1] NA
> Warning message:
> Argument ist weder numerisch noch boolesch: gebe NA zurück in: 
> mean.default("rae.df$POINT_Y_CH") "

Well,
   rae.df$VOL_DEP != "rae.df$POINT_Y_CH"

I think this is really strange. Are you sure this is the exact call and 
its output? If so, please tell us the output of
   str(rae.df)

Uwe Ligges




> But when i look into the dataframe the column is characterized as numeric.
> 
> Sorry for bothering but as a complete newbie I just cannot halp myself.
> 
> Greetings,
> thomas
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] sparse matrix tools

2006-07-02 Thread Thaden, John J
Dear R-Help list:
I'm using the Matrix library to operate on 600 X ~5000 element
unsymmetrical sparse arrays. So far, so good, but if I find I need more
speed or functionality, how hard would it be to utilize other sparse
matrix toolsets from within R, say MUMPS, PARDISO or UMFPACK, that do
not have explicit R interfaces?  More information on these is available
here
   www.cise.ufl.edu/research/sparse/umfpack/ 
   www.computational.unibas.ch/cs/scicomp/software/pardiso
   www.enseeiht.fr/lima/apo/MUMPS/ 
and in these reviews
   ftp://ftp.numerical.rl.ac.uk/pub/reports/ghsNAGIR20051r1.pdf 
   http://www.cise.ufl.edu/research/sparse/codes/ 
neither of which reviewed the R Matrix package, unfortunately.
Thanks,   
- John Thaden, Ph.D., U. Arkansas for Med. Sci., Little Rock.

Confidentiality Notice: This e-mail message, including any a...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to recode in my dataset?

2006-07-02 Thread zhijie zhang
Dear Rusers,
 My question is about "recode variables". First, i'd like to say
something about the idea of recoding:
 My dataset have three variables:type,soiltem and airtem,which means
grass type, soil temperature and air temperature. As we all known, the
change of air temperature is greater than soil temperature,so the
values in those two different temperaturemay represent different
range.
 My recoding is to recode soiltem with 0.2 intervals, and airtem with
0.5 intervals, that is:
In soiltem:0~0.2<-0.1,  0.2~0.4<-0.3, 0.4`0.6<-0.5,...etc;
In airtem:0~0.5<-0.25,  0.5~1<-0.75, 1`1.5<-1.25,...etc;
My example like this:
type<-c(1, 1, 2, 3,4,1,1,4,3,2)
soiltem<-c(19.2,18.6,19.5,19.8,19.6,20.6,19.1,18.7,22.4,21.6)
airtem<-c(19.9,20.5,21.6,25.6,22.6,21.3,23.7,21.5,24.7,24.4)
mydata<-data.frame(type,soiltem,airtem) #copy the above four arguments
to generate the dataset

mydata
   type soiltem airtem
1 119.2   19.9
2 118.6   20.5
3 219.5   21.6
4 319.8   25.6
5 419.6   22.6
6 120.6   21.3
7 119.1   23.7
8 418.7   21.5
9 322.4   24.7
10221.6   24.4

Thanks very much!
-- 
Kind Regards,
Zhi Jie,Zhang ,PHD
Department of Epidemiology
School of Public Health
Fudan University
Tel:86-21-54237149

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Optional variables in function?

2006-07-02 Thread jim holtman
?missing

On 7/2/06, Jonathan Greenberg <[EMAIL PROTECTED]> wrote:
>
> I'm a bit new to writing R functions and I was wondering what the "best
> practice" for having optional variables in a function is, and how to test
> for optional and non-optional variables?  e.g. if I have the following
> function:
>
> helpme <- function(a,b,c) {
>
>
> }
>
> In this example, I want c to be an optional variable, but a and b to be
> required.  How do I:
> 1) test to see if the user has inputted c
> 2) break out of the function of the user has NOT inputted a or b.
>
> Thanks!
>
> --j
>
> --
>
> Jonathan A. Greenberg, PhD
> NRC Research Associate
> NASA Ames Research Center
> MS 242-4
> Moffett Field, CA 94035-1000
> Phone: 415-794-5043
> AIM: jgrn3007
> MSN: [EMAIL PROTECTED]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] workaround for numeric problems

2006-07-02 Thread Dimitrios Rizopoulos
I'd compute this in the log-scale (taking also advantage of the 'log' 
and 'log.p' arguments of dnorm() and pnorm(), respectively), and then 
transform back, e.g.,

fn1 <- function(B){
-(pnorm(B) * dnorm(B) * B + dnorm(B)^2)/pnorm(B)^2
}

fn2 <- function(B){
p1 <- dnorm(B, log = TRUE) + log(-B) - pnorm(B, log.p = TRUE)
p2 <- 2 * (dnorm(B, log = TRUE) - pnorm(B, log.p = TRUE))
exp(p1) - exp(p2)
}

fn1(c(-15, -25, -35, -55, -105))
fn2(c(-15, -25, -35, -55, -105))


I hope it helps.

Best,
Dimitris

 
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


Quoting Ott Toomet <[EMAIL PROTECTED]>:

> Dear R-people,
> 
> I have to compute 
> 
> C - -(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2
> 
> This expression seems to be converging to -1 if B approaches to -Inf
> (although I am unable to prove it).  R has no problems until B
> equals
> around -28 or less, where both numerator and denominator go to 0 and
> you get NaN. A simple workaround I did was
> 
> C <- ifelse(B > -25,
>-(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2,
> -1)
> 
> It works well for me (32bit intel/linux platform).  But what about
> other processors/platforms/compilator options?  Are there any better
> ways for finding out at which values the numerical problems start?
> Can one derive something from .Machine$double.eps (but what about
> the
> precison of dnorm and other analytic functions)?
> 
> Thanks in advance,
> Ott
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> 
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] replace values?

2006-07-02 Thread Matthias Braeunig
# reproducing your example
xx<-"x y z
+ 1 2 3
+ 2 3 1
+ 3 2 1
+ 1 1 3
+ 2 1 2
+ 3 2 3
+ 2 1 1"

# you did not tell us the class of your data, assuming data.frame
df<-read.table(textConnection(xx),header=T,colClasses="factor")

# a clean way to do what you want is using factors with ?levels
# (note that data has already been read as factor)
levels(df$x)<-c("a","b","c","d")
levels(df$y)<-c("b","a","c","d")
levels(df$z)<-c("d","c","b","a")

subset(df,x=="a")
  x y z
1 a a b
4 a b b
subset(df,x=="a"&y=="a")
  x y z
1 a a b


HTH, m


zhijie zhang wrote:
> Dear friends,
>   i have a dataset like this:
> x y z
> 1 2 3
> 2 3 1
> 3 2 1
> 1 1 3
> 2 1 2
> 3 2 3
> 2 1 1
> I want to replace x with the following values:1<-a,2<-b,3<-c,4<-d;
>  replace y with the following values:1<-b,2<-a,3<-c,4<-d;
>  replace z with the following values:1<-d,2<-c,3<-b,4<-a;
> Finally,select two subsets:
> 1. if x='a';
> 2.x='a' and y='a';
>  thanks very much!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] replace values?

2006-07-02 Thread Jonathan Baron
On 07/02/06 12:39, zhijie zhang wrote:
> Dear friends,
>   i have a dataset like this:
> x y z
> 1 2 3
> 2 3 1
> 3 2 1
> 1 1 3
> 2 1 2
> 3 2 3
> 2 1 1
> I want to replace x with the following values:1<-a,2<-b,3<-c,4<-d;
>  replace y with the following values:1<-b,2<-a,3<-c,4<-d;
>  replace z with the following values:1<-d,2<-c,3<-b,4<-a;

Here's one way.  Call your dataset M, and assume it is a
data.frame.  This method of replacement works best when you are
replacing consecutive integers, as you are.  Note that X[1] is
"a", X[2] is "b" and so on.

X <- c("a","b","c","d")
Y <- c("b","a","c","d")
Z <- c("d","c","b","a")
M$x <- X[M$x]
M$y <- Y[M$y]
M$z <- Z[M$z]

> Finally,select two subsets:
> 1. if x='a';
> 2.x='a' and y='a';

M[M$x=="a",]
M[M$x=="a" & M$y=="a",]

The subsets will be rows.  I'm not sure that's what you mean.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] workaround for numeric problems

2006-07-02 Thread Ott Toomet
Dear R-people,

I have to compute 

C - -(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2

This expression seems to be converging to -1 if B approaches to -Inf
(although I am unable to prove it).  R has no problems until B equals
around -28 or less, where both numerator and denominator go to 0 and
you get NaN. A simple workaround I did was

C <- ifelse(B > -25,
   -(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2,
-1)

It works well for me (32bit intel/linux platform).  But what about
other processors/platforms/compilator options?  Are there any better
ways for finding out at which values the numerical problems start?
Can one derive something from .Machine$double.eps (but what about the
precison of dnorm and other analytic functions)?

Thanks in advance,
Ott

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] problems with simple statistical procedures

2006-07-02 Thread Thomas Preuth
Hello,

I use an imported dataframe and want to extract the mean value for one 
column.
after typing "mean (rae.df$VOL_DEP)" I receive
"[1] NA
Warning message:
Argument ist weder numerisch noch boolesch: gebe NA zurück in: 
mean.default("rae.df$POINT_Y_CH") "

But when i look into the dataframe the column is characterized as numeric.

Sorry for bothering but as a complete newbie I just cannot halp myself.

Greetings,
thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html