Re: [R] Partial Derivatives in R

2009-05-12 Thread spencerg
Hi, Paul: 



 Your example is so complicated that I don't want to take the time 
to check it.  You apply deriv to an exponential divided by a sum of 
exponentials, and I'm not convinced that your manual Correct way is 
actually correct.  It looks like you've followed the examples in the 
deriv help page, so I would be more inclined to trust that, especially 
since it matched the answer I got from genD, as follows. 



 In your genD example, x01 and x02 should be x[1] and x[2]: 


p1 - function(x) {exp(b00.1+b01.1*x[1]+b02.1*x[2]) /
(exp(b00.1+b01.1*x[1]+b02.1*x[2])+
 exp(b00.2+b01.2*x[1]+b02.2*x[2])+
 exp(b00.3+b01.3*x[1]+b02.3*x[2])) - phat1}
test - genD(p1, c(x01, x02))
test$D
  [,1]  [,2][,3]   [,4]   [,5]
[1,] -0.2012997 0.1296301 -0.03572875 0.07082898 -0.1261376


 The first two components of test$D here match your 
attr(eval(dp1.dx), gradient).  The next three are the lower triangular 
portion of the matrix of second partials of the function p1, per the 
genD documentation. 



 The function numericGradient in the maxLik package could also be 
used for this, I believe.  However, I won't take the time here to test 
that. 



 Hope this helps. 
 Spencer Graves   



Paul Heinrich Dietrich wrote:

Hi Spencer,
Thanks for suggesting the genD function.  In attempting it, I have
rearranged my function from phat1 ~ ... to ... - 1, it apparently doesn't
like the first one :)  But when I run it, it tells me the partials are all
zero.  I'm trying out a simple MNL equation before I expand it to what I'm
looking for.  Here is what I tried (and I get different answers from a
textbook solution, deriv(), and genD()):

  

### Variables for an observation
x01 - rnorm(1,0,1)
x02 - rnorm(1,0,1)
### Parameters for an observation
b00.1 - rnorm(1,0,1)
b00.2 - rnorm(1,0,1)
b00.3 - 0
b01.1 - rnorm(1,0,1)
b01.2 - rnorm(1,0,1)
b01.3 - 0
b02.1 - rnorm(1,0,1)
b02.2 - rnorm(1,0,1)
b02.3 - 0
### Predicted Probabilities for an observation
phat1 - 0.6
phat2 - 0.3
phat3 - 0.1
### Correct way to calculate a partial derivative
partial.b01.1 - phat1 * (b01.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
partial.b01.2 - phat2 * (b01.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
partial.b01.3 - phat3 * (b01.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
partial.b01.1; partial.b01.2; partial.b01.3


[1] 0.04288663
[1] -0.1804876
[1] 0.1376010
  

partial.b02.1 - phat1 * (b02.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
partial.b02.2 - phat2 * (b02.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
partial.b02.3 - phat3 * (b02.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
partial.b02.1; partial.b02.2; partial.b02.3


[1] 0.8633057
[1] 0.3171978
[1] 0.1376010
  

### Derivatives for MNL
dp1.dx - deriv(phat1 ~ exp(b00.1+b01.1*x01+b02.1*x02) /


+ (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
+ exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))
  

dp2.dx - deriv(phat2 ~ exp(b00.2+b01.2*x01+b02.2*x02) /


+ (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
+ exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))
  

dp3.dx - deriv(phat3 ~ exp(b00.3+b01.3*x01+b02.3*x02) /


+ (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
+ exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))
  

attr(eval(dp1.dx), gradient)


 x01  x02
[1,] -0.01891354 0.058918
  

attr(eval(dp2.dx), gradient)


x01 x02
[1,] -0.1509395 -0.06258685
  

attr(eval(dp3.dx), gradient)


  x01 x02
[1,] 0.169853 0.003668849
  

library(numDeriv)
dp1.dx - function(x) {exp(b00.1+b01.1*x01+b02.1*x02) /


+ (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
+ exp(b00.3+b01.3*x01+b02.3*x02)) - phat1}
  

test - genD(dp1.dx, c(phat1,b00.1,b01.1,b02.1,x01,x02)); test


$D
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[,14]
[1,]000000000 0 0 0 0
0

 [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
[1,] 0 0 0 0 0 0 0 0 0 0 0 0
 [,27]
[1,] 0

$p
[1] 6

$f0
[1] 0.05185856

$func
function(x) {exp(b00.1+b01.1*x01+b02.1*x02) /
(exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
exp(b00.3+b01.3*x01+b02.3*x02)) - phat1}

$x
[1]  0.6  1.401890082 -1.304531849  0.062833294 -0.143141379
[6] -0.005995477

$d
[1] 1e-04

$method
[1] Richardson

$method.args
$method.args$eps
[1] 1e-04

$method.args$d
[1] 1e-04

$method.args$zero.tol
[1] 1.781029e-05

$method.args$r
[1] 4

$method.args$v
[1] 2


attr(,class)
[1] Darray
  






spencerg wrote:
  
  Have you considered genD{numDeriv}? 

  If this does not answer your question, I suggest you try the 
RSiteSearch package.  The following will open a list of options in a 
web browser, sorted by package most often found with your 

Re: [R] Add data count to plot?

2009-05-12 Thread Tal Galili
Hi Mike.
In the plotting you do, there are usually parameters in the function you use
to help you add text to the plot.
for example:
par(mfrow = c(2, 2))
plot(x, sub = text in the bottom, main = text in top, ylab = text to
the side of the axis, xlab = text to the bottom of the axis)
plot(y) # this one is without the text, but
text(...) # you can add text inside the plot with the text command

make sure to read:
? plot
? text
? the name of plotting function you use

Cheers,
Tal




On Tue, May 12, 2009 at 12:17 AM, MikSmith m...@hsm.org.uk wrote:


 Hi

 I'm a relative newbie to R and had a query concerning plotting. I have
 generated a par(mfrow = c(2, 2)) graphic with 10 rose diagrams using
 circular. What I wanted to add to each individual plot was n = x for the
 number of data observations in each dataset. How might I go about doing
 this??

 Thanks

 Mike
 --
 View this message in context:
 http://www.nabble.com/Add-data-count-to-plot--tp23491681p23491681.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2009-05-12 Thread Kon Knafelman

Hi,

I have a question about derivatives of functions.

if i have the following function,

f - function(x) x^3-2*x^2+3*x-5

i need a simple function for the derivative of this with respect to 'x', so 
that i can then sub in values to the the derivative function, and use Newtons 
method of finding a root for this.

Please help.

Regards,

Kon Knafelman

_



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] decimal troubles ?

2009-05-12 Thread Patrick Giraudoux

Dear all,

I have some trouble with the number of decimals in R (currently R 
2.9.0). For instance:


 options()$digits
[1] 3

let me hope that I will get three digits where useful when a number is 
printed. BUT:


 44.25+31.1+50
[1] 125

No way to get the right result 125.35

Can anybody tell me what's happens ?

Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lattice histogram for multiple variables : adjusting x axis

2009-05-12 Thread GOUACHE David
Hello all,

I have a large data frame and I want to look at the distribution of each 
variable very quickly by plotting an individual histogram for each variable.
I'd like to do so using lattice.

Here is a small example using the iris data set:

histogram(as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.factor)]),collapse=+))),data=iris[,!sapply(iris,is.factor)])

However in this graphic, the x-axis and the breaks are the same for all 4 
variables. I would like them to be adjusted to the data in each panel 
separately. I have tried using the scales argument to no avail:

histogram( 
as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.factor)]),collapse=+)))
 ,data=iris[,!sapply(iris,is.factor)], scales=list(x=list(relation=free)) )


How can I individualize the x-axis and breaks for each individual panel?

Thanks for any help.

David Gouache
ARVALIS - Institut du végétal
Station de La Minière
78280 Guyancourt
Tel: 01.30.12.96.22 / Port: 06.86.08.94.32

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] decimal troubles ?

2009-05-12 Thread Dieter Menne



Patrick Giraudoux wrote:
 
 I have some trouble with the number of decimals in R (currently R 
 2.9.0). For instance:
 
   options()$digits
 [1] 3
 
 let me hope that I will get three digits where useful when a number is 
 printed. BUT:
 
   44.25+31.1+50
 [1] 125
 
 No way to get the right result 125.35
 


It says digits, not decimals:

 (44.25+31.1+50)/100
[1] 1.25

Dieter 

-- 
View this message in context: 
http://www.nabble.com/decimal-troubles---tp23498062p23498345.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] decimal troubles ?

2009-05-12 Thread Peter Alspach
Tena koe Patrick

If you want more than three digits, change the options:

options(digits=7)
44.25+31.1+50
[1] 125.35

HTH 

Peter Alspach 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Giraudoux
 Sent: Tuesday, 12 May 2009 8:08 p.m.
 To: r-help@r-project.org
 Subject: [R] decimal troubles ?
 
 Dear all,
 
 I have some trouble with the number of decimals in R 
 (currently R 2.9.0). For instance:
 
   options()$digits
 [1] 3
 
 let me hope that I will get three digits where useful when a 
 number is printed. BUT:
 
   44.25+31.1+50
 [1] 125
 
 No way to get the right result 125.35
 
 Can anybody tell me what's happens ?
 
 Patrick
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to display data content on a only row?

2009-05-12 Thread Dieter Menne



Thom_249 wrote:
 
 I have data like this:
  [1] 16.800  6.533  5.067  3.933  2.200  1.667
  [7]  1.200  1.067  0.733  0.667
 
 And I want that all these data, printed on a 4 rows instead of 8, and it's
 be great without the [x]
 
 

First look would be cat(). Second, tell us where you got the data from.
Matrix? 
Many vectors? 

Dieter


-- 
View this message in context: 
http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23498397.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2009-05-12 Thread Dieter Menne



Kon Knafelman wrote:
 
 
 if i have the following function,
 
 f - function(x) x^3-2*x^2+3*x-5
 
 i need a simple function for the derivative of this with respect to 'x',
 so that i can then sub in values to the the derivative function, and use
 Newtons method of finding a root for this.
 

?deriv
?uniroot (if this is not homework)

Dieter

-- 
View this message in context: 
http://www.nabble.com/%28no-subject%29-tp23498039p23498366.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how the break is calculated by R?

2009-05-12 Thread XinMeng
Hi all: 
As to hist,the help file says: R's default with equi-spaced breaks (also the 
default) is to plot the counts in the cells defined by breaks. 
I wanna know how the break is calculated by R? 

In other words: break = (max - min)/(number of group) but how the number of 
group is calculated by R? Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bayes newby

2009-05-12 Thread Daniele Amberti

   *
   *
   *

   I have to setup a simple classification method (mix of factor and
numeric variables) that can iteratively:

   -  Use expert opinion prior on item X1

   -  Classify X1 based on estimated model (training set) and
expert opinion (previous requirement)

   -  Update the model with true classification information
(and eventually give some feedback on expert opinion contribution)

    

   Where can I start from? Is there a library good for this kind of
work?

   Can someone write down some simple example to kick off this
project?

    

   Thanks in advance



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] times family unavailable in postscript device (Ubuntu Linux)

2009-05-12 Thread Paul Murrell
Hi

This happens because of the way PostScript files are generated, with all
metadata in the head of the file, including font information.  So you
need to predeclare any fonts that you are going to use in a PostScript
file.  In this case, something like ...

postscript(fonts=Times)

... or if Times is the ONLY font that will be used in the file, then
you can just use ...

postscript(family=Times)

... (and then you don't need to specify the family in the call to plot())

Paul


Paul Johnson wrote:
 I'm running Ubuntu 9.04.  I could use some advice about fonts in
 postscript devices.
 
 sessionInfo()
 R version 2.9.0 (2009-04-17)
 i486-pc-linux-gnu
 
 locale:
 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 I can use family=Times with pdf output, but postscript refuses. It says:
 
 
 plot(rnorm(10),rnorm(10), family=Times)
 Error in axis(side = side, at = at, labels = labels, ...) :
   family 'Times' not included in PostScript device
 
 This happens even though Times *appears* to be listed as a valid family :
 
 names(postscriptFonts())
  [1] serifsans mono
  [4] AvantGarde   Bookman  Courier
  [7] HelveticaHelvetica-Narrow NewCenturySchoolbook
 [10] Palatino TimesURWGothic
 [13] URWBookman   NimbusMonNimbusSan
 [16] URWHelvetica NimbusSanCondCenturySch
 [19] URWPalladio  NimbusRomURWTimes
 [22] ComputerModern   ComputerModernItalic Japan1
 [25] Japan1HeiMin Japan1GothicBBB  Japan1Ryumin
 [28] Korea1   Korea1debCNS1
 [31] GB1
 
 example(postscriptFonts)
 
 pstscF postscriptFonts()
 $serif
 $family
 [1] Times
 
 $metrics
 [1] Times-Roman.afm  Times-Bold.afm   Times-Italic.afm
 [4] Times-BoldItalic.afm Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $sans
 $family
 [1] Helvetica
 
 $metrics
 [1] Helvetica.afm Helvetica-Bold.afm
 [3] Helvetica-Oblique.afm Helvetica-BoldOblique.afm
 [5] Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $mono
 $family
 [1] Courier
 
 $metrics
 [1] Courier.afm Courier-Bold.afm
 [3] Courier-Oblique.afm Courier-BoldOblique.afm
 [5] Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $AvantGarde
 $family
 [1] AvantGarde
 
 $metrics
 [1] agw_.afm agd_.afm agwo.afm agdo.afm Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $Bookman
 $family
 [1] Bookman
 
 $metrics
 [1] bkl_.afm bkd_.afm bkli.afm bkdi.afm Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $Courier
 $family
 [1] Courier
 
 $metrics
 [1] Courier.afm Courier-Bold.afm
 [3] Courier-Oblique.afm Courier-BoldOblique.afm
 [5] Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $Helvetica
 $family
 [1] Helvetica
 
 $metrics
 [1] Helvetica.afm Helvetica-Bold.afm
 [3] Helvetica-Oblique.afm Helvetica-BoldOblique.afm
 [5] Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $`Helvetica-Narrow`
 $family
 [1] Helvetica-Narrow
 
 $metrics
 [1] hvn_.afm hvnb.afm hvno.afm hvnbo___.afm Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $NewCenturySchoolbook
 $family
 [1] NewCenturySchoolbook
 
 $metrics
 [1] ncr_.afm ncb_.afm nci_.afm ncbi.afm Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $Palatino
 $family
 [1] Palatino
 
 $metrics
 [1] por_.afm pob_.afm poi_.afm pobi.afm Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $Times
 $family
 [1] Times
 
 $metrics
 [1] Times-Roman.afm  Times-Bold.afm   Times-Italic.afm
 [4] Times-BoldItalic.afm Symbol.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $URWGothic
 $family
 [1] URWGothic
 
 $metrics
 [1] a010013l.afm a010015l.afm a010033l.afm a010035l.afm s05l.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $URWBookman
 $family
 [1] URWBookman
 
 $metrics
 [1] b018012l.afm b018015l.afm b018032l.afm b018035l.afm s05l.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $NimbusMon
 $family
 [1] NimbusMon
 
 $metrics
 [1] n022003l.afm n022004l.afm n022023l.afm n022024l.afm s05l.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $NimbusSan
 $family
 [1] NimbusSan
 
 $metrics
 [1] n019003l.afm n019004l.afm n019023l.afm n019024l.afm s05l.afm
 
 $encoding
 [1] default
 
 attr(,class)
 [1] Type1Font
 
 $URWHelvetica
 $family
 [1] URWHelvetica
 
 $metrics
 [1] n019003l.afm n019004l.afm n019023l.afm n019024l.afm 

Re: [R] Plotting colors on a world map

2009-05-12 Thread dxc13

So, this would allow me to assign a color to each of the 2592 rectangular
grid cells on a map of the world?  I guess I am having trouble finding a
suitable method of doing this.  Also, to be able to distinguish country
boundaries from the grid lines (which are both black color), could the grid
lines be transparent so that all we see is the map of the world in various
colors?
I was thinking that 10 degrees would be light blue, 10-30 degrees would be
blue, 31-40 be green, 41-50 be yellow, 51-60 be dark yellow, 61-70 be
orange, 71-80 be red, 81-90 be dark red.

Any help is appreciated in getting this task accomplished.  Thanks


Duncan Murdoch-2 wrote:
 
 On 5/11/2009 10:32 AM, dxc13 wrote:
 Hi useR's
 
 I have created a simple map of the world using the following code:
 m - map(xlim=c(-180,180), ylim=c(-90,90))
 map.axes()
 
 I then create a grid of dimension 36x72 using the code:
 map.grid(m, nx=72, ny=36, labels=FALSE, col=black)
 
 This gives 2592 grid cells.  In a separate data set of dimension 36x72, I
 have 2592 temperature values (some missing values are present) ranging
 from
 -20 degrees F to 90 degrees F.  
 
 My question is, how can I create a reasonable color scheme (low
 temperatures
 in light blue to higher temperatures in dark red) and plot the
 temperature
 colors in their respective grid cells on the map?
 
 Take this data matrix as some example data:
 T - sample(-10:90, 2592, replace=TRUE)
 mat - matrix(T, nrow=36, ncol=72)
 
 
 The colorspace package can make nice color sequences for this sort of 
 thing, and image() can add them to a plot.  In colorspace, look at the 
 examples for diverge_hcl.
 
 Duncan Murdoch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Plotting-colors-on-a-world-map-tp23484524p23495370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to display data content on a only row?

2009-05-12 Thread Thom_249

Hi

I'm sorry for my so basic question, but it's so basic that I can't find
anwser anywhere...

I have data like this:
 [1] 16.800  6.533  5.067  3.933  2.200  1.667
 [7]  1.200  1.067  0.733  0.667

 [1] 35.6113946  6.9576953  4.5271667  2.3744674  1.4735768  1.1751393
 [7]  1.2649111  0.7988086  0.7037316  0.7237469

 [1] 32.933 15.667  8.133  6.533  2.800  2.267
 [7]  1.467  0.733  0.733  0.667

 [1] 27.4992641 10.2794293  3.4198301  2.3864698  1.4735768  1.2227993
 [7]  1.1872337  0.7037316  1.0327956  0.7237469

 [1] 234.533  11.333   5.800   2.933   2.000   1.533
 [7]   0.933   0.667   0.267   0.533

 [1] 425.7797665   9.5966264   4.2627959   2.4043611   2.0701967   1.5522641
 [7]   1.0997835   1.0465362   0.5936168   0.7432234

 [1] 430.2667  18.2000   4.5333   2.1333   1.0667
 [6]   0.4000   0.0667   0.2667   0.0667   0.0667

 [1] 450.8085540  17.1514264   4.7938453   2.5597619   2.2824381   0.9102590
 [7]   0.2581989   0.5936168   0.2581989   0.2581989

And I want that all these data, printed on a 4 rows instead of 8, and it's
be great without the [x]

How can I don that?

Regards

Thomas
-- 
View this message in context: 
http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23497437.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extracting residuals from glm object

2009-05-12 Thread utkarshsinghal

Hi All,

Can anybody explain why the following three ways of extracting residuals 
from a glm object are giving me different outputs:


 idv = runif(1000,0,1)
 dv = rbinom(1000,1,0.5)
 d = data.frame(idv,dv)
 fit = glm(dv~idv, data=d, family=binomial)

 head(residuals(fit))
   1 2 3 4 5 6
1.216862 -1.161059 -1.156795  1.204759 -1.141068  1.201437

 head(fit$residuals)
   1 2 3 4 5 6
2.096724 -1.962126 -1.952454  2.066224 -1.917492  2.057981

 head(d$dv-fit$fitted.values)
1  2  3  4  5  6
0.5230655 -0.4903489 -0.4878241  0.5160253 -0.4784855  0.5140869


Regards
Utkarsh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] newtons method

2009-05-12 Thread Kon Knafelman

Hi,

Does anyone know how to code newton's method for finding the roots of 
polynomial functions? im not sure whether i need to do this manually, or just 
code something with a loop to stop when it gets to the desired result

thanks guys!

_
Looking to move somewhere new this winter? Let ninemsn property help

[[elided Hotmail spam]]
=Domain_tagline_m=EXT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] decimal troubles ?

2009-05-12 Thread Dieter Menne



Dieter Menne wrote:
 
 
 It says digits, not decimals:
 
 (44.25+31.1+50)/100
 [1] 1.25
 
 Dieter 
 
 

 (44.25+31.1+50)*10
[1] 1254

Strictly speaking, this should print as 1250 (no flames, please, I can live
with it)

Dieter


-- 
View this message in context: 
http://www.nabble.com/decimal-troubles---tp23498062p23498439.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] decimal troubles ?

2009-05-12 Thread Richard . Cotton
 I have some trouble with the number of decimals in R (currently R 
 2.9.0). For instance:
 
   options()$digits
 [1] 3
 
 let me hope that I will get three digits where useful when a number is 
 printed. BUT:
 
   44.25+31.1+50
 [1] 125
 
 No way to get the right result 125.35
 
 Can anybody tell me what's happens ?

The digits option specifies the number of significant figures, not the 
number of decimal places.  (The help documentation on the options page 
doesn't make this clear at the moment, though it does point you to 
print.default, which describes it as setting significant digits.)

Also note that the true value is being stored, so you can retrieve it with 
explicit formatting, e.g.

x - 44.25+31.1+50
x   # 125
print(x, digits=5)  # 125.35

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems to run SVM regression with e1071

2009-05-12 Thread marlene marchena







Hi, 

 Is the variable st character or a factor? What does str(train$st) show?

 str(train$st)
 Factor w/ 208 levels 0,000,0,0058643,..: 132 134 41 29 42 151 195 195 
196 207 ...

Thank you very much Max with your help I found my error, now it works.

Marlene.

_
[[elided Hotmail spam]]

_medium=Taglineutm_campaign=IE8
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2009-05-12 Thread Richard . Cotton
 if i have the following function,
 
 f - function(x) x^3-2*x^2+3*x-5
 
 i need a simple function for the derivative of this with respect to 
 'x', so that i can then sub in values to the the derivative 
 function, and use Newtons method of finding a root for this.

You could take a look at ?deriv, but since the example you presented is 
just a polymonial, you can probably learn about calculus (
http://en.wikipedia.org/wiki/Derivative#Derivatives_of_elementary_functions
) and differentiate it in your head in less time than it took you to write 
this help question.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice histogram for multiple variables : adjusting x axis

2009-05-12 Thread Richard . Cotton
 I have a large data frame and I want to look at the distribution of 
 each variable very quickly by plotting an individual histogram for 
 each variable.
 I'd like to do so using lattice.
 
 Here is a small example using the iris data set:
 
 histogram(as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.
 factor)]),collapse=+))),data=iris[,!sapply(iris,is.factor)])
 
 However in this graphic, the x-axis and the breaks are the same for 
 all 4 variables. I would like them to be adjusted to the data in 
 each panel separately. I have tried using the scales argument to no 
avail:
 
 histogram( as.formula(paste(~,paste(colnames(iris[,!sapply(iris,
 is.factor)]),collapse=+))) ,data=iris[,!sapply(iris,is.factor)], 
 scales=list(x=list(relation=free)) )
 
 
 How can I individualize the x-axis and breaks for each individual panel?

You're on the right track.  If you don't specify a breaks argument, then 
histogram calculates a common axis and breaks across all the panels. 
Declaring breaks=NULL will force it to calculate them for each panel.

histogram( 
 
as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.factor)]),collapse=+))),
 

   data=iris[,!sapply(iris,is.factor)], 
   scales=list(x=list(relation=free)),
   breaks=NULL
)

You can also specify a hist-style method, e.g. breaks=Sturges to get 
what you want.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] decimal troubles ?

2009-05-12 Thread Patrick Giraudoux
Shame on me... I confused digits and decimals 

Thanks anyway to make me come to the English basics...

Patrick


Peter Alspach a écrit :
 Tena koe Patrick

 If you want more than three digits, change the options:

 options(digits=7)
 44.25+31.1+50
 [1] 125.35

 HTH 

 Peter Alspach 

   
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Giraudoux
 Sent: Tuesday, 12 May 2009 8:08 p.m.
 To: r-help@r-project.org
 Subject: [R] decimal troubles ?

 Dear all,

 I have some trouble with the number of decimals in R 
 (currently R 2.9.0). For instance:

   options()$digits
 [1] 3

 let me hope that I will get three digits where useful when a 
 number is printed. BUT:

   44.25+31.1+50
 [1] 125

 No way to get the right result 125.35

 Can anybody tell me what's happens ?

 Patrick

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

 The contents of this e-mail are confidential and may be subject to legal 
 privilege.
  If you are not the intended recipient you must not use, disseminate, 
 distribute or
  reproduce all or any part of this e-mail or attachments.  If you have 
 received this
  e-mail in error, please notify the sender and delete all material pertaining 
 to this
  e-mail.  Any opinion or views expressed in this e-mail are those of the 
 individual
  sender and may not represent those of The New Zealand Institute for Plant and
  Food Research Limited.

   


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how the break is calculated by R?

2009-05-12 Thread Richard . Cotton
 As to hist,the help file says: R's default with equi-spaced breaks 
 (also the default) is to plot the counts in the cells defined by 
breaks. 
 I wanna know how the break is calculated by R? 
 
 In other words: break = (max - min)/(number of group) but how the 
 number of group is calculated by R? Thanks!

If you look closely at the help page for hist, you'll see that in the 
Usage section it says breaks=Sturges.  Then in the Arguments section it 
says that when breaks is a character string, it means it is the name of a 
method to compute the number of cells.  Finally, in the Details section, 
it tells you to look at the nclass.Sturges help page for details of 
Sturges' algorithm.

Wikipedia has a good description of common alorgithms for calculating the 
number of breaks.
http://en.wikipedia.org/wiki/Histogram#Number_of_bins_and_width

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newtons method

2009-05-12 Thread Uwe Ligges



Kon Knafelman wrote:

Hi,

Does anyone know how to code newton's method for finding the roots of 
polynomial functions? im not sure whether i need to do this manually, or just 
code something with a loop to stop when it gets to the desired result


See ?optim for optimization methods.

Uwe Ligges



thanks guys!

_
Looking to move somewhere new this winter? Let ninemsn property help

[[elided Hotmail spam]]
=Domain_tagline_m=EXT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROCR: auc and logarithm plot

2009-05-12 Thread Tim
Hi,
I am quite new to R and I have two questions regarding ROCR.

1. I have tried to understand how to extract area-under-curve value by looking 
at the ROCR document and googling. Still I am not sure if I am doing the right 
thing. Here is my code, is auc1 the auc value?

pred1 - prediction(resp1,label1)

perf1 - performance(pred1,tpr,fpr)
plot( perf1, type=l,col=1 )

auc1 - performance(pred1,auc)
auc1 - a...@y.values[[2]]


2. I have to compare two models that have very close ROCs. I'd like to have a 
more distinguishable plot of the ROCs. So is it possible to have a logarithm FP 
axis which might probably separate them well? Or zoom in the part close to the 
leftup corner of ROC plot? Or any other ways to make the ROCs more separate?

Thanks and regards!

--Tim



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kumaraswamy distribution

2009-05-12 Thread Debbie Zhang

Dear R users,

Does anyone know how to write function for Kumaraswamy distribution in R? Since 
I cannot write dkumar, pkumar, etc. in R.

 

Please help.

 

Thanks a lot,

Debbie

_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inconsistent results for axis.POSIXct

2009-05-12 Thread Uwe Ligges
I can reproduce it. Can you please send a bug report to R-bugs so that 
this won't get lost.


Thank you,
Uwe Ligges



Dan Kelley wrote:

Some time ago, I posted a note about what I considered to be a bug in
axis.POSIXt() for R 2.8.x, relating to whether timezones in the data are
obeyed on the axes.  A link to that note, and to a quick and helpful
response, is at the following URL 


http://www.nabble.com/patch-for-axis.POSIXct-%28related-to-timezones%29-td22338700.html#a22338700

Note that R 2.9.0 has been adjusted to help with this.  However, there are
still problems.  Test code is given below:

# test code (top panel, OK; bottom panel, incorrect times on x axis) 
par(mar=c(2,2,2,1)) 
par(mfrow=c(2,1)) 
start - as.POSIXct(2008-05-01 00:00:00, tz=UTC) 
end   - as.POSIXct(2008-05-01 00:10:00, tz=UTC) 
plot(c(start, end), c(0, 1), ylab=) 
mtext(paste(start, to, end), side=3) 

start - as.POSIXct(2008-05-01 00:00:00, tz=UTC) 
end   - as.POSIXct(2008-05-01 01:10:00, tz=UTC) 
plot(c(start, end), c(0, 1), ylab=) 
mtext(paste(start, to, end), side=3)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how often we use the libraries/packages in R

2009-05-12 Thread Marta Rufino

Dear list members,

I am trying to make a clean up on my computer (windows problems, off course), 
and I wanted to remove some libraries that I have installed in R, which have 
been hardly used. 
Is there any way to see how often each library is used by us?
I have many that I am not sure... because they can be used with other packages, 
etc.

Thank you very much in advance,
Best wishes,
Marta

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read many .csv files into an R session automatically, without specifying filenames

2009-05-12 Thread Wacek Kusnierczyk
Romain Francois wrote:

 Hi,

 Something like this perhaps:

 files - dir( pattern  = \\.csv$ )
 for( x in files){
assign( sub( \\.csv$, , x ) , read.csv(x), envir = .GlobalEnv )
 }

or maybe

csvs = Map(read.csv, dir(pattern='\\.csv$'))

possibly with a correction of list item names

names(csvs) = sub('\\.csv$', '', names(csvs))

which is optional, since the content of foo.csv can be accessed as
csvs$foo (or as csvs[['foo', exact=FALSE]]) instead of csvs$foo.cvs even
without such a correction.

vQ


 Romain

 Mark Na wrote:
 Hi R-helpers,

 I would like to read into R all the .csv files that are in my working
 directory, without having to use a read.csv statement for each file.

 Each .csv would be read into a separate dataframe which would acquire
 the filename of the .csv.

 As an example:

 Mark-read.csv(Mark.csv)

 ...but the code (or command) would do this automatically for every
 .csv in the working directory without my specifying each file.

 I'd appreciate any help or ideas you might have.

 Thanks!

 Mark Na
   



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR: auc and logarithm plot

2009-05-12 Thread Tobias Sing
 1. I have tried to understand how to extract area-under-curve value by 
 looking at the ROCR document and googling. Still I am not sure if I am doing 
 the right thing. Here is my code, is auc1 the auc value?
 
 pred1 - prediction(resp1,label1)

 perf1 - performance(pred1,tpr,fpr)
 plot( perf1, type=l,col=1 )

 auc1 - performance(pred1,auc)
 auc1 - a...@y.values[[2]]
 


If you have only one set of predictions and matching class labels, it
would be in a...@y.values[[1]].
If you have multiple sets (as from cross-validation or bootstrapping),
the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc.
You can collect all of them for example by unlist(p...@y.values).

Btw, you can use str(auc1) to see the structure of objects.


 2. I have to compare two models that have very close ROCs. I'd like to have a 
 more distinguishable plot of the ROCs. So is it possible to have a logarithm 
 FP axis which might probably separate them well? Or zoom in the part close to 
 the leftup corner of ROC plot? Or any other ways to make the ROCs more 
 separate?


To zoom in to a specific part:
plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1))
plot(perf2, add=TRUE, lty=2, col='red')

If you want logarithmic axes (though I wouldn't personally do this for
a ROC plot), you can set up an empty canvas and add ROC curves to it:
plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n')
plot(perf, add=TRUE)

You can adjust all components of the performance plots. See
?plot.performance and the examples in this slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

Hope that helps,
  Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] working with groups of labels?

2009-05-12 Thread Jim Lemon

Phillip Porter wrote:

At 9:25 PM +1000 5/11/09, Jim Lemon wrote:

Hi Phillip,
I'm not exactly sure how you are positioning the labels. Is it 
possible to give us an example with some data that will produce a 
plot to show what you want? It shouldn't be too hard to do.

Jim



Data is something along the lines of:

variablescore
Real Bad Stuff  0
   Real23
  Bad 14
  Stuff   17
Other Crazy Things  0
Other   18
  Crazy   43
Things  13

The basic plot looks like this:

barplot(score, horiz=TRUE, axes=FALSE, las=2, space=1, 
names.arg=c(Real Bad Stuff, Real,Bad,Stuff,Other Crazy 
Things,Other,Crazy,Things))



I've been trying make it look the way I want it by doing this:

barplot(score, horiz=TRUE, axes=FALSE, las=1, space=1, names.arg=c(, 
Real,Bad,Stuff,,Other,Crazy,Things), cex.names=.5)
mtext(c(Real Bad Stuff,Other Crazy Things),at=c(8,15.6),side=2, 
line=1.5,adj=1, las=1, cex=.5, font=2)


The mtext is annoying to line up with the actual plot.  If there was a 
way to get it to line up with the variables that would solve my 
problems, but I've tried at=variable and at=score (putting in  for 
all of the variable names I don't want bolded and left justified) and 
neither way gets them to line up.


Thanks,
Phillip

Hi Phillip,
Try this:

pporter-read.csv(pporter.csv)
pporter
   variable score
1 Real Bad Stuff 0
2   Real23
3Bad14
4  Stuff17
5 Other Crazy Things 0
6  Other18
7  Crazy43
8 Things13
par(mar=c(5,8,4,2))
barpos-barplot(pporter$score, horiz=TRUE, axes=FALSE, las=1, space=1,
names.arg=c(, Real,Bad,Stuff,,Other,Crazy,Things),
cex.names=.5)
mtext(c(Real Bad Stuff,Other Crazy Things),at=barpos[c(1,5)],
side=2, line=5,adj=0, las=1, cex=.5, font=2)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how often we use the libraries/packages in R

2009-05-12 Thread Uwe Ligges



Marta Rufino wrote:

Dear list members,

I am trying to make a clean up on my computer (windows problems, off course), and I wanted to remove some libraries that I have installed in R, which have been hardly used. 
Is there any way to see how often each library is used by us?


No, unless you have an internal protocol if files accesses which is 
probably not the case.


Uwe Ligges



I have many that I am not sure... because they can be used with other packages, 
etc.

Thank you very much in advance,
Best wishes,
Marta

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting colors on a world map

2009-05-12 Thread Jim Lemon

dxc13 wrote:

Hi useR's

I have created a simple map of the world using the following code:
m - map(xlim=c(-180,180), ylim=c(-90,90))
map.axes()

I then create a grid of dimension 36x72 using the code:
map.grid(m, nx=72, ny=36, labels=FALSE, col=black)

This gives 2592 grid cells.  In a separate data set of dimension 36x72, I
have 2592 temperature values (some missing values are present) ranging from
-20 degrees F to 90 degrees F.  


My question is, how can I create a reasonable color scheme (low temperatures
in light blue to higher temperatures in dark red) and plot the temperature
colors in their respective grid cells on the map?

Take this data matrix as some example data:
T - sample(-10:90, 2592, replace=TRUE)
mat - matrix(T, nrow=36, ncol=72)


Thanks in advance,
dxc13
  

Howdy dxc13,
Try this:

library(plotrix)
tempcolor-color.scale(mat,c(0.5,1),c(0.5,0),c(1,0))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What's the best way to tell a function about relevant fields in data frames

2009-05-12 Thread Titus von der Malsburg

Hi list,

I have a function that detects saccadic eye movements in a time series
of eye positions sampled at a rate of 250Hz.  This function needs
three vectors: x-coordinate, y-coordinate, trial-id.  This information
is usually contained in a data frame that also has some other fields.
The names of the fields are not standardized.

 head(eyemovements)
time x  y trial
51 880446504 53.18 375.73 1
52 880450686 53.20 375.79 1
53 880454885 53.35 376.14 1
54 880459060 53.92 376.39 1
55 880463239 54.14 376.52 1
56 880467426 54.46 376.74 1

There are now several possibilities for the signature of the function:

1. Passing the columns separately:

detect(eyemovements$x, eyemovements$y, eyemovements$trial)

  or:

with(eyemovements,
 detect(x, y, trial))

2. Passing the data frame plus the names of the fields:

detect(eyemovements, x, y, trial)

3. Passing the data frame plus a formula specifying the relevant
fields:

detect(eyemovements, ~x+y|trial)

4. Passing a formula and getting the data from the environment:

with(eyemovements,
 detect(~x+y|trial))

I saw instances of all those variants (and others) in the wild.

Is there a canonical way to tell a function which fields in a data
frame are relevant?  What other alternatives are possible?  What are
the pros and cons of the alternatives?

Thanks, Titus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What's the best way to tell a function about relevant fields in data frames

2009-05-12 Thread Zeljko Vrba
On Tue, May 12, 2009 at 12:18:59PM +0200, Titus von der Malsburg wrote:
 
 Is there a canonical way to tell a function which fields in a data
 frame are relevant?  What other alternatives are possible?  What are
 the pros and cons of the alternatives?
 
Why not simply rearrange your data frames to have standardized column names
(see names() function), and write functions that operate on the standardized
format?  The change need not be destructive, you can first make a copy of the
data.

If all data frames have the same sequence of variables (time, x, y), you can
just use indices to refer to the columns, e.g. 1 corresponds to the time
variable.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What's the best way to tell a function about relevant fields in data frames

2009-05-12 Thread Gabor Grothendieck
You could define a generic detect(obj, ...) that dispatches (using S3):

detect.formula(fo, data)
detect.data.frame(data)
detect.default(x, y, trial)

where the first two call the third thereby modeling it on lm,
a common approach, and giving the user choice in interface.

On Tue, May 12, 2009 at 6:18 AM, Titus von der Malsburg
malsb...@gmail.com wrote:

 Hi list,

 I have a function that detects saccadic eye movements in a time series
 of eye positions sampled at a rate of 250Hz.  This function needs
 three vectors: x-coordinate, y-coordinate, trial-id.  This information
 is usually contained in a data frame that also has some other fields.
 The names of the fields are not standardized.

 head(eyemovements)
        time     x      y trial
 51 880446504 53.18 375.73     1
 52 880450686 53.20 375.79     1
 53 880454885 53.35 376.14     1
 54 880459060 53.92 376.39     1
 55 880463239 54.14 376.52     1
 56 880467426 54.46 376.74     1

 There are now several possibilities for the signature of the function:

 1. Passing the columns separately:

    detect(eyemovements$x, eyemovements$y, eyemovements$trial)

  or:

    with(eyemovements,
         detect(x, y, trial))

 2. Passing the data frame plus the names of the fields:

    detect(eyemovements, x, y, trial)

 3. Passing the data frame plus a formula specifying the relevant
 fields:

    detect(eyemovements, ~x+y|trial)

 4. Passing a formula and getting the data from the environment:

    with(eyemovements,
         detect(~x+y|trial))

 I saw instances of all those variants (and others) in the wild.

 Is there a canonical way to tell a function which fields in a data
 frame are relevant?  What other alternatives are possible?  What are
 the pros and cons of the alternatives?

 Thanks, Titus

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What's the best way to tell a function about relevant fields in data frames

2009-05-12 Thread Titus von der Malsburg

Hi Zeljko,

thanks for your suggestion!

On Tue, May 12, 2009 at 12:26:48PM +0200, Zeljko Vrba wrote:
 Why not simply rearrange your data frames to have standardized column names
 (see names() function), and write functions that operate on the standardized
 format?

Actually that's what I'm currently doing.  And if the code was only
for my personal use I would stick with this solution.  However, I want
to publish my stuff as a package and want to make its use as
convenient as possible for the users.  The drawbacks of the current
solution are: The users have to perform the additional processing step
and the users have to know the correct format of the data frame.

  Titus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What's the best way to tell a function about relevant fields in data frames

2009-05-12 Thread Duncan Murdoch

On 12/05/2009 6:18 AM, Titus von der Malsburg wrote:

Hi list,

I have a function that detects saccadic eye movements in a time series
of eye positions sampled at a rate of 250Hz.  This function needs
three vectors: x-coordinate, y-coordinate, trial-id.  This information
is usually contained in a data frame that also has some other fields.
The names of the fields are not standardized.


head(eyemovements)

time x  y trial
51 880446504 53.18 375.73 1
52 880450686 53.20 375.79 1
53 880454885 53.35 376.14 1
54 880459060 53.92 376.39 1
55 880463239 54.14 376.52 1
56 880467426 54.46 376.74 1

There are now several possibilities for the signature of the function:

1. Passing the columns separately:

detect(eyemovements$x, eyemovements$y, eyemovements$trial)

  or:

with(eyemovements,
 detect(x, y, trial))


I'd choose this one, with one modification described below.



2. Passing the data frame plus the names of the fields:

detect(eyemovements, x, y, trial)


I think this is too inflexible.  What if you want to temporarily change 
one variable?  You don't want to have to create a whole new dataframe, 
it's better to just substitute in another variable.




3. Passing the data frame plus a formula specifying the relevant
fields:

detect(eyemovements, ~x+y|trial)

4. Passing a formula and getting the data from the environment:

with(eyemovements,
 detect(~x+y|trial))


Rather than 3 or 4, I would use the more common idiom

detect(~x+y|trial, data=eyemovements)

(and the formula might be x+y~trial).  But I think the formula interface 
is too general for your needs.  What would ~x+y+z|trial mean?


I'd suggest something like 1 but using the convention plot.default() 
uses, where you have x and y arguments, but y can be skipped if x is a 
matrix/dataframe/formula/list. It uses the xy.coords() function to do 
the extraction.


Duncan Murdoch



I saw instances of all those variants (and others) in the wild.

Is there a canonical way to tell a function which fields in a data
frame are relevant?  What other alternatives are possible?  What are
the pros and cons of the alternatives?

Thanks, Titus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplot: no ticks for a factor scale?

2009-05-12 Thread Marcin Kozak
Hi,

I would like to have no ticks on a scale that represents a factor. The
tick.number argument from scales does not work in such a situation, as
the help page as well as this simple (fairly stupid) code show:

require(lattice)
fact-gl(4,1,labels=LETTERS[1:4])
y-c(1,4,3,2)
xyplot(y~fact,scales=list(x=list(tick.number=0)))

I want to leave out the ticks at the x-axis (removing labels is easy,
of course). What can I do with that?

Thanks,
Marcin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kumaraswamy distribution

2009-05-12 Thread G. Jay Kerns
Dear Debbie,

On Tue, May 12, 2009 at 5:37 AM, Debbie Zhang debbie0...@hotmail.com wrote:

 Dear R users,

 Does anyone know how to write function for Kumaraswamy distribution in R? 
 Since I cannot write dkumar, pkumar, etc. in R.



 Please help.



 Thanks a lot,

 Debbie


Check the CRAN Task View on Probability distributions:

http://cran.r-project.org/web/views/Distributions.html

and in particular, check out package VGAM.

HTH,
Jay

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kumaraswamy distribution

2009-05-12 Thread Mike Lawrence
dkumar = function(x,a,b) a*b*(x^(a-1))*((1-x^a)^(b-1))
pkumar = function(x,a,b) 1-(1-x^a)^b

(I've based this entirely on the wikipedia entry on the Kumaraswamy
distribution [http://en.wikipedia.org/wiki/Kumaraswamy_distribution],
so best to check both my replication of the formula there and the
accuracy of the wikipedia entry itself)

On Tue, May 12, 2009 at 6:37 AM, Debbie Zhang debbie0...@hotmail.com wrote:

 Dear R users,

 Does anyone know how to write function for Kumaraswamy distribution in R? 
 Since I cannot write dkumar, pkumar, etc. in R.



 Please help.



 Thanks a lot,

 Debbie

 _
 [[elided Hotmail spam]]

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] From two colors to 01 sequences

2009-05-12 Thread Paul Smith
Dear All,

Perhaps, what I am asking is impossible, but I am asking it anyway.

I have got several pdf files with rows of colored rectangles: red
rectangles should be read as 0; green rectangles as 1. No other color
exists. Is there some way to have R reading the colored rectangles to
a matrix or data frame converting the color of the rectangles to
sequences of 01?

Thanks in advance,

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Partial Derivatives in R

2009-05-12 Thread Gabor Grothendieck
You could also use rSymPy to symbolically differentiate it. For example,
based on the semi-automatic differentiation example on the rSymPy home
page:

http://code.google.com/p/rsympy/#Semi-Automatic_Differentiation

just replace the indented lines there with the indented lines here
(I've also added # after each such line in case the email messes
up the indentation.)  The remaining lines are identical to that
example.

Note that x and bx hold symbolic variables so one must use [[ ... ]]
rather than [ ... ] with them:

 library(rSymPy)
 # next file is sourced from devel version of rSymPy
 source(http://rsympy.googlecode.com/svn/trunk/R/Sym.R;)
n - 2 #
 res - x - lapply(paste(x, 1:n, sep = ), Sym)
 sympy(paste(var(, shQuote(paste(x, collapse =  )), )))
[1] (x1, x2)

b - matrix(1:9, 3) #
bx - list(NA, NA, NA) #
for(i in 1:3) bx[[i]] - b[i, 1] + b[i, 2] * x[[1]] + b[i, 3] * x[[2]] #
result - bx[[1]] / (bx[[1]] + bx[[2]] + bx[[3]]) #

 g - sapply(x, deriv, expr = result, USE.NAMES = FALSE)
 g2 - sapply(g, sympy, USE.NAMES = FALSE)
 g3 - gsub(x([0-9]+), x[\\1], g2)
 gfun0 - paste(grad - function(x, n =, n, ) { c(, paste(g3, collapse = 
 ,), ) })
 gfun - eval(parse(text = gfun0))
 gfun
function(x, n = 2 ) { c( -15*(1 + 4*x[1] + 7*x[2])/(6 + 15*x[1] +
24*x[2])**2 + 4/(6 + 15*x[1] + 24*x[2]),-24*(1 + 4*x[1] + 7*x[2])/(6 +
15*x[1] + 24*x[2])**2 + 7/(6 + 15*x[1] + 24*x[2]) ) }

etc.


On Tue, May 12, 2009 at 2:25 AM, spencerg spencer.gra...@prodsyse.com wrote:
 Hi, Paul:

     Your example is so complicated that I don't want to take the time to
 check it.  You apply deriv to an exponential divided by a sum of
 exponentials, and I'm not convinced that your manual Correct way is
 actually correct.  It looks like you've followed the examples in the deriv
 help page, so I would be more inclined to trust that, especially since it
 matched the answer I got from genD, as follows.

     In your genD example, x01 and x02 should be x[1] and x[2]:
 p1 - function(x) {exp(b00.1+b01.1*x[1]+b02.1*x[2]) /
                        (exp(b00.1+b01.1*x[1]+b02.1*x[2])+
                         exp(b00.2+b01.2*x[1]+b02.2*x[2])+
                         exp(b00.3+b01.3*x[1]+b02.3*x[2])) - phat1}
 test - genD(p1, c(x01, x02))
 test$D
          [,1]      [,2]        [,3]       [,4]       [,5]
 [1,] -0.2012997 0.1296301 -0.03572875 0.07082898 -0.1261376


     The first two components of test$D here match your attr(eval(dp1.dx),
 gradient).  The next three are the lower triangular portion of the matrix
 of second partials of the function p1, per the genD documentation.

     The function numericGradient in the maxLik package could also be used
 for this, I believe.  However, I won't take the time here to test that.

     Hope this helps.     Spencer Graves

 Paul Heinrich Dietrich wrote:

 Hi Spencer,
 Thanks for suggesting the genD function.  In attempting it, I have
 rearranged my function from phat1 ~ ... to ... - 1, it apparently doesn't
 like the first one :)  But when I run it, it tells me the partials are all
 zero.  I'm trying out a simple MNL equation before I expand it to what I'm
 looking for.  Here is what I tried (and I get different answers from a
 textbook solution, deriv(), and genD()):



 ### Variables for an observation
 x01 - rnorm(1,0,1)
 x02 - rnorm(1,0,1)
 ### Parameters for an observation
 b00.1 - rnorm(1,0,1)
 b00.2 - rnorm(1,0,1)
 b00.3 - 0
 b01.1 - rnorm(1,0,1)
 b01.2 - rnorm(1,0,1)
 b01.3 - 0
 b02.1 - rnorm(1,0,1)
 b02.2 - rnorm(1,0,1)
 b02.3 - 0
 ### Predicted Probabilities for an observation
 phat1 - 0.6
 phat2 - 0.3
 phat3 - 0.1
 ### Correct way to calculate a partial derivative
 partial.b01.1 - phat1 * (b01.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b01.2 - phat2 * (b01.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b01.3 - phat3 * (b01.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b01.1; partial.b01.2; partial.b01.3


 [1] 0.04288663
 [1] -0.1804876
 [1] 0.1376010


 partial.b02.1 - phat1 * (b02.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b02.2 - phat2 * (b02.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b02.3 - phat3 * (b02.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b02.1; partial.b02.2; partial.b02.3


 [1] 0.8633057
 [1] 0.3171978
 [1] 0.1376010


 ### Derivatives for MNL
 dp1.dx - deriv(phat1 ~ exp(b00.1+b01.1*x01+b02.1*x02) /


 + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))


 dp2.dx - deriv(phat2 ~ exp(b00.2+b01.2*x01+b02.2*x02) /


 + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))


 dp3.dx - deriv(phat3 ~ exp(b00.3+b01.3*x01+b02.3*x02) /


 + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))


 attr(eval(dp1.dx), gradient)


             x01      x02
 [1,] -0.01891354 0.058918


 attr(eval(dp2.dx), gradient)


            x01         x02
 [1,] 

Re: [R] xyplot: no ticks for a factor scale?

2009-05-12 Thread Richard . Cotton
 I would like to have no ticks on a scale that represents a factor. The
 tick.number argument from scales does not work in such a situation, as
 the help page as well as this simple (fairly stupid) code show:
 
 require(lattice)
 fact-gl(4,1,labels=LETTERS[1:4])
 y-c(1,4,3,2)
 xyplot(y~fact,scales=list(x=list(tick.number=0)))
 
 I want to leave out the ticks at the x-axis (removing labels is easy,
 of course). What can I do with that?

Replacing tick.number with tck will do the trick.

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] From two colors to 01 sequences

2009-05-12 Thread Zeljko Vrba
On Tue, May 12, 2009 at 12:20:56PM +0100, Paul Smith wrote:
 
 I have got several pdf files with rows of colored rectangles: red
 rectangles should be read as 0; green rectangles as 1. No other color
 exists. Is there some way to have R reading the colored rectangles to
 a matrix or data frame converting the color of the rectangles to
 sequences of 01?
 
I would not do it with R, but.. here's the general approach:

  1. Convert the PDF to some raster format at high enough resolution (DPI)
 without any kind of compression or anti-aliasing
  2. Use some image manipulation program to replace red/green with black/white
  3. Save the resulting picture in ASCII PBM format
  4. Parse the resulting PBM and find 0-1 and 1-0 transitions which will give
 you rectangle boundaries.
  5. You did not specify the kind of rectangles, nor whether rows are of
 uniform height, so I assume uniform grid.  Otherwise, position and size
 might also be relevant[1], the interpretation is completely up to you.

[1] As in, for example, http://educ.queensu.ca/~fmc/october2001/GoldenArt3.gif
(Imagine that there are only two colors instead of 4 + black lines)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR: auc and logarithm plot

2009-05-12 Thread Tim
Thanks Tobias!
A new question: if I want to draw an average ROC from cross-validation, how to 
make the bar color same as the line color? Here is my code:

plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev,barcol=2)

Even I specify barcol=2, the color of bars are still black, the default one, 
instead of red 2.

--Tim


--- On Tue, 5/12/09, Tobias Sing tobias.s...@gmail.com wrote:
From: Tobias Sing tobias.s...@gmail.com
Subject: Re: [R] ROCR: auc and logarithm plot
To: timlee...@yahoo.com, r-help@r-project.org
Date: Tuesday, May 12, 2009, 5:54 AM

 1. I have tried to understand how to extract area-under-curve value by
looking at the ROCR document and googling. Still I am not sure if I am doing the
right thing. Here is my code, is auc1 the auc value?
 
 pred1 - prediction(resp1,label1)

 perf1 - performance(pred1,tpr,fpr)
 plot( perf1, type=l,col=1 )

 auc1 - performance(pred1,auc)
 auc1 - a...@y.values[[2]]
 


If you have only one set of predictions and matching class labels, it
would be in a...@y.values[[1]].
If you have multiple sets (as from cross-validation or bootstrapping),
the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc.
You can collect all of them for example by unlist(p...@y.values).

Btw, you can use str(auc1) to see the structure of objects.


 2. I have to compare two models that have very close ROCs. I'd like to
have a more distinguishable plot of the ROCs. So is it possible to have a
logarithm FP axis which might probably separate them well? Or zoom in the part
close to the leftup corner of ROC plot? Or any other ways to make the ROCs more
separate?


To zoom in to a specific part:
plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1))
plot(perf2, add=TRUE, lty=2, col='red')

If you want logarithmic axes (though I wouldn't personally do this for
a ROC plot), you can set up an empty canvas and add ROC curves to it:
plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n')
plot(perf, add=TRUE)

You can adjust all components of the performance plots. See
?plot.performance and the examples in this slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

Hope that helps,
  Tobias



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Power function for ratio of lognormal means: two equally valid results? [SEC=Unclassified]

2009-05-12 Thread Steve Candy
Hi All

This is a general stats problem that I am dealing with using R, so any help is 
greater appreciated.

I have two lognormal distributions with means M1 and M2.

If we have:

H0: log(M1/M2)=0

H1: log(M1/M2) !=0 equivalent to log(M1/M2)=log(1+P) where P0 or P0.

If we calculate the power for a value of P=0.1 or P=-0.1 (i.e. a 10% 
difference) and say assume SE{log(M1/M2)}=0.05, and confidence level= 
100(1-alpha), alpha=0.05, then how is the power function calculated?

As far as I can see we can calculate the power in the two ways given below and 
if there is no assumed direction to difference between M1 and M2 are not both 
calculations valid?

 # P=-0.1
 qn - qnorm(p=0.05, mean=0, sd=0.05, lower.tail=T)
 Power.1 - pnorm(q=qn, mean=log(0.9), sd=0.05, lower.tail=T)
 # P=0.1
 qn - qnorm(p=0.95, mean=0, sd=0.05, lower.tail=T)
 Power.2 - pnorm(q=qn, mean=log(1.1), sd=0.05, lower.tail=F)

 print(c(Power.1,Power.2))
[1] 0.6780872 0.6030887

 So which value should I use? Or would the average of the two values be 
appropriate to use? Or is there a fault in my logic? After a quick lit search I 
contacted a colleague who has written two stats text books and taught at 
University level for many years and he has not come across this problem and 
suggested averaging the values. This post is to ask if anyone has come across 
this pretty basic problem and has a suggested approach.

thanks

Steve



Classification: Unclassified

___

Australian Antarctic Division - Commonwealth of Australia
IMPORTANT: This transmission is intended for the addressee only. If you are not 
the
intended recipient, you are notified that use or dissemination of this 
communication is
strictly prohibited by Commonwealth law. If you have received this transmission 
in error,
please notify the sender immediately by e-mail or by telephoning +61 3 6232 
3209 and
DELETE the message.
Visit our web site at http://www.antarctica.gov.au/
___

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting residuals from glm object

2009-05-12 Thread Marc Schwartz

On May 12, 2009, at 3:50 AM, utkarshsinghal wrote:


Hi All,

Can anybody explain why the following three ways of extracting  
residuals from a glm object are giving me different outputs:


 idv = runif(1000,0,1)
 dv = rbinom(1000,1,0.5)
 d = data.frame(idv,dv)
 fit = glm(dv~idv, data=d, family=binomial)

 head(residuals(fit))
  1 2 3 4 5 6
1.216862 -1.161059 -1.156795  1.204759 -1.141068  1.201437

 head(fit$residuals)
  1 2 3 4 5 6
2.096724 -1.962126 -1.952454  2.066224 -1.917492  2.057981

 head(d$dv-fit$fitted.values)
   1  2  3  4  5  6
0.5230655 -0.4903489 -0.4878241  0.5160253 -0.4784855  0.5140869



set.seed(1)
idv - runif(1000, 0, 1)
d - data.frame(idv, dv)
fit - glm(dv ~ idv, data = d, family = binomial)


 head(fit$residuals)
1 2 3 4 5 6
-1.957016 -1.960477 -1.967029 -1.978074 -1.954949 -1.977749

 head(residuals(fit, type = working))
1 2 3 4 5 6
-1.957016 -1.960477 -1.967029 -1.978074 -1.954949 -1.977749



 head(d$dv - fit$fitted.values)
 1  2  3  4  5  6
-0.4890179 -0.4899201 -0.4916190 -0.4944577 -0.4884778 -0.4943746

 head(residuals(fit, type = response))
 1  2  3  4  5  6
-0.4890179 -0.4899201 -0.4916190 -0.4944577 -0.4884778 -0.4943746


See ?glm and ?residuals.glm and read the information there regarding  
the type of residuals stored in the glm model object as opposed to the  
multiple types of residuals that can be returned by residuals.glm().  
See the references in ?residuals.glm for more information as per the  
Details section therein.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR: auc and logarithm plot

2009-05-12 Thread Tobias Sing
To color the error bars in ROCR the same way as the performance curve,
you need to add one more argument (plotCI.col='red') to your plot
call:

plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev, plotCI.col=2)

The use of 'plotCI.col' is an example for the general mechanism of
ROCR to propagate arguments to the components of a plot (also
explained in ?plot.performance):

Optional graphical parameters to adjust different components of the performance
plot. Parameters are directed to their target component by prefixing them with
the name of the component (component.parameter, e.g. text.cex).
The following components are available: xaxis, yaxis, coloraxis, box
(around the plotting region), points, text, plotCI (error bars), boxplot.
The names of these components are influenced by the R functions that are used
to create them. Thus, par(component) can be used to see which parameters
are available for a given component (with the expection of the three axes;
use par(axis) here). To adjust the canvas or the performance curve(s), the
standard plot parameters can be used without any prefix.

Good luck,
  Tobias


On Tue, May 12, 2009 at 1:48 PM, Tim timlee...@yahoo.com wrote:
 Thanks Tobias!
 A new question: if I want to draw an average ROC from cross-validation, how
 to make the bar color same as the line color? Here is my code:

 plot( perf2,avg=threshold,lty=2,col=2,
 spread.estimate=stddev,barcol=2)

 Even I specify barcol=2, the color of bars are still black, the default
 one, instead of red 2.

 --Tim


 --- On Tue, 5/12/09, Tobias Sing tobias.s...@gmail.com wrote:

 From: Tobias Sing tobias.s...@gmail.com
 Subject: Re: [R] ROCR: auc and logarithm plot
 To: timlee...@yahoo.com, r-help@r-project.org
 Date: Tuesday, May 12, 2009, 5:54 AM

 1. I have tried to understand how to extract area-under-curve value by
 looking at the ROCR document and
  googling. Still I am not sure if I am doing the
 right thing. Here is my code, is auc1 the auc value?
 
 pred1 - prediction(resp1,label1)

 perf1 - performance(pred1,tpr,fpr)
 plot( perf1, type=l,col=1 )

 auc1 - performance(pred1,auc)
 auc1 - a...@y.values[[2]]
 


 If you have only one set of predictions and matching class labels, it
 would be in a...@y.values[[1]].
 If you have multiple sets (as from cross-validation or bootstrapping),
 the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc.
 You can collect all of them for example by unlist(p...@y.values).

 Btw, you can use str(auc1) to see the structure of objects.


 2. I have to compare two models that have very close ROCs. I'd like to
 have a more distinguishable plot of the ROCs. So is it possible to have a
 logarithm FP axis which might probably separate
  them well? Or zoom in the part
 close to the leftup corner of ROC plot? Or any other ways to make the ROCs
 more
 separate?


 To zoom in to a specific part:
 plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1))
 plot(perf2, add=TRUE, lty=2, col='red')

 If you want logarithmic axes (though I wouldn't personally do this for
 a ROC plot), you can set up an empty canvas and add ROC curves to it:
 plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n')
 plot(perf, add=TRUE)

 You can adjust all components of the performance plots. See
 ?plot.performance and the examples in this slide deck:
 http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

 Hope that helps,
   Tobias



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] From two colors to 01 sequences

2009-05-12 Thread baptiste auguie

Depending on the nature of your pdf file, it may be possible to use  
the grImport package. I've never used it before, but a quick test  
seems promising,

  # create a test picture
 colorStrip -
 function (colors, draw = T)
 {
 x - seq(0, 1 - 1/ncol(colors), length = ncol(colors))
 y - rep(0.5, length(colors))
 my.grob - grid.rect(x = unit(x, npc), y = unit(y, npc),
 width = unit(1/ncol(colors), npc), height = unit(1,
 npc), just = left, hjust = NULL, vjust = NULL,
 default.units = npc, name = NULL, gp = gpar(fill =  
 rgb(colors[1,
 ], colors[2, ], colors[3, ]), col = rgb(colors[1,
 ], colors[2, ], colors[3, ])), draw = draw, vp = NULL)
 my.grob
 }


 colors - rbind(c(1, 0, 1), c(0, 1, 0), c(0, 0, 0))

 pdf(testRGB.pdf)
 colorStrip(colors)
 dev.off()


 # import the pdf file into R
 library(grImport)
 PostScriptTrace(testRGB.pdf)

 test - readLines(testRGB.pdf.xml)

 testRead - readPicture(testRGB.pdf.xml)

 str(testRead)
 grid.picture(testRead) # somehow I've lost the fill color in the  
 process?!

 grep(rgb.+, test, value=T) # this should allow you to find the  
 sequence of red and green rectangles

HTH,

baptiste




On 12 May 2009, at 13:38, Zeljko Vrba wrote:

 I have got several pdf files with rows of colored rectangles: red
 rectangles should be read as 0; green rectangles as 1. No other color
 exists. Is there some way to have R reading the colored rectangles to
 a matrix or data frame converting the color of the rectangles to
 sequences of 01?


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
__


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SAS PROC SORT nodupkey

2009-05-12 Thread amor Gandhi

Hi,
 
I have the following data and I would like to delete douple names, it is almost 
similar to SAS PROC SORT nodupkey! Is there any function in R does this?
 
x1 - rnorm(11,5,1)
x2 - runif(11,0,1)
nam -paste(A, c(1:4,4,5:9,9), sep=.)
mydata - data.frame(x1,x2)
crownames(mydata) - nam
 
Many thanks in advance,
Amor


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to display data content on a only row?

2009-05-12 Thread Dieter Menne



Thom_249 wrote:
 
 I got them from a Matrix on with I use the applyfunction tu compute the
 mean columns by columns
 print(apply(matSD,2,mean))
 

matSD = matrix(round(rnorm(20),2),nrow=4)
cat(matSD)
print(matSD)
dput(matSD) # How to send this matrix to r-help

newMat = apply(matSD,2,mean)
print(newMat)
cat(newMat,\n)
dput(newMat) # Good for r-help questions



-- 
View this message in context: 
http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23501655.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Response surface plot

2009-05-12 Thread Tim Carnus
Dear List,

I am trying to plot a similar graph to attached from minitab manual in R.
I have a response Y and three components which systematically vary in their 
proportions. I have found in R methods/packages to plot ternary plots (eg. 
plotrix) but nothing which can extend it to response surface in 3-D.

Any help appreciated,

Tim Carnus


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] From two colors to 01 sequences

2009-05-12 Thread Paul Smith
Thanks, Baptiste and Zeljko. I am attaching here an example of the
picture of the rectangles.

Paul


On Tue, May 12, 2009 at 1:23 PM, baptiste auguie ba...@exeter.ac.uk wrote:

 Depending on the nature of your pdf file, it may be possible to use
 the grImport package. I've never used it before, but a quick test
 seems promising,

  # create a test picture
 colorStrip -
 function (colors, draw = T)
 {
     x - seq(0, 1 - 1/ncol(colors), length = ncol(colors))
     y - rep(0.5, length(colors))
     my.grob - grid.rect(x = unit(x, npc), y = unit(y, npc),
         width = unit(1/ncol(colors), npc), height = unit(1,
             npc), just = left, hjust = NULL, vjust = NULL,
         default.units = npc, name = NULL, gp = gpar(fill =
 rgb(colors[1,
             ], colors[2, ], colors[3, ]), col = rgb(colors[1,
             ], colors[2, ], colors[3, ])), draw = draw, vp = NULL)
     my.grob
 }


 colors - rbind(c(1, 0, 1), c(0, 1, 0), c(0, 0, 0))

 pdf(testRGB.pdf)
 colorStrip(colors)
 dev.off()


 # import the pdf file into R
 library(grImport)
 PostScriptTrace(testRGB.pdf)

 test - readLines(testRGB.pdf.xml)

 testRead - readPicture(testRGB.pdf.xml)

 str(testRead)
 grid.picture(testRead) # somehow I've lost the fill color in the
 process?!

 grep(rgb.+, test, value=T) # this should allow you to find the
 sequence of red and green rectangles

 HTH,

 baptiste




 On 12 May 2009, at 13:38, Zeljko Vrba wrote:

 I have got several pdf files with rows of colored rectangles: red
 rectangles should be read as 0; green rectangles as 1. No other color
 exists. Is there some way to have R reading the colored rectangles to
 a matrix or data frame converting the color of the rectangles to
 sequences of 01?


 _

 Baptiste Auguié

 School of Physics
 University of Exeter
 Stocker Road,
 Exeter, Devon,
 EX4 4QL, UK

 Phone: +44 1392 264187

 http://newton.ex.ac.uk/research/emag
 __


        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Response surface plot

2009-05-12 Thread Tim Carnus

   Dear List,
   I am trying to plot a similar graph to attached from minitab manual in R.
   I have a response Y and three components which systematically vary in their
   proportions. I have found in R methods/packages to plot ternary plots (eg.
   plotrix) but nothing which can extend it to response surface in 3-D.
   Any help appreciated,
   Tim Carnus
attachment: reponse_surface.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survival curves for time dependent covariates (was consultation)

2009-05-12 Thread Terry Therneau
*I´m writing to ask you how can I do Survivals Curves using Time-dependent
*covariates? Which packages I need to Install?*

  This is a very difficult problem statistically.  That is, there are not many 
good ideas for what SHOULD be done.  Hence, there are no packages. Almost 
everything you find in an applied paper (e.g. a medical journal) is wrong.
  
 Terry Therneau
  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotCI line types for line and for bar

2009-05-12 Thread lehe

Hi,
I was wondering how to specify the line type for line instead of for bar.
Here is my code:
plotCI(x=mcra1avg, uiw=stdev1, type=l,col=2,lty=2)
This way, I will have the bar line as dashed lty=2 and red col=2, and
the line connecting the centers of the bars is also red col=2 but solid 
lty=1. How to make the line connecting the bar centers have the same solid
lty as the bar?
Thanks and regards!

-- 
View this message in context: 
http://www.nabble.com/plotCI-line-types-for-line-and-for-bar-tp23501900p23501900.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS PROC SORT nodupkey

2009-05-12 Thread Chuck Cleland
On 5/12/2009 8:29 AM, amor Gandhi wrote:
 Hi,
  
 I have the following data and I would like to delete douple names, it is 
 almost similar to SAS PROC SORT nodupkey! Is there any function in R does 
 this?
  
 x1 - rnorm(11,5,1)
 x2 - runif(11,0,1)
 nam -paste(A, c(1:4,4,5:9,9), sep=.)
 mydata - data.frame(x1,x2)
 crownames(mydata) - nam
  
 Many thanks in advance,
 Amor 

x1 - rnorm(11,5,1)

x2 - runif(11,0,1)

nam -paste(A, c(1:4,4,5:9,9), sep=.)

mydata - data.frame(nam,x1,x2)

mydata[!duplicated(mydata$nam),]
   nam   x1 x2
1  A.1 5.418824 0.04867219
2  A.2 5.658403 0.18398337
3  A.3 4.458279 0.50975887
4  A.4 5.465920 0.16425144
6  A.5 3.769153 0.80380230
7  A.6 6.827979 0.13745441
8  A.7 5.353053 0.91418900
9  A.8 5.409866 0.41879708
10 A.9 5.041249 0.38226152

?duplicated

   [[alternative HTML version deleted]]
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newtons method

2009-05-12 Thread John C Nash

Finding polynomial roots is not a problem where one wants a quick and dirty 
code. There are a lot of pitfalls, especially if there are roots that are 
multiples, and there has been a lot of work on this problem. See 
http://en.wikipedia.org/wiki/Category:Root-finding_algorithms .

And Uwe may not be aware that optim() is contra-recommended for functions of 1 variable, which seems to be the problem here. But there is 


?polyroot

JN


Message: 130
Date: Tue, 12 May 2009 11:12:51 +0200
From: Uwe Ligges lig...@statistik.tu-dortmund.de
Subject: Re: [R] newtons method
To: Kon Knafelman konk2...@hotmail.com
Cc: r-h...@stat.math.ethz.ch
Message-ID: 4a093d93.1020...@statistik.tu-dortmund.de
Content-Type: text/plain; charset=ISO-8859-1; format=flowed



Kon Knafelman wrote:


 Hi,
 
 Does anyone know how to code newton's method for finding the roots of polynomial functions? im not sure whether i need to do this manually, or just code something with a loop to stop when it gets to the desired result
  


See ?optim for optimization methods.

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Warning trying to plot -log(log(survival)) (John Sorkin)

2009-05-12 Thread Terry Therneau
John,
  Nothing about the data set you posted should give issues in any version of 
the 
package.  The points =0 usually arises when the survival curve drops to 
zero, 
and so log(-log(0)) is being trimmed off the graph by the plot command.  This 
is 
what should happen.  As Thomas pointed out to you earlier, log-log survival 
curves are not that useful and there are now much better methods.
  
   I suspect that the data set you posted and the data set giving the warning 
are not the same.
   
Terry T.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RES: SAS PROC SORT nodupkey

2009-05-12 Thread Leandro Marino
Amor,

You have the possibility to use the duplicated() function.

A[!duplicated(A[,var]),]

Atenciosamente,
Leandro Lins Marino
Centro de Avaliação
Fundação CESGRANRIO
Rua Santa Alexandrina, 1011 - 2º andar
Rio de Janeiro, RJ - CEP: 20261-903
R (21) 2103-9600 R.:236 
( (21) 8777-7907
( lean...@cesgranrio.org.br

Aquele que suporta o peso da sociedade
é precisamente aquele que obtém
 as menores vantagens. (SMITH, Adam)

  Antes de imprimir pense em sua responsabilidade e compromisso com o MEIO 
AMBIENTE 

-Mensagem original-
De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Em nome 
de amor Gandhi
Enviada em: terça-feira, 12 de maio de 2009 09:30
Para: r-h...@stat.math.ethz.ch
Assunto: [R] SAS PROC SORT nodupkey


Hi,
 
I have the following data and I would like to delete douple names, it is almost 
similar to SAS PROC SORT nodupkey! Is there any function in R does this?
 
x1 - rnorm(11,5,1)
x2 - runif(11,0,1)
nam -paste(A, c(1:4,4,5:9,9), sep=.) mydata - data.frame(x1,x2)
crownames(mydata) - nam
 
Many thanks in advance,
Amor


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power function for ratio of lognormal means: two equally

2009-05-12 Thread Ted Harding
On 12-May-09 12:00:50, Steve Candy wrote:
 Hi All
 This is a general stats problem that I am dealing with using R,
 so any help is greater appreciated.
 
 I have two lognormal distributions with means M1 and M2.
 If we have:
 
 H0: log(M1/M2)=0
 
 H1: log(M1/M2) !=0 equivalent to log(M1/M2)=log(1+P) where P0 or P0.
 
 If we calculate the power for a value of P=0.1 or P=-0.1 (i.e. a 10%
 difference) and say assume SE{log(M1/M2)}=0.05, and confidence level=
 100(1-alpha), alpha=0.05, then how is the power function calculated?
 
 As far as I can see we can calculate the power in the two ways given
 below and if there is no assumed direction to difference between M1 and
 M2 are not both calculations valid?
 
 # P=-0.1
 qn - qnorm(p=0.05, mean=0, sd=0.05, lower.tail=T)
 Power.1 - pnorm(q=qn, mean=log(0.9), sd=0.05, lower.tail=T)
 # P=0.1
 qn - qnorm(p=0.95, mean=0, sd=0.05, lower.tail=T)
 Power.2 - pnorm(q=qn, mean=log(1.1), sd=0.05, lower.tail=F)

 print(c(Power.1,Power.2))
 [1] 0.6780872 0.6030887

  So which value should I use? Or would the average of the two values be
 appropriate to use? Or is there a fault in my logic? After a quick lit
 search I contacted a colleague who has written two stats text books and
 taught at University level for many years and he has not come across
 this problem and suggested averaging the values. This post is to ask if
 anyone has come across this pretty basic problem and has a suggested
 approach.
 
 thanks
 Steve

The Power Function is the probability of rejecting H0 for a particular
instance of H1 (defined by a specific non-null value of a parameter),
considered as a function of that parameter.

So the power for H1: +10% is not the same as the power for H1: -10%.

So then you face the problem of what to do about that when you want
to consider What is the power when H1 is +/-10%?. The answer, in
general terms, is whatever best suits the context in which the problem
arises.

One general solution which makes a certain kind of sense is: Adopt
the smaller value of the two results (0.6780872 or 0.6030887). Then
you know that the power is at least 0.6030887 if H1 (either +10% or -10%)
is true. If your objective is to know at least what power can
be achieved in the circumstances, then that answers the question. If
0.6030887 is adequate for your needs, then your ivestigation is in
a satisfactory state.

On the other hand, if in the context of the problem, you are concerned
that there may not be enough power, then you may want to know at
most what power can be achieved. In that case, you can achieve at most
0.6780872. If that is inadequate for your needs, then you need to
re-consider the investigation as a whole, with a view to increasing
the power..

You might also be considering a situation in which you are prepared
to think as though +/-10% may arise at random with probabilities
(P+) and (P-), and are interested in the average power as being
representative of what you may expect to achieve overall (even though
this will never be true for any particular case).

In that case, you may reasonably adopt 0.6780872*(P-) + 0.6030887*(P+).
Your colleague's suggestion is equivalent to (P-) = (P+) = 0.5.

Additionally, in such a context, you may have different utilities
for successfully rejecting H0 when H1: -10% is true, versus successfully
rejecting H0 when H1: +10% is true. In that case, you would be
thinking of computing the expected utility (the preceding case is
tantamount to having equal utilities).

Summary: Apply the results in the way that best fits into what you
know about the situation you are investigating, and best meets the
objectives you have in that aituation. There is not a universal
answer to your question!

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 12-May-09   Time: 14:19:41
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newtons method

2009-05-12 Thread Uwe Ligges



John C Nash wrote:
Finding polynomial roots is not a problem where one wants a quick and 
dirty code. There are a lot of pitfalls, especially if there are roots 
that are multiples, and there has been a lot of work on this problem. 
See http://en.wikipedia.org/wiki/Category:Root-finding_algorithms .


And Uwe may not be aware that optim() is contra-recommended for 
functions of 1 variable,


Has anybody told us something about just 1 variable?

uwe




which seems to be the problem here. But there is
?polyroot

JN


Message: 130
Date: Tue, 12 May 2009 11:12:51 +0200
From: Uwe Ligges lig...@statistik.tu-dortmund.de
Subject: Re: [R] newtons method
To: Kon Knafelman konk2...@hotmail.com
Cc: r-h...@stat.math.ethz.ch
Message-ID: 4a093d93.1020...@statistik.tu-dortmund.de
Content-Type: text/plain; charset=ISO-8859-1; format=flowed



Kon Knafelman wrote:


 Hi,
  Does anyone know how to code newton's method for finding the roots 
of polynomial functions? im not sure whether i need to do this 
manually, or just code something with a loop to stop when it gets to 
the desired result
  


See ?optim for optimization methods.

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R^2 extraction and autocorrelation/heterokedasticity on TSLS regression

2009-05-12 Thread Axel Leroix










Hi,
 
I'm actually I’m performing a TSLS linear multiple regression on annually 
data which go from 1971 to 1997. After performing the TSLS regression, I tried 
to extract the R squared value using “output$r.squared” function and to 
perform autocorrelation (Durbin Watson and Breush Godfrey) and 
heterokedasticity tests (Breush-pagan and Goldfeld Quandt)  but I have errors 
messages. More specifically, this is function that I write to R and below its 
response :
for R^2 :
 output$r.squared
NULL
for heterokedasticity tests :
bptest(reg1)
Error in terms.default(formula) : no terms component
and for autocorrelation test, when I try :
durbin.watson(reg1$residuals, max.lag=10)
 [1] 1.509 2.520 2.247 2.001 1.743 1.092 1.392 1.439 1.468 1.035
this give me only the durbin watson value and not the probabilities (p-value)
When performing these tests on lm object I have no problem. So my question is 
how to extract R^2 from a tsls regression (object) and how to perform 
autocorrelation and heterokedasticity tests on tsls regression. I looked at the 
sem package but I found no answer to my questions. So please is there any 
person who can help me.
 
Think you in advance



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] import HTML tables

2009-05-12 Thread Dimitri Szerman
Hello,
I was wondering if there is a function in R that imports tables directly
from a HTML document. I know there are functions (say, getURL() from {RCurl}
) that download the entire page source, but here I refer to something like
google document's function importHTML() (if you don't know this function, go
check it, it's very useful). Anyway, if someone of something that does this
job, I'd be very grateful if you could let me know. Otherwise, here's a
suggestion for R-developers: please do write something inspired
in google's importHMTL() function.

Many thanks,
Dimitri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] neural network not using all observations

2009-05-12 Thread jude.ryan
I am exploring neural networks (adding non-linearities) to see if I can
get more predictive power than a linear regression model I built. I am
using the function nnet and following the example of Venables and
Ripley, in Modern Applied Statistics with S, on pages 246 to 249. I have
standardized variables (z-scores) such as assets, age and tenure. I have
other variables that are binary (0 or 1). In max_acc_ownr_nwrth_n_med
for example, the variable has a value of 1 if the client's net worth is
above the median net worth and a value of 0 otherwise. These are derived
variable I created and variables that the regression algorithm has found
to be predictive. A regression on the same variables shown below gives
me an R-Square of about 0.12. I am trying to increase the predictive
power of this regression model with a neural network being careful to
avoid overfitting.

Similar to Venables and Ripley, I used the following code:

 

 library(nnet)

 dim(coreaff.trn.nn)

[1] 50888

 head(coreaff.trn.nn)

  hh.iast.y WC_Total_Assets all_assets_per_hh age  tenure
max_acc_ownr_liq_asts_n_med max_acc_ownr_nwrth_n_med
max_acc_ownr_ann_incm_n_med

1   3059448  -0.4692186-0.4173532 -0.06599001 -1.04747935
01   0

2   4899746   3.4854334 4.064 -0.06599001 -0.72540200
11   1

3727333  -0.2677357-0.4177944 -0.30136473 -0.40332465
11   1

4443138  -0.5295170-0.6999646 -0.1825 -1.04747935
00   0

5484253  -0.6112205-0.7306664  0.64013414  0.07979137
10   0

6799054   0.6580506 1.1763114  0.24784295  0.07979137
01   1

 coreaff.nn1 - nnet(hh.iast.y ~ WC_Total_Assets + all_assets_per_hh +
age + tenure + max_acc_ownr_liq_asts_n_med +

+ max_acc_ownr_nwrth_n_med +
max_acc_ownr_ann_incm_n_med, coreaff.trn.nn, size = 2, decay = 1e-3,

+ linout = T, skip = T, maxit = 1000, Hess = T)

# weights:  26

initial  value 12893652845419998.00 

iter  10 value 6352515847944854.00

final  value 6287104424549762.00 

converged

 summary(coreaff.nn1)

a 7-2-1 network with 26 weights

options were - skip-layer connections  linear output units  decay=0.001

 b-h1 i1-h1 i2-h1 i3-h1 i4-h1 i5-h1
i6-h1 i7-h1 

 -21604.84   -2675.80   -5001.90   -1240.16-335.44  -12462.51
-13293.80   -9032.34 

 b-h2 i1-h2 i2-h2 i3-h2 i4-h2 i5-h2
i6-h2 i7-h2 

 210841.52   47296.92   58100.43  -13819.10   -9195.80  117088.99
131939.57  106994.47 

  b-o  h1-o  h2-o  i1-o  i2-o  i3-o
i4-o  i5-o  i6-o  i7-o 

1115190.67  894123.33 -417269.57   89621.84  170268.12   44833.63
59585.05  112405.30  437581.05  244201.69

 sum((hh.iast.y - predict(coreaff.nn1))^2)  

Error: object hh.iast.y not found

 

So I try:

 

 sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2)

Error: dims [product 5053] do not match the length of object [5088]

In addition: Warning message:

In coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1) :

  longer object length is not a multiple of shorter object length

 

Doing a little debugging:

 

 pred - predict(coreaff.nn1)

 dim(pred)

[1] 50531

 dim(coreaff.trn.nn)

[1] 50888

 

So it looks like the dimensions (number of records/cases) of the vector
pred is 5,053 and the number of records of the input dataset is 5,088.

 

It looks like the neural network is dropping 35 records. Does anyone
have any idea of why it would do this? It is most probably because those
35 records are bad data, a pretty common occurrence in the real world.
Does anyone know how I can identify the dropped records? If I can do
this I can get the dimensions of the input dataset to be 5,053 and then:

 

 sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2)

 

would work.

 

A summary of my dataset is:

 

 summary(coreaff.trn.nn)

   hh.iast.yWC_Total_Assets  all_assets_per_hh age
tenure   max_acc_ownr_liq_asts_n_med

 Min.   :   0   Min.   :-6.970e-01   Min.   :-8.918e-01   Min.
:-4.617e+00   Min.   :-1.209e+00   Min.   :0. 

 1st Qu.:  565520   1st Qu.:-5.387e-01   1st Qu.:-6.147e-01   1st
Qu.:-4.583e-01   1st Qu.:-7.254e-01   1st Qu.:0. 

 Median :  834164   Median :-3.160e-01   Median :-3.718e-01   Median :
9.093e-02   Median :-2.423e-01   Median :0. 

 Mean   : 1060244   Mean   : 2.948e-13   Mean   : 3.204e-12   Mean
:-1.884e-11   Mean   :-3.302e-12   Mean   :0.4951 

 3rd Qu.: 1207181   3rd Qu.: 1.127e-01   3rd Qu.: 1.891e-01   3rd Qu.:
5.617e-01   3rd Qu.: 5.629e-01   3rd Qu.:1. 

 Max.   :45003160   Max.   : 

[R] R Group on Professional Networking Site

2009-05-12 Thread Ajit de Silva
Greetings:

There is a LinkedIn R Group.  There are job postings, news,  as well as
other discussions.

http://www.linkedin.com/groups?about=gid=77616trk=anet_ug_grppro

Regards,

Ajit
-- 
_
Ajit Gemunu de Silva
Oakland CA 94619

skype: ajit_de_silva

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Response surface plot

2009-05-12 Thread Duncan Murdoch

On 5/12/2009 8:43 AM, Tim Carnus wrote:

   Dear List,
   I am trying to plot a similar graph to attached from minitab manual in R.
   I have a response Y and three components which systematically vary in their
   proportions. I have found in R methods/packages to plot ternary plots (eg.
   plotrix) but nothing which can extend it to response surface in 3-D.
   Any help appreciated,


I'm not aware of anyone who has done this.  The way to do the surface in 
rgl would be to construct a mesh of triangles using tmesh3d, and set the 
color of each vertex as part of the material argument. It's a little 
tricky to get the colors right when they vary by vertex, but the code 
below gives an example.


I would construct the mesh by starting with one triangle and calling 
subdivision3d, but you may want more control over them.


For example:

library(rgl)

# First create a flat triangle and subdivide it
triangle - c(0,0,0,1, 1,0,0,1, 0.5, sqrt(3)/2, 0, 1)
mesh - tmesh3d( triangle, 1:3, homogeneous=TRUE)
mesh - subdivision3d(mesh, 4, deform=FALSE, normalize=TRUE)

# Now get the x and y coordinates and compute the surface height
x - with(mesh, vb[1,])
y - with(mesh, vb[2,])
z - x^2 + y^2
mesh$vb[3,] - z

# Now assign colors according to the height; remember that the
# colors need to be in the order of mesh$it, not vertex order.

vcolors - rainbow(100)[99*z+1]
tricolors - vcolors[mesh$it]
mesh$material = list(color=tricolors)

# Now draw the surface, and a rudimentary frame behind it.

shade3d(mesh)
triangles3d(matrix(triangle, byrow=TRUE, ncol=4), col=white)
quads3d(matrix(c(1,0.5,0.5,1, 0,sqrt(3)/2, sqrt(3)/2,0, 0,0,1,1), 
ncol=3), col=white)

bg3d(gray)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Group on Professional Networking Site

2009-05-12 Thread agdesilva
Greetings:

There is a LinkedIn R Group.  There are job postings, news,  as well
as other discussions.

http://www.linkedin.com/groups?about=gid=77616trk=anet_ug_grppro

Regards,

Ajit


http://www.linkedin.com/groups?about=gid=77616trk=anet_ug_grppro

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble with parametric bootstrap

2009-05-12 Thread Anders Bjorn
Hi,

I'm having trouble understanding how to construct a random number generator
for a parametric bootstrap. My aim is to bootstrap a Likelihood Ratio
statistic (under the null) for a linear model.  The function at this point
is given by

boot.test.n01 - function(data, indeces, maxit=20) {
y1 - fit1+se(e2)*rnorm(314)
mod1 - glm(y1 ~ X1-1, maxit=maxit)
y2 - fit2+se(e2)*rnorm(314)
mod2 - glm(y2~1, maxit=maxit)
t - 2*(logLik(mod1)-logLik(mod2))
t
}
boot.lrtest.n01 - boot(data=M1, statistic=boot.test.n01, R=3999, maxit=100,

sim=parametric, ran.gen=???, mle=???)

fit1  fit2 are vectors containing fitted values, the se() is the standard
error of a residual vector, e2, which I'm using as an estimate of the
variance. I'm not sure if I have constructed the function boot.test.n01
correctly with respect to the bootstrap dependent variables y1  y2.
Furthermore I'm rather lost when it comes to how to construct the random
number generator (as indicated by ???) and what to use as MLE's (as
indicated by ???). I would really appriciate any help that I could get.

Sincerely,

/Anders

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AFT-model with time-dependent covariates

2009-05-12 Thread Philipp Rappold
Dear R-community, 
Dear Prof. Therneau,

I would like to fit an AFT-model with time-dependent covariates and 
right-censored data. 

Searching the mailing list for information on the subject, I found some old 
posts which said it didn't work back then.

My questions:
(1) Has this kind of fitting already been implemented in the survival library 
in R?
(2) If not: Are there any alternatives/ workarounds in order to get this job 
done in R anyways?
(3) Can you recommend any other (commercial) software to fit this model?

Thanks for your great help, I areally ppreciate it!

All the best
Philipp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newtons method

2009-05-12 Thread Ravi Varadhan
Uwe,

John's comment about the difficulties with finding polynomial roots is even
more forceful for a system of polynomials.  There are likely numerous roots,
some possibly real, and some possibly multiple.  Homotopy methods are
currrently the state-of-art for finding all the roots, but beware that
they are very time-consuming.   For locating the real roots, I have found
that a relatively simple approach like multiple random starts works
failrly well with a root-finder such as dfsane() in the BB package.
However, I don't know of any tests to check whether I have found all the
roots.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Uwe Ligges
Sent: Tuesday, May 12, 2009 9:23 AM
To: John C Nash
Cc: r-help@r-project.org
Subject: Re: [R] newtons method



John C Nash wrote:
 Finding polynomial roots is not a problem where one wants a quick and 
 dirty code. There are a lot of pitfalls, especially if there are roots 
 that are multiples, and there has been a lot of work on this problem.
 See http://en.wikipedia.org/wiki/Category:Root-finding_algorithms .
 
 And Uwe may not be aware that optim() is contra-recommended for 
 functions of 1 variable,

Has anybody told us something about just 1 variable?

uwe



 which seems to be the problem here. But there is ?polyroot
 
 JN
 
 
 Message: 130
 Date: Tue, 12 May 2009 11:12:51 +0200
 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 Subject: Re: [R] newtons method
 To: Kon Knafelman konk2...@hotmail.com
 Cc: r-h...@stat.math.ethz.ch
 Message-ID: 4a093d93.1020...@statistik.tu-dortmund.de
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 
 
 
 Kon Knafelman wrote:
 
  Hi,
   Does anyone know how to code newton's method for finding the 
   roots
 of polynomial functions? im not sure whether i need to do this 
 manually, or just code something with a loop to stop when it gets to 
 the desired result
   
 
 See ?optim for optimization methods.
 
 Uwe Ligges
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] variability

2009-05-12 Thread elisia

I am analyzing the voice telephone traffic of some operators. In this type of
phenomenon is useful to consider the anomalies in the duration in minutes of
conversation but it is very important to take account of another variable:
average time of conversation, the relationship between the duration in
minutes of talk time and the number of such calls. 
Is there a way to get the values of the abnormal traffic based on these
variables? 
one can calculate an index of variability that represent the variability of
the duration of the conversation and the average time of conversation?
-- 
View this message in context: 
http://www.nabble.com/variability-tp23503877p23503877.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Hold out procedures in R software

2009-05-12 Thread Oludare Ariyo

 
 
 I am a PhD Student at the University of
 Agriculture,Abeokuta Nigeria.Also a Statistician at
 National
  Horticultural Research institites ibadan Nigeria.
  
  I currently working on Discriniant analysis and i need
 a
  help on hoe to solve Lanche Bruch's hold out
 Procedure
  using R .
  
  Kindly help because i consired you as my last option.
  Thanks
  
  ARIYO Oludare Samuel
  
  
        
  
 
 
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Two-way Anova

2009-05-12 Thread Alan O'Loughlin
Hello, 
 
I'm trying to do a comparsion on a large scale say 10L bottle of liquid and a 
small scale bottle of liquid 0.5L, I have 5 different samples from each and 
they are measured over the space of 8 days as % viability and the % viability 
decreases over time. However not all 10 samples got measured every day. How 
would I do a two-way anova on this in R?
 
Thanks for any help.
 
Regards,
Al

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rbind help

2009-05-12 Thread MikSmith

Hi

I have a dataset of orientation and elevation in a dataframe which I am
plotting in circular. Orientation is on the range 0-360, however I need to
reduce this to 0-180 as the values over 180 are not necessarily correct. I
can do

striae$orientation[striae$orientation180]-180)

to extract these values, but I then want to combine them back in to the
original dataframe (with the elevation values). I assume rbind is the thing
to use and have tried something like:

test - rbind((striae$orientation[striae$orientation180]-180),
(striae$orientation[striae$orientation=180]))

Not being a regular R user I suspect the syntax is easy when you know it!!
Any help much appreciated.

Best wishes

mike
-- 
View this message in context: 
http://www.nabble.com/rbind-help-tp23499705p23499705.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to display data content on a only row?

2009-05-12 Thread Thom_249

Totally awsom!

Thank you very much

Thom 

-- 
View this message in context: 
http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23502952.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Partial Derivatives in R

2009-05-12 Thread Paul Heinrich Dietrich

Thanks for showing me how to use genD correctly and that it matches deriv
here.  Like you say, there must be a problem with the manual way of doing
it, and I will look at it closely.  Thanks again.


spencerg wrote:
 
 Hi, Paul: 
 
 
   Your example is so complicated that I don't want to take the time 
 to check it.  You apply deriv to an exponential divided by a sum of 
 exponentials, and I'm not convinced that your manual Correct way is 
 actually correct.  It looks like you've followed the examples in the 
 deriv help page, so I would be more inclined to trust that, especially 
 since it matched the answer I got from genD, as follows. 
 
 
   In your genD example, x01 and x02 should be x[1] and x[2]: 
 
 p1 - function(x) {exp(b00.1+b01.1*x[1]+b02.1*x[2]) /
  (exp(b00.1+b01.1*x[1]+b02.1*x[2])+
   exp(b00.2+b01.2*x[1]+b02.2*x[2])+
   exp(b00.3+b01.3*x[1]+b02.3*x[2])) - phat1}
 test - genD(p1, c(x01, x02))
 test$D
[,1]  [,2][,3]   [,4]   [,5]
 [1,] -0.2012997 0.1296301 -0.03572875 0.07082898 -0.1261376
 
 
   The first two components of test$D here match your 
 attr(eval(dp1.dx), gradient).  The next three are the lower triangular 
 portion of the matrix of second partials of the function p1, per the 
 genD documentation. 
 
 
   The function numericGradient in the maxLik package could also be 
 used for this, I believe.  However, I won't take the time here to test 
 that. 
 
 
   Hope this helps. 
   Spencer Graves   
 
 
 Paul Heinrich Dietrich wrote:
 Hi Spencer,
 Thanks for suggesting the genD function.  In attempting it, I have
 rearranged my function from phat1 ~ ... to ... - 1, it apparently doesn't
 like the first one :)  But when I run it, it tells me the partials are
 all
 zero.  I'm trying out a simple MNL equation before I expand it to what
 I'm
 looking for.  Here is what I tried (and I get different answers from a
 textbook solution, deriv(), and genD()):

   
 ### Variables for an observation
 x01 - rnorm(1,0,1)
 x02 - rnorm(1,0,1)
 ### Parameters for an observation
 b00.1 - rnorm(1,0,1)
 b00.2 - rnorm(1,0,1)
 b00.3 - 0
 b01.1 - rnorm(1,0,1)
 b01.2 - rnorm(1,0,1)
 b01.3 - 0
 b02.1 - rnorm(1,0,1)
 b02.2 - rnorm(1,0,1)
 b02.3 - 0
 ### Predicted Probabilities for an observation
 phat1 - 0.6
 phat2 - 0.3
 phat3 - 0.1
 ### Correct way to calculate a partial derivative
 partial.b01.1 - phat1 * (b01.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b01.2 - phat2 * (b01.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b01.3 - phat3 * (b01.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b01.1; partial.b01.2; partial.b01.3
 
 [1] 0.04288663
 [1] -0.1804876
 [1] 0.1376010
   
 partial.b02.1 - phat1 * (b02.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b02.2 - phat2 * (b02.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b02.3 - phat3 * (b02.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3))
 partial.b02.1; partial.b02.2; partial.b02.3
 
 [1] 0.8633057
 [1] 0.3171978
 [1] 0.1376010
   
 ### Derivatives for MNL
 dp1.dx - deriv(phat1 ~ exp(b00.1+b01.1*x01+b02.1*x02) /
 
 + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))
   
 dp2.dx - deriv(phat2 ~ exp(b00.2+b01.2*x01+b02.2*x02) /
 
 + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))
   
 dp3.dx - deriv(phat3 ~ exp(b00.3+b01.3*x01+b02.3*x02) /
 
 + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02))
   
 attr(eval(dp1.dx), gradient)
 
  x01  x02
 [1,] -0.01891354 0.058918
   
 attr(eval(dp2.dx), gradient)
 
 x01 x02
 [1,] -0.1509395 -0.06258685
   
 attr(eval(dp3.dx), gradient)
 
   x01 x02
 [1,] 0.169853 0.003668849
   
 library(numDeriv)
 dp1.dx - function(x) {exp(b00.1+b01.1*x01+b02.1*x02) /
 
 + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 + exp(b00.3+b01.3*x01+b02.3*x02)) - phat1}
   
 test - genD(dp1.dx, c(phat1,b00.1,b01.1,b02.1,x01,x02)); test
 
 $D
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
 [,14]
 [1,]000000000 0 0 0 0
 0
  [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
 [,26]
 [1,] 0 0 0 0 0 0 0 0 0 0 0
 0
  [,27]
 [1,] 0

 $p
 [1] 6

 $f0
 [1] 0.05185856

 $func
 function(x) {exp(b00.1+b01.1*x01+b02.1*x02) /
 (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+
 exp(b00.3+b01.3*x01+b02.3*x02)) - phat1}

 $x
 [1]  0.6  1.401890082 -1.304531849  0.062833294 -0.143141379
 [6] -0.005995477

 $d
 [1] 1e-04

 $method
 [1] Richardson

 $method.args
 $method.args$eps
 [1] 1e-04

 $method.args$d
 [1] 1e-04

 $method.args$zero.tol
 

[R] Graphic aspect ratio

2009-05-12 Thread MikSmith

Hi

I've been playing with a 3x2 graphics device using the default size as it
appears on screen. This has given me tall thin plots which I can resize by
dragging the window and increasing the window width. However I was wondering
if I can force R to produce square plots or set the actual aspect ratio. asp
seems to affect data units, not on-screen pixel units.

thanks

mike
-- 
View this message in context: 
http://www.nabble.com/Graphic-aspect-ratio-tp23503495p23503495.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] questions on rpart (tree changes when rearrange the order of covariates?!)

2009-05-12 Thread Yuanyuan
Greetings,

I am using rpart for classification with class method. The test data  is
the Indian diabetes data from package mlbench.

I fitted a classification tree firstly using the original data, and then
exchanged the order of Body mass and Plasma glucose which are the
strongest/important variables in the growing phase. The second tree is a
little different from the first one. The misclassification tables are
different too. I did not change the data, but why the results are so
different?

Does anyone know how rpart deal with ties?

Here is the codes for running the two trees.


library(mlbench)
data(PimaIndiansDiabetes2)
mydata-PimaIndiansDiabetes2
library(rpart)
fit2-rpart(diabetes~., data=mydata,method=class)
plot(fit2,uniform=T,main=CART for original data)
text(fit2,use.n=T,cex=0.6)
printcp(fit2)
table(predict(fit2,type=class),mydata$diabetes)
## misclassifcation table: rows are fitted class
  neg pos
  neg 437  68
  pos  63 200
#Klimt(fit2,mydata)

pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)])
fit3-rpart(diabetes~., data=pmydata,method=class)
plot(fit3,uniform=T,main=CART after exchaging mass  glucose)
text(fit3,use.n=T,cex=0.6)
printcp(fit3)
table(predict(fit3,type=class),pmydata$diabetes)
##after exchage the order of BODY mass and PLASMA glucose
  neg pos
  neg 436  64
  pos  64 204
#Klimt(fit3,pmydata)


Thanks,


--
Yuanyuan Huang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to take away the same varible when I use merge

2009-05-12 Thread Xin Shi
Dear:

 

I am trying to merge two tables by a common variable. However, there are a
few same variables which are in both of two tables. How can I take them away
when I merge the two tables?

 

Thanks!

 

Xin

 

 

 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to display data content on a only row?

2009-05-12 Thread Thom_249

Hello Dieter

I got them from a Matrix on with I use the applyfunction tu compute the mean
columns by columns

print(apply(matSD,2,mean))

So it's a kind of vector.

I'll check with the cat function

Thank you, have a nice day

Thomas
-- 
View this message in context: 
http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23500703.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] triangle.class pch per factor

2009-05-12 Thread HighSchool2005

Hello,

I am using triangle.class in ade4 package and a factor. I would like to use
this graphic parameter (pch) to set a different kind of point for each group
in my factor (lakes). I have tried pch=1:4, I guess it does what it is
supposed to do but not what I want. I can't use colors.

Thank you in advance, 
-- 
View this message in context: 
http://www.nabble.com/triangle.class-pch-per-factor-tp23502805p23502805.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using loops to run functions over a list of variables

2009-05-12 Thread rapton

Hello,

I have a data set with many variables, and often I want to run a given
function, like summary() or cor() or lmer() etc. on many combinations of one
or more than one of these variables.  For every combination of variables I
want to analyze I have been writing out the code by hand, but given that I
want to run many different functions over dozens and dozens of variables
combinations it is taking a lot of time and making for very inelegent code. 
There *has* to be a better way!  I have tried looking through numerous
message boards but everything I've tried has failed.

It seems like loops would solve this problem nicely.
(1) Create list of variables of interest
(2) Iterate through the list, running a given function on each variable

I have a data matrix which I have creatively called data.  It has
variables named focus and productive.

If I run the function summary(), for instance, it works fine:
summary(data$focus)
summary(data$productive)  

Both of these work.

If I try to use a loop like:

factors - c(data$focus, data$productive)
for(i in 1:2){
summary(get(factors[i]))
}

It given the following errors:
Error in get(factors[i]) : variable data$focus was not found
Error in summary(get(factors[i])) : 
  error in evaluating the argument 'object' in selecting a method for
function 'summary'

But data$focus *does* exist!  I could run summary(data$focus) and it works
perfectly.  

What am I doing wrong?

Even if I get this working, is there a better way to do this, especially if
I have dozens of variables to analyze?

Any ideas would be greatly appreciated!
-- 
View this message in context: 
http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rbind help

2009-05-12 Thread James W. MacDonald

Hi Mike,

MikSmith wrote:

Hi

I have a dataset of orientation and elevation in a dataframe which I am
plotting in circular. Orientation is on the range 0-360, however I need to
reduce this to 0-180 as the values over 180 are not necessarily correct. I
can do

striae$orientation[striae$orientation180]-180)

to extract these values, but I then want to combine them back in to the
original dataframe (with the elevation values). I assume rbind is the thing
to use and have tried something like:

test - rbind((striae$orientation[striae$orientation180]-180),
(striae$orientation[striae$orientation=180]))


I would do the conversion in place:

striae$orientation - ifelse(striae$orientation  108, 
striae$orientation - 180, striae$orientation)


Best,

Jim



Not being a regular R user I suspect the syntax is easy when you know it!!
Any help much appreciated.

Best wishes

mike


--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pair matching

2009-05-12 Thread Thomas S. Dye
Given two numeric vectors of possibly unequal length, I'd like to pair  
each element of the shorter vector with an element of the longer  
vector such that the sum of squared differences between the pairs is  
minimized.  Can someone point me to an R function or an algorithm for  
accomplishing this?

All the best,
Tom

Thomas S. Dye, Ph.D.
T. S. Dye  Colleagues, Archaeologists, Inc.
Phone: (808) 529-0866 Fax: (808) 529-0884
http://www.tsdye.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple plot margins

2009-05-12 Thread Andre Nathan
Hello

I'm plotting 6 graphs using mfrow = c(2, 3). In these plots, only
graphs in the first column have titles for the y axis, and only the ones
in the last row have titles for the x axis.

I'd like all plots to be of the same size, and I'm trying to keep them
as near each other as possible, but I'm having the following problem.

If I make a single call to par(mar = ...), to leave room on the left and
bottom for the axes titles, a lot of space will be wasted because not
all graphs need titles; however, if I make one call of par(mar = ...)
per plot, to have finer control of the margins, the first column and
last row plots will be smaller than the rest, because the titles use up
some of their space.

I thought that setting large enough values for oma would do what I
want, but it doesn't appear to work if mar is too small.

To illustrate better what I'm trying to do:

  l +-+ +-+ +-+
  a | | | | | |
  b | | | | | |
  e | | | | | |
  l +-+ +-+ +-+
  
  l +-+ +-+ +-+
  a | | | | | |
  b | | | | | |
  e | | | | | |
  l +-+ +-+ +-+
 label   label   label

where the margins between each plot should be narrow.

Should I just plot the graphs without axis titles and then use text() to
manually position them?

Thanks in advance,
Andre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using loops to run functions over a list of variables

2009-05-12 Thread Jun Shen
I don't think get(factor[i]) will work. The problem is get only sees a
character string data$focus instead of doing extracting focus from data.
In your case isn't lapply (or sapply) good enough?

sapply (data, summary)

try ?lapply for details

Jun
On Tue, May 12, 2009 at 10:55 AM, rapton mattc...@gmail.com wrote:


 Hello,

 I have a data set with many variables, and often I want to run a given
 function, like summary() or cor() or lmer() etc. on many combinations of
 one
 or more than one of these variables.  For every combination of variables I
 want to analyze I have been writing out the code by hand, but given that I
 want to run many different functions over dozens and dozens of variables
 combinations it is taking a lot of time and making for very inelegent code.
 There *has* to be a better way!  I have tried looking through numerous
 message boards but everything I've tried has failed.

 It seems like loops would solve this problem nicely.
 (1) Create list of variables of interest
 (2) Iterate through the list, running a given function on each variable

 I have a data matrix which I have creatively called data.  It has
 variables named focus and productive.

 If I run the function summary(), for instance, it works fine:
 summary(data$focus)
 summary(data$productive)

 Both of these work.

 If I try to use a loop like:

 factors - c(data$focus, data$productive)
 for(i in 1:2){
 summary(get(factors[i]))
 }

 It given the following errors:
 Error in get(factors[i]) : variable data$focus was not found
 Error in summary(get(factors[i])) :
  error in evaluating the argument 'object' in selecting a method for
 function 'summary'

 But data$focus *does* exist!  I could run summary(data$focus) and it works
 perfectly.

 What am I doing wrong?

 Even if I get this working, is there a better way to do this, especially if
 I have dozens of variables to analyze?

 Any ideas would be greatly appreciated!
 --
 View this message in context:
 http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jun Shen PhD
PK/PD Scientist
BioPharma Services
Millipore Corporation
15 Research Park Dr.
St Charles, MO 63304
Direct: 636-720-1589

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using loops to run functions over a list of variables

2009-05-12 Thread Jun Shen
Forgot one thing, make sure your data is a list or data frame.

On Tue, May 12, 2009 at 12:19 PM, Jun Shen jun.shen...@gmail.com wrote:

 I don't think get(factor[i]) will work. The problem is get only sees a
 character string data$focus instead of doing extracting focus from data.
 In your case isn't lapply (or sapply) good enough?

 sapply (data, summary)

 try ?lapply for details

 Jun

 On Tue, May 12, 2009 at 10:55 AM, rapton mattc...@gmail.com wrote:


 Hello,

 I have a data set with many variables, and often I want to run a given
 function, like summary() or cor() or lmer() etc. on many combinations of
 one
 or more than one of these variables.  For every combination of variables I
 want to analyze I have been writing out the code by hand, but given that I
 want to run many different functions over dozens and dozens of variables
 combinations it is taking a lot of time and making for very inelegent
 code.
 There *has* to be a better way!  I have tried looking through numerous
 message boards but everything I've tried has failed.

 It seems like loops would solve this problem nicely.
 (1) Create list of variables of interest
 (2) Iterate through the list, running a given function on each variable

 I have a data matrix which I have creatively called data.  It has
 variables named focus and productive.

 If I run the function summary(), for instance, it works fine:
 summary(data$focus)
 summary(data$productive)

 Both of these work.

 If I try to use a loop like:

 factors - c(data$focus, data$productive)
 for(i in 1:2){
 summary(get(factors[i]))
 }

 It given the following errors:
 Error in get(factors[i]) : variable data$focus was not found
 Error in summary(get(factors[i])) :
  error in evaluating the argument 'object' in selecting a method for
 function 'summary'

 But data$focus *does* exist!  I could run summary(data$focus) and it works
 perfectly.

 What am I doing wrong?

 Even if I get this working, is there a better way to do this, especially
 if
 I have dozens of variables to analyze?

 Any ideas would be greatly appreciated!
 --
 View this message in context:
 http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jun Shen PhD
 PK/PD Scientist
 BioPharma Services
 Millipore Corporation
 15 Research Park Dr.
 St Charles, MO 63304
 Direct: 636-720-1589




-- 
Jun Shen PhD
PK/PD Scientist
BioPharma Services
Millipore Corporation
15 Research Park Dr.
St Charles, MO 63304
Direct: 636-720-1589

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] FW: neural network not using all observations

2009-05-12 Thread jude.ryan
As a follow-up to my email below:

 

The input data frame to nnet() has dimensions:

 

 dim(coreaff.trn.nn)

[1] 50888

 

And the predictions from the neural network (35 records are dropped -
see email below for more details) has dimensions:

 

 pred - predict(coreaff.nn1)

 dim(pred)

[1] 50531

 

So, the following line of R code does not work as the dimensions are
different.

 

 sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2)

Error: dims [product 5053] do not match the length of object [5088]

In addition: Warning message:

In coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1) :

  longer object length is not a multiple of shorter object length

 

While:

 

 dim(pred)

[1] 50531

 

 tail(pred)

  [,1]

5083  664551.9

5084  552170.6

5085  684834.3

5086 1215282.5

5087 1116302.2

5088  658112.1

 

shows that the last row of pred is 5,088, which corresponds to the
dimension of coreaff.trn.nn, the input data frame to the neural network.

 

I tried using row() to identify the 35 records that were dropped (or not
scored). The code I tried was:

 

 coreaff.trn.nn.subset - coreaff.trn.nn[row(coreaff.trn.nn) ==
row(pred), ]

Error in row(coreaff.trn.nn) == row(pred) : non-conformable arrays

 

But I am not doing something right. pred has dimension = 1 and row()
requires an object of dimension = 2. So using cbind() I bound a column
of sequence numbers to pred to make the dimension = 2 but that did not
help.

 

Basically, if I can identify the 5,053 records that the neural network
made predictions for, in the data frame of 5,088 records
(coreaff.trn.nn) used by the neural network, then I can compare the
predictions to the actual values, and compare the predictive power of
the neural network to the predictive power of the linear regression
model.

 

Any idea how I can extract the 5,053 records that the neural network
made predictions for from the data frame (5,088 records) used to train
the neural network? 

 

Thanks in advance,

 

Jude

 



From: Ryan, Jude 
Sent: Tuesday, May 12, 2009 11:11 AM
To: 'r-help@r-project.org'
Cc: juderya...@yahoo.com
Subject: neural network not using all observations



I am exploring neural networks (adding non-linearities) to see if I can
get more predictive power than a linear regression model I built. I am
using the function nnet and following the example of Venables and
Ripley, in Modern Applied Statistics with S, on pages 246 to 249. I have
standardized variables (z-scores) such as assets, age and tenure. I have
other variables that are binary (0 or 1). In max_acc_ownr_nwrth_n_med
for example, the variable has a value of 1 if the client's net worth is
above the median net worth and a value of 0 otherwise. These are derived
variable I created and variables that the regression algorithm has found
to be predictive. A regression on the same variables shown below gives
me an R-Square of about 0.12. I am trying to increase the predictive
power of this regression model with a neural network being careful to
avoid overfitting.

Similar to Venables and Ripley, I used the following code:



 library(nnet)

 dim(coreaff.trn.nn)

[1] 50888

 head(coreaff.trn.nn)

  hh.iast.y WC_Total_Assets all_assets_per_hh age  tenure
max_acc_ownr_liq_asts_n_med max_acc_ownr_nwrth_n_med
max_acc_ownr_ann_incm_n_med

1   3059448  -0.4692186-0.4173532 -0.06599001 -1.04747935
01   0

2   4899746   3.4854334 4.064 -0.06599001 -0.72540200
11   1

3727333  -0.2677357-0.4177944 -0.30136473 -0.40332465
11   1

4443138  -0.5295170-0.6999646 -0.1825 -1.04747935
00   0

5484253  -0.6112205-0.7306664  0.64013414  0.07979137
10   0

6799054   0.6580506 1.1763114  0.24784295  0.07979137
01   1

 coreaff.nn1 - nnet(hh.iast.y ~ WC_Total_Assets + all_assets_per_hh +
age + tenure + max_acc_ownr_liq_asts_n_med +

+ max_acc_ownr_nwrth_n_med +
max_acc_ownr_ann_incm_n_med, coreaff.trn.nn, size = 2, decay = 1e-3,

+ linout = T, skip = T, maxit = 1000, Hess = T)

# weights:  26

initial  value 12893652845419998.00 

iter  10 value 6352515847944854.00

final  value 6287104424549762.00 

converged

 summary(coreaff.nn1)

a 7-2-1 network with 26 weights

options were - skip-layer connections  linear output units  decay=0.001

 b-h1 i1-h1 i2-h1 i3-h1 i4-h1 i5-h1
i6-h1 i7-h1 

 -21604.84   -2675.80   -5001.90   -1240.16-335.44  -12462.51
-13293.80   -9032.34 

 b-h2 i1-h2 i2-h2 i3-h2 i4-h2 i5-h2
i6-h2 i7-h2 

 210841.52   47296.92   58100.43  -13819.10   

Re: [R] Looking for a quick way to combine rows in a matrix

2009-05-12 Thread Johannes Hüsing

jim holtman schrieb:

Try this:

  

key - rownames(a)
key[key == AT] - TA
do.call(rbind, by(a, key, colSums))


something like

paste(sort(strsplit(key, split=)[[1]]), )

might be more general.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two-way Anova

2009-05-12 Thread Mike Lawrence
Using traditional ANOVA, you'd have to drop either cases or time
points with missing data. Using linear mixed effects analysis, you'd
be able to use all the data. LME also has the benefit of *not*
assuming sphericity, which is good for data like yours (many measures
across few cases) where the traditional ANOVA sphericity assumption is
unlikely to hold.

Your dependent variable, % valid, suggests that there's some more raw
representation of the data that may be better to look at. For example,
if % valid is derived from, say, the success/failure rate of 10
observations per sample/timepoint, you might want to take a look the
lme4 package (as suggested in a previous post:
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q3/001160.html )

On Tue, May 12, 2009 at 6:03 AM, Alan O'Loughlin olo...@wyeth.com wrote:
 Hello,

 I'm trying to do a comparsion on a large scale say 10L bottle of liquid and a 
 small scale bottle of liquid 0.5L, I have 5 different samples from each and 
 they are measured over the space of 8 days as % viability and the % viability 
 decreases over time. However not all 10 samples got measured every day. How 
 would I do a two-way anova on this in R?

 Thanks for any help.

 Regards,
 Al

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pair matching

2009-05-12 Thread Liaw, Andy
If the matching need not be one-to-one, then you can just compute the
Euclidean distances between the two vectors, then in each row (or
column, which ever corresponds to the shorter vector) and find the
smallest.  This should be fairly easy to do.

Andy 

From: Thomas S. Dye
 
 Given two numeric vectors of possibly unequal length, I'd 
 like to pair  
 each element of the shorter vector with an element of the longer  
 vector such that the sum of squared differences between the pairs is  
 minimized.  Can someone point me to an R function or an 
 algorithm for  
 accomplishing this?
 
 All the best,
 Tom
 
 Thomas S. Dye, Ph.D.
 T. S. Dye  Colleagues, Archaeologists, Inc.
 Phone: (808) 529-0866 Fax: (808) 529-0884
 http://www.tsdye.com
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: From two colors to 01 sequences

2009-05-12 Thread Paul Smith
On Tue, May 12, 2009 at 2:35 PM, Zeljko Vrba zv...@ifi.uio.no wrote:
 Aha, so you DON'T have only red and green rectangles in the picture, there is
 also white in between, and some pale delimiter lines.  Nevertheless, both
 things ease the job slightly, and the approach I described should work.  You
 can even script it by using ImageMagick and ghostscript.

 If all pictures are of the same size and layout, you could even manually
 (or programataically) construct a mask of rectangle midpoints, so you
 don't have to detect transitions between red/green and white, which will
 enormously simplify the job. [Note that the distance between rectangle
 centers is uniform in both directions.]

I thank again both Zeljko and Baptiste.

Meanwhile, I found out a more or less simple way of solving the problem:

1. edited the pdf file with Inkscape, inserting 0 or 1 over the rectangles;

2. saved the pdf file;

3. opened it with Acrobat Reader, and selected, as text, the 0 and 1 inserted;

4. pasted the 0 and 1 into a text editor.

And that was done!

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using loops to run functions over a list of variables

2009-05-12 Thread Matt Killingsworth
Jun,

sapply() does the trick!  Thank you Jun!  I really appreciate your help.

-Matt

On Tue, May 12, 2009 at 1:20 PM, Jun Shen jun.shen...@gmail.com wrote:

 Forgot one thing, make sure your data is a list or data frame.


 On Tue, May 12, 2009 at 12:19 PM, Jun Shen jun.shen...@gmail.com wrote:

 I don't think get(factor[i]) will work. The problem is get only sees a
 character string data$focus instead of doing extracting focus from data.
 In your case isn't lapply (or sapply) good enough?

 sapply (data, summary)

 try ?lapply for details

 Jun

 On Tue, May 12, 2009 at 10:55 AM, rapton mattc...@gmail.com wrote:


 Hello,

 I have a data set with many variables, and often I want to run a given
 function, like summary() or cor() or lmer() etc. on many combinations of
 one
 or more than one of these variables.  For every combination of variables
 I
 want to analyze I have been writing out the code by hand, but given that
 I
 want to run many different functions over dozens and dozens of variables
 combinations it is taking a lot of time and making for very inelegent
 code.
 There *has* to be a better way!  I have tried looking through numerous
 message boards but everything I've tried has failed.

 It seems like loops would solve this problem nicely.
 (1) Create list of variables of interest
 (2) Iterate through the list, running a given function on each variable

 I have a data matrix which I have creatively called data.  It has
 variables named focus and productive.

 If I run the function summary(), for instance, it works fine:
 summary(data$focus)
 summary(data$productive)

 Both of these work.

 If I try to use a loop like:

 factors - c(data$focus, data$productive)
 for(i in 1:2){
 summary(get(factors[i]))
 }

 It given the following errors:
 Error in get(factors[i]) : variable data$focus was not found
 Error in summary(get(factors[i])) :
  error in evaluating the argument 'object' in selecting a method for
 function 'summary'

 But data$focus *does* exist!  I could run summary(data$focus) and it
 works
 perfectly.

 What am I doing wrong?

 Even if I get this working, is there a better way to do this, especially
 if
 I have dozens of variables to analyze?

 Any ideas would be greatly appreciated!
 --
 View this message in context:
 http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jun Shen PhD
 PK/PD Scientist
 BioPharma Services
 Millipore Corporation
 15 Research Park Dr.
 St Charles, MO 63304
 Direct: 636-720-1589




 --
 Jun Shen PhD
 PK/PD Scientist
 BioPharma Services
 Millipore Corporation
 15 Research Park Dr.
 St Charles, MO 63304
 Direct: 636-720-1589



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] selecting points on 3D scatterplots

2009-05-12 Thread Abby Drake
Hello Everyone,

I am new to R and need some help.

I have a matrix of x,y,z coordinates that I would like to
interactively plot in 3D and then using the cursor select points on
the plot and have the coordinates sent to a matrix. I am using the rgl
package to plot the data at the moment because it allows me to rotate
and zoom. I also tried cloud and scatterplot3D.

I am looking for a function like 'locator' which is used to select
points on 2D scatterplots.

Thanks in advance!
Abby

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave: Howto write real TeX formula in plot

2009-05-12 Thread Jonas Stein
Thank you, Baptiste and Charlie.
I found some examples wich look great on:
http://www.texample.net/tikz/examples/

 
 Perhaps try the pgfSweave package on r-forge?
 http://r-forge.r-project.org/projects/pgfsweave/ 
 [..]
 accomplished using the java utility eps2pgf and requires the PGF/TiKZ
 package for LaTeX:
 http://sourceforge.net/projects/pgf/

I have installed pgf with apt-get install pgf. That worked fine, but
installing pgfSweave fails.

I did this:

,[ in R interpreter (running as root) ]
| 
| install.packages('pgfSweave',,'http://www.rforge.net/') 
| 
| Warning in install.packages(pgfSweave, , http://www.rforge.net/;) :
|   argument 'lib' is missing: using '/usr/local/lib/R/site-library'
| trying URL 'http://www.rforge.net/src/contrib/pgfSweave_0.7.1.tar.gz'
| Content type 'application/x-tar' length 1017992 bytes (994 Kb)
| opened URL
| ==
| downloaded 994 Kb
| 
| * Installing *source* package 'pgfSweave' ...
| ** R
| ** exec
| ** inst
| ** preparing package for lazy loading
| Loading required package: stashR
| Warning in library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc 
= lib.loc) :
|   there is no package called 'stashR'
| Error: package 'stashR' could not be loaded
| Execution halted
| ERROR: lazy loading failed for package 'pgfSweave'
| ** Removing '/usr/local/lib/R/site-library/pgfSweave'
| 
| The downloaded packages are in
| /tmp/Rtmp6huu9t/downloaded_packages
| Warning messages:
| 1: In install.packages(pgfSweave, , http://www.rforge.net/;) :
|   dependencies ‘stashR’, ‘filehash’, ‘digest’, ‘cacheSweave’ are not available
| 2: In install.packages(pgfSweave, , http://www.rforge.net/;) :
|   installation of package 'pgfSweave' had non-zero exit status
|
`

what will i have to do now?

And could someone give me an example how to write a formula in a plot?
Like plot(... title=$\sigma^2 + \int x$)

Thank you very much,

-- 
Jonas Stein n...@jonasstein.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting points on 3D scatterplots

2009-05-12 Thread Duncan Murdoch

On 5/12/2009 3:08 PM, Abby Drake wrote:

Hello Everyone,

I am new to R and need some help.

I have a matrix of x,y,z coordinates that I would like to
interactively plot in 3D and then using the cursor select points on
the plot and have the coordinates sent to a matrix. I am using the rgl
package to plot the data at the moment because it allows me to rotate
and zoom. I also tried cloud and scatterplot3D.

I am looking for a function like 'locator' which is used to select
points on 2D scatterplots.


There's no locator3d, but select3d is somewhat similar.

You could write a locator3d function using the rgl.setMouseCallbacks, 
rgl.user2window and rgl.window2user functions, if you can figure out how 
to indicate depth with a mouse click.  identify3d might be a little easier.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I extract the scoring equations for neural networks and support vector machines?

2009-05-12 Thread jude.ryan
Sorry for these multiple postings.

I solved the problem using na.omit() to drop records with missing values
for the time being. I will worry about imputation, etc. later.

 

I calculated the sum of squared errors for 3 models, linear regression,
neural networks, and support vector machines. This is the first run.
Without doing any parameter tuning on the SVM or playing around with the
number of nodes in the hidden layer of the neural network, I found that
the SVM had the lowest sum of squared errors, followed by neural
networks, with regression being last. This probably indicates that the
data has non-linear patterns.

 

I have a couple of questions.

1) Besides sum of squared errors, are there any other metrics that can
be used to compare these 3 models? AIC, BIC, etc, can be used for
regressions, but I am not sure whether they can be used for SVM's and
neural networks.

2) Is there any easy way to extract the scoring equations for SVM's and
neural networks? Using the R objects I can always score new data
manually but the model will need to be implemented in a production
environment. When the model gets implemented in production (could be the
mainframe) I will need equations that can be coded in any language
(COBOL or SAS on the mainframe). Also, getting the scoring equations for
all 3 models will let me create an ensemble model where the predicted
value could be the average of the predictions from the SVM, neural
network and linear regression. If the ensemble model has the smallest
sum of squared errors this would be the model I would use.

 

I have SAS Enterprise Miner as well and can get a scoring equation for
the neural network (I don't have SVM), but the scoring code that SAS EM
generates sucks and I would much rather extract a scoring equation from
R. I am using nnet() for the neural network.

 

Thanks in advance,

 

Jude Ryan

 



From: Ryan, Jude 
Sent: Tuesday, May 12, 2009 1:23 PM
To: 'r-help@r-project.org'
Cc: juderya...@yahoo.com
Subject: FW: neural network not using all observations



As a follow-up to my email below:



The input data frame to nnet() has dimensions:



 dim(coreaff.trn.nn)

[1] 50888



And the predictions from the neural network (35 records are dropped -
see email below for more details) has dimensions:



 pred - predict(coreaff.nn1)

 dim(pred)

[1] 50531



So, the following line of R code does not work as the dimensions are
different.



 sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2)

Error: dims [product 5053] do not match the length of object [5088]

In addition: Warning message:

In coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1) :

  longer object length is not a multiple of shorter object length



While:



 dim(pred)

[1] 50531



 tail(pred)

  [,1]

5083  664551.9

5084  552170.6

5085  684834.3

5086 1215282.5

5087 1116302.2

5088  658112.1



shows that the last row of pred is 5,088, which corresponds to the
dimension of coreaff.trn.nn, the input data frame to the neural network.



I tried using row() to identify the 35 records that were dropped (or not
scored). The code I tried was:



 coreaff.trn.nn.subset - coreaff.trn.nn[row(coreaff.trn.nn) ==
row(pred), ]

Error in row(coreaff.trn.nn) == row(pred) : non-conformable arrays



But I am not doing something right. pred has dimension = 1 and row()
requires an object of dimension = 2. So using cbind() I bound a column
of sequence numbers to pred to make the dimension = 2 but that did not
help.



Basically, if I can identify the 5,053 records that the neural network
made predictions for, in the data frame of 5,088 records
(coreaff.trn.nn) used by the neural network, then I can compare the
predictions to the actual values, and compare the predictive power of
the neural network to the predictive power of the linear regression
model.



Any idea how I can extract the 5,053 records that the neural network
made predictions for from the data frame (5,088 records) used to train
the neural network? 



Thanks in advance,



Jude





From: Ryan, Jude 
Sent: Tuesday, May 12, 2009 11:11 AM
To: 'r-help@r-project.org'
Cc: juderya...@yahoo.com
Subject: neural network not using all observations



I am exploring neural networks (adding non-linearities) to see if I can
get more predictive power than a linear regression model I built. I am
using the function nnet and following the example of Venables and
Ripley, in Modern Applied Statistics with S, on pages 246 to 249. I have
standardized variables (z-scores) such as assets, age and tenure. I have
other variables that are binary (0 or 1). In max_acc_ownr_nwrth_n_med
for example, the variable has a value of 1 if the client's net worth is
above the median net worth and a value of 0 otherwise. These are derived
variable I created and variables that the regression algorithm has found
to be predictive. A regression on the same variables shown below gives
me an R-Square of 

[R] Help needed to compile R with shared library.

2009-05-12 Thread Tena Sakai
Hi,

I cannot compile R with shared library.
I am using Redhad Linux on Dell hardware.

Here''s what I am doing:

I set ?PICFLAGS in config.site file:
 CPICFLAGS=-fPIC
 FPICFLAGS=-fPIC
and issue:
 ./configure --enable-R-shlib --prefix=$HOME/newR
 make

configure finishes with no complaints, the last
informative line being: 
 Options enabled:   shared R library, shared BLAS, R profiling, Java

But make is upset:
  /usr/bin/ld: CConverters.o: relocation R_X86_64_32S against `a local symbol' 
can not be used when making a shared object; recompile with -fPIC
  CConverters.o: could not read symbols: Bad value
  collect2: ld returned 1 exit status

It saysm recompile with -fPIC  But I am already
doing so, am I not?  What am I doing wrong here?
Any help would be appreciated.

Regards,

Tena Sakai
tsa...@gallo.ucsf.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >