Re: [R] MLE maximum number of parameters

2006-06-19 Thread Albyn Jones
I regularly optimize functions of over 1000 parameters for posterior
mode computations using a variant of newton-raphson.  I have some
favorable conditions: the prior is pretty good, the posterior is
smooth, and I can compute the gradient and hessian.

albyn


On Mon, Jun 19, 2006 at 06:53:00PM +0100, Patrick Burns wrote:
> Seagulls have a very different perspective to ballparks
> than ants.  Nonetheless, there is something that can be
> said.
> 
> There are several variables in addition to the number of
> parameters that are important.  These include:
> 
> * The complexity of the likelihood
> 
> * The number of observations in the dataset
> 
> * How close to the optimum is close enough
> 
> * Your patience
> 
> The latter is undoubtedly the most important of all.  It
> matters a lot whether you think a minute is a long time
> or only periods measured in weeks.
> 
> The optimization strategy can also have a big effect.  If
> you are using a derivative-based optimizer, then the number
> of parameters can have a big impact.  Typically one iteration
> in such algorithms requires p+1 function calls, where p is the
> number of parameters.  Since more iterations are generally
> required with more parameters, the speed can decrease
> rapidly as the number of parameters increases.
> 
> One strategy to deal with a large number of parameters is to
> start with something like a genetic algorithm.  Once the genetic
> algorithm has a pretty good solution, then switch to a derivative-
> based algorithm to finish.  The amount to run the initial
> algorithm before switching depends on the problem, the quality
> of the two optimizers, and probably other things.
> 
> With this switching strategy and at least a modicum of patience,
> problems with thousands of parameters may be feasible to solve.
> 
> Patrick Burns
> [EMAIL PROTECTED]
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
> 
> Federico Calboli wrote:
> 
> >Hi All,
> >
> >I would like to know, is there a *ballpark* figure for how many  
> >parameters the minimisation routines can cope with?
> >
> >I'm asking because I was asked if I knew.
> >
> >Cheers,
> >
> >Federico
> >
> >--
> >Federico C. F. Calboli
> >Department of Epidemiology and Public Health
> >Imperial College, St. Mary's Campus
> >Norfolk Place, London W2 1PG
> >
> >Tel +44 (0)20 75941602   Fax +44 (0)20 75943193
> >
> >f.calboli [.a.t] imperial.ac.uk
> >f.calboli [.a.t] gmail.com
> >
> >__
> >R-help@stat.math.ethz.ch mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
> >
> >  
> >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Foometrics in R

2006-06-19 Thread Albyn Jones
On Mon, Jun 19, 2006 at 02:34:17AM -0700, Jan de Leeuw wrote:
> 
> One of the outcomes of useR! 2006 is that JSS is planning to publish
> a series of special volumes. They will be guest edited (I have guest
> editors already, although somewhat tentative in some cases). Each volume
> will have 5-10 issues (articles) of the usual JSS format.
> 
> Psychometrics in R
> Political Methodology in R
> Econometrics in R
> Social Science Methodology in R
> Spectroscopy/Chemometrics in R
> Ecology in R
> 
> If you have suggestions, comments, possible contributions, ideas for
> additional volumes and so on, send them to me and I'll forward them to the
> appropriate authorities.
> 


Great idea!

albyn

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] off-topic; iraq statistics again

2006-05-22 Thread Albyn Jones
>>Gabor Grothendieck wrote:
>>
>>
>>>I came across this one:
>>>
>>>http://www.nysun.com/article/32787
>>>
>>>which says that the violent death rate in Iraq (which presumably
>>>includes violent deaths from the war) is lower than the violent
>>>death rate in major American cities.
>>>
>>>Does anyone have any insights from statistics on how to
>>>interpret this? 

I finally had time to follow up on this.
The NY Sun article compares apples and oranges, ie US cities to all of Iraq.

For data on Baghdad, see
 http://www.iraqbodycount.org/press/pr13.php
Iraq Body Count Press Release 13, 9th March 2006. 

 "Figures released by IBC today, updated by statistics for the year
 2005 from the main Baghdad morgue, show that the total number of
 civilians reported killed has risen year-on-year since May 1st 2003
 (the date that President Bush announced ��major combat operations
 have ended��):

* 6,331 from 1st May 2003 to the first anniversary of the invasion, 
 19th March 2004 (324 days: Year 1)
* 11,312 from 20th March 2004 to 19th March 2005 (365 days: Year 2)
* 12,617 from 20th March 2005 to 1st March 2006 (346 days: Year 3)."

According to several websites, the population of Baghdad is about 5 million.

According to R the violent death rate per 10 based on the 
year 3 data is

> (12617/500)*10
[1] 252.34

For comparison, see http://www.infoplease.com/ipa/A0004902.html,
Crime Rates for Selected Large Cities, 2002: homicides per 10 per year.

New York, N.Y.   7.3
Los Angeles, Calif. 17.1
Chicago, Ill.   22.1
Houston, Tex.   12.5
Philadelphia, Pa.   18.9
Phoenix, Ariz.  12.6
San Diego, Calif.3.7
Dallas, Tex.15.8
San Antonio, Tex.8.4
Las Vegas, Nev.311.9
Detroit, Mich.  41.8
San Jose, Calif. 2.8
Honolulu, Hawaii 2.0
Indianapolis, Ind.3 13.9
San Francisco, Calif.8.4
etc...

regards

albyn

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Curve fitting

2006-01-12 Thread Albyn Jones
You haven't told us how you are fitting the model; are you using
nls(), and if so with what initial values?  The models don't make
sense at x=0, due to the inclusion of the log(x) term.  Ignoring that,
you have 5 observations and 5 parameters in your second model. What is
the reason you are including both "b*log(x)" and "c*x" terms in the
model?  

regards

albyn
---
On Thu, Jan 12, 2006 at 07:11:12PM +0100, [EMAIL PROTECTED] wrote:
> Hi!
> 
> I have a problem of curve fitting.
> 
> I use the following data :
> 
>  - vector of predictor data : 
> 0
> 0.4
> 0.8
> 1.2
> 1.6
> 
> - vector of response data : 
> 0.81954
> 0.64592
> 0.51247
> 0.42831
> 0.35371
> 
>  I perform parametric fits using custom equations
> 
> when I use this equation :   y  =  yo + K *(1/(1+exp(-(a+b*ln(x)   the 
> fitting result is OK
> but when I use this more general equation :y  =  yo + K 
> *(1/(1+exp(-(a+b*log(x)+c*x  , then I get an aberrant curve!
> 
> I don't understand that... The second fitting should be at least as good 
> as the first one because when taking c=0, both equations are identical!
> 
> There is here a mathematical phenomenon that I don't understand!could 
> someone help me
> 
> Thanks a lot in advance!
> 
> Nad?ge 
> 
>   [[alternative HTML version deleted]]
> 

> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Console

2005-08-18 Thread Albyn Jones
Quoting Daniela Salvini <[EMAIL PROTECTED]>:

> I am at my first steps with R... and I already notice that the 
> console has a quite limited number of lines. Can anyone tell me how 
> to visualise all the information, which is actually present? I only 
> see the last part of the output, which obviosly exceeds the maximum 
> number of rows in the console.
> Thank you very much for your help!
> Daniela
>

"visualize" suggests plotting.

do you mean "how do I look at the whole dataset?"  you could print a few lines
at a time, say  X[1:25,].  With bigger datasets I usually look at the 
file in a
text editor like emacs...

albyn

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How about a mascot for R?

2004-12-02 Thread Albyn Jones
On Thu, Dec 02, 2004 at 06:18:39PM +0100, Henrik Bengtsson wrote:
> > -Original Message-
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Damian 
> > Betebenner
> > Sent: Thursday, December 02, 2004 6:07 PM
> > To: [EMAIL PROTECTED]
> > Subject: [R] How about a mascot for R?
> > 
> > 
> > Excellent replies,
> > 
> > So a couple of questions about preferences for the mascot:
> > 
> > 1. Does the mascot need to have a name that starts with R? Is 
> > that usually the way it works?
> > 
> > So far the possibilities put forward are: Ray, Ram, Inch 
> > Worm, Rhinoceros
> 
> R.oo (http://www.maths.lth.se/help/R/R.oo/), ooops Roo, which is Australian
> slang for Kangaroo. http://images.google.com/images?q=roo
> 
> Cheers
> 
> Henrik Bengtsson
>  

how about a Shark?  that would be a bit more subtle:-)

albyn

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculate Closest 5 Cases?

2004-02-13 Thread Albyn Jones
14.12   12.14  
> >  14.12
> > 4.02
> > 117800344.254.285.305.336.3814.88   15.96   18.08   14.85  
> >  7.48
> > 3.20
> > 117900164.404.405.544.404.4010.93   17.67   19.72   13.20  
> >  12.13
> > 4.33
> > 126603386.607.545.668.4910.38   11.31   16.06   12.26   8.49   
> >  8.49
> > 4.73
> > 126606445.513.143.957.097.1114.98   15.72   18.90   9.44   
> >  5.50
> > 8.65
> > 126616675.444.505.444.505.4412.69   13.63   11.81   9.07   
> >  13.68
> > 13.79
> > END DATA.
> >
> > *Output should be:.
> > *.
> > *   ID1 CLOSEID1CLOSEID2CLOSEID3CLOSEID4
> > CLOSEID5.
> > *   ID2 CLOSEID1CLOSEID2CLOSEID3CLOSEID4
> > CLOSEID5.
> > *   ID3 CLOSEID1CLOSEID2CLOSEID3CLOSEID4
> > CLOSEID5.
> > *   ID4 CLOSEID1CLOSEID2CLOSEID3CLOSEID4
> > CLOSEID5.
> > *   ID5 CLOSEID1CLOSEID2CLOSEID3CLOSEID4
> > CLOSEID5.
> >
> > __
> > [EMAIL PROTECTED] mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 

 ...armaments were not created chiefly for the protection of the 
nations but for their enslavement.  

The new political gospel: public office is private graft.

[Mark Twain]

http://www.reed.edu/~jonesAlbyn Jones [EMAIL PROTECTED]
Reed College, Portland OR 97202 (503)-771-1112 x7418

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] correlation and causality examples

2003-11-15 Thread Albyn Jones
On Sat, Nov 15, 2003 at 03:49:29PM +0100, Jean lobry wrote:
> Dear All,
> 
> I'am looking for examples showing that correlation does not imply
> causality, the targeted audience consists of undergraduate students
> (their first year at the university but in the BioMathStat track).
> All practicals are under R.
> 

The dataset below contains data by state, including population in 
thousands, area in square miles, percent urban population, percent 
below poverty line, whether there are gun registration laws or not, 
and the number of homicides. The socioeconomic data are from 1990/91,
from the census bureau as I recall.
 
The gun registration indicator is taken from a USA Today article 
(Tuesday, January 7, 1992, PAGE 5A).  The article reported that
gun registration laws lead to increased numbers of murders 
(homicides), a conclusion reached by comparing the mean number of 
homicides in states with gun registration laws to states without 
registration laws.  

"Guns" <- 
structure(.Data = list(
"pop" = c(4089, 2372, 30380, 3291, 598, 13277, 1135, 2795, 11543, 5996, 
   4860, 9368, 4432, 5158, 6737, 635, 7760, 18058, 10939, 11961, 1004,
   3560, 4953, 17349, 1770, 5018, 570, 3750, 3377, 680, 6623,
   1039, 5610, 2495, 3713, 4252, 1235, 2592, 808, 1593, 1105,
   1548, 1284, 3175, 2922, 703, 6286, 567, 4955, 1801, 460.), 
"area" = c(52.4, 53.2, 163.7, 5.5, 0.1, 65.8, 10.9, 56.3, 57.9, 10.6, 
   12.4, 96.8, 86.9, 69.7, 53.8, 70.7, 8.7, 54.5, 44.8, 46.1, 1.5, 32, 
   42.1, 268.6, 84.9, 71.3, 656.4, 114, 104.1, 2.5, 59.4, 83.6, 36.4, 
   82.3, 40.4, 51.8, 35.4, 48.4, 147, 77.4, 9.4, 121.6, 110.6, 69.9, 
   98.4, 77.1, 42.8, 9.6, 65.5, 24.2, 97.8), 
"urban" = c(60, 54, 93, 79, 100, 85, 89, 61, 85, 84, 81, 70, 71, 53, 
   50, 53, 89, 84, 74, 69, 86, 55, 61, 80, 87, 76, 68, 88, 82,
   73, 63, 57, 65, 69, 52, 68, 45, 47, 53, 66, 51, 73, 88,
   68, 71, 50, 69, 32, 66, 36, 65.), 
"poverty" = c(19, 18.4, 14.2, 5.8, 19.2, 14.1, 10, 10.1, 13.3, 10.2, 
   9.3, 13.9, 12, 13.6, 13.2, 13.5, 9, 14.1, 11.8, 10.8, 8.2, 16.5, 
   16.9, 16.8, 9.8, 26.2, 11.2, 14.2, 12.1, 8.1, 16, 13.7, 14.1, 11.1, 
   17.4, 22, 12.5, 23.8, 15.8, 10.9, 7.1, 20.9, 10.7, 15.8, 11.3, 13.5, 
   10.6, 7.1, 9.2, 17.2, 10.6), 
"gunreg" = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
   1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.), 
"homicides" = c(410, 240, 3710, 170, 489, 1300, 44, 62, 1270, 200, 540, 
   1020, 100, 550, 730, 11, 350, 2550, 760, 740, 38, 350, 470, 2660,
   43, 220, 56, 290, 155, 32, 720, 21, 380, 150, 260, 760,
   23, 370, 29, 43, 32, 160, 135, 220, 120, 9, 550, 24, 240,
   135, 20.)), 
names = c("pop", "area", "urban", "poverty", "gunreg", "homicides"), 
row.names = c("AL", "AR", "CA", "CT", "DC", "FL", "HI", "IA", "IL", "MA", 
   "MD", "MI", "MN", "MO", "NC", "ND", "NJ", "NY", "OH", "PA", "RI", "SC", 
   "TN", "TX", "UT", "WA", "AK", "AZ", "CO", "DE", "GA", "ID", "IN", "KS", 
   "KY", "LA", "ME", "MS", "MT", "NE", "NH", "NM", "NV", "OK", "OR", "SD", 
   "VA", "VT", "WI", "WV", "WY"), class = "data.frame")



 "I would rather be exposed to the inconveniences attending too 
  much liberty than to those attending too small a degree of it."
  -Thomas Jefferson

http://www.reed.edu/~jonesAlbyn Jones [EMAIL PROTECTED]
Reed College, Portland OR 97202 (503)-771-1112 x7418

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Effects of rounding on regression

2003-11-08 Thread Albyn Jones
On Sat, Nov 08, 2003 at 07:37:46AM -0800, Spencer Graves wrote:
> Have you considered "?qr" and the references cited therein? 
> 
> hope this helps.  spencer graves
> 
> Peter Flom wrote:
> 
> >Does anyone know of research on the effects of rounding on regression?
> >
> >e.g., when you ask people "How often have you ___?" you are more
> >likely to get answers like 100, 200, etc. than 98, 203, etc.
> >
> >I'm interested in investigating this, but don't want to reinvent the
> >wheel.
> >
> >thanks
> >
> >Peter

Are you asking about the propagation of rounding error in the 
computation, or about the statistical effects of measurement error?  

If the latter, and if the errors are in the explanatory variables, 
the keywords are "errors in variables".  One source is the text by 
Wayne Fuller "Measurement error Models" (1978, Wiley), and there is 
considerable literature more recently.

albyn

 "I would rather be exposed to the inconveniences attending too 
  much liberty than to those attending too small a degree of it."
  -Thomas Jefferson

http://www.reed.edu/~jonesAlbyn Jones [EMAIL PROTECTED]
Reed College, Portland OR 97202 (503)-771-1112 x7418

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help