date:20060105

Re: [R] A comment about R:

2006-01-05 Thread J Dougherty

On Thursday 05 January 2006 12:13, Achim Zeileis wrote:
> . . . snip
> Whether you find this simple or not depends on what you might want to
> have. Personally, I always find it very limiting if I've only got a switch
> to choose one or another vcov matrix when there is a multitude of vcov
> matrices in use in the literature. What if you would want to do HC3
> instead of the HC(0) that is offered by Eviews...or HC4...or HAC...or
> something bootstrapped...or...
> In my view, this is the stengths of many implementation in R: you can make
> programs very modular so that the user can easily extend the software or
> re-use it for other purposes. The price you pay for that is that it is not
> as easy to as a point-and-click software that offers some standard tools.
> Of course, both sides have advantages or disadvantages.
> . . .snip

Stata's ADO scripting language has the ability to access intermediate steps 
and local variables used by various commands.  These are typically held in 
memory until they are purged.  The difference between Stata and R is more 
that Stata has been streamlined into an application, the nuts and bolts 
hidden away, the rivet heads counter sunk and polished, so that unless you 
really need to use them, they aren't visible.  It only LOOKS like you are 
constrained to the readily available results of specific commands.  Stata 
output will tend to look very much like the standard output one becomes 
accustomed to in undergraduate stat courses.  

R assumes you _will_ want access to the nuts and bolts, and don't much care 
about visible rivets if the system is both accurate and functional.  R is 
much more a programming environment in that sense.  It is an important 
difference.  There is going to be a continuing growth in users of R as 
companies see cost savings in OS.  They will often be people who happily 
dragged .xls files into SPSS or SPSS for analysis and then printed the 
resulting reports.  (Personally, I became a strong believer in statistical 
analysis packages after receiving a _negative_ variance in Excel once upon a 
time.  I don't see how that could even be possible, but apparently it was a 
known issue.  Some ad hoc experimentation then demonstrated that no 
spreadsheet was all that precise).

One place where R and Stata have a great deal in common is in the manner in 
which graphs and charts are formatted.  Stata is perhaps slightly less 
bizantine, but only slightly.  Both systems emphasize flexibility and quality 
graphics at the price of learning to know what you are doing.  That said, you 
can still do a lot more with R in some areas than Stata, especially in 
spatial graphics and analysis.

JD

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread hadley wickham

> Selecting a sample is easy.  Yet, I'm not aware of any SQL device for
> easily selecting a _random_ sample of the records of a given table.  On
> the other hand, I'm no SQL specialist, others might know better.

There are a number of such devices, which tend to be rather SQL
variant specific.  Try googling for select random rows mysql, select
random rows pgsql, etc.

Another possibility is to generate a large table of randomly
distributed ids and then use that (with randomly generated limits) to
select the appropriate number of records.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] convert matrix to data frame

2006-01-05 Thread Petr Pikal

Hi

On 5 Jan 2006 at 15:11, Chia, Yen Lin wrote:

Date sent:  Thu, 5 Jan 2006 15:11:18 -0800
From:   "Chia, Yen Lin" <[EMAIL PROTECTED]>
To: 
Subject:[R] convert matrix to data frame

> Hi all,
> 
> 
> 
> Suppose I have a 4 x 2 matrix  A and I want to select the values in
> second column such that the value in first column equals to k.
> 
> 
> 
> I gave the colnames as alpha beta, so I was trying to access the info
> using
> 
> 
> 
> A$beta[A[,1]==k], however, I was told it's not a data frame, I can get
> the object by using dollar sign.  I tried data.frame(A), but it didn't
> work.  

I believe its because matrix has a bit different structure from 
data.frame so its columns can not be simply called by names. But I 
wonder why data.frame(A) does not work for you? See below.

 > str(A)
 `data.frame':   4 obs. of  2 variables:
  $ alpha: num  -1.181 -0.415  2.087  1.422
  $ beta : num  -0.0889  0.6828 -0.7035 -0.3351 

 > str(mat)
   num [1:4, 1:2] -1.1813 -0.4152  2.0865  1.4216 -0.0889 ...
  - attr(*, "dimnames")=List of 2
   ..$ : chr [1:4] "1" "2" "3" "4"
   ..$ : chr [1:2] "alpha" "beta" 

 > B<-data.frame(mat) 

 > str(B)
 `data.frame':   4 obs. of  2 variables:
  $ alpha: num  -1.181 -0.415  2.087  1.422
  $ beta : num  -0.0889  0.6828 -0.7035 -0.3351

 > mat$alpha[mat[,1]<0]
 NULL
 > B$alpha[mat[,1]<0]
 [1] -1.181258 -0.415175




> 
> 
> 
> Any input on this will be very appreciated.  Thanks.
> 
> 
> 
> I tried looking in the manual, but I think I'm might be wrong about
> the keywords.
> 
> 
> 
> Yen Lin
> 
> 
>  [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Frank E Harrell Jr

Leif Kirschenbaum wrote:
> A few thoughts about R vs SAS:
> I started learning SAS 8 years ago at IBM, I believe it was version 6.10.
> I started with R 7 months ago.
> 
> Learning curve:
>   I think I can do everything in R after 7 months that I could do in SAS 
> after about 4 years.
> 
> Bugs:
>   I suffered through several SAS version changes, 7.0, 7.1, 7.2, 8.0, 9.0 (I 
> may have misquoted some version numbers). Every version change gave me 
> headaches, as every version release (of an expensive commercially produced 
> software set) had bugs which upset or crashed previously working code. I had 
> code which ran fine under Windows 2000 and terribly under Windows XP. Most 
> bugs I found were noted by SAS, but never fixed.
>   With R I have encounted very few bugs, except for an occasional crash of R, 
> which I usually ascribe to some bug in Windows XP.
> 
> Help:
>   SAS help was OK. As others have mentioned, there is too much. I even had 
> the set of printed manuals on my desk (stretching 4 feet or so), which were 
> quote impenetrable. I had almost no support from colleagues: even within IBM 
> the number of advanced SAS users was small.
>   With R this mailing list has been of great help: almost every issue I copy 
> some program and save it as a "R hint " file.
> --> A REQUEST
> I would say that I would appreciate a few more program examples with the help 
> pages for some functions. For instance, "?Control" tells me about "if(cond) 
> cons.expr  else  alt.expr", however an example of
>if(i==1) { print("one") 
>} else if(i==2) { print("two")
>} else if(i>2) { print("bigger than two") }
>  at the end of that help section would have been very helpful for me a few 
> months ago.
> 
> Functions:
>   Writing my own functions in SAS was by use of macros, and usually depended 
> heavily on macro substitution. Learning SAS's macro language, especially 
> macro substitution, was very difficult and it took me years to be able to 
> write complicated functions. Quite different situation in R. Some functions I 
> have written by dint of copying code from other people's packages, which has 
> been very helpful.
>   I wanted to generate arbitrary k-values (the k-multiplier of sigma for a 
> given alpha, beta, and N to establish confidence limits around a mean for 
> small populations). I had a table from a years old microfiche book giving 
> values but wanted to generate my own. I had to find the correct integrals to 
> approximate the k-values and then write two SAS macros which iterated to the 
> desired level of tolerance to generate values. I would guess that there is 
> either an R base function or a package which will do this for me (when I need 
> to start generating AQL tables). Given the utility of these numbers, I was 
> disappointed with SAS.
> 
> Data manipulation:
>   All SAS data is in 2-dimensional datasets, which was very frustrating after 
> having used variables, arrays, and matrices in BASIC, APL, FORTRAN, C, 
> Pascal, and LabVIEW. SAS allows you to access only 1 row of a dataset at a 
> time which was terribly horribly incomprehensibly frustrating. There were so 
> many many problems I had to solve where I had to work around this SAS 
> paradigm.
>   In R, I can access all the elements of a matrix/dataframe at once, and I 
> can use >2 dimensional matrices. In fact, the limitations of SAS I had 
> ingrained from 7.5 years has sometimes made me forget how I can do something 
> so easily in R, like be able to know when a value in a column of a dataframe 
> changes:
>   DF$marker <- DF[1:(nrow(DF)-1),icol] != DF[2:nrow(DF),icol]
> This was hard to do in SAS...and even after years it was sometimes buggy, 
> keeping variable values from previous iterations of a SAS program.
>   One very nice advantage with SAS is that after data is saved in libraries, 
> there is a GUI showing all the libraries and the datasets inside the 
> libraries with sizes and dates. While we can save Rdata objects in an 
> external file, the base package doesn't seem to have the same capabilities as 
> SAS.
> 
> Graphics:
>   SAS graphics were quite mediocre, and generating customized labels was 
> cumbersome. Porting code from one Windows platform to another produced 
> unpredictable and sometimes unworkable results.
>   It has been easier in R: I anticipate that I will be able to port R Windows 
> code to *NIX and generate the same graphics.
> 
> Batch commands:
>   I am working on porting some of my R code to our *NIX server to generate 
> reports and graphs on a scheduled basis. Although a few at IBM did this with 
> SAS, I would have found doing this fairly daunting.
> 
> 
> -Leif

Leif,

Those are excellent points.  I'm especially glad you mentioned data 
manipulation.  I find that R is far ahead of SAS in this respect 
although most people are shocked to hear me say that.  We are doing all 
our data manipulation (merging, recoding, etc.) in R for pharmaceutical 
research.  The ability to deal wi

Re: [R] A comment about R:

2006-01-05 Thread Frank E Harrell Jr

Leif Kirschenbaum wrote:
> A few thoughts about R vs SAS:
> I started learning SAS 8 years ago at IBM, I believe it was version 6.10.
> I started with R 7 months ago.
> 
> Learning curve:
>   I think I can do everything in R after 7 months that I could do in SAS 
> after about 4 years.
> 
> Bugs:
>   I suffered through several SAS version changes, 7.0, 7.1, 7.2, 8.0, 9.0 (I 
> may have misquoted some version numbers). Every version change gave me 
> headaches, as every version release (of an expensive commercially produced 
> software set) had bugs which upset or crashed previously working code. I had 
> code which ran fine under Windows 2000 and terribly under Windows XP. Most 
> bugs I found were noted by SAS, but never fixed.
>   With R I have encounted very few bugs, except for an occasional crash of R, 
> which I usually ascribe to some bug in Windows XP.
> 
> Help:
>   SAS help was OK. As others have mentioned, there is too much. I even had 
> the set of printed manuals on my desk (stretching 4 feet or so), which were 
> quote impenetrable. I had almost no support from colleagues: even within IBM 
> the number of advanced SAS users was small.
>   With R this mailing list has been of great help: almost every issue I copy 
> some program and save it as a "R hint " file.
> --> A REQUEST
> I would say that I would appreciate a few more program examples with the help 
> pages for some functions. For instance, "?Control" tells me about "if(cond) 
> cons.expr  else  alt.expr", however an example of
>if(i==1) { print("one") 
>} else if(i==2) { print("two")
>} else if(i>2) { print("bigger than two") }
>  at the end of that help section would have been very helpful for me a few 
> months ago.
> 
> Functions:
>   Writing my own functions in SAS was by use of macros, and usually depended 
> heavily on macro substitution. Learning SAS's macro language, especially 
> macro substitution, was very difficult and it took me years to be able to 
> write complicated functions. Quite different situation in R. Some functions I 
> have written by dint of copying code from other people's packages, which has 
> been very helpful.
>   I wanted to generate arbitrary k-values (the k-multiplier of sigma for a 
> given alpha, beta, and N to establish confidence limits around a mean for 
> small populations). I had a table from a years old microfiche book giving 
> values but wanted to generate my own. I had to find the correct integrals to 
> approximate the k-values and then write two SAS macros which iterated to the 
> desired level of tolerance to generate values. I would guess that there is 
> either an R base function or a package which will do this for me (when I need 
> to start generating AQL tables). Given the utility of these numbers, I was 
> disappointed with SAS.
> 
> Data manipulation:
>   All SAS data is in 2-dimensional datasets, which was very frustrating after 
> having used variables, arrays, and matrices in BASIC, APL, FORTRAN, C, 
> Pascal, and LabVIEW. SAS allows you to access only 1 row of a dataset at a 
> time which was terribly horribly incomprehensibly frustrating. There were so 
> many many problems I had to solve where I had to work around this SAS 
> paradigm.
>   In R, I can access all the elements of a matrix/dataframe at once, and I 
> can use >2 dimensional matrices. In fact, the limitations of SAS I had 
> ingrained from 7.5 years has sometimes made me forget how I can do something 
> so easily in R, like be able to know when a value in a column of a dataframe 
> changes:
>   DF$marker <- DF[1:(nrow(DF)-1),icol] != DF[2:nrow(DF),icol]
> This was hard to do in SAS...and even after years it was sometimes buggy, 
> keeping variable values from previous iterations of a SAS program.
>   One very nice advantage with SAS is that after data is saved in libraries, 
> there is a GUI showing all the libraries and the datasets inside the 
> libraries with sizes and dates. While we can save Rdata objects in an 
> external file, the base package doesn't seem to have the same capabilities as 
> SAS.
> 
> Graphics:
>   SAS graphics were quite mediocre, and generating customized labels was 
> cumbersome. Porting code from one Windows platform to another produced 
> unpredictable and sometimes unworkable results.
>   It has been easier in R: I anticipate that I will be able to port R Windows 
> code to *NIX and generate the same graphics.
> 
> Batch commands:
>   I am working on porting some of my R code to our *NIX server to generate 
> reports and graphs on a scheduled basis. Although a few at IBM did this with 
> SAS, I would have found doing this fairly daunting.
> 
> 
> -Leif

Leif,

Those are excellent points.  I'm especially glad you mentioned data 
manipulation.  I find that R is far ahead of SAS in this respect 
although most people are shocked to hear me say that.  We are doing all 
our data manipulation (merging, recoding, etc.) in R for pharmaceutical 
research.  The ability to deal wi

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread François Pinard

[Brian Ripley]

>I rather thought that using a DBMS was standard practice in the 
>R community for those using large datasets: it gets discussed rather 
>often.

Indeed.  (I tried RMySQL even before speaking of R to my co-workers.)

>Another possibility is to make use of the several DBMS interfaces already 
>available for R.  It is very easy to pull in a sample from one of those, 
>and surely keeping such large data files as ASCII not good practice.

Selecting a sample is easy.  Yet, I'm not aware of any SQL device for 
easily selecting a _random_ sample of the records of a given table.  On 
the other hand, I'm no SQL specialist, others might know better.

We do not have a need yet for samples where I work, but if we ever need 
such, they will have to be random, or else, I will always fear biases.

>One problem with Francois Pinard's suggestion (the credit has got lost) 
>is that R's I/O is not line-oriented but stream-oriented.  So selecting 
>lines is not particularly easy in R.

I understand that you mean random access to lines, instead of random 
selection of lines.  Once again, this chat comes out of reading someone 
else's problem, this is not a problem I actually have.  SPSS was not 
randomly accessing lines, as data files could well be hold on magnetic 
tapes, where random access is not possible on average practice.  SPSS 
reads (or was reading) lines sequentially from beginning to end, and the 
_random_ sample is built while the reading goes.

Suppose the file (or tape) holds N records (N is not known in advance), 
from which we want a sample of M records at most.  If N <= M, then we 
use the whole file, no sampling is possible nor necessary.  Otherwise, 
we first initialise M records with the first M records of the file.  
Then, for each record in the file after the M'th, the algorithm has to 
decide if the record just read will be discarded or if it will replace 
one of the M records already saved, and in the latter case, which of 
those records will be replaced.  If the algorithm is carefully designed, 
when the last (N'th) record of the file will have been processed this 
way, we may then have M records randomly selected from N records, in 
such a a way that each of the N records had an equal probability to end 
up in the selection of M records.  I may seek out for details if needed.

This is my suggestion, or in fact, more a thought that a suggestion.  It 
might represent something useful either for flat ASCII files or even for 
a stream of records coming out of a database, if those effectively do 
not offer ready random sampling devices.


P.S. - In the (rather unlikely, I admit) case the gang I'm part of would 
have the need described above, and if I then dared implementing it 
myself, would it be welcome?

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Leif Kirschenbaum

A few thoughts about R vs SAS:
I started learning SAS 8 years ago at IBM, I believe it was version 6.10.
I started with R 7 months ago.

Learning curve:
  I think I can do everything in R after 7 months that I could do in SAS after 
about 4 years.

Bugs:
  I suffered through several SAS version changes, 7.0, 7.1, 7.2, 8.0, 9.0 (I 
may have misquoted some version numbers). Every version change gave me 
headaches, as every version release (of an expensive commercially produced 
software set) had bugs which upset or crashed previously working code. I had 
code which ran fine under Windows 2000 and terribly under Windows XP. Most bugs 
I found were noted by SAS, but never fixed.
  With R I have encounted very few bugs, except for an occasional crash of R, 
which I usually ascribe to some bug in Windows XP.

Help:
  SAS help was OK. As others have mentioned, there is too much. I even had the 
set of printed manuals on my desk (stretching 4 feet or so), which were quote 
impenetrable. I had almost no support from colleagues: even within IBM the 
number of advanced SAS users was small.
  With R this mailing list has been of great help: almost every issue I copy 
some program and save it as a "R hint " file.
--> A REQUEST
I would say that I would appreciate a few more program examples with the help 
pages for some functions. For instance, "?Control" tells me about "if(cond) 
cons.expr  else  alt.expr", however an example of
   if(i==1) { print("one") 
   } else if(i==2) { print("two")
   } else if(i>2) { print("bigger than two") }
 at the end of that help section would have been very helpful for me a few 
months ago.

Functions:
  Writing my own functions in SAS was by use of macros, and usually depended 
heavily on macro substitution. Learning SAS's macro language, especially macro 
substitution, was very difficult and it took me years to be able to write 
complicated functions. Quite different situation in R. Some functions I have 
written by dint of copying code from other people's packages, which has been 
very helpful.
  I wanted to generate arbitrary k-values (the k-multiplier of sigma for a 
given alpha, beta, and N to establish confidence limits around a mean for small 
populations). I had a table from a years old microfiche book giving values but 
wanted to generate my own. I had to find the correct integrals to approximate 
the k-values and then write two SAS macros which iterated to the desired level 
of tolerance to generate values. I would guess that there is either an R base 
function or a package which will do this for me (when I need to start 
generating AQL tables). Given the utility of these numbers, I was disappointed 
with SAS.

Data manipulation:
  All SAS data is in 2-dimensional datasets, which was very frustrating after 
having used variables, arrays, and matrices in BASIC, APL, FORTRAN, C, Pascal, 
and LabVIEW. SAS allows you to access only 1 row of a dataset at a time which 
was terribly horribly incomprehensibly frustrating. There were so many many 
problems I had to solve where I had to work around this SAS paradigm.
  In R, I can access all the elements of a matrix/dataframe at once, and I can 
use >2 dimensional matrices. In fact, the limitations of SAS I had ingrained 
from 7.5 years has sometimes made me forget how I can do something so easily in 
R, like be able to know when a value in a column of a dataframe changes:
  DF$marker <- DF[1:(nrow(DF)-1),icol] != DF[2:nrow(DF),icol]
This was hard to do in SAS...and even after years it was sometimes buggy, 
keeping variable values from previous iterations of a SAS program.
  One very nice advantage with SAS is that after data is saved in libraries, 
there is a GUI showing all the libraries and the datasets inside the libraries 
with sizes and dates. While we can save Rdata objects in an external file, the 
base package doesn't seem to have the same capabilities as SAS.

Graphics:
  SAS graphics were quite mediocre, and generating customized labels was 
cumbersome. Porting code from one Windows platform to another produced 
unpredictable and sometimes unworkable results.
  It has been easier in R: I anticipate that I will be able to port R Windows 
code to *NIX and generate the same graphics.

Batch commands:
  I am working on porting some of my R code to our *NIX server to generate 
reports and graphs on a scheduled basis. Although a few at IBM did this with 
SAS, I would have found doing this fairly daunting.


-Leif

-
 Leif Kirschenbaum, Ph.D.
 Senior Yield Engineer
 Reflectivity
 [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Ordering boxplot factors

2006-01-05 Thread Marc Schwartz

On Thu, 2006-01-05 at 20:27 -0600, Joseph LeBouton wrote:
> Hi all,
> 
> what a great help list!  I hope someone can help me with this puzzle...
> 
> I'm trying to find a simple way to do:
> 
> boxplot(obs~factor)
> 
> so that the factors are ordered left-to-right along the x-axis by
> median, not alphabetically by factor name.
> 
> Complicated ways abound, but I'm hoping for a magical one-liner that'll
> do the trick.
> 
> Any suggestions would be treasured.
> 
> Thanks,
> 
> -jlb

Using the first example in ?boxplot, which is:

boxplot(count ~ spray, data = InsectSprays, col = "lightgray")

Get the medians for 'count by spray' using tapply() and then sort the
results in increasing order, by median:

  med <- sort(with(InsectSprays, tapply(count, spray, median)))

> med
   CEDAFB 
 1.5  3.0  5.0 14.0 15.0 16.5 

Now do the boxplot, setting the factor levels in order by median:

  boxplot(count ~ factor(spray, levels = names(med)), 
  data = InsectSprays, col = "lightgray")

So...technically two lines of code.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Ordering boxplot factors

2006-01-05 Thread Joseph LeBouton

Hi all,

what a great help list!  I hope someone can help me with this puzzle...

I'm trying to find a simple way to do:

boxplot(obs~factor)

so that the factors are ordered left-to-right along the x-axis by
median, not alphabetically by factor name.

Complicated ways abound, but I'm hoping for a magical one-liner that'll
do the trick.

Any suggestions would be treasured.

Thanks,

-jlb
-- 

Joseph P. LeBouton
Forest Ecology PhD Candidate
Department of Forestry
Michigan State University
East Lansing, Michigan 48824

Office phone: 517-355-7744
email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wikis etc.

2006-01-05 Thread Kevin E. Thorpe

Frank makes an intersting point.  For those interested, A site I spend
quite a bit of time on for Linux related stuff is IMHO really well done.
There are fora for many different linux distrubtions.  There is a wiki,
a collection of tutorials, etc.  If you want to take a look, the url is
http://www.linuxquestions.org/

Kevin

Frank E Harrell Jr wrote:
> I feel that as long as people continue to provide help on r-help wikis 
> will not be successful.  I think we need to move to a central wiki or 
> discussion board and to move away from e-mail.  People are extremely 
> helpful but e-mail seems to be to always be memory-less and messages get 
> too long without factorization of old text.  R-help is now too active 
> and too many new users are asking questions asked dozens of times for 
> e-mail to be effective.
> 
> The wiki also needs to collect and organize example code, especially for 
> data manipulation.  I think that new users would profit immensely from a 
> compendium of examples.
> 
> Just my .02 Euros
> 
> Frank

-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: [EMAIL PROTECTED]  Tel: 416.946.8081  Fax: 416.946.3297

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Looking for packages to do Feature Selection and Classification

2006-01-05 Thread Diaz.Ramon


Thanks for the reference, it looks very interesting.

Best,

R.

-Original Message-
From:   Weiwei Shi [mailto:[EMAIL PROTECTED]
Sent:   Thu 1/5/2006 9:01 PM
To: Diaz.Ramon
Cc: Frank Duan; r-help
Subject:Re: [R] Looking for packages to do Feature Selection and 
Classification

FYI:

check the following paper on svm (using libsvm) as well as random
forest in the context of feature selection.

http://www.csie.ntu.edu.tw/~cjlin/papers/features.pdf

HTH

On 1/4/06, Diaz.Ramon <[EMAIL PROTECTED]> wrote:
> Dear Frank,
> I expect you'll get many different answers since a wide variety of approaches 
> have been suggested. So I'll stick to self-advertisment: I've written an R 
> package, varSelRF (available from R), that uses random forest together with a 
> simple variable selection approach, and provides also bootstrap estimates of 
> the error rate of the procedure. Andy Liaw and collaborators previously 
> developed and published a somewhat similar procedure. You probably also want 
> to take a look at several packages available from BioConductor.
>
> Best,
>
> R.
>
>
> -Original Message-
> From:   [EMAIL PROTECTED] on behalf of Frank Duan
> Sent:   Wed 1/4/2006 4:23 AM
> To: r-help
> Cc:
> Subject:[R] Looking for packages to do Feature Selection and 
> Classification
>
> Hi All,
>
> Sorry if this is a repost (a quick browse didn't give me the answer).
>
> I wonder if there are packages that can do the feature selection and
> classification at the same time. For instance, I am using SVM to classify my
> samples, but it's easy to get overfitted if using all of the features. Thus,
> it is necessary to select "good" features to build an optimum hyperplane
> (?). Here is a simple example: Suppose I have 100 "useful" features and 100
> "useless" features (or noise features), I want the SVM to give me the
> same results when 1) using only 100 useful features or 2) using all 200
> features.
>
> Any suggestions or point me to a reference?
>
> Thanks in advance!
>
> Frank
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
> --
> Ramón Díaz-Uriarte
> Bioinformatics Unit
> Centro Nacional de Investigaciones Oncológicas (CNIO)
> (Spanish National Cancer Center)
> Melchor Fernández Almagro, 3
> 28029 Madrid (Spain)
> Fax: +-34-91-224-6972
> Phone: +-34-91-224-6900
>
> http://ligarto.org/rdiaz
> PGP KeyID: 0xE89B3462
> (http://ligarto.org/rdiaz/0xE89B3462.asc)
>
>
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>


--
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III




**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Wikis etc.

2006-01-05 Thread Frank E Harrell Jr

I feel that as long as people continue to provide help on r-help wikis 
will not be successful.  I think we need to move to a central wiki or 
discussion board and to move away from e-mail.  People are extremely 
helpful but e-mail seems to be to always be memory-less and messages get 
too long without factorization of old text.  R-help is now too active 
and too many new users are asking questions asked dozens of times for 
e-mail to be effective.

The wiki also needs to collect and organize example code, especially for 
data manipulation.  I think that new users would profit immensely from a 
compendium of examples.

Just my .02 Euros

Frank
-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread Neuro LeSuperHéros

Rongui,

I'm not familiar with SQLite, but using MySQL would solve your problem.

MySQL has a "LOAD DATA INFILE" statement that loads text/csv files rapidly.

In R, assuming a test table exists in MySQL (blank table is fine), something 
like this would load the data directly in MySQL.

library(DBI)
library(RMySQL)
dbSendQuery(mycon,"LOAD DATA INFILE 'C:/textfile.csv'
INTO TABLE test3 FIELDS TERMINATED BY ',' ") #for csv files

Then a normal SQL query would allow you to work with a manageable size of 
data.



>From: bogdan romocea <[EMAIL PROTECTED]>
>To: [EMAIL PROTECTED]
>CC: r-help 
>Subject: Re: [R] Suggestion for big files [was: Re: A comment about R:]
>Date: Thu, 5 Jan 2006 15:26:51 -0500
>
>ronggui wrote:
> > If i am familiar with
> > database software, using database (and R) is the best choice,but
> > convert the file into database format is not an easy job for me.
>
>Good working knowledge of a DBMS is almost invaluable when it comes to
>working with very large data sets. In addition, learning SQL is piece
>of cake compared to learning R. On top of that, knowledge of another
>(SQL) scripting language is not needed (except perhaps for special
>tasks): you can easily use R to generate the SQL syntax to import and
>work with arbitrarily wide tables. (I'm not familiar with SQLite, but
>MySQL comes with a command line tool that can run syntax files.)
>Better start learning SQL today.
>
>
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of ronggui
> > Sent: Thursday, January 05, 2006 12:48 PM
> > To: jim holtman
> > Cc: r-help@stat.math.ethz.ch
> > Subject: Re: [R] Suggestion for big files [was: Re: A comment
> > about R:]
> >
> >
> > 2006/1/6, jim holtman <[EMAIL PROTECTED]>:
> > > If what you are reading in is numeric data, then it would
> > require (807 *
> > > 118519 * 8) 760MB just to store a single copy of the object
> > -- more memory
> > > than you have on your computer.  If you were reading it in,
> > then the problem
> > > is the paging that was occurring.
> > In fact,If I read it in 3 pieces, each is about 170M.
> >
> > >
> > > You have to look at storing this in a database and working
> > on a subset of
> > > the data.  Do you really need to have all 807 variables in
> > memory at the
> > > same time?
> >
> > Yip,I don't need all the variables.But I don't know how to get the
> > necessary  variables into R.
> >
> > At last I  read the data in piece and use RSQLite package to write it
> > to a database.and do then do the analysis. If i am familiar with
> > database software, using database (and R) is the best choice,but
> > convert the file into database format is not an easy job for me.I ask
> > for help in SQLite list,but the solution is not satisfying as that
> > required the knowledge about the third script language.After searching
> > the internet,I get this solution:
> >
> > #begin
> > rm(list=ls())
> > f<-file("D:\wvsevs_sb_v4.csv","r")
> > i <- 0
> > done <- FALSE
> > library(RSQLite)
> > con<-dbConnect("SQLite","c:\sqlite\database.db3")
> > tim1<-Sys.time()
> >
> > while(!done){
> > i<-i+1
> > tt<-readLines(f,2500)
> > if (length(tt)<2500) done <- TRUE
> > tt<-textConnection(tt)
> > if (i==1) {
> >assign("dat",read.table(tt,head=T,sep=",",quote=""));
> >  }
> > else assign("dat",read.table(tt,head=F,sep=",",quote=""))
> > close(tt)
> > ifelse(dbExistsTable(con, "wvs"),dbWriteTable(con,"wvs",dat,append=T),
> >   dbWriteTable(con,"wvs",dat) )
> > }
> > close(f)
> > #end
> > It's not the best solution,but it works.
> >
> >
> >
> > > If you use 'scan', you could specify that you do not want
> > some of the
> > > variables read in so it might make a more reasonably sized objects.
> > >
> > >
> > > On 1/5/06, FranÃ§ois Pinard <[EMAIL PROTECTED]> wrote:
> > > > [ronggui]
> > > >
> > > > >R's week when handling large data file.  I has a data
> > file : 807 vars,
> > > > >118519 obs.and its CVS format.  Stata can read it in in
> > 2 minus,but In
> > > > >my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.
> > > >
> > > > Just (another) thought.  I used to use SPSS, many, many
> > years ago, on
> > > > CDC machines, where the CPU had limited memory and no
> > kind of paging
> > > > architecture.  Files did not need to be very large for
> > being too large.
> > > >
> > > > SPSS had a feature that was then useful, about the capability of
> > > > sampling a big dataset directly at file read time, quite before
> > > > processing starts.  Maybe something similar could help in
> > R (that is,
> > > > instead of reading the whole data in memory, _then_ sampling it.)
> > > >
> > > > One can read records from a file, up to a preset amount
> > of them.  If the
> > > > file happens to contain more records than that preset
> > number (the number
> > > > of records in the whole file is not known beforehand),
> > already read
> > > > records may be dropped at random and replaced by other
> > records coming
> > > > from the file bei

[R] convert matrix to data frame

2006-01-05 Thread Chia, Yen Lin

Hi all,

 

Suppose I have a 4 x 2 matrix  A and I want to select the values in
second column such that the value in first column equals to k.

 

I gave the colnames as alpha beta, so I was trying to access the info
using

 

A$beta[A[,1]==k], however, I was told it's not a data frame, I can get
the object by using dollar sign.  I tried data.frame(A), but it didn't
work.  

 

Any input on this will be very appreciated.  Thanks.

 

I tried looking in the manual, but I think I'm might be wrong about the
keywords.

 

Yen Lin


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Memory limitation in GeoR - Windows or R?

2006-01-05 Thread Patrick Giraudoux

Thanks a lot Jim. Forwarded to Aaron (who rised the question) and the 
R-help list...

Patrick



jim holtman a écrit :

> The size matrix you are allocating (18227 x 18227) would require 2.6GB 
> of memory and that is what the error message is saying that you only 
> have 0.5GB (512MB) available.  You also need about 3-4x the largest 
> object so that you can do any calculations on it due to extra copies 
> that might be made of the data.
>  
> Now your 2000 x 2000 will require about 32MB of memory that that is 
> why it is comfortable in running on your system with 512MB.
>  
> If you are running on a Windows machine, you may want to use the 
> --max-memory command line parameter to set the memory size.  I have 
> 1GB and set the max to 800MB.  You may also want to remove any extra 
> objects and do 'gc()' before working with a large object to try and 
> get memory cleared up.
>
>  
> On 1/5/06, *Patrick Giraudoux* <[EMAIL PROTECTED] 
> > wrote:
>
> Dear Aaron,
>
> I am really  a tool user and not a tool maker (actually an ecologist
> doing some biostatistics)... so, I take the liberty of sending a
> copy of
> this e-mail to the r-help list where capable computer persons and true
> statisticians may provide more relevant information and also to Paulo
> Ribeiro and Peter Diggle, the authors of geoR..
>
> I really feel that your huge matrix cannot be handled in R that easy,
> and I get the same kind of error as you:
>
> > m=matrix(1,ncol=18227, nrow=18227)
> Error: cannot allocate vector of size 2595496 Kb
> In addition: Warning messages:
> 1: Reached total allocation of 511Mb: see help( memory.size)
> 2: Reached total allocation of 511Mb: see help(memory.size)
>
> However, if you want to compute a distance matrix, have a look to the
> function:
>
> ?dist
>
> and try it... You will not have to create the distance matrix
> yourself
> from the coordinates file (but you may meet meory problem anyway).
> Still
> more straigthfully, if you intend to use interpolation methods such as
> kriging, you don't need to manage the distance matrix building by
> yourself. See:
>
> library(geoR)
> library(help=geoR)
> ?variog
> ?variofit
> ?likfit?
>
> etc...
>
> If you want further use geostatics (eg via geoR or gstat), you will
> anyway have to manage with memory limits (not due to R). On my
> computer
> (portable HP compaq nx7000) I can hardly manage with more than 2000
> observations using geoR (far from your 18227 observations). I had to
> krige on a dataset with 9000 observations recently and has been led to
> subsample randomly 2000 values. You can however try and  increase the
> memory allocated to R on your computer. The size limit is hardware
> dependent, eg:
>
> ?memory.size
> ?memory.limit
> / memory.limit(size=5) /
>
> Another way may be to perform local kriging (eg kriging within the
> dataset on a fixed radius), but to my knowldege this cannot be
> done with
> geoR (unfortunately). The library gstat offers this option, but
> has much
> more limited possibilities than geoR considering other issues
> (variogram
> analysis, etc...).
>
> library(gstat)
> ?krige
>
> see argument 'maxdist'
>
> Hope this can help,
>
> Kind regards
>
>
> Patrick Giraudoux
> --
>
> Department of Environmental Biology
> EA3184 usc INRA
> University of Franche-Comte
> 25030 Besancon Cedex
> (France)
>
> tel. +33 381 665 745
> fax +33 381 665 797
> http://lbe.univ-fcomte.fr
>
>
> Aaron Swoboda a écrit :
>
> > Dear Sir:
> >
> >
> > I ran across your post to the R-help archive from February 9it is
> > attached below since it was nearly two years ago!). I am
> beginning to
> > learn R and am interested in analyzing some of my data in a spatial
> > context. I am having a problem that seems similar to the problem you
> > encountered, trying to work with a large matrix in Windows XP).
> I am
> > wondering if you could help steer me in the direction that
> helped you
> > solve your problem. I am trying to construct a distance matrix
> > containing the all of the distances between my 18,000 observations.
> > Trying to make a matrix that large gets an error message...
> >
> > m=matrix(1,ncol=18227, nrow=18227)
> >
> > Error: cannot allocate vector of size 2595496 Kb
> > In addition: Warning messages:
> > 1: Reached total allocation of 1023Mb: see help(memory.size)
> >
> > Thank you for any help you may be able to send my way.
> >
> > ~Aaron Swoboda
> >
> > Below is your posting to the R-help list...
> >
> >
> > Dear all,
> >
> > I a read with great interest the e-mails related to Arnav Sheth
> about
> > memory limitatio

Re: [R] Wald tests and Huberized variances (was: A comment about R:)

2006-01-05 Thread bogdan romocea

Peter Muhlberger wrote:
> But, there is a second point here, which is how difficult it
> was for me [...] to find what seem to me like standard & key
> features I've taken for granted in other packages.

There is another side to this. Don't consider only how difficult it
was to find what you were looking for; also remember to be _glad_ that
there are so many packages and features to choose from. IMHO, the
benefit of having a lot of packages dwarfs all the efforts needed to
locate the right ones.


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter
> Muhlberger
> Sent: Thursday, January 05, 2006 12:44 PM
> To: Achim Zeileis
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Wald tests and Huberized variances (was: A
> comment about R:)
>
>
> Hi Achim:  Your reply is tremendously helpful in addressing
> some of the
> outstanding questions I had about R.  The 'econometrics view'
> materials look
> exactly like what I needed.  Many thanks!
>
> But, there is a second point here, which is how difficult it
> was for me, as
> someone just becoming more familiar w/ R's more basic
> capabilities (in the
> past I've focused on features like optim, sem), to find what
> seem to me like
> standard & key features I've taken for granted in other
> packages.  I looked
> high & low in my existing installed packages for the standard
> version of R,
> I googled, I looked in the r-help archives, I looked through
> several manuals
> / introductions to R I had downloaded.  I've asked questions
> about all of
> the points I raised in my email on this email list before.  I
> believe I
> passed through the parent directory for the econometric view
> material at the
> website w/o realizing what it contained because I thought of
> "computational
> econometrics" as having to do w/ running Monte Carlo models
> of economic
> processes.
>
> If R wants to bring in a wider audience, one thing that might
> help is a
> denser set of cross-references.  For example, perhaps lm's help should
> mention the econometrics view materials as well as other
> places to look for
> tests and procedures people may want to do w/ lm.  Another
> thought is that
> perhaps the standard R package help should allow people to find
> non-installed but commonly used contributed packages and
> perhaps their help
> page contents.  A feature that would be very helpful for me
> is the capacity
> to search all the contents of help files, not just keywords
> that at times
> seem to miss what I'm trying to find.
>
> Cheers,
>
> Peter
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generic Functions

2006-01-05 Thread Berton Gunter

Briefly, S4 classes and methods are entirely different -- and often do not
comfortably coexist with -- the older S3 class/method system (which really
isn't, since the classes of the objects aren't really guaranteed as one
would expect). 

Probably the best place to learn about S4 is The Green Book (Chambers:
PROGRAMMING WITH DATA). Also you can load the "methods" package and type
?Methods for a shorter overview. See also ?setClass.

S4 provides a lot better control and encapsulation than S3, but it also
makes considerably greater demands of the programmer. You can decide whether
you think the tradeoff is justified for your work.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Elizabeth Purdom
> Sent: Thursday, January 05, 2006 12:44 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Generic Functions
> 
> Hi,
> I've been using the package "graph" in the BioConductor 
> assortment and 
> writing some functions based on its classes. My question is 
> not specific to 
> this package or BioConductor (I think), but it serves as a 
> useful example. 
> I recently wanted to look at the code for the function 
> "edgeMatrix" for the 
> class "graphNEL".
> 
> Usually I would type
>   > func.foo
> and the code for the function func for class foo would appear 
> (where func 
> and foo are edgeMatrix and graphNEL respectively). Similarly 
> I would type
>   > methods(func)
> to see for what classes the function func is defined.
> 
> However, these do not work for these functions (they are not 
> S3 functions I 
> am told, though I don't know what that means). After a great deal of 
> guessing and help.search requests, I finally found functions 
> that seem to work:
>   > getMethod(func,"foo")
>   > showMethods(func)
> I get the corresponding code and possible methods available.
> 
> What is this about and is there a section in the R language 
> definition that 
> explains the difference?
> 
> Similarly I'm use to the object oriented program described in the R 
> language online based on the command UseMethod:
>   > func <- function(x, ...) UseMethod("func", x)
> Under this system, I could just create a function "func.foo" 
> and it would 
> work for my class "foo" -- most notably I would create a 
> print command 
> "print.foo" and it would just seamlessly work. However my function 
> "print.graphNEL", for example, never worked and I'm just now 
> guessing from 
> the pieces of documentation from R, that I have to set it up 
> differently 
> but it is not clear to me how. How can I add a method to an existing 
> function under this setup?
> 
> On another note: even if these different functions are 
> internally quite 
> different, can't the functions everyone is already accustomed 
> to be made to 
> access the properties, rather than creating new, similar 
> functions? (what 
> is the need for a different function "showMethods" when a function 
> "methods" already exists? I have the same issue for slots 
> where I have to 
> use a function "slotNames" rather than the more commonly 
> known function 
> "names"). It becomes such a learning curve, that I shy away 
> from packages 
> that use new techniques in coding and stick with packages and 
> functions I 
> can comfortably dissect and personalize.
> 
> Thank you for any assistance,
> Elizabeth Purdom
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How do I get sub to insert a single backslash?

2006-01-05 Thread Peter Dalgaard

Michael Dewey <[EMAIL PROTECTED]> writes:

> Something about the way R processes backslashes is defeating me.
> Perhaps this is because I have only just started using R for text processing.
> 
> I would like to change occurrences of the ampersand & into ampersand 
> preceded by a backslash.
> 
>  > temp <- "R & D"
>  > sub("&", "\&", temp)
> [1] "R & D"
>  > sub("&", "\\&", temp)
> [1] "R & D"
>  > sub("&", "\\\&", temp)
> [1] "R & D"
>  > sub("&", "&", temp)
> [1] "R \\& D"
>  >
> 
> So I can get zero, or two backslashes, but not one. I am sure this is 
> really simple but I did not find the answer by doing, for example ?regexp 
> or ?Quotes


None of those result strings  have two backslashes!


Hint:

> nchar("R \\& D")
[1] 6

and ?Quotes tellse the entire story.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] How do I get sub to insert a single backslash?

2006-01-05 Thread Michael Dewey

Something about the way R processes backslashes is defeating me.
Perhaps this is because I have only just started using R for text processing.

I would like to change occurrences of the ampersand & into ampersand 
preceded by a backslash.

 > temp <- "R & D"
 > sub("&", "\&", temp)
[1] "R & D"
 > sub("&", "\\&", temp)
[1] "R & D"
 > sub("&", "\\\&", temp)
[1] "R & D"
 > sub("&", "&", temp)
[1] "R \\& D"
 >

So I can get zero, or two backslashes, but not one. I am sure this is 
really simple but I did not find the answer by doing, for example ?regexp 
or ?Quotes


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to create a correlation table for categorical data???

2006-01-05 Thread Michael Dewey

At 19:07 04/01/06, fabio crepaldi wrote:
>Hi,
>   I need to create the correlation table of a set of categorical data 
> (sex, marital_status, car_type, etc.) for a given population.
>   Basically what I'm looking for is a function like cor( ) working on 
> factors (and, if possible, considering NAs as a level).

If they are ordered have you considered downloading the polycor package?

Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wald tests and Huberized variances (was: A comment about R:)

2006-01-05 Thread Peter Muhlberger

Thanks Z, it's coming more into focus.  I don't know what would work, though
maybe it's not impossible to have a richer set of cross-references by
interest area--e.g. People interested in econometrics may wish to
examine  The views help in this regard, tho something in help itself
would be handy.

Peter

On 1/5/06 2:52 PM, "Achim Zeileis" <[EMAIL PROTECTED]> wrote:

> Peter:
> 
>> If R wants to bring in a wider audience, one thing that might help is a
>> denser set of cross-references.  For example, perhaps lm's help should
>> mention the econometrics view materials as well as other places to look for
>> tests and procedures people may want to do w/ lm.  Another thought is that
> 
> This is difficult because the core development team has to ensure a
> certain stability of the system and you wouldn't start cross-linking to
> potentially unstable contributed packages. Furthermore, what is obvious to
> you (or me) as further desired functionality for linear models might be
> completely counter-intuitive for someone in genomics or biostatistics or
> environmetrics or ... and you can't link to all of these without confusing
> everybody.
> 
>> perhaps the standard R package help should allow people to find
>> non-installed but commonly used contributed packages and perhaps their help
>> page contents.  A feature that would be very helpful for me is the capacity
>> to search all the contents of help files, not just keywords that at times
>> seem to miss what I'm trying to find.
> 
> This is surely desirable but unfortunately not that simple to implement,
> you'll find some discussion in the list archives about this. However,
> there are various very helpful search facilites like RSiteSearch().
> 
> Best,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Generic Functions

2006-01-05 Thread Elizabeth Purdom

Hi,
I've been using the package "graph" in the BioConductor assortment and 
writing some functions based on its classes. My question is not specific to 
this package or BioConductor (I think), but it serves as a useful example. 
I recently wanted to look at the code for the function "edgeMatrix" for the 
class "graphNEL".

Usually I would type
> func.foo
and the code for the function func for class foo would appear (where func 
and foo are edgeMatrix and graphNEL respectively). Similarly I would type
> methods(func)
to see for what classes the function func is defined.

However, these do not work for these functions (they are not S3 functions I 
am told, though I don't know what that means). After a great deal of 
guessing and help.search requests, I finally found functions that seem to work:
> getMethod(func,"foo")
> showMethods(func)
I get the corresponding code and possible methods available.

What is this about and is there a section in the R language definition that 
explains the difference?

Similarly I'm use to the object oriented program described in the R 
language online based on the command UseMethod:
> func <- function(x, ...) UseMethod("func", x)
Under this system, I could just create a function "func.foo" and it would 
work for my class "foo" -- most notably I would create a print command 
"print.foo" and it would just seamlessly work. However my function 
"print.graphNEL", for example, never worked and I'm just now guessing from 
the pieces of documentation from R, that I have to set it up differently 
but it is not clear to me how. How can I add a method to an existing 
function under this setup?

On another note: even if these different functions are internally quite 
different, can't the functions everyone is already accustomed to be made to 
access the properties, rather than creating new, similar functions? (what 
is the need for a different function "showMethods" when a function 
"methods" already exists? I have the same issue for slots where I have to 
use a function "slotNames" rather than the more commonly known function 
"names"). It becomes such a learning curve, that I shy away from packages 
that use new techniques in coding and stick with packages and functions I 
can comfortably dissect and personalize.

Thank you for any assistance,
Elizabeth Purdom

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] build boot object for boot.ci

2006-01-05 Thread Daniel Metzler

Hello all,

I have a dataset with several variables v1...vn and five groups f1...fn.
For each group I took a subset of the data

f1<-subset(data,f==1)
f2<-subset(data,f==2)

and bootstrapped the weighted mean for v1...vn, which worked
nicely.

data.boot<-boot(f1$v1,mymean,R=2000)

Afterwards I combined all boot.out$t for each group:

boot$f1v1<-data.boot$t

data.boot<-boot(f1$v2,mymean,R=2000)
boot$f1v2<-data.boot$t
...
data.boot<-boot(f2$v1,mymean,R=2000)
boot$f2v1<-data.boot$t
...

Each row of boot is used as input in a model resulting in boot$W.

My idea is to find (bootstrap) CIs for W.
Sorry, I'm new to bootstrap (and R...): Does it make sense to treat  
the Ws like a bootstrapped variable (and calculate e.g. BCa CIs),  
because only v1..vn
are bootstrapped? If yes, how is a boot object build on the basis of  
boot$W?

I'm aware that boot() handles strata, weights and complex functions,  
but I didn't manage to get my model bootstrapped right. Hopefully, I  
can circumvent turning my
script upside down.


Daniel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread bogdan romocea

ronggui wrote:
> If i am familiar with
> database software, using database (and R) is the best choice,but
> convert the file into database format is not an easy job for me.

Good working knowledge of a DBMS is almost invaluable when it comes to
working with very large data sets. In addition, learning SQL is piece
of cake compared to learning R. On top of that, knowledge of another
(SQL) scripting language is not needed (except perhaps for special
tasks): you can easily use R to generate the SQL syntax to import and
work with arbitrarily wide tables. (I'm not familiar with SQLite, but
MySQL comes with a command line tool that can run syntax files.)
Better start learning SQL today.


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of ronggui
> Sent: Thursday, January 05, 2006 12:48 PM
> To: jim holtman
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Suggestion for big files [was: Re: A comment
> about R:]
>
>
> 2006/1/6, jim holtman <[EMAIL PROTECTED]>:
> > If what you are reading in is numeric data, then it would
> require (807 *
> > 118519 * 8) 760MB just to store a single copy of the object
> -- more memory
> > than you have on your computer.  If you were reading it in,
> then the problem
> > is the paging that was occurring.
> In fact,If I read it in 3 pieces, each is about 170M.
>
> >
> > You have to look at storing this in a database and working
> on a subset of
> > the data.  Do you really need to have all 807 variables in
> memory at the
> > same time?
>
> Yip,I don't need all the variables.But I don't know how to get the
> necessary  variables into R.
>
> At last I  read the data in piece and use RSQLite package to write it
> to a database.and do then do the analysis. If i am familiar with
> database software, using database (and R) is the best choice,but
> convert the file into database format is not an easy job for me.I ask
> for help in SQLite list,but the solution is not satisfying as that
> required the knowledge about the third script language.After searching
> the internet,I get this solution:
>
> #begin
> rm(list=ls())
> f<-file("D:\wvsevs_sb_v4.csv","r")
> i <- 0
> done <- FALSE
> library(RSQLite)
> con<-dbConnect("SQLite","c:\sqlite\database.db3")
> tim1<-Sys.time()
>
> while(!done){
> i<-i+1
> tt<-readLines(f,2500)
> if (length(tt)<2500) done <- TRUE
> tt<-textConnection(tt)
> if (i==1) {
>assign("dat",read.table(tt,head=T,sep=",",quote=""));
>  }
> else assign("dat",read.table(tt,head=F,sep=",",quote=""))
> close(tt)
> ifelse(dbExistsTable(con, "wvs"),dbWriteTable(con,"wvs",dat,append=T),
>   dbWriteTable(con,"wvs",dat) )
> }
> close(f)
> #end
> It's not the best solution,but it works.
>
>
>
> > If you use 'scan', you could specify that you do not want
> some of the
> > variables read in so it might make a more reasonably sized objects.
> >
> >
> > On 1/5/06, François Pinard <[EMAIL PROTECTED]> wrote:
> > > [ronggui]
> > >
> > > >R's week when handling large data file.  I has a data
> file : 807 vars,
> > > >118519 obs.and its CVS format.  Stata can read it in in
> 2 minus,but In
> > > >my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.
> > >
> > > Just (another) thought.  I used to use SPSS, many, many
> years ago, on
> > > CDC machines, where the CPU had limited memory and no
> kind of paging
> > > architecture.  Files did not need to be very large for
> being too large.
> > >
> > > SPSS had a feature that was then useful, about the capability of
> > > sampling a big dataset directly at file read time, quite before
> > > processing starts.  Maybe something similar could help in
> R (that is,
> > > instead of reading the whole data in memory, _then_ sampling it.)
> > >
> > > One can read records from a file, up to a preset amount
> of them.  If the
> > > file happens to contain more records than that preset
> number (the number
> > > of records in the whole file is not known beforehand),
> already read
> > > records may be dropped at random and replaced by other
> records coming
> > > from the file being read.  If the random selection
> algorithm is properly
> > > chosen, it can be made so that all records in the
> original file have
> > > equal probability of being kept in the final subset.
> > >
> > > If such a sampling facility was built right within usual R reading
> > > routines (triggered by an extra argument, say), it could offer
> > > a compromise for processing large files, and also
> sometimes accelerate
> > > computations for big problems, even when memory is not at stake.
> > >
> > > --
> > > François Pinard   http://pinard.progiciels-bpi.ca
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > >
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 247 0281
> >
> > What the problem you are trying to s

Re: [R] A comment about R:

2006-01-05 Thread Achim Zeileis

Peter:

> My two most immediate problems were a) to test whether a set of coefficients
> were jointly zero (as Achim suggests, though the complication here is that
> the varcov matrix is bootstrapped), but also b) to test whether the average

This can be tested with both waldtest() and linear.hypothesis() when
you've got the bootstrapped vcov estimator of your choice available. This
can be conveniently plugged into both functions (either as a vcov matrix
or as a function extracting the vcov matrix from the fitted model object).
There is some discussion about this in the vignette accompanying the
sandwich package.

> of a set of coefficients was equal to zero.  At other points in time, I
> remember having had to test more complex linear hypotheses involving joint
> combinations of equality, non-zero, and 'averages.'  The Stata interface for
> linear hypothesis tests is amazingly straightforward.  For example, after a
> regression, I could use the following to test the joint hypothesis that
> v1=v2 and the average (or sum) of v3 through v5 is zero and .75v6+.25v7 is
> zero:
>
> test v1=v2
> test v3+v4+v5=0, accum
> test .75*v6+.25*v7=0, accum

Mmmh, should be possible to derive the restriction matrix from this
together with the terms structure...I'll think about this.

> I don't even have to set up a matrix for my test ];-) !  The output would
> show not merely the joint test of all the hypotheses but the tests along the
> way, one for each line of commands.  I vaguely remember the hypothesis
> testing command after an ml run is much the same and cross-equation
> hypothesis tests simply involve adding an equation indicator to the terms.
> I can get huberized var-cov matrices simply by adding "robust" to the
> regression command.

Whether you find this simple or not depends on what you might want to
have. Personally, I always find it very limiting if I've only got a switch
to choose one or another vcov matrix when there is a multitude of vcov
matrices in use in the literature. What if you would want to do HC3
instead of the HC(0) that is offered by Eviews...or HC4...or HAC...or
something bootstrapped...or...
In my view, this is the stengths of many implementation in R: you can make
programs very modular so that the user can easily extend the software or
re-use it for other purposes. The price you pay for that is that it is not
as easy to as a point-and-click software that offers some standard tools.
Of course, both sides have advantages or disadvantages.

> I won't claim to know what's good for R or the R community, but it would be
> nice for me and perhaps others if there were a comparable straightforward
> command as in Stata that could meet a variety of needs.  I need to play w/
> the commands that have been suggested to me by you guys recently, but I'm
> looking at a multitude of commands none of which I suspect have the
> flexibility and ease of use of the above Stata commands, at least for the
> kind of applications I'd like.  Perhaps the point of R isn't to serve as a
> package for a wider set of non-statisticians, but if it wishes to develop in
> that direction, facilities like this may be helpful.

The point of R is hard to determine, R itself does not wish this or that,
it is an open source project which is driven by many contributors. If
there are people out there that want to use R for social sciences, they
are free to contribute to the project. And in this particular case, I
think that there has been some activity in the last one or two years
aiming at providing tools for econometrics, quantitative methods in the
social and political sciences.
However, you won't be very happy with R when you want R to be Stata. If
you want Stata, use it.

> It's interesting that
> Achim points out that a function John suggests is already available in R--an
> indication that even R experts don't have a complete handle on everything in
> R even on a relatively straightforward topic like hypothesis tests.

In fairness to John, this functionality became available rather recently.
And it's not surprising that John knows his car package better and that
I'm more familiar with my lmtest package. Therefore, it's very natural to
think first how you would do a certain task using your own package...in
particular given that you specifically asked about car.

> John is no doubt right that editorializing about statistics would be out of
> place on an R help page.  But when I have gone to statistical papers, many
> have been difficult to access & not very helpful for practical concerns.
> I'm glad to hear that Long and Erwin's paper is helpful, but there's a
> goodly list of papers mentioned in help.

I would think this to be an advantage not a drawback. It's the user's
responsiblity to know what he/she is doing.

Best wishes,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide

Re: [R] Wald tests and Huberized variances (was: A comment about R:)

2006-01-05 Thread Achim Zeileis

Peter:

> If R wants to bring in a wider audience, one thing that might help is a
> denser set of cross-references.  For example, perhaps lm's help should
> mention the econometrics view materials as well as other places to look for
> tests and procedures people may want to do w/ lm.  Another thought is that

This is difficult because the core development team has to ensure a
certain stability of the system and you wouldn't start cross-linking to
potentially unstable contributed packages. Furthermore, what is obvious to
you (or me) as further desired functionality for linear models might be
completely counter-intuitive for someone in genomics or biostatistics or
environmetrics or ... and you can't link to all of these without confusing
everybody.

> perhaps the standard R package help should allow people to find
> non-installed but commonly used contributed packages and perhaps their help
> page contents.  A feature that would be very helpful for me is the capacity
> to search all the contents of help files, not just keywords that at times
> seem to miss what I'm trying to find.

This is surely desirable but unfortunately not that simple to implement,
you'll find some discussion in the list archives about this. However,
there are various very helpful search facilites like RSiteSearch().

Best,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Looking for packages to do Feature Selection and Classification

2006-01-05 Thread Weiwei Shi

FYI:

check the following paper on svm (using libsvm) as well as random
forest in the context of feature selection.

http://www.csie.ntu.edu.tw/~cjlin/papers/features.pdf

HTH

On 1/4/06, Diaz.Ramon <[EMAIL PROTECTED]> wrote:
> Dear Frank,
> I expect you'll get many different answers since a wide variety of approaches 
> have been suggested. So I'll stick to self-advertisment: I've written an R 
> package, varSelRF (available from R), that uses random forest together with a 
> simple variable selection approach, and provides also bootstrap estimates of 
> the error rate of the procedure. Andy Liaw and collaborators previously 
> developed and published a somewhat similar procedure. You probably also want 
> to take a look at several packages available from BioConductor.
>
> Best,
>
> R.
>
>
> -Original Message-
> From:   [EMAIL PROTECTED] on behalf of Frank Duan
> Sent:   Wed 1/4/2006 4:23 AM
> To: r-help
> Cc:
> Subject:[R] Looking for packages to do Feature Selection and 
> Classification
>
> Hi All,
>
> Sorry if this is a repost (a quick browse didn't give me the answer).
>
> I wonder if there are packages that can do the feature selection and
> classification at the same time. For instance, I am using SVM to classify my
> samples, but it's easy to get overfitted if using all of the features. Thus,
> it is necessary to select "good" features to build an optimum hyperplane
> (?). Here is a simple example: Suppose I have 100 "useful" features and 100
> "useless" features (or noise features), I want the SVM to give me the
> same results when 1) using only 100 useful features or 2) using all 200
> features.
>
> Any suggestions or point me to a reference?
>
> Thanks in advance!
>
> Frank
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
> --
> Ramón Díaz-Uriarte
> Bioinformatics Unit
> Centro Nacional de Investigaciones Oncológicas (CNIO)
> (Spanish National Cancer Center)
> Melchor Fernández Almagro, 3
> 28029 Madrid (Spain)
> Fax: +-34-91-224-6972
> Phone: +-34-91-224-6900
>
> http://ligarto.org/rdiaz
> PGP KeyID: 0xE89B3462
> (http://ligarto.org/rdiaz/0xE89B3462.asc)
>
>
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en s...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>


--
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Memory limitation in GeoR - Windows or R?

2006-01-05 Thread Patrick Giraudoux

Dear Aaron,

I am really  a tool user and not a tool maker (actually an ecologist 
doing some biostatistics)... so, I take the liberty of sending a copy of 
this e-mail to the r-help list where capable computer persons and true 
statisticians may provide more relevant information and also to Paulo 
Ribeiro and Peter Diggle, the authors of geoR..

I really feel that your huge matrix cannot be handled in R that easy, 
and I get the same kind of error as you:

 > m=matrix(1,ncol=18227, nrow=18227)
Error: cannot allocate vector of size 2595496 Kb
In addition: Warning messages:
1: Reached total allocation of 511Mb: see help(memory.size)
2: Reached total allocation of 511Mb: see help(memory.size)

However, if you want to compute a distance matrix, have a look to the 
function:

?dist

and try it... You will not have to create the distance matrix yourself 
from the coordinates file (but you may meet meory problem anyway). Still 
more straigthfully, if you intend to use interpolation methods such as 
kriging, you don't need to manage the distance matrix building by 
yourself. See:

library(geoR)
library(help=geoR)
?variog
?variofit
?likfit?

etc...

If you want further use geostatics (eg via geoR or gstat), you will 
anyway have to manage with memory limits (not due to R). On my computer 
(portable HP compaq nx7000) I can hardly manage with more than 2000 
observations using geoR (far from your 18227 observations). I had to 
krige on a dataset with 9000 observations recently and has been led to 
subsample randomly 2000 values. You can however try and  increase the 
memory allocated to R on your computer. The size limit is hardware 
dependent, eg:

?memory.size
?memory.limit
/ memory.limit(size=5) /

Another way may be to perform local kriging (eg kriging within the 
dataset on a fixed radius), but to my knowldege this cannot be done with 
geoR (unfortunately). The library gstat offers this option, but has much 
more limited possibilities than geoR considering other issues (variogram 
analysis, etc...).

library(gstat)
?krige

see argument 'maxdist'

Hope this can help,

Kind regards


Patrick Giraudoux
-- 

Department of Environmental Biology
EA3184 usc INRA
University of Franche-Comte
25030 Besancon Cedex
(France)

tel. +33 381 665 745
fax +33 381 665 797
http://lbe.univ-fcomte.fr


Aaron Swoboda a écrit :

> Dear Sir:
>
>
> I ran across your post to the R-help archive from February 9it is 
> attached below since it was nearly two years ago!). I am beginning to 
> learn R and am interested in analyzing some of my data in a spatial 
> context. I am having a problem that seems similar to the problem you 
> encountered, trying to work with a large matrix in Windows XP). I am 
> wondering if you could help steer me in the direction that helped you 
> solve your problem. I am trying to construct a distance matrix 
> containing the all of the distances between my 18,000 observations. 
> Trying to make a matrix that large gets an error message...
>
> m=matrix(1,ncol=18227, nrow=18227)  
>
> Error: cannot allocate vector of size 2595496 Kb
> In addition: Warning messages:
> 1: Reached total allocation of 1023Mb: see help(memory.size)
>
> Thank you for any help you may be able to send my way.
>
> ~Aaron Swoboda
>
> Below is your posting to the R-help list...
>
>
> Dear all,
>
> I a read with great interest the e-mails related to Arnav Sheth about 
> memory limitation when computing a distance matrix. I suspect
> that I will also meet some memory limitation using GeoR. I am 
> currently running GeoR on a geodata object including 2686 geographical
> coordinates.
>
> krige.conv() can handle it (it takes 10-15 mn of computing) but 
> requests an increased memory.
>
> /> memory.limit(size=5) /
>
> When the computing is completed, the computer speed is considerably 
> slowed down for any application. It is thus most necessary to
> shut it down and restart. I will probably have to handle a set of 
> 5000-6000 coordinates in once in the next few months. I wonder if
> it will go through it on my plateform (Windows XP and compaq nx7000). 
> If not, will the limitation due to R or to Windows? Does an
> alternate solution exist?
>
> Thanks for any hint,
>
> Patrick Giraudoux
>
>-- 
>Dr. Aaron M. Swoboda
>3208 Posvar Hall
>Graduate School of Public and International Affairs
>University of Pittsburgh
>Pittsburgh, PA 15260
>
>e-mail: [EMAIL PROTECTED]
>Office: 412-648-7604
>Fax:412-648-2605
>  
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Setting the working directory on Windows

2006-01-05 Thread Duncan Murdoch

On 1/5/2006 2:11 PM, Berton Gunter wrote:
> In a similar vein, a GUI version is:
> 
> setwd(dirname(choose.files()))
> 
> This gives you a standard Windows file browser -- you just click on any file
> in the directory you want to set. Obviously, dirname(choose.files()) is an
> easy interactive way to get directories as strings if you need them. See
> also ?basename . 
> 

Brian Ripley added a function choose.dir() to R-devel, which will give 
you access to the code in the File|Change dir... menu item in R code. 
Won't be out for a few more months, though.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ylim problem in barplot

2006-01-05 Thread Ben Bolker

Robert Baer  atsu.edu> writes:

> Well, consider this example:
> barplot(c(-200,300,-250,350),ylim=c(-99,400))
> 
> It seems that barplot uses ylim and pretty to decide things about the axis
> but does some slightly unexpected things with the bars themselves that are
> not just at the 'zero' end of the bar.
> 
> Rob
> 

  in previous cases I think there was room for debate about
the appropriate behavior.  What do you think should happen
in this case?  Cutting off the bars seems like the right thing
to do; is your point that the axis being confined to positive
values (a side effect of setting ylim) is weird?

  Ben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Setting the working directory on Windows (was: Replacing backslashes with slashes)

2006-01-05 Thread Berton Gunter

In a similar vein, a GUI version is:

setwd(dirname(choose.files()))

This gives you a standard Windows file browser -- you just click on any file
in the directory you want to set. Obviously, dirname(choose.files()) is an
easy interactive way to get directories as strings if you need them. See
also ?basename . 


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Golicher
> Sent: Thursday, January 05, 2006 10:11 AM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] Replacing backslashes with slashes
> 
> It  is often convenient to quickly set the working directory 
> to a path 
> copied onto the windows clipboard. A simple trick I have been 
> using for 
> a while is along the lines given in the previous posts.
> 
> setwd.clip<-function()
> {
>   options(warn=-1)
>   setwd(gsub("","/",readLines("clipboard")))
>   options(warn=0)
>   getwd()
> }
> 
> 
> I load this at the start of every session and then write setwd.clip() 
> whenever I have a path I want to change to on the clipboard. 
> You can of 
> course write
> 
> setwd(gsub("","/",readLines("clipboard")))
> 
> everytime you need it. Obviously it takes longer and there is 
> the minor 
> detail that the path read from the clipboard is incomplete (no EOL 
> marker) which leads to an unnecessary warning.
> 
> 
> Dr Duncan Golicher
> Ecologia y Sistematica Terrestre
> Conservación de la Biodiversidad
> El Colegio de la Frontera Sur
> San Cristobal de Las Casas, 
> Chiapas, Mexico
> 
> Email: [EMAIL PROTECTED] 
> 
> Tel: 967 674 9000 ext 1310
> Fax: 967 678 2322
> Celular: 044 9671041021
> 
> United Kingdom Skypein; 020 7870 6251
> Skype name: duncangolicher 
> Download Skype from http://www.skype.com
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Peter Muhlberger

On 1/5/06 11:27 AM, "Achim Zeileis" <[EMAIL PROTECTED]> wrote:

> As John and myself seem to have written our replies in parallel, hence
> I added some more clarifying remarks in this mail:
>> Note that the Anova() function, also in car, can more conveniently compute
>> Wald tests for certain kinds of hypotheses. More generally, however, I'd be
>> interested in your suggestions for an alternative method of specifying
>> linear hypotheses.
> My understanding was that Peter just wants to eliminate various elements
> from the terms(obj) which is what waldtest() in lmtest supports. If some
> other way of specifying nested models is required, I'd also be interested
> in that.

My two most immediate problems were a) to test whether a set of coefficients
were jointly zero (as Achim suggests, though the complication here is that
the varcov matrix is bootstrapped), but also b) to test whether the average
of a set of coefficients was equal to zero.  At other points in time, I
remember having had to test more complex linear hypotheses involving joint
combinations of equality, non-zero, and 'averages.'  The Stata interface for
linear hypothesis tests is amazingly straightforward.  For example, after a
regression, I could use the following to test the joint hypothesis that
v1=v2 and the average (or sum) of v3 through v5 is zero and .75v6+.25v7 is
zero:

test v1=v2
test v3+v4+v5=0, accum
test .75*v6+.25*v7=0, accum

I don't even have to set up a matrix for my test ];-) !  The output would
show not merely the joint test of all the hypotheses but the tests along the
way, one for each line of commands.  I vaguely remember the hypothesis
testing command after an ml run is much the same and cross-equation
hypothesis tests simply involve adding an equation indicator to the terms.
I can get huberized var-cov matrices simply by adding "robust" to the
regression command.  I believe there's also a command that will huberize a
var-cov matrix after the fact.  Subsequent hypothesis tests would be on the
huberized matrix.

I won't claim to know what's good for R or the R community, but it would be
nice for me and perhaps others if there were a comparable straightforward
command as in Stata that could meet a variety of needs.  I need to play w/
the commands that have been suggested to me by you guys recently, but I'm
looking at a multitude of commands none of which I suspect have the
flexibility and ease of use of the above Stata commands, at least for the
kind of applications I'd like.  Perhaps the point of R isn't to serve as a
package for a wider set of non-statisticians, but if it wishes to develop in
that direction, facilities like this may be helpful.  It's interesting that
Achim points out that a function John suggests is already available in R--an
indication that even R experts don't have a complete handle on everything in
R even on a relatively straightforward topic like hypothesis tests.

John is no doubt right that editorializing about statistics would be out of
place on an R help page.  But when I have gone to statistical papers, many
have been difficult to access & not very helpful for practical concerns.
I'm glad to hear that Long and Erwin's paper is helpful, but there's a
goodly list of papers mentioned in help.  Perhaps something that would be
useful is some way of highlighting on a help page which reference is most
helpful for practical concerns?

Again, thanks for all the great input from everyone!

Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ylim problem in barplot

2006-01-05 Thread Robert Baer

> PaulB> When I use barplot but select a ylim value greater
> PaulB> than zero, the graph is distorted.  The bars extend
> PaulB> below the bottom of the graph.
> PaulB> For instance the command produces a problematic graph.
> PaulB> barplot(c(200,300,250,350),ylim=c(150,400))

> Well, my question would be if that is not a feature :-)
> Many people would consider barplots that do not start at 0 as
>  "Cheating with Graphics"  (in the vein of "Lying with Statistics").

Well, consider this example:
barplot(c(-200,300,-250,350),ylim=c(-99,400))

It seems that barplot uses ylim and pretty to decide things about the axis
but does some slightly unexpected things with the bars themselves that are
not just at the 'zero' end of the bar.

Rob


Robert W. Baer, Ph.D.
Associate Professor
Department of Physiology
A. T. Still University of Health Science
800 W. Jefferson St.
Kirksville, MO 63501-1497 USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Replacing backslashes with slashes

2006-01-05 Thread Duncan Golicher

It  is often convenient to quickly set the working directory to a path 
copied onto the windows clipboard. A simple trick I have been using for 
a while is along the lines given in the previous posts.

setwd.clip<-function()
{
  options(warn=-1)
  setwd(gsub("","/",readLines("clipboard")))
  options(warn=0)
  getwd()
}


I load this at the start of every session and then write setwd.clip() 
whenever I have a path I want to change to on the clipboard. You can of 
course write

setwd(gsub("","/",readLines("clipboard")))

everytime you need it. Obviously it takes longer and there is the minor 
detail that the path read from the clipboard is incomplete (no EOL 
marker) which leads to an unnecessary warning.


Dr Duncan Golicher
Ecologia y Sistematica Terrestre
Conservación de la Biodiversidad
El Colegio de la Frontera Sur
San Cristobal de Las Casas, 
Chiapas, Mexico

Email: [EMAIL PROTECTED] 

Tel: 967 674 9000 ext 1310
Fax: 967 678 2322
Celular: 044 9671041021

United Kingdom Skypein; 020 7870 6251
Skype name: duncangolicher 
Download Skype from http://www.skype.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread ronggui

2006/1/6, jim holtman <[EMAIL PROTECTED]>:
> If what you are reading in is numeric data, then it would require (807 *
> 118519 * 8) 760MB just to store a single copy of the object -- more memory
> than you have on your computer.  If you were reading it in, then the problem
> is the paging that was occurring.
In fact,If I read it in 3 pieces, each is about 170M.

>
> You have to look at storing this in a database and working on a subset of
> the data.  Do you really need to have all 807 variables in memory at the
> same time?

Yip,I don't need all the variables.But I don't know how to get the
necessary  variables into R.

At last I  read the data in piece and use RSQLite package to write it
to a database.and do then do the analysis. If i am familiar with
database software, using database (and R) is the best choice,but
convert the file into database format is not an easy job for me.I ask
for help in SQLite list,but the solution is not satisfying as that
required the knowledge about the third script language.After searching
the internet,I get this solution:

#begin
rm(list=ls())
f<-file("D:\wvsevs_sb_v4.csv","r")
i <- 0
done <- FALSE
library(RSQLite)
con<-dbConnect("SQLite","c:\sqlite\database.db3")
tim1<-Sys.time()

while(!done){
i<-i+1
tt<-readLines(f,2500)
if (length(tt)<2500) done <- TRUE
tt<-textConnection(tt)
if (i==1) {
   assign("dat",read.table(tt,head=T,sep=",",quote=""));
 }
else assign("dat",read.table(tt,head=F,sep=",",quote=""))
close(tt)
ifelse(dbExistsTable(con, "wvs"),dbWriteTable(con,"wvs",dat,append=T),
  dbWriteTable(con,"wvs",dat) )
}
close(f)
#end
It's not the best solution,but it works.



> If you use 'scan', you could specify that you do not want some of the
> variables read in so it might make a more reasonably sized objects.
>
>
> On 1/5/06, François Pinard <[EMAIL PROTECTED]> wrote:
> > [ronggui]
> >
> > >R's week when handling large data file.  I has a data file : 807 vars,
> > >118519 obs.and its CVS format.  Stata can read it in in 2 minus,but In
> > >my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.
> >
> > Just (another) thought.  I used to use SPSS, many, many years ago, on
> > CDC machines, where the CPU had limited memory and no kind of paging
> > architecture.  Files did not need to be very large for being too large.
> >
> > SPSS had a feature that was then useful, about the capability of
> > sampling a big dataset directly at file read time, quite before
> > processing starts.  Maybe something similar could help in R (that is,
> > instead of reading the whole data in memory, _then_ sampling it.)
> >
> > One can read records from a file, up to a preset amount of them.  If the
> > file happens to contain more records than that preset number (the number
> > of records in the whole file is not known beforehand), already read
> > records may be dropped at random and replaced by other records coming
> > from the file being read.  If the random selection algorithm is properly
> > chosen, it can be made so that all records in the original file have
> > equal probability of being kept in the final subset.
> >
> > If such a sampling facility was built right within usual R reading
> > routines (triggered by an extra argument, say), it could offer
> > a compromise for processing large files, and also sometimes accelerate
> > computations for big problems, even when memory is not at stake.
> >
> > --
> > François Pinard   http://pinard.progiciels-bpi.ca
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 247 0281
>
> What the problem you are trying to solve?


--
黄荣贵
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wald tests and Huberized variances (was: A comment about R:)

2006-01-05 Thread Peter Muhlberger

Hi Achim:  Your reply is tremendously helpful in addressing some of the
outstanding questions I had about R.  The 'econometrics view' materials look
exactly like what I needed.  Many thanks!

But, there is a second point here, which is how difficult it was for me, as
someone just becoming more familiar w/ R's more basic capabilities (in the
past I've focused on features like optim, sem), to find what seem to me like
standard & key features I've taken for granted in other packages.  I looked
high & low in my existing installed packages for the standard version of R,
I googled, I looked in the r-help archives, I looked through several manuals
/ introductions to R I had downloaded.  I've asked questions about all of
the points I raised in my email on this email list before.  I believe I
passed through the parent directory for the econometric view material at the
website w/o realizing what it contained because I thought of "computational
econometrics" as having to do w/ running Monte Carlo models of economic
processes.

If R wants to bring in a wider audience, one thing that might help is a
denser set of cross-references.  For example, perhaps lm's help should
mention the econometrics view materials as well as other places to look for
tests and procedures people may want to do w/ lm.  Another thought is that
perhaps the standard R package help should allow people to find
non-installed but commonly used contributed packages and perhaps their help
page contents.  A feature that would be very helpful for me is the capacity
to search all the contents of help files, not just keywords that at times
seem to miss what I'm trying to find.

Cheers,

Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Frank E Harrell Jr

John Fox wrote:
> Dear Peter,
> 
> 
>>-Original Message-
>>From: [EMAIL PROTECTED] 
>>[mailto:[EMAIL PROTECTED] On Behalf Of Peter 
>>Muhlberger
>>Sent: Wednesday, January 04, 2006 2:43 PM
>>To: rhelp
>>Subject: [R] A comment about R:
>>
> 
> 
> . . .
>  
> 
>>Ex. 1)  Wald tests of linear hypotheses after max. likelihood 
>>or even after a regression.  "Wald" does not even appear in 
>>my standard R package on a search.  There's no comment in the 
>>lm help or optim help about what function to use for 
>>hypothesis tests.  I know that statisticians prefer 
>>likelihood ratio tests, but Wald tests are still useful and 
>>indeed crucial for first-pass analysis.  After searching with 
>>Google for some time, I found several Wald functions in 
>>various contributed R packages I did not have installed.  One 
>>confusion was which one would be relevant to my needs.  This 
>>took some time to resolve.  I concluded, perhaps on 
>>insufficient evidence, that package car's Wald test would be 
>>most helpful.  To use it, however, one has to put together a 
>>matrix for the hypotheses, which can be arduous for a 
>>many-term regression or a complex hypothesis.  
>>In comparison, 
>>in Stata one simply states the hypothesis in symbolic terms.  
>>I also don't know for certain that this function in car will 
>>work or work properly w/ various kinds of output, say from lm 
>>or from optim.  To be sure, I'd need to run time-consuming 
>>tests comparing it with Stata output or examine the 
>>function's code.  In Stata the test is easy to find, and 
>>there's no uncertainty about where it can be run or its 
>>accuracy.  Simply having a comment or "see also" in lm help 
>>or mle or optim help pointing the user to the right Wald 
>>function would be of enormous help.

The Design package's anova.Design and contrast.Design make many Wald 
tests very easy.  contrast( ) will allow you to test all kinds of 
hypotheses by stating which differences in predicted values you are 
interested in.

Frank Harrell

>>
> 
> 
> 
> The reference, I believe, is to the linear.hypothesis() function, which has
> methods for lm and glm objects. [To see what kinds of objects
> linear.hypothesis is suitable for, use the command
> methods(linear.hypothesis).] For lm objects, you get an F-test by default.
> Note that the Anova() function, also in car, can more conveniently compute
> Wald tests for certain kinds of hypotheses. More generally, however, I'd be
> interested in your suggestions for an alternative method of specifying
> linear hypotheses. There is currently no method for mle objects, but adding
> one is a good idea, and I'll do that when I have a chance. (In the meantime,
> it's very easy to compute Wald tests from the coefficients and the
> hypothesis and coefficient-covariance matrices. Writing a small function to
> do so, without the bells and whistles of something like linear.hypothesis(),
> should not be hard. Indeed, the ability to do this kind of thing easily is
> what I see as the primary advantage of working in a statistical computing
> environment like R -- or Stata.
> 
> 
>>Ex. 2) Getting neat output of a regression with Huberized 
>>variance matrix.
>>I frequently have to run regressions w/ robust variances.  In 
>>Stata, one simply adds the word "robust" to the end of the 
>>command or "cluster(cluster.variable)" for a cluster-robust 
>>error.  In R, there are two functions, robcov and hccm.  I 
>>had to run tests to figure out what the relationship is 
>>between them and between them and Stata (robcov w/o cluster 
>>gives hccm's hc0; hccm's hc1 is equivalent to Stata's 
>>'robust' w/o cluster; etc.).  A single sentence in hccm's 
>>help saying something to the effect that statisticians prefer 
>>hc3 for most types of data might save me from having to 
>>scramble through the statistical literature to try to figure 
>>out which of these I should be using.  A few sentences on 
>>what the differences are between these methods would be even 
>>better.  Then, there's the problem of output.  Given that hc1 
>>or hc3 are preferred for non-clustered data, I'd need to be 
>>able to get regression output of the form summary(lm) out of 
>>hccm, for any practical use.  Getting this, however, would 
>>require programming my own function.  Huberized t-stats for 
>>regressions are commonplace needs, an R oriented a little 
>>toward more everyday needs would not require programming of 
>>such needs.  Also, I'm not sure yet how well any of the 
>>existing functions handle missing data.
>>
> 
> 
> I think that we have a philosophical difference here: I don't like giving
> advice in documentation. An egregious extended example of this, in my
> opinion, is the SPSS documentation. The hccm() function uses hc3 as the
> default, which is an implicit recommendation, but more usefully, in my view,
> points to Long and Erwin's American Statistician paper on the subject, which
> does give advice and which is quite accessible. As well, and more gen

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread jim holtman

If what you are reading in is numeric data, then it would require (807 *
118519 * 8) 760MB just to store a single copy of the object -- more memory
than you have on your computer.  If you were reading it in, then the problem
is the paging that was occurring.

You have to look at storing this in a database and working on a subset of
the data.  Do you really need to have all 807 variables in memory at the
same time?

If you use 'scan', you could specify that you do not want some of the
variables read in so it might make a more reasonably sized objects.


On 1/5/06, François Pinard <[EMAIL PROTECTED]> wrote:
>
> [ronggui]
>
> >R's week when handling large data file.  I has a data file : 807 vars,
> >118519 obs.and its CVS format.  Stata can read it in in 2 minus,but In
> >my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.
>
> Just (another) thought.  I used to use SPSS, many, many years ago, on
> CDC machines, where the CPU had limited memory and no kind of paging
> architecture.  Files did not need to be very large for being too large.
>
> SPSS had a feature that was then useful, about the capability of
> sampling a big dataset directly at file read time, quite before
> processing starts.  Maybe something similar could help in R (that is,
> instead of reading the whole data in memory, _then_ sampling it.)
>
> One can read records from a file, up to a preset amount of them.  If the
> file happens to contain more records than that preset number (the number
> of records in the whole file is not known beforehand), already read
> records may be dropped at random and replaced by other records coming
> from the file being read.  If the random selection algorithm is properly
> chosen, it can be made so that all records in the original file have
> equal probability of being kept in the final subset.
>
> If such a sampling facility was built right within usual R reading
> routines (triggered by an extra argument, say), it could offer
> a compromise for processing large files, and also sometimes accelerate
> computations for big problems, even when memory is not at stake.
>
> --
> François Pinard   http://pinard.progiciels-bpi.ca
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



--
Jim Holtman
Cincinnati, OH
+1 513 247 0281

What the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread Prof Brian Ripley

Another possibility is to make use of the several DBMS interfaces already 
available for R.  It is very easy to pull in a sample from one of those, 
and surely keeping such large data files as ASCII not good practice.


One problem with Francois Pinard's suggestion (the credit has got lost) is 
that R's I/O is not line-oriented but stream-oriented.  So selecting lines 
is not particularly easy in R.  That's a deliberate design decision, given 
the DBMS interfaces.


I rather thought that using a DBMS was standard practice in the R 
community for those using large datasets: it gets discussed rather often.


On Thu, 5 Jan 2006, Kort, Eric wrote:


-Original Message-

[ronggui]


R's week when handling large data file.  I has a data file : 807 vars,
118519 obs.and its CVS format.  Stata can read it in in 2 minus,but In
my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.


Just (another) thought.  I used to use SPSS, many, many years ago, on
CDC machines, where the CPU had limited memory and no kind of paging
architecture.  Files did not need to be very large for being too large.

SPSS had a feature that was then useful, about the capability of
sampling a big dataset directly at file read time, quite before
processing starts.  Maybe something similar could help in R (that is,
instead of reading the whole data in memory, _then_ sampling it.)

One can read records from a file, up to a preset amount of them.  If the
file happens to contain more records than that preset number (the number
of records in the whole file is not known beforehand), already read
records may be dropped at random and replaced by other records coming
from the file being read.  If the random selection algorithm is properly
chosen, it can be made so that all records in the original file have
equal probability of being kept in the final subset.

If such a sampling facility was built right within usual R reading
routines (triggered by an extra argument, say), it could offer
a compromise for processing large files, and also sometimes accelerate
computations for big problems, even when memory is not at stake.



Since I often work with images and other large data sets, I have been thinking about a 
"BLOb" (binary large object--though it wouldn't necessarily have to be binary) 
package for R--one that would handle I/O for such creatures and only bring as much data 
into the R space as was actually needed.

So I see 3 possibilities:

1. The sort of functionality you describe is implemented in the R internals (by 
people other than me).
2. Some individuals (perhaps myself included) write such a package.
3. This thread fizzles out and we do nothing.

I guess I will see what, if any, discussion ensues from this point to see which 
of these three options seems worth pursuing.


--
François Pinard   http://pinard.progiciels-bpi.ca


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R

2006-01-05 Thread Gregory Snow


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
> Sent: Thursday, January 05, 2006 6:26 AM
> To: 'Patrick Burns'; John Maindonald
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] A comment about R

[snip]

> Any suggestion on how to go about getting kids that young on 
> (R) programming?

For those of us in the US look at:
http://www.amstat.org/education/index.cfm?fuseaction=adoptas

I expect that some of the other stats organizations have similar
Adopt-A-School programs.

Last year I was in my daughter's 3rd grade class helping with a party
when I noticed a large posterboard that had the heights in inches of all
the students, since I had run out of apple juice to pour and was getting
a little bored, a went over to the chalk board next to it and made a
quick stem-and-leaf plot of the data.  The teacher was interested in
what I had done and came over and had me explain the stem and leaf plot
to her (she had used the data to talk about averages (mean and median,
but not by that name) and spread (general concept, not computing
anything)).

My other daughter (6) also brought home a homework to show me where they
had been given candy hearts (it was in February) and they had colored in
boxes corresponding to the colors of their hearts to make a basic bar
graph.  I showed her how I could do the same thing on my laptop using R
(I even colored the bars to match her graph and used the symbol font
with text to put colored hearts under the bars like hers had), she was
impressed enough to make me print the graph so she could show her
teacher.

There are 2 opourtunities that I should have followed up on more, now I
just need to get things in gear and do a more formal adopting of their
school.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] what can I do to make the lines straight

2006-01-05 Thread Lisa Wang

Hello there,

I did a few graphics but only one looks like the attached. Why the line
is not straight and how I can fix it?

thank you very much

Lisa Wang
Princess Margaret Hospital
Toronto, Canada__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Achim Zeileis

As John and myself seem to have written our replies in parallel, hence
I added some more clarifying remarks in this mail:

> Note that the Anova() function, also in car, can more conveniently compute
> Wald tests for certain kinds of hypotheses. More generally, however, I'd be
> interested in your suggestions for an alternative method of specifying
> linear hypotheses.

My understanding was that Peter just wants to eliminate various elements
from the terms(obj) which is what waldtest() in lmtest supports. If some
other way of specifying nested models is required, I'd also be interested
in that.

> The Anova() function with argument white=TRUE will give you F-tests
> corresponding to the t-tests to which you refer (though it will combine df
> for multiple-df terms in the model). To get the kind of summary you
> describe, you could use something like
>
> mysummary <- function(model){
>   coef <- coef(model)
>   se <- sqrt(diag(hccm(model)))
>   t <- coef/se
>   p <- 2*pt(abs(t), df=model$df.residual, lower=FALSE)
>   table <- cbind(coef, se, t, p)
>   rownames(table) <- names(coef)
>   colnames(table) <- c("Estimate", "Std. Error", "t value",
> "Pr(>|t|)")
>   table
>   }

This is supported out of the box in coeftest() in lmtest.
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread John Fox

Dear Peter,

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter 
> Muhlberger
> Sent: Wednesday, January 04, 2006 2:43 PM
> To: rhelp
> Subject: [R] A comment about R:
> 

. . .
 
> Ex. 1)  Wald tests of linear hypotheses after max. likelihood 
> or even after a regression.  "Wald" does not even appear in 
> my standard R package on a search.  There's no comment in the 
> lm help or optim help about what function to use for 
> hypothesis tests.  I know that statisticians prefer 
> likelihood ratio tests, but Wald tests are still useful and 
> indeed crucial for first-pass analysis.  After searching with 
> Google for some time, I found several Wald functions in 
> various contributed R packages I did not have installed.  One 
> confusion was which one would be relevant to my needs.  This 
> took some time to resolve.  I concluded, perhaps on 
> insufficient evidence, that package car's Wald test would be 
> most helpful.  To use it, however, one has to put together a 
> matrix for the hypotheses, which can be arduous for a 
> many-term regression or a complex hypothesis.  
> In comparison, 
> in Stata one simply states the hypothesis in symbolic terms.  
> I also don't know for certain that this function in car will 
> work or work properly w/ various kinds of output, say from lm 
> or from optim.  To be sure, I'd need to run time-consuming 
> tests comparing it with Stata output or examine the 
> function's code.  In Stata the test is easy to find, and 
> there's no uncertainty about where it can be run or its 
> accuracy.  Simply having a comment or "see also" in lm help 
> or mle or optim help pointing the user to the right Wald 
> function would be of enormous help.
> 


The reference, I believe, is to the linear.hypothesis() function, which has
methods for lm and glm objects. [To see what kinds of objects
linear.hypothesis is suitable for, use the command
methods(linear.hypothesis).] For lm objects, you get an F-test by default.
Note that the Anova() function, also in car, can more conveniently compute
Wald tests for certain kinds of hypotheses. More generally, however, I'd be
interested in your suggestions for an alternative method of specifying
linear hypotheses. There is currently no method for mle objects, but adding
one is a good idea, and I'll do that when I have a chance. (In the meantime,
it's very easy to compute Wald tests from the coefficients and the
hypothesis and coefficient-covariance matrices. Writing a small function to
do so, without the bells and whistles of something like linear.hypothesis(),
should not be hard. Indeed, the ability to do this kind of thing easily is
what I see as the primary advantage of working in a statistical computing
environment like R -- or Stata.

> Ex. 2) Getting neat output of a regression with Huberized 
> variance matrix.
> I frequently have to run regressions w/ robust variances.  In 
> Stata, one simply adds the word "robust" to the end of the 
> command or "cluster(cluster.variable)" for a cluster-robust 
> error.  In R, there are two functions, robcov and hccm.  I 
> had to run tests to figure out what the relationship is 
> between them and between them and Stata (robcov w/o cluster 
> gives hccm's hc0; hccm's hc1 is equivalent to Stata's 
> 'robust' w/o cluster; etc.).  A single sentence in hccm's 
> help saying something to the effect that statisticians prefer 
> hc3 for most types of data might save me from having to 
> scramble through the statistical literature to try to figure 
> out which of these I should be using.  A few sentences on 
> what the differences are between these methods would be even 
> better.  Then, there's the problem of output.  Given that hc1 
> or hc3 are preferred for non-clustered data, I'd need to be 
> able to get regression output of the form summary(lm) out of 
> hccm, for any practical use.  Getting this, however, would 
> require programming my own function.  Huberized t-stats for 
> regressions are commonplace needs, an R oriented a little 
> toward more everyday needs would not require programming of 
> such needs.  Also, I'm not sure yet how well any of the 
> existing functions handle missing data.
> 

I think that we have a philosophical difference here: I don't like giving
advice in documentation. An egregious extended example of this, in my
opinion, is the SPSS documentation. The hccm() function uses hc3 as the
default, which is an implicit recommendation, but more usefully, in my view,
points to Long and Erwin's American Statistician paper on the subject, which
does give advice and which is quite accessible. As well, and more generally,
the car package is associated with a book (my R and S-PLUS Companion to
Applied Regression), which gives advice, though, admittedly, tersely in this
case.

The Anova() function with argument white=TRUE will give you F-tests
corresponding to the t-tests to which you refer (though it will combine df
for multiple-df terms in

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread Kort, Eric

> -Original Message-
> 
> [ronggui]
> 
> >R's week when handling large data file.  I has a data file : 807 vars,
> >118519 obs.and its CVS format.  Stata can read it in in 2 minus,but In
> >my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.
> 
> Just (another) thought.  I used to use SPSS, many, many years ago, on
> CDC machines, where the CPU had limited memory and no kind of paging
> architecture.  Files did not need to be very large for being too large.
> 
> SPSS had a feature that was then useful, about the capability of
> sampling a big dataset directly at file read time, quite before
> processing starts.  Maybe something similar could help in R (that is,
> instead of reading the whole data in memory, _then_ sampling it.)
> 
> One can read records from a file, up to a preset amount of them.  If the
> file happens to contain more records than that preset number (the number
> of records in the whole file is not known beforehand), already read
> records may be dropped at random and replaced by other records coming
> from the file being read.  If the random selection algorithm is properly
> chosen, it can be made so that all records in the original file have
> equal probability of being kept in the final subset.
> 
> If such a sampling facility was built right within usual R reading
> routines (triggered by an extra argument, say), it could offer
> a compromise for processing large files, and also sometimes accelerate
> computations for big problems, even when memory is not at stake.
> 

Since I often work with images and other large data sets, I have been thinking 
about a "BLOb" (binary large object--though it wouldn't necessarily have to be 
binary) package for R--one that would handle I/O for such creatures and only 
bring as much data into the R space as was actually needed.

So I see 3 possibilities:

1. The sort of functionality you describe is implemented in the R internals (by 
people other than me).
2. Some individuals (perhaps myself included) write such a package.
3. This thread fizzles out and we do nothing.

I guess I will see what, if any, discussion ensues from this point to see which 
of these three options seems worth pursuing.

> --
> François Pinard   http://pinard.progiciels-bpi.ca
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-
> guide.html
This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Detlef Steuer

Only or the record:

There is a wiki for R in general, used by only but a few people,  annouced here 
some year(s) ago:
http://fawn.unibw-hamburg.de/cgi-bin/Rwiki.pl

The question is: one or more wikis? 

Detlef

On Wed, 04 Jan 2006 20:35:17 +0100
Philippe Grosjean <[EMAIL PROTECTED]> wrote:

> David Forrest wrote:
>  > [...]
>  > Any volunteers?
> 
> Yes, me (well, partly...)! Here is what I propose: this is a very 
> lengthy thread in R-Help, with many interesting ideas and suggestions. I 
> fear that, as it happens too often, those nice ideas will be lost 
> because of the support used: email! By nature, emails are read and then 
> deleted (well, there is the R-Help archive, but anyway, threads in a 
> mailing list is not at all the best tool to make collaborative documents 
> like those tutorials and co).
> 
> I just cooked a little Wiki *dedicated to R beginners* (meaning they can 
> contribute too, and are very welcome to discuss their problems -possibly 
> trivial for others-). It is available at 
> http://www.sciviews.org/_rgui/wiki. For the moment, everyone can edit 
> and add pages, but I will restrict rights in the future to logged users 
> only (with everybody allowed to log in at any time). So that we will be 
> able to track who made changes (authorship).
> 
> For those who do not know the Wiki concept, it is a very simple way of 
> working together in the same documents. The concept has proven very 
> powerful with a good example being Wikipedia, that is becoming one of 
> the largest encyclopedia in the world... and also as accurate as 
> Encyclopedia Britannica (but read this: 
> http://www.nature.com/news/2005/051212/full/438900a.html).
> 
> Here is the introduction of the R (GUI) Wiki:
> 
> This Wiki is mainly dedicated to deal with R beginners problems. 
> Although we would like to emphasize using R GUIs (Graphical User 
> Interfaces), this Wiki is not restricted to those GUIs: one can also 
> deal with command-line approaches. The main idea is thus to have 
> material contributed by both beginners, and by more advanced R users, 
> that will help novices or casual users of R (http://www.r-project.org).
> 
> Overview
> 
> * The various documents in the [[wiki section]] explain how to use 
> DokuWiki to edit documents in this site.
> 
> * The [[beginners section]] is dedicated to... beginners (share 
> experience, expose problems and difficulties useful to share with other 
> beginners, or to get help from more advanced people).
> 
> * The [[tutorials section]] is the place where you can put various R 
> session examples, or short tutorials on either general or specific use of R.
> 
> * The [[easier section]] aims to collect together various pieces of R 
> code that simplifies various tasks (especially for beginners) and that 
> will ultimately be compiled in a “easieR” R packages on CRAN.
> 
> * The [[varia section]] is for any material that does not fit in the 
> previous sections.
> 
> 
> Final note: working with Wikis requires some learning... So, I am not 
> sure at all that many R beginners will contribute to this wiki, but, of 
> course, I hope so. Just let's pretend that it is a small experiment to 
> try answering requests for another Internet space than R-Help, 
> specifically dedicated to beginners...
> 
> A good starting point would be the following: all people that expressed 
> interesting points in this thread could "copy and paste their ideas" to 
> new pages in the Wiki.
> 
> Best,
> 
> Philippe Grosjean
> 
> ..<°}))><
>   ) ) ) ) )
> ( ( ( ( (Prof. Philippe Grosjean
>   ) ) ) ) )
> ( ( ( ( (Numerical Ecology of Aquatic Systems
>   ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
> ( ( ( ( (Academie Universitaire Wallonie-Bruxelles
>   ) ) ) ) )   8, av du Champ de Mars, 7000 Mons, Belgium
> ( ( ( ( (
>   ) ) ) ) )   phone: + 32.65.37.34.97, fax: + 32.65.37.30.54
> ( ( ( ( (email: [EMAIL PROTECTED]
>   ) ) ) ) )
> ( ( ( ( (web:   http://www.umh.ac.be/~econum
>   ) ) ) ) )  http://www.sciviews.org
> ( ( ( ( (
> ..
> 
> David Forrest wrote:
> > On Tue, 3 Jan 2006, Gabor Grothendieck wrote:
> > ...
> > 
> >>In fact there are some things that are very easy
> >>to do in Stata and can be done in R but only with more difficulty.
> >>For example, consider this introductory session in Stata:
> >>
> >>http://www.stata.com/capabilities/session.html
> >>
> >>Looking at the first few queries,
> >>see how easy it is to take the top few in Stata whereas in R one would
> >>have a complex use of order.  Its not hard in R to write a function
> >>that would make it just as easy but its not available off the top
> >>of one's head though RSiteSearch("sort.data.frame") will find one
> >>if one knew what to search for.
> > 
> > 
> > This sort of thing points to an opportunity for documentation.  Building a
> > tutorial session in R on how one would do a

[R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread François Pinard

[ronggui]

>R's week when handling large data file.  I has a data file : 807 vars,
>118519 obs.and its CVS format.  Stata can read it in in 2 minus,but In
>my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.

Just (another) thought.  I used to use SPSS, many, many years ago, on 
CDC machines, where the CPU had limited memory and no kind of paging 
architecture.  Files did not need to be very large for being too large.

SPSS had a feature that was then useful, about the capability of 
sampling a big dataset directly at file read time, quite before 
processing starts.  Maybe something similar could help in R (that is, 
instead of reading the whole data in memory, _then_ sampling it.)

One can read records from a file, up to a preset amount of them.  If the 
file happens to contain more records than that preset number (the number 
of records in the whole file is not known beforehand), already read 
records may be dropped at random and replaced by other records coming 
from the file being read.  If the random selection algorithm is properly 
chosen, it can be made so that all records in the original file have 
equal probability of being kept in the final subset.

If such a sampling facility was built right within usual R reading 
routines (triggered by an extra argument, say), it could offer 
a compromise for processing large files, and also sometimes accelerate 
computations for big problems, even when memory is not at stake.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] about terminate an identification

2006-01-05 Thread Zhesi He

Dear all,

I'm now working on some programming on interactive displays from R.
So I'm creating an R display and want some interactive selection to be 
available. I know ggobi and things like that, but it doesn't cope with 
dendrogram.

So, if I go back to R graphics, as far as I know,  --
"For the usual X11 device the identification process is terminated by 
pressing any mouse button other than the first. "

but I really want to keep it active all the time unless I want it to be 
ended. Is there any package can do such things?
Also, is there any way that I can make my right click to active some 
drop down menus, or something like that?

Thanks a lot.

___

Zhesi He
Computational Biology Laboratory, University of York
York YO10 5YW, U.K.
Phone:  +44-(0)1904-328279
Email:  [EMAIL PROTECTED]
___

[[alternative text/enriched version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wald tests and Huberized variances (was: A comment about R:)

2006-01-05 Thread Achim Zeileis

On Wed, 4 Jan 2006, Peter Muhlberger wrote:

One comment in advance: please use a more meaningful subject. I would have
missed this mail if a colleague hadn't pointed me to it.

> I'm someone who from time to time comes to R to do applied stats for social
> science research.
[snip]
> I would also prefer not to have to work through a
> couple books on R or S+ to learn how to meet common needs in R.  If R were

There are some overviews and pointers available for certain topics,
so-called CRAN task views:
  http://CRAN.R-project.org/src/contrib/Views/
Currently, there is not yet a "SocialSciences" view (but John Fox is
working on one). However, it might as be interesting for you to look
at the "Econometrics" view which has some remarks about Wald tests.

> Ex. 1)  Wald tests of linear hypotheses after max. likelihood or even after
> a regression.  "Wald" does not even appear in my standard R package on a
> search.

You might want to look at waldtest() and coeftest() in package lmtest. And
you seem to have discovered linear.hypothesis() in package car. All three
perform Wald tests, providing different means of specifying the
hypothesis/alternative of the tests.

> There's no comment in the lm help or optim help about what function
> to use for hypothesis tests.

Well, the lm() man page does say:
  The functions 'summary' and 'anova' are used to obtain and print a
  summary and analysis of variance table of the results.

As for optim() it is not that straightforward, because optim() does not
know whether it maximizes a proper likelihood or not.

> I know that statisticians prefer likelihood
> ratio tests, but Wald tests are still useful and indeed crucial for
> first-pass analysis.  After searching with Google for some time, I found
> several Wald functions in various contributed R packages I did not have
> installed.  One confusion was which one would be relevant to my needs.  This
> took some time to resolve.

Yes, this is a problem that is at least partly addressed by the CRAN task
views.

> I concluded, perhaps on insufficient evidence,
> that package car's Wald test would be most helpful.  To use it, however, one
> has to put together a matrix for the hypotheses, which can be arduous for a
> many-term regression or a complex hypothesis.  In comparison, in Stata one
> simply states the hypothesis in symbolic terms.

waldtest() does the latter and is linked in the "See Also" section of
linear.hypothesis()

> I also don't know for
> certain that this function in car will work or work properly w/ various
> kinds of output, say from lm or from optim.

The man page of linear.hypothesis() does say that there are methods for
"lm" and "glm" objects (but not for results from optim).

> Ex. 2) Getting neat output of a regression with Huberized variance matrix.
> I frequently have to run regressions w/ robust variances.  In Stata, one
> simply adds the word "robust" to the end of the command or
> "cluster(cluster.variable)" for a cluster-robust error.  In R, there are two
> functions, robcov and hccm.  I had to run tests to figure out what the
> relationship is between them and between them and Stata (robcov w/o cluster
> gives hccm's hc0; hccm's hc1 is equivalent to Stata's 'robust' w/o cluster;
> etc.).

This is rather clearly document on the respective man pages. hccm()
provides HC covariance matrices without clustering, as does vcovHC() in
package sandwich. I plan to extend vcovHC() to also deal with clustered
data, but I didn't get round to do so, yet.

> A single sentence in hccm's help saying something to the effect that
> statisticians prefer hc3 for most types of data might save me from having to
> scramble through the statistical literature to try to figure out which of
> these I should be using.  A few sentences on what the differences are
> between these methods would be even better.

Yes and no. I'll add some more comments about the different HC-type
covariance matrices, but on the other hand this is just the software which
cannot replace understanding the underlying theory.

> Then, there's the problem of
> output.  Given that hc1 or hc3 are preferred for non-clustered data, I'd
> need to be able to get regression output of the form summary(lm) out of
> hccm, for any practical use.  Getting this, however, would require
> programming my own function.

Or using coeftest() from package lmtest intended particularly for this.

> Huberized t-stats for regressions are
> commonplace needs, an R oriented a little toward more everyday needs would
> not require programming of such needs.  Also, I'm not sure yet how well any
> of the existing functions handle missing data.

When fitting a linear model via lm() you can specify a suitable na.action.

The released version of lmtest and sandwich can deal with Wald tests and
sandwich covariance matrix estimators for linear models. I've got
development versions ready which make the functions fully object-oriented
and thus applicable to "glm" or "survreg" objects (for cens

Re: [R] A comment about R

2006-01-05 Thread Gabor Grothendieck

On 1/5/06, Liaw, Andy <[EMAIL PROTECTED]> wrote:
> From: Patrick Burns
> >
> > John Maindonald wrote:
> >
> > > ...
> > >
> > >(4) When should students start learning R?
> > >
> > >[Students should get their first exposure to a high-level
> > programming
> > >language, in the style of R then Python or Octave, at age 11-14.
> > >There are now good alternatives to the former use of Fortran or
> > >Pascal, languages which have for good reason dropped out of favour
> > >for learning experience. They should start on R while their
> > minds are
> > >still malleable, and long before they need it for serious
> > research use.]
> > >
> > >
> >
> > I think 11-14 years old might better be halved.  Kids are
> > playing very complicated video games barely after they
> > learn to walk.
>
> My kids (7- and 5-year old) barely get an hour on video games a week, and I
> can see that they lag behind their peers at the games (though I don't feel
> sorry for that).  I hope I won't be acused of `endangering welfare of
> children'...
>
> > R is a quite reasonable programming language for children.
> > You don't need to worry about low-level issues, and it is
> > easy to produce graphics with it.
>
> Any suggestion on how to go about getting kids that young on (R)
> programming?

I have introduced a number of computer software tools to my nephew
who is a teenager.  I think the key item is motivation and attention
span -- which is short.  They will want to get results fast and want
results to be of interest to them.

I have taught him elements of HTML, javascript and R.  In retrospect,
the most successful was HTML and to a lesser extent javascript.
When I asked him which of the three he wanted to learn more of
after not having done it for a while it was javascript.

The advantage of starting with HTML is that its relatively simple and within
one or two sessions he/she will be able to be putting together
web pages for themelves so its obviously useful and they can
be creative almost immediately. Also that leads naturally to javascript
and one can download lots of fancy mouse tails and other
motivating javascript snippets.

Previously we did it in person but now we are in different cities and
do it via instant messaging.  We started with javascript (which of the
three was the one he favored to get back into) again but
found that it was difficult to communicate javascript over instant
messaging so we tried R instead.

Because R is interactive one can easily discuss a line at a time and
include it right in the instant messaging dialogue so in that mode
I found R was possible to communicate whereas javascript difficult.  There
are some nice graphics demos in R which are motivating although
I think the mouse javascript tails are still more appealing to someone
that age.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] e1071::SVM calculate distance to separating hyperplane

2006-01-05 Thread David Meyer


predict.svm() can give you the decision values which are the distances
you are looking for (up to a scaling constant).

Regards,
David

>Hi,
>I know this question has been posed before, but I didnt find the answer in
>the R-help archive, so please accept my sincere apologies for being
>repetitive:
>How can one (elegantly) calculate the distance between data points (in the
>transformed space, I suppose) and the hyperplane that separates the 2
>categories when using svm() from the e1071 library?

>thanks a lot,
>Hans

-- 
Dr. David Meyer
Department of Information Systems and Operations

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] problem with using lines command on windows XP machine

2006-01-05 Thread Liaw, Andy

lines() connects the `dots' given.  If you want straight lines spanning the
entire graph, you are better off with abline().

Andy

From: Edwin Commandeur
> 
> Hi Petr and Eric,
> 
> Thanks for your comments.
> 
> To plot a vertical line, using "lines(0)" does not work, but
> 
> lines(c(-1,0),c(0,1))
> 
> does the work in my simple test example. I just interpreted the ?lines
> documentation wrong.
> 
> So the "lines" command does work on my pc. Off course 
> "abline(v=0)" will
> also do the job in this specific example...
> 
> Sorry for the trouble,
> 
> Edwin
> 
> -Original Message-
> From: Petr Pikal [mailto:[EMAIL PROTECTED]
> Sent: donderdag 5 januari 2006 13:32
> To: Edwin Commandeur; r-help@stat.math.ethz.ch
> Subject: Re: [R] problem with using lines command on windows 
> XP machine
> 
> 
> Hi
> 
> On 5 Jan 2006 at 11:27, Edwin Commandeur wrote:
> 
> From: "Edwin Commandeur" <[EMAIL PROTECTED]>
> To:   
> Date sent:Thu, 5 Jan 2006 11:27:01 +0100
> Subject:  [R] problem with using lines command on 
> windows XP machine
> 
> > Hello,
> >
> > I'm using R version 2.2.0 installed on windows XP machine, with SP2
> > (maybe it's also interesting to note it's laptop, so it outputs to a
> > laptop screen) a l and I wanted to draw a line in a graph, 
> but it does
> > not seem to work.
> >
> > To test it I use the following code:
> >
> > x = c(-1,0,1)
> > y = c(-1,0,1)
> > plot(x,y, type="l", xlim=c(-1,1), ylim=c(-1,1))
> > lines(0)
> 
> quite close
> 
> abline(v=0)
> 
> draws a vertical line at x=0
> 
> from help page
> Arguments:
> 
> x, y: coordinate vectors of points to join.
> 
> Maybe to mention abline in See also of lines help page could be good
> 
> HTH
> Petr
> 
> 
> 
> >
> > If I understand the documentation right this should draw a 
> line (with
> > default settings, I'm not setting any parameters) at x=0.
> >
> > I tried goofing around a bit setting linewidth and color 
> differently,
> > I tried using xy.coords etc, but no line appeared in the graph.
> >
> > The commands abline and segments work perfectly fine (so I am now
> > using segments to plot the line I want), but I still think the lines
> > command should work.
> >
> > Does anybody has similar problems drawing lines on XP machines (or
> > laptops in general?)? Or I am doing something abominably wrong?
> >
> > Greetings and thanks in advance for any replies,
> > Edwin Commandeur
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> 
> Petr Pikal
> [EMAIL PROTECTED]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] .Rprofile files (was R newbie configuration)

2006-01-05 Thread Mark Leeds

Thanks a lot. setHook is
Currently not in my knowledge set
But it's great to save these
Thing so I can look them up
When I  feel more comfortable.

Just to add to that Stata versus R discussion :

I believe, anyone who uses
any other package than R, is probably missing
out in the long run. It's truly unbelievable
what has been done here. I feel like I
fell asleep for 5 years ( by not using it )
and just woke up to all of these new
packages, facilities etc.

Mark




-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Sent: Thursday, January 05, 2006 4:04 AM
To: Petr Pikal
Cc: Mark Leeds; R-Stat Help
Subject: Re: [R] .Rprofile files (was R newbie configuration)

And here is one with a working setHook call.

options(show.signif.stars=FALSE)
setHook(packageEvent("grDevices", "onLoad"),
 function(...) grDevices::ps.options(horizontal=FALSE))
set.seed(1234)
options(repos=c(CRAN="http://cran.uk.r-project.org";))


On Thu, 5 Jan 2006, Petr Pikal wrote:

> Hi
>
> here is my example of .Rprofile file
>
>
> require(graphics)
> require(utils)
>
> # setHook(packageEvent("graphics", "onLoad"), function(...)
> # graphics::par(bg="white"))  ## did not manage to persuade setHook
> # to work properly
>
> par(bg="white")
> RNGkind("Mersenne-Twister", "Inversion")
>
> # some set of my functions and data
>
> .libPaths("D:/programy/R/R-2.2.0/library/fun")
> library(fun)
> data(stand)
>
>
> HTH
> Petr
>
>
> On 4 Jan 2006 at 15:46, Mark Leeds wrote:
>
> Date sent:Wed, 4 Jan 2006 15:46:37 -0500
> From: "Mark Leeds" <[EMAIL PROTECTED]>
> To:   "R-Stat Help" 
> Subject:  [R] R newbie configuration
>
>> I think I did enough reading on my
>> Own about startup ( part of the morning
>> And most of this afternoon )
>> to not feel uncomfortable asking
>> for confirmation of my understanding of this startup stuff.
>>
>> Obviously, the startup process is more complicated
>> Than below but, for my R newbie purposes,
>> It seems like I can think of the startup process as follows :
>>
>> Suppose my  home directory = "c:documents and settings/mleeds" =
>> $HOME.
>>
>> Put things in $HOME/.Rprofile that are more generic on startup and
not
>> specific to any particular R project.
>>
>> Put various .First() functions in the working directories of the
>> particular projects that
>> they are associated with so that they loaded in when their .RData
>> directory gets loaded.
>>
>> If above is correct  ( emphasis on correct for a newbie. I know there
>> is a lot more going on And things can be done more elegantly etc ),
>> Could someone send me an example of a .Rprofile file. I didn't use
>> these in S+ and I am wondering what you put in them ?
>>
>>Thanks
>>
>>
>>
>>
>>
**
>> This email and any files transmitted with it are
>> confidentia...{{dropped}}
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>
> Petr Pikal
> [EMAIL PROTECTED]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595


**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ylim problem in barplot

2006-01-05 Thread Martin Maechler

> "PaulB" == Bliese, Paul D LTC USAMH <[EMAIL PROTECTED]>
> on Thu, 5 Jan 2006 14:01:17 +0100 writes:

PaulB> R Version 2.2.0
PaulB> Platform:  Windows

PaulB> When I use barplot but select a ylim value greater
PaulB> than zero, the graph is distorted.  The bars extend
PaulB> below the bottom of the graph.

Well, my question would be if that is not a feature :-)
Many people would consider barplots that do not start at 0 as
 "Cheating with Graphics"  (in the vein of "Lying with Statistics").

PaulB> For instance the command produces a problematic graph.

PaulB> barplot(c(200,300,250,350),ylim=c(150,400))

The advantage of the current graphic drawn is that everyone *sees*
that the bars were cut off {and that one should really think
twice before producing such cheating graphics.. :-)}

 plot(c(200,300,250,350), ylim=c(150,400), type = "h", 
  lwd=20, xaxt="n", col="gray")

produces something closer to what you like.
[yes, you can get rid of the roundedness of the thick-line ends;
 --> ?par and look for 'lend';
 --> op <- par(lend = 1) ; plot(.) ; par(op)
 In R-devel (i.e. from R 2.3.0 on) you can even say
  plot(c(200,300,250,350), ylim=c(150,400), type = "h", 
   lwd=20, xaxt="n", col="gray", lend = 1)
]

But after all, I tend to agree that R should behave a bit differently
here, 
e.g., first giving a warning about the non-approriate ylim 
but then still obey the ylim specification more nicely.  

Regards,
Martin Maechler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ylim problem in barplot

2006-01-05 Thread Ben Bolker

Ben Bolker  ufl.edu> writes:

> 
> Bliese, Paul D LTC USAMH  us.army.mil> writes:
> 
> > 
> > R Version 2.2.0
> > 
> > Platform:  Windows
> > 
> > When I use barplot but select a ylim value greater than zero, the graph
> > is distorted.  The bars extend below the bottom of the graph.
> >
> 
>   The problem is that barplot() is really designed to work
> with zero-based data.  I don't know if the Powers That Be
> will say that "fixing" this would violate the spirit of
> barplot (although I see there is some code in barplot 
> that deals with figuring out the base of the rectangle
> in the logarithmic case, where 0 obviously doesn't work)
> 

  hmm, replying to myself ...
  Now that I think about it, I don't know if the default behavior should
necessarily be to set the baseline at ylim[1] or xlim[1] ...  (i.e., you
can imagine setting ylim negative to allow more space
below the bars ... you could allow a "baseline" argument,
but this would then be ripe for abuse ...  perhaps this
discussion should move to r-devel, if anyone cares  ... )

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] problem with using lines command on windows XP machine

2006-01-05 Thread Edwin Commandeur

Hi Petr and Eric,

Thanks for your comments.

To plot a vertical line, using "lines(0)" does not work, but

lines(c(-1,0),c(0,1))

does the work in my simple test example. I just interpreted the ?lines
documentation wrong.

So the "lines" command does work on my pc. Off course "abline(v=0)" will
also do the job in this specific example...

Sorry for the trouble,

Edwin

-Original Message-
From: Petr Pikal [mailto:[EMAIL PROTECTED]
Sent: donderdag 5 januari 2006 13:32
To: Edwin Commandeur; r-help@stat.math.ethz.ch
Subject: Re: [R] problem with using lines command on windows XP machine


Hi

On 5 Jan 2006 at 11:27, Edwin Commandeur wrote:

From:   "Edwin Commandeur" <[EMAIL PROTECTED]>
To: 
Date sent:  Thu, 5 Jan 2006 11:27:01 +0100
Subject:[R] problem with using lines command on windows XP 
machine

> Hello,
>
> I'm using R version 2.2.0 installed on windows XP machine, with SP2
> (maybe it's also interesting to note it's laptop, so it outputs to a
> laptop screen) a l and I wanted to draw a line in a graph, but it does
> not seem to work.
>
> To test it I use the following code:
>
> x = c(-1,0,1)
> y = c(-1,0,1)
> plot(x,y, type="l", xlim=c(-1,1), ylim=c(-1,1))
> lines(0)

quite close

abline(v=0)

draws a vertical line at x=0

from help page
Arguments:

x, y: coordinate vectors of points to join.

Maybe to mention abline in See also of lines help page could be good

HTH
Petr



>
> If I understand the documentation right this should draw a line (with
> default settings, I'm not setting any parameters) at x=0.
>
> I tried goofing around a bit setting linewidth and color differently,
> I tried using xy.coords etc, but no line appeared in the graph.
>
> The commands abline and segments work perfectly fine (so I am now
> using segments to plot the line I want), but I still think the lines
> command should work.
>
> Does anybody has similar problems drawing lines on XP machines (or
> laptops in general?)? Or I am doing something abominably wrong?
>
> Greetings and thanks in advance for any replies,
> Edwin Commandeur
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ylim problem in barplot

2006-01-05 Thread Ben Bolker

Bliese, Paul D LTC USAMH  us.army.mil> writes:

> 
> R Version 2.2.0
> 
> Platform:  Windows
> 
> When I use barplot but select a ylim value greater than zero, the graph
> is distorted.  The bars extend below the bottom of the graph.
>

  The problem is that barplot() is really designed to work
with zero-based data.  I don't know if the Powers That Be
will say that "fixing" this would violate the spirit of
barplot (although I see there is some code in barplot 
that deals with figuring out the base of the rectangle
in the logarithmic case, where 0 obviously doesn't work)

 Here's a workaround:

barplot(c(200,300,250,350)-150,axes=FALSE)
axis(side=2,at=seq(0,200,by=50),labels=seq(150,350,by=50))

 And here's a diff: if you want to hack barplot yourself,

sink("newbarplot.R")
barplot.default
sink()
## go edit newbarplot.R; add barplot.default <- to
## the first line, remove the namespace information
## from the last line, and substitute the lines
## in the first chunk below with exclamation points for 
## the lines in the second chunk below with exclamation
## points
source("newbarplot.R")

  cheers
Ben

*** newbarplot2.R   2006-01-05 08:52:11.0 -0500
--- /usr/local/src/R/R-2.2.1/src/library/graphics/R/barplot.R   2005-10-06
06:22:59.0 -0400
***
*** 85,97 
if  (logy && !horiz && !is.null(ylim))  ylim[1]
else if (logx && horiz  && !is.null(xlim))  xlim[1]
else 0.9 * min(height)
! } else {
!   rectbase <- if (!horiz && !is.null(ylim))
! ylim[1]
!   else if (horiz && !is.null(xlim))
! xlim[1]
!   else 0
! }
  ## if stacked bar, set up base/cumsum levels, adjusting for log scale
  if (!beside)
height <- rbind(rectbase, apply(height, 2, cumsum))
--- 85,92 
if  (logy && !horiz && !is.null(ylim))  ylim[1]
else if (logx && horiz  && !is.null(xlim))  xlim[1]
else 0.9 * min(height)
! } else rectbase <- 0
!
  ## if stacked bar, set up base/cumsum levels, adjusting for log scale
  if (!beside)
height <- rbind(rectbase, apply(height, 2, cumsum))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ylim problem in barplot

2006-01-05 Thread Marc Schwartz

On Thu, 2006-01-05 at 14:01 +0100, Bliese, Paul D LTC USAMH wrote:
> R Version 2.2.0
> 
> Platform:  Windows

> When I use barplot but select a ylim value greater than zero, the graph
> is distorted.  The bars extend below the bottom of the graph.

> For instance the command produces a problematic graph.

> barplot(c(200,300,250,350),ylim=c(150,400))

> Any help would be appreciated.

> Paul

Use:

  barplot(c(200, 300, 250, 350), ylim = c(150, 400), xpd = FALSE)

The 'xpd = FALSE' will enable clipping of the graphic at the boundary of
the plot region.

See ?par for more information on 'xpd'.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Splitting the list

2006-01-05 Thread Fernando Henrique Ferraz P. da Rosa

Kjetil Halvorsen writes:
> 
> Another possibi8lity, of course, is language-based lists. Any interest for
> r-spanish@ ...?
> 
> Kjetil

Since you´ve mentioned the topic, anyone reading this thread
knows of currently active R language-based lists? I am a member of R_STAT,
 an R list for Portuguese speakers [1]. It would be nice to collect
links for such lists and have them on the R-project website. I tried
e-mailing r-devel regarding this on last July, but got no reply [2].

References:
[1] http://br.groups.yahoo.com/group/R_STAT/
[2] http://tolstoy.newcastle.edu.au/~rking/R/devel/05/07/1623.html

--
"Though this be randomness, yet there is structure in't."
   Rosa, F.H.F.P

Instituto de Matemática e Estatística
Universidade de São Paulo
Fernando Henrique Ferraz P. da Rosa
http://www.feferraz.net

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R

2006-01-05 Thread Liaw, Andy

From: Patrick Burns
> 
> John Maindonald wrote:
> 
> > ...
> >
> >(4) When should students start learning R?
> >
> >[Students should get their first exposure to a high-level 
> programming  
> >language, in the style of R then Python or Octave, at age 11-14.   
> >There are now good alternatives to the former use of Fortran or  
> >Pascal, languages which have for good reason dropped out of favour  
> >for learning experience. They should start on R while their 
> minds are  
> >still malleable, and long before they need it for serious 
> research use.]
> >  
> >
> 
> I think 11-14 years old might better be halved.  Kids are
> playing very complicated video games barely after they
> learn to walk.

My kids (7- and 5-year old) barely get an hour on video games a week, and I
can see that they lag behind their peers at the games (though I don't feel
sorry for that).  I hope I won't be acused of `endangering welfare of
children'...
 
> R is a quite reasonable programming language for children.
> You don't need to worry about low-level issues, and it is
> easy to produce graphics with it.

Any suggestion on how to go about getting kids that young on (R)
programming?

Cheers,
Andy

 
> Patrick Burns
> [EMAIL PROTECTED]
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wikis for R

2006-01-05 Thread Fernando Henrique Ferraz P. da Rosa

Martin Maechler writes:
>  If you go to the bottom of that wikipedia page,
>  you see that there is an "R Wiki" -- and has been for several
>  years now (!) at a Hamburg (De) university.
>  http://fawn.unibw-hamburg.de/cgi-bin/Rwiki.pl?RwikiHome
> 
> (...) 
> So, are you sure that another R Wiki is desirable, rather than
> have people who "believe in Wiki's for R" use the existing
> one(s)?   I believe the main challenge will (similar as for
> an "R-beginners" mailing list) to have well-qualified "editors"
> to be willing to review and amend what others have written.

I´ve tried to colaborate on the R Wiki hosted by the Hamburg
university but the Wiki would get regularlly vandalized by some spam
bot, and then I'd have to manually keep reverting it several times. Also
the wiki engine used by this wiki is very rudimentary. I think the
DokuWiki engine, which is used by Philippe Grosjean is more promising as
a workhorse for an 'official' R-wiki. 

I think that the title could be perhaps changed to Rwiki
and the contents currently hosted on the Hamburg wiki 'transfered' to 
the new location, if the current mantainers of the Hamburg Wiki and
Philippe Grosjean agree (I´m cc-ing this msg to them).

This could emerge then as official or semi-oficial R-wiki, to be
linked to from the R-project home. 

--
"Though this be randomness, yet there is structure in't."
   Rosa, F.H.F.P

Instituto de Matemática e Estatística
Universidade de São Paulo
Fernando Henrique Ferraz P. da Rosa
http://www.feferraz.net

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Understanding and translating lme() into lmer() model

2006-01-05 Thread Doran, Harold

Peter:

Almost correct. You need to add the variance component for the highest
level of nesting, so your model would be

lmer.m1.1 = lmer(Y~A+B+C+(1|D:E) + (1|E), data=data,method="ML")

But, yes, the : is used to note implicit nesting in lmer similar to the
syntax used for / in lme. The syntax varies a bit because lme was useful
for models with nested random effects. But, lmer can handle models with
more complex structures such as crossed random effects. It doesn't make
sense to use strict nesting structures when units are migrating, so that
is part of the reason for the evolution of lmer().  

If you use RSiteSearch('lmer syntax') you will find a few threads on the
topic that might be helpful.

Harold



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Petar Milin
Sent: Thursday, January 05, 2006 7:48 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Understanding and translating lme() into lmer() model

I am newbie in R, trying to understand and compare syntax in nlme and
lme4. lme() model from the nlme package I am interested in is:
lme.m1.1 = lme(Y~A+B+C,random=~1|D/E,data=data,method="ML")
(for simplicity reason, I am giving generic names of factors) If I
understand well, there are three fixed factors: A, B and C, and two
random factors: D and E. In addition to that, E is nested in D, isn't
it? Of course, method is Maximum Likelihood.
If I would like to translate the above model to one suitable for lmer(),
it should look like this:
lmer.m1.1 = lmer(Y~A+B+C+(1|D:E),data=data,method="ML")
Am I right? Is '/' in nlme same as ':' in lme4?

Sincerely,
Peter M.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Fernando Henrique Ferraz P. da Rosa

Peter Dalgaard writes:
> Patrick Burns <[EMAIL PROTECTED]> writes:
> 
> whereas you could quite conceivably do it in R. (What *is* the
> equivalent of rnorm(25) in those languages, actually?)
> 
In SAS, it would go along the lines of:

data randvec(drop=seed);
 seed = 459437845;
 do obs = 1 to 25;
   x = rannor(seed);
   output;
   end;
 run;

--
"Though this be randomness, yet there is structure in't."
   Rosa, F.H.F.P

Instituto de Matemática e Estatística
Universidade de São Paulo
Fernando Henrique Ferraz P. da Rosa
http://www.feferraz.net

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] ylim problem in barplot

2006-01-05 Thread Bliese, Paul D LTC USAMH

R Version 2.2.0

Platform:  Windows

 

When I use barplot but select a ylim value greater than zero, the graph
is distorted.  The bars extend below the bottom of the graph.

 

For instance the command produces a problematic graph.

 

barplot(c(200,300,250,350),ylim=c(150,400))

 

Any help would be appreciated.

 

Paul

 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Understanding and translating lme() into lmer() model

2006-01-05 Thread Petar Milin

I am newbie in R, trying to understand and compare syntax in nlme and
lme4. lme() model from the nlme package I am interested in is:
lme.m1.1 = lme(Y~A+B+C,random=~1|D/E,data=data,method="ML")
(for simplicity reason, I am giving generic names of factors)
If I understand well, there are three fixed factors: A, B and C, and two
random factors: D and E. In addition to that, E is nested in D, isn't
it? Of course, method is Maximum Likelihood.
If I would like to translate the above model to one suitable for lmer(),
it should look like this:
lmer.m1.1 = lmer(Y~A+B+C+(1|D:E),data=data,method="ML")
Am I right? Is '/' in nlme same as ':' in lme4?

Sincerely,
Peter M.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] problem with using lines command on windows XP machine

2006-01-05 Thread Petr Pikal

Hi

On 5 Jan 2006 at 11:27, Edwin Commandeur wrote:

From:   "Edwin Commandeur" <[EMAIL PROTECTED]>
To: 
Date sent:  Thu, 5 Jan 2006 11:27:01 +0100
Subject:[R] problem with using lines command on windows XP 
machine

> Hello,
> 
> I'm using R version 2.2.0 installed on windows XP machine, with SP2
> (maybe it's also interesting to note it's laptop, so it outputs to a
> laptop screen) a l and I wanted to draw a line in a graph, but it does
> not seem to work.
> 
> To test it I use the following code:
> 
> x = c(-1,0,1)
> y = c(-1,0,1)
> plot(x,y, type="l", xlim=c(-1,1), ylim=c(-1,1))
> lines(0)

quite close

abline(v=0)

draws a vertical line at x=0

from help page
Arguments:

x, y: coordinate vectors of points to join.

Maybe to mention abline in See also of lines help page could be good

HTH
Petr



> 
> If I understand the documentation right this should draw a line (with
> default settings, I'm not setting any parameters) at x=0.
> 
> I tried goofing around a bit setting linewidth and color differently,
> I tried using xy.coords etc, but no line appeared in the graph.
> 
> The commands abline and segments work perfectly fine (so I am now
> using segments to plot the line I want), but I still think the lines
> command should work.
> 
> Does anybody has similar problems drawing lines on XP machines (or
> laptops in general?)? Or I am doing something abominably wrong?
> 
> Greetings and thanks in advance for any replies,
> Edwin Commandeur
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Ronnie Babigumira

As someone who has been using Stata for a while now (and I started without a 
programming background), I recently had to 
move to R because of the rich spatial packages. Here is my 0.001 cent to this 
thread.

-WHAT I LOVE ABOUT STATA--
a) Total control
In Stata I feel like I had TOTAL CONTROL. I put my data in a directory, I can 
look at it, generate new variables 
(columns), reshape, collapse, and expand my data, and all the while I use the 
list command [list (my variables) in 
1/10)] over and over again to make sure I am doing what I want. List is 
probably my favorite Stata command.

b) Structure
As far as I am concerned Stata, has three main types of files

1. The data file (*.dta) which is my "spread sheet" in which I have my 
variables (columns, vectors or whatever you want 
to call it)

2. The do file (*.do) which is my set of commands for a particular analysis

3. The log file (*.log) (which is text of smcl output from my do file)

Just looking at the extensions in any given directory, I would know what is 
what and I am able to organise my project, 
(infact I put the three types in different sub-directories but work in one main 
project directory). Some have said R 
allows you to think through your analysis, well, I can swear that Stata has 
brought the same discipline to me. Key 
questions I always ask myself

- what peculiarities are there about my data (do I have unique observations...1 
record per household, or multiple 
records and what does this mean for my analysis...do I need to collapse it, or 
reshape it).
- what do I want to do (write down a few lines of what I want to do and 
expected output)

c) Ease of use
I feel that most of Stata's commands were intuitively named and I find it easy 
to use (a choice of the GUI, command 
prompt, or the dofile editor)


-MY FIRST 30 DAYS WITH R--
Moving to R was a totally different experience, and in part its the whole 
concept of objects (and I still dont get them 
:-) ). My first assignment was to get the R equivalents of the three files as 
well as my main Stata commands (and 
frankly, the only one that is clear now is the script which is R's equivalent 
of the Stata do file).

A few have asked about the relevance of reproducing Stata (or SAS for that 
matter) commands in R. Well someone correctly 
pointed out that the challenge is in the mind set. Stata users have a Stata 
mindset so by being able to reproduce some 
basic work done in Stata in R, you are many steps closer to understanding the 
workings of R.

So yes I did whine in the first few weeks about how hard R is. Some have 
attributed the whining about R to laziness...I 
disagree, the learning curve is simply steep. I there salute Roger Bivand's 
effort to reproduce the example on the Stata 
website and I second efforts by others to do this for other programs.

Now dont get me wrong, I am not ungrateful for the tons of material make freely 
available by the R community (top on 
this list being R itself), however, most of this material is terse and most of 
the time I have had to go over it a few 
times (and may still not get it).

But even more, I am yet to find material dedicated to basic data management 
(indeed bits of data management are dropped 
here and there in the manuals and online material) however, a dedicated book 
(which I would gladly buy) is lacking.


-R-Help List--
In this same thread there has been discussion on splitting the R-Help list. I 
have reservations about this (we had the 
same discussion on the Stata list and the consensus was to maintain the status 
quo). Geographically splitting the list 
simply reinforces the inequalities birthed out of the original development of 
R. Some countries or regions are bound to 
have more exciting lists thanks to the initial distribution of resource 
persons. Sending the beginners to their own list 
is nothing short of crippling them (let the one eyed lead the 
blindhmmmbad idea). Not only will it cripple your 
thinking, but it can instill bad prgramming practices that may be hard to drop. 
I look back at the Stata stuff I wrote 6 
years ago and I am ashamed by how much real estate I wasted writing line onto 
line that could be cut down in less than 
1/10th. How did I learn...well, I passively and faithfully read each email that 
was posted and saved in my scrap book 
elegant bits of code.

Finally, I have been on the Statalist for close to six years and we do get our 
fair share of "homework type" questions 
and people get told off (though not with the frequency and "harshness" of this 
list). Infact some one once whined about 
a rude reply he got from his posting and someone wrote to inform him that there 
were much harsher lists adding that 
R-Help list is not for the faint hearted (two reasons, one being that the 
typical posting may sound like rocket science 
to most, the other being that there i

[R] jointprior in deal package

2006-01-05 Thread Tim Smits

Dear all,

I recently started using the deal package for learning Bayesian 
networks. When using the jointprior function on a particular dataset, I 
get the following message:
 >tor.prior<-jointprior(tor.nw)
Error in array(1, Dim) : 'dim' specifies too large an array

What is the problem? How can I resolve it?

Thanks,
Tim

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Naji

Hi all,

Roger thanks for the reproduction.
As a user of Stata & R, for common analysis I do use Stata and often, I have
to adapt some computations or to do some complex hierarchical modeling and
then I switch to R.
For me switching from Stata (or other statistical software, SO) to R (or
other statistical language) requests a double effort:
- Programming (laziness?) : writing and testing the code; considering the
data as N array or any data frame in order to optimize performance
- Statistical testing : I test the model over a simulated data set and
validate that the statistical process is giving me back the adequate
parameter estimates. An additional step one doesn't need when using an
established SO.

For me using Stata (or any other SO), has the advantage of using a high
quality code  written & tested by an organization & their clients.
Getting back to Roger replication, I find such replication very useful.
Test whether the R-code is giving back adequate results. So it's a very good
starting point before adapting the R-code to one's needs.

Stata advantage : one can download additional ado files ('package' like) and
with the permission of the author, adapt them or translate them into R-code.
Not only R &Stata are good products, they also show a valuable asset : the
users community


Happy new year
Le 5/01/06 10:46, « Robert Chung » <[EMAIL PROTECTED]> a écrit :

> Roger Bivand wrote:
>> Gabor Grothendieck wrote:
>> 
>>> For example, consider this introductory session in Stata:
>>> http://www.stata.com/capabilities/session.html
>>> 
>> Could I ask for comments on:
>> source(url("http://spatial.nhh.no/R/etc/capabilities.R";), echo=TRUE)
>> as a reproduction of the Stata capabilities session?
> 
> Roger, I think your reproduction of the Stata session is excellent.
> 
> However, in a deeper sense, perhaps it's *too* faithful a replication. I
> don't normally do analyses exactly the same way in R and in Stata, so
> although it's possible to contort R into producing Stata-like output, why
> would anyone want to? For example, in the sample Stata session, they run a
> t-test before plotting any data. In R, I'd tend to plot early and test
> hypotheses after. Rather that print out the top and bottom 5 mileage cars,
> I might plot(weight,mpg,col=as.integer(foreign)) and identify() the
> bivariate oddities. Rather than start into linear models, I might do some
> lowess() lines. I'd probably do a splom() pretty early. Depending on what
> I was doing, maybe I'd do something like
> stars(auto[,-c(1,12)],labels=make).
> 
> Stata and R are both fine products, but I sometimes wonder how the tools
> one chooses affect the analyses one does.
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] jointprior in deal package

2006-01-05 Thread Tim Smits

Dear all,

I recently started using the deal package for learning Bayesian 
networks. When using the jointprior function on a particular dataset, I 
get the following message:
 >tor.prior<-jointprior(tor.nw)
Error in array(1, Dim) : 'dim' specifies too large an array

What is the problem? How can I resolve it?

Thanks,
Tim

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R

2006-01-05 Thread Patrick Burns

John Maindonald wrote:

> ...
>
>(4) When should students start learning R?
>
>[Students should get their first exposure to a high-level programming  
>language, in the style of R then Python or Octave, at age 11-14.   
>There are now good alternatives to the former use of Fortran or  
>Pascal, languages which have for good reason dropped out of favour  
>for learning experience. They should start on R while their minds are  
>still malleable, and long before they need it for serious research use.]
>  
>

I think 11-14 years old might better be halved.  Kids are
playing very complicated video games barely after they
learn to walk.

R is a quite reasonable programming language for children.
You don't need to worry about low-level issues, and it is
easy to produce graphics with it.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wikis for R

2006-01-05 Thread Martin Maechler

> "PhGr" == Philippe Grosjean <[EMAIL PROTECTED]>
> on Wed, 04 Jan 2006 20:35:17 +0100 writes:

PhGr> David Forrest wrote:
>> [...]
>> Any volunteers?

PhGr> Yes, me (well, partly...)! Here is what I propose: this is a very 
PhGr> lengthy thread in R-Help, with many interesting ideas and 
suggestions. I 
PhGr> fear that, as it happens too often, those nice ideas will be lost 
PhGr> because of the support used: email! By nature, emails are read and 
then 
PhGr> deleted (well, there is the R-Help archive, but anyway, threads in a 
PhGr> mailing list is not at all the best tool to make collaborative 
documents 
PhGr> like those tutorials and co).

PhGr> I just cooked a little Wiki *dedicated to R beginners* (meaning they 
can 
PhGr> contribute too, and are very welcome to discuss their problems 
-possibly 
PhGr> trivial for others-). It is available at 
PhGr> http://www.sciviews.org/_rgui/wiki. 

I you google for "R Wiki" you get (on the first page of hits)
- the japanase R Wiki "RjpWiki" [which has been in existence for
  quite a while, but that's all I know about it].

- the  Wikipedia entry for R
http://en.wikipedia.org/wiki/R_programming_language

(which is quite good, but probably could benefit from more volunteer input)

 If you go to the bottom of that wikipedia page,
 you see that there is an "R Wiki" -- and has been for several
 years now (!) at a Hamburg (De) university.
 http://fawn.unibw-hamburg.de/cgi-bin/Rwiki.pl?RwikiHome

- Simon Urbanek's "R Wiki" mainly (but not exclusively) aimed at
  R for Mac OSX.

So, are you sure that another R Wiki is desirable, rather than
have people who "believe in Wiki's for R" use the existing
one(s)?   I believe the main challenge will (similar as for
an "R-beginners" mailing list) to have well-qualified "editors"
to be willing to review and amend what others have written.

I think it's an experiment that should be tried; but it has been
started already a while ago, and instead of restarting it, one
should try to agree on some cooperation with existing (Wiki)
approaches.

Hopefully some agreement on this is reached quickly, and we
could also add a link to the R wiki {or maybe several ones?}
from www.r-project.org.

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] more on the daisy function

2006-01-05 Thread Adrian DUSA


Dear R-helpers,

First of all, a happy new year to everyone!

I succesfully used the daisy function (from package cluster) to find which two 
rows from a dataframe differ by only one value, and I now want to come up with 
a simpler way to find _which_ value makes the difference between any such 
pair of two rows.
Consider a very small example (the actual data counts thousands of rows):

   input <- matrix(letters[c(1,2,1,2,2,3,2,1,1,2,2,2)], ncol=3)

   > input
 X1 X2 X3
   1  a  b  a
   2  b  c  b
   3  a  b  b
   4  b  a  b

I am interested by the rows which differ by one value only; I easily do that 
with:

   library(cluster)
   distance <- daisy(as.data.frame(input))*ncol(input)

   > distance
   Dissimilarities :
 1 2 3
   2 3
   3 1 2
   4 3 1 2

   Metric :  mixed ;  Types = N, N, N
   Number of objects : 4


The first and the third rows differ only with respect to variable V3, and the 
second and the fourth rows differ only with respect to variable V2.


Now I want to replace the different values by an "x"; currently my code is:

   distance <- as.matrix(distance)
   distance[!upper.tri(distance)] <- NA
   to.be.compared <- as.matrix(which(distance == 1, arr.ind=T))
   logical.result <- t(apply(to.be.compared, 1,
                   function(idx) {input[idx[1], ] == input[idx[2], ]}))
   result <- t(sapply(1:nrow(to.be.compared), 
                       function(idx) {input[to.be.compared[idx, 1], ]})) 
   result[!logical.result] <- "x"

   > as.data.frame(result)
 V1 V2 V3
   1  a  b  x
   2  b  x  b

I wonder if the daisy function could be persuaded to output a similar object 
as the dissimilarities one; it would be fantastic to also get something like:

   First.difference.found:
 1 2 3
   2 1
   3 3 1
   4 1 2 1

Here, 3 means the third variable (V3) that the first and third rows differ on. 
I could try to do that myself, but I don't know where to find the Fortran 
code daisy uses.

Thanks for any hint,
Adrian

-- 
Adrian DUSA
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
  +40 21 3120210 / int.101

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Fwd: Re: Splitting the list

2006-01-05 Thread ahimsa campos arceiz



>Another possibi8lity, of course, is language-based lists. Any interest for
>r-spanish@ ...?
>
>Kjetil

I am ready to contribute traducing original English texts into Spanish, but 
not to produce original ones (I'm too new with these matters).

Ahimsa


Ahimsa Campos Arceiz
The University Museum,
The University of Tokyo
Hongo 7-3-1, Bunkyo-ku,
Tokyo 113-0033
phone +81-(0)3-5841-2824
cell +81-(0)80-5402-7702

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread ronggui

R's week when handling large data file.
I has a data file : 807 vars, 118519 obs.and its CVS format.
Stata can read it in in 2 minus,but In my PC,R almost can not handle.
my pc's cpu 1.7G ;RAM 512M.


--

Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] problem in install "kidpack" package

2006-01-05 Thread Uwe Ligges

Yufen wrote:

> Dear Sir,
>  I use the followoing command to install the library("kidpack"). BTW I 
> install Biobase already.
> 
>> install.packages("kidpack",type="source")
> 
> However, there is an error message occurred as follows.
> > library("kidpack")
>  Error in library("kidpack") : 'kidpack' is not a valid package -- 
> installed <
> 
> I have problem in install "kidpack" package, could you please give me 
> some help.
> Thank you!

Please check out 
http://tolstoy.newcastle.edu.au/~rking/R/help/05/12/16693.html

If you think that does not apply for you:
a) Where did you find a recent version of kidpack?
b) Which version of kidpack?
c) Which OS?
d) Which version of R?

Uwe Ligges

> Best,
> Yufen Huang
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Splitting the list

2006-01-05 Thread Kjetil Halvorsen

On 1/5/06, John Maindonald <[EMAIL PROTECTED]> wrote:
>
> I've changed the heading because this really is another thread.  I
> think it inevitable that there will, in the course of time, be other
> lists that are devoted, in some shape or form, to the concerns of
> practitioners (at all levels) who are using R.  One development I'd
> not like to see is fracture along application area lines, allowing
> those who are comfortable in coteries whose focus was somewhat
> relevant to standards of use of statistics in that area 15 or 20
> years ago to continue that way.  One of the great things about R, in
> its development to date, has been its role in exposing people from a
> variety of application area communities to statistical traditions
> different from that in which they have been nurtured. I expect it to
> have a continuing role in raising statistical analysis standards, in
> "raising the bar".
>
> Another possibility is fracture along geographic boundaries.  This
> has both benefits (one being that its is easier within a smaller
> circle of people who are more likely to know each other for
> contributors to establish a rapport that will make the list really
> effective; also there will be notices and discussion that are of
> local interest) and drawbacks (it risks separating subscribers off
> from important discussions on the official R lists.)  On balance,
> this may be the better way to go. Indeed subscribers to ANZSTAT
> (Australian and NZ statistical list) will know that an R-downunder
> list, hosted at Auckland, is currently in test-drive mode. There
> should be enough subscribers in common between this and the official
> R lists that the south-eastern portion of Gondwana does not, at any
> time in the very near future, float off totally on its own.
>
> There are of course other possibilities, and it may be useful to
> canvass them.


Another possibi8lity, of course, is language-based lists. Any interest for
r-spanish@ ...?

Kjetil

John Maindonald email: [EMAIL PROTECTED]
> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
> Mathematical Sciences Institute, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
>
>
>
> On 4 Jan 2006, at 10:00 PM, [EMAIL PROTECTED] wrote:
>
> > From: Ben Fairbank <[EMAIL PROTECTED]>
> > Date: 4 January 2006 4:42:31 AM
> > To: R-help@stat.math.ethz.ch
> > Subject: Re: [R] A comment about R:
> >
> >
> > One implicit point in Kjetil's message is the difficulty of learning
> > enough of R to make its use a natural and desired "first choice
> > alternative," which I see as the point at which real progress and
> > learning commence with any new language.  I agree that the long
> > learning
> > curve is a serious problem, and in the past I have discussed, off
> > list,
> > with one of the very senior contributors to this list the
> > possibility of
> > splitting the list into sections for newcomers and for advanced users.
> > He gave some very cogent reasons for not splitting, such as the
> > possibility of newcomers' getting bad advice from others only slightly
> > more advanced than themselves.  And yet I suspect that a newcomers'
> > section would encourage the kind of mutually helpful collegiality
> > among
> > newcomers that now characterizes the exchanges of the more experienced
> > users on this list.  I know that I have occasionally been reluctant to
> > post issues that seem too elementary or trivial to vex the others
> > on the
> > list with and so have stumbled around for an hour or so seeking the
> > solution to a simple problem.  Had I the counsel of others similarly
> > situated progress might have been far faster.  Have other newcomers or
> > occasional users had the same experience?
> >
> > Is it time to reconsider splitting this list into two sections?
> > Certainly the volume of traffic could justify it.
> >
> > Ben Fairbank
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] problem with using lines command on windows XP machine

2006-01-05 Thread Dimitris Rizopoulos

I think  you need "abline(v = 0)".

I hope it helps.

Best,
Dimitris



Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: "Edwin Commandeur" <[EMAIL PROTECTED]>
To: 
Sent: Thursday, January 05, 2006 11:27 AM
Subject: [R] problem with using lines command on windows XP machine


> Hello,
>
> I'm using R version 2.2.0 installed on windows XP machine, with SP2 
> (maybe
> it's also interesting to note it's laptop, so it outputs to a laptop 
> screen)
> a l and I wanted to draw a line in a graph, but it does not seem to 
> work.
>
> To test it I use the following code:
>
> x = c(-1,0,1)
> y = c(-1,0,1)
> plot(x,y, type="l", xlim=c(-1,1), ylim=c(-1,1))
> lines(0)
>
> If I understand the documentation right this should draw a line 
> (with
> default settings, I'm not setting any parameters) at x=0.
>
> I tried goofing around a bit setting linewidth and color 
> differently, I
> tried using xy.coords etc, but no line appeared in the graph.
>
> The commands abline and segments work perfectly fine (so I am now 
> using
> segments to plot the line I want), but I still think the lines 
> command
> should work.
>
> Does anybody has similar problems drawing lines on XP machines (or 
> laptops
> in general?)? Or I am doing something abominably wrong?
>
> Greetings and thanks in advance for any replies,
> Edwin Commandeur
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] problem in install "kidpack" package

2006-01-05 Thread Yufen

Dear Sir,
 I use the followoing command to install the library("kidpack"). BTW I 
install Biobase already.
>  install.packages("kidpack",type="source")
However, there is an error message occurred as follows.
> library("kidpack")
 Error in library("kidpack") : 'kidpack' is not a valid package -- 
installed <

I have problem in install "kidpack" package, could you please give me some 
help.
Thank you!

Best,
Yufen Huang
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] problem with using lines command on windows XP machine

2006-01-05 Thread Edwin Commandeur

Hello,

I'm using R version 2.2.0 installed on windows XP machine, with SP2 (maybe
it's also interesting to note it's laptop, so it outputs to a laptop screen)
a l and I wanted to draw a line in a graph, but it does not seem to work.

To test it I use the following code:

x = c(-1,0,1)
y = c(-1,0,1)
plot(x,y, type="l", xlim=c(-1,1), ylim=c(-1,1))
lines(0)

If I understand the documentation right this should draw a line (with
default settings, I'm not setting any parameters) at x=0.

I tried goofing around a bit setting linewidth and color differently, I
tried using xy.coords etc, but no line appeared in the graph.

The commands abline and segments work perfectly fine (so I am now using
segments to plot the line I want), but I still think the lines command
should work.

Does anybody has similar problems drawing lines on XP machines (or laptops
in general?)? Or I am doing something abominably wrong?

Greetings and thanks in advance for any replies,
Edwin Commandeur

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Splitting the list

2006-01-05 Thread Florence Combes

I don't think splitting the list is a good idea, neither according to the
level of questions (which will "kill" the "beginners list"), nor according
to geographic boundaries.

I totally agree with Heinz Tuechler's position : a (short) code on the
sublect of the e-mail seems a good ideau if people feel necessary to
organize more this list.

Florence.


On 1/5/06, Heinz Tuechler <[EMAIL PROTECTED]> wrote:
>
> At 11:56 05.01.2006 +1100, John Maindonald wrote:
> >I've changed the heading because this really is another thread.  I
> >think it inevitable that there will, in the course of time, be other
> >lists that are devoted, in some shape or form, to the concerns of
> >practitioners (at all levels) who are using R.  One development I'd
> >not like to see is fracture along application area lines, allowing
> >those who are comfortable in coteries whose focus was somewhat
> >relevant to standards of use of statistics in that area 15 or 20
> >years ago to continue that way.  One of the great things about R, in
> >its development to date, has been its role in exposing people from a
> >variety of application area communities to statistical traditions
> >different from that in which they have been nurtured. I expect it to
> >have a continuing role in raising statistical analysis standards, in
> >"raising the bar".
> >
> >Another possibility is fracture along geographic boundaries.  This
> >has both benefits (one being that its is easier within a smaller
> >circle of people who are more likely to know each other for
> >contributors to establish a rapport that will make the list really
> >effective; also there will be notices and discussion that are of
> >local interest) and drawbacks (it risks separating subscribers off
> >from important discussions on the official R lists.)  On balance,
> >this may be the better way to go. Indeed subscribers to ANZSTAT
> >(Australian and NZ statistical list) will know that an R-downunder
> >list, hosted at Auckland, is currently in test-drive mode. There
> >should be enough subscribers in common between this and the official
> >R lists that the south-eastern portion of Gondwana does not, at any
> >time in the very near future, float off totally on its own.
> >
> >There are of course other possibilities, and it may be useful to
> >canvass them.
> >
>
> Repeating a comment under the subject "Splitting the list":
> I would considere to use flags at the beginning of the subject line, like
> e.g. "BQ" for basic question. Of course, also geographic boundaries could
> be considered.
> This flags should be defined in the posting guide.
> This way, every reader/expert can decide on a personal level to split the
> list by filtering the messages accordingly.
>
> Heinz Tuechler
>
> >John Maindonald email: [EMAIL PROTECTED]
> >phone : +61 2 (6125)3473fax  : +61 2(6125)5549
> >Mathematical Sciences Institute, Room 1194,
> >John Dedman Mathematical Sciences Building (Building 27)
> >Australian National University, Canberra ACT 0200.
> >
> >
> >
> >On 4 Jan 2006, at 10:00 PM, [EMAIL PROTECTED] wrote:
> >
> >> From: Ben Fairbank <[EMAIL PROTECTED]>
> >> Date: 4 January 2006 4:42:31 AM
> >> To: R-help@stat.math.ethz.ch
> >> Subject: Re: [R] A comment about R:
> >>
> >>
> >> One implicit point in Kjetil's message is the difficulty of learning
> >> enough of R to make its use a natural and desired "first choice
> >> alternative," which I see as the point at which real progress and
> >> learning commence with any new language.  I agree that the long
> >> learning
> >> curve is a serious problem, and in the past I have discussed, off
> >> list,
> >> with one of the very senior contributors to this list the
> >> possibility of
> >> splitting the list into sections for newcomers and for advanced users.
> >> He gave some very cogent reasons for not splitting, such as the
> >> possibility of newcomers' getting bad advice from others only slightly
> >> more advanced than themselves.  And yet I suspect that a newcomers'
> >> section would encourage the kind of mutually helpful collegiality
> >> among
> >> newcomers that now characterizes the exchanges of the more experienced
> >> users on this list.  I know that I have occasionally been reluctant to
> >> post issues that seem too elementary or trivial to vex the others
> >> on the
> >> list with and so have stumbled around for an hour or so seeking the
> >> solution to a simple problem.  Had I the counsel of others similarly
> >> situated progress might have been far faster.  Have other newcomers or
> >> occasional users had the same experience?
> >>
> >> Is it time to reconsider splitting this list into two sections?
> >> Certainly the volume of traffic could justify it.
> >>
> >> Ben Fairbank
> >
> >
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@stat.math.ethz.ch mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide!
> http://www.R-

Re: [R] A comment about R:

2006-01-05 Thread Robert Chung

Roger Bivand wrote:
> Gabor Grothendieck wrote:
>
>> For example, consider this introductory session in Stata:
>> http://www.stata.com/capabilities/session.html
>>
> Could I ask for comments on:
> source(url("http://spatial.nhh.no/R/etc/capabilities.R";), echo=TRUE)
> as a reproduction of the Stata capabilities session?

Roger, I think your reproduction of the Stata session is excellent.

However, in a deeper sense, perhaps it's *too* faithful a replication. I
don't normally do analyses exactly the same way in R and in Stata, so
although it's possible to contort R into producing Stata-like output, why
would anyone want to? For example, in the sample Stata session, they run a
t-test before plotting any data. In R, I'd tend to plot early and test
hypotheses after. Rather that print out the top and bottom 5 mileage cars,
I might plot(weight,mpg,col=as.integer(foreign)) and identify() the
bivariate oddities. Rather than start into linear models, I might do some
lowess() lines. I'd probably do a splom() pretty early. Depending on what
I was doing, maybe I'd do something like
stars(auto[,-c(1,12)],labels=make).

Stata and R are both fine products, but I sometimes wonder how the tools
one chooses affect the analyses one does.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Uwe Ligges

François Pinard wrote:
> [David Forrest]
> 
> 
>>[...] A few end-to-end tutorials on some interesting analyses would be
>>helpful.
> 
> 
> I'm in the process of learning R.  While tutorials are undoubtedly very 
> useful, and understanding that working and studying methods vary between 
> individuals, what I (for one) would like to have is a fairly complete 
> reference manual to the library.
> 
> Of course, we already have one, and that's marvellous already.  Yet, it 
> is organised by library and, within each library, by function name: this
> organisation means that the manual is mainly used as a reference, or 
> else, that it ought to be studied from cover to cover, dauntingly.
> 
> The very same material could be organised by topics.  Chapters could be 
> named like "General Help", "Language features", "Data types", "Data 
> Handling", "Input/Output", "Graphics", "Statistics", and such.  The 
> chapter "Language features", to take one example, could hold sections 
> like "Expressions", "Statements", "Functions", "Environments", 
> "Packages", "Execution" and "Debugging".  Sections could then hold 
> current reference pages.  References by library and/or by function name 
> could be stated either in appendices or as a general index at the end.


Have a look at  help.start() --> Search Engine & Keywords --> Section 
"Keywords by Topic".

Uwe Ligges



> For those who happen to know it, I find the "Emacs Lisp Reference 
> Manual" to be a good example for organising, in a very usable way,
> a comprehensive reference to a flurry of library functions.  When one 
> needs string handling functions, they are likely grouped together in the 
> manual, and are likely all present.  A tutorial, by comparison, usually 
> presents a subset, or even a tiny subset, of what is available.
> 
> 
>>Any volunteers?
> 
> 
> Not me, or at least, not before quite a long while.  The overall 
> organisation of a reference should not be handled by beginners.  On the 
> contrary, it rather requires someone who has comprehensive knowledge of 
> all the material to be considered.
> 
> Just an idea.  A good work plan would be to establish a new structure 
> for a reference manual, and once competent people (or this community as 
> a whole) agrees on a structure, to develop mechanical means for 
> generating a reference manual out of the current material.  The 
> mechanism should likely allow for added glue text, about everywhere 
> reasonable, and for diagnosing any lone, unreachable page in the current 
> reference.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread François Pinard

[Jonathan Baron]

>> [the current reference manual] is organised by library and, within 
>> each library, by function name: this organisation means that the 
>> manual is mainly used as a reference, or else, that it ought to be 
>> studied from cover to cover, dauntingly.

>I think that many search facilities are helpful here: [...]
>help.search() [...] >2. RSiteSearch() [...]

Sure they are!  Yet, we do not all learn or work the same way.  Given
full choice, I prefer reading a reference than go fish for information,
as this tends to build stronger information nets within my brain :-).

>I doubt that the sort of manual you describe is possible given the very
>rapid growth of CRAN, and it would be really inadequate if it did not
>include those packages.

The current reference manual does not cover CRAN, and even if it does 
not, I would not be tempted to qualify it as inadequate (at least for 
the novice I am).  There seems to be a lot to know about R, initially 
"as a language", and then, for learning to shuffle and organise data in 
preparation for later processing.  I would guess every new R user has to 
learn his way in there.  The current reference says a lot, but is big to 
grasp as it stands, its organisation is not as helpful as it could for 
learning and retaining.

The kind of manual I described seems possible to me, because it could be
mechanically derived out of a plan, and the derivation mechanics could
diagnose what is being forgotten (this could even yield some "Unsorted
functions" chapter or appendix).  The mechanic could be made general
enough to accept glue text at appropriate places.  [Not completely
dissimilar to, for those who happen to remember it, the way C code was
mechanically derived out of Pascal, initially, for Knuth's TeX.]

>Many of [CRAN packages] are designed for people in particular fields
>and turn out to be extremely useful.

Undoubtedly!  I envy you all, who know already! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread Philippe Grosjean

David Forrest wrote:
 > [...]
 > Any volunteers?

Yes, me (well, partly...)! Here is what I propose: this is a very 
lengthy thread in R-Help, with many interesting ideas and suggestions. I 
fear that, as it happens too often, those nice ideas will be lost 
because of the support used: email! By nature, emails are read and then 
deleted (well, there is the R-Help archive, but anyway, threads in a 
mailing list is not at all the best tool to make collaborative documents 
like those tutorials and co).

I just cooked a little Wiki *dedicated to R beginners* (meaning they can 
contribute too, and are very welcome to discuss their problems -possibly 
trivial for others-). It is available at 
http://www.sciviews.org/_rgui/wiki. For the moment, everyone can edit 
and add pages, but I will restrict rights in the future to logged users 
only (with everybody allowed to log in at any time). So that we will be 
able to track who made changes (authorship).

For those who do not know the Wiki concept, it is a very simple way of 
working together in the same documents. The concept has proven very 
powerful with a good example being Wikipedia, that is becoming one of 
the largest encyclopedia in the world... and also as accurate as 
Encyclopedia Britannica (but read this: 
http://www.nature.com/news/2005/051212/full/438900a.html).

Here is the introduction of the R (GUI) Wiki:

This Wiki is mainly dedicated to deal with R beginners problems. 
Although we would like to emphasize using R GUIs (Graphical User 
Interfaces), this Wiki is not restricted to those GUIs: one can also 
deal with command-line approaches. The main idea is thus to have 
material contributed by both beginners, and by more advanced R users, 
that will help novices or casual users of R (http://www.r-project.org).

Overview

* The various documents in the [[wiki section]] explain how to use 
DokuWiki to edit documents in this site.

* The [[beginners section]] is dedicated to... beginners (share 
experience, expose problems and difficulties useful to share with other 
beginners, or to get help from more advanced people).

* The [[tutorials section]] is the place where you can put various R 
session examples, or short tutorials on either general or specific use of R.

* The [[easier section]] aims to collect together various pieces of R 
code that simplifies various tasks (especially for beginners) and that 
will ultimately be compiled in a “easieR” R packages on CRAN.

* The [[varia section]] is for any material that does not fit in the 
previous sections.

Final note: working with Wikis requires some learning... So, I am not 
sure at all that many R beginners will contribute to this wiki, but, of 
course, I hope so. Just let's pretend that it is a small experiment to 
try answering requests for another Internet space than R-Help, 
specifically dedicated to beginners...

A good starting point would be the following: all people that expressed 
interesting points in this thread could "copy and paste their ideas" to 
new pages in the Wiki.

Best,

Philippe Grosjean

..<°}))><
  ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
  ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
  ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
( ( ( ( (Academie Universitaire Wallonie-Bruxelles
  ) ) ) ) )   8, av du Champ de Mars, 7000 Mons, Belgium
( ( ( ( (
  ) ) ) ) )   phone: + 32.65.37.34.97, fax: + 32.65.37.30.54
( ( ( ( (email: [EMAIL PROTECTED]
  ) ) ) ) )
( ( ( ( (web:   http://www.umh.ac.be/~econum
  ) ) ) ) )  http://www.sciviews.org
( ( ( ( (
..

David Forrest wrote:
> On Tue, 3 Jan 2006, Gabor Grothendieck wrote:
> ...
> 
>>In fact there are some things that are very easy
>>to do in Stata and can be done in R but only with more difficulty.
>>For example, consider this introductory session in Stata:
>>
>>http://www.stata.com/capabilities/session.html
>>
>>Looking at the first few queries,
>>see how easy it is to take the top few in Stata whereas in R one would
>>have a complex use of order.  Its not hard in R to write a function
>>that would make it just as easy but its not available off the top
>>of one's head though RSiteSearch("sort.data.frame") will find one
>>if one knew what to search for.
> 
> 
> This sort of thing points to an opportunity for documentation.  Building a
> tutorial session in R on how one would do a similar analysis would provide
> another method of learning R.  "An Introduction to R" is a good bottom-up
> introduction, which if you work through it does teach you how to do
> several things.  Adapting other tutorials or extended problems, like the
> Stata session, to R would give additional entry points.  A few end-to-end
> tutorials on some interesting analyses would be helpful.
> 
> Any volunteers?
> 
> Dave

__
R-help@stat.math.ethz.ch mailing list
htt

Re: [R] A comment about R:

2006-01-05 Thread Jonathan Baron

On 01/04/06 11:04, Franois Pinard wrote:
> I'm in the process of learning R.  While tutorials are undoubtedly very
> useful, and understanding that working and studying methods vary between
> individuals, what I (for one) would like to have is a fairly complete
> reference manual to the library.
> 
> Of course, we already have one, and that's marvellous already.  Yet, it
> is organised by library and, within each library, by function name: this
> organisation means that the manual is mainly used as a reference, or
> else, that it ought to be studied from cover to cover, dauntingly.

I think that many search facilities are helpful here:

1. help.search() searches all libraries on your computer by
default.

2. RSiteSearch() has an option of searching all functions in all
existing packages, in case you don't have a package and turn out
to need it: RSiteSearch([topic],restrict="functions")

I doubt that the sort of manual you describe is possible given
the very rapid growth of CRAN, and it would be really inadequate
if it did not include those packages.  Many of them are designed
for people in particular fields and turn out to be extremely
useful.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Splitting the list

2006-01-05 Thread Heinz Tuechler

At 11:56 05.01.2006 +1100, John Maindonald wrote:
>I've changed the heading because this really is another thread.  I  
>think it inevitable that there will, in the course of time, be other  
>lists that are devoted, in some shape or form, to the concerns of  
>practitioners (at all levels) who are using R.  One development I'd  
>not like to see is fracture along application area lines, allowing  
>those who are comfortable in coteries whose focus was somewhat  
>relevant to standards of use of statistics in that area 15 or 20  
>years ago to continue that way.  One of the great things about R, in  
>its development to date, has been its role in exposing people from a  
>variety of application area communities to statistical traditions  
>different from that in which they have been nurtured. I expect it to  
>have a continuing role in raising statistical analysis standards, in  
>"raising the bar".
>
>Another possibility is fracture along geographic boundaries.  This  
>has both benefits (one being that its is easier within a smaller  
>circle of people who are more likely to know each other for  
>contributors to establish a rapport that will make the list really  
>effective; also there will be notices and discussion that are of  
>local interest) and drawbacks (it risks separating subscribers off  
>from important discussions on the official R lists.)  On balance,  
>this may be the better way to go. Indeed subscribers to ANZSTAT  
>(Australian and NZ statistical list) will know that an R-downunder  
>list, hosted at Auckland, is currently in test-drive mode. There  
>should be enough subscribers in common between this and the official  
>R lists that the south-eastern portion of Gondwana does not, at any  
>time in the very near future, float off totally on its own.
>
>There are of course other possibilities, and it may be useful to  
>canvass them.
>

Repeating a comment under the subject "Splitting the list":
I would considere to use flags at the beginning of the subject line, like
e.g. "BQ" for basic question. Of course, also geographic boundaries could
be considered.
This flags should be defined in the posting guide.
This way, every reader/expert can decide on a personal level to split the
list by filtering the messages accordingly.

Heinz Tuechler

>John Maindonald email: [EMAIL PROTECTED]
>phone : +61 2 (6125)3473fax  : +61 2(6125)5549
>Mathematical Sciences Institute, Room 1194,
>John Dedman Mathematical Sciences Building (Building 27)
>Australian National University, Canberra ACT 0200.
>
>
>
>On 4 Jan 2006, at 10:00 PM, [EMAIL PROTECTED] wrote:
>
>> From: Ben Fairbank <[EMAIL PROTECTED]>
>> Date: 4 January 2006 4:42:31 AM
>> To: R-help@stat.math.ethz.ch
>> Subject: Re: [R] A comment about R:
>>
>>
>> One implicit point in Kjetil's message is the difficulty of learning
>> enough of R to make its use a natural and desired "first choice
>> alternative," which I see as the point at which real progress and
>> learning commence with any new language.  I agree that the long  
>> learning
>> curve is a serious problem, and in the past I have discussed, off  
>> list,
>> with one of the very senior contributors to this list the  
>> possibility of
>> splitting the list into sections for newcomers and for advanced users.
>> He gave some very cogent reasons for not splitting, such as the
>> possibility of newcomers' getting bad advice from others only slightly
>> more advanced than themselves.  And yet I suspect that a newcomers'
>> section would encourage the kind of mutually helpful collegiality  
>> among
>> newcomers that now characterizes the exchanges of the more experienced
>> users on this list.  I know that I have occasionally been reluctant to
>> post issues that seem too elementary or trivial to vex the others  
>> on the
>> list with and so have stumbled around for an hour or so seeking the
>> solution to a simple problem.  Had I the counsel of others similarly
>> situated progress might have been far faster.  Have other newcomers or
>> occasional users had the same experience?
>>
>> Is it time to reconsider splitting this list into two sections?
>> Certainly the volume of traffic could justify it.
>>
>> Ben Fairbank
>
>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] comparision and removal

2006-01-05 Thread Uwe Ligges

gynmeerut wrote:

> 
> Dear All,
> 
> 
> I am using R and I am putting my problem in form of an example:
> 
> X<-c(128,34,153,987,345,45,3454,23,123)
> I want to remove the entries which are lesser than 100(? How to compare every 
> element with 100 and how to create subsets )
> and I need two vectors y and z s.t
> y<-c(entries < 100)
> z<- c(remaining entries)
> 
> Moreover, Please tell me which command to use if I want to use different 
> programs for y and z.
> X is the whole dataset and y,z are its disjoint subsets.

Arbitrary basic documentation on R programming (e.g. the manual "An 
Introduction to R") explains how to compare and how to use index 
operations. Please don't ask on R-help for your homework questions, but 
please read the posting guide.

Uwe Ligges

> Thanks 
> GS
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] problem with command line arguments

2006-01-05 Thread Uwe Ligges

madhurima bhattacharjee wrote:

> Hello Everybody,
> 
> I am running a R script through a perl code from command line.
> The perl script is like:
> 
> my $cmd= 'R CMD BATCH D:/try5.R';
> system($cmd);
> 
> I run the perl code from command line.
> Now I want to pass some command line arguments to the R script.
> Its like the argv concept of perl.
> 
> Do I pass the arguments through my $cmd in the perl script?
> If yes, then how to access that in the R script?
> Any help will really be appreciated.


See ?commandArgs

Uwe Ligges


> Thanks and Regards,
> Madhurima.
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problem with nlme version 3.1-68

2006-01-05 Thread Prof Brian Ripley

It's a bug.  So nothing in the test suites uses this (nor any example in 
any package on CRAN, which were tested prior to release).

Note that 3.1-68 is not the version of nlme which ships with R 2.2.1 
(deliberately not introducing a new feature until after release).
Look for 3.1-68.1 in due course.


On Thu, 5 Jan 2006, Bing T. Guan wrote:

> Dear All:
> I updated my R program as well as associated packages yesterday. Currently
> my R version is 2.2.1 running under WINXP SP-2.
> When I tried to list (summary) an nlme object that I developed before, I got
> the following error message:
>
> [ Error in .C("ARMA_constCoef", as.integer(attr(object, "p")),
> as.integer(attr(object,  :
>C entry point "ARMA_constCoef" not in DLL for package "nlme" ]
>
> The nlme object was fitted with corr = corARMA(q=2) option. I refitted the
> model, and the same error message appeared. I then refitted the model with
> option corr = corARMA(p=1), then no problem; but for p = 2, or q = 1 or 2,
> then the error occurred. When I listed the same fitted nlme objects under R
> 2.1.1 with nlme 3.1-65, then no problem.
>
> I fitted the Ovary data (Pinheiro and Bates 2000, p.397) using the script
> provided in nlme package
> fm3Ovar.nlme <- update(fm1Ovar.nlme, correlation = corARMA(p=0, q=2)), and
> tried to list the result. The same error occurred. I tried it out on several
> of PCs (WINXP SP-2, R 2.2.1, nlme 3.1-68) and the same situation happened on
> every machine.
>
> Is there a bug in the latest version of nlme (3.1-68), or the problem only
> happened to me and my machines?
> ***
> Biing T. Guan
> School of Forestry and Resource Conservation
> National Taiwan University
> [EMAIL PROTECTED]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] .Rprofile files (was R newbie configuration)

2006-01-05 Thread Prof Brian Ripley

And here is one with a working setHook call.

options(show.signif.stars=FALSE)
setHook(packageEvent("grDevices", "onLoad"),
 function(...) grDevices::ps.options(horizontal=FALSE))
set.seed(1234)
options(repos=c(CRAN="http://cran.uk.r-project.org";))


On Thu, 5 Jan 2006, Petr Pikal wrote:

> Hi
>
> here is my example of .Rprofile file
>
>
> require(graphics)
> require(utils)
>
> # setHook(packageEvent("graphics", "onLoad"), function(...)
> # graphics::par(bg="white"))  ## did not manage to persuade setHook
> # to work properly
>
> par(bg="white")
> RNGkind("Mersenne-Twister", "Inversion")
>
> # some set of my functions and data
>
> .libPaths("D:/programy/R/R-2.2.0/library/fun")
> library(fun)
> data(stand)
>
>
> HTH
> Petr
>
>
> On 4 Jan 2006 at 15:46, Mark Leeds wrote:
>
> Date sent:Wed, 4 Jan 2006 15:46:37 -0500
> From: "Mark Leeds" <[EMAIL PROTECTED]>
> To:   "R-Stat Help" 
> Subject:  [R] R newbie configuration
>
>> I think I did enough reading on my
>> Own about startup ( part of the morning
>> And most of this afternoon )
>> to not feel uncomfortable asking
>> for confirmation of my understanding of this startup stuff.
>>
>> Obviously, the startup process is more complicated
>> Than below but, for my R newbie purposes,
>> It seems like I can think of the startup process as follows :
>>
>> Suppose my  home directory = "c:documents and settings/mleeds" =
>> $HOME.
>>
>> Put things in $HOME/.Rprofile that are more generic on startup and not
>> specific to any particular R project.
>>
>> Put various .First() functions in the working directories of the
>> particular projects that
>> they are associated with so that they loaded in when their .RData
>> directory gets loaded.
>>
>> If above is correct  ( emphasis on correct for a newbie. I know there
>> is a lot more going on And things can be done more elegantly etc ),
>> Could someone send me an example of a .Rprofile file. I didn't use
>> these in S+ and I am wondering what you put in them ?
>>
>>Thanks
>>
>>
>>
>>
>> **
>> This email and any files transmitted with it are
>> confidentia...{{dropped}}
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>
> Petr Pikal
> [EMAIL PROTECTED]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Difficulty with 'merge'

2006-01-05 Thread Christoph Buser

Dear Michael

Please remark that merge calculates all possible combinations if
you have repeated elements as you can see in the example below. 

?merge

"... If there is more than one match, all possible matches
contribute one row each. ..."

Maybe you can apply "aggregate" in a reasonable way on your 
data.frame first to summarize your repeated values to unique
ones and the proceed with merge, but that depends on your
problem. 

Regards,

Christoph

--
Christoph Buser <[EMAIL PROTECTED]>
Seminar fuer Statistik, LEO C13
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--

example with repeated values


v1 <- c("a", "b", "a", "b", "a")
n1 <- 1:5
v2 <- c("b", "b", "a", "a", "a")
n2 <- 6:10
(f1  <- data.frame(v1, n1))
(f2 <- data.frame(v2, n2))
(m12 <- merge(f1, f2, by.x = "v1", by.y = "v2", sort = F))





Michael Kubovy writes:
 > Dear R-helpers,
 > 
 > Happy New Year to all the helpful members of the list.
 > 
 > Here is the behavior I'm looking for:
 >  > v1 <- c("a","b","c")
 >  > n1 <- c(0, 1, 2)
 >  > v2 <- c("c", "a", "b")
 >  > n2 <- c(0, 1 , 2)
 >  > (f1  <- data.frame(v1, n1))
 >v1 n1
 > 1  a  0
 > 2  b  1
 > 3  c  2
 >  > (f2 <- data.frame(v2, n2))
 >v2 n2
 > 1  c  0
 > 2  a  1
 > 3  b  2
 >  > (m12 <- merge(f1, f2, by.x = "v1", by.y = "v2", sort = F))
 >v1 n1 n2
 > 1  c  2  0
 > 2  a  0  1
 > 3  b  1  2
 > 
 > Now to my data:
 >  > summary(pL)
 >  pairL
 > a fondo   :  41
 > alto  :  41
 > ampio :  41
 > angoloso  :  41
 > aperto:  41
 > appoggiato:  41
 > (Other)   :1271
 > 
 >  > pL$pairL[c(1,42)]
 > [1] appoggiato dentro
 > 37 Levels: a fondo alto ampio angoloso aperto appoggiato asimmetrico  
 > complicato convesso davanti dentro destra ... verticale
 > 
 >  > summary(oppN)
 >  pairL  pairR subject
 > LLLRR   M
 > a fondo   :  41   a galla:  41   S1 :  37   Min.   :0.3646
 > Min.   :0.02083   Min.   :0.0010   Min.   :0.
 > alto  :  41   acuto  :  41   S10:  37   1st Qu.:0.5521
 > 1st Qu.:0.37500   1st Qu.:0.1771   1st Qu.:0.1042
 > ampio :  41   arrotondato:  41   S11:  37   Median :0.6354
 > Median :0.47917   Median :0.2708   Median :0.2292
 > angoloso  :  41   basso  :  41   S12:  37   Mean   :0.6403
 > Mean   :0.46452   Mean   :0.2760   Mean   :0.2598
 > aperto:  41   chiuso :  41   S13:  37   3rd Qu.:0.7188
 > 3rd Qu.:0.55208   3rd Qu.:0.3750   3rd Qu.:0.3854
 > appoggiato:  41   compl  :  41   S14:  37   Max.   :0.9375
 > Max.   :0.92708   Max.   :0.6042   Max.   :0.7812
 > (Other)   :1271   (Other):1271   (Other): 
 > 1295  NA's   :3.   NA's   : 
 > 3.
 >asym polarpolar_a1  clust
 > Min.   :-0.   Min.   :-1.2410   Min.   :-2.949e+00   c1:492
 > 1st Qu.: 0.2091   1st Qu.: 0.4571   1st Qu.:-1.902e-01   c2:287
 > Median : 0.   Median : 1.1832   Median :-1.110e-16   c3: 82
 > Mean   : 0.6265   Mean   : 1.3428   Mean   :-5.745e-02   c4:246
 > 3rd Qu.: 0.9383   3rd Qu.: 2.0712   3rd Qu.: 1.168e-01   c5: 82
 > Max.   : 2.7081   Max.   : 4.6151   Max.   : 4.218e+00   c6:328
 > NA's   : 3.   NA's   : 3.000e+00
 > 
 >  > oppN$pairL[c(1,42)]
 > [1] spesso fine
 > 37 Levels: a fondo alto ampio angoloso aperto appoggiato asimmetrico  
 > complicato convesso davanti dentro destra ... verticale
 > 
 >  > unique(sort(oppM$pairL)) == unique(sort(pL$pairL))
 > [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE  
 > TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 > [26] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 > 
 > In other words I think that pL$pairL and oppN$pairL consists of 37  
 > blocks of 41 repetitions of names, and that these blocks are  
 > permutations of each other,
 > 
 > However:
 > 
 >  > summary(m1 <- merge(oppM, pairL, by.x = "pairL", by.y = "pairL",  
 > sort = F))
 >  pairL   pairR  subject 
 > LLLRR   M
 > a fondo   : 1681   a galla: 1681   S1 : 1517   Min.   : 
 > 0.3646   Min.   :0.02083   Min.   :0.0010   Min.   :0.
 > alto  : 1681   acuto  : 1681   S10: 1517   1st Qu.: 
 > 0.5521   1st Qu.:0.37500   1st Qu.:0.1771   1st Qu.:0.1042
 > ampio : 1681   arrotondato: 1681   S11: 1517   Median : 
 > 0.6354   Median :0.47917   Median :0.2708   Median :0.2292
 > angoloso  : 1681   basso  : 1681   S12: 1517   Mean   : 
 > 0.6398   Mean   :0.46402   Mean   :0.2760   Mean   :0.2598
 > aperto: 1681   chiuso : 1681   S13: 1517   3rd Qu.: 
 > 0.7188   3rd Qu.:0.55208

[R] Problem with nlme version 3.1-68

2006-01-05 Thread Bing T. Guan

Dear All:
I updated my R program as well as associated packages yesterday. Currently
my R version is 2.2.1 running under WINXP SP-2. 
When I tried to list (summary) an nlme object that I developed before, I got
the following error message:

[ Error in .C("ARMA_constCoef", as.integer(attr(object, "p")),
as.integer(attr(object,  : 
C entry point "ARMA_constCoef" not in DLL for package "nlme" ]

The nlme object was fitted with corr = corARMA(q=2) option. I refitted the
model, and the same error message appeared. I then refitted the model with
option corr = corARMA(p=1), then no problem; but for p = 2, or q = 1 or 2,
then the error occurred. When I listed the same fitted nlme objects under R
2.1.1 with nlme 3.1-65, then no problem.

I fitted the Ovary data (Pinheiro and Bates 2000, p.397) using the script
provided in nlme package 
fm3Ovar.nlme <- update(fm1Ovar.nlme, correlation = corARMA(p=0, q=2)), and
tried to list the result. The same error occurred. I tried it out on several
of PCs (WINXP SP-2, R 2.2.1, nlme 3.1-68) and the same situation happened on
every machine.

Is there a bug in the latest version of nlme (3.1-68), or the problem only
happened to me and my machines?
***
Biing T. Guan
School of Forestry and Resource Conservation
National Taiwan University
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

92 matches

Mail list logo