date:20030923

Re: Re: [R] what does the sum of square of Gaussian RVs with differentvariance obey?

2003-09-23 Thread Jean Sun

Thanks a lot.
It does work. The fitted data match the simulated ones well. Even no need the shifted 
or scaled version of Chi-squared pdf. Also, I have tested the case of non-independent 
RVs,generated by linear combining of independent Gaussian RVs,the result is 
satisfactory too.

Regards,
J.Sun


2003-09-23 07:07:00 Thomas Lumley wrote：

>On Tue, 23 Sep 2003, Jean Sun wrote:
>
>> >From basic statistics principle,we know,given several i.i.d Gaussian
>> >RVs with zero or nonzero mean,the sum of square of them is a central or
>> >noncentral Chi-distributed RV.However if these Gaussian RVs have
>> >different variances,what does the sum of square of them obey?
>>
>
>Nothing very useful.  It's a mixture of chisquare(1) variables. One
>standard approach is to approximate it by a multiple of a chisquared
>distribution that has the correct mean and variance.
>
>   -thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

RE: [R] data.frame with duplicated id's

2003-09-23 Thread Andrew Hayen

Try ?reshape

A


-Original Message-
From: Christian Schulz [mailto:[EMAIL PROTECTED]
Sent: Wednesday, 24 September 2003 3:42 PM
To: [EMAIL PROTECTED]
Subject: [R] data.frame with duplicated id's


Hi,

is there a exstisting function (..i found nothing until now.) 
what makes it possible transfrom a dataset:

ID AGE V.MAI V.JUNE
11 20   100   120
12 30   200   90

into 

IDAGEV
1120   100 
1120   120
1230200
123090

,or have i to programm ths my self?

Thanks for any comment, help and/or starting point.

regards,christian






[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] data.frame with duplicated id's

2003-09-23 Thread Christian Schulz

Hi,

is there a exstisting function (..i found nothing until now.) 
what makes it possible transfrom a dataset:

ID AGE V.MAI V.JUNE
11 20   100   120
12 30   200   90

into 

IDAGEV
1120   100 
1120   120
1230200
123090

,or have i to programm ths my self?

Thanks for any comment, help and/or starting point.

regards,christian






[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] R Production Performance

2003-09-23 Thread Joe Conway

Paul Meagher wrote:
Below is the test I ran awhile back on invoking R as a system call.  It
might be faster if you had a c-extension to R but before I went that route I
would want to know 1) roughly how fast Python and Perl are in returning
results with their c-bindings/embedded stuff/dcom stuff, 2) whether R can be
run as a daemon process so you don't incur start up costs, and 3) whether R
can act as a math server in the sense that it will fork children or threads
as multiple users establish sessions with it.  I agree it would be nice to
have a better interface to R than via a system call.
I'm doing something similar using PL/R (an R procedural language handler 
extension to Postgres that I wrote) with Postgres, R, and PHP. In 
Postgres 7.4 (currently at beta3) or with a back-patched copy of 7.3, 
you can preload the R interpreter when the Postgres postmaster first 
starts. This means that essentially R is running as part of the Postgres 
daemon. Whenever a connection is made to the database, the forked 
process already has an initialized copy of R running inside it. The 
startup savings I see are similar to what you did (2.2 seconds versus 
0.009 seconds):

--
Function -- intentionally very simple:
--
create or replace function echo(text) returns text as 'print(arg1)' 
language 'plr';

Without preloading (first function call):
-
regression=# explain analyze select echo('hello');
 Total runtime: 2195.35 msec
Without preloading (second function call):
-
regression=# explain analyze select echo('hello');
 Total runtime: 0.55 msec
With preloading (first function call):
-
regression=# explain analyze select echo('hello');
 Total runtime: 9.74 msec
With preloading (second function call):
-
regression=# explain analyze select echo('hello');
 Total runtime: 0.59 msec
--
In both cases the second (and subsequent) function calls are even faster 
because the PL/R function itself has been precompiled and cached.

I call the PL/R function from PHP to read my data directly from the 
database, process it, and generate whatever charts I need. Here's a very 
simple example:

The PL/R function:
--
create type histtup as
(
  break float8,
  count int
);
create or replace function hist(text, text)
returns setof histtup as '
 sql <- paste("select id_val from sample_numeric_data ",
  "where ia_id=''", arg1, "''", sep="")
 rs <- pg.spi.exec(sql)
 if (!is.na(arg2)) {
x11(display=":5")
jpeg(file=arg2, width = 480, height = 480,
 pointsize = 12, quality = 75)
par(ask = FALSE, bg = "#F8F8F8")
sql <- paste("select ia_attname as val from atts ",
 "where ia_id=''", arg1, "''", sep="")
attname <- pg.spi.exec(sql)
h <- hist(rs[,1], col = "blue",
  main = paste("Histogram of", attname$val),
  xlab = attname$val);
dev.off()
system(paste("chmod 666 ", arg2, sep=""),
   intern = FALSE, ignore.stderr = TRUE)
  }
  else
h <- hist(rs[,1], plot = FALSE);
  result = data.frame(breaks = h$breaks[1:length(h$breaks)-1],
   count = h$counts);
  return(result)
' language 'plr';
--
The PHP page:
--



  
Data

  
  

  

  


";
if ($_POST['submit'] == "Submit")
{
  $tmpfilename = 'charts/hist1.jpg';
  $conn = pg_connect("dbname=oscon user=postgres");
  $sql = "select * from hist('" . $_POST['userdata'] . "','" .
 "/tmp/" . $tmpfilename . "')";
  $rs = pg_query($conn,$sql);
  echo "";
}
?>

--
Hopefully this gives you some ideas about what is possible. If you're 
interested in PL/R, you can grab a copy (along with a patched 7.3.4 
source RPM for Postgres) here: http://www.joeconway.com/

HTH,

Joe

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] Omitting blank lines with read.table

2003-09-23 Thread kjetil brinchmann halvorsen

On 24 Sep 2003 at 11:32, Patrick Connolly wrote:

read.table has an argument blank.lines.skip, which is true by 
default, at least in read.table. You can try to give that to 
read.delim?

Kjetil Halvorsen

> Say we have a tab delimited file called bug.txt
> 
> Part  Rep CageHb pupae
> 1 1   S   32
>   1   M   34
>   L   42
> 
> 
>   2   S   36
>   M   28
>   L   36
> 
> read.delim("bug.txt")
> 
>   Part Rep Cage Hb.pupae
> 11   1S   32
> 2   NA   1M   34
> 3   NA  NAL   42
> 4   NA  NANA
> 5   NA  NANA
> 6   NA   2S   36
> 7   NA  NAM   28
> 8   NA  NAL   36
> >
> 
> Variations on read.table give the same result.
> 
> When I first used read.table in Splus, I liked the way it ignored rows
> that were empty (at least when using sep = "\t").  A line was
> considerend empty if it contained only tab characters, so the rows of
> NAs or ""s are omitted, so that rows 4 and 5 above would be deleted.
> 
> R's read.table differs in this respect (and a number of really neat
> ones).  I probably know enough Perl to be able to write a short script
> that could delete such lines, and it's not difficult to remove the
> rows from the dataframe afterwards; but maybe there's something simple
> I've misunderstood in the use of R's read.table.
> 
> I can't use na.omit since the other NAs in the data can be dealt with
> so I don't want them removed.  Other suggestions welcome.
> 
> Thanks
> 
> -- 
> Patrick Connolly
> HortResearch
> Mt Albert
> Auckland
> New Zealand 
> Ph: +64-9 815 4200 x 7188
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
> I have the world`s largest collection of seashells. I keep it on all
> the beaches of the world ... Perhaps you`ve seen it.  ---Steven Wright 
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] weird behaviour when calling c++

2003-09-23 Thread Francisco J Molina


I am using 1.7.1 in a PC ( redhat 9, linux )

I created a subroutine in C++, mySubrutine, to be used in R. To debug this
subroutine I have a main routine in C++ that calls mySubrutine. The only thing main () 
does is to provide mySubrutine with its
arguments 
(this is the easiest way for me to debug subroutines written in
C++ and intended to be used in R ).

I also have a version of the same program in R: an R script that provides
mySubrutine with its arguments. 

When I run the C++ version in gdb I do not have any problem, every time I
run it I get the result ( the same result ). 

The R script calls mySubrutine through .( ) C. Sometimes it gives me the 
same result I get in the C++ version; sometimes it freezes. This even
happens if I execute the script several times in a row ( I use C-c when it
freezes )
Any idea?

I am using  new and delete in mySubrutine, but I guess this should not be
any problem.

P.S: The first thing the R script executes is rm ( list = ls ( )).
 To use dyn.unload does not make any difference.
 
Thank you.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] confusion about what to expect?

2003-09-23 Thread A.J. Rossini

Marc Schwartz <[EMAIL PROTECTED]> writes:

> On Tue, 2003-09-23 at 18:08, A.J. Rossini wrote:

  <-- about confusion -->

Anyway, the mnemonic to use seems to be to remember that complex data
structures get simplified whenever possible.  Thanks to Doug G. and
Patrick C. for that!

I'll plead an excess of Lisp, Python, and C++ recently, but that isn't
a real excuse.

best,
-tony

-- 
[EMAIL PROTECTED]http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN  Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] R Production Performance

2003-09-23 Thread Paul Meagher

Hi Zitan,

Below is the test I ran awhile back on invoking R as a system call.  It
might be faster if you had a c-extension to R but before I went that route I
would want to know 1) roughly how fast Python and Perl are in returning
results with their c-bindings/embedded stuff/dcom stuff, 2) whether R can be
run as a daemon process so you don't incur start up costs, and 3) whether R
can act as a math server in the sense that it will fork children or threads
as multiple users establish sessions with it.  I agree it would be nice to
have a better interface to R than via a system call.

Regards,
Paul Meagher

=

I just timed how long it took to pipe a file containing 2 lines below into R
(via a PHP script executed from my browser):

input.R
--
x = cbind(4,5,3,2,3,4)
x


 $Output");
$fp = fopen("$Output", "r");
while (!feof($fp)) {
  $line = fgets($fp,4096);
  echo $line ."";
}
fclose($fp);

$time_end = getmicrotime();
$time = $time_end - $time_start;
echo "Time To Execute: $time seconds";
?>

The time to execute this script was 3.1081320047379 seconds (if I execute
the script a few times this around what I get).

I then removed the line that calls R only.  There was something in the
output.R file so, in essense, the only difference between the original
script and modified script is the removal of the system call to R.

The time to execute was 0.0010089874267578 seconds

By subtractive logic, this means the call to R incurs an overhead of around
3 seconds on a average web server box using the php-apache module.








- Original Message - 
From: "Zitan Broth" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wednesday, September 24, 2003 2:46 PM
Subject: Re: [R] R Production Performance


> Hi James,
>
> Thanks for your response :-)
>
> - Original Message -
> > It is like anything else that you want to run as part of web services:
> what
> > do you want it to do?  Yes, it is fast in doing computations, but what
> will
> > you have it do?  It is probably as fast as anything else that you will
> find
> > out there that is fairly general purpose.
>
> I just want to use R for mathematical computations, and will call it via
PHP
> from the commandline with infile. We'll need to obviously test this
> ourselves, but I just thought I'd raise the question :-))
>
> > Are you going to be creating a lot of graphics that have to be displayed
> > back on the screen?  How is the user going to input data (flat files,
XML,
> > Excel worksheets, Oracle database, ...)?  Will you be invoking a unique
> > process each time a user calls, or will you be using a 'daemon' that
will
> > communicate with DCOM and such?  How many people will be trying to
access
> > it once and what is the mix of transactions that they will use?
>
> Well for sure the rest of the app needs to scale as well and be fast,
> failsafe etc..., but I am just asking about R.
>
> I was imagining using a unique process call each time I access R, which is
> how the apache/php/*nix environment works best (although keeping processes
> in memory is achievable as well).  My experience to date on integration
with
> C packages deploying to *nix is that this works quite effectively although
> certain packages require process management that are not multiprocess (to
> ensure that R for example only executes one computation at a time), but
this
> is no problem. There are ways to call c packages directly with PHP (swig)
> and I am investigating this at present.
>
> > You can probably get a real good feel by enclosing the operations that
you
> > want to do in a "system.time" function to see how long it will take.
This
> > really depends on what you are trying to do.  I can definitely say that
it
> > is faster than trying to code the algorithm in PERL or another scripting
> > language.
>
> Makes sense because R is written in C, where PERL and PHP are also written
> in C, so R is a "layer deep" so to speak :-)
>
> Thanks again,
> Z.
>
> > Greetings All,
> >
> > Been playing with R and it is very easy to get going with the UI or
infile
> > batch commands :-)
> >
> > What I am wondering is how scalable and fast R is for running as part of
a
> > web service.  I believe R is written in C which is a great start, but
what
> > are peoples general thoughts on this?
> >
> > Thanks greatly,
> > Z.
> >
> >  [[alternative HTML version deleted]]
> >
> > __
> > [EMAIL PROTECTED] mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
> >
> >
> >
> > --
> > "NOTICE:  The information contained in this electronic mail transmission
> is
> > intended by Convergys Corporation for the use of the named individual or
> > entity to which it is directed and may contain information that is
> > privileged or otherwise confidential.  If you have received this
> electronic
> > mail transmission in error, please delete it

Re: [R] install problem with R Windows

2003-09-23 Thread Duncan Murdoch

On Tue, 23 Sep 2003 16:16:38 -0500, you wrote:

>Dear R People:
>
>I'm trying to install R 1.7.1 for Windows from Source.
>
>The error that I get is:
>previous declarion of 'ssize_t'
>MAKE[2]: ***[internet.o]Error 1
>MAKE[1]: ***[all]Error 1
>MAKE: *** [rmodules] Error 2
>
>Any ideas on how to proceed, please?

Can you describe your setup?  Have you installed the tools as
described in src/gnuwin32/INSTALL, or are you using others?  

Duncan Murdoch

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] least squares regression line

2003-09-23 Thread Jason Turner

On Wed, 2003-09-24 at 11:04, Carmen Fridell wrote:
> I can't seem to find the command to find the least squares regression line 
> for my bivariate data set. Can you please help? ~Carmen

Any time you're lost in R, you can type "help.start()".  This will start
a web browser, which loads the starting help page.  Click on "Keyword
Search".  Once the new page loads, type in some of the words you're
looking for.  "regression" is a good place to start.  That will lead to
a list of potential matches by subject.  Hint - you're looking for a
function to estimate a *linear* model.

Cheers

Jason
-- 
Indigo Industrial Controls Ltd.
http://www.indigoindustrial.co.nz
+64-(0)21-343-545

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] confusion about what to expect?

2003-09-23 Thread Marc Schwartz

On Tue, 2003-09-23 at 18:08, A.J. Rossini wrote:
> In playing around with data.frames (and wanting a simple, cheap way to
> use the variable and case names in plots; but I've solved that with
> some hacks, yech), I noticed the following behavior with subsetting. 
> 
> 
> testdata <- data.frame(matrix(1:20,nrow=4,ncol=5))
> names(testdata) ## expect labels, get them
> names(testdata[2,]) ## expect labels, get them
> names(testdata[,2]) ## expect labels, but NOT --  STRIPPED OFF??
> testdata[,2]  ## would have expect a name (X2) in the front? NOT EXPECTED
> testdata[2,]  ## get what I expect
> testdata[2,2]  ## just a number, not a sub-data.frame? unexpected
> testdata[2,2:3] ## this is a data.frame
> testdata[2:3,2:3] ## and this is, too.
> 
> > version
>  _
> platform i386-pc-linux-gnu
> arch i386 
> os   linux-gnu
> system   i386, linux-gnu  
> status   alpha
> major1
> minor8.0  
> year 2003 
> month09   
> day  20   
> language R
> > 
> 
> I don't have 1.7.1 handy at this location to test, but I would've
> expected a data.frame-like object upon subsetting; should I have
> expected otherwise?  (granted, a data.frame with just a single
> variable could be thought of as silly, but it does have some extra
> information that might be worthwhile, on occassion?)
> 
> I'm not sure that it is a bug, but I was caught by suprise.  If it
> isn't a bug, and someone has a concise way to think through this, for
> my future reference, I'd appreciate hearing about it.
> 
> best,
> -tony


Tony,

A quick review of what is returned when you subset the data.frame
testdata:

> str(testdata[,2])
 int [1:4] 5 6 7 8

> str(testdata[2,])
`data.frame':   1 obs. of  5 variables:
 $ X1: int 2
 $ X2: int 6
 $ X3: int 10
 $ X4: int 14
 $ X5: int 18

> dim(testdata[,2])
NULL

> dim(testdata[2,])
[1] 1 5


Quoting from ?Extract:

"When [.data.frame is used for subsetting rows of a data.frame, it
returns a data frame with unique (and non-missing)row names, if
necessary transforming the names using make.names( * , unique = TRUE)"

What is unstated, but covered by R FAQ 7.7 ("Why do my matrices lose
dimensions?"), a single column in a data.frame resulting from the subset
operation is by default turned into a vector. Hence, no names.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] confusion about what to expect?

2003-09-23 Thread Tony Plate

Have you investigated the drop= argument to "["? (as in the expression 
testdata[,2,drop=F], which will return a dataframe).

"[.data.frame" has somewhat different behavior from "[" on matrices with 
respect to the drop argument: If the result would be a dataframe with a 
single column, the default behavior of "[.data.frame" is to return a vector 
(return a dataframe always if drop=F), but if the result would be a 
dataframe with a single row, the default behavior is to return a dataframe 
(return a list if drop=T).

E.g.:
> class(data.frame(a=1:3,b=4:6)[,1])
[1] "integer"
> class(data.frame(a=1:3,b=4:6)[,1,drop=F])
[1] "data.frame"
> class(data.frame(a=1:3,b=4:6)[1,])
[1] "data.frame"
> class(data.frame(a=1:3,b=4:6)[1,,drop=T])
[1] "list"
>
The default behavior is often what you want, but when it isn't it can be 
confusing, especially it's not that easy to find documentation for this (at 
least not in a quick look through the FAQ, ?"[", and "An Introduction to R" 
-- please excuse me if I overlooked something.)

The thing you have going on with names(testdata[...]) is merely a 
consequence of whether or not the result of the subsetting operation is a 
dataframe or a vector.

hope this helps,

Tony Plate

At Tuesday 04:08 PM 9/23/2003 -0700, you wrote:

In playing around with data.frames (and wanting a simple, cheap way to
use the variable and case names in plots; but I've solved that with
some hacks, yech), I noticed the following behavior with subsetting.
testdata <- data.frame(matrix(1:20,nrow=4,ncol=5))
names(testdata) ## expect labels, get them
names(testdata[2,]) ## expect labels, get them
names(testdata[,2]) ## expect labels, but NOT --  STRIPPED OFF??
testdata[,2]  ## would have expect a name (X2) in the front? NOT EXPECTED
testdata[2,]  ## get what I expect
testdata[2,2]  ## just a number, not a sub-data.frame? unexpected
testdata[2,2:3] ## this is a data.frame
testdata[2:3,2:3] ## and this is, too.
> version
 _
platform i386-pc-linux-gnu
arch i386
os   linux-gnu
system   i386, linux-gnu
status   alpha
major1
minor8.0
year 2003
month09
day  20
language R
>
I don't have 1.7.1 handy at this location to test, but I would've
expected a data.frame-like object upon subsetting; should I have
expected otherwise?  (granted, a data.frame with just a single
variable could be thought of as silly, but it does have some extra
information that might be worthwhile, on occassion?)
I'm not sure that it is a bug, but I was caught by suprise.  If it
isn't a bug, and someone has a concise way to think through this, for
my future reference, I'd appreciate hearing about it.
best,
-tony
--
[EMAIL PROTECTED]http://www.analytics.washington.edu/
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN  Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email
CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] Updating least squares

2003-09-23 Thread David Duffy

Paul Meagher <[EMAIL PROTECTED]> wrote:
>
> I am looking at developing a user modelling type app (new data points coming
> in and wanting to dynamically update regression co-efficients for each user)
> which could be viewed as a type of control problem.
>

Alan Miller's AS274 (in C, f77 or f90) does this -- see statlib or his
home page (there are several similar routines eg AS164).  It would be
straightforward to write an R interface.

David Duffy

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] Omitting blank lines with read.table

2003-09-23 Thread Patrick Connolly

Say we have a tab delimited file called bug.txt

PartRep CageHb pupae
1   1   S   32
1   M   34
L   42
  
  
2   S   36
M   28
L   36

read.delim("bug.txt")

  Part Rep Cage Hb.pupae
11   1S   32
2   NA   1M   34
3   NA  NAL   42
4   NA  NANA
5   NA  NANA
6   NA   2S   36
7   NA  NAM   28
8   NA  NAL   36
>

Variations on read.table give the same result.

When I first used read.table in Splus, I liked the way it ignored rows
that were empty (at least when using sep = "\t").  A line was
considerend empty if it contained only tab characters, so the rows of
NAs or ""s are omitted, so that rows 4 and 5 above would be deleted.

R's read.table differs in this respect (and a number of really neat
ones).  I probably know enough Perl to be able to write a short script
that could delete such lines, and it's not difficult to remove the
rows from the dataframe afterwards; but maybe there's something simple
I've misunderstood in the use of R's read.table.

I can't use na.omit since the other NAs in the data can be dealt with
so I don't want them removed.  Other suggestions welcome.

Thanks

-- 
Patrick Connolly
HortResearch
Mt Albert
Auckland
New Zealand 
Ph: +64-9 815 4200 x 7188
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
I have the world`s largest collection of seashells. I keep it on all
the beaches of the world ... Perhaps you`ve seen it.  ---Steven Wright 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] confusion about what to expect?

2003-09-23 Thread A.J. Rossini


In playing around with data.frames (and wanting a simple, cheap way to
use the variable and case names in plots; but I've solved that with
some hacks, yech), I noticed the following behavior with subsetting. 


testdata <- data.frame(matrix(1:20,nrow=4,ncol=5))
names(testdata) ## expect labels, get them
names(testdata[2,]) ## expect labels, get them
names(testdata[,2]) ## expect labels, but NOT --  STRIPPED OFF??
testdata[,2]  ## would have expect a name (X2) in the front? NOT EXPECTED
testdata[2,]  ## get what I expect
testdata[2,2]  ## just a number, not a sub-data.frame? unexpected
testdata[2,2:3] ## this is a data.frame
testdata[2:3,2:3] ## and this is, too.

> version
 _
platform i386-pc-linux-gnu
arch i386 
os   linux-gnu
system   i386, linux-gnu  
status   alpha
major1
minor8.0  
year 2003 
month09   
day  20   
language R
> 

I don't have 1.7.1 handy at this location to test, but I would've
expected a data.frame-like object upon subsetting; should I have
expected otherwise?  (granted, a data.frame with just a single
variable could be thought of as silly, but it does have some extra
information that might be worthwhile, on occassion?)

I'm not sure that it is a bug, but I was caught by suprise.  If it
isn't a bug, and someone has a concise way to think through this, for
my future reference, I'd appreciate hearing about it.

best,
-tony

-- 
[EMAIL PROTECTED]http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN  Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] least squares regression line

2003-09-23 Thread Carmen Fridell

I can't seem to find the command to find the least squares regression line 
for my bivariate data set. Can you please help? ~Carmen

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] Typical R installation problem

2003-09-23 Thread Marc Schwartz

On Tue, 2003-09-23 at 15:28, [EMAIL PROTECTED] wrote:
> Dear Peter, 
> 
>   I don't know if this is the proper way to ask for help installing R, but if
> not, I presume you can pass this on to the appropriate place.  
> 
>   I'm trying to install the latest binary on Redhat 9, and keep getting the same
> error message no matter what I try.  
> 
> [EMAIL PROTECTED] mhu]# rpm -i R-1.7.1-1.i386.rpm
> warning: R-1.7.1-1.i386.rpm: V3 DSA signature: NOKEY, key ID 97d3544e
> error: Failed dependencies:
> libtcl8.3.so is needed by R-1.7.1-1
> libtk8.3.so is needed by R-1.7.1-1
> [EMAIL PROTECTED] mhu]# exit
> 
> I actually found these two files in an obscure location on my computer, and
> copied them to /usr/lib/ I also set my path to include the obscure location.  SO
> they are actually available.  At least when I enter libtcl8.3.so it is
> apparently read and a segmentation fault occurs.  
> 
> So, do you have any suggestions about how to get around this problem.  I don't
> know what else to do to make these routines available to the R install.  
> 
> Any help would be greatly appreciated.  I had no problem installing R on my MAC
> OS 10.1.5.
> 
>   Thank you very much, 
> 
>   Michael Huston
>   Oak Ridge, Tennessee

First, you have posted to r-help, which is an international e-mail list
and the primary source of assistance with R.

Since I now typically compile from source, I decided to remove my
present installation of R and freshly install the RPM that Martyn has
created on CRAN.  I run RH 9 and have a clean and fully updated
installation. The aforementioned tcl/tk files are in /usr/lib on my
system.

I installed the RPM without problem and then installed John Fox's Rcmdr
to test the tcltk functionality. It works without problem. I also tested
a tcl/tk function that I wrote and it works fine as well.

If your above listed files were not in /usr/lib to start with, that may
be (probably is) an indication that something is amiss in your tcl/tk
installation. You may need to remove and reinstall tcl/tk. 

Also, I am unsure as to what you mean by "At least when I enter
libtcl8.3.so it is apparently read and a segmentation fault occurs." Are
you trying to execute it directly from the command line? .so files are
shared libraries (akin to .DLLs in Windows).

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] R Production Performance

2003-09-23 Thread Zitan Broth

Hi James,

Thanks for your response :-)

- Original Message -
> It is like anything else that you want to run as part of web services:
what
> do you want it to do?  Yes, it is fast in doing computations, but what
will
> you have it do?  It is probably as fast as anything else that you will
find
> out there that is fairly general purpose.

I just want to use R for mathematical computations, and will call it via PHP
from the commandline with infile. We'll need to obviously test this
ourselves, but I just thought I'd raise the question :-))

> Are you going to be creating a lot of graphics that have to be displayed
> back on the screen?  How is the user going to input data (flat files, XML,
> Excel worksheets, Oracle database, ...)?  Will you be invoking a unique
> process each time a user calls, or will you be using a 'daemon' that will
> communicate with DCOM and such?  How many people will be trying to access
> it once and what is the mix of transactions that they will use?

Well for sure the rest of the app needs to scale as well and be fast,
failsafe etc..., but I am just asking about R.

I was imagining using a unique process call each time I access R, which is
how the apache/php/*nix environment works best (although keeping processes
in memory is achievable as well).  My experience to date on integration with
C packages deploying to *nix is that this works quite effectively although
certain packages require process management that are not multiprocess (to
ensure that R for example only executes one computation at a time), but this
is no problem. There are ways to call c packages directly with PHP (swig)
and I am investigating this at present.

> You can probably get a real good feel by enclosing the operations that you
> want to do in a "system.time" function to see how long it will take.  This
> really depends on what you are trying to do.  I can definitely say that it
> is faster than trying to code the algorithm in PERL or another scripting
> language.

Makes sense because R is written in C, where PERL and PHP are also written
in C, so R is a "layer deep" so to speak :-)

Thanks again,
Z.

> Greetings All,
>
> Been playing with R and it is very easy to get going with the UI or infile
> batch commands :-)
>
> What I am wondering is how scalable and fast R is for running as part of a
> web service.  I believe R is written in C which is a great start, but what
> are peoples general thoughts on this?
>
> Thanks greatly,
> Z.
>
>  [[alternative HTML version deleted]]
>
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
>
>
>
> --
> "NOTICE:  The information contained in this electronic mail transmission
is
> intended by Convergys Corporation for the use of the named individual or
> entity to which it is directed and may contain information that is
> privileged or otherwise confidential.  If you have received this
electronic
> mail transmission in error, please delete it from your system without
> copying or forwarding it, and notify the sender of the error by reply
email
> or by telephone (collect), so that the sender's address records can be
> corrected."
>
>
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] install R for Windows problem

2003-09-23 Thread Erin Hodgess

Here are all of the messages:

In file included from internet.c:858:
sock.h.27:conflicting types for 'ssize_t'
c:/mingw/include/sys/types.h:119: previous declaration of 'ssize_t'
MAKE[2]: ***[internet.o] Error 1
MAKE[1]: ***[all] Error 1
MAKE:  ***[rmodules] Error 2

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] loops in Sweave

2003-09-23 Thread Jason Turner

On Wed, 2003-09-24 at 03:06, Millo Giovanni wrote:
> Dear all,
> 
> I was wondering whether there is a way to make loops in Sweave, i.e. for example to:
> 1) calculate a parameter, say, a=length(b)
> 2) according to that, add #a# chapters to the document, each including some 
> repetitive analysis, each time done on a particular subset of the data indexed by 
> the elements of 1:a.
> This would be of great help for repeating exploratory data analyses on, say, 
> questionaries when the number of questions changes without having to change the 
> Sweave .snw file.

Two possible ways, off the top of my head:

1) Within LaTeX, use \Sexpr{a} to get the length, then loop within
LaTeX.  I believe Lamport includes an example of looping within LaTeX,
but I haven't got the book handy.

2) Within the R chunk, generate the table using xtable() (package
"xtable") or Latex (package "Hmisc") and print directly within R.  I
haven't tried it, but something like
<>=
## build your table in R
tt <- xtable(foo)
print(tt)

@

might do the trick.  I'd been meaning to look at this anyway; you
question prompted me ;)

Check the Sweave manual at Herr Dr Leisch's site.
http://www.ci.tuwein.ac.at/~leisch/Sweave
This has the R chunk options required to produce the above, if my
untested example is not correct.

Cheers

Jason

-- 
Indigo Industrial Controls Ltd.
http://www.indigoindustrial.co.nz
+64-(0)21-343-545

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] install problem with R Windows

2003-09-23 Thread Erin Hodgess

Dear R People:

I'm trying to install R 1.7.1 for Windows from Source.

The error that I get is:
previous declarion of 'ssize_t'
MAKE[2]: ***[internet.o]Error 1
MAKE[1]: ***[all]Error 1
MAKE: *** [rmodules] Error 2

Any ideas on how to proceed, please?

thanks in advance!

Sincerely,
Erin Hodgess
mailto: [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] Typical R installation problem

2003-09-23 Thread mahustonor

Dear Peter, 

I don't know if this is the proper way to ask for help installing R, but if
not, I presume you can pass this on to the appropriate place.  

I'm trying to install the latest binary on Redhat 9, and keep getting the same
error message no matter what I try.  

[EMAIL PROTECTED] mhu]# rpm -i R-1.7.1-1.i386.rpm
warning: R-1.7.1-1.i386.rpm: V3 DSA signature: NOKEY, key ID 97d3544e
error: Failed dependencies:
libtcl8.3.so is needed by R-1.7.1-1
libtk8.3.so is needed by R-1.7.1-1
[EMAIL PROTECTED] mhu]# exit

I actually found these two files in an obscure location on my computer, and
copied them to /usr/lib/ I also set my path to include the obscure location.  SO
they are actually available.  At least when I enter libtcl8.3.so it is
apparently read and a segmentation fault occurs.  

So, do you have any suggestions about how to get around this problem.  I don't
know what else to do to make these routines available to the R install.  

Any help would be greatly appreciated.  I had no problem installing R on my MAC
OS 10.1.5.

Thank you very much, 

Michael Huston
Oak Ridge, Tennessee

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] bug in stack?

2003-09-23 Thread apjaworski

I am posting it here because I am not sure if the behavior described below
is actually a bug.

If I do something like this:

> x1 <- 1:3
> x2 <- 5:9
> x3 <- 21:27
> ll <- list(x1, x2, x3)
> stack(x1,x2,x3)

I get the following error:

Error in rep.int(names(x), lapply(x, length)) :
invalid number of copies in "rep"

The problem seems to be that the generic list ll is lacking the names
attribute.

The stack.default function (in frametools.R) looks like this:

> stack.default
function (x, ...)
{
x <- as.list(x)
x <- x[unlist(lapply(x, is.vector))]
data.frame(values = unlist(unname(x)), ind = factor(rep.int(names(x),
lapply(x, length
}

and the last statement generates an error if names(x) evaluates to NULL.
If we add the following line of code before the last statement

if(is.null(names(x))) names(x) <- seq(along=x)

the stack function will work fine even for "nameless" lists.


Andy

__
Andy Jaworski
Engineering Systems Technology Center
3M Center, 518-1-01
St. Paul, MN 55144-1000
-
E-mail: [EMAIL PROTECTED]
Tel:  (651) 733-6092
Fax:  (651) 736-3122

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] Plotting multiple lines

2003-09-23 Thread Jonathan Baron

On 09/23/03 15:49, K Skanes wrote:
>Hi,
>
>I have a data set with 7 years worth of data, and 2 different values (a real 
>value and a model value) of interest in each year for different lengths.  I 
>would like a plot with the year on the y axis and an animal length along the 
>x axis.  For each year I would like to see two lines, one representing real 
>data and one representing model data.  I would like either the model or real 
>line to be red.

One way to do it is with barplot.  You would put your data in a
matrix m1 in which the columns were animals and the rows were
real/model.  (I might have it backward about rows/columns).  The
something like barplot(m1,beside=T) and lots of other options for
labels, colors, line thickness, line spacing, etc., although red
just might turn out to be the default.  Look at the help for
barplot.

-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page:http://www.sas.upenn.edu/~baron
R page:   http://finzi.psych.upenn.edu/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] Plotting multiple lines

2003-09-23 Thread K Skanes

Hi,

I have a data set with 7 years worth of data, and 2 different values (a real 
value and a model value) of interest in each year for different lengths.  I 
would like a plot with the year on the y axis and an animal length along the 
x axis.  For each year I would like to see two lines, one representing real 
data and one representing model data.  I would like either the model or real 
line to be red.

I have no idea how to start this.  Can anybody help me??  If I didn't 
explain it well enough, I can try to explain it better...

Thank you,
Kay
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] How to extract data from Excel

2003-09-23 Thread Ben Bolker


  Your best bet is saving as a comma-separated value file (.csv) and using 
read.csv to get the data into R.

  Ben

On Tue, 23 Sep 2003 [EMAIL PROTECTED] wrote:

> Hi,
> 
> I would like to know how to extract the data from Excel Spreadsheet.
> 
> Thank you very much.
> 
> Melissa
> 
> 
> 
> JLT Risk Solutions Ltd
> 6 Crutched Friars, London EC3N 2PH. Co Reg No 1536540
> Tel: (44) (0)20 7528 4000   Fax: (44) (0)20 7528 4500
> http://www.jltgroup.com
> Lloyd's Broker.  Regulated by the General Insurance
> Standards Council
> 
> The content of this e-mail (including any attachments) as\ r...{{dropped}}
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

-- 
620B Bartram Hall[EMAIL PROTECTED]
Zoology Department, University of Floridahttp://www.zoo.ufl.edu/bolker
Box 118525   (ph)  352-392-5697
Gainesville, FL 32611-8525   (fax) 352-392-3704

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: AW: [R] Rank and extract data from a series

2003-09-23 Thread Tony Plate

Using Thomas Unternährer's handy example, one could also do:

> X <- c(1, 4.5, 2.3, 1, 7.3)
> mean(order(X, decreasing=TRUE)[1:2])
[1] 3.5
>
I think this will give the same results as Thomas Unternährer's suggested 
code in almost all cases, but it is perhaps more concise and direct 
(provided that you don't actually need the values of the top items).

(of course you have to change the 1:2 to 1:10 for your needs).

Note that this question gets tricky if there are ties such that there is no 
unique set of row numbers that identify N "top" items.

For example, consider the following data:

> X <- c(1,3,2,3,4)

Taking "top two", should the answer be 3.5 (avg of row numbers 2 and 5), 
4.5 (avg of row numbers 4 and 5), or 3.67 (avg of row numbers 2,4 and 5)?

> mean(order(X, decreasing=TRUE)[1:2])
[1] 3.5
> order(X, decreasing=TRUE)[1:2]
[1] 5 2
> # Andy Liaw's suggestion:
> mean(which(X %in% sort(X, decreasing=TRUE)[1:2]))
[1] 3.67
> which(X %in% sort(X, decreasing=TRUE)[1:2])
[1] 2 4 5
> # Thomas Unternährer's suggestion:
> mean(match(sort(X, decreasing=TRUE)[1:2], X))
[1] 3.5
> match(sort(X, decreasing=TRUE)[1:2], X)
[1] 5 2
>
hope this helps,

Tony Plate

At Tuesday 02:23 PM 9/23/2003 +0200, Unternährer Thomas, uth wrote:

Hi,

>I would like to rank a time-series of data, extract the top ten data 
items from this series, determine the
>corresponding row numbers for each value in the sample, and take a mean 
of these *row numbers* (not the data).

>I would like to do this in R, rather than pre-process the data on the 
UNIX command line if possible, as I need to >calculate other statistics 
for the series.

>I understand that I can use 'sort' to order the data, but I am not aware 
of a function in R that would allow me
>to extract a given number of these data and then determine their 
positions within the original time series.

>e.g.

>Time series:

>1.0 (row 1)
>4.5 (row 2)
>2.3 (row 3)
>1.0 (row 4)
>7.3 (row 5)
>Sort would give me:

>1.0
>1.0
>2.3
>4.5
>7.3
>I would then like to extract the top two data items:

>4.5
>7.3
>and determine their positions within the original (unsorted) time series:

>4.5 = row 2
>7.3 = row 5
>then take a mean:

>2 and 5 = 3.5

>Thanks in advance.

>James Brown

X <- c(1, 4.5, 2.3, 1, 7.3)
X1 <- sort(X, decreasing=TRUE)[1:2]
X2 <- match(X1, X)
mean(X2)


Hope this helps

Thomas

___

James Brown

Cambridge Coastal Research Unit (CCRU)
Department of Geography
University of Cambridge
Downing Place
Cambridge
CB2 3EN, UK
Telephone: +44 (0)1223 339776
Mobile: 07929 817546
Fax: +44 (0)1223 355674
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]
http://www.geog.cam.ac.uk/ccru/CCRU.html
___




On Wed, 10 Sep 2003, Jerome Asselin wrote:

> On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote:
> >
> > Your method looks like a naive reimplementation of integration, and
> > won't work so well for distributions that have the great majority of
> > the probability mass concentrated in a small fraction of the sample
> > space.  I was hoping for something that would retain the
> > adaptability of integrate().
>
> Yesterday, I've suggested to use approxfun(). Did you consider my
> suggestion? Below is an example.
>
> N <- 500
> x <- rexp(N)
> y <- rank(x)/(N+1)
> empCDF <- approxfun(x,y)
> xvals <- seq(0,4,.01)
> plot(xvals,empCDF(xvals),type="l",
> xlab="Quantile",ylab="Cumulative Distribution Function")
> lines(xvals,pexp(xvals),lty=2)
> legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2)
>
>
> It's possible to tune in some parameters in approxfun() to better
> match your personal preferences. Have a look at help(approxfun) for
> details.
>
> HTH,
> Jerome Asselin
>
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
__
[EMAIL PROTECTED] mailing list 
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Tony Plate   [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] discretization method

2003-09-23 Thread Edgar Acuna

there are plenty of discretization methods, which one are you looking for?
(In Spanish
hay bastantes metodos de discretizacion cual de ellos estas busacando?
Regards (saludos)
Edgar Acuna

On Tue, 23 Sep 2003, Jaime Lopez Carvajal wrote:

> Hi R users
>
> I need to apply discretization  to my continuous data.
> Is there a method in R to do this?
>
> Thanks in advance,
>
> Jaime
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] How to extract data from Excel

2003-09-23 Thread Uwe Ligges

[EMAIL PROTECTED] wrote:

Hi,

I would like to know how to extract the data from Excel Spreadsheet.
I would like to know whether you have read the manual "R Data 
Import/Export" before having posted the question. It tells about you 
more than one way.

Uwe Ligges


Thank you very much.

Melissa
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] How to extract data from Excel

2003-09-23 Thread Melissa_Kuang

Hi,

I would like to know how to extract the data from Excel Spreadsheet.

Thank you very much.

Melissa



JLT Risk Solutions Ltd
6 Crutched Friars, London EC3N 2PH. Co Reg No 1536540
Tel: (44) (0)20 7528 4000   Fax: (44) (0)20 7528 4500
http://www.jltgroup.com
Lloyd's Broker.  Regulated by the General Insurance
Standards Council

The content of this e-mail (including any attachments) as\ r...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] filled.contour without box

2003-09-23 Thread Martin Maechler

> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
> on Tue, 23 Sep 2003 11:30:02 +0200 writes:

UweL> Jan Kleinn wrote:
>> Dear all,
>> 
>> I would like to make a filled contour plot without the
>> box R is generating by default around the plotting area,
>> i.e. I'm looking for an option in filled.contour similar
>> to 'axes=F' in 'contour' or in 'plot'.  I couldn't find
>> any option to get rid of the box, any help is welcome.
>> 
>> Thanks, Jan:-)

[ filled.contour()  does have an `axes = FALSE' option but that
  eliminates axes both on the image _and_ on the key/legend.

  As Roger Peng has just noted you can get rid of the axes of
  the image alone, using  `plot.axes = {}',
  however as you say, it's the box, not the axes you want to get
  rid of
]

UweL> Easy to add a corresponding feature:

UweL>   fix(filled.contour)

UweL> Now, remove the two lines including "box()".  Or even
UweL> better, add an argument to turn plotting of the box on
UweL> or off.

R-1.8.0  will also have an argument  'frame.plot'
which can be set to FALSE to eliminate the box around the plot.
Note that it does not eliminate the box around the legend.
Since I think this is hardly desired this is still not an option
of the future filled.contour().

Martin Maechler <[EMAIL PROTECTED]> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16Leonhardstr. 27
ETH (Federal Inst. Technology)  8092 Zurich SWITZERLAND
phone: x-41-1-632-3408  fax: ...-1228   <><

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] (Fwd) Re: goodfit macro

2003-09-23 Thread Achim Zeileis

On Tuesday 23 September 2003 16:38, Michael Shott wrote:

> Dear R-Help:
>
> As you can see, Prof. Friendly refers me to your site for an
> executable version of vcd.  I don't mean to be obtuse, but 15
> minutes spent exploring your site failed to locate a downloadable
> version of the vcd package to which he referred.
>
> I know plainly what this application can do.  What I need to know is
> how to obtain the application itself.

Michael,

you need to install the R system first and then the add-on package vcd.
If you look at 
  http://www.R-project.org/
there are some manuals and the FAQ which will tell you how to obtain 
and install R on your operating system. Having done that you can 
install the package "vcd" which is also covered in the documentation I 
mentioned above.
Both R itself and the vcd package can be downloaded from
  http://CRAN.R-project.org/

HTH,
Z

> My thanks in advance for any help you can provide.
>
> MS
>
>
>
> --- Forwarded message follows ---
> Date sent:Tue, 23 Sep 2003 09:14:28 -0400
> From: Michael Friendly <[EMAIL PROTECTED]>
> Subject:  Re: goodfit macro
> To:   [EMAIL PROTECTED]
> Send reply to:[EMAIL PROTECTED]
> Organization: York University
>
> You can also carry out goodness of fit tests using the function
> goodfit in the vcd package
> for R (a free version of S/Splus),
> http://www.r-project.org
>
> -Michael
>
> Michael Shott wrote:
> >Dear Michael:
> >
> >Recently I searched the web for guidance in gauging the
> > goodness-of- fit of frequency data to statistical models.  Your
> > website came up in virtually every combination of keywords
> > searched in Google.  The site describes the goodfit.sas macro,
> > which seems to do exactly what I'd like.  But it seems to be
> > available only as a .sas file.  Do you have some application file
> > that can be downloaded and used?
> >
> >I ask because I'd like to analyze some archaeological data.  They
> > are frequency distributions of the number of a particular artifact
> > type by site.  That is, x sites may have 1 occurrence, y sites may
> > have 2 occurrences, z sites 3 occurrences and so on.  Your goodfit
> > macro seems to do the job, but I can't execute it from the .sas
> > version that appears on your website's ftp link.
> >
> >Thanks for any help that you can provide.
> >
> >Best,
> >
> >Mike Shott
> >
> >Michael J. Shott
> >Professor
> >Dept. of Sociology, Anthropology & Criminology
> >University of Northern Iowa
> >Cedar Falls, IA 50614-0513
> >319/273-7337

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

RE: [R] what does the sum of square of Gaussian RVs with differen t variance obey?

2003-09-23 Thread RBaskin

This is a relatively recent article that is somewhat accessible.
Jensen, D. R., and Solomon, Herbert (1994), "Approximations to joint
distributions of definite quadratic forms", Journal of the American
Statistical Association, 89 , 480-486
It has references to previous work.

I also have an old paper that is so old I can't tell what journal it came
out of:(
Grad, Arthur and Solomon, Herbert "Distribution of Quadratic Forms and Some
Applications" probably published in 55 or 56 but I can't tell.  The paper by
Grad and Solomon uses the moment generating function to give the exact
distribution and various approximations to produce a table for a sum of 2 or
3 variates.

Usual disclaimers ...
Bob

-Original Message-
From: Thomas Lumley [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 23, 2003 10:07 AM
To: Jean Sun
Cc: [EMAIL PROTECTED]
Subject: Re: [R] what does the sum of square of Gaussian RVs with different
variance obey?

On Tue, 23 Sep 2003, Jean Sun wrote:

> >From basic statistics principle,we know,given several i.i.d Gaussian
> >RVs with zero or nonzero mean,the sum of square of them is a central or
> >noncentral Chi-distributed RV.However if these Gaussian RVs have
> >different variances,what does the sum of square of them obey?
>

Nothing very useful.  It's a mixture of chisquare(1) variables. One
standard approach is to approximate it by a multiple of a chisquared
distribution that has the correct mean and variance.

-thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

AW: [R] weighted standard deviation

2003-09-23 Thread Schnitzler, Johannes

Thank you all for the reply,

the Hmisc library is exactly what i was looking for.

Johannes

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] (Fwd) Re: goodfit macro

2003-09-23 Thread Uwe Ligges

Michael Shott wrote:

Dear R-Help:

As you can see, Prof. Friendly refers me to your site for an 
executable version of vcd.  I don't mean to be obtuse, but 15 minutes 
spent exploring your site failed to locate a downloadable version of the 
vcd package to which he referred.

I know plainly what this application can do.  What I need to know is 
how to obtain the application itself.

My thanks in advance for any help you can provide.

MS


Well, vcd is a contributed package, not a stand-alone executable.
At first, you need to intstall R, after that, just type
  install.packages("vcd")
if you are connected to the internet, and the package will be installed.
Then, you can use the function:
  library(vcd)
  goodfit(.)
But you will need to learn some basics of R at first, e.g. from "An 
Introduction to R" (a manual that comes with R), in order to import your 
data etc.

Uwe Ligges





--- Forwarded message follows ---
Date sent:  Tue, 23 Sep 2003 09:14:28 -0400
From:   Michael Friendly <[EMAIL PROTECTED]>
Subject:Re: goodfit macro
To: [EMAIL PROTECTED]
Send reply to:  [EMAIL PROTECTED]
Organization:   York University
You can also carry out goodness of fit tests using the function goodfit 
in the vcd package
for R (a free version of S/Splus),
http://www.r-project.org

-Michael

Michael Shott wrote:


Dear Michael:

Recently I searched the web for guidance in gauging the goodness-of-
fit of frequency data to statistical models.  Your website came up in 
virtually every combination of keywords searched in Google.  The site 
describes the goodfit.sas macro, which seems to do exactly what I'd 
like.  But it seems to be available only as a .sas file.  Do you have 
some application file that can be downloaded and used?

I ask because I'd like to analyze some archaeological data.  They are 
frequency distributions of the number of a particular artifact type by 
site.  That is, x sites may have 1 occurrence, y sites may have 2 
occurrences, z sites 3 occurrences and so on.  Your goodfit macro 
seems to do the job, but I can't execute it from the .sas version that 
appears on your website's ftp link.

Thanks for any help that you can provide.

Best,

Mike Shott

Michael J. Shott
Professor
Dept. of Sociology, Anthropology & Criminology
University of Northern Iowa
Cedar Falls, IA 50614-0513
319/273-7337




__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] loops in Sweave

2003-09-23 Thread Millo Giovanni

Dear all,

I was wondering whether there is a way to make loops in Sweave, i.e. for example to:
1) calculate a parameter, say, a=length(b)
2) according to that, add #a# chapters to the document, each including some repetitive 
analysis, each time done on a particular subset of the data indexed by the elements of 
1:a.
This would be of great help for repeating exploratory data analyses on, say, 
questionaries when the number of questions changes without having to change the Sweave 
.snw file.

Many thanx for your answers

Giovanni Millo
R&D Dept.
Assicurazioni Generali SpA
Trieste, Italy



Ai sensi della Legge 675/96 si precisa che le informazioni contenute in questo 
messaggio sono riservate ed a uso esclusivo del destinatario. Qualora il messaggio in 
parola Le fosse pervenuto per errore, la preghiamo di eliminarlo senza copiarlo e di 
non inoltrarlo a terzi, dandocene gentilmente comunicazione. Grazie.This 
message, for the law 675/96, may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this for the addressee, you must 
not use, copy, disclose or take any action based on this message or any information 
herein. If you have received this message in error, please advise the sender 
immediately by reply e-mail and delete this message. Thank you for your cooperation.
[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] discretization method

2003-09-23 Thread Ben Bolker


Consider:

 ?cut
 ?round

  With more detail we might be able to help more ...

On Tue, 23 Sep 2003, Jaime Lopez Carvajal wrote:

> Hi R users
> 
> I need to apply discretization  to my continuous data.
> Is there a method in R to do this?
> 
> Thanks in advance,
> 
> Jaime
> 

-- 
620B Bartram Hall[EMAIL PROTECTED]
Zoology Department, University of Floridahttp://www.zoo.ufl.edu/bolker
Box 118525   (ph)  352-392-5697
Gainesville, FL 32611-8525   (fax) 352-392-3704

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] discretization method

2003-09-23 Thread Thomas W Blackwell

On Tue, 23 Sep 2003, Jaime Lopez Carvajal wrote:

> I need to apply discretization  to my continuous data.
> Is there a method in R to do this?

See  help("cut").

-  tom blackwell  -  u michigan medical school  -  ann arbor  -

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] (Fwd) Re: goodfit macro

2003-09-23 Thread Michael Shott

Dear R-Help:

As you can see, Prof. Friendly refers me to your site for an 
executable version of vcd.  I don't mean to be obtuse, but 15 minutes 
spent exploring your site failed to locate a downloadable version of the 
vcd package to which he referred.

I know plainly what this application can do.  What I need to know is 
how to obtain the application itself.

My thanks in advance for any help you can provide.

MS



--- Forwarded message follows ---
Date sent:  Tue, 23 Sep 2003 09:14:28 -0400
From:   Michael Friendly <[EMAIL PROTECTED]>
Subject:Re: goodfit macro
To: [EMAIL PROTECTED]
Send reply to:  [EMAIL PROTECTED]
Organization:   York University

You can also carry out goodness of fit tests using the function goodfit 
in the vcd package
for R (a free version of S/Splus),
http://www.r-project.org

-Michael

Michael Shott wrote:

>Dear Michael:
>
>Recently I searched the web for guidance in gauging the goodness-of-
>fit of frequency data to statistical models.  Your website came up in 
>virtually every combination of keywords searched in Google.  The site 
>describes the goodfit.sas macro, which seems to do exactly what I'd 
>like.  But it seems to be available only as a .sas file.  Do you have 
>some application file that can be downloaded and used?
>
>I ask because I'd like to analyze some archaeological data.  They are 
>frequency distributions of the number of a particular artifact type by 
>site.  That is, x sites may have 1 occurrence, y sites may have 2 
>occurrences, z sites 3 occurrences and so on.  Your goodfit macro 
>seems to do the job, but I can't execute it from the .sas version that 
>appears on your website's ftp link.
>
>Thanks for any help that you can provide.
>
>Best,
>
>Mike Shott
>
>Michael J. Shott
>Professor
>Dept. of Sociology, Anthropology & Criminology
>University of Northern Iowa
>Cedar Falls, IA 50614-0513
>319/273-7337
>  
>


-- 
Michael Friendly Email: [EMAIL PROTECTED] 
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA


--- End of forwarded message ---
Michael J. Shott
Professor
Dept. of Sociology, Anthropology & Criminology
University of Northern Iowa
Cedar Falls, IA 50614-0513
319/273-7337

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] discretization method

2003-09-23 Thread Jaime Lopez Carvajal

Hi R users

I need to apply discretization  to my continuous data.
Is there a method in R to do this?

Thanks in advance,

Jaime
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] problems installing Design and Hmisc libs

2003-09-23 Thread Thomas Lumley

On Tue, 23 Sep 2003, Federico Calboli wrote:

> Dear All,
>
> when I try to:
>
> install.packages("Design"); install.packages("Hmisc")
>
> I get the following error messages:
>
> * Installing *source* package 'Design' ...
> ** libs
> g77 -mieee-fp  -O2 -fomit-frame-pointer -pipe -march=i586
> -mcpu=pentiumpro  -O2 -fomit-frame-pointer -pipe -march=i586
> -mcpu=pentiumpro -c lrmfit.f -o lrmfit.o
> make: g77: Command not found
> make: *** [lrmfit.o] Error 127
> ERROR: compilation failed for package 'Design'
>
> * Installing *source* package 'Hmisc' ...
> ** libs
> g77 -mieee-fp  -O2 -fomit-frame-pointer -pipe -march=i586
> -mcpu=pentiumpro  -O2 -fomit-frame-pointer -pipe -march=i586
> -mcpu=pentiumpro -c cidxcn.f -o cidxcn.o
> make: g77: Command not found
> make: *** [cidxcn.o] Error 127
> ERROR: compilation failed for package 'Hmisc'
>
> The connection to CRAN and the download are fine though.
>
> I am using R 1.7.1 under Mandrake Linux 9.1. My C compiler is gcc 3.2.2.
>
> Any idea how to install the packages?


You need a Fortran compiler, such as g77.

If you have g77 then it looks like it has moved since you compiled R.  If
you don't, then you presumably installed R binaries, and you need g77.


-thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] what does the sum of square of Gaussian RVs with different variance obey?

2003-09-23 Thread Thomas Lumley

On Tue, 23 Sep 2003, Jean Sun wrote:

> >From basic statistics principle,we know,given several i.i.d Gaussian
> >RVs with zero or nonzero mean,the sum of square of them is a central or
> >noncentral Chi-distributed RV.However if these Gaussian RVs have
> >different variances,what does the sum of square of them obey?
>

Nothing very useful.  It's a mixture of chisquare(1) variables. One
standard approach is to approximate it by a multiple of a chisquared
distribution that has the correct mean and variance.

-thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] filled.contour without box

2003-09-23 Thread Roger D. Peng

If you just want to get rid of the axes, you can do

filled.contour(x, plot.axes = { })

-roger

Uwe Ligges wrote:

Jan Kleinn wrote:

Dear all,

I would like to make a filled contour plot without the box R is 
generating by default around the plotting area, i.e. I'm looking for 
an option in filled.contour similar to 'axes=F' in 'contour' or in 
'plot'. I couldn't find any option to get rid of the box, any help is 
welcome.

Thanks, Jan:-)


Easy to add a corresponding feature:

 fix(filled.contour)

Now, remove the two lines including "box()".
Or even better, add an argument to turn plotting of the box on or off.
Uwe Ligges

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] Marginal Means with the lme()

2003-09-23 Thread [EMAIL PROTECTED]


I need help computing model-predicted estimated marginal means of the dependent 
variable in the cells defined by the fixed factors, including interactions. By 
estimated marginal means, I just mean cell means collapsed over all other factors and 
adjusted for the mean values of other covariates in the model.  Basically, I need an 
lme() analog of the model.tables() function for an aov() object.  model.tables() does 
not work for lme() fits.  Does anyone know if there is a equivalent model.tables() for 
lme?

 

Thank you,

Scott



-


[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

AW: [R] Rank and extract data from a series

2003-09-23 Thread "Unternährer Thomas, uth"


Hi,



>I would like to rank a time-series of data, extract the top ten data items from this 
>series, determine the 
>corresponding row numbers for each value in the sample, and take a mean of these *row 
>numbers* (not the data).

>I would like to do this in R, rather than pre-process the data on the UNIX command 
>line if possible, as I need to >calculate other statistics for the series.

>I understand that I can use 'sort' to order the data, but I am not aware of a 
>function in R that would allow me 
>to extract a given number of these data and then determine their positions within the 
>original time series.

>e.g.

>Time series:

>1.0 (row 1)
>4.5 (row 2)
>2.3 (row 3)
>1.0 (row 4)
>7.3 (row 5)

>Sort would give me:

>1.0
>1.0
>2.3
>4.5
>7.3

>I would then like to extract the top two data items:

>4.5
>7.3

>and determine their positions within the original (unsorted) time series:

>4.5 = row 2
>7.3 = row 5

>then take a mean:

>2 and 5 = 3.5

>Thanks in advance.

>James Brown

X <- c(1, 4.5, 2.3, 1, 7.3)
X1 <- sort(X, decreasing=TRUE)[1:2]
X2 <- match(X1, X)
mean(X2)



Hope this helps

Thomas


___

James Brown

Cambridge Coastal Research Unit (CCRU)
Department of Geography
University of Cambridge
Downing Place
Cambridge
CB2 3EN, UK

Telephone: +44 (0)1223 339776
Mobile: 07929 817546
Fax: +44 (0)1223 355674

E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

http://www.geog.cam.ac.uk/ccru/CCRU.html
___






On Wed, 10 Sep 2003, Jerome Asselin wrote:

> On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote:
> >
> > Your method looks like a naive reimplementation of integration, and 
> > won't work so well for distributions that have the great majority of 
> > the probability mass concentrated in a small fraction of the sample 
> > space.  I was hoping for something that would retain the 
> > adaptability of integrate().
>
> Yesterday, I've suggested to use approxfun(). Did you consider my 
> suggestion? Below is an example.
>
> N <- 500
> x <- rexp(N)
> y <- rank(x)/(N+1)
> empCDF <- approxfun(x,y)
> xvals <- seq(0,4,.01)
> plot(xvals,empCDF(xvals),type="l",
> xlab="Quantile",ylab="Cumulative Distribution Function")
> lines(xvals,pexp(xvals),lty=2)
> legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2)
>
>
> It's possible to tune in some parameters in approxfun() to better 
> match your personal preferences. Have a look at help(approxfun) for 
> details.
>
> HTH,
> Jerome Asselin
>
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

__
[EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

RE: [R] Rank and extract data from a series

2003-09-23 Thread Liaw, Andy

Here's one way.  Suppose your "time series" is in a vector called "x".

top10 <- sort(x, decreasing=TRUE)[1:10]
mean.index <- mean(which(x %in% top10))

HTH,
Andy

> -Original Message-
> From: James Brown [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, September 23, 2003 7:51 AM
> To: [EMAIL PROTECTED]
> Subject: [R] Rank and extract data from a series
> 
> 
> 
> I would like to rank a time-series of data, extract the top 
> ten data items from this series, determine the corresponding 
> row numbers for each value in the sample, and take a mean of 
> these *row numbers* (not the data).
> 
> I would like to do this in R, rather than pre-process the 
> data on the UNIX command line if possible, as I need to 
> calculate other statistics for the series.
> 
> I understand that I can use 'sort' to order the data, but I 
> am not aware of a function in R that would allow me to 
> extract a given number of these data and then determine their 
> positions within the original time series.
> 
> e.g.
> 
> Time series:
> 
> 1.0 (row 1)
> 4.5 (row 2)
> 2.3 (row 3)
> 1.0 (row 4)
> 7.3 (row 5)
> 
> Sort would give me:
> 
> 1.0
> 1.0
> 2.3
> 4.5
> 7.3
> 
> I would then like to extract the top two data items:
> 
> 4.5
> 7.3
> 
> and determine their positions within the original (unsorted) 
> time series:
> 
> 4.5 = row 2
> 7.3 = row 5
> 
> then take a mean:
> 
> 2 and 5 = 3.5
> 
> Thanks in advance.
> 
> James Brown
> 
> ___
> 
> James Brown
> 
> Cambridge Coastal Research Unit (CCRU)
> Department of Geography
> University of Cambridge
> Downing Place
> Cambridge
> CB2 3EN, UK
> 
> Telephone: +44 (0)1223 339776
> Mobile: 07929 817546
> Fax: +44 (0)1223 355674
> 
> E-mail: [EMAIL PROTECTED]
> E-mail: [EMAIL PROTECTED]
> 
> http://www.geog.cam.ac.uk/ccru/CCRU.html
> ___
> 
> 
> 
> 
> 
> 
> On Wed, 10 Sep 2003, Jerome Asselin wrote:
> 
> > On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote:
> > >
> > > Your method looks like a naive reimplementation of 
> integration, and 
> > > won't work so well for distributions that have the great 
> majority of 
> > > the probability mass concentrated in a small fraction of 
> the sample 
> > > space.  I was hoping for something that would retain the 
> > > adaptability of integrate().
> >
> > Yesterday, I've suggested to use approxfun(). Did you consider my 
> > suggestion? Below is an example.
> >
> > N <- 500
> > x <- rexp(N)
> > y <- rank(x)/(N+1)
> > empCDF <- approxfun(x,y)
> > xvals <- seq(0,4,.01)
> > plot(xvals,empCDF(xvals),type="l",
> > xlab="Quantile",ylab="Cumulative Distribution Function")
> > lines(xvals,pexp(xvals),lty=2)
> > legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2)
> >
> >
> > It's possible to tune in some parameters in approxfun() to better 
> > match your personal preferences. Have a look at help(approxfun) for 
> > details.
> >
> > HTH,
> > Jerome Asselin
> >
> > __
> > [EMAIL PROTECTED] mailing list 
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

RE: [R] R-project [.com?] [.net?]

2003-09-23 Thread Liaw, Andy

> From: Murray Jorgensen [mailto:[EMAIL PROTECTED] 
> 
> I got a shock a few days ago when I accidentally visited 
> www.r-project.com . I thought that the r-project site had been hacked 

This one seems to be about some sort of city revival projects in Japan.
(The introduction starts with "Recycle, Redesign, Rethink, Refine, Restore,
Recreation...".  I can't read Japanese, but just guessing from the Chinese
characters that were sprinkled in the text.

> until I realised my mistake. There is also a site www.r-project.net. 

No indication what this is about.  Seems to say the site is still under
construction.

Andy

> Both of these sites appear to be Japanese. Does anyone know anything 
> about them? I suppose that it is not unusual for names close 
> to those of 
> popular sites to be used. It is good that they use a 
> different language 
> or there might well be confusion.
> 
> Murray
> 
> -- 
> Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
> Department of Statistics, University of Waikato, Hamilton, New Zealand
> Email: [EMAIL PROTECTED]Fax 7 838 4155
> Phone  +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] Rank and extract data from a series

2003-09-23 Thread James Brown

I would like to rank a time-series of data, extract the top ten data items
from this series, determine the corresponding row numbers for each value
in the sample, and take a mean of these *row numbers* (not the data).

I would like to do this in R, rather than pre-process the data on the
UNIX command line if possible, as I need to calculate other statistics
for the series.

I understand that I can use 'sort' to order the data, but I am not aware
of a function in R that would allow me to extract a given number of these
data and then determine their positions within the original time series.

e.g.

Time series:

1.0 (row 1)
4.5 (row 2)
2.3 (row 3)
1.0 (row 4)
7.3 (row 5)

Sort would give me:

1.0
1.0
2.3
4.5
7.3

I would then like to extract the top two data items:

4.5
7.3

and determine their positions within the original (unsorted) time series:

4.5 = row 2
7.3 = row 5

then take a mean:

2 and 5 = 3.5

Thanks in advance.

James Brown

___

James Brown

Cambridge Coastal Research Unit (CCRU)
Department of Geography
University of Cambridge
Downing Place
Cambridge
CB2 3EN, UK

Telephone: +44 (0)1223 339776
Mobile: 07929 817546
Fax: +44 (0)1223 355674

E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

http://www.geog.cam.ac.uk/ccru/CCRU.html
___

On Wed, 10 Sep 2003, Jerome Asselin wrote:

> On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote:
> >
> > Your method looks like a naive reimplementation of integration, and
> > won't work so well for distributions that have the great majority of the
> > probability mass concentrated in a small fraction of the sample space.
> >  I was hoping for something that would retain the adaptability of
> > integrate().
>
> Yesterday, I've suggested to use approxfun(). Did you consider my
> suggestion? Below is an example.
>
> N <- 500
> x <- rexp(N)
> y <- rank(x)/(N+1)
> empCDF <- approxfun(x,y)
> xvals <- seq(0,4,.01)
> plot(xvals,empCDF(xvals),type="l",
> xlab="Quantile",ylab="Cumulative Distribution Function")
> lines(xvals,pexp(xvals),lty=2)
> legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2)
>
>
> It's possible to tune in some parameters in approxfun() to better match
> your personal preferences. Have a look at help(approxfun) for details.
>
> HTH,
> Jerome Asselin
>
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] problems installing Design and Hmisc libs

2003-09-23 Thread Federico Calboli


> You need the Fortran compiler (g77) as well (as indicated by the error 
> message).  Is it installed? Is it in your path?
> 
> Uwe Ligges
> 
It was not installed. Now everything works.

Cheers,
Federico

-- 



=

Federico C. F. Calboli

Department of Biology
University College London
Darwin Building 
Gower Street
London
WC1E 6BT

tel: 020 7679 4395
fax: 020 7679 7096

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

RE: [R] Date on x-axis of xyplot

2003-09-23 Thread Heywood, Giles

One thing you could do is to use the 'its' (irregular time-series) 
package on CRAN.

e.g. using a trivial dataset

require(its)
its.format("%Y-%m-%d") #defines text format of dates in dimnames
df <- data.frame(1:3,(1:3)^2)
dimnames(df) <- list(c("2003-01-03","2003-01-06","2003-01-07"),letters[1:2])
plot(its(as.matrix(df)),type="p")

or more simply

mat <-
structure(1:6,dim=c(3,2),dimnames=list(c("2003-01-03","2003-01-06","2003-01-
07"),letters[1:2]))
plot(its(mat),type="p")

Clearly plot does not provide the same functionality as the 
beautifully crafted xyplot of lattice, but dates do get handled
as an abcissa. Incidentally the implementation of 'its' uses the 
POSIXct class, but this is largely encapsulated.

Giles

> -Original Message-
> From: Charles H. Franklin [mailto:[EMAIL PROTECTED]
> Sent: 17 September 2003 04:00
> To: [EMAIL PROTECTED]
> Subject: [R] Date on x-axis of xyplot
> 
> 
> xyplot doesn't seem to want to label my x-axis with dates but 
> instead puts
> the day-number for each date.
> 
> begdate is the number of days since January 1, 1960 and was initially
> created by
> 
> library(date)
> 
> ...
> 
> polls$begdate<-mdy.date(begmm,begdd,begyy)
> 
> I create a new dataframe (pollstack) which includes begdate. 
> In the process
> begdate seems to lose its date attribute so I redo it as:
> 
> > pollstack$begdate<-as.date(pollstack$begdate)
> 
> after which
> 
> > attach(pollstack)
> > summary(pollstack)
>begdate   pct  names
>  First :15Nov2002   Min.   : 0.000   Clark   : 54
>  Last  :10Sep2003   1st Qu.: 2.000   Dean: 54
> Median : 5.000   Edwards : 54
> Mean   : 6.991   Gephardt: 54
> 3rd Qu.:12.000   Graham  : 54
> Max.   :29.000   Kerry   : 54
>  (Other) :216
> >
> 
> And all seems well.
> 
> But xyplot continues to use day number on the x-axis. My 
> plots are created
> by
> 
>  print(xyplot(pct ~ begdate | names, pch=2, cex=.2,
>prepanel = function(x, y) prepanel.loess(x, y, span = 1),
>main="2004 Democratic Primary Race",
>xlab = "Date of Survey",
>ylab = "Percent Support",
>panel = function(x, y) {
>panel.grid(h=-1, v= -1)
>panel.xyplot(x, y, pch=1,col=2,cex=.7)
>panel.loess(x,y, span=.65, lwd=2,col=4)
>   }, ) )
> 
> What am I missing?
> 
> Thanks!
> 
> Charles
> 
> 
> 
> /**
> ** Charles H. Franklin
> ** Professor, Political Science
> ** University of Wisconsin, Madison
> ** 1050 Bascom Mall
> ** Madison, WI 53706
> ** 608-263-2022 Office
> ** 608-265-2663 Fax
> ** mailto:[EMAIL PROTECTED] (best)
> ** mailto:[EMAIL PROTECTED] (alt)
> ** http://www.polisci.wisc.edu/~franklin
> **/
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 


** 
This is a commercial communication from Commerzbank AG.\ \ T...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] problems installing Design and Hmisc libs

2003-09-23 Thread Uwe Ligges

Federico Calboli wrote:

Dear All,

when I try to:

install.packages("Design"); install.packages("Hmisc") 

I get the following error messages:

* Installing *source* package 'Design' ...
** libs
g77 -mieee-fp  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro -c lrmfit.f -o lrmfit.o
make: g77: Command not found
make: *** [lrmfit.o] Error 127
ERROR: compilation failed for package 'Design'
* Installing *source* package 'Hmisc' ...
** libs
g77 -mieee-fp  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro -c cidxcn.f -o cidxcn.o
make: g77: Command not found
make: *** [cidxcn.o] Error 127
ERROR: compilation failed for package 'Hmisc'
The connection to CRAN and the download are fine though.

I am using R 1.7.1 under Mandrake Linux 9.1. My C compiler is gcc 3.2.2.
You need the Fortran compiler (g77) as well (as indicated by the error 
message).  Is it installed? Is it in your path?

Uwe Ligges


Any idea how to install the packages?

Regads,

Federico

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] problems installing Design and Hmisc libs

2003-09-23 Thread Federico Calboli

Dear All,

when I try to:

install.packages("Design"); install.packages("Hmisc") 

I get the following error messages:

* Installing *source* package 'Design' ...
** libs
g77 -mieee-fp  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro -c lrmfit.f -o lrmfit.o
make: g77: Command not found
make: *** [lrmfit.o] Error 127
ERROR: compilation failed for package 'Design'

* Installing *source* package 'Hmisc' ...
** libs
g77 -mieee-fp  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro  -O2 -fomit-frame-pointer -pipe -march=i586
-mcpu=pentiumpro -c cidxcn.f -o cidxcn.o
make: g77: Command not found
make: *** [cidxcn.o] Error 127
ERROR: compilation failed for package 'Hmisc'

The connection to CRAN and the download are fine though.

I am using R 1.7.1 under Mandrake Linux 9.1. My C compiler is gcc 3.2.2.

Any idea how to install the packages?

Regads,

Federico

-- 



=

Federico C. F. Calboli

Department of Biology
University College London
Darwin Building 
Gower Street
London
WC1E 6BT

tel: 020 7679 4395
fax: 020 7679 7096

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] Plotting of the lm

2003-09-23 Thread Gavin Simpson

Try this:

plot(x, y)
abline(lm(y ~ x)) #plots the fitted line
or this if you need the model results elsewhere:

mod.lm <- lm(y ~ x)   # store the model
plot(x, y)# plot the data
abline(mod.lm)# plot the fitted line
see ?plot.lm and ?abline

HTH

Gav

[EMAIL PROTECTED] wrote:

Hi,

I would like to enquire if by typing plot (lm(y~x)) would this show me the
plot of the fitted line? I tried this function previously but I was only
able to get the last 4 plots starting with Residuals vs fitted. 

Thank You.

Melissa


JLT Risk Solutions Ltd
6 Crutched Friars, London EC3N 2PH. Co Reg No 1536540
Tel: (44) (0)20 7528 4000   Fax: (44) (0)20 7528 4500
http://www.jltgroup.com
Lloyd's Broker.  Regulated by the General Insurance
Standards Council

The content of this e-mail (including any attachments) as 
received may not be the same as sent. If you consider that 
the content is material to the formation or performance of 
a contract or you are otherwise relying upon its accuracy, 
you should consider requesting a copy be sent by facsimile 
or normal mail.  The information in this e-mail is 
confidential and may be legally privileged. If you are not 
the intended recipient, please notify the sender immediately 
and then delete this e-mail entirely - you must not retain, 
copy, distribute or use this e-mail for any purpose or 
disclose any of its content to others.

Opinions, conclusions and other information in this e-mail 
that do not relate to the official business of JLT Risk 
Solutions Ltd shall be understood as neither given nor 
endorsed by it.  Please note we intercept and monitor 
incoming / outgoing e-mail and therefore you should neither 
expect nor intend any e-mail to be private in nature.

We have checked this e-mail for viruses and other harmful 
components and believe but not guarantee it virus-free prior 
to leaving our computer system.  However, you should satisfy 
yourself that it is free from harmful components, as we do 
not accept responsibility for any loss or damage it may 
cause to your computer systems.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [T] +44 (0)20 7679 5522
ENSIS Research Fellow [F] +44 (0)20 7679 7565
ENSIS Ltd. & ECRC [E] [EMAIL PROTECTED]
UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] filled.contour without box

2003-09-23 Thread Uwe Ligges

Jan Kleinn wrote:

Dear all,

I would like to make a filled contour plot without the box R is 
generating by default around the plotting area, i.e. I'm looking for an 
option in filled.contour similar to 'axes=F' in 'contour' or in 'plot'. 
I couldn't find any option to get rid of the box, any help is welcome.

Thanks, Jan:-)
Easy to add a corresponding feature:

 fix(filled.contour)

Now, remove the two lines including "box()".
Or even better, add an argument to turn plotting of the box on or off.
Uwe Ligges

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] Plotting of the lm

2003-09-23 Thread Uwe Ligges

[EMAIL PROTECTED] wrote:

Hi,

I would like to enquire if by typing plot (lm(y~x)) would this show me the
plot of the fitted line? I tried this function previously but I was only
able to get the last 4 plots starting with Residuals vs fitted. 
No, it shows plots for analyses of the residuals.

Try

 plot(x, y)
 abline(lm(y~x))
in order to see the fitted line.

Uwe Ligges



Thank You.

Melissa


JLT Risk Solutions Ltd
6 Crutched Friars, London EC3N 2PH. Co Reg No 1536540
Tel: (44) (0)20 7528 4000   Fax: (44) (0)20 7528 4500
http://www.jltgroup.com
Lloyd's Broker.  Regulated by the General Insurance
Standards Council

The content of this e-mail (including any attachments) as 
received may not be the same as sent. If you consider that 
the content is material to the formation or performance of 
a contract or you are otherwise relying upon its accuracy, 
you should consider requesting a copy be sent by facsimile 
or normal mail.  The information in this e-mail is 
confidential and may be legally privileged. If you are not 
the intended recipient, please notify the sender immediately 
and then delete this e-mail entirely - you must not retain, 
copy, distribute or use this e-mail for any purpose or 
disclose any of its content to others.

Opinions, conclusions and other information in this e-mail 
that do not relate to the official business of JLT Risk 
Solutions Ltd shall be understood as neither given nor 
endorsed by it.  Please note we intercept and monitor 
incoming / outgoing e-mail and therefore you should neither 
expect nor intend any e-mail to be private in nature.

We have checked this e-mail for viruses and other harmful 
components and believe but not guarantee it virus-free prior 
to leaving our computer system.  However, you should satisfy 
yourself that it is free from harmful components, as we do 
not accept responsibility for any loss or damage it may 
cause to your computer systems.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] filled.contour without box

2003-09-23 Thread Jan Kleinn

Dear all,

I would like to make a filled contour plot without the box R is 
generating by default around the plotting area, i.e. I'm looking for an 
option in filled.contour similar to 'axes=F' in 'contour' or in 'plot'. 
I couldn't find any option to get rid of the box, any help is welcome.

Thanks, Jan:-)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] Plotting of the lm

2003-09-23 Thread Melissa_Kuang

Hi,

I would like to enquire if by typing plot (lm(y~x)) would this show me the
plot of the fitted line? I tried this function previously but I was only
able to get the last 4 plots starting with Residuals vs fitted. 

Thank You.

Melissa



JLT Risk Solutions Ltd
6 Crutched Friars, London EC3N 2PH. Co Reg No 1536540
Tel: (44) (0)20 7528 4000   Fax: (44) (0)20 7528 4500
http://www.jltgroup.com
Lloyd's Broker.  Regulated by the General Insurance
Standards Council

The content of this e-mail (including any attachments) as 
received may not be the same as sent. If you consider that 
the content is material to the formation or performance of 
a contract or you are otherwise relying upon its accuracy, 
you should consider requesting a copy be sent by facsimile 
or normal mail.  The information in this e-mail is 
confidential and may be legally privileged. If you are not 
the intended recipient, please notify the sender immediately 
and then delete this e-mail entirely - you must not retain, 
copy, distribute or use this e-mail for any purpose or 
disclose any of its content to others.

Opinions, conclusions and other information in this e-mail 
that do not relate to the official business of JLT Risk 
Solutions Ltd shall be understood as neither given nor 
endorsed by it.  Please note we intercept and monitor 
incoming / outgoing e-mail and therefore you should neither 
expect nor intend any e-mail to be private in nature.

We have checked this e-mail for viruses and other harmful 
components and believe but not guarantee it virus-free prior 
to leaving our computer system.  However, you should satisfy 
yourself that it is free from harmful components, as we do 
not accept responsibility for any loss or damage it may 
cause to your computer systems.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] R Production Performance

2003-09-23 Thread Laurent Faisnel

Zitan Broth wrote:
Greetings All,

Been playing with R and it is very easy to get going with the UI or infile batch commands :-)

What I am wondering is how scalable and fast R is for running as part of a web service.  I believe R is written in C which is a great start, but what are peoples general thoughts on this?

Thanks greatly,
Z.
I use R in such a way. R is called by dynamic pages written in PHP, 
performs a calculation and writes results in a XML file. Results are 
then read by other pages. Performances still could be better but I 
haven't tried every solution yet (RSOAP seems very interesting). MySQL 
database access is made thanks to RMySQL.
Anyway, R is not incompatible with such projects.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] Very small estimated random effect variance (lme)

2003-09-23 Thread Peter Dalgaard BSA

"Remko Duursma" <[EMAIL PROTECTED]> writes:

> Dear R-helpers,
> 
> i get some strange results using a linear mixed-effects model (lme), of the type:
> 
> lme1 <- lme(y ~ x, random=~x|group, ...)
> 
> For some datasets, i obtain very small standard deviations of the random effects. I 
> compared these to standard deviations of the slope and intercept using a lmList 
> approach. Of course, the SD from the lme is always smaller (shrinkage estimator), 
> but in some cases (the problem cases) the SD from the lme seems way too small. E.g.: 
> SD of intercept = 0.14, SD of slope = 0.0004, SD residual=0.11. An lmList gives a 
> slope SD of 0.07.
> 
> I have about n=6 observations per group, and about 20-100 groups depending on the 
> dataset.
> 
> thank you for any suggestions,

It's not a shrinkage estimator it is a "subtraction estimator",
measuring the excess variance of the empirical slopes over what would
be expected from their s.e. if all (true) slopes were identical. This
can even be negative, although the parametrizations in lme() will
enforce a zero or very small variance in that case.

(There are occasional cases where a negative variance can be
interpreted, e.g. plants competing for the same growth medium, but
you're generally in trouble if the design is unbalanced.)

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

59 matches

Mail list logo