[R] LDA: normalization of eigenvectors (see SPSS)

2003-06-08 Thread Christoph Lehmann
Hi dear R-users

I try to reproduce the steps included in a LDA. Concerning the eigenvectors there is 
a difference to SPSS. In my textbook (Bortz)
it says, that the matrix with the eigenvectors 

V

usually are not normalized to the length of 1, but in the way that the
following holds (SPSS does the same thing):

t(Vstar)%*%Derror%*%Vstar = I


where Vstar are the normalized eigenvectors. Derror is an error or
within squaresum- and crossproduct matrix (squaresum of the p
variables on the diagonale, and the non-diagonal elements are the sum of
the crossproducts). For Derror the following holds: Dtotal = Dtreat +
Derror.

Since I assume that many of you are familiar with this transformation:
can anybody of you tell me, how to conduct this transformation in R?
Would be very nice. Thanks a lot

Cheers

Christoph

-- 
Christoph Lehmann [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Ordering long vectors

2003-06-08 Thread Göran Broström
On Sat, 7 Jun 2003, Göran Broström wrote:

 
 I need to order a long vector of integers with rather few unique values.
 This is very slow:
 
  x - sample(rep(c(1:10), 5))
  system.time(ord - order(x))
 [1] 189.18   0.09 190.48   0.00   0.00
 
 But with no ties
 
  y - sample(50)
  system.time(ord1 - order(y))
 [1] 1.18 0.00 1.18 0.00 0.00
 
 it is very fast!
 This gave me the following idea: Since I don't care about keeping the 
 order within tied values, why not add some small disturbance to  x,
 and indeed,
 
  unix.time(ord2 - order(x + runif(length(x), -0.1, 0.1)))
 [1] 1.66 0.00 1.66 0.00 0.00

An even better way is 

 system.time(ord3 - order(x + seq(0, 0.9, length = length(x 
[1] 1.32 0.05 1.37 0.00 0.00

Faster, but more important; it keeps the original ordering for tied 
values. Thanks to James Holtman.

Göran
[...]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] daylight saving time problems

2003-06-08 Thread wouter buytaert

Hello,

sorry for my mail yesterday about the POSIXct problems. I was a bit
tired and now I found out the real problem. When importing time data
over a daylight saving time shift, R shifts two times. I don't now
whether it is a bug or a (wrongly used) feature

If you execute the following code:

--
test-c(31/03/2002 0:00, 31/03/2002 0:15, 31/03/2002 0:30,
31/03/2002 0:45, 31/03/2002 1:00, 31/03/2002 1:15, 31/03/2002
1:30, 31/03/2002 1:45, 31/03/2002 2:00, 31/03/2002 2:15,
31/03/2002 2:30, 31/03/2002 2:45, 31/03/2002 3:00, 31/03/2002
3:15, 31/03/2002 3:30, 31/03/2002 3:45, 31/03/2002 4:00,
31/03/2002 4:15, 31/03/2002 4:30, 31/03/2002 4:45, 31/03/2002
5:00, 31/03/2002 5:15, 31/03/2002 5:30, 31/03/2002 5:45,
31/03/2002 6:00);
timetest-strptime(as.character(test), format = %d/%m/%Y %H:%M);
timetest2-as.POSIXct(timetest);
--

then R 1.7.0 gives on my Mandrake 9.1:

 test
 [1] 31/03/2002 0:00 31/03/2002 0:15 31/03/2002 0:30 31/03/2002
0:45
 [5] 31/03/2002 1:00 31/03/2002 1:15 31/03/2002 1:30 31/03/2002
1:45
 [9] 31/03/2002 2:00 31/03/2002 2:15 31/03/2002 2:30 31/03/2002
2:45
[13] 31/03/2002 3:00 31/03/2002 3:15 31/03/2002 3:30 31/03/2002
3:45
[17] 31/03/2002 4:00 31/03/2002 4:15 31/03/2002 4:30 31/03/2002
4:45
[21] 31/03/2002 5:00 31/03/2002 5:15 31/03/2002 5:30 31/03/2002
5:45
[25] 31/03/2002 6:00

 timetest
 [1] 2002-03-31 00:00:00 2002-03-31 00:15:00 2002-03-31 00:30:00
 [4] 2002-03-31 00:45:00 2002-03-31 01:00:00 2002-03-31 01:15:00
 [7] 2002-03-31 01:30:00 2002-03-31 01:45:00 2002-03-31 03:00:00
[10] 2002-03-31 03:15:00 2002-03-31 03:30:00 2002-03-31 03:45:00
[13] 2002-03-31 03:00:00 2002-03-31 03:15:00 2002-03-31 03:30:00
[16] 2002-03-31 03:45:00 2002-03-31 04:00:00 2002-03-31 04:15:00
[19] 2002-03-31 04:30:00 2002-03-31 04:45:00 2002-03-31 05:00:00
[22] 2002-03-31 05:15:00 2002-03-31 05:30:00 2002-03-31 05:45:00
[25] 2002-03-31 06:00:00

 timetest2
 [1] 2002-03-31 00:00:00 CET  2002-03-31 00:15:00 CET
 [3] 2002-03-31 00:30:00 CET  2002-03-31 00:45:00 CET
 [5] 2002-03-31 01:00:00 CET  2002-03-31 01:15:00 CET
 [7] 2002-03-31 01:30:00 CET  2002-03-31 01:45:00 CET
 [9] 2002-03-31 03:00:00 CEST 2002-03-31 03:15:00 CEST
[11] 2002-03-31 03:30:00 CEST 2002-03-31 03:45:00 CEST
[13] 2002-03-31 03:00:00 CEST 2002-03-31 03:15:00 CEST
[15] 2002-03-31 03:30:00 CEST 2002-03-31 03:45:00 CEST
[17] 2002-03-31 04:00:00 CEST 2002-03-31 04:15:00 CEST
[19] 2002-03-31 04:30:00 CEST 2002-03-31 04:45:00 CEST
[21] 2002-03-31 05:00:00 CEST 2002-03-31 05:15:00 CEST
[23] 2002-03-31 05:30:00 CEST 2002-03-31 05:45:00 CEST
[25] 2002-03-31 06:00:00 CEST

There is a clear time shift timetest[8] and timetest[9] and another one
between timetest[12] and timetest[13]. I.e. timetest[9:12] are wrongly
converted.

In october (reverse timeshift in daylight time) there is no shift at
all.

It seems that it was a feature before that has been badly patched.

I'm using R 1.7.0 on Mandrake Linux in Belgium (CEST?)
It does not occur on my MacOSX box (both Darwin and Carbon version); I
don't now about the windows version.

Thanks,

Wouter Buytaert

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] LDA: normalization of eigenvectors (see SPSS)

2003-06-08 Thread Spencer Graves
	  The following satisfies some of your constraints but I don't know if 
it satisfies all of them.

	  Let V = eigenvectors normalized so t(V) %*% V = I.  Also, let D.5 = 
some square root matrix, so t(D.5) %*% D.5 = Derror, and Dm.5 = 
solve(D.5) = invers of D.5.  The Choleski decomposition (chol) 
provides one such solution, but you can construct a symmetric square 
root using eigen.  Then Vstar = Dm.5%*%V will have the property you 
mentioned below.

	  Consider the following:

 (Derror - array(c(1,1,1,4), dim=c(2,2)))
 [,1] [,2]
[1,]11
[2,]14
 D.5 - chol(Derror)
 t(D.5) %*% D.5
 [,1] [,2]
[1,]11
[2,]14
 (Dm.5 - solve(D.5))
 [,1]   [,2]
[1,]1 -0.5773503
[2,]0  0.5773503
 (t(Dm.5) %*% Derror %*% Dm.5)
 [,1] [,2]
[1,]10
[2,]01
	  Thus,t(Vstar)%*%Derror%*%Vstar =  t(V)%*%t(Dm.5)%*%Derror%*%Dm.5%*%V 
= t(V)%*%V = I.

hope this helps.  spencer graves

Christoph Lehmann wrote:
Hi dear R-users

I try to reproduce the steps included in a LDA. Concerning the eigenvectors there is 
a difference to SPSS. In my textbook (Bortz)
it says, that the matrix with the eigenvectors 

V

usually are not normalized to the length of 1, but in the way that the
following holds (SPSS does the same thing):
t(Vstar)%*%Derror%*%Vstar = I

where Vstar are the normalized eigenvectors. Derror is an error or
within squaresum- and crossproduct matrix (squaresum of the p
variables on the diagonale, and the non-diagonal elements are the sum of
the crossproducts). For Derror the following holds: Dtotal = Dtreat +
Derror.
Since I assume that many of you are familiar with this transformation:
can anybody of you tell me, how to conduct this transformation in R?
Would be very nice. Thanks a lot
Cheers

Christoph

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] RMySQL errors

2003-06-08 Thread Chris Fonnesbeck
I have RMySQL installed on my OSX implementation of R, but get the 
following errors when trying to use it:

Error in dyn.load(x, as.logical(local), as.logical(now)) :
unable to load shared library 
/usr/local/lib/R/library/RMySQL/libs/RMySQL.so:
  dlcompat: dyld: /usr/local/lib/R/bin/R.bin Undefined symbols:
_getopt_long
_load_defaults
_mysql_affected_rows
_mysql_close
_mysql_errno
_mysql_error
_mysql_fetch_fields
_mysql_fetch_lengths
_mysql_fetch_row
_mysql_field_count
_mysql_free_result
_mysql_g
Error in library(RMySQL) : .First.lib failed

I'm hoping there is a RMySQL guru out there somewhere that can help me 
out.

TIA,
cjf
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] LDA: normalization of eigenvectors (see SPSS)

2003-06-08 Thread Spencer Graves
Hi, Christoph:

	  1.  I didn't see in your original email that you wanted V to be 
orthogonal, only that it's columns have length 1.  You have a solution 
satisfying the latter constraint, but not the former.

	  2.  I don't have time now to sort out the details, and I don't have 
them on the top of my head.  I just entered lda into R 1.6.2 [after 
library(MASS)] and got the following:

 lda
function (x, ...)
{
if (is.null(class(x)))
class(x) - data.class(x)
UseMethod(lda, x, ...)
}
	  To decode 'UseMethod(lda, ...)', I requested 'methods(lda)' with 
the following result:

 methods(lda)
[1] lda.data.frame lda.defaultlda.formulalda.matrix
	  Have you tried listing each of these 4 functions and working through 
them step by step?  I think this should answer your question.  Also see 
Venables and Ripley (2002) Modern Applied Statistics with S, index entry 
for lda.

hth.  spencer graves

Christoph Lehmann wrote:
 thanks a lot, Spencer

 The problem is the following: my textbook has an example with the data:

 X x
x1 x2 x3
 1   3  3  4
 2   4  4  3
 3   4  4  6
 4   2  5  5
 5   2  4  5
 6   3  4  6
 7   3  4  4
 8   2  5  5
 9   4  3  6
 10  5  5  6
 11  4  5  7
 12  4  6  4
 13  3  6  6
 14  4  7  6
 15  6  5  6
 --

y

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
  1  1  1  1  1  1  2  2  2  2  3  3  3  3  3
 --

Dtot - (t(x)%*%x-t(xbar)%*%xbar)
Dtot


 x1x2x3
   x1 17.73  2.67  4.87
   x2  2.67 17.33  4.33
   x3  4.87  4.33 16.93
 --

A - cbind(tapply(x[,1],y,sum), tapply(x[,2],y,sum),

 tapply(x[,3],y,sum))

A

   [,1] [,2] [,3]
 1   18   24   29
 2   14   17   21
 3   21   29   29

G - apply(x,2,sum)
G

 x1 x2 x3
 53 70 79

p - ncol(x)
k - length(freq)
N - sum(freq)
Dtreat - array(0,c(p,p))
k - length(freq)
for (i in 1:p)

 + {
 +   for (j in 1:k)
 +   {
 + for (h in 1:k)
 + {
 +   Dtreat[i,j] - Dtreat[i,j] + A[h,i]*A[h,j]/freq[h]
 + }
 + Dtreat[i,j] - Dtreat[i,j] - G[i]*G[j]/N
 +   }
 + }

Dtreat

  [,1] [,2] [,3]
 [1,] 3.93 5.97 3.17
 [2,] 5.97 9.78 4.78
 [3,] 3.17 4.78 2.55
 --

Derror - Dtot-Dtreat
Derror


x1x2   x3
   x1 13.8 -3.30  1.7
   x2 -3.3  7.55 -0.45000
   x3  1.7 -0.45 14.38333

 --

eigen(Dtreat%*%solve(Derror))

 $values
 [1]  2.300398e+00  2.039672e-02 -1.907034e-15

 $vectors
[,1]   [,2]   [,3]
 [1,] -0.4870772  0.6813155 -0.6076020
 [2,] -0.7809602 -0.4342229  0.1539928
 [3,] -0.3909693  0.5892874  0.7791701


V - eigen(Dtreat%*%solve(Derror))$vectors
V

[,1]   [,2]   [,3]
 [1,] -0.4870772  0.6813155 -0.6076020
 [2,] -0.7809602 -0.4342229  0.1539928
 [3,] -0.3909693  0.5892874  0.7791701

 the textbook (SPSS) has similar eigenvalues, but only two!:

 lambda1 = 2.30048, lambda2 = 0.02091
 , but as I wrote in the last mail: different eigenvectors

 Let's start here with your recommendation:
 first, it seems, since the last eigenvalue is almost 0, that the
 eigenvectors V are not orthogonal:


t(V)%*%V

[,1][,2][,3]
 [1,]  1.000 -0.22313575 -0.12894473
 [2,] -0.2231357  1. -0.02168078
 [3,] -0.1289447 -0.02168078  1.

 let's continue anyway?

D.5 - chol(Derror)
t(D.5) %*% D.5


x1x2   x3
   x1 13.8 -3.30  1.7
   x2 -3.3  7.55 -0.45000
   x3  1.7 -0.45 14.38333

Dm.5 - solve(D.5)
t(Dm.5) %*% Derror %*% Dm.5


 x1x2x3
   x1  1.00e+00 -2.523481e-17 -1.097755e-18
   x2 -6.625163e-18  1.00e+00 -2.120970e-18
   x3  4.501901e-18  4.460942e-19  1.00e+00
 perfectly orthogonal

t(V)%*%t(Dm.5)%*%Dfehler%*%Dm.5%*%V


  [,1][,2][,3]
   [1,]  1.000 -0.22313575 -0.12894473
   [2,] -0.2231357  1. -0.02168078
   [3,] -0.1289447 -0.02168078  1.
 again, equals t(V)%*%V not orthogonal.

 -- I think it has to do with the fact, that the textbook considers the
 third eigenvalue as = 0 and then gets the Vstar eigenvectors (which I
 try to reproduce:

 Vstar =
  [,1][,2][,3]
   [1,]  0.1689 0.1419 -0.1825
   [2,]  0.3498-0.1597  0.0060
   [3,]  0.0625 0.1422  0.2154

 -

 Spencer if you find some minutes time to help me reproduce this example,
 it would be very nice (the data are from Jones 1961. He investigated
 whether essays written by children from lower, middle, upper class
 differ in sentence length, choosen words, complexity of sentence)

 Cheers

 Christoph

##
  The following satisfies some of your constraints but I don't
know if it satisfies all of them.
  Let V = eigenvectors normalized so t(V) %*% V = I.  Also, let
D.5 = some square root matrix, so t(D.5) %*% D.5 = Derror, and Dm.5 =
solve(D.5) = invers of D.5.  The Choleski decomposition (chol)
provides one such solution, but you can construct a symmetric 

Re: [R] Ordering long vectors

2003-06-08 Thread Thomas Lumley

On Sun, 8 Jun 2003, [ISO-8859-1] Göran Broström wrote:

 On Sat, 7 Jun 2003, Göran Broström wrote:

 
  I need to order a long vector of integers with rather few unique values.
  This is very slow:
 
   x - sample(rep(c(1:10), 5))
   system.time(ord - order(x))
  [1] 189.18   0.09 190.48   0.00   0.00
 
  But with no ties
 
   y - sample(50)
   system.time(ord1 - order(y))
  [1] 1.18 0.00 1.18 0.00 0.00
 
  it is very fast!
  This gave me the following idea: Since I don't care about keeping the
  order within tied values, why not add some small disturbance to  x,

Another option:

 system.time(a-sapply(sort(unique(x)),function(i) which(x==i)))

This turns out to be slightly slower than your method, but doesn't require
that you know what the smallest difference between values is (and works
for characters as well as numbers)

-thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Ordering long vectors

2003-06-08 Thread Thomas Lumley
On Sat, 7 Jun 2003, [ISO-8859-1] Göran Broström wrote:


 I need to order a long vector of integers with rather few unique values.
 This is very slow:


I think the culprit is

src/main/sort.c: orderVector1

/* Shell sort isn't stable, but it proves to be somewhat faster
   to run a final insertion sort to re-order runs of ties when
   comparison is cheap.
*/

This also explains:

 aa-sample(rep(1:10,5))
 system.time( order(aa, 1:length(aa)))
[1] 3.67 0.01 3.68 0.00 0.00
 system.time( order(aa))
^C
Timing stopped at: 49.33 0.01 49.34 0 0

which is perhaps the simplest work-around :).


-thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] converting by to a data.frame?

2003-06-08 Thread Spencer Graves
Thanks to Thomas Lumley, Sundar Dorai-Raj, and Don McQueen for their 
suggestions.  I need the INDICES as part of the output data.frame, which 
McQueen's solution provided.  I generalized his method as follows:

by.to.data.frame -
function(x, INDICES, FUN){
# Split data.frame x on x[,INDICES]
# and lapply FUN to each data.frame subset,
# returning a data.frame
#
#  Internal functions
   get.Index - function(x, INDICES){
Ind - as.character(x[,INDICES[1]])
k - length(INDICES)
if(k  1)
Ind - paste(Ind, get.Index(x, INDICES[-1]), sep=:)  
Ind 
}
FUN2 - function(data., INDICES, FUN){
vec - FUN(data.)
Vec - matrix(vec, nrow=1)
dimnames(Vec) - list(NULL, names(vec))
cbind(data.[1,INDICES], Vec)
}
#   Combine INDICES
Ind - get.Index(x, INDICES)
#   Apply ...:  Do the work.
Split - split(x, Ind)
byFits - lapply(Split, FUN2, INDICES, FUN)
#   Convert to a data.frame
do.call('rbind',byFits) 
}
Applying this to my toy problem produces the following:

 by.df - data.frame(A=rep(c(A1, A2), each=3),
+  B=rep(c(B1, B2), each=3), x=1:6, y=rep(0:1, length=6))

 by.to.data.frame(by.df, c(A, B), function(data.)coef(lm(y~x, data.)))
   A  B (Intercept) x
A1:B1 A1 B1   0.333 -1.517960e-16
A2:B2 A2 B2   0.667  3.282015e-16
Thanks for the assistance.  I can now tackle the real problem that 
generated this question.

Best Wishes,
Spencer Graves

Don MacQueen wrote:
Since I don't have your by.df to test with I may not have it exactly 
right, but something along these lines should work:

byFits - lapply(split(by.df,paste(by.df$A,by.df$B)),
 FUN=function(data.) {
tmp - coef(lm(y~x,data.))
data.frame(A=unique(data.$A),
   B=unique(data.$B),
   intercept=tmp[1],
   slope=tmp[2])
   })
byFitsDF - do.call('rbind',byFits)

That's assuming I've got all the closing parantheses in the right 
places, since my email software (Eudora) doesn't do R syntax checking!

This approach can get rather slow if by.df is big, or when the 
computations in FUN are extensive (or both).

If by.df$A has mode character (as opposed to being a factor), then 
replacing A=unique(data.$A) with A=I(unique(data.$A)) might improve 
performance. You want to avoid character to factor conversions when 
using an approach like this.

-Don

At 2:54 PM -0700 6/5/03, Spencer Graves wrote:

Dear R-Help:

  I want to (a) subset a data.frame by several columns, (b) fit a 
model to each subset, and (c) store a vector of results from the fit 
in the columns of a data.frame.  In the past, I've used for loops do 
do this.  Is there a way to use by?

  Consider the following example:

  byFits - by(by.df, list(A=by.df$A, B=by.df$B),
+  function(data.)coef(lm(y~x, data.)))
  byFits
A: A1
B: B1
  (Intercept) x
 3.33e-01 -1.517960e-16

A: A2
B: B1
NULL

A: A1
B: B2
NULL

A: A2
B: B2
 (Intercept)x
6.67e-01 3.282015e-16


#
Desired output:
data.frame(A=c(A1,A2), B=c(B1, B2),
.Intercept.=c(1/3, 2/3), x=c(-1.5e-16, 3.3e-16))
What's the simplest way to do this?
Thanks,
Spencer Graves
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help



__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] Need help on data frame

2003-06-08 Thread Pratibha Murthy
Dear Sir/Madam,

I am new in R.I have data corresponding to every day.
Problem is that there are some gap i.e. observation
couldn't be done on some particular day.

I want to place this data frame like exact data frame
(every year it will change, Feb 28 or feb29)

Maybe I need to make one coulmn of date (for each
year, say this dataframe 'frame1'), then I need to
place data set on frame1 with missing entry as NA.

Then I want to change this NA as mean of precceeding
and following entries (for EACH NA)

Hope it is possible by using R.  I will greatly
appreciate any help.

Thanks,
Pratibha 




Missed your favourite TV serial last night? Try the new, Yahoo! TV.
   visit http://in.tv.yahoo.com

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Ordering long vectors

2003-06-08 Thread Göran Broström
On Sun, 8 Jun 2003, Thomas Lumley wrote:

 On Sat, 7 Jun 2003, [ISO-8859-1] Göran Broström wrote:
 
 
  I need to order a long vector of integers with rather few unique values.
  This is very slow:
 
 
 I think the culprit is
 
 src/main/sort.c: orderVector1
 
 /* Shell sort isn't stable, but it proves to be somewhat faster
to run a final insertion sort to re-order runs of ties when
comparison is cheap.
 */
 
 This also explains:
 
  aa-sample(rep(1:10,5))
  system.time( order(aa, 1:length(aa)))
 [1] 3.67 0.01 3.68 0.00 0.00
  system.time( order(aa))
 ^C
 Timing stopped at: 49.33 0.01 49.34 0 0
 
 which is perhaps the simplest work-around :).

Thanks. This is really surprising: it is *much* faster to break ties by a 
second condition than not breaking them. I think it should be mentioned 
in the help. And could 'order/sort' be modified to check for 'tieness'? 
But I guess the the overhead would be too heavy.

(if (length(unique(x))  alpha * length(x)) then  else )

Göran

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Need help on data frame

2003-06-08 Thread Spencer Graves
I'm sorry, but I don't understand enough of your problem to be able to 
comment.  If you can give us a toy example, small, easy to understand in 
a few seconds, that illustrated the difficulty, it should be easier for 
others to help.

spencer graves

Pratibha Murthy wrote:
Dear Sir/Madam,

I am new in R.I have data corresponding to every day.
Problem is that there are some gap i.e. observation
couldn't be done on some particular day.
I want to place this data frame like exact data frame
(every year it will change, Feb 28 or feb29)
Maybe I need to make one coulmn of date (for each
year, say this dataframe 'frame1'), then I need to
place data set on frame1 with missing entry as NA.
Then I want to change this NA as mean of precceeding
and following entries (for EACH NA)
Hope it is possible by using R.  I will greatly
appreciate any help.
Thanks,
Pratibha 




Missed your favourite TV serial last night? Try the new, Yahoo! TV.
   visit http://in.tv.yahoo.com
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] Basic question on applying a function to each row of adataframe

2003-06-08 Thread peter leonard
Hi,

I have a function foo(x,y) and a dataframe, DF,  comprised of two vectors, x 
 w,  as follows :

  x w
1  1 1
2  2 1
3  3 1
4  4 1
etc

I would like to apply the function foo to each 'pair' within DF e.g  
foo(1,1), foo(2,1), foo(3,1) etc

I have tried

apply(DF,foo)
apply(DF[,],foo)
apply(DF[DF$x,DF$w],foo)


However, none of the above worked. Can anyone help ?

Thanks in advance,
Peter
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Basic question on applying a function to each row ofa dataframe

2003-06-08 Thread Spencer Graves
How about the following:

 DF - data.frame(x=1:4, y=rep(1,4))
 foo - function(x, y)x+y
 foo(DF$x, DF$y)
[1] 2 3 4 5
hth.  spencer graves

peter leonard wrote:
Hi,

I have a function foo(x,y) and a dataframe, DF,  comprised of two 
vectors, x  w,  as follows :

  x w
1  1 1
2  2 1
3  3 1
4  4 1
etc

I would like to apply the function foo to each 'pair' within DF e.g  
foo(1,1), foo(2,1), foo(3,1) etc

I have tried

apply(DF,foo)
apply(DF[,],foo)
apply(DF[DF$x,DF$w],foo)


However, none of the above worked. Can anyone help ?

Thanks in advance,
Peter
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Basic question on applying a function to each row of adataframe

2003-06-08 Thread Ko-Kang Kevin Wang
Hi,

You need to tell the apply() whether you want to apply the function to 
rows (1) or columns (2).

So in your case you may want to try something like:
  apply(DF, 1, foo)

On Sun, 8 Jun 2003, peter leonard wrote:

 I have a function foo(x,y) and a dataframe, DF,  comprised of two vectors, x 
  w,  as follows :
 
x w
 1  1 1
 2  2 1
 3  3 1
 4  4 1
 
 etc
 
 
 I would like to apply the function foo to each 'pair' within DF e.g  
 foo(1,1), foo(2,1), foo(3,1) etc
 
 I have tried
 
 apply(DF,foo)
 apply(DF[,],foo)
 apply(DF[DF$x,DF$w],foo)
 

-- 
Cheers,

Kevin

--
On two occasions, I have been asked [by members of Parliament],
'Pray, Mr. Babbage, if you put into the machine wrong figures, will
the right answers come out?' I am not able to rightly apprehend the
kind of confusion of ideas that could provoke such a question.

-- Charles Babbage (1791-1871) 
 From Computer Stupidities: http://rinkworks.com/stupid/

--
Ko-Kang Kevin Wang
Master of Science (MSc) Student
SLC Tutor and Lab Demonstrator
Department of Statistics
University of Auckland
New Zealand
Homepage: http://www.stat.auckland.ac.nz/~kwan022
Ph: 373-7599
x88475 (City)
x88480 (Tamaki)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Basic question on applying a function to each row of adataframe

2003-06-08 Thread peter leonard
Hi Keven,
This returns :
Error in FUN(newX[, i], ...) : Argument y is missing, with no default

E.g

x-c(1,2,3,4)
w-c(1,1,1,1)
DF-data.frame(x,w)
foo - function(x, y)x+y apply(DF, 1, foo)
Error in FUN(newX[, i], ...) : Argument y is missing, with no default

Regards
Peter




From: Ko-Kang Kevin Wang [EMAIL PROTECTED]
To: peter leonard [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
Subject: Re: [R] Basic question on applying a function to each row of a 
dataframe Date: Mon, 9 Jun 2003 08:54:02 +1200 (NZST)

Hi,

You need to tell the apply() whether you want to apply the function to
rows (1) or columns (2).
So in your case you may want to try something like:
  apply(DF, 1, foo)
On Sun, 8 Jun 2003, peter leonard wrote:

 I have a function foo(x,y) and a dataframe, DF,  comprised of two 
vectors, x
  w,  as follows :

x w
 1  1 1
 2  2 1
 3  3 1
 4  4 1

 etc


 I would like to apply the function foo to each 'pair' within DF e.g
 foo(1,1), foo(2,1), foo(3,1) etc

 I have tried

 apply(DF,foo)
 apply(DF[,],foo)
 apply(DF[DF$x,DF$w],foo)


--
Cheers,
Kevin

--
On two occasions, I have been asked [by members of Parliament],
'Pray, Mr. Babbage, if you put into the machine wrong figures, will
the right answers come out?' I am not able to rightly apprehend the
kind of confusion of ideas that could provoke such a question.
-- Charles Babbage (1791-1871)
 From Computer Stupidities: http://rinkworks.com/stupid/
--
Ko-Kang Kevin Wang
Master of Science (MSc) Student
SLC Tutor and Lab Demonstrator
Department of Statistics
University of Auckland
New Zealand
Homepage: http://www.stat.auckland.ac.nz/~kwan022
Ph: 373-7599
x88475 (City)
x88480 (Tamaki)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] executable R scripts

2003-06-08 Thread John Zedlewski
Hi, I'm a newbie trying to make an R program executable on UNIX, just like one 
would write an executable perl script by putting #!/usr/bin/perl in the 
first line, and so on.

It seems, though, that this would only work if I use the BATCH command to 
tell R to execute the program in its first argument. This would have the 
unfortunately side-effect of dumping all output to a file rather than stdout.

Additionally, I'd want to see only the results of print statements on 
stdout, not all off R's output, just as when you source a script with 
echo=FALSE.

This seems like it would be a pretty common problem, but I haven't found any 
explanations in the docs. Does somebody have a sample script that I could 
look at for advice? Or should I just bite the bullet and write a wrapper 
shell script?

Thanks!
--JRZ

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] executable R scripts

2003-06-08 Thread Jonathan Baron
On 06/08/03 17:35, John Zedlewski wrote:
Hi, I'm a newbie trying to make an R program executable on UNIX, just like one 
would write an executable perl script by putting #!/usr/bin/perl in the 
first line, and so on.

It seems, though, that this would only work if I use the BATCH command to 
tell R to execute the program in its first argument. This would have the 
unfortunately side-effect of dumping all output to a file rather than stdout.

Additionally, I'd want to see only the results of print statements on 
stdout, not all off R's output, just as when you source a script with 
echo=FALSE.

See
man R
for how to do it, although I'm not sure where it says the
following:

To get just the print output and nothing else, it helps to have
print()'s in the script itself.  Then you can use

R --slave  myfile.R  printoutput.txt

I also use R --vanilla  myfile.R
for a R file that has write.table()'s in it.  For this you do not
need to pipe the output anywhere.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] executable R scripts

2003-06-08 Thread Dirk Eddelbuettel
On Sun, Jun 08, 2003 at 05:35:14PM -0700, John Zedlewski wrote:
 Hi, I'm a newbie trying to make an R program executable on UNIX, just like one 
 would write an executable perl script by putting #!/usr/bin/perl in the 
 first line, and so on.

This is not currently supported, but with some luck may be supported in a
later version of R.  

 It seems, though, that this would only work if I use the BATCH command to 
 tell R to execute the program in its first argument. This would have the 
 unfortunately side-effect of dumping all output to a file rather than stdout.

My personal favourite currently is to arrange everything (loading of
package, code, ...) in a file which I can read with source() from within R.
Then
$ echo source(\foo.R\) | R --slave
works quite well, you can redirect etc. Works on windows/cygwin too using
Rterm.exe.

 Additionally, I'd want to see only the results of print statements on 
 stdout, not all off R's output, just as when you source a script with 
 echo=FALSE.

I think the above fits that bill.

 This seems like it would be a pretty common problem, but I haven't found any 
 explanations in the docs. Does somebody have a sample script that I could 
 look at for advice? Or should I just bite the bullet and write a wrapper 
 shell script?

That's where the above leads to as well.

Dirk

-- 
Don't drink and derive. Alcohol and analysis don't mix.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] executable R scripts

2003-06-08 Thread John Zedlewski
Dirk and Jonathan--
  Thanks a lot for the fast and helpful comments, guys. I ended up writing a 
wrapper script that uses the trick of echoing source(\filename\) into R 
--slave, and it works well.
  Thanks again!
--JRZ

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] early R messages to stdout

2003-06-08 Thread John Zedlewski
Hi,
 I have an R script that takes its input in the form of command-line 
parameters. It works fine, but R complains about every unknown arg with the 
ARGUMENT %s ignored message, and this goes to stdout instead of stderr 
because R_ConsoleFile isn't set yet. Is it really necessary to process all 
command line args before setting R_ConsoleFile? It seems that only Aqua 
systems care about their arguments when choosing the console file.

  I've attached a diff (against 1.7.0) that fixes this issue, so that non-Aqua 
unix folks can redirect stderr to /dev/null and not have to worry about those 
annoying argument ignored errors anymore.

--JRZ__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Basic question on applying a function to each row of adataframe

2003-06-08 Thread peter leonard
This works fine.
Thanks
Peter

From: Spencer Graves [EMAIL PROTECTED]
To: peter leonard [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
Subject: Re: [R] Basic question on applying a function to each row of 
a	dataframe
Date: Sun, 08 Jun 2003 13:48:04 -0700

How about the following:

 DF - data.frame(x=1:4, y=rep(1,4))
 foo - function(x, y)x+y
 foo(DF$x, DF$y)
[1] 2 3 4 5
hth.  spencer graves

peter leonard wrote:
Hi,

I have a function foo(x,y) and a dataframe, DF,  comprised of two vectors, 
x  w,  as follows :

  x w
1  1 1
2  2 1
3  3 1
4  4 1
etc

I would like to apply the function foo to each 'pair' within DF e.g  
foo(1,1), foo(2,1), foo(3,1) etc

I have tried

apply(DF,foo)
apply(DF[,],foo)
apply(DF[DF$x,DF$w],foo)


However, none of the above worked. Can anyone help ?

Thanks in advance,
Peter
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] Questions for package ts prediction

2003-06-08 Thread zhu wang
Dear helpers,

I am trying to write a function to return prediction values using
package ts. I have written three different versions since I am not sure
what's wrong with my func2. func and func1 return the same results.But
func1 and func2 don't. In particular, the only difference between
func1 and func2 is the function variable name being y and data,
respectively.  But running the last line of the following script will
give the message:

Error in ts(x): object is not a matrix.

I am confused. Also, could somebody kindly let me what's the answer if
any for the following sunspot example from the package help:

data(sunspot)
(sunspot.ar - ar(sunspot.year)) 
# why not just sunspot.ar - ar(sunspot.year) ?
predict(sunspot.ar, n.ahead=25)

Thanks in advance.

Zhu Wang
Statistical Science Department
Southern Methodist University

(214)768-2453
-- 
zhu wang [EMAIL PROTECTED]

# time series prediction

func-function(data)
  {(esti- ar(data))
   return(predict(object=esti,newdata=data,n.head=5))
 }
func1-function(y)
  {(esti- ar(y))
   return(predict(esti,n.head=5))
 }
func2-function(data)
  {(esti- ar(data))
   return(predict(esti,n.head=5))
 }

y-arima.sim(model=list(ar=c(1.7,-0.8)),n=100)
func(y)
func1(y)
func2(y)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] looking for Prof Bates' file

2003-06-08 Thread Viet Nguyen,,,
Hello

I'm reading up on fitting truncated Weibull distribution to data.

There are posts in 2002 that point to this presentation by Prof Bates:

http://www.stat.wisc.edu/~bates/JSM2001.pdf

but now the file is not there. I can't find it anywhere else, Google 
doesn't have a cached copy for it.

Could someone please give me a copy of this file, if they have it?

Thanks and regards,
viet.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help