Re: [R] How to merge string to DF

2007-08-24 Thread Stephen Tucker
This seems to work:

tmp - aggregate(DF$y, list(DF$x, DF$f), mean)

tmp2 - aggregate(DF$conc, list(DF$x, DF$f), paste,collapse=, )
names(tmp2)[3] - var1

final - merge(tmp,tmp2)


--- Lauri Nikkinen [EMAIL PROTECTED] wrote:

 #Hi R-users,
 #I have an example DF like this:
 
 y1 - rnorm(10) + 6.8
 y2 - rnorm(10) + (1:10*1.7 + 1)
 y3 - rnorm(10) + (1:10*6.7 + 3.7)
 y - c(y1,y2,y3)
 x - rep(1:3,10)
 f - gl(2,15, labels=paste(lev, 1:2, sep=))
 g - seq(as.Date(2000/1/1), by=day, length=30)
 DF - data.frame(x=x,y=y, f=f, g=g)
 DF$wdays - weekdays(DF$g)
 DF$conc - paste(DF$g, DF$wdays)
 DF
 
 #Now I calculate group means
 
 tmp - aggregate(DF$y, list(DF$x, DF$f), mean)
 tmp
 
 #After doing this, I want to merge string from DF$conc to tmp using DF$x
 and
 DF$y as an identifier
 #The following DF should look like this:
 
   Group.1 Group.2 x var1
 1   1lev1  6.607869 2000-01-01 Saturday, 2000-01-04 Tuesday,
 2000-01-07 Friday, 2000-01-10 Monday etc.
 2   2lev1  6.598861 etc.
 3   3lev1  7.469262
 4   1lev2 27.488734
 5   2lev2 33.164037
 6   3lev2 34.466359
 
 #How do I do this?
 
 #Cheers,
 #Lauri
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vectors in R (WAS Re: Does anyone.... worth a warning?!? No warning at all)

2007-08-21 Thread Stephen Tucker

Dear Ted (and community), 

You raise a very interesting point - namely, what should and should
not be called a vector in R (it's neither a class or mode,
formally). I don't know which version of the R Language Definition you
were quoting from, but mine (Version 2.5.1 DRAFT), says:

Vectors can be thought of as contiguous cells containing data.

(doesn't say homogeneous in the version that I have). In that sense
it's more analagous to 'lists' in Python, Scheme, etc. (with the
additional benefit that the names attribute for R vectors allows you
to use them also as 'dictionaries' or 'hash tables'), and less like
the 1-D array used in mathematics. (Incidentally, the array class in
Python is like the matrix and array classes in R, which do require
specification of row or column).

In any case, the quote above is more consistent with my understanding
of the basic data objects in R, as atomic vectors and lists are
both contiguous cells containing data, only that they differ in the
value of their mode attributes. I think it can be a bit confusing
when they are introduced separately (e.g., in the R Language
Definition document with headings, Vectors and Lists in section
2.1) - though I think its origin lies in the pedagogy of the
language. For instance, introductory documents often show off R as a
calculator and draw the analogy between the vector notation used in
mathematics and the application of +() [as an operator rather than a
function] on a pair of numeric vectors in R. This is probably due to
the background of the audience these documents are intended to address
(Python/Scheme, perhaps more computer science; R/S, more statistics or
mathematics perhaps). I think this is a bit unfortunate as students
can get stuck with the idea that there are (atomic) vectors, and then
another thing called a list - and then later he/she is told that a
list is a vector as well, and has to reconcile this new bit of
information - while conceptually they are similar except that a
certain set of functions (e.g., the arithemetic operators and string
functions) cannot be applied to vectors of mode list, but many other
functions (e.g., extraction, subsetting, replacement) can be applied
in the same way.

This article was very elucidating:

Statistical programming with R, Part 3: Reusable and object-oriented
programming
http://www.ibm.com/developerworks/linux/library/l-r3.html

In it, David Mertz says:

'The main thing to keep in mind about R data is that everything is a
vector. Even objects that look superficially distinct from vectors --
matrices, arrays, data.frames, etc. -- are really just vectors with
extra (mutable) attributes that tell [generic functions in] R to treat
them in special ways.'

So matrices, arrays, lists, data frames, (and even factors) are all
vectors (used henceforth in the sense of contiguous cells as are
lists in Python/Scheme), with additional attributes attached. When
these attributes are removed, print() will allow us to view them to us
as 1-D objects (a sequence of values; not necessarily a 1-D row or
column matrix).

One defining attribute besides mode and length is the class
attribute, which determines the dispatch method for a generic
function. For instance, the [() and [-() functions allow N-D
subscripting notation for matrix, array, and data.frame classes,
but as they are also still vectors (contiguous cells), and therefore
can be subscripted as stated, cells are accessed through indexing
operations such as x[5].

This is important in it that it allows one to use many functions not
immediately thought of as applicable to data frames (which is a list,
which is a vector, etc.
http://tolstoy.newcastle.edu.au/R/help/00b/2390.html); for me that
would be functions like append(), replace(), etc. For example:

 df - data.frame(a=1:5,c=11:15,d=16:20)
 append(df,list(b=6:10),1)
$a
[1] 1 2 3 4 5

$b
[1]  6  7  8  9 10

$c
[1] 11 12 13 14 15

$d
[1] 16 17 18 19 20

 replace(df,c(FALSE,TRUE,FALSE),list(b=21:25))
  a  c  d
1 1 21 16
2 2 22 17
3 3 23 18
4 4 24 19
5 5 25 20

append() returns a list because c() is invoked internally, and this
removes all extra attributes except names (including class,
row.names, etc.). So, retaining the intrinsic mode list, the
append function returns a class list object by default ['If the
object does not have a class attribute, it has an implicit class,
matrix, array or the result of mode(x)', says ?class] when applied
to a data frame.

On the other hand, replace() still returns a data frame because only
[.-data.frame() is invoked so the returned object retains the class
of data.frame.

Even factors, which fails the is.vector() test, are actually vectors
(IMHO). The R Language definition says, 

Factors are currently implemented using an integer array [which is a
vector] to specify the actual levels and a second array of names [in
the levels attribute] that are mapped to the integers.

As an example, the following behavior is also predictable in that if
we know how each function 

Re: [R] Stacked Bar

2007-08-21 Thread Stephen Tucker
I think you want to use the 'density' argument. For example:

barplot(1:5,col=1)
legend(topleft,fill=1,legend=text,cex=1.2)
par(new=TRUE)
barplot(1:5,density=5,col=2)
legend(topleft,fill=2,density=20,legend=text,bty=n,cex=1.2)

(if you wanted to overlay solid colors with hatching)

Here's the lattice alternative of the bar graph, though the help page says
'density' is currently unimplemented (Package lattice version 0.16-2). To get
the legend into columns, I followed the suggestion described here:
http://tolstoy.newcastle.edu.au/R/help/05/04/2529.html

Essentially I use mapply() and the line following to create a list with
alternating 'text' and 'rect' arguments (3 times to get 3 columns).
===
x - matrix(1:75, ncol= 5)
dimnames(x)[[2]] - paste(Method, 1:5, sep=)
dimnames(x)[[1]] - paste(Row, 1:15, sep=)

u - mapply(function(x,y) list(text=list(lab=x),rect=list(col=y)),
x = as.data.frame(matrix(levels(as.data.frame.table(x)$Var1),
  ncol=3)),
y = as.data.frame(matrix(rainbow(nrow(x)),
  ncol=3)),
SIMPLIFY=FALSE)
key - c(rep=FALSE,space=bottom,unlist(names-(u,NULL),rec=FALSE))

barchart(Freq ~ Var2,
 data = as.data.frame.table(x),
 groups = Var1, stack = TRUE,
 col=rainbow(nrow(x)),density=5,
 key = key )
===
(I often use tim.colors() in the 'fields' package, if you wanted other ideas
for color schemes).



--- Deb Midya [EMAIL PROTECTED] wrote:

 Jim,

   Thanks for such a quick response. It works well. Is it possible to fill
 the bars with patterns and colours?

   Regards,

   Deb
 
 Jim Lemon [EMAIL PROTECTED] wrote:
   Deb Midya wrote:
  Hi R Users!
  
  Thanks in advance.
  
  I am using R-2.5.1 on Windows XP.
  
  I am trying to do a stacked bar plot, but could not get through the
 following problem. The code is given below.
  
  1. How can I provide 15 different colors for each method with 15 Rows?
  
  2. How can I put the legend in a particular position (eg., in the top or
 bottom or right or left)? How can I put legend using a number of rows (eg.,
 using two or three rows)? 
  
 Hi Deb,
 As you have probably noticed, the integer coded colors repeat too 
 quickly for the number of colors you want. You can use the rainbow()
 function to generate colors like this:
 
 barplot(x,beside=FALSE,col=rainbow(nrow(x)))
 
 or there are lots of other color generating functions in the grDevices 
 or plotrix packages. Here's how to get your legend in an empty space for 
 your plot. There is also an emptyspace() function in the plotrix package 
 that tries to find the biggest empty space in a plot, although it 
 probably wouldn't work in this case.
 
 legend(0,1000,rownames(x),fill=rainbow(nrow(x)))
 
 Jim
 
 
 

 -
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] polar.plot orientation and scale in plotrix

2007-08-20 Thread Stephen Tucker
I think that's the standard presentation for polar plots (theta measured from
positive x-axis) - that I've seen, anyway. But for customization you can
shift your origin for theta and define your own labels. For example, here is
a modification to the example in the help page for polar.plot():

testlen-c(rnorm(36)*2+5)
testpos-seq(0,350,by=10)
polar.plot(testlen,360-(testpos+90),
   main=Test Polar Plot,lwd=3,line.col=4,
   labels=seq(0,359,by=45)[c(3:1,8:4)],
   label.pos=seq(0,359,by=45))





--- Tim Sippel [EMAIL PROTECTED] wrote:

 Hello all-
 
  
 
 I would like to orient my polar.plot (from package plotrix) so that the
 circular scale runs clockwise and the origin (ie. 0 degrees) starts at the
 top of the plot.  The defaults of running the scale counter-clockwise and
 beginning with 90 degrees at the top of the graph seems counter-intuitive
 to
 me.  
 
  
 
 I'm using R 2.5.0, and plotrix version 2.2-4.  
 
  
 
 Many thanks,
 
  
 
 Tim
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting character string to an expression

2007-08-08 Thread Stephen Tucker

I think you're looking for

parse(text=paste(letters[1:3], collapse=+))

--- Jarrod Hadfield [EMAIL PROTECTED] wrote:

 Hi Everyone,
 
 I would simply like to coerce a character string into an expression:  
 something like:
 
 as.expression(paste(letters[1:3], collapse=+))
 
 but I can't seem to get rid of the quotes.  The only way I can get it  
 to work is using as.formula:
 
 as.expression(as.formula(paste(~, paste(letters[1:3], collapse=+
 
 but this requires the expression to have a tilde, which it will not  
 always have.
 
 Thanks,
 
 Jarrod
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Catch errors

2007-08-06 Thread Stephen Tucker
?try

or

?tryCatch
http://www.maths.lth.se/help/R/ExceptionHandlingInR/

for example...

tryCatch(lme(Y ~ X1*X2, random = ~1|subj, Model[i]),
 error=function(err) return(0))

(you can do something with 'err' or just return 0 as above)

--- Gang Chen [EMAIL PROTECTED] wrote:

 I run a linear mixed-effects model in a loop
 
 for (i in 1:N) {
 fit.lme - lme(Y ~ X1*X2, random = ~1|subj, Model[i]);
 }
 
 As the data in some iterations are all (or most) 0's, leading to the  
 following error message from lme:
 
 Error in chol((value + t(value))/2) : the leading minor of order 1 is  
 not positive definite
 
 What is a good way to catch the error without spilling on the screen  
 so that I can properly stuff the corresponding output with some  
 artificial value such as 0?
 
 Thanks,
 Gang
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Catch errors

2007-08-06 Thread Stephen Tucker
That's because tag - 1 is evaluated in a local environment (of the function)
- once function(err) exits, the information is lost. Try R's - operator:

tag - 0
tryCatch(fit.lme - lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj,  
 Model), error=function(err) tag - 1)



--- Gang Chen [EMAIL PROTECTED] wrote:

 I wanted something like this:
 
  tag - 0;
  tryCatch(fit.lme - lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj,  
 Model), error=function(err) tag - 1);
 
 but it seems not working because 'tag' does not get value of 1 when  
 error occurs. How can I make it work?
 
 Thanks,
 Gang
 
 
 On Aug 6, 2007, at 1:44 PM, Stephen Tucker wrote:
 
  ?try
 
  or
 
  ?tryCatch
  http://www.maths.lth.se/help/R/ExceptionHandlingInR/
 
  for example...
 
  tryCatch(lme(Y ~ X1*X2, random = ~1|subj, Model[i]),
   error=function(err) return(0))
 
  (you can do something with 'err' or just return 0 as above)
 
  --- Gang Chen [EMAIL PROTECTED] wrote:
 
  I run a linear mixed-effects model in a loop
 
  for (i in 1:N) {
  fit.lme - lme(Y ~ X1*X2, random = ~1|subj, Model[i]);
  }
 
  As the data in some iterations are all (or most) 0's, leading to the
  following error message from lme:
 
  Error in chol((value + t(value))/2) : the leading minor of order 1 is
  not positive definite
 
  What is a good way to catch the error without spilling on the screen
  so that I can properly stuff the corresponding output with some
  artificial value such as 0?
 
  Thanks,
  Gang
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  __ 
  __
  Looking for a deal? Find great prices on flights and hotels with  
  Yahoo! FareChase.
  http://farechase.yahoo.com/
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About grep

2007-08-06 Thread Stephen Tucker
try

grep(paste(^,b[2],$,sep=),a)


your version will match b2:

 grep(^b[2]$,c(b,b2,b3))
[1] 2


--- Shao [EMAIL PROTECTED] wrote:

 Hi,everyone.
 
 I have a problem when using the grep.
 for example:
 a - c(aa,aba,abac)
 b- c(ab,aba)
 
 I want to match the whole word,so
 grep(^aba$,a)
 it returns 2
 
 but when I used it a more useful way:
 grep(^b[2]$,a),
 it doesn't work at all, it can't find it, returning integer(0).
 
 How can I chang the format in the second way?
 
 Thanks.
 
 -- 
 Shao
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using loops to create multiple images

2007-08-05 Thread Stephen Tucker
Not sure exactly what 'results' is doing there or 'barplot(table(i),...)'
does  [see ?table]

but I think this is sort of what you want to do?

## Variable assignment
G01_01 - 1:10
G01_02 - 2:6

## Combine to list*
varnames - paste(G01_,substring(100+1:2,2),sep=)
vars - lapply(`names-`(as.list(varnames),varnames),
   function(x) eval(parse(text=x)))
print(vars)

## Plotting
for( i in 1:length(vars) ) {
  filenm - paste(/my/dir/barplot,i,.png,sep=)
  barplot(...)
  dev.copy(png,filename=filenm,...)
  dev.off()
}

## **Combining to list, step-by-step
## does the same thing as above
digits - substring(100+1:2,2)
varnames - paste(G01_,digits,sep=)
vars - as.list(varnames)
names(varlist) - vars
# convert character string of variable names to
# expressions via parse() and evaluate by eval()
vars - lapply(varlist,function(x) eval(parse(text=x)))
print(vars)

I think in many cases paste() is your answer...


--- Donatas G. [EMAIL PROTECTED] wrote:

 I have a data.frame with ~100 columns and I need a barplot for each column 
 produced and saved in some directory.
 
 I am not sure it is possible - so please help me.
 
 this is my loop that does not work...
 
 vars - list (substitute (G01_01), substitute (G01_02), substitute
 (G01_03), 
 substitute (G01_04))
 results - data.frame ('Variable Name'=rep (NA, length (vars)), 
 check.names=FALSE)
 for (i in 1:length (vars))  {
 barplot(table(i),xlab=i,ylab=Nuomonės)
 dev.copy(png, filename=/my/dir/barplot.i.png, height=600, width=600)
 dev.off()
 }
 
 questions: 
 
 Is it possible to use the i somewhere _within_ a file name? (like it is 
 possible in other programming or scripting languages?)
 
 Since I hate to type in all the variables (they go from G01_01 to G01_10
 and 
 then from G02_01 to G02_10 and so on), is it possible to shorten this list
 by 
 putting there another loop, applying some programming thing or so? 
 
 -- 
 Donatas Glodenis
 http://dg.lapas.info
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


that gives answers, not web links.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems using lm in combination with predict

2007-08-04 Thread Stephen Tucker
I think you need 

predict(mod,newdate)

instead of 

predict(y,newdate)


--- Maja Schröter [EMAIL PROTECTED] wrote:

 Hello everybody,
 
 I'm trying to predict a linear regression model but it does not work.
 
 My Model: y = Worktime + Vacation + Illnes + Bankholidays
 
 My modelmatrix is of dimension  28x4
 
 Then I want to make use of the function predict because there
 confidence.intervals are include.
 
 My idea was:
 
 mod - lm(y~Worktime+Vacation+Illnes+Bankholidays)
 
 newdate=data.frame(x=c(324,123,0.9,0.1))
 predict(y,newdate)
 
 But I always get the message:
 
 
 'newdata' had 1 rows but variable(s) found have 28 rows
 
 
 What can I do?
 
 Yours, 
 
 Maja
 
 -- 
 Pt! Schon vom neuen GMX MultiMessenger gehört?
 Der kanns mit allen: http://www.gmx.net/de/go/multimessenger
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] methods and classes and things

2007-08-04 Thread Stephen Tucker
methods(plot)

--- Edna Bell [EMAIL PROTECTED] wrote:

 Hi R Gurus:
 
 I know that plot  has extra things like plot.ts, plot.lm
 
 How would i find out all of them, please?
 
 Thanks,
 Edna
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] request

2007-08-03 Thread Stephen Tucker
?cumsum

--- zahid khan [EMAIL PROTECTED] wrote:

  I want to calculate the commulative sum of any numeric vector with the
 following command but this following command does not work  comsum
   My question is , how we can calculate the commulative sum of any numeric
 vector with above command
   Thanks
 
 
 Zahid Khan
 Lecturer in Statistics
 Department of Mathematics
 Hazara University Mansehra.

 -
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t-distribution

2007-08-02 Thread Stephen Tucker
p - seq(0.001,0.999,,1000)
x - qt(p,df=9)
y - dt(x,df=9)
plot(x,y,type=l)
polygon(x=c(x,rev(x)),y=c(y,rep(0,length(y))),col=gray90)

Hope this helps.

ST


--- Nair, Murlidharan T [EMAIL PROTECTED] wrote:

 Indeed, this is what I wanted, I figured it from the function you and
 Mark pointed me. Thank you both. 
 
 I am trying to plot it to illustrate the point and I tried this
 
 plot(function(x) dt(x, df = 9), -5, 5, ylim = c(0, 0.5), main=t -
 Density, yaxs=i)
 
 Is there an easy way to shade the area under the curve? 
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of
 [EMAIL PROTECTED]
 Sent: Wednesday, August 01, 2007 3:18 PM
 To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
 Subject: Re: [R] t-distribution
 
 Well, is t = 1.11 all that accurate in the first place?  :-)
 
 In fact, reading beween the lines of the original enquiry, what the
 person probably wanted was something like
 
 ta - pt(-1.11, 9) + pt(1.11, 9, lower.tail = FALSE)
 
 which is the two-sided t-test tail area.
 
 The teller of the parable will usually leave some things unexplained...
 
 Bill. 
 
 
 Bill Venables
 CSIRO Laboratories
 PO Box 120, Cleveland, 4163
 AUSTRALIA
 Office Phone (email preferred): +61 7 3826 7251
 Fax (if absolutely necessary):  +61 7 3826 7304
 Mobile: +61 4 8819 4402
 Home Phone: +61 7 3286 7700
 mailto:[EMAIL PROTECTED]
 http://www.cmis.csiro.au/bill.venables/ 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Ben Bolker
 Sent: Thursday, 2 August 2007 4:57 AM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] t-distribution
 
  Bill.Venables at csiro.au writes:
 
  
  for the upper tail:
  
   1-pt(1.11, 9)
  [1] 0.1478873
  
wouldn't 
  pt(1.11, 9, lower.tail=FALSE)
   be more accurate?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] line widths of plotting symbols in the lattice

2007-08-02 Thread Stephen Tucker
Thanks to all for the response - the grid.points() solution works well.

Stephen

(oddly I missed when this thread and its response actually got posted... was
starting to get worried)

--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/31/07, Uwe Ligges [EMAIL PROTECTED] wrote:
 
 
  Stephen Tucker wrote:
   Dear List,
  
   Sorry, this is very simple but I can't seem to find any information
 regarding
   line widths of plotting symbols in the lattice package.
  
   For instance, in traditional graphics:
  
   plot(1:10,lwd=3)
   points(10:1,lwd=2,col=3)
  
   'lwd' allows control of plotting symbol line widths.
 
 
  'lwd' is documented in ?gpar (the help page does not show up for me,
  I'll take a closer look why) and works for me:
 
  xyplot(1:10 ~ 1:10, type = l, lwd = 5)
 
 I think the point is that lwd doesn't work for _points_, and that is a
 bug (lplot.xy doesn't pass on lwd to grid.points). I'll fix it,
 meanwhile a workaround is to use grid.points directly, e.g.
 
 library(grid)
 xyplot(1:10 ~ 1:10, cex = 2, lwd = 3,
panel = function(x, y, ...) grid.points(x, y, gp = gpar(...)))
 
 -Deepayan


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t-distribution

2007-08-02 Thread Stephen Tucker
yes, or

p - seq(0.001,0.999,,1000)
x - qt(p,df=9)
y - dt(x,df=9)
plot(x,y,type=l)

f - function(x,y,...) {
  polygon(x=c(x,rev(x)),y=c(y,rep(0,length(y))),...)
}
with(data.frame(x,y)[x = 2.3,],f(x,y,col=gray90))
with(data.frame(x,y)[x = -2.3,],f(x,y,col=gray90))


--- Nair, Murlidharan T [EMAIL PROTECTED] wrote:

 
 I tried doing it this way. 
 
 left--2.3
 right-2.3
 p - seq(0.001,0.999,,1000)
 x - qt(p,df=9)
 y - dt(x,df=9)
 plot(x,y,type=l)
 x.tmp-x
 y.tmp-y
 a-which(x=left)

polygon(x=c(x.tmp[a],rev(x.tmp[a])),y=c(y.tmp[a],rep(0,length(y.tmp[a]))),col=gray90)
 b-which(x=right)

polygon(x=c(x.tmp[b],rev(x.tmp[b])),y=c(y.tmp[b],rep(0,length(y.tmp[b]))),col=gray90)
 
 Please let me know if I have made any mistakes. 
 Thanks ../Murli
 
 
 
 -Original Message-
 From: Richard M. Heiberger [mailto:[EMAIL PROTECTED]
 Sent: Thu 8/2/2007 10:25 AM
 To: Nair, Murlidharan T; Stephen Tucker; r-help@stat.math.ethz.ch
 Subject: Re: [R] t-distribution
  
 I believe you are looking for the functionality I have
 in the norm.curve function in the HH package.
 
 Download and install HH from CRAN and then look at
 
 example(norm.curve)
 
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] y axix number into horizontal direction

2007-08-02 Thread Stephen Tucker
try

par(las=1)
plot(0,0,xaxt=n,type=n, ylim=c(0,100))
mtext(35,side=2,at=35)

you can use 'las=1' in par(), plot(), axis(), etc.

more generally, you can use 'srt' in text() to rotate tick labels:

plot(1:10,1:10,xaxt=n,type=n, yaxt=n,ylim=c(0,100))
axis(1); axis(2,lab=FALSE)
text(x=par(usr)[1]-2*par(cxy)[1],y=axTicks(2),
 lab=axTicks(2),xpd=TRUE,srt=45)



--- Rebecca Ding [EMAIL PROTECTED] wrote:

 Dear R users,
 
 I used plot() and mtext() functions to draw a plot. The numbers: 0,20,35,
 40,60,80,100 were in the vertical direction. I'd like to transfer them into
 the horizontal direction.
 
 plot(0,0,xaxt=n,type=n, ylim=c(0,100))
 mtext(35,side=2,at=35)
 
 Any suggestion?
 
 Thanks.
 
 Rebecca
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shadow between two lines in plot()

2007-08-01 Thread Stephen Tucker
see ?rect, or, for more general shapes, ?polygon

## EXAMPLES
plot(c(0,500),c(0,500),type=n,las=1)
rect(par(usr)[1],200,par(usr)[2],300,col=grey90)
points(seq(0,500,length=3),seq(0,500,length=3))

plot(c(0,500),c(0,500),type=n,las=1)
polygon((par(usr)[1:2])[c(1,1,2,2)],
(c(200,300))[c(1,2,2,1)],col=grey90)
points(seq(0,500,length=3),seq(0,500,length=3))



--- Ding, Rebecca [EMAIL PROTECTED] wrote:

  Dear R users,
 
 I used the following code to draw a scatter plot. 
 
 plot(x,y,type=n)
 points(x,y,pch=1)
 
 And then I used the abline functions to draw two lines. I want to add
 the shadow between those two lines. 
 
 abline(h=200)
 abline(h=300)
 
 Any suggestions?
 
 Thanks
 
 Rebecca
 
 --
 This e-mail and any files transmitted with it may contain pr...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new user question on dataframe comparisons and plots

2007-08-01 Thread Stephen Tucker
Hi Conor,

I hope I interpreted your question correctly. I think for the first one you
are looking for a conditioning plot? I am going to create and use some
nonsensical data - 'iris' comes with R so this should be reproducible on your
machine:

library(lattice)
data(iris)
x - iris
# make some factors using cut()
x[,2:3] - lapply(x[,2:3],cut,3)
# add column of TRUE FALSE
x - cbind(x,TF=sample(c(TRUE,FALSE),nrow(x),replace=TRUE))
xyplot(petal.wid~petal.len | ## these are numeric
   sepal.wid*sepal.len,  ## these are factors
   groups=TF,## TRUE or FALSE
   panel=function(x,y,...) {
 panel.xyplot(x,y,...)
 panel.loess(x,y,...)
   },
   data=x,auto.key=TRUE)


merge() should work when you have different factors, when you specify
all=TRUE.

## get counts for TRUE and FALSE
 y - tapply(x$species,INDEX=x$TF,
+function(x) as.data.frame(table(x)))
## merge results
 (z - `names-`(merge(y$`TRUE`,y$`FALSE`,by=x,all=TRUE),
+   c(factor,true,false)))
  factor true false
1 versicolor   2921
2  virginica   2327

## reshape the data frame
 library(reshape)
 melt(z,id=1)
  factor variable value
1 versicolor true29
2  virginica true23
3 versicolorfalse21
4  virginicafalse27

Hope this helps. If it doesn't you can post a small (reproducible) piece of
data and we can maybe help you out a little better...

Best regards,

ST


--- Conor Robinson [EMAIL PROTECTED] wrote:

 I'm coming from the scipy community and have been using R on and for
 the past week or so.  I'm still feeling out the language structure,
 but so far so good.  I apologize in advance if I pose any obvious
 questions, due to my current lack of diction when searching for my
 issue, or recognizing it if I did see it.
 
 Question 1, plots:
 
 I have a data frame with 4 type factor columns, also in the data frame
 I have one single, type logical column with the response data (T or
 F).  I would like to plot a 4*4 grid showing all the two way attribute
 interactions like with plot(data.frame) or pairs(data.frame,
 panel=panel.smooth), however show the response's True and False as
 different colors, or any other built in graphical analysis that might
 be relevant in this case.  I'm sure this is simple since this is a
 common procedure, thanks in advance for humoring me.  Also, what is
 the correct term for this type of plot?
 
 
 Question 2, data frame analysis:
 
 I have two sub data frames split by whether my logical column is T or
 F.  I want to compare the same factor column between both of the two
 sub data frames (there are a few hundred different unique possibles
 for this factor column eg  -  enumerated).  I've used table()
 on the attribute columns from each sub frame to get counts.
 
 pos - data.frame(table(df.true$CAT))
 
   10
 BASD  0
 ZAQM 4
 ...
 
 neg - data.frame(table(df.false$CAT))
 
  1000
 BASD  3
 ZAQM  9
 PPWS 10
 ...
 
 The TRUE sub frame has less unique factors that the sub frame FALSE, I
 would like an output data frame that is one column all the factors
 from the TRUE sub frame and the second column the counts from the TRUE
 attributes / counts from the corresponding FALSE attributes ie
 %response for each represented factor.  It's fine (better even) if all
 factors are included and there is just a zero for the attributes with
 no TRUEs.
 
 I've been going off making my own function and running into trouble
 with the data frame not being a vector etc etc, but I have a feeling
 there is a *much* better way ie built in function, but I've hit my
 current level of R understanding.
 
 Thank you,
 Conor
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting exponential curve to data points

2007-07-30 Thread Stephen Tucker
Sorry, just got back into town.

I wonder if AIC, BIC, or cross-validation scoring couldn't also be used as
criteria for model selection - I've seen it mostly in the context of variable
selection rather than 'form' selection but in principle might apply here?


--- Dieter Menne [EMAIL PROTECTED] wrote:

 Andrew Clegg andrew.clegg at gmail.com writes:
 
  
  ... If I want to demonstrate that a non-linear curve fits
  better than an exponential, what's the best measure for that? Given
  that neither of nls() or optim() provide R-squared. 
 
 To supplement Karl's comment, try Douglas Bates' (author of nls) comments
 on the
 matter
 
 http://www.ens.gu.edu.au/ROBERTK/R/HELP/00B/0399.HTML
 
 Short summary:
 * ... the lack of automatic ANOVA, R^2 and adj. R^2 from nls is a
 feature,
 not a bug :-)
 * My best advice regarding R^2 statistics with nonlinear models is, as
 Nancy
 Reagan suggested, Just say no.
 
 Dieter
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] manipulating arrays

2007-07-30 Thread Stephen Tucker
I think you are looking for append(), though it won't modify the object
in-place like Python [I believe that is a product of R's 'functional
programming' philosophy].

might want to check this entertaining thread:
http://tolstoy.newcastle.edu.au/R/help/04/11/7727.html

in this example it would be like

 c(X[1], 0, X[2:5])
[1] 1 0 2 3 4 5
 append(X,0,1)
[1] 1 0 2 3 4 5


--- Henrique Dallazuanna [EMAIL PROTECTED] wrote:

 Hi, I don't know if is the more elegant way, but:
 
 X-c(1,2,3,4,5)
 X - c(X[1], 0, X[2:5])
 
 
 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O
 
 On 27/07/07, Nair, Murlidharan T [EMAIL PROTECTED] wrote:
 
  Can I insert an element in an array at a particular position without
  destroying the already existing element?
 
 
 
  X-c(1,2,3,4,5)
 
 
 
  I want to insert an element between 1 and 2.
 
 
 
  Thanks ../Murli
 
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] line widths of plotting symbols in the lattice

2007-07-30 Thread Stephen Tucker
Dear List,

Sorry, this is very simple but I can't seem to find any information regarding
line widths of plotting symbols in the lattice package.

For instance, in traditional graphics:

 plot(1:10,lwd=3)
 points(10:1,lwd=2,col=3)

'lwd' allows control of plotting symbol line widths.

I've tried looking through the documentation for xyplot, panel.points,
trellis.par.set, and the R-help archives. Maybe it goes by another name?

Thanks in advance,

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Redirecting print output

2007-07-24 Thread Stephen Tucker
Here are two simple ways:

=== method1 ===
cat(line1,\n,file=output.txt)
cat(line2,\n,file=output.txt,append=TRUE)

=== method2 ===
sink(output.txt)
cat(line1,\n)
cat(line2,\n)
out - lm(y~x,data=data.frame(x=1:10,y=(1:10+rnorm(10,0,0.1
print(out)
sink()

And then there is 'Sweave'. Check out, for instance
http://www.stat.umn.edu/~charlie/Sweave/

You can embed R code, figures, and output from print methods into your latex
document.

ST
--- Stan Hopkins [EMAIL PROTECTED] wrote:

 I see a rich set of graphic device functions to redirect that output.  Are
 there commands to redirect text as well.  I have a set of functions that
 execute many linear regression tests serially and I want to capture this in
 a file for printing.
 
 Thanks,
 
 Stan Hopkins
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] persp and greek symbols in the axes labels

2007-07-24 Thread Stephen Tucker
I don't know why it doesn't work but I think people generally recommend that
you use wireframe() in lattice rather than persp(), because wireframe is more
customizable (the pdf document referred to in this post is pretty good):
http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html

Here's an example:

library(lattice)
library(reshape)
x - 1:5
y - 1:3
z - matrix(1:15,ncol=3,dimnames=list(NULL,y))
M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y)
wireframe(value~x*y,data=M,
  screen=list(z=45,x=-75),
  xlab=expression(kappa[lambda]),
  ylab=as.expression(substitute(paste(phi,=,true,sigma),
  list(true=5))),
  zlab = Z)

[you can play around with the 'screen' argument to rotate the view, analogous
to phi and theta in persp()]


--- Nathalie Peyrard [EMAIL PROTECTED] wrote:

 Hello,
 
 I am plotting a 3D function using persp and I would like to use greek 
 symbols in the axes labels.
 I  have found examples like  this one on the web:
 

plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)))
 
 this works well with plot but not with persp:
 with the command
 
 persp(M,theta = -20,phi = 

0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab
 
 = Z)
 
 I get the labels as in toto.eps
 
 Any suggestion? Thanks!
 
 Nathalie
 
 -- 
 ~~   
 INRA  Toulouse - Unité de Biométrie et  Intelligence Artificielle 
 Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE 
 Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35
 Web :http://mia.toulouse.inra.fr/index.php?id=217
 ~~
 
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker
I think your way is probably the easiest (shockingly). For instance, here are
some alternatives - I think in both cases you have to calculate the
coefficient of determination (R^2) manually. My understanding is that
multiple R^2 in your case is the usual R^2 because you only have one
predictor variable, and the adjusted R^2 considers the degrees of freedom and
penalizes for additional predictors. Which is better... depends? (Perhaps
more stats-savvy people can help you on that one. I'm a chemical engineer so
I unjustifiably claim ignorance).

## Data input
input -
Year   Count
19993
20005
20019
200230
200362
2004154
2005245
2006321

dat - read.table(textConnection(input),header=TRUE)
dat[,] - lapply(dat,function(x) x-x[1])
  # shifting in origin; will need to add back in later

## Nonlinear least squares
plot(dat)
out - nls(Count~b0*exp(b1*Year),data=dat,
   start=list(b0=1,b1=1))
lines(dat[,1],fitted(out),col=2)
out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial
   start=list(b0=0,b1=1,b2=1))
lines(dat[,1],fitted(out),col=3)

## Optim
f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2)
fitFun - function(b,x) cbind(1,x,x^2)%*%b
expFun - function(b,x) b[1]*exp(b[2]*x)

plot(dat)
out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun)
lines(dat[,1],fitFun(out$par,dat[,1]),col=2)
out - optim(c(1,1),f,.dat=dat,.fun=expFun)
lines(dat[,1],expFun(out$par,dat[,1]),col=3)


--- Andrew Clegg [EMAIL PROTECTED] wrote:

 Hi folks,
 
 I've looked through the list archives and online resources, but I
 haven't really found an answer to this -- it's pretty basic, but I'm
 (very much) not a statistician, and I just want to check that my
 solution is statistically sound.
 
 Basically, I have a data file containing two columns of data, call it
 data.tsv:
 
 year  count
 1999  3
 2000  5
 2001  9
 2002  30
 2003  62
 2004  154
 2005  245
 2006  321
 
 These look exponential to me, so what I want to do is plot these
 points on a graph with linear axes, and add an exponential curve over
 the top. I also want to give an R-squared for the fit.
 
 The way I did it was like so:
 
 
 # Read in the data, make a copy of it, and take logs
 data = read.table(data.tsv, header=TRUE)
 log.data = data
 log.data$count = log(log.data$count)
 
 # Fit a model to the logs of the data
 model = lm(log.data$count ~ year, data = log.data)
 
 # Plot the original data points on a graph
 plot(data)
 
 # Draw in the exponents of the model's output
 lines(data$year, exp(fitted(model)))
 
 
 Is this the right way to do it? log-ing the data and then exp-ing the
 results seems like a bit of a long-winded way to achieve the desired
 effect. Is the R-squared given by summary(model) a valid measurement
 of the fit of the points to an exponential curve, and should I use
 multiple R-squared or adjusted R-squared?
 
 The R-squared I get from this method (0.98 multiple) seems a little
 high going by the deviation of the last data point from the curve --
 you'll see what I mean if you try it.
 
 Thanks in advance for any help!
 
 Yours gratefully,
 
 Andrew.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker
Well spoken. And since log transformations are nonlinear and 'compresses' the
data, it's not surprising to find that the fit doesn't look so nice while the
fit metrics tell you that a model does a good job.

--- [EMAIL PROTECTED] wrote:

 On 24-Jul-07 01:09:06, Andrew Clegg wrote:
  Hi folks,
  
  I've looked through the list archives and online resources, but I
  haven't really found an answer to this -- it's pretty basic, but I'm
  (very much) not a statistician, and I just want to check that my
  solution is statistically sound.
  
  Basically, I have a data file containing two columns of data, call it
  data.tsv:
  
  year  count
  1999  3
  2000  5
  2001  9
  2002  30
  2003  62
  2004  154
  2005  245
  2006  321
  
  These look exponential to me, so what I want to do is plot these
  points on a graph with linear axes, and add an exponential curve over
  the top. I also want to give an R-squared for the fit.
  
  The way I did it was like so:
  
  
 # Read in the data, make a copy of it, and take logs
  data = read.table(data.tsv, header=TRUE)
  log.data = data
  log.data$count = log(log.data$count)
  
 # Fit a model to the logs of the data
  model = lm(log.data$count ~ year, data = log.data)
  
 # Plot the original data points on a graph
  plot(data)
  
 # Draw in the exponents of the model's output
  lines(data$year, exp(fitted(model)))
  
  
  Is this the right way to do it? log-ing the data and then exp-ing the
  results seems like a bit of a long-winded way to achieve the desired
  effect. Is the R-squared given by summary(model) a valid measurement
  of the fit of the points to an exponential curve, and should I use
  multiple R-squared or adjusted R-squared?
  
  The R-squared I get from this method (0.98 multiple) seems a little
  high going by the deviation of the last data point from the curve --
  you'll see what I mean if you try it.
 
 I just did. From the plot of log(count) against year, with the plot
 of the linear fit of log(count)~year superimposed, I see indications
 of a non-linear relationship.
 
 The departures of the data from the fit follow a rather systematic
 pattern. Initially the data increase more slowly than the fit,
 and lie below it. Then they increase faster and corss over above it.
 Then the data increase less fast than the fit, and the final data
 point is below the fit.
 
 There are not enough data to properly identify the non-linearity,
 but the overall appearance of the data plot suggests to me that
 you should be considering one of the growth curve models.
 
 Many such models start of with an increasing rate of growth,
 which then slows down, and typically levels off to an asymptote.
 The apparent large discrepancy of your final data point could
 be compatible with this kind of behaviour.
 
 At this point, knowledge of what kind of thing is represented
 by your count variable might be helpful. If, for instance,
 it is the count of the numbers of individuals of a species in
 an area, then independent knowledge of growth mechanisms may
 help to narrow down the kind of model you should be tring to fit.
 
 As to your question about Is this the right way to do it
 (i.e. fitting an exponential curve by doing a linear fit of the
 logarithm), generally speaking the answer is Yes. But of course
 you need to be confident that exponential is the right curve
 to be fitting in the first place. If it's the wrong type of
 curve to be considering, then it's not the right way to do it!
 
 Hoping this help[s,
 Ted.
 
 
 E-Mail: (Ted Harding) [EMAIL PROTECTED]
 Fax-to-email: +44 (0)870 094 0861
 Date: 24-Jul-07   Time: 10:08:33
 -- XFMail --
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Re: Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker
Hope these help for alternatives to lm()? I show the use of a 2nd order
polynomial as an example to generalize a bit.

Sometimes from the subject line two separate responses can appear as reposts
when in fact they are not... (though there are identical reposts too). I
should probably figure a way around that.

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 ## Data input
 input -
 Year Count
 1999  3
 2000  5
 2001  9
 2002  30
 2003  62
 2004  154
 2005  245
 2006  321
 
 dat - read.table(textConnection(input),header=TRUE)
 dat[,] - lapply(dat,function(x) x-x[1])
   # shifting in origin; will need to add back in later
 
 ## Nonlinear least squares
 plot(dat)
 out - nls(Count~b0*exp(b1*Year),data=dat,
start=list(b0=1,b1=1))
 lines(dat[,1],fitted(out),col=2)
 out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial
start=list(b0=0,b1=1,b2=1))
 lines(dat[,1],fitted(out),col=3)
 
 ## Optim
 f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2)
 fitFun - function(b,x) cbind(1,x,x^2)%*%b
 expFun - function(b,x) b[1]*exp(b[2]*x)
 
 plot(dat)
 out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun)
 lines(dat[,1],fitFun(out$par,dat[,1]),col=2)
 out - optim(c(1,1),f,.dat=dat,.fun=expFun)
 lines(dat[,1],expFun(out$par,dat[,1]),col=3)



   

Got a little couch potato? 
Check out fun summer activities for kids.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Set

2007-07-23 Thread Stephen Tucker
My bad... corrections (semantic and otherwise) always appreciated. I'm still
learning too.

I also forgot the alternative of using make.names() instead of manually
assigning 'more convenient' names.

input - 
Mydata,S-sharif,A site
1,45,34
2,66,45
3,79,56

 dat - read.csv(textConnection(input),check.names=FALSE)
 dat
  Mydata S-sharif A site
1  1   45 34
2  2   66 45
3  3   79 56
 names(dat)
[1] Mydata   S-sharif A site  
 names(dat) - make.names(names(dat))
 names(dat)
[1] Mydata   S.sharif A.site  

Which, in the case of the data set, Monsoon, I don't know how it was created
originally but may be convenient to reassign names by

  names(Monsoon) - make.names(names(Monsoon))



--- Gavin Simpson [EMAIL PROTECTED] wrote:

 On Sun, 2007-07-22 at 21:51 -0700, Stephen Tucker wrote:
  It turns out that - and   (space) are not valid variable names. 
 
 They are valid names, the problem is that they aren't very convenient to
 use, as the OP discovered, because they need to be quoted.
 
 Note that if using something like read.csv or read.table, R will correct
 these problem variable names for you when you import the data. If you
 read this file in for example:
 
 Mydata,S-sharif,A site
 1,45,34
 2,66,45
 3,79,56
 
 using read.csv, you get easy to use names
 
  dat - read.csv(temp.csv)
  dat
   Mydata S.sharif A.site
 1  1   45 34
 2  2   66 45
 3  3   79 56
 
 You can turn off this safety checking using the argument check.names =
 FALSE
 
 G
 
 -- 
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ?R: Removing white space betwen multiple plots, traditional graphics

2007-07-22 Thread Stephen Tucker
You could try
par(mar=c(0,5,0,2), mfrow = c(6,1), oma=c(5,0,2,0))
##...then, your plots...##


--- Mr Natural [EMAIL PROTECTED] wrote:

 
 I would appreciate suggestions for removing the white spaces the graphs in
 a
 stack:
 
 par(mar=c(2,2,1,1), mfrow = c(6,1))
 mydates-dates(1:20,origin=c(month = 1, day = 1, year = 1986))
 plot(rnorm(20,0.1,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.2,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.3,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.5,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.7,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.8,0.1)~mydates, type=b,xlab=,ylim=c(0,1) )
 
 Thanx, Don
 -- 
 View this message in context:

http://www.nabble.com/-R%3A--Removing-white-space-betwen-multiple-plots%2C-traditional-graphics-tf4119626.html#a11716176
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tagging results of apply

2007-07-22 Thread Stephen Tucker
Dear Bruce,
In your functions, you need to use your bound variable, 'x' [not mat1] in
your anonymous function [function(x)] as the argument to cor().

For instance, you wrote:
apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
apply(mat1, 1, function(x) cor(mat1, mat2))

They should be
apply(mat1, 1, function(x) cor(x, mat2[1,]))
apply(mat1, 1, function(x) cor(x, mat2))

or
f - function(x,y) cor(x, y)
apply(mat1, 1, f, y=mat2[1,])
apply(mat1, 1, f, y=mat2)

Then from the ?apply documentation - under section, 'Value' - the following
statement will help you predict its behavior in this case:
If each call to FUN returns a vector of length n, then apply returns an
array of dimension c(n, dim(X)[MARGIN]) if n  1.

[each column of your output is the output from cor(mat1[i,],mat2) in Scenario
2]. As for tagging, you can try adding dimension labels [to the object which
is passed as the 'X' argument to apply()]:

mat1 - matrix(sample(1:500, 25), ncol = 5,
   dimnames=list(paste(row,1:5,sep=),
 paste(col,1:5,sep=)))
mat2 - matrix(sample(501:1000, 25), ncol = 5)

 apply(mat1, 1, function(x,y) cor(x, y), y=mat2)
row1   row2   row3row4row5
[1,]  0.39412464 -0.6241649  0.7423724  0.48391875  0.27085386
[2,] -0.22912466 -0.4123714  0.2857004 -0.52447327  0.06971423
[3,] -0.51027247  0.3256587 -0.6195050 -0.48309737  0.01699978
[4,]  0.26353316 -0.1873564  0.2121154  0.88784766 -0.02257890
[5,] -0.03771225 -0.4250040  0.3795558 -0.03372794 -0.05874675

Hope this helps,

Stephen

--- Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:

 In trying to get a better understanding of vectorization I wrote the
 following code:
 
 My objective is to take two sets of time series and calculate the
 correlations for each combination of time series.
 
 mat1 - matrix(sample(1:500, 25), ncol = 5)
 mat2 - matrix(sample(501:1000, 25), ncol = 5)
 
 Scenario 1:
 apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
 
 Scenario 2:
 apply(mat1, 1, function(x) cor(mat1, mat2))
 
 Using scenario 1, (output below) I can see that correlations are
 calculated for just the first row of mat2 against each individual row of
 mat1.
 
 Using scenario 2, (output below) I can see that correlations are
 calculated for each row of mat2 against each individual row of mat1.  
 
 Q1: The output of scenario2 consists of 25 rows of data.  Are the first
 five rows mat1 against mat2[1,], the next five rows mat1 against
 mat2[2,], ... last five rows mat1 against mat2[5,]?
 
 Q2: I assign the output of scenario 2 to a new matrix
 
   matC - apply(mat1, 1, function(x) cor(mat1, mat2))
 
 However, I need a way to identify each row in matC as a pairing of
 rows from mat1 and mat2.  Is there a parameter I can add to apply to do
 this?
 
 Scenario 1:
  apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
[,1]   [,2]   [,3]   [,4]   [,5]
 [1,] -0.4626122 -0.4626122 -0.4626122 -0.4626122 -0.4626122
 [2,] -0.9031543 -0.9031543 -0.9031543 -0.9031543 -0.9031543
 [3,]  0.0735273  0.0735273  0.0735273  0.0735273  0.0735273
 [4,]  0.7401259  0.7401259  0.7401259  0.7401259  0.7401259
 [5,] -0.4548582 -0.4548582 -0.4548582 -0.4548582 -0.4548582
 
 Scenario 2:
  apply(mat1, 1, function(x) cor(mat1, mat2))
  [,1][,2][,3][,4][,5]
  [1,]  0.19394126  0.19394126  0.19394126  0.19394126  0.19394126
  [2,]  0.26402400  0.26402400  0.26402400  0.26402400  0.26402400
  [3,]  0.12923842  0.12923842  0.12923842  0.12923842  0.12923842
  [4,] -0.74549676 -0.74549676 -0.74549676 -0.74549676 -0.74549676
  [5,]  0.64074122  0.64074122  0.64074122  0.64074122  0.64074122
  [6,]  0.26931986  0.26931986  0.26931986  0.26931986  0.26931986
  [7,]  0.08527921  0.08527921  0.08527921  0.08527921  0.08527921
  [8,] -0.28034079 -0.28034079 -0.28034079 -0.28034079 -0.28034079
  [9,] -0.15251915 -0.15251915 -0.15251915 -0.15251915 -0.15251915
 [10,]  0.19542415  0.19542415  0.19542415  0.19542415  0.19542415
 [11,]  0.75107032  0.75107032  0.75107032  0.75107032  0.75107032
 [12,]  0.53042767  0.53042767  0.53042767  0.53042767  0.53042767
 [13,] -0.51163612 -0.51163612 -0.51163612 -0.51163612 -0.51163612
 [14,] -0.44396048 -0.44396048 -0.44396048 -0.44396048 -0.44396048
 [15,]  0.57018745  0.57018745  0.57018745  0.57018745  0.57018745
 [16,]  0.70480284  0.70480284  0.70480284  0.70480284  0.70480284
 [17,] -0.36674283 -0.36674283 -0.36674283 -0.36674283 -0.36674283
 [18,] -0.81826607 -0.81826607 -0.81826607 -0.81826607 -0.81826607
 [19,]  0.53145184  0.53145184  0.53145184  0.53145184  0.53145184
 [20,]  0.24568385  0.24568385  0.24568385  0.24568385  0.24568385
 [21,] -0.10610402 -0.10610402 -0.10610402 -0.10610402 -0.10610402
 [22,] -0.78650748 -0.78650748 -0.78650748 -0.78650748 -0.78650748
 [23,]  0.04269423  0.04269423  0.04269423  0.04269423  0.04269423
 [24,]  0.14704698  0.14704698  0.14704698  0.14704698  0.14704698
 [25,]  0.28340166  0.28340166  0.28340166  

Re: [R] tagging results of apply

2007-07-22 Thread Stephen Tucker
Actually if you want to tag both column and row, this might also help:

## Give dimension labels to both matrices
mat1 - matrix(sample(1:500, 25), ncol = 5,
   dimnames=list(paste(mat1row,1:5,sep=),
 paste(mat1col,1:5,sep=)))
mat2 - matrix(sample(501:1000, 25), ncol = 5,
   dimnames=list(paste(mat2row,1:5,sep=),
 paste(mat2col,1:5,sep=)))

cor(mat1[1,],mat2)
mat2col1   mat2col2   mat2col3  mat2col4 mat2col5
[1,] -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972

The column labels are there but are lost when returned from apply(), as it
says in ?apply:

In all cases the result is coerced by as.vector to one of the basic vector
types before the dimensions are set

 as.vector(cor(mat1[1,],mat2))
[1] -0.063135353 -0.467992672 -0.514708392 -0.797748010 -0.001457972

You lose the dimension labels in this case, so one option is to guard against
this in the following way:

 as.vector(as.data.frame(cor(mat1[1,],mat2)))
 mat2col1   mat2col2   mat2col3  mat2col4 mat2col5
1 -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972

Unfortunately, if you use 'as.data.frame()' in 'function(x)', apply will
return a list - but you can bind the rows of the output:

 f - function(x,y) as.data.frame(cor(x,y))
 do.call(rbind, apply(mat1,1,f,y=mat2))
mat2col1   mat2col2mat2col3   mat2col4 mat2col5
mat1row1 -0.06313535 -0.4679927 -0.51470839 -0.7977480 -0.001457972
mat1row2 -0.28750363  0.1681777  0.14671484  0.8139768  0.039982028
mat1row3 -0.62017387 -0.6932731 -0.72263865 -0.7929604  0.427366680
mat1row4  0.06441894  0.1707946 -0.11444747 -0.8213577  0.526239013
mat1row5 -0.09849051  0.7024540 -0.01997228  0.3712480  0.439037838

The result is a data frame, not a matrix, and note that the columns/rows are
transposed in relation to the output of
  apply(mat1,1,f,y=mat2)

An alternative is to convert each row of mat1 into a list element [by
transposing it with t() and then feeding it to as.data.frame()] and then use
sapply():

 sapply(as.data.frame(t(mat1)),f,y=mat2)
 mat1row1 mat1row2   mat1row3   mat1row4   mat1row5   
mat2col1 -0.06313535  -0.2875036 -0.6201739 0.06441894 -0.0984905 
mat2col2 -0.4679927   0.1681777  -0.6932731 0.1707946  0.702454   
mat2col3 -0.5147084   0.1467148  -0.7226387 -0.1144475 -0.01997228
mat2col4 -0.7977480.8139768  -0.7929604 -0.8213577 0.371248   
mat2col5 -0.001457972 0.03998203 0.4273667  0.526239   0.4390378



--- Stephen Tucker [EMAIL PROTECTED] wrote:

 Dear Bruce,
 In your functions, you need to use your bound variable, 'x' [not mat1] in
 your anonymous function [function(x)] as the argument to cor().
 
 For instance, you wrote:
 apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
 apply(mat1, 1, function(x) cor(mat1, mat2))
 
 They should be
 apply(mat1, 1, function(x) cor(x, mat2[1,]))
 apply(mat1, 1, function(x) cor(x, mat2))
 
 or
 f - function(x,y) cor(x, y)
 apply(mat1, 1, f, y=mat2[1,])
 apply(mat1, 1, f, y=mat2)
 
 Then from the ?apply documentation - under section, 'Value' - the following
 statement will help you predict its behavior in this case:
 If each call to FUN returns a vector of length n, then apply returns an
 array of dimension c(n, dim(X)[MARGIN]) if n  1.
 
 [each column of your output is the output from cor(mat1[i,],mat2) in
 Scenario
 2]. As for tagging, you can try adding dimension labels [to the object
 which
 is passed as the 'X' argument to apply()]:
 
 mat1 - matrix(sample(1:500, 25), ncol = 5,
dimnames=list(paste(row,1:5,sep=),
  paste(col,1:5,sep=)))
 mat2 - matrix(sample(501:1000, 25), ncol = 5)
 
  apply(mat1, 1, function(x,y) cor(x, y), y=mat2)
 row1   row2   row3row4row5
 [1,]  0.39412464 -0.6241649  0.7423724  0.48391875  0.27085386
 [2,] -0.22912466 -0.4123714  0.2857004 -0.52447327  0.06971423
 [3,] -0.51027247  0.3256587 -0.6195050 -0.48309737  0.01699978
 [4,]  0.26353316 -0.1873564  0.2121154  0.88784766 -0.02257890
 [5,] -0.03771225 -0.4250040  0.3795558 -0.03372794 -0.05874675
 
 Hope this helps,
 
 Stephen
 
 --- Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 
  In trying to get a better understanding of vectorization I wrote the
  following code:
  
  My objective is to take two sets of time series and calculate the
  correlations for each combination of time series.
  
  mat1 - matrix(sample(1:500, 25), ncol = 5)
  mat2 - matrix(sample(501:1000, 25), ncol = 5)
  
  Scenario 1:
  apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
  
  Scenario 2:
  apply(mat1, 1, function(x) cor(mat1, mat2))
  
  Using scenario 1, (output below) I can see that correlations are
  calculated for just the first row of mat2 against each individual row of
  mat1.
  
  Using scenario 2, (output below) I can see that correlations are
  calculated for each row of mat2 against each individual row of mat1.  
  
  Q1: The output of scenario2 consists of 25 rows of data.  Are the first

Re: [R] Data Set

2007-07-22 Thread Stephen Tucker
Could you post the output from 

str(data)

?

Perhaps that will give us a clue.

--- amna khan [EMAIL PROTECTED] wrote:

 Sir the station name S.Sharif exists in the data but still the error is
 ocurring of being not found.
 Please help in this regard.
 
 
 On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote:
 
  On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote:
   Hi Sir
   I have made a data set having 23 stations of rainfall.
   when I use the attach function to approach indevidual stations then
   following error occurr.
  
   *attach(data)*
   *S.Sharif#S.Sharif is the station  name which has 50 data values*
   *Error: object S.Sharif not found*
   Now how to solve this problem.
 
  Then you don't have a column named exactly S.Sharif in your object
  data.
 
  What does str(data) and names(data) tell you about the columns in your
  data set? If looking at these doesn't help you, post the output from
  str(data) and names(data) and someone might be able to help.
 
  You should always check that R has imported the data in the way you
  expect; just because you think there is something in there called
  S.Sharif doesn't mean R sees it that way.
 
  You also seem to have included the R-Help email address twice in the To:
  header of your email - once is sufficient.
 
  G
 
   Thank You
   Regards
  
  --
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 
 
 
 
 
 -- 
 AMINA SHAHZADI
 Department of Statistics
 GC University Lahore, Pakistan.
 Email:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Write columns from within a list to a matrix?

2007-07-22 Thread Stephen Tucker
Very close... Actually it's more like

savecol2=sapply(test, function(x) x[,1])

to get the same matrix as you showed in your for-loop (did you actually want
the first or second column?).

when I have multiple complex lists I am trying to manage...
for this, you can try mapply() which goes something like
mapply(function(x,y) #...function body...#,
   x=list1,y=list2)


--- [EMAIL PROTECTED] wrote:

 Hello,
 
 I think I have a mental block when it comes to working with lists.  lapply
 and sapply appear to do some magical things, but I can't seem to master
 their usage.
 
 As an example, I would like to convert a column within a list to a matrix,
 with the list element corresponding to the new matrix column.
 
 #Here is a simplified example: .
 test=vector(list, 3)
 for (i in 1:3){ test[[i]]=cbind(runif(15), rnorm(15,2)) }  #create example
 list (I'm sure there is a better way to do this too).
 
 #Now, I wan to get the second column back out, converting it from a list to
 a matrix.  This works, but gets confusing/inefficient when I have multiple
 complex lists I am trying to manage.
 
 savecol2=matrix(0,15,0)
 for (i in 1:3){
 savecol2=cbind(savecol2, test[[i]][,1])
 } 
 
 #Something like??:  (of course this doesn't work)
 savecol2=sapply(test, [[, function(x) x[2,]) 
 
 Thank you!
 
 Jeff
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Set

2007-07-22 Thread Stephen Tucker
It turns out that - and   (space) are not valid variable names. You can
get around that in two ways:

==
names(Monsoon)[2] - S.Sharif
names(Monsoon)[8] - Islamabad.AP
attach(Monsoon)
S.Sharif
Islamabad.AP
detach(Monsoon)

and do the same for other variable names that contain - or   characters.

=
The other way is to enclose the names in ``. For instance:
attach(Monsoon)
`S-Sharif`
`Islamabad AP`
detach(Monsoon)

Here is my example in which it works:
 x - list(1:5,6:8)
 names(x) - c(S-Sharif,Peshawar)
 str(x)
List of 2
 $ S-Sharif: int [1:5] 1 2 3 4 5
 $ Peshawar: int [1:3] 6 7 8
 attach(x)
 `S-Sharif`
[1] 1 2 3 4 5
 detach(x)



--- amna khan [EMAIL PROTECTED] wrote:

 Yes Sir
  I am sending u the clue for data.
 
  str(Monsoon)
 List of 23
  $ Dir   : num [1:40] 72.4 60.7 52.1.
  $ S-Sharif  : num [1:55] 23.6 93.5 36.3  ..
  $ Peshawar  : num [1:57] 54.4 27.7 ...
  $ Kakul : num [1:54]  50.3 116.1 ...
  $ Balakot   : num [1:47] 218.2  76.5 ...
  $ Parachinar: num [1:40] 41.4 37.6 62.2...
  $ Kohat : num [1:53] 50.8 93.2 94.5 ...
  $ Islamabad AP  : num [1:48] 140.2  69.3...
  $ Murree: num [1:47] 130.0 131.3  74.4 ...
  $ Islamabad SRRC: num [1:24] 172.2  82.3 150.1   ...
  $ Mian Wali : num [1:48] 80.5 48.5 56.6 43.2  ...
  $ Jhelum: num [1:57] 111.8  82.3  53.8  94.7  ...
  $ Sialkot   : num [1:55]  62.7 126.0  90.7  ...
  $ D-I Khan  : num [1:57] 24.9 40.6 34.3  ...
  $ Faisalabad: num [1:56] 79.2 43.9 55.4 ...
  $ Lahore: num [1:60] 32.5 81.5 28.7  ...
 
 when I attach the data file and access the site S-Sharif or D-I Khan or
 Mian Wali then error messages occur.
 
 Please help in this regard.
 
 Thank You
 
 
 On 7/23/07, Stephen Tucker [EMAIL PROTECTED] wrote:
 
  Could you post the output from
 
  str(data)
 
  ?
 
  Perhaps that will give us a clue.
 
  --- amna khan [EMAIL PROTECTED] wrote:
 
   Sir the station name S.Sharif exists in the data but still the error
  is
   ocurring of being not found.
   Please help in this regard.
  
  
   On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote:
   
On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote:
 Hi Sir
 I have made a data set having 23 stations of rainfall.
 when I use the attach function to approach indevidual stations then
 following error occurr.

 *attach(data)*
 *S.Sharif#S.Sharif is the station  name which has 50 data
  values*
 *Error: object S.Sharif not found*
 Now how to solve this problem.
   
Then you don't have a column named exactly S.Sharif in your object
data.
   
What does str(data) and names(data) tell you about the columns in
 your
data set? If looking at these doesn't help you, post the output from
str(data) and names(data) and someone might be able to help.
   
You should always check that R has imported the data in the way you
expect; just because you think there is something in there called
S.Sharif doesn't mean R sees it that way.
   
You also seem to have included the R-Help email address twice in the
  To:
header of your email - once is sufficient.
   
G
   
 Thank You
 Regards

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
   
   
   
  
  
   --
   AMINA SHAHZADI
   Department of Statistics
   GC University Lahore, Pakistan.
   Email:
   [EMAIL PROTECTED]
   [EMAIL PROTECTED]
   [EMAIL PROTECTED]
  
 [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
 
 
 



  news, photos  more.
  http://mobile.yahoo.com/go?refer=1GNXIC
 
 
 
 
 -- 
 AMINA SHAHZADI
 Department of Statistics
 GC University Lahore, Pakistan.
 Email:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]

 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to combine presence only data sets to one presence/absence table

2007-07-18 Thread Stephen Tucker
I think you can still read as a table, just use argument fill=TRUE.

Reading from Excel in general: you can save data as 'csv' or tab-delimited
file and then use read.csv or read.delim, respectively, or use one of the
packages listed in the following post (for some reason lines breaks are
messed up but hope you can extract the content):
http://tolstoy.newcastle.edu.au/R/e2/help/07/06/19925.html

## read in data
x - 
read.table(textConnection(
spl_A  spl_B   spl_C
spcs1   spcs1   spcs2
spcs2   spcs3   spcs3
spcs4   spcs5
spcs5
),fill=TRUE,header=TRUE,na.string=)

Then,

## 1. find unique
spcs - sort(na.omit(unique(unlist(x 
## 2. create matrix of zeros
mat - matrix(0,ncol=ncol(x),nrow=length(spcs),
  dimnames=list(spcs,names(x))) 
## 3. assign zeros to matches
for( i in 1:ncol(mat) ) mat[match(x[,i],rownames(mat)),i] - 1

Alternatively,
## find unique
spcs - sort(na.omit(unique(unlist(x 
## return the matrix you want (combine steps 2 and 3 from above)
sapply(x,function(.x,spcs)
   names-(ifelse(!is.na(match(spcs,.x)),1,0),spcs),spcs)

Hope this helps.

ST

--- Patrick Zimmermann [EMAIL PROTECTED] wrote:

 Problem: I have a Set of samples each with a list of observed species
 (presence only).
 Data is stored in a excel spreadsheet and the columns (spl) have
 different numbers of observations (spcs).
 Now I want to organize the data in a species by sample matrix with
 presence/absence style in R.
 
 data style (in excel):
 
 spl_A spl_B   spl_C
 spcs1 spcs1   spcs2
 spcs2 spcs3   spcs3
 spcs4 spcs5
 spcs5
 
 desired style:
 
   spl_A   spl_B   spl_C
 spcs1 1   1   0
 spcs2 1   0   1
 spcs3 0   1   1
 .
 .
 .
 
 How and in which form do I import the data to R?
 (read.table() seems not to be appropriate, as data is not organized as a
 table)
 
 How can I create the species by sample matrix?
 
 Thanks for any help,
 Patrick Zimmermann
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-17 Thread Stephen Tucker
Thanks very much, Gabor - I hadn't considered this possibility. I always
enjoy your posts!

--- Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Suppose ri were already defined as in the example below.
 Then panel.qrect is a bit harder to define although with
 work its possible as shown below:
 
 rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))
 
 ri - function(x, y, ..., rect.info) {
ri - rect.info[[packet.number()]]
panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
   col = grey86, border = NA)
panel.xyplot(x, y, ...)
  }
 
 panel.qrect - function(rect.info) {
   function(x, y, ...) {
   environment(ri) - environment() ###
   ri(x, y, ..., rect.info = rect.info)
   }
 }
 
 xyplot(runif(30) ~ runif(30) | gl(3, 10),
   panel = panel.qrect(rectInfo))
 
 
 
 On 7/14/07, Stephen Tucker [EMAIL PROTECTED] wrote:
  This is very interesting - but I'm not entirely clear on your last
 statement
  though about how existing functions can cause problems with the scoping
 that
  createWrapper() avoids... (but thanks for the tip).
 
 
  --- Gabor Grothendieck [EMAIL PROTECTED] wrote:
 
   Your approach of using closures is cleaner than that
   given below but just for comparison in:
  
   http://tolstoy.newcastle.edu.au/R/devel/06/03/4476.html
  
   there is a createWrapper function which creates a new function based
   on the function passed as its first argument by using the components
   of the list passed as its second argument to overwrite its formal
   arguments.  For example,
  
   createWrapper - function(FUN, Params) {
  as.function(c(replace(formals(FUN), names(Params), Params),
 body(FUN)))
   }
  
   library(lattice)
  
   rectInfo -
  list(matrix(runif(4), 2, 2),
   matrix(runif(4), 2, 2),
   matrix(runif(4), 2, 2))
  
  
   panel.qrect - function(x, y, ..., rect.info) {
  ri - rect.info[[packet.number()]]
  panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
 col = grey86, border = NA)
  panel.xyplot(x, y, ...)
   }
  
   xyplot(runif(30) ~ runif(30) | gl(3, 10),
 panel = createWrapper(panel.qrect, list(rect.info = rectInfo)))
  
   The createWrapper approach does have an advantage in the situation
   where the function analogous to panel.qrect is existing since using
   scoping then involves manipulation of environments in the closure
   approach.
  
   On 7/11/07, Stephen Tucker [EMAIL PROTECTED] wrote:
In the Trellis approach, another way (I like) to deal with multiple
   pieces of
external data sources is to 'attach' them to panel functions through
   lexical
closures. For instance...
   
rectInfo -
   list(matrix(runif(4), 2, 2),
matrix(runif(4), 2, 2),
matrix(runif(4), 2, 2))
   
panel.qrect - function(rect.info) {
 function(x, y, ...) {
   ri - rect.info[[packet.number()]]
   panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
  col = grey86, border = NA)
   panel.xyplot(x, y, ...)
 }
}
   
xyplot(runif(30) ~ runif(30) | gl(3, 10),
  panel = panel.qrect(rectInfo))
   
...which may or may not be more convenient than passing rectInfo (and
   perhaps
other objects if desired) explicitly as an argument to xyplot().
   
   
--- Deepayan Sarkar [EMAIL PROTECTED] wrote:
   
 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts
   argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to
 the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want
 to
   plot
 for
   each panel. It is not clear to me how the panel layout
 information
   from
   panel.number(), etc. would be helpful here instead. Am I
 correct?
   -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples
 of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic
 idea.

 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that
 you
 need to learn the language.

 On a philosophical note, I think the apparent limitations of
 Trellis
 in some (not all) cases is just due to the artificial importance
 given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can

Re: [R] Drawing rectangles in multiple panels

2007-07-17 Thread Stephen Tucker
Hi Deepayan, that's very hard-core... for the atmospheric science
applications (which is what I do) that I've encountered, (time-series) data
sets are often pre-aggregated before distribution (to 'average out'
instrument noise) so I haven't had the need for such requirements thus far...
but very good to know (and cool demonstrations btw). Thanks!

Stephen

--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/14/07, Stephen Tucker [EMAIL PROTECTED] wrote:
 
  I wonder what kind of objects? Are there large advantages for allowing
  lattice functions to operate on objects other than data frames - I
  couldn't find any screenshots of flowViz but I imagine those objects
  would probably be list of arrays and such? I tend to think of mapply()
  [and more recently melt()], etc. could always be applied beforehand,
  but I suppose that would undermine the case for having generic
  functions to support the rich collection of object classes in R...
 
 There's a copy of a presentation at
 

http://www.ficcs.org/meetings/ficcs3/presentations/DeepayanSarkar-flowviz.pdf
 
 and a (largish - 37M) vignette linked from
 
 http://bioconductor.org/packages/2.1/bioc/html/flowViz.html
 
 Neither of these really talk about the challenge posed by the size of
 the data. The data structure, as with most microarray-type
 experiments, is like a data frame, except that the response for every
 experimental unit is itself a large matrix. If we represented the GvHD
 data set (the one used in the examples) as a long format data frame
 that lattice would understand, it would have 585644 rows and 12
 columns (8 measurements that are different for each row, and 4
 phenotypic variables that are the same for all rows coming from a
 single sample). And this is for a smallish subset of the actual
 experiment.
 
 In practice, the data are stored in an environment to prevent
 unnecessary copying, and panel functions only access one data matrix
 at a time.
 
 -Deepayan
 
 
  --- Deepayan Sarkar [EMAIL PROTECTED] wrote:
 
   On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
 A question/comment: I have usually found that the subscripts
 argument
   is
 what I need when passing *external* information into the panel
   function, for
 example, when I wish to add results from a fit done external to the
   trellis
 call. Fits[subscripts] gives me the fits (or whatever) I want to
 plot
   for
 each panel. It is not clear to me how the panel layout information
 from
 panel.number(), etc. would be helpful here instead. Am I correct?
 -- or
   is
 there a smarter way to do this that I've missed?
   
This is one of things that I think ggplot does better - it's much
easier to plot multiple data sources.  I don't have many examples of
this yet, but the final example on
http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
  
   That's probably true. The Trellis approach is to define a plot by
   data source + type of plot, whereas the ggplot approach (if I
   understand correctly) is to create a specification for the display
   (incrementally?) and then render it. Since the specification can be
   very general, the approach is very flexible. The downside is that you
   need to learn the language.
  
   On a philosophical note, I think the apparent limitations of Trellis
   in some (not all) cases is just due to the artificial importance given
   to data frames as the one true container for data. Now that we have
   proper multiple dispatch in S4, we can write methods that behave like
   traditional Trellis calls but work with more complex data structures.
   We have tried this in one bioconductor package (flowViz) with
   encouraging results.
  
   -Deepayan
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization

2007-07-17 Thread Stephen Tucker
My apologies, didn't see the boundary constraints. Try this one...

f - function(x)
  (sqrt((x[1]*0.114434)^2+(x[2]*0.043966)^2+(x[3]*0.100031)^2)-0.04)^2

optim(par=rep(0,3),f,lower=rep(0,3),upper=rep(1,3),method=L-BFGS-B)

and check ?optim

--- massimiliano.talarico [EMAIL PROTECTED] wrote:

 I'm sorry the function is 
 
 sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
 
 Have you any suggests.
 
 Thanks,
 Massimiliano
 
 
 
 What is radq?
 
 --- massimiliano.talarico
 [EMAIL PROTECTED] wrote:
 
  Dear all,
  I need a suggest to obtain the max of this function:
  
  Max x1*0.021986+x2*0.000964+x3*0.02913
  
  with these conditions:
  
  x1+x2+x3=1;
 
 radq((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
  x1=0;
  x1=1;
  x2=0;
  x2=1;
  x3=0;
  x3=1;
  
  Any suggests ?
  
  Thanks in advanced,
  Massimiliano
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scaling of different data sets in ggplot

2007-07-17 Thread Stephen Tucker
Hi Hadley,

That was also my initial thought as well, that maybe having different scales
on the same figure would obfuscate the structure and meaning of the data. But
I think in some instances (i.e., publications where page limits are imposed)
I think it's desirable to condense a lot of information onto a single plot
(for instance, if they show the same trend - even if they are not in the same
units), which means having more than one scale in the same plotting window. I
haven't checked what Tufte, Cleveland, and Wilkinson have to say about this,
but in practice I don't think it's all that uncommon.

I agree that log(z) is an operation on the data set, but representing it
graphically can be accomplished either through plotting log(z), or plotting z
on a log scale... in either case having an extra axis showing y and z [and
not log(z)] would be nice I would think.

I haven't tried it in lattice but in the traditional graphics system it is
quite straight-forward. Your claim says that ggplot takes 'tries to take the
good parts of base and lattice graphics and none of the bad parts' - just
trying to hold you to your word :).

Seriously though, I think the idea of ggplot (and implementation) is really
great. Currently R has many graphics systems, of which I know traditional and
lattice - and both are really fantastic (I plan to learn grid sometime in the
future) and I am fanatical about them. But for students and colleagues who
have less programming experience, I think the learning curve for lattice (to
gain proficiency, that is) may be a tad steep... I've been playing around
with ggplot to see if it would be a gentler introduction to conditioning
plots and analysis of multivariate datasets - which, in a way, I think it
could be - so I'm currently trying to test the limits of its flexibility.
It's true that there are some plotting concepts that are generally
discouraged, but it seems to me that the ultimate discretion should lie with
the user, and the plotting system should give him/her the freedom to choose
[to make a bad plot]. Even Lee Wilkinson says in his book that his grammar
will allow someone to make meaningless plots. One example that comes to mind
is the pie chart - I know they are heavily discouraged, but in some
communities, it's commonly used and therefore expected; to communicate to
that particular audience it's sometimes necessary to speak their language...

So, hope you don't mind, but I may ask some more 'can ggplot do this'
questions in the future. But keep up the good work,

Stephen


--- hadley wickham [EMAIL PROTECTED] wrote:

 Hi Stephen,
 
 You can't do that in ggplot (have two different scales) because I
 think it's generally a really bad idea.  The whole point of plotting
 the data is so that you can use your visual abilities to gain insight
 into the data.  When you have two different scales the positions of
 the two groups are essentially arbitrary - the data only have x values
 in common, not y values.  You essentially have two almost unrelated
 graphs plotted on top of each other.
 
 On the other hand, for this data, I think it would be reasonable to
 plot log(z) and y on the same scale - the data is transformed not the
 scales.
 
 Hadley
 
 On 7/14/07, Stephen Tucker [EMAIL PROTECTED] wrote:
  Dear list (but probably mostly Hadley):
 
  In ggplot, operations to modify 'guides' are accessed through grid
  objects, but I did not find mention of creating new guides or possibly
  removing them altogether using ggplot functions. I wonder if this is
  something I need to learn grid to learn more about (which I hope to do
  eventually).
 
  Also, ggplot()+geom_object() [where 'object' can be point, line, etc.]
  or layer() contains specification for the data, mappings and
  geoms/stats - but the geoms/stats can be scale-dependent [for
  instance, log]. so I wonder how different scalings can be applied to
  different data sets.
 
  Below is an example that requires both:
 
  x - runif(100) y - exp(x^2) z - x^2+rnorm(100,0,0.02)
 
  par(mar=c(5,4,2,4)+0.1) plot(x,y,log=y) lines(lowess(x,y,f=1/3))
  par(new=TRUE) plot(x,z,col=2,pch=3,yaxt=n,ylab=)
  lines(lowess(x,z,f=1/3),col=2) axis(4,col=2,col.axis=2)
  mtext(z,4,line=3,col=2)
 
  In ggplot:
 
  ## data specification
  ggplot(data=data.frame(x,y,z)) +
 
## first set of points geom_point(mapping=aes(x=x,y=y)) +
## scale_y_log() +
 
## second set of points geom_point(mapping=aes(x=x,y=z),pch=3) +
## layer(mapping=aes(x=x,y=z),stat=smooth,method=loess) +
## scale_y_continuous()
 
  scale_y_log() and scale_y_continuous() appear to apply to both mappings
 at
  once, and I can't figure out how to associate them with the intended ones
 (I
  expect this will be a desire for size and color scales as well).
 
  Of course, I can always try to fool the system by (1) applying the
 scaling a
  priori to create a new variable, (2) plotting points from the new
 variable,
  and (3) creating a new axis with custom labels. Which

Re: [R] Optimization

2007-07-17 Thread Stephen Tucker
f - function(x)
  (sqrt((x[1]*0.114434)^2+(x[2]*0.043966)^2+(x[3]*0.100031)^2)-0.04)^2

optim(c(0,0,0),f)

see ?optim for details on arguments, options, etc.

--- massimiliano.talarico [EMAIL PROTECTED] wrote:

 I'm sorry the function is 
 
 sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
 
 Have you any suggests.
 
 Thanks,
 Massimiliano
 
 
 
 What is radq?
 
 --- massimiliano.talarico
 [EMAIL PROTECTED] wrote:
 
  Dear all,
  I need a suggest to obtain the max of this function:
  
  Max x1*0.021986+x2*0.000964+x3*0.02913
  
  with these conditions:
  
  x1+x2+x3=1;
 
 radq((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
  x1=0;
  x1=1;
  x2=0;
  x2=1;
  x3=0;
  x3=1;
  
  Any suggests ?
  
  Thanks in advanced,
  Massimiliano
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization

2007-07-17 Thread Stephen Tucker
My apologies, I read the post over too quickly (even the second time). 

It's been a while since I've played around with anything other than box
constraints, but this one is conducive to a brute-force approach (employing
Berwin suggestions). The pseudo-code would look something like this:

delta - 1e-3   # grid space of x3, the smaller the better
oldvalue - -Inf # some initial value for objective function
for( x3 in seq(0,1,by=delta) ) {
  ## calculate x1,x2 as per Berwin's response
  ## if all constraints are met, feasible - TRUE
  ## else feasible - FALSE
  if( !feasible ) next # if not feasible, go to next x3 value
  ## newvalue - value of objective function with x1,x2,x3
  if( newvalue  oldvalue ) {
oldvalue - newvalue
max.x1 - x1; max.x2 - x2; max.x3 - x3
  }
}

You should end up with the desired values of max.x1, max.x2, max.x3. Hope
this helps,

ST



--- massimiliano.talarico [EMAIL PROTECTED] wrote:

 Thanks for your suggests, but I need to obtain the MAX of
 this function:
 
 Max x1*0.021986+x2*0.000964+x3*0.02913
 
 with these conditions:
 
 x1+x2+x3=1;
 
 sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
 
 x1=0;
 x2=0;
 x3=0;
 
 
 Thanks and again Thanks,
 Massimiliano
 
 
 
 My apologies, didn't see the boundary constraints. Try this
 one...
 
 f - function(x)
   (sqrt((x[1]*0.114434)^2+(x[2]*0.043966)^2+(x[3]*0.100031)
 ^2)-0.04)^2
 
 optim(par=rep(0,3),f,lower=rep(0,3),upper=rep
 (1,3),method=L-BFGS-B)
 
 and check ?optim
 
 --- massimiliano.talarico
 [EMAIL PROTECTED] wrote:
 
  I'm sorry the function is
 
  sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)
 =0.04;
 
  Have you any suggests.
 
  Thanks,
  Massimiliano
 
 
 
  What is radq?
 
  --- massimiliano.talarico
  [EMAIL PROTECTED] wrote:
 
   Dear all,
   I need a suggest to obtain the max of this function:
  
   Max x1*0.021986+x2*0.000964+x3*0.02913
  
   with these conditions:
  
   x1+x2+x3=1;
  
  radq((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)
 =0.04;
   x1=0;
   x1=1;
   x2=0;
   x2=1;
   x3=0;
   x3=1;
  
   Any suggests ?
  
   Thanks in advanced,
   Massimiliano
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
   reproducible code.
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 
 
 
 
 
 
 Fussy? Opinionated? Impossible to please? Perfect.  Join
 Yahoo!'s user panel and lay it on us.
 http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
 
 
 
 



   


that gives answers, not web links.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternative to xyplot()?

2007-07-17 Thread Stephen Tucker
What's wrong with lattice? Here's an alternative:

library(ggplot2)
ggplot(data=data.frame(x,y,grps=factor(grps)), 
   mapping=aes(x=x,y=y,colour=grps)) + # define data
  geom_identity() +# points
  geom_smooth(method=lm) # regression line



--- Ben Bolker [EMAIL PROTECTED] wrote:

 Manuel Morales Manuel.A.Morales at williams.edu writes:
 
  
  Sorry. I was thinking of the groups functionality, as illustrated
  below:
  
  grps-rep(c(1:3),10)
  x-rep(c(1:10),3)
  y-x+grps+rnorm(30)
  library(lattice)
  xyplot(y~x,group=grps, type=c(r,p))
 
   The points (type p) are easy, the regression lines (type r) are a
 little
 harder. How about:
 
 
 plot(y~x,col=grps)
 invisible(mapply(function(z,col) {abline(lm(y~x,data=z),col=col)},
   split(data.frame(x,y),grps),1:3))
 
   cheers
 Ben Bolker
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] polymorphic functions in ggplot? (WAS Re: Drawing rectangles in multiple panels)

2007-07-14 Thread Stephen Tucker
Regarding your earlier statement,

I tend to think in very data centric approach, where you first generate the
data (in a data frame) and then you plot it. There is very little data
creation/modification during the plotting itself...

Is the data generation and plotting truly separate and sequential? I'm
not entirely clear on this point - as statistical
transformations/operations return objects that require new variables
to be created - and this may be rooted in semantics (the verbal one,
not the computational) of the grammar of graphics - in the online book
draft of 'ggplot' it says (p. 37)

The explicit transformation stage was dropped because variable
transformations are already so easy in R: they do not need to be part
of the grammar.

In my understanding of what transformations are defined to be, they
involve statistical ones - which perhaps I'm not truly getting because
tranformations are defined (by L. Wilkinson) as a mapping of elements
of one set to elements of the same set, and yet a function like
median() will accept a list (of values) and return a single
value... in any case maybe there is a distinction between a
statistical 'transformation' and a statistical 'operation' that I've
missed, but statistical 'transformations' are included in ggplot's
stat functions. L. Wilkinson also seems to include an explicit TRANS
specification at times (for example, in the case of the boxplot on
p.60) and at other times nest it into the ELEMENT specification (for
example, the histogram on p. 47).

In any case, I interpret that the following progression is achieved
through 'data operations' and 'application of algebra' in the language
of L. Wilkinson and through I/O, merge, reshape, and other functions
in R:

source object - variables - varset

A statistic might then computed on the varset, which will return
another source object (true in R as well: e.g., class 'histogram' or
'lm') from which variables can again be extracted, varsets
constructed, etc. to yield a list of tuples to be associated with
geometrical and aesthetic attributes. Indeed, in the bootstrap
example, L. Wilkinson begins by extracting variables from a bootstrap
function on another variable that has not explicitly been created from
source (dataset).

So it's not clear to me that the the data creation step is necessarily
distinct from the plotting, as it is more (but not completely) so in
the traditional graphics system:

## DATA specification
variable - rnorm(100)
## TRANS specification
statsObj - hist(variable,nclass=20,plot=FALSE)
## Transformed data is plotted (variables extracted implicity and
## associated with default geometry/aesthetic mappings)
plot(statsObj)

Below is an analogous plot in ggplot, where the creation of the
summary object occurs as part of the grammar:

ggplot(data=data.frame(variable),mapping=aes(x=variable)) +
stat_bin(breaks=statsObj$breaks)

Since all statistical transformations/operations aren't handled by
ggplot, it seems that working with non-data-frame objects (for
example, of class 'nls' or 'rlm') require data operations (p.7) (to
extract fitted values, etc.). Of course, R provides these facilities,
but the plotting functions in the traditional graphics system
accommodate a number of object classes through polymorphic
functions. I wonder if in a similar way for ggplot, stat_bin could
accept objects of 'histogram' class [hist() allows the user to specify
'nclass', which will then compute the breaks], or stat_smooth could
accept 'rlm' objects. Of course, in the case of an 'lm' object, plot()
additionally gives diagnostic (residual and Q-Q) plots but that type of
response does not seem to fit in with the expected behavior of ggplot
functions...


--- hadley wickham [EMAIL PROTECTED] wrote:

 On 7/12/07, Deepayan Sarkar [EMAIL PROTECTED] wrote:
  On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
A question/comment: I have usually found that the subscripts argument
 is
what I need when passing *external* information into the panel
 function, for
example, when I wish to add results from a fit done external to the
 trellis
call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
each panel. It is not clear to me how the panel layout information
 from
panel.number(), etc. would be helpful here instead. Am I correct? --
 or is
there a smarter way to do this that I've missed?
  
   This is one of things that I think ggplot does better - it's much
   easier to plot multiple data sources.  I don't have many examples of
   this yet, but the final example on
   http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
  That's probably true. The Trellis approach is to define a plot by
  data source + type of plot, whereas the ggplot approach (if I
  understand correctly) is to create a specification for the display
  (incrementally?) and then render it. Since the specification can be
  very general, the approach is very flexible. The downside is that you
  need to 

[R] scaling of different data sets in ggplot

2007-07-14 Thread Stephen Tucker
Dear list (but probably mostly Hadley):

In ggplot, operations to modify 'guides' are accessed through grid
objects, but I did not find mention of creating new guides or possibly
removing them altogether using ggplot functions. I wonder if this is
something I need to learn grid to learn more about (which I hope to do
eventually).

Also, ggplot()+geom_object() [where 'object' can be point, line, etc.]
or layer() contains specification for the data, mappings and
geoms/stats - but the geoms/stats can be scale-dependent [for
instance, log]. so I wonder how different scalings can be applied to
different data sets.

Below is an example that requires both:

x - runif(100) y - exp(x^2) z - x^2+rnorm(100,0,0.02)

par(mar=c(5,4,2,4)+0.1) plot(x,y,log=y) lines(lowess(x,y,f=1/3))
par(new=TRUE) plot(x,z,col=2,pch=3,yaxt=n,ylab=)
lines(lowess(x,z,f=1/3),col=2) axis(4,col=2,col.axis=2)
mtext(z,4,line=3,col=2)

In ggplot:

## data specification
ggplot(data=data.frame(x,y,z)) +

  ## first set of points geom_point(mapping=aes(x=x,y=y)) +
  ## scale_y_log() +

  ## second set of points geom_point(mapping=aes(x=x,y=z),pch=3) +
  ## layer(mapping=aes(x=x,y=z),stat=smooth,method=loess) +
  ## scale_y_continuous()

scale_y_log() and scale_y_continuous() appear to apply to both mappings at
once, and I can't figure out how to associate them with the intended ones (I
expect this will be a desire for size and color scales as well).

Of course, I can always try to fool the system by (1) applying the scaling a
priori to create a new variable, (2) plotting points from the new variable,
and (3) creating a new axis with custom labels. Which then brings me back to
...how to add new guides? :)

Thanks,

Stephen



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read many files at one time?

2007-07-14 Thread Stephen Tucker
This should do it:

allData - sapply(paste(Sim,1:20,sep=),
  function(.x) read.table(paste(.x,txt,sep=.)),
  simplify=FALSE)

see ?read.table for specification of delimiters, etc.

allData will be a list, and you can access the contents of each file by
any of the following commands:
allData[[2]]
allData[[Sim2]]
allData$Sim2


--- Zhang Jian [EMAIL PROTECTED] wrote:

 I want to load many files in the R. The names of the files are Sim1.txt,
 
 Sim2.txt, Sim3.txt, Sim4.txt, Sim5.txt and so on.
 Can I read them at one time? What should I do? I can give the same names in
 R.
 Thanks.
 
 For example:
  tst=paste(Sim,1:20,.txt,sep=) # the file names
  tst
  [1] Sim1.txt  Sim2.txt  Sim3.txt  Sim4.txt  Sim5.txt  Sim6.txt
  [7] Sim7.txt  Sim8.txt  Sim9.txt  Sim10.txt Sim11.txt
 Sim12.txt
 [13] Sim13.txt Sim14.txt Sim15.txt Sim16.txt Sim17.txt
 Sim18.txt
 [19] Sim19.txt Sim20.txt
 
  data.name=paste(Sim,1:20,sep=) # the file names in R
  data.name
  [1] Sim1  Sim2  Sim3  Sim4  Sim5  Sim6  Sim7  Sim8  Sim9
 [10] Sim10 Sim11 Sim12 Sim13 Sim14 Sim15 Sim16 Sim17
 Sim18
 [19] Sim19 Sim20
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-14 Thread Stephen Tucker

I wonder what kind of objects? Are there large advantages for allowing
lattice functions to operate on objects other than data frames - I
couldn't find any screenshots of flowViz but I imagine those objects
would probably be list of arrays and such? I tend to think of mapply()
[and more recently melt()], etc. could always be applied beforehand,
but I suppose that would undermine the case for having generic
functions to support the rich collection of object classes in R...


--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
   each panel. It is not clear to me how the panel layout information from
   panel.number(), etc. would be helpful here instead. Am I correct? -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that you
 need to learn the language.
 
 On a philosophical note, I think the apparent limitations of Trellis
 in some (not all) cases is just due to the artificial importance given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can write methods that behave like
 traditional Trellis calls but work with more complex data structures.
 We have tried this in one bioconductor package (flowViz) with
 encouraging results.
 
 -Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-14 Thread Stephen Tucker
This is very interesting - but I'm not entirely clear on your last statement
though about how existing functions can cause problems with the scoping that
createWrapper() avoids... (but thanks for the tip).


--- Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Your approach of using closures is cleaner than that
 given below but just for comparison in:
 
 http://tolstoy.newcastle.edu.au/R/devel/06/03/4476.html
 
 there is a createWrapper function which creates a new function based
 on the function passed as its first argument by using the components
 of the list passed as its second argument to overwrite its formal
 arguments.  For example,
 
 createWrapper - function(FUN, Params) {
as.function(c(replace(formals(FUN), names(Params), Params), body(FUN)))
 }
 
 library(lattice)
 
 rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))
 
 
 panel.qrect - function(x, y, ..., rect.info) {
ri - rect.info[[packet.number()]]
panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
   col = grey86, border = NA)
panel.xyplot(x, y, ...)
 }
 
 xyplot(runif(30) ~ runif(30) | gl(3, 10),
   panel = createWrapper(panel.qrect, list(rect.info = rectInfo)))
 
 The createWrapper approach does have an advantage in the situation
 where the function analogous to panel.qrect is existing since using
 scoping then involves manipulation of environments in the closure
 approach.
 
 On 7/11/07, Stephen Tucker [EMAIL PROTECTED] wrote:
  In the Trellis approach, another way (I like) to deal with multiple
 pieces of
  external data sources is to 'attach' them to panel functions through
 lexical
  closures. For instance...
 
  rectInfo -
 list(matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2))
 
  panel.qrect - function(rect.info) {
   function(x, y, ...) {
 ri - rect.info[[packet.number()]]
 panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
col = grey86, border = NA)
 panel.xyplot(x, y, ...)
   }
  }
 
  xyplot(runif(30) ~ runif(30) | gl(3, 10),
panel = panel.qrect(rectInfo))
 
  ...which may or may not be more convenient than passing rectInfo (and
 perhaps
  other objects if desired) explicitly as an argument to xyplot().
 
 
  --- Deepayan Sarkar [EMAIL PROTECTED] wrote:
 
   On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
 A question/comment: I have usually found that the subscripts
 argument
   is
 what I need when passing *external* information into the panel
   function, for
 example, when I wish to add results from a fit done external to the
   trellis
 call. Fits[subscripts] gives me the fits (or whatever) I want to
 plot
   for
 each panel. It is not clear to me how the panel layout information
 from
 panel.number(), etc. would be helpful here instead. Am I correct?
 -- or
   is
 there a smarter way to do this that I've missed?
   
This is one of things that I think ggplot does better - it's much
easier to plot multiple data sources.  I don't have many examples of
this yet, but the final example on
http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
  
   That's probably true. The Trellis approach is to define a plot by
   data source + type of plot, whereas the ggplot approach (if I
   understand correctly) is to create a specification for the display
   (incrementally?) and then render it. Since the specification can be
   very general, the approach is very flexible. The downside is that you
   need to learn the language.
  
   On a philosophical note, I think the apparent limitations of Trellis
   in some (not all) cases is just due to the artificial importance given
   to data frames as the one true container for data. Now that we have
   proper multiple dispatch in S4, we can write methods that behave like
   traditional Trellis calls but work with more complex data structures.
   We have tried this in one bioconductor package (flowViz) with
   encouraging results.
  
   -Deepayan
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



   


that gives answers, not web links.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread Stephen Tucker
Not that Trellis/lattice was entirely easy to learn at first. :)

I've been playing around with ggplot2 and there is a plot()-like wrapper for
building a quick plot [incidentally, called qplot()], but otherwise it's my
understanding that you superpose elements (incrementally) to build up to the
graph you want. Here is the same plot in ggplot2:

rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))

library(ggplot2)
ggopt(grid.fill = white) # just my preference
## original plot of points
p -
qplot(x,y,data=data.frame(x=runif(30),y=runif(30),f=gl(3,30)),facets=f~.)
# print(p)

## external data (rectangles) - in coordinates for geom_polygon 
x - do.call(rbind,
 mapply(function(.r,.f)
data.frame(x=.r[c(1,1,2,2),1],y=.r[c(1,2,2,1),2],f=.f),
.r=rectInfo,.f=seq(along=rectInfo),SIMPLIFY=FALSE))
## add rectangle to original plot of points
p+layer(geom=polygon,data=x,mapping=aes(x=x,y=y),facets=f~.)
# will print the graphics on my windows() device

Though lattice does seem to emphasize the 'chart type' approach to graphing,
in a way I see that it provides a similar flexibility - just that the
specifications for each element are contained in functions and objects that
are ultimately invoked by a high-level/higher-order function, instead of
being combined in the linear fashion of ggplot2.

ST

--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
   each panel. It is not clear to me how the panel layout information from
   panel.number(), etc. would be helpful here instead. Am I correct? -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that you
 need to learn the language.
 
 On a philosophical note, I think the apparent limitations of Trellis
 in some (not all) cases is just due to the artificial importance given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can write methods that behave like
 traditional Trellis calls but work with more complex data structures.
 We have tried this in one bioconductor package (flowViz) with
 encouraging results.
 
 -Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread Stephen Tucker
In the Trellis approach, another way (I like) to deal with multiple pieces of
external data sources is to 'attach' them to panel functions through lexical
closures. For instance...

rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))

panel.qrect - function(rect.info) {
  function(x, y, ...) {
ri - rect.info[[packet.number()]]
panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
   col = grey86, border = NA)
panel.xyplot(x, y, ...)
  }
}

xyplot(runif(30) ~ runif(30) | gl(3, 10),
   panel = panel.qrect(rectInfo))

...which may or may not be more convenient than passing rectInfo (and perhaps
other objects if desired) explicitly as an argument to xyplot().


--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
   each panel. It is not clear to me how the panel layout information from
   panel.number(), etc. would be helpful here instead. Am I correct? -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that you
 need to learn the language.
 
 On a philosophical note, I think the apparent limitations of Trellis
 in some (not all) cases is just due to the artificial importance given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can write methods that behave like
 traditional Trellis calls but work with more complex data structures.
 We have tried this in one bioconductor package (flowViz) with
 encouraging results.
 
 -Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread Stephen Tucker
Regarding this, I meant to imply that lattice was similarly flexible in the
sense of handing multiple data sets [IMHO], in regards to other aspects of
the 'grammar of graphics' I have no qualifications to justify comment. But
the idea and intuitiveness of graph construction in ggplot2 is very appealing
- in an hour I picked up enough to do quite a bit, just by going through
examples in the author's book http://had.co.nz/ggplot2/. Will be
interesting to see how this package will be received by the community.

Stephen

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 Not that Trellis/lattice was entirely easy to learn at first. :)
 
 I've been playing around with ggplot2 and there is a plot()-like wrapper
 for
 building a quick plot [incidentally, called qplot()], but otherwise it's my
 understanding that you superpose elements (incrementally) to build up to
 the
 graph you want. Here is the same plot in ggplot2:
 
 rectInfo -
 list(matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2))
 
 library(ggplot2)
 ggopt(grid.fill = white) # just my preference
 ## original plot of points
 p -
 qplot(x,y,data=data.frame(x=runif(30),y=runif(30),f=gl(3,30)),facets=f~.)
 # print(p)
 
 ## external data (rectangles) - in coordinates for geom_polygon 
 x - do.call(rbind,
  mapply(function(.r,.f)
 data.frame(x=.r[c(1,1,2,2),1],y=.r[c(1,2,2,1),2],f=.f),
 .r=rectInfo,.f=seq(along=rectInfo),SIMPLIFY=FALSE))
 ## add rectangle to original plot of points
 p+layer(geom=polygon,data=x,mapping=aes(x=x,y=y),facets=f~.)
 # will print the graphics on my windows() device
 
 Though lattice does seem to emphasize the 'chart type' approach to
 graphing,
 in a way I see that it provides a similar flexibility - just that the
 specifications for each element are contained in functions and objects that
 are ultimately invoked by a high-level/higher-order function, instead of
 being combined in the linear fashion of ggplot2.
 
 ST
 
 --- Deepayan Sarkar [EMAIL PROTECTED] wrote:
 
  On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
A question/comment: I have usually found that the subscripts argument
  is
what I need when passing *external* information into the panel
  function, for
example, when I wish to add results from a fit done external to the
  trellis
call. Fits[subscripts] gives me the fits (or whatever) I want to plot
  for
each panel. It is not clear to me how the panel layout information
 from
panel.number(), etc. would be helpful here instead. Am I correct? --
 or
  is
there a smarter way to do this that I've missed?
  
   This is one of things that I think ggplot does better - it's much
   easier to plot multiple data sources.  I don't have many examples of
   this yet, but the final example on
   http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
  
  That's probably true. The Trellis approach is to define a plot by
  data source + type of plot, whereas the ggplot approach (if I
  understand correctly) is to create a specification for the display
  (incrementally?) and then render it. Since the specification can be
  very general, the approach is very flexible. The downside is that you
  need to learn the language.
  
  On a philosophical note, I think the apparent limitations of Trellis
  in some (not all) cases is just due to the artificial importance given
  to data frames as the one true container for data. Now that we have
  proper multiple dispatch in S4, we can write methods that behave like
  traditional Trellis calls but work with more complex data structures.
  We have tried this in one bioconductor package (flowViz) with
  encouraging results.
  
  -Deepayan
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  


 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple Stripcharts

2007-07-07 Thread Stephen Tucker
I'm not able to make out your data but something like this?

df - data.frame(A=rnorm(10),B=rnorm(10),C=runif(10))
stripchart(df,method=jitter)


--- Tavpritesh Sethi [EMAIL PROTECTED] wrote:

 Hi all,
 I have 205 rows with measurements for three categories of people. I want to
 generate stripplots for each of these rows. How can I do it without having
 to do them one by one. I am giving a sample dataset:-
 
  A
  B
  C
  A
  B
  C
  A
  B
  C
  A
  B
  C
  10.34822
  10.18426
  9.837874
  9.65047
  8.020482
  9.17312
  6.349595
  13.55664
  5.286697
  11.85409
  2.827027
  7.002696
  11.54984
  12.14591
  14.88955
  12.26134
  11.74262
  11.13481
  15.11849
  14.97857
  14.12973
  14.23219
  15.36582
  15.4698
  10.59222
  11.22417
  13.34279
  12.2538
  11.02348
  11.59403
  9.933778
  10.45499
  8.884345
  8.465186
  9.72647
  10.44469
 
 
 Thanks,
 Tavpritesh
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Finding fabulous fares is fun.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] color scale in rgl plots

2007-07-07 Thread Stephen Tucker
Hi David,

I'm not an expert in 'rgl', but to determine data-dependent color for points
I often use cut().

# using a very simple example,
x - 1:2; y - 1:2; z - matrix(1:4,ncol=2)

# the following image will be a projection of my intended 3-D 'rgl' plot
# into 2-D space (if we don't consider color to be a dimension):
library(fields)
image.plot(x,y,z)

# the 3-D rgl plot will be as follows:
df - data.frame(x=rep(x,times=length(y)),
 y=rep(y,each=length(x)),
 z=as.vector(z))
plot3d(x=df,col=1:4,type=s)

## looks okay so moving onto bigger example:
x - 1:10; y - 1:10; z - matrix(1:100,ncol=10)

# 2-D projection:
image.plot(x,y,z)

# 3-D plot in rgl
df - data.frame(x=rep(x,times=length(y)),
 y=rep(y,each=length(x)),
 z=as.vector(z))
# This is how I determine color:
nColors - 64
colindex - as.integer(cut(df$z,breaks=nColors))
plot3d(x=df,type=s,col=tim.colors(nColors)[colindex])

===
tim.colors(nColors)[colindex] will return a vector of colors the same length
as 'df'.

I don't think as.integer() on cut() is entirely necessary because cut()
returns a factor... in any case, I use these integers as indices for
tim.colors() [you will need the 'fields' package for this set of colors].

Hope this helps.

ST


--- David Farrelly [EMAIL PROTECTED] wrote:

 Hello,
 
 I'm trying to make a 3d plot using rgl in which the size and color of
 each point corresponds to certain attributes of each data point. The color
 attribute, let's call it X, is scaled to go from 0 to 1. The
 rainbow(64,start=0.7,end=0.1) palette is perfect for what I want but I
 don't know how to take that palette and pick a color from it based on
 the value of X for a given data point. I'm fairly new to R and any
 suggestions would be greatly appreciated.
 
 Here's what I do - it's how to do the color mapping that has me stumped.
 

plot3d(th1,ph,th2,type='s',size=wd1,col=(rp1,0,0,1),cex=2,ylab=NULL,xlab=NULL,zlab=NULL,xlim=c(0,1),ylim=c(0,2),zlim=c(0,1),box=TRUE,axes=FALSE)
 
 I have also tried the more obvious col = rgb(a,b,c,d) where a,b,c,d are
 functions of X but I can't manage to come up with a nice looking color
 scale.
 
 Thanks in advance,
 
 David
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop and function

2007-07-05 Thread Stephen Tucker
You do not have matching parentheses in this line
   returnlow - gpdlow(var[,i][var[,i](p[,i][[2]])
most likely there is a syntax error that halts the execution of the
assignment statement?



--- livia [EMAIL PROTECTED] wrote:

 
 Hi All, I am trying to make a loop for a function and I am using the
 following codes. p and var are some matrix obtained before. I would
 like
 to apply the function  gpdlow for i in 1:12 and get the returnlow for i
 in 1:12. But when I ask for returnlow there are warnings and it turns out
 some strange result. 
 
 for (i in 1:12){  
 gpdlow - function(u){  
 p[,i]$beta -u*p[,i][[2]]
 }
 returnlow - gpdlow(var[,i][var[,i](p[,i][[2]])
 }
 
 
 -- 
 View this message in context:
 http://www.nabble.com/Loop-and-function-tf4028854.html#a11443955
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nls() lower/upper bound specification

2007-07-04 Thread Stephen Tucker
Dear all,

In optim() all parameters of a function to be adjusted is stored in a single
vector, with lower/upper bounds can be specified by a vector of the same
length.

In nls(), is it true that if I want to specify lower/upper bounds, functions
must be re-written so that each parameter is contained in a single-valued
vector?

## data input
x - 1:10
y - 3*x+4*x^2+rnorm(10,250)

## this one does not work
f - function(x)
  function(beta)
  beta[1]+ beta[2]*x+beta[3]*x^2

out - nls(y~f(x)(beta),data=data.frame(x,y),
   alg=port,
   start=list(beta=1:3),
   lower=list(beta=rep(0,3)))

(However, this works if I do not specify a lower bound)

## this one works
g - function(x)
  function(beta1,beta2,beta3)
  beta1+ beta2*x+beta3*x^2

out - nls(y~g(x)(beta1,beta2,beta3),data=data.frame(x,y),
   alg=port,
   start=list(beta1=1,beta2=1,beta3=1),
   lower=list(beta1=1,beta2=1,beta3=1))

Thanks in advance!

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems using imported data

2007-07-03 Thread Stephen Tucker
Actually, I believe attach() and detached() is discouraged nowadays...

x - read.delim(Filename.txt, header=TRUE)

You can access your data by column:
x[,1]
x[,c(1,3)]

or if your first column is named Col1 and the third Col3,
x[,Col1]
x[,c(Col1,Col3)]

and you can do the same to access by row - by indices or rownames [which you
can set with rownames-, see help(rownames-)]

Alternatively, with this type of data [created by the read.delim() function]
you can also access with the following syntax:
x$Col1
x$Col3
with(x,Col1)
with(x,cbind(Col1,Col3))

...hope this helps

ST


--- Susie Iredale [EMAIL PROTECTED] wrote:

 
 
 
 (Repeat of previous HTML version)
 
 Hello all,
 
 I am a new R user and I have finally imported my data using
 read.delim(Filename.txt, header=TRUE) after some difficulty, by changing
 file directories (a hint to anyone who might be stuck there).
 
 However, I am now stuck trying to use my data.  When I try to use
 data.frame(filename.txt) it tells me object not found, which makes it
 difficult to use attach() or with().  How do I get R to recognize my data? 
 
 
 Thanks,
 Susie
 PhD Student UCI
 
 
 
 
  


 Luggage? GPS? Comic books?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regexpr

2007-06-29 Thread Stephen Tucker
I think you are looking for paste().

And you can replace your for loop with lapply(), which will apply regexpr to
every element of 'mylist' (as the first argument, which is 'pattern'). 'text'
can be a vector also:

mylist - c(MN,NY,FL)
lapply(paste(mylist,$,sep=),regexpr,text=Those from MN:)



--- runner [EMAIL PROTECTED] wrote:

 
 Hi, 
 
 I 'd like to match each member of a list to a target string, e.g.
 --
 mylist=c(MN,NY,FL)
 g=regexpr(mylist[1], Those from MN:)
 if (g0)
 {
 On list
 }
 --
 My question is:
 
 How to add an end-of-string symbol '$' to the to-match string? so that 'M'
 won't match.
 
 Of course, MN$ will work, but i want to use it in a loop; mylist[i] is
 what i need. I tried mylist[1]$, but didn't work. So why it doesn't
 extrapolate? How to do it?
 
 Thanks a lot!
 -- 
 View this message in context:
 http://www.nabble.com/regexpr-tf4000743.html#a11363041
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Bored stiff? Loosen up...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assign name to a name

2007-06-29 Thread Stephen Tucker
You can just create another variable which contains the names you want:


## let
Year - c(rep(1999,2),rep(2000,2),rep(2001,3))

## one alternative
getYearCode1 - function(yr) {
  # yr can be a vector
  ifelse(yr==1999,Year1,
 ifelse(yr==2000,Year2,
ifelse(yr==2001,Year3)))
}

## another alternative
## more appropriate since you probably want 
## a single value returned
getYearCode2 - function(yr) {
  # yr is a single value
  switch(as.character(yr),
 `1999` = Year1,
 `2000` = Year2,
 `2001` = Year3)
}

## Application:
## single value
getYearCode1(Year[1])
getYearCode2(Year[1])
## on a vector
dataset$YearCode - getYearCode1(Year)
# or
dataset$YearCode - sapply(Year,getYearCode2)

## another option is match()
df - data.frame(Year=c(1999,2000,2001),YearCode=c(Year1,Year2,Year3))
dataset$YearCode - df[match(Year,df[,Year]),YearCode]

##
## reading from console
subset(dataset,YearCode==scan(,what=))
subset(dataset,
   YearCode=={x - function() {cat(YrCode: );readline()}; x()})

## or as a function
f - function(x) {
  g - function() {
x - function() {
  cat(YearCode: );
  readline()
}
subset(dataset,YearCode==x())
  }
}
getSubset1 - f(dataset)

## type at console. You will be prompted:
datayear - getSubset1()

## but easier is
f - function(x) {
  g - function(y)
subset(x,YearCode==y)
}
getSubset2 - f(dataset)

## type at prompt
datayear - getSubset1(1999)


--- Spilak,Jacqueline [Edm] [EMAIL PROTECTED] wrote:

 I would like to know how I can assign a name to a name.  I have a
 dataset that has different years in it.  I am writing scripts using R
 and I would like to give a month a generic name and then use the generic
 name to do different analysis.  The reason for the generic name would be
 so that I only have to change one thing if I wanted to change the year.
 For example.
 Year1 = 1999
 datayear - subset(dataset, Year = Year1)
 I would want  to subset for whatever year is in Year1.  I am not sure
 if R does this but it would be great if it does.  Is there also anyway
 for R to ask the user for the variable in the console without going into
 the script and then use whatever the user puts in. Thanks for the help.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function call within a function.

2007-06-28 Thread Stephen Tucker
Dear John,

Perhaps I am mistaken in what you are trying to accomplish but it seems like
what is required is that you call lstfun() outside of ukn(). [and remove the
call to lstfun() in ukn()].

nts - lstfun(myfile, aa, bb)
results - ukn(dd1, a, b, nts$cda)

Alternatively, you can eliminate the fourth argument in ukn() and assign (via
'-') the results of lstfun() to 'nam1' within ukn() instead of saving to
'nts'...

--- John Kane [EMAIL PROTECTED] wrote:

 I am trying to call a funtion within another function
 and I clearly am misunderstanding what I should do. 
 Below is a simple example.
 I know lstfun works on its own but I cannot seem to
 figure out how to get it to work within ukn. Basically
 I need to create the variable nts. I have probably
 missed something simple in the Intro or FAQ.
 
 Any help would be much appreciated.
 
 EXAMPLE

---
 # create data.frame
 cata - c( 1,1,6,1,1,4)
 catb - c( 1,2,3,4,5,6)
 id - c('a', 'b', 'b', 'a', 'a', 'b')
 dd1  -  data.frame(id, cata,catb)
 
 # function to create list from data.frame
 lstfun  - function(file, alpha , beta ) {
 cda  -  subset(file, file[,1] == alpha)
 cdb  -  subset (file, file[,1]== beta)
 list1 - list(cda,cdb)
 }
 
 # funtion to operate on list
 ukn  -  function(file, alpha, beta, nam1){
 aa  - alpha
 bb  - beta
 myfile  - file
 nts - lstfun(myfile, aa, bb)
 mysum - nam1[,3]*5
 return(mysum)
 }
 
 results - ukn(dd1, a, b, nts$cda)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for parallel functionality between Matlab and R

2007-06-27 Thread Stephen Tucker
This zooming function on the R-Wiki page was very neat:

http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-misc:interactive_zooming

Also, to answer question (a), maybe these examples might help?

## add elements to plot
plot(1:10,1:10)
lines(1:10,(1:10)/2)
points(1:10,(1:10)/1.5)

## add second y-axis
par(mar=c(5,4,2,4)+0.1)
plot(1:10,1:10)
par(new=TRUE)
plot(-20:20,20:-20,col=4,
 type=l,axes=FALSE,
 xlab=,ylab=,
 xaxs=i,
 xlim=par(usr)[1:2])
axis(4,col=4,col.axis=4)
mtext(second y-axis label,4,outer=TRUE,padj=-2,col=4)






--- Jim Lemon [EMAIL PROTECTED] wrote:

 El-ad David Amir wrote:
  I'm slowly moving my statistical analysis from Matlab to R, and find
 myself
  missing two features:
  
  a) How do I mimic Matlab's 'hold on'? (I want to show several plots
  together, when I type two plots one after the other the second overwrites
  the first)
  b) How do I mimic Matlab's 'axis'? (after drawing my plots I want to zoom
 on
  specific parts- for example, x=0:5, y=0:20).
  
 I think what you want for a) is par(ask=TRUE).
 
 There have been a few discussions of zooming on the help list - see:
 
 http://stats.math.uni-augsburg.de/iPlots/index.shtml
 
 for one solution.
 
 Jim
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  

Luggage? GPS? Comic books?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simultaneous actions of grep ???

2007-06-26 Thread Stephen Tucker
You can list them together using | (which stands for 'or'):

  c-subset(c,!rownames(c) %in% grep(.1|.5|.6|.9,rownames(c),value=T))

but . means any character for regular expressions, so if you meant a
decimal place, you probably want to escape them with a \\:

  c-subset(c,!rownames(c) %in%
grep(\\.1|\\.5|\\.6|\\.9, rownames(c),value=T))

Another option is

  c-subset(c,regexpr(\\.1|\\.5|\\.6|\\.9,c)  0)

because regexpr will return -1 for elements which do not contain a match.


--- Ana Patricia Martins [EMAIL PROTECTED] wrote:

 Hello R-users and developers,
 
 Once again, I'm asking for your help.
 
 There is other way to do the same more easily for applied simultaneous
 grep???
   
 c-subset(c,!rownames(c) %in% grep(.1,rownames(c),value=T))
 c-subset(c,!rownames(c) %in% grep(.5,rownames(c),value=T))
 c-subset(c,!rownames(c) %in% grep(.6,rownames(c),value=T))
 c-subset(c,!rownames(c) %in% grep(.9,rownames(c),value=T))
 
 Thanks in advance for helping me.
 
 Atenciosamente,
 Ana Patricia Martins
 ---
 Serviço Métodos Estatísticos
 Departamento de Metodologia Estatística
 INE - Portugal
 Telef:  218 426 100 - Ext: 3210
 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simultaneous actions of grep ???

2007-06-26 Thread Stephen Tucker
My mistake... last alternative should be:

   c-subset(c,regexpr(\\.1|\\.5|\\.6|\\.9,rownames(c))  0)

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 You can list them together using | (which stands for 'or'):
 
   c-subset(c,!rownames(c) %in%
 grep(.1|.5|.6|.9,rownames(c),value=T))
 
 but . means any character for regular expressions, so if you meant a
 decimal place, you probably want to escape them with a \\:
 
   c-subset(c,!rownames(c) %in%
 grep(\\.1|\\.5|\\.6|\\.9, rownames(c),value=T))
 
 Another option is
 
   c-subset(c,regexpr(\\.1|\\.5|\\.6|\\.9,c)  0)
 
 because regexpr will return -1 for elements which do not contain a match.
 
 
 --- Ana Patricia Martins [EMAIL PROTECTED] wrote:
 
  Hello R-users and developers,
  
  Once again, I'm asking for your help.
  
  There is other way to do the same more easily for applied simultaneous
  grep???

  c-subset(c,!rownames(c) %in% grep(.1,rownames(c),value=T))
  c-subset(c,!rownames(c) %in% grep(.5,rownames(c),value=T))
  c-subset(c,!rownames(c) %in% grep(.6,rownames(c),value=T))
  c-subset(c,!rownames(c) %in% grep(.9,rownames(c),value=T))
  
  Thanks in advance for helping me.
  
  Atenciosamente,
  Ana Patricia Martins
  ---
  Serviço Métodos Estatísticos
  Departamento de Metodologia Estatística
  INE - Portugal
  Telef:  218 426 100 - Ext: 3210
  E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
  
  
  [[alternative HTML version deleted]]
  
   __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing the position of the y label (ylab)

2007-06-26 Thread Stephen Tucker
If by 'position' you mean the distance from the axes, I think 'mgp' is the
argument you are looking for (see ?par)-

You can set this in par(), plot() [which will affect both x and y labels], or
title():

par(mar=rep(6,4))
plot(NA,NA,xlim=0:1,ylim=0:1,xlab=X,ylab=)
title(ylab=Y2,mgp=c(4,1,0))

if you want to change 'position' parallel to the axis, then you probably have
to do
plot(...,xlab=,ylab=)

and set labels using mtext(); playing around with the 'adj' argument.

Btw, you can use '\n' to denote new line:
title(ylab=Onset/Withdrawl\nDate,mgp=c(4,1,0))


--- Etienne [EMAIL PROTECTED] wrote:

 How can I change the position of the ylab, after
 enlarging the margins with par(mar=...)? 
 
 Here is the relevant code snippet
 
 
 par(mar=c(5.1,5.1,4.1,2.1))

plot(c(1979,2003),c(40,50),ylim=c(1,73),lab=c(20,10,1),pch=21,col='blue',bg='blue',axes=FALSE,xlab=Years,ylab=Onset/Withdrawl
 Date,font.lab=2)
 box()
 axis(1,las=2)

axis(2,las=2,labels=c('JAN','FEB','MAR','APR','MAY','JUN','JUL','AUG','SEP','OCT','NOV','DEC','JAN'),at=seq(from=1,to=73,by=6))
 axis(3,labels=FALSE)
 axis(4,labels=FALSE,at=seq(from=1,to=73,by=6))
 
 
 Thanks
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ylab at the right hand of a plot with two y axes

2007-06-26 Thread Stephen Tucker
Here are two ways:

## method 1
plot(1:100,y1)
par(new=TRUE)
plot(1:100,y2,xlab=,ylab=,col=2,axes=FALSE)
axis(4,col=2,col.axis=2)

## method 2
plot.new()
plot.window(xlim=c(1,100),ylim=range(y1))
points(1:100,y1)
axis(1)
axis(2)
title(xlab=x,ylab=y1)
plot.window(xlim=c(1,100),ylim=range(y2))
points(1:100,y2)
axis(4,col=2,col.axis=2)
box()



--- Young Cho [EMAIL PROTECTED] wrote:

 When I try to plot two lines ( or scatterplots) with different scales, this
 is what I have been doing:
 
 Suppose: I have y1 and y2 in a very different scale
 
 y1 = 1:100
 y2 = c(100:1)*10
 
 To plot them on top of each other  and denote by different colors: I have
 to
 figure out the correct scale '10'  and corresponding tick.vector and
 lables.
 Then do:
 
 plot(1:100, y1)   # I can have 'ylab' here for the left-hand side y axis.
 points(1:100, y2/10,col=2)
 ytick.vector = seq(from=0,to=100,by=20)
 ytick.label = as.character(seq(from=0,to=1000,by=200))
 axis(4,at = ytick.vector,label = ytick.label,col=2,col.axis=2)
 
 Two questions.
 
 1. Are there easier ways to plot the y1, y2 w/o figuring out the correct
 scaler, tick vectors, and labels in order to put them in one figure?
 2. How to add additional 'ylab' to the right hand side y-axis of the plot?
 Thanks a lot!
 
 -Young
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-excel

2007-06-25 Thread Stephen Tucker
There are also some notes about this in the R Data Import/Export manual: 
http://cran.r-project.org/doc/manuals/R-data.html#Reading-Excel-spreadsheets

But I've gathered the following examples from the R-help mailing list
archives [in addition to the option of saving the spreadsheet as a .csv file
and reading it in with read.csv()]. Personally, I use option 4 regularly (I
happened to have Perl installed on my Windows XP machine already) and have
had good luck with it.

Hope this helps.

= Option 1 =
# SIMPLEST OPTION
install.packages(xlsReadWrite)
library(xlsReadWrite)
data = read.xls(sampledata.xls,sheet=1)

= Option 2 =
# ALSO SIMPLE BUT MORE MANUAL WORK EACH TIME
# (1) highlight region in Excel you want to import and
data = read.delim(file=clipboard,header=TRUE)
# or, if you don't have a header,
data = read.delim(file=clipboard,header=FALSE)

= Option 3 =
# RODBC IS A BIG APPLICATION, FOR INTERFACING
# WITH MANY TYPES OF FILES/SERVERS
install.packages(RODBC)
library(RODBC)
fid - odbcConnectExcel(sampledata.xls)
data - sqlFetch(fid,Sheet1)
close(fid)

= Option 4 =
# REQUIRES CONCURRENT INSTALLATION OF PERL
install.packages(gdata)
library(gdata)
data = read.xls(sampledata.xls,sheet=1)

 



--- Erika Frigo [EMAIL PROTECTED] wrote:

 
 Good morning to everybody,
 I have a problem : how can I import excel files in R???
 
 thank you very much
 
 
 Dr.sa. Erika Frigo
 Università degli Studi di Milano
 Facoltà di Medicina Veterinaria
 Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza
 Alimentare (VSA)
  
 Via Grasselli, 7
 20137 Milano
 Tel. 02/50318515
 Fax 02/50318501
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Computing time differences

2007-06-20 Thread Stephen Tucker
Here is one way:

Vector1 - c(20080621.00,20080623.00)
Vector2 - c(20080620.00,20080622.00)
do.call(difftime,
c(apply(cbind(time1=Vector1,time2=Vector2),2,
  function(x) strptime(x,format=%Y%m%d.00)),
  units=hours))

see ?strptime, ?difftime and
http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf



--- [EMAIL PROTECTED] wrote:

 Dear R users, 
 
 I have a problem computing time differences using R. 
 
 I have a date that are given using the following format: 20080620.00, where
 the 4 first digits represent the year, the next 2 ones the month and the
 last
 2 ones the day. I would need to compute time differences between two
 vectors
 of this given format. 
 
 I tried around trying to change this format into any type of time serie
 without any succes. 
 
 Could some one provide me with some useful suggestion and/or tip to know
 where to look?
 
 I am using R-2.4.0 under Windows XP
 
 Thanks for your help, 
 
 Vincent
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data type for block data?

2007-06-19 Thread Stephen Tucker
Hi Paul,

Hope this is what you're looking for:

## reading in text (the first 13 rows of cc from your posting)
## and using smaller indices [(3,8) instead of (10,40)]
## for this example
 cc - mode-(do.call(rbind,
+strsplit(readLines(textConnection(txt))[-1],[ ]{2,}))[,-1],
+numeric)
 index - c(3,8)

## (1) convert cc to data frame
## (2) split according to factors produced by cut()
## (3) apply data.matrix() to each element of list
## produced by split() to convert back to numeric matrix
 s - lapply(split(as.data.frame(cc),
+   f=cut(1:nrow(cc),breaks=c(-Inf,index,Inf))),
+ data.matrix)

## return result. now s[[1]] contains the first block,
## s[[2]] contains the second block, and so on.
 s
$`(-Inf,3]`
  V1 V2
1  1 26
2  2 27
3  3 28

$`(3,8]`
  V1 V2
4  4 29
5  5 30
6  6 31
7  7 32
8  8 33

$`(8, Inf]`
   V1 V2
9   9 34
10  1 27
11  1 28
12  2 30
13  3 34


--- H. Paul Benton [EMAIL PROTECTED] wrote:

 Dear All,
 
 
 I have a matrix with data that is not organised. I would like to go
 through this and extract it. Each feature has 2 vectors which express
 the data. I also have an index of the places where the data should be cut.
 eg.
 class(cc)
 matrix
 cc
   [,1] [,2]
  [1,]1   26
  [2,]2   27
  [3,]3   28
  [4,]4   29
  [5,]5   30
  [6,]6   31
  [7,]7   32
  [8,]8   33
  [9,]9   34
 [10,]1   27
 [11,]1   28
 [12,]2   30
 [13,]3   34
 ect..
  index
 [1] 10 40
 
 
 Is there a way to take cc[i:index[i-1],] to another format as to where
 each block could be worked on separately. ie so in one block would be
 rows1:10 the next block would be rows11:40 and so on.
 
 Thanks,
 
 Paul
 
 
 
 -- 
 Research Technician
 Mass Spectrometry
o The
   /
 o Scripps
   \
o Research
   /
 o Institute
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] : create a PDF file (text (print list) and grafics)

2007-06-19 Thread Stephen Tucker
Hi Ana,

There are two ways in which I imagine this can be done:
(1) create a layout [using layout()] and printing the text on a blank plot;
(2) using Sweave.

## === Method 1 example... ===

pdf()
layout(matrix(c(1,1,2,3),ncol=2,byrow=TRUE),widths=c(1,1),heights=c(3,2))
par(mar=c(0,0,5,0))
plot.new(); plot.window(xlim=c(0,1),ylim=c(0,1))
title(Title,cex.main=1.5)
text(0.4,0.5,adj=c(0,0),lab=
 print(myList)
$a
[1] 1

$b
[1] 2)
par(mar=c(5,4,1,1))
boxplot(1:10)
hist(1:10)
dev.off()

## (there must be a more elegant way than pasting the output
## of print(myList) as a character string in text() but I can't
## think of it at the moment...

## === Method 2 (warning: I am not too familiar with Sweave
## but I understand that this is how it *should* work; this
## Sweave document will create a '.tex' file which you can then
## compile with latex - this site was helpful:
## http://www.stat.umn.edu/~charlie/Sweave/)  ===

\documentclass[a4paper]{article}
\title{Sweave Document}
\author{}
\date{}
\begin{document}
\maketitle

Text field here

echo=FALSE=
## computations to build 'myList' here (but not for printing)
## such as
myList - list(a=1,b=2)
@
reg=
## this is for output
print(myList)
@ 

\begin{center}
fig =TRUE , echo =FALSE =
par(mfrow=c(1,2), oma=c(0,0,3,0),cex=0.5)
#Image
hist(controlo$quope,axes=T,plot=T,col=gray,xlab=
Quope,main=Histograma,lwd=2)
boxplot(controlo$quope,col=bisque,lty=3,medlty=1,medlwd=2.5,main=
Boxplot) 
mtext(regiao,cex=1.5,col=blue,adj=0.5,side=3,outer=TRUE) 
@
\end{center}
\end{document}

##



--- Ana Patricia Martins [EMAIL PROTECTED] wrote:

 Dear helpers,
 
 I need help to create a PDF file like the example
 
  ---
  |  Title |
  ---
  ||
  |Text (print a list) |  
  ||
  ---
  || |
  || |
  |image   | image   |
  || |
  || |
  ---
 
 

pdf(paste(getwd(),/Output/Controlo_Pesos,regiao,trimestre,substr(ano,3,4),
 
   .pdf,sep=),height=13.7, paper=special)
 par(mfrow=c(1,2), oma=c(0,0,3,0),cex=0.5)
 
 #Text field ()
 #print(qual_pesos)# is a list
 
 #Image
 hist(controlo$quope,axes=T,plot=T,col=gray,xlab=
 Quope,main=Histograma,lwd=2)
 boxplot(controlo$quope,col=bisque,lty=3,medlty=1,medlwd=2.5,main=
 Boxplot) 
 mtext(regiao,cex=1.5,col=blue,adj=0.5,side=3,outer=TRUE) 
 dev.off()
 
 
 
 There is other way to do the same more easily
 Thanks in advance for helping me.
 Best regards.
 
 Atenciosamente,
 Ana Patricia Martins
 ---
 Serviço Métodos Estatísticos
 Departamento de Metodologia Estatística
 INE - Portugal
 Telef:  218 426 100 - Ext: 3210
 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] passing (or obtaining) index or element name of list to FUN in lapply()

2007-06-13 Thread Stephen Tucker
Hello everyone,

I wonder if there is a way to pass the index or name of a list to a
user-specified function in lapply(). For instance, my desired effect is
something like the output of 

 L - list(jack=4098,sape=4139)
 lapply(seq(along=L),function(i,x) if(i==1) jack else sape,x=L)
[[1]]
[1] jack

[[2]]
[1] sape

 lapply(seq(along=L),function(i,x) if(names(x)[i]==jack) 1 else 2,x=L)
[[1]]
[1] 1

[[2]]
[1] 2

But by passing L as the first argument of lapply(). I thought there was a
tangentially-related post on this mailing list in the past but I don't recall
that it was ever addressed directly (and I can't seem to find it now). The
examples above are perfectly good alternatives especially if I wrap each of
the lines in names-() to return lists with appropriate names assigned, but
it feels like I am essentially writing a FOR-LOOP - though I was surprised to
find that speed-wise, it doesn't seem to make much of a difference (unless I
have not selected a rigorous test):

 N - 1
 y - runif(N)
## looping through elements of y
 system.time(lapply(y,
+function(x) {
+  set.seed(222)
+  mean(rnorm(1e4,x,1))
+}))
[1] 21.00  0.17 21.29NANA
## looping through indices
 system.time(lapply(1:N,
+function(x,y) {
+  set.seed(222)
+  mean(rnorm(1e4,y[x],1))
+  },y=y))
[1] 21.09  0.14 21.26NANA

In Python, there are methods for Lists and Dictionaries called enumerate(),
and iteritems(), respectively. Example applications:

## a list
L = ['a','b','c']
[x for x in enumerate(L)]
## returns index of list along with the list element
[(0, 'a'), (1, 'b'), (2, 'c')]

## a dictionary
D = {'jack': 4098, 'sape': 4139}
[x for x in D.iteritems()]
## returns element key (name) along with element contents
[('sape', 4139), ('jack', 4098)]

And this is something of the effect I was looking for...

Thanks to all,

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Viewing a data object

2007-06-13 Thread Stephen Tucker
Hi Horace,

I have also thought that it may be useful but I don't know of any Object
Explorer available for R.

However, (you may alread know this but) 
(1) you can view your list of objects in R with objects(), 
(2) view objects in a spreadsheet-like table (if they are matrices or data
frames) with invisible(edit(objectName)) [which isn't easy on the fingers].
fix(objectName) is also a shorter option but it has the side effect of
possibly changing your object when you close the viewing data. For instance,
this can happen if you mistakenly type something into a cell; it can also
change your column classes when you don't - for example:

 options(stringsAsFactors=TRUE)
 x - data.frame(letters[1:5],1:5)
 sapply(x,class)
letters.1.5. X1.5 
factorinteger 
 fix(x) # no user-changes made
 sapply(x,class)
letters.1.5. X1.5 
factornumeric 

(3) I believe Deepayan Sarkar contributed the tab-completion capability at
the command line. So unless you have a lot of objects beginning with
'AuroraStoch...' you should be able to type a few letters and let the
auto-completion handle the rest.

Best regards,

ST


--- Horace Tso [EMAIL PROTECTED] wrote:

 Dear list,
 
 First apologize that this is trivial and just betrays my slothfulness at
 the keyboard. I'm sick of having to type a long name just to get a glimpse
 of something. For example, if my data frame is named
 'AuroraStochasticRunsJune1.df and I want to see what the middle looks
 like, I have to type
 
 AuroraStochasticRunsJune1.df[ 400:500, ]
 
 And often I'm not even sure rows 400 to 500 are what I want to see.  I
 might have to type the same line many times.
 
 Is there sort of a R-equivalence of the Object Explorer, like in Splus,
 where I could mouse-click an object in a list and a window pops up?  Short
 of that, is there any trick of saving a couple of keystrokes here and
 there?
 
 Thanks for tolerating this kind of annoying questions.
 
 H.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Sucker-punch spam with award-winning protection.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] passing (or obtaining) index or element name of list to FUN in lapply()

2007-06-13 Thread Stephen Tucker
Hi Professor Ripley,

Thanks for the response. I apologize, my examples were not too real (though
your solutions are indeed clever)... I was trying to ask more generally
whether the element name or index of 'listObj' could be obtained by the
user-function 'myfunction' when used in lapply(X=listObj,FUN=myfunction);
below I illustrate two cases in which I have come across this desire:
(1) In 'Example 1' I essentially take the list element and do some
transformations (optionally some number-crunching), and then plot it with the
element name of the list for the title.
(2) In 'Example 2' I want to read in data from the list element and write the
contents to a file; writing a header line only when operating on the first
element of the list.

## data specification
data1 - var1 var2
-0.44 0.17
1.03 0.93
0.85 0.39
data2 - var1 var2
-0.16 0.97
0.93 0.23
0.80 0.42
L - list(data1=data1,data2=data2)

##=== Example 1 (want element name) ===
## function definition
plottingfunc - function(i,x) {
  plot(read.table(textConnection(x[[i]]),header=TRUE),main=names(x)[i])
}
## function application
par(mfrow=c(2,1))
lapply(seq(along=L),plottingfunc,x=L)

##=== Example 2 (want element index) ===
## function definition
readwritefunc - function(i,x,fout) {
  data - read.table(textConnection(x[[i]]),header=TRUE)
  if(i==1) cat(paste(colnames(data),collapse=,),\n,file=fout)
  write.table(data,file=fout,sep=,,col=FALSE,
  row=FALSE,quote=FALSE,append=TRUE)
}
## function application
fout - file(out.dat,open=w)
lapply(seq(along=L),readwritefunc,x=L,fout=fout)
close(fout)

Since the above code works, I suppose this is more of a question of
aesthetics since I thought the spirit of lapply() was to operate on the
elements of a list and not its indices - I thought perhaps there is a way to
get the index number and element name from within the user-function.

Also, I recall a lesson on 'loop avoidance' from an earlier version of MASS;
this was in the days of S-PLUS dominance and perhaps less applicable now to R
as you mentioned... But old habits die hard; my amygdala still invokes a fear
response at the thought of a loop... (and as of recently, I have been
infatuated with the notion of adhering, albeit loosely, to the 'functional
programming' paradigm which makes me doubly fearful of loops)

Thanks and best regards,

Stephen

--- Prof Brian Ripley [EMAIL PROTECTED] wrote:

 On Tue, 12 Jun 2007, Stephen Tucker wrote:
 
  Hello everyone,
 
  I wonder if there is a way to pass the index or name of a list to a
  user-specified function in lapply(). For instance, my desired effect is
  something like the output of
 
  L - list(jack=4098,sape=4139)
  lapply(seq(along=L),function(i,x) if(i==1) jack else sape,x=L)
  [[1]]
  [1] jack
 
  [[2]]
  [1] sape
 
 as.list(names(L))
 
  lapply(seq(along=L),function(i,x) if(names(x)[i]==jack) 1 else 2,x=L)
  [[1]]
  [1] 1
 
  [[2]]
  [1] 2
 
 as.list(seq_along(L))
 
 lapply() can be faster than a for-loop, but usually not by much: its main 
 advantage is clarity of code.
 
 I think we need a real-life example to see what you are trying to do.
 
  But by passing L as the first argument of lapply(). I thought there was a
  tangentially-related post on this mailing list in the past but I don't
 recall
  that it was ever addressed directly (and I can't seem to find it now).
 The
  examples above are perfectly good alternatives especially if I wrap each
 of
  the lines in names-() to return lists with appropriate names assigned,
 but
 
 Try something like
 
 L[] - lapply(seq_along(L),function(i,x) if(i==1) jack else sape,x=L)
 
  it feels like I am essentially writing a FOR-LOOP - though I was
 surprised to
  find that speed-wise, it doesn't seem to make much of a difference
 (unless I
  have not selected a rigorous test):
 
  N - 1
  y - runif(N)
  ## looping through elements of y
  system.time(lapply(y,
  +function(x) {
  +  set.seed(222)
  +  mean(rnorm(1e4,x,1))
  +}))
  [1] 21.00  0.17 21.29NANA
  ## looping through indices
  system.time(lapply(1:N,
  +function(x,y) {
  +  set.seed(222)
  +  mean(rnorm(1e4,y[x],1))
  +  },y=y))
  [1] 21.09  0.14 21.26NANA
 
  In Python, there are methods for Lists and Dictionaries called
 enumerate(),
  and iteritems(), respectively. Example applications:
 
  ## a list
  L = ['a','b','c']
  [x for x in enumerate(L)]
  ## returns index of list along with the list element
  [(0, 'a'), (1, 'b'), (2, 'c')]
 
  ## a dictionary
  D = {'jack': 4098, 'sape': 4139}
  [x for x in D.iteritems()]
  ## returns element key (name) along with element contents
  [('sape', 4139), ('jack', 4098)]
 
  And this is something of the effect I was looking for...
 
  Thanks to all,
 
  Stephen
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman

Re: [R] selecting characters from a line of text

2007-06-11 Thread Stephen Tucker
Maybe substring() is what you're looking for? Some examples:

 substring(textstring,1,5)
[1] texts
 substring(textstring,3)
[1] xtstring
 substring(textstring,3,nchar(textstring))
[1] xtstring


--- Tim Holland [EMAIL PROTECTED] wrote:

 Is there a way in R to select certain characters from a line of text?  I
 have some data that is presently in a large number of text files, and I
 would like to be able to select elements of each text file (elements are
 always on the same line, in the same position) and organize them into a
 table.  Is there a tool to select text in this way in R?  What I am looking
 for would be somewhat similar to the left() and right() functions in Excel.
 I have looked at the parse() and scan() functions, but don't think they can
 do what I want (although I could be wrong).
 Thank you,
 Tim
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to specify the start position using plot

2007-06-10 Thread Stephen Tucker

plot(x=1:10,y=1:10,xlim=c(0,5),ylim=c(6,10))

a lot of the arguments descriptions for plot() are contained in ?par

--- Patrick Wang [EMAIL PROTECTED] wrote:

 Hi,
 
 How to specify the start position of Y in plot command, hopefully I can
 specify the range of X and Y axis. I checked the ?plot, it didnot mention
 I can setup the range.
 
 
 Thanks
 Pat
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Bored stiff? Loosen up...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-10 Thread Stephen Tucker

Since R is supposed to be a complete programming language, I wonder
why these tools couldn't be implemented in R (unless speed is the
issue). Of course, it's a naive desire to have a single language that
does everything, but it seems that R currently has most of the
functions necessary to do the type of data cleaning described.

For instance, Gabor and Peter showed some snippets of ways to do this
elegantly; my [physical science] data is often not as horrendously
structured so usually I can get away with a program containing this
type of code

txtin - scan(filename,what=,sep=\n)
filteredList - lapply(strsplit(txtin,delimiter),FUN=filterfunction)
   # fiteringfunction() returns selected (and possibly transformed
   # elements if present and NULL otherwise
   # may include calls to grep(), regexpr(), gsub(), substring(),...
   # nchar(), sscanf(), type.convert(), paste(), etc.
mydataframe - do.call(rbind,filteredList)
   # then match(), subset(), aggregate(), etc.

In the case that the file is large, I open a file connection and scan
a single line + apply filterfunction() successively in a FOR-LOOP
instead of using lapply(). Of course, the devil is in the details of
the filtering function, but I believe most of the required text
processing facilities are already provided by R.

I often have tasks that involve a combination of shell-scripting and
text processing to construct the data frame for analysis; I started
out using Python+NumPy to do the front-end work but have been using R
progressively more (frankly, all of it) to take over that portion
since I generally prefer the data structures and methods in R.


--- Peter Dalgaard [EMAIL PROTECTED] wrote:

 Douglas Bates wrote:
  Frank Harrell indicated that it is possible to do a lot of difficult
  data transformation within R itself if you try hard enough but that
  sometimes means working against the S language and its whole object
  view to accomplish what you want and it can require knowledge of
  subtle aspects of the S language.

 Actually, I think Frank's point was subtly different: It is *because* of 
 the differences in view that it sometimes seems difficult to find the 
 way to do something in R that  is apparently straightforward in SAS. 
 I.e. the solutions exist and are often elegant, but may require some 
 lateral thinking.
 
 Case in point: Finding the first or the last observation for each 
 subject when there are multiple records for each subject. The SAS way 
 would be a datastep with IF-THEN-DELETE, and a RETAIN statement so that 
 you can compare the subject ID with the one from the previous record, 
 working with data that are sorted appropriately.
 
 You can do the same thing in R with a for loop, but there are better 
 ways e.g.
 subset(df,!duplicated(ID)), and subset(df, rev(!duplicated(rev(ID))), or 
 maybe
 do.call(rbind,lapply(split(df,df$ID), head, 1)), resp. tail. Or 
 something involving aggregate(). (The latter approaches generalize 
 better to other within-subject functionals like cumulative doses, etc.).
 
 The hardest cases that I know of are the ones where you need to turn one 
 record into many, such as occurs in survival analysis with 
 time-dependent, piecewise constant covariates. This may require 
 transposing the problem, i.e. for each  interval you find out which 
 subjects contribute and with what, whereas the SAS way would be a 
 within-subject loop over intervals containing an OUTPUT statement.
 
 Also, there are some really weird data formats, where e.g. the input 
 format is different in different records. Back in the 80's where 
 punched-card input was still common, it was quite popular to have one 
 card with background information on a patient plus several cards 
 detailing visits, and you'd get a stack of cards containing both kinds. 
 In R you would most likely split on the card type using grep() and then 
 read the two kinds separately and merge() them later.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  

Park yourself in front of a world of choices in alternative vehicles. Visit the 
Yahoo! Auto Green Center.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-10 Thread Stephen Tucker
Embarrasingly, I don't know awk or sed but R's code seems to be
shorter for most tasks than Python, which is my basis for comparison.

It's true that R's more powerful data structures usually aren't
necessary for the data cleaning, but sometimes in the filtering
process I will pick out lines that contain certain data, in which case
I have to convert text to numbers and perform operations like
which.min(), order(), etc., so in that sense I like to have R's
vectorized notation and the objects/functions that support it.

As far as some of the tasks you described, I've tried transcribing
them to R. I know you provided only the simplest examples, but even in
these cases I think R's functions for handling these situations
exemplify their usefulness in this step of the analysis. But perhaps
you would argue that this code is too long... In any event it will
still save the trouble of keeping track of an extra (intermediate)
file passed between awk and R.

(1) the numbers of fields in each line equivalent to 
cat datafile.csv | awk 'BEGIN{FS=,}{n=NF;print n}'
in awk

# R equivalent:
nFields - count.fields(datafile.csv,sep=,)
# or 
nFields - sapply(strsplit(readLines(datafile.csv),,),length)

(2) which lines have the wrong number of fields, and how many fields
they have. You can similarly count how many lines there are (e.g. pipe
into wc -l).

# number of lines with wrong number of fields
nWrongFields - length(nFields[nFields  10])

# select only first ten fields from each line
# and return a matrix
firstTenFields - 
  do.call(rbind,
  lapply(strsplit(readLines(datafile.csv),,),
 function(x) x[1:10]))

# select only those lines which contain ten fields
# and return a matrix
onlyTenFields - 
  do.call(rbind,
  lapply(strsplit(readLines(datafile.csv),,),
 function(x) if(length(x) = 10) x else NULL))

(3)
If for instance you try to
read the following CSV into R as a dataframe:
 
1,2,.,4
2,.,4,5
3,4,.,6
 

txtC - textConnection(
1,2,.,4
2,.,4,5
3,4,.,6)
# using read.csv() specifying na.string argument:
 read.csv(txtC,header=FALSE,na.string=.)
  V1 V2 V3 V4
1  1  2 NA  4
2  2 NA  4  5
3  3  4 NA  6

# Of course, read.csv will work only if data is formatted correctly.
# More generally, using readLines(), strsplit(), etc., which are more
# flexible :

 do.call(rbind,
+ lapply(strsplit(readLines(txtC),,),
+type.convert,na.string=.))
 [,1] [,2] [,3] [,4]
[1,]12   NA4
[2,]2   NA45
[3,]34   NA6

(4) Situations where people mix ,, and ,.,!

# type.convert (and read.csv) will still work when missing values are ,,
# and ,., (automatically recognizes  as NA and through
# specification of 'na.string', can recognize . as NA)

# If it is desired to convert . to  first, this is simple as
# well:

m - do.call(rbind,
lapply(strsplit(readLines(txtC),,),
   function(x) gsub(^\\.$,,x)))
 m
 [,1] [,2] [,3] [,4]
[1,] 1  2 4 
[2,] 2 4  5 
[3,] 3  4 6 

# then
mode(m) - numeric
# or
m - apply(m,2,type.convert)
# will give
 m
 [,1] [,2] [,3] [,4]
[1,]12   NA4
[2,]2   NA45
[3,]34   NA6


--- [EMAIL PROTECTED] wrote:

 On 10-Jun-07 19:27:50, Stephen Tucker wrote:
  
  Since R is supposed to be a complete programming language,
  I wonder why these tools couldn't be implemented in R
  (unless speed is the issue). Of course, it's a naive desire
  to have a single language that does everything, but it seems
  that R currently has most of the functions necessary to do
  the type of data cleaning described.
 
 In principle that is certainly true. A couple of comments,
 though.
 
 1. R's rich data structures are likely to be superfluous.
Mostly, at the sanitisation stage, one is working with
flat files (row  column). This straightforward format
is often easier to handle using simple programs for the
kind of basic filtering needed, rather then getting into
the heavier programming constructs of R.
 
 2. As follow-on and contrast at the same time, very often
what should be a nice flat file with no rough edges is not.
If there are variable numbers of fields per line, R will
not handle it straightforwardly (you can force it in,
but it's more elaborate). There are related issues as well.
 
 a) If someone entering data into an Excel table lets their
cursor wander outside the row/col range of the table,
this can cause invisible entities to be planted in the
extraneous cells. When saved as a CSV, this file then
has variable numbers of fields per line, and possibly
also extra lines with arbitrary blank fields.
 
cat datafile.csv | awk 'BEGIN{FS=,}{n=NF;print n}'
 
will give you the numbers of fields in each line.
 
If you further pipe it into | sort -nu you will get
the distinct field-numbers. If you know (by now) how many
fields there should be (e.g. 10), then
 
cat

Re: [R] looking for the na.omit equivalent for a matrix of characters

2007-05-28 Thread Stephen Tucker
You can also use type.convert() if you did want to convert your characters to
numbers and NA's to NA's so that you can use na.omit().

 x - matrix(0,5,5)
 x[1,3] - x[4,4] - NA
 newx - apply(x,2,type.convert)
 newx
 [,1] [,2] [,3] [,4] [,5]
[1,]00   NA00
[2,]00000
[3,]00000
[4,]000   NA0
[5,]00000

--- jim holtman [EMAIL PROTECTED] wrote:

 Since they are characters you can just compare for them.  You did not show
 what your data looks like, or what you want to do if there are NA.  Do
 you
 want the row removed?  You can use 'apply' to test a row for NA:
 
   x - matrix(0,5,5)
  x[1,3] - x[4,4] - NA
  x
  [,1] [,2] [,3] [,4] [,5]
 [1,] 0  0  NA 0  0
 [2,] 0  0  0  0  0
 [3,] 0  0  0  0  0
 [4,] 0  0  0  NA 0
 [5,] 0  0  0  0  0
  apply(x, 1, function(z) any(z == NA))
 [1]  TRUE FALSE FALSE  TRUE FALSE
  x[!apply(x, 1, function(z) any(z == NA)),]
  [,1] [,2] [,3] [,4] [,5]
 [1,] 0  0  0  0  0
 [2,] 0  0  0  0  0
 [3,] 0  0  0  0  0
 
 
 
 
 On 5/28/07, Andrew Yee [EMAIL PROTECTED] wrote:
 
  I have a matrix of characters (actually numbers that have been read in as
  numbers), and I'd like to remove the NA.
 
  I'm familiar with na.omit, but is there an equivalent of na.omit when the
  NA
  are the actual characters NA?
 
  Thanks,
  Andrew
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem you are trying to solve?
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Sick
 sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loop in function

2007-05-05 Thread Stephen Tucker
Actually I am not sure what you want exactly, but is it

df1 -data.frame(b=c(1,2,3,4,5,5,6,7,8,9,10))
df2 -data.frame(x=c(1,2,3,4,5), y=c(2,5,4,6,5), z=c(10, 8, 7, 9, 3))
df1 - cbind(df1,
 colnames-(sapply(with(df2,(x+y)/z),
 function(a,b) a/b,b=df1$b),
  paste(goal,seq(nrow(df2)),sep=)))

 round(df1,2)
b goal1 goal2 goal3 goal4 goal5
1   1  0.30  0.88  1.00  1.11  3.33
2   2  0.15  0.44  0.50  0.56  1.67
3   3  0.10  0.29  0.33  0.37  1.11
4   4  0.07  0.22  0.25  0.28  0.83
5   5  0.06  0.17  0.20  0.22  0.67
6   5  0.06  0.17  0.20  0.22  0.67
7   6  0.05  0.15  0.17  0.19  0.56
8   7  0.04  0.12  0.14  0.16  0.48
9   8  0.04  0.11  0.12  0.14  0.42
10  9  0.03  0.10  0.11  0.12  0.37
11 10  0.03  0.09  0.10  0.11  0.33

each column goal corresponds to row of df1. Alternatively, the sapply()
function can be rewritten with apply():

apply(df2,1,
  function(a,b) (a[x]+a[y])/(a[z]*b),
  b=df1$b)

Hope this answered your question...

--- [EMAIL PROTECTED] wrote:

 Dear Mailing-List,
 I think this is a newbie question. However, i would like to integrate a
 loop in the function below. So that the script calculates for each
 variable within the dataframe df1 the connecting data in df2. Actually it
 takes only the first row. I hope that's clear. My goal is to apply the
 function for each data in df1. Many thanks in advance. An example is as
 follows:
 
 df1 -data.frame(b=c(1,2,3,4,5,5,6,7,8,9,10))
 df2 -data.frame(x=c(1,2,3,4,5), y=c(2,5,4,6,5), z=c(10, 8, 7, 9, 3))
 attach(df2)
 myfun = function(yxz) (x + y)/(z * df1$b)
 df1$goal - apply(df2, 1, myfun)
 df1$goal
 
 regards,
 
 kay
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Single Title for the Multiple plot page

2007-05-03 Thread Stephen Tucker
Sometimes I just overlay a blank plot and annotate with text.

par(mfrow=c(1,2), oma=c(2,0,2,0))
plot(1:10)
plot(1:10)
oldpar - par()
par(mfrow=c(1,1),new=TRUE,mar=rep(0,4),oma=rep(0,4))
plot.window(xlim=c(0,1),ylim=c(0,1),mar=rep(0,4))
text(0.5,c(0.98,0.02),c(Centered Overall Title,Centered Subtitle),
 cex=c(1.4,1))
par(oldpar)


--- Chuck Cleland [EMAIL PROTECTED] wrote:

 Mohammad Ehsanul Karim wrote:
  Dear List, 
  
  In R we can plot multiple graphs in same page using
  par(mfrow = c(*,*)). In each plot we can set title
  using main and sub commands. 
  
  However, is there any way that we can place an
  universal title above the set of plots placed in the
  same page (not individual plot titles, all i need is a
  title of the whole graph page) as well as sib-titles?
  Do I need any package to do so?
  
  Thank you for your time.
 
   This is covered in a number of places in the archives and
 RSiteSearch(Overall Title) points you to relevant posts. I'm not sure
 there is an example of having both a title and subtitle, but that is
 easy enough:
 
  par(mfrow=c(1,2), oma=c(2,0,2,0))
  plot(1:10)
  plot(1:10)
  title(Centered Overall Title, outer=TRUE)
  mtext(side=1, Centered Subtitle, outer=TRUE)
 
   You might consider upgrading to a more recent version of R.
 
  Mohammad Ehsanul Karim (R - 2.3.1 on windows)
  Institute of Statistical Research and Training
  University of Dhaka 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code. 
 
 -- 
 Chuck Cleland, Ph.D.
 NDRI, Inc.
 71 West 23rd Street, 8th floor
 New York, NY 10010
 tel: (212) 845-4495 (Tu, Th)
 tel: (732) 512-0171 (M, W, F)
 fax: (917) 438-0894
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regular expressions with grep() and negative indexing

2007-04-25 Thread Stephen Tucker
Dear R-helpers,

Does anyone know how to use regular expressions to return vector elements
that don't contain a word? For instance, if I have a vector
  x - c(seal.0,seal.1-exclude)
I'd like to get back the elements which do not contain the word exclude,
using something like (I know this doesn't work) but:
  grep([^(exclude)],x)

I can use 
  x[-grep(exclude,x)]
for this case but then if I use this expression in a recursive function, it
will not work for instances in which the vector contains no elements with
that word. For instance, if I have
  x2 - c(dolphin.0,dolphin.1)
then
  x2[-grep(exclude,x2)]
will give me 'character(0)'

I know I can accomplish this in several steps, for instance:
  myfunc - function(x) {
iexclude - grep(exclude,x)
if(length(iexclude)  0) x2 - x[-iexclude] else x2 - x
# do stuff with x2 ...?
  }

But this is embedded in a much larger function and I am trying to minimize
intermediate variable assignment (perhaps a futile effort). But if anyone
knows of an easy solution, I'd appreciate a tip.

Thanks very much!

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions with grep() and negative indexing

2007-04-25 Thread Stephen Tucker
Thanks guys for the suggestions guys- I come across this problem a lot but
now I have many solutions.

Thank you,

Stephen


--- Peter Dalgaard [EMAIL PROTECTED] wrote:

 Peter Dalgaard wrote:
  Stephen Tucker wrote:

  Dear R-helpers,
 
  Does anyone know how to use regular expressions to return vector
 elements
  that don't contain a word? For instance, if I have a vector
x - c(seal.0,seal.1-exclude)
  I'd like to get back the elements which do not contain the word
 exclude,
  using something like (I know this doesn't work) but:
grep([^(exclude)],x)
 
  I can use 
x[-grep(exclude,x)]
  for this case but then if I use this expression in a recursive function,
 it
  will not work for instances in which the vector contains no elements
 with
  that word. For instance, if I have
x2 - c(dolphin.0,dolphin.1)
  then
x2[-grep(exclude,x2)]
  will give me 'character(0)'
 
  I know I can accomplish this in several steps, for instance:
myfunc - function(x) {
  iexclude - grep(exclude,x)
  if(length(iexclude)  0) x2 - x[-iexclude] else x2 - x
  # do stuff with x2 ...?
}
 
  But this is embedded in a much larger function and I am trying to
 minimize
  intermediate variable assignment (perhaps a futile effort). But if
 anyone
  knows of an easy solution, I'd appreciate a tip.

  
  It has come up a couple of times before, and yes, it is a bit of a pain.
 
  Probably the quickest way out is
 
  negIndex - function(i) 
 
 if(length(i))
 
 -i 
 
 else 
 
 TRUE
 

 ... which of course needs braces if typed on the command line
 
 negIndex - function(i) 
 {
if(length(i))
-i 
else 
TRUE
 }
 
 And I should probably also have said that it works like this:
 
  x2 - c(dolphin.0,dolphin.1)
  x2[-grep(exclude,x2)]
 character(0)
  x2[negIndex(grep(exclude,x2))]
 [1] dolphin.0 dolphin.1
 
 
 
 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] queries

2007-04-22 Thread Stephen Tucker
hist(rnorm(100),xlab=Data,ylab=Count,main=)
title(main=Histogram of ...,cex=0.5)

see ?par for details on xlab, ylab, main, and cex arguments.
You can call these from title() or include them in hist().
I called title(main=..) separately to control its size separately
from the rest of the text (axis and tick labels).



--- Nima Tehrani [EMAIL PROTECTED] wrote:

 Dear Help Desk,

   Is there any way to change some of the labels on R diagrams? 

   Specifically in histograms, I would like to: 

   1. change the word frequency to count. 
   2. Make the font of the title (Histogram of …) smaller.
   3. Have a different word below the histogram than the one
 occurring in the title (right now if you choose X for your variable, it
 comes both above the histogram (in the phrase Histogram of X) and below
 it).

   Thanks for your time,
   Nima
 

 -
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] queries

2007-04-22 Thread Stephen Tucker
My apologies. Second line should be
title(main=Histogram of ...,cex.main=0.5)

Actually I just realized you can also do 
hist(rnorm(100),xlab=Data,ylab=Count,cex.main=0.5)

...this way you don't have to call title() separately.


--- Stephen Tucker [EMAIL PROTECTED] wrote:

 hist(rnorm(100),xlab=Data,ylab=Count,main=)
 title(main=Histogram of ...,cex=0.5)
 
 see ?par for details on xlab, ylab, main, and cex arguments.
 You can call these from title() or include them in hist().
 I called title(main=..) separately to control its size separately
 from the rest of the text (axis and tick labels).
 
 
 
 --- Nima Tehrani [EMAIL PROTECTED] wrote:
 
  Dear Help Desk,
 
Is there any way to change some of the labels on R diagrams? 
 
Specifically in histograms, I would like to: 
 
1. change the word frequency to count. 
2. Make the font of the title (Histogram of …) smaller.
3. Have a different word below the histogram than the one
  occurring in the title (right now if you choose X for your variable, it
  comes both above the histogram (in the phrase Histogram of X) and below
  it).
 
Thanks for your time,
Nima
  
 
  -
  
  
  [[alternative HTML version deleted]]
  
   __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Opinion on R plots: connecting X and Y

2007-04-20 Thread Stephen Tucker
Edward Tufte seems to have some opinions on this topic.

In The Visual Display of Quantitative Information (Chapter 6: Data-Ink
Maximization and Graphical Design - Redesign of the Scatterplot), he
presents several alternatives

(1) non-data-bearing frame in conventional scatterplots (equivalent to R's
bty=l), which he argues is the common but less informative method.

(2) a little removal of ink from (1) can change axes to display the range of
the data (range-frame).

(3) or, with slight modification of (2), fivenum().

(4) or even dot-dash-plots in which marginal frequency distribution are
displayed as the axis using dots and dashes.

I don't know that bty=n with xaxs,yaxs equal to i or r meets any of
these criteria (and bty=l is apparently less informative than his other
suggestions)...


--- Inman, Brant A.   M.D. [EMAIL PROTECTED] wrote:

 
 Attention R users, especially those that are experienced enough to be
 opinionated, I need your input.
 
 Consider the following simple plot:
 
 x - rnorm(100)
 y - rnorm(100)
 plot(x, y, bty='n')
 
 A colleague (and dreaded SAS user) commented that she thought that my
 plots could be cleaned up by connecting the X and Y axes.  I know that
 I can do that with bty='l' but I don't want to, I find that the plots
 look less cluttered with disjoint axes.
 
 However, I was intrigued enough by her comments that I decided to
 solicit the opinions of others on this issue.  Are there principled
 reasons why one should prefer joined axes or disjoint axes?
 
 Brant Inman
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using mean if two values are identical

2007-04-19 Thread Stephen Tucker
## making data up
# make matrix with some equal values
 mat - cbind(x=rnorm(10),y=rnorm(10),z=rnorm(10))
 mat[c(8,9),y] - mat[c(1,7),x]
 mat
x  y   z
 [1,]  0.26116849  0.5823529 -0.96924020
 [2,] -0.21415406  0.1085396  2.00542549
 [3,]  0.56890081 -1.2526322  0.08539552
 [4,] -1.09096693 -1.9369088  0.03079587
 [5,] -1.31749886 -1.1437411 -0.29125624
 [6,] -0.45941172  0.2997472  0.10329381
 [7,]  0.39586456 -0.2587432 -1.29788184
 [8,] -0.05066363  0.2611685 -0.47942195
 [9,] -0.87602919  0.3958646 -0.53205231
[10,]  0.30059621 -1.9531231  0.22398194
 

## find the index of y which corresponds to equivalent value of
## x and find mean. the following function will give search
## through for each x the matching values of y and return
## the value of x, the index of y, and the mean value
 t(apply(mat[,c(x,z)], MARGIN=1, FUN=function(v,w) {
+   yindex - which(abs(v[x]-w[,y])  .Machine$double.eps^0.5)
+   if(length(yindex)  0) {
+ c(xVal=v[x],indexOfy=yindex,meanVal=mean(c(v[z],w[yindex,z])))
+   } else {
+ c(xVal=v[x],indexOfy=NA,meanVal=NA)
+   }
+ },w=mat[,c(y,z)]))
x indexOfy   meanVal
 [1,]  0.261168498 -0.724331
 [2,] -0.21415406   NANA
 [3,]  0.56890081   NANA
 [4,] -1.09096693   NANA
 [5,] -1.31749886   NANA
 [6,] -0.45941172   NANA
 [7,]  0.395864569 -0.914967
 [8,] -0.05066363   NANA
 [9,] -0.87602919   NANA
[10,]  0.30059621   NANA

Hope this helps.

--- Felix Wave [EMAIL PROTECTED] wrote:

 Hello,
 I have got a question. 
 I've got a matrix (mail end) with the colnames x, y, z. In this matrix
 are different measurements. x and y are risign coordinates.
 
 My question. Always, if the x AND y coordinates are the same, I want to
 
 get the mean of their z values.
 
 
 e.q. 
 x AND y in line1 and line8 are identical: 
 29 4.5 -- mean of 1.505713 and 1.495148
 
 
 Thank's a lot.
 Felix
 
 
 
 
 ###
 ## My R Code ##
 ###
 INPUT   - readLines(dat.dat)
 INPUT   - gsub(^ , , INPUT)
 INPUT   - t( sapply( strsplit(INPUT, split= ), as.numeric ) )
 colnames(INPUT) - c(x, y, z )
 
 
 # HERE START's my PROBLEM #
 if (duplicated(INPUT[,1]  INPUT[,2] ))
   zMEAN   - mean(INPUT[,3] )
 
 # MATRIX with the mean-z values #
 zMATRIX - matrix(c(INPUT[,1], INPUT[,2], INPUT[,3] ), ncol=3, byrow=FALSE)
 
 
 
 
 #
 ## dat.dat ##
 #
 29 4.5 1.505713
 29 4.6 1.580402
 29 4.7 1.656875
 29 4.8 1.735054
 30 0 0
 30 0.1 0.00096108
 30 0.2 0.00323831
 29 4.5 1.495148
 29 4.6 1.568961
 29 4.7 1.644467
 30 0 0
 30 0.1 0.00093699
 30 0.2 0.00319411
 30 0.3 0.00676619
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error with strptime

2007-04-19 Thread Stephen Tucker
you have to use POSIXct classes to include date-time objects into data
frames. strptime() returns an object of class of POSIXlt. when you do the
cbind(), it automatically converts test2 into POSIXct

you probably want
bsamp$spltime-as.POSIXct(strptime(test,format=%d-%B-%y %H:%M))

(but please be aware of time-zone issues when using POSIXct classes). these
documents may help:

http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf

=== Also ===
Changes in behavior of POSIXt classes since the aforementioned R News
publication:

http://tolstoy.newcastle.edu.au/R/e2/help/07/04/13626.html
http://tolstoy.newcastle.edu.au/R/e2/help/07/04/13632.html



--- Jean-Louis Abitbol [EMAIL PROTECTED] wrote:

 Dear All,
 
 I am trying to convert to POSIXct after pasting a date and a time in
 character format with strptime.
 
 It is probably obvious but I don't understand why I get an error message
 after
 
 bsamp$spltime-strptime(test,format=%d-%B-%y %H:%M)
 
 whereas I can get what I want if I do it in 2 steps and rbinding ?
 
 Thanks and best regards, Jean-Louis
 
 This is the R console output
 
  bsamp-read.table(bsampl2.csv,header=T,sep=;)
  names(bsamp)-tolower(names(bsamp))
  bsamp-upData(bsamp,drop=c(study))
 Input object size:   23896 bytes;15 variables
 Dropped variable study
 New object size: 23016 bytes;14 variables
  bsamp$visitdat-as.character(bsamp$visitdat)
  bsamp$samtime-as.character(bsamp$samtime)
  bsamp$admtime-as.character(bsamp$admtime)
  bsamp$delay-as.character(bsamp$delay)
  test-paste(bsamp$visitdat,bsamp$samtime)
  test
   [1] 01-mars-06 11:40 15-mars-06 11:30 15-mars-06 15:00
   [4] 29-mars-06 11:40 01-mars-06 11:45 15-mars-06 11:15
   [7] 15-mars-06 14:45 29-mars-06 12:50 01-mars-06 11:16
  [10] 15-mars-06 11:10 15-mars-06 14:30 29-mars-06 11:50
  [13] 01-mars-06 11:50 15-mars-06 11:25 15-mars-06 14:55
  [16] 29-mars-06 11:30 01-mars-06 11:55 15-mars-06 11:35
  [19] 15-mars-06   29-mars-06 11:45 01-mars-06 11:09
  .
 
 
  bsamp$spltime-strptime(test,format=%d-%B-%y %H:%M)
 Erreur dans `$-.data.frame`(`*tmp*`, spltime, value = list(sec =
 numeric(0),  :
 le tableau de remplacement a 9 lignes, le tableau remplacé en a
 140
 
 
  test2-strptime(test,format=%d-%B-%y %H:%M)
  bsamp-cbind(bsamp,test2)
  bsamp$test2
   [1] 2006-03-01 11:40:00 Centre de l'Europe
   [2] 2006-03-15 11:30:00 Centre de l'Europe
   [3] 2006-03-15 15:00:00 Centre de l'Europe
   [4] 2006-03-29 11:40:00 Centre de l'Europe (heure d'été
   [5] 2006-03-01 11:45:00 Centre de l'Europe
   [6] 2006-03-15 11:15:00 Centre de l'Europe
   [7] 2006-03-15 14:45:00 Centre de l'Europe
   [8] 2006-03-29 12:50:00 Centre de l'Europe (heure d'été
   [9] 2006-03-01 11:16:00 Centre de l'Europe
  [10] 2006-03-15 11:10:00 Centre de l'Europe
  [11] 2006-03-15 14:30:00 Centre de l'Europe
  [12] 2006-03-29 11:50:00 Centre de l'Europe (heure d'été
  [13] 2006-03-01 11:50:00 Centre de l'Europe
  [14] 2006-03-15 11:25:00 Centre de l'Europe
  [15] 2006-03-15 14:55:00 Centre de l'Europe
  [16] 2006-03-29 11:30:00 Centre de l'Europe (heure d'été
  [17] 2006-03-01 11:55:00 Centre de l'Europe
  [18] 2006-03-15 11:35:00 Centre de l'Europe
  [19] NA
  ..
  sessionInfo()
 R version 2.4.1 (2006-12-18)
 i386-pc-mingw32
 
 locale:

LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets
 [6] methods   base
 
 other attached packages:
  car RColorBrewer   gplotsgdata   gtools
  1.2-1  0.2-3  2.3.2  2.3.1  2.3.1
  latticeHmisc  acepack  RWinEdt
0.14-17  3.3-11.3-2.2  1.7-5
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Character coerced to factor and I cannot get it back

2007-04-19 Thread Stephen Tucker
You can also set this option globally with options(stringsAsFactors = TRUE)

I believe this was added in R 2.4.0.



--- Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Try this:
 
 DF - data.frame(let = letters[1:3], num = 1:3, stringsAsFactors = FALSE)
 str(DF)
 
 
 On 4/19/07, John Kane [EMAIL PROTECTED] wrote:
 
  --- Tyler Smith [EMAIL PROTECTED] wrote:
 
   I really need to sit down with the manual and sort
   factors and classes
   properly. In your case, I think the problem has
   something to do with
   the way a list behaves?  I'm not sure, but if you
   convert your list to
   a dataframe it seems to work ok:
  
dd3 - as.data.frame(dd1)
typeof(dd3$st)
   [1] integer
class(dd3$st)
   [1] factor
dd3$st - as.character(dd3$st)
typeof(dd3$st)
   [1] character
class(dd3$st)
   [1] character
  
   HTH,
  
   Tyler
 
  Seems to work nicely. I had forgotten about
  'as.data.frame.
 
  I originally thought that it might be a list problem
  too but I don't think so. I set up the example as a
  list since that is the way my real data is being
  imported from csv. However after my original posting I
  went back and tried it with just a dataframe and I'm
  getting the same results. See below.
 
  I even shut down R , reloaded it and detached the two
  extra packages I usually load. Everything is working
  fine but I am doing some things with factors that I
  have never done before and this just makes me a bit
  paranoid.
 
  Thanks very much for the help.
 
 
  EXAMPLE
  dd  - data.frame(aa - 1:4, bb -  letters[1:4],
  cc - c(12345, 123456, 45678, 456789))
 
  id  -  as.character(dd[,3]) ; id
 
  st  - substring(id, 1,nchar(id)-4 ) ; st
  typeof (st)  ; class(st)
 
  dd1  -  cbind(dd, st)
 names(dd1)  - c(aa,bb,cc,st)
 dd1
 typeof(dd1$st); class(dd1$st)
 
  dd2  -  cbind(dd, as.character(st))
 names(dd2)  - c(aa,bb,cc,st)
 dd2
 typeof(dd2$st) ;   class(dd2$st)
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] importing excel-file

2007-04-18 Thread Stephen Tucker
I use gdata and it works quite well for me. It's as easy as

install.packages(gdata)
library(gdata)
data = read.xls(mydata.xls,sheet=1) 

[read.xls() can take other arguments]

It requires concurrent installation of Perl, but installing Perl is also
simple. For Windows, you can get it here:
http://www.activestate.com/Products/ActivePerl/


--- Gabor Csardi [EMAIL PROTECTED] wrote:

 There is also a read.xls command in package gdata, it seems that it uses
 a perl script called 'xls2csv'. I've have no idea how good this is,
 never tried it.
 
 Btw, xlsReadWrite is Windows-only, so you can use it only if 
 you use windows.
 
 Gabor
 
 ps. Corinna, to be honest, i've no idea what kind online help you've
 read, there is plenty. Next time try to be more specific please. 
 
 On Wed, Apr 18, 2007 at 03:07:51PM -0200, Alberto Monteiro wrote:
  Corinna Schmitt wrote:
   
   It is a quite stupid question but please help me. I am very 
   confuced. I am able to import normal txt ant mat-files to R but 
   unable to import .xls-file
   
  I've tried two ways to import excel files, but none of them
  seems perfect.
 [...]
 
 -- 
 Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation using R

2007-04-18 Thread Stephen Tucker
...is this what you're looking for?

donedat - subset(data,ID  6000 | ID = 7000)
findat - donedat[-unique(rapply(donedat,function(x)
 which( x  0 ))),,drop=FALSE]

the second line looks through each column, and finds the indices of negative
values - rapply() returns all of them as a vector; unique() removes
duplicated elements, and with negative indexing you remove these values from
donedat.

--- Anup Nandialath [EMAIL PROTECTED] wrote:

 Dear Friends,
 
 I have data set with around 220,000 rows and 17 columns. One of the columns
 is an id variable which is grouped from 1000 through 9000. I need to
 perform the following operations. 
 
 1) Remove all the observations with id's between 6000 and 6999
 
 I tried using this method. 
 
 remdat1 - subset(data, ID6000)
 remdat2 - subset(data, ID=7000)
 donedat - rbind(remdat1, remdat2)
 
 I check the last and first entry and found that it did not have ID values
 6000. Therefore I think that this might be correct, but is this the most
 efficient way of doing this?
 
 2) I need to remove observations within columns 3, 4, 6 and 8 when they are
 negative. For instance if the number in column 3 is -4, then I need to
 delete the entire observation. Can somebody help me with this too.
 
 Thank and Regards
 
 Anup
 

 -
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply problem

2007-04-13 Thread Stephen Tucker
?apply says

If X is not an array but has a dimension attribute, apply attempts to coerce
it to an array via as.matrix if it is two-dimensional (e.g., data frames). .
.

It would probably be easiest with a FOR-LOOP, but you could also try
something like the code below (and insert your operations in #...).

myfunc - function(x,classOfX) {
  x - as.data.frame(t(x))
  factvars - which(classOfX==factor)
  x[,factvars] - lapply(x[,factvars],factor)
  for( i in seq(along=x) ) x[,i] - as(x[,i],Class=classOfX[i])
  # ...
  return(x)
}
x - data.frame(a=as.integer(1:10),b=factor(letters[1:10]),c=runif(10))
Fold - function(f,x,L) for(e in L) x - f(x,e)
y - Fold(rbind,vector(),apply(x,1,myfunc,rapply(x,class)))

 rapply(x,class)
a b c 
integer  factor numeric 
 rapply(y,class)
a b c 
integer  factor numeric 



--- aedin culhane [EMAIL PROTECTED] wrote:

 Dear R-Help
 I am running apply on a data.frame containing factors and numeric 
 columns.  It appears to convert are columns into as.character? Does it 
 convert data.frame into matrix? Is this expected? I wish it to recognise 
 numerical columns and round numbers.  Can I use another function instead 
 of apply, or should I use a for loop in the case?
 
   summary(xmat)
 A   B C D
   Min.   :  1.0   414:  1   Stage 2:  5   Min.   :-0.075369
   1st Qu.:113.8   422:  1   Stage 3:  6   1st Qu.:-0.018102
   Median :226.5   426:  1   Stage 4:441   Median :-0.003033
   Mean   :226.5   436:  1 Mean   : 0.008007
   3rd Qu.:339.2   460:  1 3rd Qu.: 0.015499
   Max.   :452.0   462:  1 Max.   : 0.400578
   (Other):446
 EFG
   Min.   :0.2345   Min.   :0.9808   Min.   :0.01558
   1st Qu.:0.2840   1st Qu.:0.9899   1st Qu.:0.02352
   Median :0.3265   Median :0.9965   Median :0.02966
   Mean   :0.3690   Mean   :1.0079   Mean   :0.03580
   3rd Qu.:0.3859   3rd Qu.:1.0129   3rd Qu.:0.03980
   Max.   :2.0422   Max.   :1.3742   Max.   :0.20062
 
   for(i in 1:7) print(class(xmat[,i]))
 [1] integer
 [1] factor
 [1] factor
 [1] numeric
 [1] numeric
 [1] numeric
 [1] numeric
 
   apply(xmat, 2, class)
A   B   C   D   E   F
 character character character character character character
G
 character
 
 
 
 Thanks for your help
 Aedin
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a data frame from a list

2007-04-06 Thread Stephen Tucker
Hi Dimitri,

You can try this one if you'd like:

lst = list(a=c(A=1,B=8) , b=c(A=2,B=3,C=0), c=c(B=2,D=0))
# get unique names
nms - unique(rapply(lst,function(x) names(x)))
# create a vector of NA's and then fill it according
# to matching names for each element of list
doit - function(x,nms) {
  y - rep(NA,length(nms)); names(y) - nms
  y[match(names(x),names(y))] - x
  return(y)
}
# apply it to the data
dtf - as.data.frame(sapply(lst,doit,nms))
  


--- Dimitri Szerman [EMAIL PROTECTED] wrote:

 Dear all,
 
 A few months ago, I asked for your help on the following problem:
 
 I have a list with three (named) numeric vectors:
 
  lst = list(a=c(A=1,B=8) , b=c(A=2,B=3,C=0), c=c(B=2,D=0) )
  lst
 $a
 A B
 1 8
 
 $b
 A B C
 2 3 0
 
 $c
 B D
 2 0
 
 Now, I'd love to use this list to create the following data frame:
 
  dtf = data.frame(a=c(A=1,B=8,C=NA,D=NA),
 +  b=c(A=2,B=3,C=0,D=NA),
 +  c=c(A=NA,B=2,C=NA,D=0) )
 
  dtf
ab c
 A   1   2  NA
 B   8   3 2
 C NA   0  NA
 D NA NA0
 
 That is, I wish to merge the three vectors in the list into a data frame
 by their (row)names.
 
 And I got the following answer:
 
 library(zoo)
 z - do.call(merge, lapply(lst, function(x) zoo(x, names(x
 rownames(z) - time(z)
 coredata(z)
 
 However, it does not seem to be working. Here's what I get when I try it:
 
  lst = list(a=c(A=1,B=8) , b=c(A=2,B=3,C=0), c=c(B=2,D=0) )
  library(zoo)
  z - do.call(merge, lapply(lst, function(x) zoo(x, names(x
 Error in if (freq  1  identical(all.equal(freq, round(freq)),
 TRUE)) freq - round(freq) :
 missing value where TRUE/FALSE needed
 In addition: Warning message:
 NAs introduced by coercion
 
 and z was not created.
 
 Any ideas on what is going on here?
 Thank you,
 Dimitri
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Be a PS3 game guru.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reasons to Use R

2007-04-06 Thread Stephen Tucker
Hi Lorenzo,

I don't think I'm qualified to provide solid information on the first
three questions, but I'd like to drop a few thoughts on (4). While
there are no shortage of language advocates out there, I'd like to
join in for this once. My background is in chemical engineering and
atmospheric science; I've done simulation on a smaller scale but spend
much of my time analyzing large sets of experimental data. I am
comfortable programming in Matlab, R, Python, C, Fortran, Igor Pro,
and I also know a little IDL but have not programmed in it
extensively.

As you are probably aware, I would count among these, Matlab, R,
Python, and IDL as good candidates for processing large data sets, as
they are high-level languages and can communicate with netCDF files
(which I imagine will be used to transfer data).

Each language boasts an impressive array of libraries, but what I
think gives R the advantage for analyzing data is the level of
abstraction in the language. I am extremely impressed with the objects
available to represent data sets, and the functions support them very
well - it requires that I carry around a fewer number of objects to
hold information about my data (and I don't have to unpack them to
feed them into functions). The language is also very expressive in
that it lets you write a procedure in many different ways, some
shorter, some more readable, depending on what your situation
requires. System commands and text processing are integrated into the
language, and the input/output facilities are excellent, in terms of
data and graphics. Once I have my data object I am only a few
keystrokes to split, sort, and visualize multivariate data; even after
several years I keep discovering new functions for basic things like
manipulation of data objects and descriptive statistics, and plotting
- truly, an analyst's needs have been well anticipated.

And this is a recent obsession of mine, which I was introduced to
through Python, but the functional programming support for R is
amazing. By using higher-order functions like lapply(), I infrequently
rely on FOR-LOOPS, which have often caused me trouble in the past
because I had forgotten to re-initialize a variable, or incremented
the wrong variable, etc. Though I'm definitely not militant about
functional programming, in general I try to write functions and then
apply them to the data (if the functions don't exist in R already),
often through higher-order functions such as lapply(). This approach
keeps most variables out of the global namespace and so I am less
likely to reassign a value to a variable that I had intended to
keep. It also makes my code more modular so that I can re-use bits of
my code as my analysis inevitably grows much larger than I had
originally intended.

Furthermore, my code in R ends up being much, much shorter than code I
imagine writing in other languages to accomplish the same task; I
believe this leads to fewer places for errors to occur, and the nature
of the code is immediately comprehensible (though a series of nested
functions can get pretty hard to read at times), not to mention it
takes less effort to write. This also makes it easier to interact with
the data, I think, because after making a plot I can set up for the
next plot with only a few function calls instead of setting out to
write a block of code with loops, etc.

I have actually recommended R to colleagues who needed to analyze the
information from large-scale air quality/ global climate simulations,
and they are extremely pleased. I think the capability for statistics
and graphics is well-established enough that I don't need to do a
hard-sell on that so much, but R's language is something I get very
excited about. I do appreciate all the contributors who have made this
available.

Best regards,
ST


--- Lorenzo Isella [EMAIL PROTECTED] wrote:

 Dear All,
 The institute I work for is organizing an internal workshop for High
 Performance Computing (HPC).
 I am planning to attend it and talk a bit about fluid dynamics, but
 there is also quite a lot of interest devoted to data post-processing
 and management of huge data sets.
 A lot of people are interested in image processing/pattern recognition
 and statistic applied to geography/ecology, but I would like not to
 post this on too many lists.
 The final aim of the workshop is  understanding hardware requirements
 and drafting a list of the equipment we would like to buy. I think
 this could be the venue to talk about R as well.
 Therefore, even if it is not exactly a typical mailing list question,
 I would like to have suggestions about where to collect info about:
 (1)Institutions (not only academia) using R
 (2)Hardware requirements, possibly benchmarks
 (3)R  clusters, R  multiple CPU machines, R performance on different
 hardware.
 (4)finally, a list of the advantages for using R over commercial
 statistical packages. The money-saving in itself is not a reason good
 enough and some people are scared by the lack of 

Re: [R] Reasons to Use R

2007-04-06 Thread Stephen Tucker
Regarding (2),

I wonder if this information is too outdated or not relevant when scaled up
to larger problems...

http://www.sciviews.org/benchmark/index.html




--- Ramon Diaz-Uriarte [EMAIL PROTECTED] wrote:

 Dear Lorenzo,
 
 I'll try not to repeat what other have answered before.
 
 On 4/5/07, Lorenzo Isella [EMAIL PROTECTED] wrote:
  The institute I work for is organizing an internal workshop for High
  Performance Computing (HPC).
 (...)
 
  (1)Institutions (not only academia) using R
 
 You can count my institution too. Several groups. (I can provide more
 details off-list if you want).
 
  (2)Hardware requirements, possibly benchmarks
  (3)R  clusters, R  multiple CPU machines, R performance on different
 hardware.
 
 We do use R in commodity off-the shelf clusters; our two clusters are
 running Debian GNU/Linux; both 32-bit machines ---Xeons--- and 64-bit
 machines ---dual-core AMD Opterons. We use parallelization quite a
 bit, with MPI (via Rmpi and papply packages mainly). One convenient
 feature is that (once the lam universe is up and running) whether we
 are using the 4 cores in a single box, or the max available 120, is
 completeley transparent. Using R and MPI is, really, a piece of cake.
 That said, there are things that I miss; in particular, oftentimes I
 wish R were Erlang or Oz because of the straightforward fault-tolerant
 distributed computing and the built-in abstractions for distribution
 and concurrency. The issue of multithreading has come up several times
 in this list and is something that some people miss.
 
 I am not sure how much R is used in the usual HPC realms. It is my
 understanding that the traditional HPC is still dominated by things
 such as HPF, and C with MPI, OpenMP, or UPC or Cilk. The usual answer
 to but R is too slow is but you can write Fortran or C code for the
 bottlenecks and call it from R. I guess you could use, say, UPC in
 that C that is linked to R, but I have no experience. And I think this
 code can become a pain to write and maintain (specially if you want to
 play around with what you try to parallelize, etc). My feeling (based
 on no information or documentation whatsoever) is that how far R can
 be stretched or extended into HPC is still an open question.
 
 
  (4)finally, a list of the advantages for using R over commercial
  statistical packages. The money-saving in itself is not a reason good
  enough and some people are scared by the lack of professional support,
  though this mailing list is simply wonderful.
 
 
 (In addition to all the already mentioned answers)
 Complete source code availability. Being able to look at the C source
 code for a few things has been invaluable for me.
 And, of course, and extremely active, responsive, and vibrant
 community that, among other things, has contributed packages and code
 for an incredible range of problems.
 
 
 Best,
 
 R.
 
 P.S. I'd be interested in hearing about the responses you get to your
 presentation.
 
 
  Kind Regards
 
  Lorenzo Isella
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 Ramon Diaz-Uriarte
 Statistical Computing Team
 Structural Biology and Biocomputing Programme
 Spanish National Cancer Centre (CNIO)
 http://ligarto.org/rdiaz
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

TV dinner still cooling?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time zone problems

2007-04-04 Thread Stephen Tucker
Hi Marc,

This R Help Desk article was very helpful for me:

http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf
Gabor Grothendieck and Thomas Petzoldt

And Gabor noted subsequent changes in behavior of POSIXt classes since the
aforementioned R News publication:

(1) http://tolstoy.newcastle.edu.au/R/e2/help/07/04/13626.html
(2) http://tolstoy.newcastle.edu.au/R/e2/help/07/04/13632.html

Basically, strptime() converts your character object to an POSIXlt object,
which may have a tzone attribute but it is ignored by everything- so you have
to convert it to POSIXct by as.POSIXct(format(myDateTimeObject),GMT). But
plot() will still try to make your figures according to system time ().

Here are two possible solutions (if all of your work is in GMT):

1.
Sys.putenv(TZ = GMT)
## your work
## (shouldn't have to use tzone.err)
Sys.putenv(TZ = ) 

2.
Use 'chron' objects as suggested in the R newsletter. This object class does
not include any time-zone information, so simplifies things.

Best regards,

ST


--- Marc Fischer [EMAIL PROTECTED] wrote:

 Folks,
 
 I'm having  trouble with how datetime objects with time zones are set 
 and plotted.   This may be the result of my running R (2.4.0) on a 
 Windoze XP box.  Perhaps not.  Here are two example problems I need 
 advise on if you have time:
 
 1)  I collect data with dates (often as a fractional day of year) in 
 UTC.  Using strptime to create date time objects appears to force the 
 data into the local time zone (including daylight time) of my 
 machine.  Setting the tz=UTC or GMT inside strptime seems to be 
 ignored.  I made the following cumbersome work around:
 
 foo$date=strptime((paste(yr,DOY), 0, 0), format=%Y %j %H %M,tz=) + 0

tzone.err=as.numeric(as.POSIXct(format(Sys.time(),tz=GMT))-as.POSIXct(format(Sys.time(),tz=)))*3600
 foo$date=foo$date-tzone.err
 attributes(foo$date)$tzone=GMT
 
 Am I missing something obvious or is this a problem with Windoze?
 
 2) Once I have the data in GMT, I try to plot it using the standard 
 plot command but it converts the data back to local time before 
 making the plot.  Now the work around is:
 
 plot(foo$date+tzone.err,foo$var, xlab=Date(GMT))
 
   Very frustrating... I've read the help pages but can't find answers 
 to these issues.  Any help?
 
 Best regards,
 
 Marc
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting a list to a data.frame

2007-04-03 Thread Stephen Tucker
You can concatenate a series of NA's to match the length of your longest
element.

(1) exampDat is example data
(2) max(rapply(exampDat,length)) is length of longest element
(3) function(x,m) will do the concatenation
(4) sapply() will return each list element as a column of a data frame
(5) t() will transpose it so you get it in row format (and convert it to a
matrix.

then you can use write() or write.table() to export your file to a text file.

exampDat - list(x=1:2,y=1:3,z=1:4)
mat - t(sapply(exampDat,
function(x,m) c(x,rep(NA,m-length(x))),
max(rapply(exampDat,length

## use write.table()

Hope this helps.

--- Biscarini, Filippo [EMAIL PROTECTED] wrote:

 Hello,
  
 I have a list with n numerical components of different length (3, 4 or 5
 values in each component of the list); I need to export this as a text
 file where each component of the list will be a row and where missing
 values should fill in the blanks due to the different lengths of the
 components of the list.
 I think that as a first step I should convert my list to a data frame,
 but this is not such a simple task to accomplish: I was thinking of the
 following for loop:
  
 X-data.frame(1,1,1,1,1);
  
 for (i in 1:length(list)) {
  
 X[i,]-unlist(list[[i]]);
  
 }
  
 Unfortunately, when the number of elements in the components of the list
 are lower than 5 (maximum), I get errors or undesired results. I also
 tried with rbind(), but again I couldn't manage to make it accept rows
 of different length.
  
 Does anybody have any suggestions? Working with lists is very nice, but
 I still have to learn how to transfer them to text files for external
 use.
  
 Thnak you,
  
 Filippo Biscarini
 Wageningen University
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Don't pick lemons.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting POSIXct objects {was Re: Using split() several times in a row?}

2007-03-31 Thread Stephen Tucker
Hi Gabor and Martin,

Thanks very much for the information. (and Gabor for the Fold() routine
included in original reply)

Regarding changes, I wonder if the behavior of plot() on POSIXct objects
changed also. According to Rnews Vol. 4/1, p. 31,

=
dp - seq(Sys.time(),len=10,by=day)
plot(dp,1:10)

This does not use the current wall clock time
for plotting today and the next 9 days since
plot treats the datetimes as relative to GMT.
The x values that result will be off from the wall
clock time by a number of hours equal to the
difference between the current time zone and
GMT.
=

In R 2.4.0, the x-values match up with the times I put in it (on Pacific
Daylight Time, which is what my system is on). However, when I convert the
times to GMT, they are shifted behind by 7hrs. Seems plot() is treating
datetimes relative to the system time and not GMT? Please see three examples
below... I tried to look at version update notes but could I have missed this
one?

Stephen

#=== Plot 1 ===
# system TZ is in Pacific Daylight Time
# POSIXct object is in Pacific Daylight Time
dp - seq(Sys.time(),len=10,by=day)
plot(dp,1:length(dp))
# Looks okay

#=== Plot 2 ===
# system TZ is in Pacific Daylight Time
# POSIXct object is in GMT
dp - seq(Sys.time(),len=10,by=day)
plot(as.POSIXct(format(dp),GMT),1:length(dp))
# Shifted by 7 hours

#=== Plot 3 ===
# system TZ is in GMT
# POSIXct object is in GMT
Sys.putenv(TZ = GMT)
dp - seq(Sys.time(),len=10,by=day)
plot(dp,1:length(dp))
Sys.putenv(TZ = )
# Looks okay


--- Gabor Grothendieck [EMAIL PROTECTED] wrote:

 On 3/31/07, Martin Maechler [EMAIL PROTECTED] wrote:
   SteT == Stephen Tucker [EMAIL PROTECTED]
   on Fri, 30 Mar 2007 18:41:39 -0700 (PDT) writes:
 
   [..]
 
 SteT For dates, I usually store them as POSIXct classes
 SteT in data frames, but according to Gabor Grothendieck
 SteT and Thomas Petzoldt's R Help Desk article
 SteT http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf,
 SteT I should probably be using chron date and times...
 
  I don't think you should (and I doubt Gabor and Thomas would
  recommend this in every case):
 
  POSIXct (and 'POSIXlt', 'POSIXt'  'Date') are part of standard R,
  and whereas they may seem not as convenient in all cases as chron
  etc, I'd rather recommed to stick to them in such a case.
 
 There is one change that has occurred since the article that in my
 mind would let you safely use POSIX but its pretty drastic.  At the time
 of the article you could not set the time zone to GMT in the R process
 on Windows but now you can do this:
 
 Sys.putenv(TZ = GMT)
 
 and you can also change it back like this:
 
 Sys.putenv(TZ = )
 
 Since the problem is that you never can be sure which time zone the
 time is interpreted in within various function (although you can be pretty
 sure its either the local time zone or GMT) by setting the process to
 GMT you make the two alternatives the same so it no longer matters.
 
 Short of the above, the recommendations of the article should be followed.
 Its not a matter of convenience.  Its a matter of being error prone
 and introducing
 subtle time-zone related errors into your code which are very hard to track
 down or worse, even realize that you have.
 
 Those who claim that its not a problem simply have not used dates and times
 enough or they would not say that.  I have seen posters make such comments
 on this list only later to run into subtle time zone problems that they
 never
 would have had had they followed the advice in the article.
 
 I've used R and dates a lot and therefore have made a lot of programming
 errors
 and these recommendations come from bitter experience looking back to see
 how I could have avoided them.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Don't get soaked.  Take a quick peek at the forecast

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using split() several times in a row?

2007-03-30 Thread Stephen Tucker
Hi Sergey,

I believe the code below should get you close to want you want.

For dates, I usually store them as POSIXct classes in data frames, but
according to Gabor Grothendieck and Thomas Petzoldt's R Help Desk article
http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf, I should probably be
using chron date and times...

Nonetheless, POSIXct casses are what I know so I can show you that to get the
month out of your column (replace 8.29.97 with your variable), you can do
the following:

month = format(strptime(8.29.97,format=%m.%d.%y),format=%m)

Or,
month = as.data.frame(strsplit(8.29.97,\\.))[1,]

In any case, here is a code, in which I follow a series of function
application and definitions (which effectively includes successive
application of split() and lapply().

Best regards,

ST

# define data (I just made this up)
df -
data.frame(month=as.character(rep(1:3,each=30)),fac=factor(rep(1:2,each=15)),
data1=round(runif(90),2),
data2=round(runif(90),2))

# define functions to split the data and another
# to get statistics
doSplits - function(df) {
  unlist(lapply(split(df,df$month),function(x)
split(x,x$fac)),recursive=FALSE)
}
getStats - function(x,f) {
  return(as.data.frame(lapply(x[unlist(lapply(x,mode))==numeric 
unlist(lapply(x,class))!=factor],f)))
}
# create a matrix of data, means, and standard deviations
listMatrix - cbind(Data=doSplits(df),
   Means=lapply(doSplits(df),getStats,mean),
   SDs=lapply(doSplits(df),getStats,sd))

# function to subtract means and divide by standard deviations
transformData - function(x) {
  newdata - x$Data
  matchedNames - match(names(x$Means),names(x$Data))
  newdata[matchedNames] -
sweep(sweep(data.matrix(x$Data[matchedNames]),2,unlist(x$Means),-),
  2,unlist(x$SDs),/)
  return(newdata)
}
# apply to data
newDF - lapply(as.data.frame(t(listMatrix)),transformData)

# Defind Fold function
Fold - function(f, x, L) for(e in L) x - f(x, e)
# Apply this to the data
finalData - Fold(rbind,vector(),newDF)






--- Sergey Goriatchev [EMAIL PROTECTED] wrote:

 Hi, fellow R users.
 
 I have a question about sapply and split combination.
 
 I have a big dataframe (4 observations, 21 variables). First
 variable (factor) is date and it is in format 8.29.97, that is, I
 have monthly data. Second variable (also factor) has levels 1 to 6
 (fractiles 1 to 5 and missing value with code 6). The other 19
 variables are numeric.
 For each month I have several hunder observations of 19 numeric and 1
 factor.
 
 I am normalizing the numeric variables by dividing val1 by val2, where:
 
 val1: (for each month, for each numeric variable) difference between
 mean of ith numeric variable in fractile 1, and mean of ith numeric
 variable in fractile 5.
 
 val2: (for each month, for each numeric variable) standard deviation
 for ith numeric variable.
 
 Basically, as far as I understand, I need to use split() function several
 times.
 To calculate val1 I need to use split() twice - first to split by
 month and then split by fractile. Is this even possible to do (since
 after first application of split() I get a list)??
 
 Is there a smart way to perform this normalization computation?
 
 My knowledge of R is not so advanced, but I need to know an efficient
 way to perform calculations of this kind.
 
 Would really appreciate some help from experienced R users!
 
 Regards,
 S
 
 -- 
 Laziness is nothing more than the habit of resting before you get tired.
 - Jules Renard (writer)
 
 Experience is one thing you can't get for nothing.
 - Oscar Wilde (writer)
 
 When you are finished changing, you're finished.
 - Benjamin Franklin (Diplomat)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] objects of class matrix and mode list? [Broadcast]

2007-03-24 Thread Stephen Tucker
Hi Andy,

I hadn't realized such objects (list matrices, list arrays, data frames with
nested lists) existed before, but now that I do I am seeing the documentation
with new eyes. I see that the pages for sapply(), lapply(), and class
coercion functions are true to their word.

Thanks,

Stephen


--- Liaw, Andy [EMAIL PROTECTED] wrote:

 It may help to (re-)read ?sapply a bit more in detail.  Simplification
 is done only if it's possible, and what possible means is defined
 there.
 
 A list is a vector whose elements can be different objects, but a vector
 nonetheless.  Thus a list can have dimensions.  E.g.,
 
 R a - list(1, 1:2, 3, c(abc, def))
 R dim(a) - c(2, 2)
 R a
  [,1]  [,2]   
 [1,] 1 3  
 [2,] Integer,2 Character,2
 
 That sometimes can be extremely useful (not like the example above!).
 
 Andy 
 
 From: Stephen Tucker
  
  Hello everyone,
  
  I cannot seem to find information about objects of class 
  matrix and mode
  list, and how to handle them (apart from flattening the 
  list). I get this
  type of object from using sapply(). Sorry for the long 
  example, but the code
  below illustrates how I get this type of object. Is anyone aware of
  documentation regarding this object?
  
  Thanks very much,
  
  Stephen
  
  = begin example 
  
  # I am just making up a fake data set
  df - data.frame(Day=rep(1:3,each=24),Hour=rep(1:24,times=3),
   Name1=rnorm(24*3),Name2=rnorm(24*3))
  
  # define a function to get a set of descriptive statistics
  tmp - function(x) {
# this function will accept a data frame
# and return a 1-row data frame of
# max value, colname of max, min value, and colname of min
return(data.frame(maxval=max(apply(x,2,max)),
  maxloc=names(x)[which.max(apply(x,2,max))],
  minval=min(apply(x,2,min)),
  minloc=names(x)[which.min(apply(x,2,min))]))
  }
  
  # Now applying function to data:
  # (1) split the data table by Day with split()
  # (2) apply the tmp function defined above to each data frame from (1)
  # using lapply()
  # (3) transpose the final matrix and convert it to a data frame
  # with mixed characters and numbers
  # using as.data.frame(), lapply(), and type.convert()
  
   final - 
  as.data.frame(lapply(as.data.frame(t(sapply(split(df[,-c(1:2)],
  +   
  f=df$Day),tmp))),
  +   type.convert,as.is=TRUE))
  Error in type.convert(x, na.strings, as.is, dec) : 
  the first argument must be of mode character
  
  I thought sapply() would give me a data frame or matrix, which I would
  transpose into a character matrix, to which I can apply type.convert()
  and get the same matrix as what I would get from these two lines (Fold
  function taken from Gabor's post on R-help a few years ago):
  
  Fold - function(f, x, L) for(e in L) x - f(x, e)
  final2 - Fold(rbind,vector(),lapply(split(df[,-c(1:2)],f=day),tmp))
  
   print(c(class(final2),mode(final2)))
  [1] data.frame list  
  
  
  However, by my original method, sapply() gives me a matrix 
  with mode, list
  
  intermediate1 - sapply(split(df[,-c(1:2)],f=df$Day),tmp)
   print(c(class(intermediate1),mode(intermediate1)))
  [1] matrix list  
  
  Transposing, still a matrix with mode list, not character:
  
  intermediate2 - t(sapply(split(df[,-c(1:2)],f=day),tmp))
   print(c(class(intermediate2),mode(intermediate2)))
  [1] matrix list  
  
  Unclassing gives me the same thing...
  
   print(c(class(unclass(intermediate2)),mode(unclass(intermediate2
  [1] matrix list  
  
  
  
  
   
  __
  __
  Be a PS3 game guru.
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
  
  
 
 

--
 Notice:  This e-mail message, together with any attachment...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Truncated x-axis values

2007-03-23 Thread Stephen Tucker
You can try playing around with oma, omi, mai, mar, etc. in par():

myMai - par(mai)
myMai[1] - max(nchar(y))*par(cin)[1]
par(mfrow=c(2, 1),mai=myMai,oma=rep(0,4),las = 2)
# your plot

--- Urban, Alexander [EMAIL PROTECTED] wrote:

 John
 Thanks for your reply and sorry for not beein specific enough
 Try this piece of code (it's an abstraction of my application...)
 
 par(las = 2)
 par(mfrow=c(2, 1))
 x =  c(1,2,3,4,5,6,7,8)
 y = c(abcdccefghijk, bcssdefghijkl, abddessfghijk,
 bddessfghijkl,abcdssedghijk,bcdedghijkl,assbcdefghidk,bcdedssgh
 ijkl)
 
 boxplot(x~y)
 boxplot(x~y) 
 
 In the lower chart the labels are truncated...(not totally visible)
 Is this clearer now?
 
 Thanks
 Alex
 
 -Original Message-
 From: John Kane [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, March 22, 2007 19:19
 To: Urban, Alexander; r-help@stat.math.ethz.ch
 Subject: Re: [R] Truncated x-axis values
 
 You really have not told us much about what you're actually doing.  A
 simple self-contained example of what you're trying to do might let
 someone help. 
 
 Have you read the posting guide?
 
 --- Urban, Alexander [EMAIL PROTECTED] wrote:
 
  Hello
  
  I'm new to this group. I looked up the last two hour in the help file 
  and in the archives of this group, but didn't find anything.
  I hope my question is not too dump:
  I'm printing a graph with vertical labels on the x-axis (necessary due
 
  to many labels). Unfortunately the png truncates the labels halfway 
  through, so that you can only read the last 7 digits of the label.
  Snice I'm already asking :-): Is there a possibility to tell R: If 
  there are so many labels that you write them on top of each other, 
  take only e.g. every 2nd...
  
  Sorry for bothering and thanks
  Alex
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 
 __
 Do You Yahoo!?

 http://mail.yahoo.com
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Looking for earth-friendly autos? 
Browse Top Cars by Green Rating at Yahoo! Autos' Green Center.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] objects of class matrix and mode list?

2007-03-23 Thread Stephen Tucker
Hello everyone,

I cannot seem to find information about objects of class matrix and mode
list, and how to handle them (apart from flattening the list). I get this
type of object from using sapply(). Sorry for the long example, but the code
below illustrates how I get this type of object. Is anyone aware of
documentation regarding this object?

Thanks very much,

Stephen

= begin example 

# I am just making up a fake data set
df - data.frame(Day=rep(1:3,each=24),Hour=rep(1:24,times=3),
 Name1=rnorm(24*3),Name2=rnorm(24*3))

# define a function to get a set of descriptive statistics
tmp - function(x) {
  # this function will accept a data frame
  # and return a 1-row data frame of
  # max value, colname of max, min value, and colname of min
  return(data.frame(maxval=max(apply(x,2,max)),
maxloc=names(x)[which.max(apply(x,2,max))],
minval=min(apply(x,2,min)),
minloc=names(x)[which.min(apply(x,2,min))]))
}

# Now applying function to data:
# (1) split the data table by Day with split()
# (2) apply the tmp function defined above to each data frame from (1)
# using lapply()
# (3) transpose the final matrix and convert it to a data frame
# with mixed characters and numbers
# using as.data.frame(), lapply(), and type.convert()

 final - as.data.frame(lapply(as.data.frame(t(sapply(split(df[,-c(1:2)],
+   
f=df$Day),tmp))),
+   type.convert,as.is=TRUE))
Error in type.convert(x, na.strings, as.is, dec) : 
the first argument must be of mode character

I thought sapply() would give me a data frame or matrix, which I would
transpose into a character matrix, to which I can apply type.convert()
and get the same matrix as what I would get from these two lines (Fold
function taken from Gabor's post on R-help a few years ago):

Fold - function(f, x, L) for(e in L) x - f(x, e)
final2 - Fold(rbind,vector(),lapply(split(df[,-c(1:2)],f=day),tmp))

 print(c(class(final2),mode(final2)))
[1] data.frame list  


However, by my original method, sapply() gives me a matrix with mode, list

intermediate1 - sapply(split(df[,-c(1:2)],f=df$Day),tmp)
 print(c(class(intermediate1),mode(intermediate1)))
[1] matrix list  

Transposing, still a matrix with mode list, not character:

intermediate2 - t(sapply(split(df[,-c(1:2)],f=day),tmp))
 print(c(class(intermediate2),mode(intermediate2)))
[1] matrix list  

Unclassing gives me the same thing...

 print(c(class(unclass(intermediate2)),mode(unclass(intermediate2
[1] matrix list  




 

Be a PS3 game guru.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple bar plot question

2007-03-23 Thread Stephen Tucker
I think you can set legend=FALSE in barplot() and add your own legend, in
which you have a lot more control:

barplot(#your arguments#,legend=FALSE)
legend(x=topleft,cex=yourCex)

etc.




--- Janet [EMAIL PROTECTED] wrote:

 Literally 60 seconds after I sent my question, I found the cex.names  
 parameter to barplot.
 I haven't found a parameter for the size of the text in the legend.
 
 Apologies for clogging inboxes semi-unnecessarily.
 
 Janet
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Finding fabulous fares is fun.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple bar plot question

2007-03-23 Thread Stephen Tucker
Sorry, that was a bit premature - you probably want other arguments in legend
as well; in particular 'fill' is an argument you'd probably be interested in 
for legends of barplots:

legend(x=topleft,cex=yourCex,fill=yourColors,legend.text=yourText,
   lty=NA,pch=NA,#and so on#)

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 I think you can set legend=FALSE in barplot() and add your own legend, in
 which you have a lot more control:
 
 barplot(#your arguments#,legend=FALSE)
 legend(x=topleft,cex=yourCex)
 
 etc.
 
 
 
 
 --- Janet [EMAIL PROTECTED] wrote:
 
  Literally 60 seconds after I sent my question, I found the cex.names  
  parameter to barplot.
  I haven't found a parameter for the size of the text in the legend.
  
  Apologies for clogging inboxes semi-unnecessarily.
  
  Janet
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  


 Finding fabulous fares is fun.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Bored stiff? Loosen up...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ticks on barplots

2007-03-21 Thread Stephen Tucker
Hi Mike, you can try using axTicks as in this example (you can also use
pretty instead).

Below, instead of a barplot() I have used plot() but with type=h and lend=3
(see ?par for details on lend) which I have found to be useful at times when
making barchart-like plots with time-series. In any case, barplot() or
plot(), I hope this illustration will be helpful:


# define function to convert numeric to POSIXct
# from http://tolstoy.newcastle.edu.au/R/help/04/05/0980.html
numToPOSIXct - function(v) {
  now - Sys.time()
  Epoch - now - as.numeric(now)
  Epoch + v
}

# define example data
x - seq(as.POSIXct(1950-01-01 00:00:00),as.POSIXct(2000-01-01 
 00:00:00),by=year)
y - runif(length(x))

# plot and label axis with with axTicks()
plot(x,y,type=h,lwd=3,col=8,xaxt=n,lend=3)
axis(1,at=axTicks(1),lab=format(numToPOSIXct(axTicks(1)),%Y))



--- Mike Prager [EMAIL PROTECTED] wrote:

 Marc Schwartz [EMAIL PROTECTED] wrote:
 
  On Tue, 2007-03-20 at 18:04 -0400, Michael H. Prager wrote:
   I am generating stacked barplots of age-composition of fish populations
 
   (Y) over time (X).  As there are many years, not every bars is labeled.
  
   When looking at the plot, it becomes difficult to associate labels with
 
   their bars.
   
   We have improved this a bit by using axis() to add a tickmark below
 each 
   bar.  Can anyone suggest a way to draw ticks ONLY at bars where a tick 
   label is drawn?  Or to make such ticks longer than those where there is
 
   no label?
   
   This is going into a function, so I'm hoping for a method that doesn't 
   require looking at the plot first.
  
   # sample code (simplified) #
   mp - barplot(t(N.age), xlab = Year, axisnames = FALSE)
   axis(side = 1, at = mp, labels = rownames(N.age), tcl = -0.75)
   
   Thanks!
   
   Mike Prager
   NOAA, Beaufort, NC
  
  Mike,
  
  How about something like this:
  
mp - barplot(1:50, axisnames = FALSE)
  
# Create short tick marks at each bar
axis(1, at = mp, labels = rep(, 50), tcl = -0.25)
  
# Create longer tick marks every 5 years with labels
axis(1, at = mp[seq(1, 50, 5)], 
 labels = 1900 + seq(0, 45, 5), tcl = -0.75, las = 2, 
 cex.axis = 0.75)
  
  
  Just pick which labels you want to be shown (eg. every 5 years) and
  synchronize the values of those with the 'at' argument in axis().
  
  HTH,
  
  Marc Schwartz
  
 
 Thanks, Marc, for this solution and thanks equally to Jim Lemon
 for a similar idea.  This seems promising.  Since this is to go
 into a function (and should work without intervention), I'll
 need to devise an algorithm to decide at what interval the
 labels should be plotted.  Clearly axis() has such an
 algorithm.  Unfortunately, it reports its result only by placing
 the labels.
 
 Mike
 
 -- 
 Mike Prager, NOAA, Beaufort, NC
 * Opinions expressed are personal and not represented otherwise.
 * Any use of tradenames does not constitute a NOAA endorsement.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Expecting? Get great news right away with email Auto-Check.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >