Re: [R] Details of subassignment (for vectors and data frames)

2011-08-12 Thread Jeff Newmiller
My mental model is that the left hand side forms a sort of virtual vector 
where each element really points to an element in the vector being modified. 
Then the right hand side scalar is extended in the usual repetitious way (if 
necessary) until it is a vector just as long as the virtual vector on the 
left. Then the right vector is assigned to the left vector, which leaves the 
designated elements in the destination vector changed.

You could work out an equivalent for loop structure, but I think it would be 
tricky to keep the behavior for different length source and destination 
assignments straight.
---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Al Roark hrbuil...@hotmail.com wrote:


Hi All: 
I'm looking to find out a bit more about how subassignment actually works and 
am hoping someone with knowledge of the details can fill me in (I've looked at 
the source code, but my knowledge of C is lacking).
In the case of vectors, my reading of ?[ would indicate that for a vector, 
vec - 1:25, vec[c(1,5,25)] - c(101,102,103)is functionally the same as indx 
- c(1,5,25) for (i in 1:length(indx)) vec[indx[i]] - c(101,102,103)[i]
And in the case of a data frame, df - data.frame(d1=1:10,d2=11:20,d3=21:30), 
df[c(1,5,10),c(1,3)] - data.frame(a=101:103,b=104:106)is functionally the same 
as rowindx - c(1,5,10) colindx - c(1,3) for (i in 1:length(rowindx)) {  for 
(j in 1:length(colindx)) df[rowindx[i],colindx[j]] - 
data.frame(a=101:103,b=104:106)[i,j] }Obviously I've verified 
that these examples work and I realize that my loops also contain 
subassignments; what I'm really after is to understand the mechanics of 
replacing multiple elements. Is a for-loop the proper way to understand the 
sequential nature of subassignments here (even if it is not actually 
implemented using a loop)? 
Cheers,HR
[[alternative HTML version deleted]]

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] current.panel.limits() of lattice returning NaN limits - why?

2011-08-12 Thread Fredrik Karlsson
Hi,

I need a custom axis function for a plot, but it seems
that current.panel.limits() sometimes returns NaN limits for the plot, which
it much harder to calculate anything sensible.
An illustration:

Given this axis function:


vs.axis - function(...){
   xlim - current.panel.limits()$xlim
   ylim - current.panel.limits()$ylim

   # Debug code
   print(list(ylim=ylim,xlim=xlim))

   xat - pretty(seq(xlim[1],xlim[2],100),n=5)
   yat - pretty(seq(ylim[1],ylim[2],100),n=4)
   xlab - sub(-,,as.character(xat))
   ylab - sub(-,,as.character(yat))
   panel.axis(side=top,at=xat,labels=xlab)
   panel.axis(side=right,at=yat,labels=ylab)
}

and the attached data set, I get this output:

 xyplot(F1 ~F2,data=pb,axis=vs.axis)
$ylim
[1] NaN NaN

$xlim
[1]  346.5 3823.5

Error in if (del == 0  to == 0) return(to) :
  missing value where TRUE/FALSE needed

What's wrong? Is there a more robust way of getting the x- and y- limits?

/Fredrik

-- 
Life is like a trumpet - if you don't put anything into it, you don't get
anything out of it.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] value.labels

2011-08-12 Thread David Winsemius


On Aug 11, 2011, at 10:27 AM, Uwe Ligges wrote:


On 11.08.2011 19:22, David Winsemius wrote:

Hmm, when you want to add a level without changing the class of the  
factor object, you will have to add the level at first and then  
assign the level to the elements of the object. I'd probably rebuild  
the whole thing in that case.




All I wanted to do was construct a not done label for the NA's in a  
factor created with cut so they would show up in tabulations. I think  
the answer to my question is to use `addNA`  and then use `levels-`  
to change the NA level to not done. I suppose RTFM is one way to  
answer the question and that was how I figured out what I now know.  
Looks like I can use either factor(x, exclude=NULL) or addNA(x.c)


This builds a test factor:

x-rnorm(100)
is.na(x) - sample(c(TRUE,FALSE), 100, c(.1,.9) , replace=TRUE)
x.c - cut(x, seq(-3,3, by=0.5))

# I think these do the same thing:
x.cE - factor(x, exclude=NULL)
x.cNA - addNA(x.c)

# Relabel the NA level
levels(x.cNA) - c(levels(x.c2)[-length(levels(x.c2))], NotDone)

 table(x.cNA)
x.cNA
(-3,-2.5] (-2.5,-2] (-2,-1.5] (-1.5,-1] (-1,-0.5]  (-0.5,0]
(0,0.5]   (0.5,1]   (1,1.5]   (1.5,2]
0 0 3 51617 
141415 4

  (2,2.5]   (2.5,3]   NotDone
2 1 9

That seems a bit less baroque than converting to numeric , changing  
NA's to 0 and then adjusting the factor labels, although I still want  
to move the not-done level so it is first.


x.cN - factor(x.cNA, levels=c(NotDone, c(levels(x.cNA)[- 
length(levels(x.cNA))]))

 table(x.cN)
x.cN
  NotDone (-3,-2.5] (-2.5,-2] (-2,-1.5] (-1.5,-1] (-1,-0.5]   
(-0.5,0]   (0,0.5]   (0.5,1]   (1,1.5]
9 0 0 3 516 
17141415

  (1.5,2]   (2,2.5]   (2.5,3]
4 2 1

As I understood the question, just how to rename the levels was the  
original question.


Yes, that was how I understood it as well. I was asking what I thought  
was a related question.




Uwe




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract named regions from an Excel file using XLConnect?

2011-08-12 Thread christiaan pauw
Hi Everybody

In R, the XLConnect package can read and write named region to and from
Excel. In order to read a named region with the readNamedRegion function you
need to know it's name. You can check is a name exists with existsName, but
you still have to know the name. Is there a way to actually get a list of
the named regions in XLConnect sinilar to getRanges in the xlsx package.

On that point: what is the difference between XLConnect and xlsx and when
should you use which?

regards
Christiaan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] current.panel.limits() of lattice returning NaN limits - why?

2011-08-12 Thread Deepayan Sarkar
On Fri, Aug 12, 2011 at 12:15 PM, Fredrik Karlsson dargo...@gmail.com wrote:
 Hi,

 I need a custom axis function for a plot, but it seems
 that current.panel.limits() sometimes returns NaN limits for the plot, which
 it much harder to calculate anything sensible.
 An illustration:

 Given this axis function:


 vs.axis - function(...){
   xlim - current.panel.limits()$xlim
   ylim - current.panel.limits()$ylim

   # Debug code
   print(list(ylim=ylim,xlim=xlim))

   xat - pretty(seq(xlim[1],xlim[2],100),n=5)
   yat - pretty(seq(ylim[1],ylim[2],100),n=4)
   xlab - sub(-,,as.character(xat))
   ylab - sub(-,,as.character(yat))
   panel.axis(side=top,at=xat,labels=xlab)
   panel.axis(side=right,at=yat,labels=ylab)
 }

 and the attached data set, I get this output:

(The attachment didn't come through.)

 xyplot(F1 ~F2,data=pb,axis=vs.axis)
 $ylim
 [1] NaN NaN

 $xlim
 [1]  346.5 3823.5

 Error in if (del == 0  to == 0) return(to) :
  missing value where TRUE/FALSE needed

 What's wrong? Is there a more robust way of getting the x- and y- limits?

You are doing the equivalent of

library(grid)
pushViewport(viewport(0.5, 0.5, width = 0.8, height = 0, xscale = c(0, 10)))
current.panel.limits()

(Note the height=0).

The axis function is called four times, once for each side. On the top
and left sides, it is actually called with the strip viewports active,
because the axis annotation goes outside the strip, not the panel. In
your case there are no strips, which basically mean a 0-height strip.
Your code will work (although will not give what you want) if you try
something like

xyplot(rnorm(10) ~ 1:10 | gl(1, 10), axis=vs.axis,
   strip = TRUE, strip.left = TRUE)

In real applications of custom axis functions, one usually writes code
conditioned on the value of the 'side' argument.

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with loops

2011-08-12 Thread Paul Hiemstra
 On 08/11/2011 07:51 PM, William Dunlap wrote:
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of R. Michael
 Weylandt
 Sent: Thursday, August 11, 2011 10:09 AM
 To: Srinivas Iyyer
 Cc: r-help@r-project.org
 Subject: Re: [R] help with loops

 No problem,

 By the way, you can't (or at least shouldn't) use return() outside of a
 function -- that was the source of your old error message.

 If you, for whatever reason, couldn't use unlist() you would write:

 OurUnlist - function(c, unique = F) {
   if (!is.list(c)) return(c)
   z - NULL
   for (i in seq_along(c)) {
z - c(z,c[[i]])
   }
   if (unique) return(unique(z))
   return(z)
 }

 or some such. Still, I suggest you stick with built in functions whenever
 possible.
 I tend to encourage people to write functions.

In addition, writing functions yourself is a good way to exercise your R
skills. On the other hand, built in functions often solve the problem
faster and are more generic. In my experience it takes quite a bit of R
knowledge before one is good enough to beat a general, builtin function.
Often a lengthy self written function can be replaced by one call to a
built in function. In addition, it saves a lot of time when you use the
already present functions. Several times I wanted something done in R,
only to find out that it was already done. This meant getting the job
done in 1 hour instead of two days of programming.

In general I tend to agree with Michael and encourage people to stick
with the builtin functions.

my 2 cts ;),

regards,
Paul

 I suppose you may end up reinventing the wheel,
 but once you get used to writing functions it
 is often faster to write a specialized one than
 to find one that meets your needs.  When you
 discover a new idiom for your task (e.g., calling
 unlist() instead of the for loop), you just edit
 one function (OurUnlist) instead of editing all
 your scripts that used the old idiom).

 Once you get used to writing functions (and using
 them), you are ready to document them and package
 them up for others to use.

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com 

 Michael Weylandt

 PS -- Can you email me (off list) and let me know what this is for? We've
 been asked this question a couple of times over the last few days and I'm
 just wondering why it's a calculation of interest to so many.

 On Thu, Aug 11, 2011 at 1:00 PM, Srinivas Iyyer
 srini_iyyer_...@yahoo.comwrote:

 Thank you. that was very easy.
 -srini

 --- On *Thu, 8/11/11, R. Michael Weylandt 
 michael.weyla...@gmail.com*wrote:


 From: R. Michael Weylandt michael.weyla...@gmail.com
 Subject: Re: [R] help with loops
 To: Srinivas Iyyer srini_iyyer_...@yahoo.com
 Cc: r-help@r-project.org
 Date: Thursday, August 11, 2011, 12:49 PM


 unlist()

 Michael Weylandt

 On Thu, Aug 11, 2011 at 12:46 PM, Srinivas Iyyer 
 srini_iyyer_...@yahoo.com http://mc/compose?to=srini_iyyer_...@yahoo.com
 wrote:
 hi I need help with list object.

 I have a list object

 a - c('apple','orange','grape')
 b - c('car','truck','jeep')
 c - list(a,b)
 names(c) - c('fruit','vehicle')
 c
 $fruit
 [1] apple  orange grape

 $vehicle
 [1] car   truck jeep


 I want to write all the elements of this list in one object 'z'.

 z
 [1] apple  orange grape car   truck jeep

 How can I write the elements of c to z
 I tried using a for loop. Could any one help me please. thanks


 z - ''
 for (i in 1:length(c)){
 + k - c[[i]]
 + z - c(z,k)
 + return(z)}
 Error: no function to return from, jumping to top level

 Thank you.
 srini

 __
 R-help@r-project.org http://mc/compose?to=R-help@r-project.org mailing
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] Splitting data

2011-08-12 Thread Marina de Wolff

Thank you for your reply,
 
I used this code on my test data, but did not get the same p-values. 
 
I think I know were the difference lies; when the data is split in 4 parts I 
want to compare the two left groups (group 1 and 2) with each other and the two 
right groups (group 3 and 4) with each other. It seems that with this code 
group 1 and 3 are compared with each other and group 2 and 4, I did not yet 
succeeded in changing this. 
 
About the unequal data sizes, I thought I could 'correct' this by using round. 
For example, when my data consists of 17 data points I would use
 
m - length(data)/2
x - data[1:round(m)]
y - data[(round(m)+1):length(data)]
 
x has size 9 and y has size 8. 
 
Sincerely,

Marina de Wolff
 



From: michael.weyla...@gmail.com
Date: Thu, 11 Aug 2011 11:54:11 -0400
Subject: Re: [R] Splitting data
To: marinadewo...@hotmail.com
CC: r-help@r-project.org

This sounds very much like a recursive problem: something like this seems to 
get the gist of what you want. 

DataSplits - function(Data, alpha = 0.05) {
DataSplitsCore - function(Data, alpha, level) {
tt - t.test(Data[,1],Data[,2])
print(tt)
if (tt$p.value  alpha) {
print(paste(Stopped at level, level))
return(invisible(TRUE))
} else {
nr = floor(NROW(Data)/2)
if (nr == 1) {print(paste(Reached Samples of Size 1)); stop}
d1 = DataSplitsCore(Data[(1:nr),], alpha = alpha, level = level + 1)
if (d1) return(invisible(TRUE))
d2 = DataSplitsCore(Data[-(1:nr),], alpha = alpha, level = level +1)
if (d2) return(invisible(TRUE))
return(invisible(FALSE))
}
}
DataSplitsCore(Data, alpha = alpha, level = 1)
}

Your description wasn't the clearest about what to do when the data sizes 
didn't match, but this should give you a start. Let me know if this doesn't do 
as desired and I can help tweak it. 

Hope this can be of help, 

Michael Weylandt 

PS -- You might as well use R's built in t.test function. 


On Thu, Aug 11, 2011 at 5:17 AM, Marina de Wolff marinadewo...@hotmail.com 
wrote:


I want to implement the following algorithm in R:

I want to split my data, use a t test to compare both means of the groups to 
see if they significantly differ from each other. If this is a yes (p  alpha) 
I want to split again (into 4 groups) and do the same procedure twice,  and 
stop otherwise (here the problem arises). As a final result I would have 
different groups of data.

I made some code where the data is splitted, until no splitting is possible. So 
for 16 datapoints, we can split 4 times with a final result of 16 groups (p is 
NA for the 4th split since sd cannot be calculated..).

The code calculated all p values, but I don't want this. I want it to stop when 
p  alpha. I tried while, but didn't succeed.

I hope someone can help me to acchieve my goal.

This is what I tried so far with test data:

a = rnorm(9,0,0.1)
b = rnorm(7,1,0.1)
data = c(a,b)
plot(data)

# Want to calculate max of groups/split for the data
d = seq(1,100,1)
n = 2^d
m - which(n =length(data))
n = n[m[1]:m[length(m)]]

# All groups
i=0
j=0
dx = 0
dy =
for (i in 1:length(n)){
split - length(data)/(n[i])
for (j in 1:(n[i]/2)){
x = data[(1 + (j-1)*(2*split)):(round(split) + (j-1)*(2*split))]
dx = cbind(dx,x)
y = data[((round(split)+1) + (j-1)*(2*split)):(2*j*split)]
dy = cbind(dy,y)
}}

dx = dx[,2:dim(dx)[2]]
dy = dy[,2:dim(dy)[2]]

k=0
meanx=0
meany=0
sdx=0
sdy=0
nx=0
ny=0
for (k in 1:dim(dx)[2]) {
meanx[k] = mean(unique(dx[,k]))
meany[k] = mean(unique(dy[,k]))
sdx[k] = sd(unique(dx[,k]))
sdy[k] = sd(unique(dy[,k]))
nx[k] = length(unique(dx[,k]))
ny[k] = length(unique(dy[,k]))
}

t = (meanx-meany)/sqrt((sdx^2/nx) + (sdy^2/ny))
df = ((sdx^2/nx) + (sdy^2/ny))^2/((sdx^2/nx)^2/(nx-1) + (sdy^2/ny)^2/(ny-1))
p = 2*pt(-abs(t),df=df)
alpha = 0.05
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] UNC windows path beginning with backslashes: normalizePath bug??

2011-08-12 Thread Keith Jewell
Hi,

Back in June I posted the message below, but had no replies. I've made a 
little progress since then so this is to update anyone interested (!) and to 
ask for comments.

Brief problem statement:
Under Windows, some parts of R don't handle UNC paths beginning with 
backslashes. Specifically
a) Sys.glob() fails to find some files breaking (e.g.) Rcmdr plugins
   Sys.glob(file.path(.libPaths(), */etc/menus.txt))
   fails to find files which are there

b) update.packages(ask='graphics') fails when copying the updates into the 
destination folders

In Renviron.site I define the site library with forward slashes, not 
backslashes thus...
   R_LIBS_SITE=//campden/shares/workgroup/stats/R/library/%v
... but the startup process seems to replace them with forward slashes.
I guess because  .libPaths with a 'new' argument calls normalizePath which 
changes leading slashes to backslashes, even with winslash=/
 normalizePath(//campden/shares/workgroup/stats/R/library, winslash=/)
[1] campden/shares/workgroup/Stats/R/library

I've corrected (??) this by inserting a line into Rprofile.site
  assign(.lib.loc, gsub(\\, /, .libPaths(), fixed=TRUE), 
env=environment(.libPaths))
That seems to fix problem (a) above, which was affecting a number of users.
But have I broken anything else?

I'm still experiencing problem (b).
I'm the only person on site who updates packages so I've mapped a drive 
letter (L:) and in my own .Rprofile I have a line
   assign(.lib.loc, sub(//campden/shares/workgroup/Stats, L:, 
.libPaths(), ignore.case = TRUE), env=environment(.libPaths))

So that's OK as far as it goes, but it's all a bit messy!
If .libPaths is called with a 'new' argument it will breaks things again.
normalizePath seems to produce paths that don't work with Sys.glob.

I have the feeling I'm being silly and making hard work of all this.

Any comments? Suggestions?

Best regards, and thanks in advance/

Keith Jewell

Keith Jewell k.jew...@campden.co.uk wrote in message news:...
 Hi,

 Back in 2010 I had a problem with 'update.packages()', which I worked 
 around by mapping a drive letter to a UNC path [described in 
 http://finzi.psych.upenn.edu/Rhelp10/2010-February/229820.html but my 
 current workaround is
 assign(.lib.loc, sub(Server02/stats, L:, .libPaths(), 
 ignore.case = TRUE), env=environment(.libPaths))].

 More recently a colleague had problems using the 'FactoMineR' plug in for 
 the Rcmdr package;
 a) directly loading 'RcmdrPlugin.FactoMineR' opened and crashed R 
 Commander;
 b) opening R Commander without FactoMiner, the Tools option 'Load Rcmdr 
 plug-in(s)...' was greyed out.

 It transpired that in .libPaths() the path to the library holding 
 'RcmdrPlugin.FactoMineR' was specified as a UNC address: 
 Server02/stats/R/library/2.13. Mapping a virtual drive letter (e.g. 
 L:) and specifying the path in .libPaths() as a 'local file system' (LFS) 
 address L:/R/library/2.13 fixed the problem.

 I contacted Professor Fox (maintainer of Rcmdr) who told me that Rcmdr 
 finds plug-in packages via the command
  plugins - unlist(lapply(.libPaths(), function(x) Sys.glob(file.path(x, 
 */etc/menus.txt
 Because file.path and Sys.glob are both vectorised I think (but am not 
 certain) that this could be simplified to:
  plugins - Sys.glob(file.path(.libPaths(), */etc/menus.txt))
 but that's by the way, the problem seems to lie in Sys.glob under Windows 
 operating systems.

 I note that 'help(Sys.glob)' on my Windows system  differs from 
 http://finzi.psych.upenn.edu/R/library/base/html/Sys.glob.html.
 The latter says For precise details, see your system's documentation on 
 the glob system call.  There is a POSIX 1003.2 standard  snip The rest 
 of these details are indicative (and based on the POSIX standard).
 On Windows The glob system call is not part of Windows, and we supply a 
 partial emulation. snip An attempt is made to handle UNC paths starting 
 with a double backslash which doesn't really inspire confidence.

 This was discussed in a 2009 R-devel thread starting here 
 https://stat.ethz.ch/pipermail/r-devel/2009-June/053879.html, but the 
 patch proposed in that thread seems not to have been implemented (??).

 Trying to avoid Sys.glob in the Rcmdr application I came up with this:
  list.files(path=file.path(list.files(path=.libPaths(), 
 full.names=TRUE), etc), pattern=^menus\\.txt$, full.names=TRUE)
 It seems to give identical results to Sys.glob for mapped drives, works 
 with UNC paths in Windows, and seems quite fast.

 So my questions relate to diagnosis, prognosis, and prescription (cure?).

 1) Diagnosis: Am I correct that my problem(s) originate in the partial 
 emulation of glob in Windows.

 2) Prognosis: If so, is there any likelihood that the emulation will 
 improve in the near future?

 3) Prescription: If not:

 a) is assign(.lib.loc, sub(Server02/stats, L:, .libPaths(), 
 ignore.case = TRUE), env=environment(.libPaths))
 a reasonable workaround in a 

Re: [R] value.labels

2011-08-12 Thread Jim Lemon

On 08/12/2011 12:10 AM, zcatav wrote:

Hello R people,

I have a data.frame. Status variable has 3 values. 0-alive, 1-dead and
2-missed
Status as a factor have correct levels. Levels and labels output as follows;

levels(Adbf$status); labels(Adbf$status)
[1] 0 1 2
   [1] 1   2   3   4   5   6   7   8   9   10
  [11] 11  12  13  14  15  16  17  18  19  20
  [21] 21  22  23  24  25  26  27  28  29  30
  [31] 31  32  33  34  35  36  37  38  39  40
  [41] 41  42  43  44  45  46  47  48  49  50

644

Can i add value.labels to status variable? If yes how? Can i see these
value.labels on results or graphics?


Hi zcatav,
There is a convenience function in the prettyR package named 
add.value.labels that does just that. It is there mostly so that 
someone with data not converted from SPSS can make them look like data 
that has been converted from SPSS. I don't know whether the method used 
will work with functions from any other package, though.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] improve formatting of HTML table

2011-08-12 Thread Jim Lemon

On 08/12/2011 02:04 AM, Juliet Hannah wrote:

I am trying to improve the look of an HTML table for a report (that
needs to be pasted into Word).

Here is an example.

table2- structure(c(26L, 0L, 40L, 0L, 10L, 0L, 0L, 188L, 0L, 281L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 4L), .Dim = c(6L, 3L), .Dimnames = structure(list(
 myvar = c(Don't know, Somewhat likely, Somewhat unlikely,
 Very likely, Very unlikely, NA), var_recode = c(0, 1,
 NA)), .Names = c(myvar, var_recode)), class = table)


library(R2HTML)
.HTML.file = paste(getwd(), /example.html, sep = )
HTML(table2)


In the output, I would like to improve the justification of the
numbers (or any other suggestion to make
the HTML look nicer). The columns are a little hard to read.


Hi Juliet,
The example below, when copied from an HTML browser (Konqueror) and 
pasted into a word processor (OpenOffice Write) produces a table with no 
borders, left justified first column and other columns centered. You can 
easily do other justifications if you wish.


library(prettyR)
delim.table(table2,filename=example.html,
 tabegin=table border=0,bor=trtd,
 delim=td align=center,html=TRUE)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] UNC Windows path beginning with backslashes: normalizePath bug??

2011-08-12 Thread Keith Jewell
Thanks Uwe.

I'm aware (and have been forcefully reminded) that using a mapped drive 
avoids these problems. But there is no single drive letter which I can use 
site-wide, so I have problems with things like R_LIBS_SITE. As I've outlined 
I'm exploring a range of solutions, including mapping a drive where I can.

I posted in the hope of learning from and perhaps helping those with similar 
problems. I hope that it is permissible to discuss non-canonical use of R on 
this list, I certainly did not intend disrespect for the R developers (or to 
make typing errors).

Best regards

Keith Jewell

Uwe Ligges lig...@statistik.tu-dortmund.de wrote in message 
news:4e44091e.7090...@statistik.tu-dortmund.de...
 This is extremely tricky since Windows does not always accept // rather 
 than \\. Additionally, there is not implemented system call in Windows, 
 hence ?Sys.glob tells us a partial emulation is provided and An attempt 
 is made to handle UNC paths starting with a double backslash.

 As you have seenm this does not work everywhere, therefore it is advisable 
 to run R from mapped drives - as I am doing in the network of our 
 university for 13 years without problems now.

 Best,
 Uwe Ligges


 On 11.08.2011 18:29, Keith Jewell wrote:
 Hi,

 Back in June I posted the message below, but had no replies. I've made a
 little progress since then so this is to update anyone interested (!) and 
 to
 ask for comments.

 Brief problem statement:
 Under Windows, some parts of R don't handle UNC paths beginning with
 backslashes. Specifically
 a) Sys.glob() fails to find some files breaking (e.g.) Rcmdr plugins
 Sys.glob(file.path(.libPaths(), */etc/menus.txt))
 fails to find files which are there

 b) update.packages(ask='graphics') fails when copying the updates into 
 the
 destination folders

 In Renviron.site I define the site library with forward slashes, not
 backslashes thus...
 R_LIBS_SITE=//campden/shares/workgroup/stats/R/library/%v
 ... but the startup process seems to replace them with forward slashes.
 I guess because  .libPaths with a 'new' argument calls normalizePath 
 which
 changes leading slashes to backslashes, even with winslash=/
 normalizePath(//campden/shares/workgroup/stats/R/library, 
 winslash=/)
 [1] campden/shares/workgroup/Stats/R/library

 I've corrected (??) this by inserting a line into Rprofile.site
assign(.lib.loc, gsub(\\, /, .libPaths(), fixed=TRUE),
 env=environment(.libPaths))
 That seems to fix problem (a) above, which was affecting a number of 
 users.
 But have I broken anything else?

 I'm still experiencing problem (b).
 I'm the only person on site who updates packages so I've mapped a drive
 letter (L:) and in my own .Rprofile I have a line
 assign(.lib.loc, sub(//campden/shares/workgroup/Stats, L:,
 .libPaths(), ignore.case = TRUE), env=environment(.libPaths))

 So that's OK as far as it goes, but it's all a bit messy!
 If .libPaths is called with a 'new' argument it will breaks things again.
 normalizePath seems to produce paths that don't work with Sys.glob.

 I have the feeling I'm being silly and making hard work of all this.

 Any comments? Suggestions?

 Best regards, and thanks in advance/

 Keith Jewell

 Keith Jewellk.jew...@campden.co.uk  wrote in message news:...
 Hi,

 Back in 2010 I had a problem with 'update.packages()', which I worked
 around by mapping a drive letter to a UNC path [described in
 http://finzi.psych.upenn.edu/Rhelp10/2010-February/229820.html  but my
 current workaround is
 assign(.lib.loc, sub(Server02/stats, L:, .libPaths(),
 ignore.case = TRUE), env=environment(.libPaths))].

 More recently a colleague had problems using the 'FactoMineR' plug in 
 for
 the Rcmdr package;
 a) directly loading 'RcmdrPlugin.FactoMineR' opened and crashed R
 Commander;
 b) opening R Commander without FactoMiner, the Tools option 'Load Rcmdr
 plug-in(s)...' was greyed out.

 It transpired that in .libPaths() the path to the library holding
 'RcmdrPlugin.FactoMineR' was specified as a UNC address:
 Server02/stats/R/library/2.13. Mapping a virtual drive letter (e.g.
 L:) and specifying the path in .libPaths() as a 'local file system' 
 (LFS)
 addressL:/R/library/2.13  fixed the problem.

 I contacted Professor Fox (maintainer of Rcmdr) who told me that Rcmdr
 finds plug-in packages via the command
   plugins- unlist(lapply(.libPaths(), function(x) Sys.glob(file.path(x,
 */etc/menus.txt
 Because file.path and Sys.glob are both vectorised I think (but am not
 certain) that this could be simplified to:
   plugins- Sys.glob(file.path(.libPaths(), */etc/menus.txt))
 but that's by the way, the problem seems to lie in Sys.glob under 
 Windows
 operating systems.

 I note that 'help(Sys.glob)' on my Windows system  differs from
 http://finzi.psych.upenn.edu/R/library/base/html/Sys.glob.html.
 The latter says For precise details, see your system's documentation on
 the glob system call.  There is a POSIX 1003.2 standardsnip  

[R] odfWeave repeats output

2011-08-12 Thread Chris Beeley
Hello all-

I'm having a problem with odfWeave. I'm still testing it out, and have
used both of these code chunks, which I copied off a blog:

Number 1:

A sample document last processed
\Sexpr{Sys.time()}.
This simply illustrates the output from an
R command inserted into our document.
This is using \Sexpr{version$version.string}.

Number 2:

Sample1=

summary(iris)

@

Both do the same thing, which is generate the document using this code:

odfWeave(/media/Windows7/temp/GCAMT_in.odt,
/media/Windows7/temp/GCAMT_out2.odt)

But the output repeats over and over in the document in a bizarre way,
stretching out over about 9 pages, like this (abbreviated):

A sample document last processed
2011-08-12 09:55:51.
This simply illustrates the output from an
R command inserted into our document.
This is using R version 2.12.1 (2010-12-16).

...

A sample document last processedA sample document last processed
2011-08-12 09:55:51.2011-08-12 09:55:51.
This simply illustrates the output from anThis simply illustrates the
output from an
R command inserted into our document.R command inserted into our document.
This is using R version 2.12.1 (2010-12-16).This is using R version
2.12.1 (2010-12-16).

...

etc.

The really weird thing is that I have replicated the problem across
two operating systems (dual boot on the same computer), windows 7
64bit and Linux Mint 11 (which is Ubuntu, not sure which version I'm
afraid). I've been unable to find anyone on any forums or anything
with the same problem.

Using R v2.13 on Windows, v 2.12 on Linux, was using RStudio but just
tested it without (just in case) and it does the same thing.

Any suggestions gratefully received.

Chris Beeley
Institute of Mental Health, UK.

Output of the operation is below.

With this output:

 odfWeave(/media/Windows7/temp/GCAMT_in.odt, 
 /media/Windows7/temp/GCAMT_out2.odt)
  Copying  /media/Windows7/temp/GCAMT_in.odt
  Setting wd to  /tmp/RtmpAwd1Bm/odfWeave12095551677
  Unzipping ODF file using unzip -o GCAMT_in.odt
Archive:  GCAMT_in.odt
 extracting: mimetype
   creating: Configurations2/statusbar/
  inflating: Configurations2/accelerator/current.xml
   creating: Configurations2/floater/
   creating: Configurations2/popupmenu/
   creating: Configurations2/progressbar/
   creating: Configurations2/toolpanel/
   creating: Configurations2/menubar/
   creating: Configurations2/toolbar/
   creating: Configurations2/images/Bitmaps/
  inflating: content.xml
  inflating: manifest.rdf
  inflating: styles.xml
 extracting: meta.xml
  inflating: Thumbnails/thumbnail.png
  inflating: settings.xml
  inflating: META-INF/manifest.xml

  Removing  GCAMT_in.odt
  Creating a Pictures directory

  Pre-processing the contents
  Sweaving  content.Rnw

  Writing to file content_1.xml
  Processing code chunks ...

  'content_1.xml' has been Sweaved

  Removing content.xml

  Post-processing the contents
  Removing content.Rnw
  Removing styles.xml
  Renaming styles_2.xml to styles.xml
  Removing manifest.xml
  Renaming manifest_2.xml to manifest.xml
  Removing extra files

  Packaging file using zip -r GCAMT_in.odt .
  adding: mimetype (stored 0%)
  adding: content.xml (deflated 98%)
  adding: settings.xml (deflated 84%)
  adding: meta.xml (deflated 57%)
  adding: META-INF/ (stored 0%)
  adding: META-INF/manifest.xml (deflated 83%)
  adding: styles.xml (deflated 93%)
  adding: manifest.rdf (deflated 54%)
  adding: Pictures/ (stored 0%)
  adding: Thumbnails/ (stored 0%)
  adding: Thumbnails/thumbnail.png (deflated 23%)
  adding: Configurations2/ (stored 0%)
  adding: Configurations2/progressbar/ (stored 0%)
  adding: Configurations2/images/ (stored 0%)
  adding: Configurations2/images/Bitmaps/ (stored 0%)
  adding: Configurations2/toolbar/ (stored 0%)
  adding: Configurations2/menubar/ (stored 0%)
  adding: Configurations2/statusbar/ (stored 0%)
  adding: Configurations2/popupmenu/ (stored 0%)
  adding: Configurations2/accelerator/ (stored 0%)
  adding: Configurations2/accelerator/current.xml (stored 0%)
  adding: Configurations2/floater/ (stored 0%)
  adding: Configurations2/toolpanel/ (stored 0%)
  Copying  GCAMT_in.odt
  Resetting wd
  Removing  /tmp/RtmpAwd1Bm/odfWeave12095551677

  Done

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 5 arguments passed to .Internal(matrix) which requires 7

2011-08-12 Thread Uwe Ligges
You obviously have a combination from R code in package base and 
compiled C code of base R that do not match.


Point is that the matrix R code passes 7 arguments to the underlying 
.Internal code that should also expect 7 arguments. If just 5 are passes 
the R code is from an ancient version of R.


Try to clean up completely and reinstall.

Best,
Uwe Ligges


On 01.08.2011 17:32, Robert Pfister wrote:

Hello,

I am having a problem with the function matrix. Specifically, when I pass
three arguments (two more being instantiated in the function), I get the
following error message:

Error in matrix(0, 30, 10) :
   5 arguments passed to .Internal(matrix) which requires 7


I looked into it, and someone has suggested that this may be the function
from an old version of R. I recently changed my source path from the lucid
version to the maverick version and installed all of the R packages I need
like so, but why would this change the matrix() function? Also, how does R
know that I passed five arguments (only three being given) if the matrix()
function is supposed to take seven arguments?

Thank you,

Robert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave repeats output

2011-08-12 Thread Rainer Stuetz
On Fri, Aug 12, 2011 at 11:10 AM, Chris Beeley chris.bee...@gmail.com wrote:

 I'm having a problem with odfWeave. I'm still testing it out, and have
 used both of these code chunks, which I copied off a blog:

 ...

 The really weird thing is that I have replicated the problem across
 two operating systems (dual boot on the same computer), windows 7
 64bit and Linux Mint 11 (which is Ubuntu, not sure which version I'm
 afraid). I've been unable to find anyone on any forums or anything
 with the same problem.

 Using R v2.13 on Windows, v 2.12 on Linux, was using RStudio but just
 tested it without (just in case) and it does the same thing.

 Any suggestions gratefully received.

You might try downgrading to an earlier version of the XML package
(e.g. version 3.2.0).
See this thread https://stat.ethz.ch/pipermail/r-help/2011-May/278068.html

HTH,
Rainer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] value.labels

2011-08-12 Thread Bert Gunter
Doesn't:

 x - sample(0:2,100,TRUE)

y - structure(factor(x,labels=c(alive,dead,missed)), orig.labs=x)

do what you want?

-- Bert


 The OP wanted to associate the label alive with the value 0, but didn't 
 want the underlying value in x to change.  The construction

 y - factor(x,labels=c(alive,dead,missed))

 creates a factor, y, where the underlying factor value is 1 and the label is 
 alive where x had the value of 0.  Without retaining the vector x, or 
 knowing how y was created, one can't get back to the original value of 0.  I 
 am agnostic about whether that is good or bad, but it seems that your 
 approach does not meet the OP's original request.  Am I missing someting?

 Dan

 Daniel Nordlund
 Bothell, WA USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract named regions from an Excel file using XLConnect?

2011-08-12 Thread Gabor Grothendieck
On Fri, Aug 12, 2011 at 3:15 AM, christiaan pauw cjp...@gmail.com wrote:
 Hi Everybody

 In R, the XLConnect package can read and write named region to and from
 Excel. In order to read a named region with the readNamedRegion function you
 need to know it's name. You can check is a name exists with existsName, but
 you still have to know the name. Is there a way to actually get a list of
 the named regions in XLConnect sinilar to getRanges in the xlsx package.

See ?getDefinedNames

This lists various R packages that read and/or write Excel files:
http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attribute name argument for getAttrib()

2011-08-12 Thread Duncan Murdoch

On 11-08-11 5:46 PM, Wei Hao wrote:

Hi all:

Having browsed Rinternals.h, it's a little unclear to me how to get
attributes that aren't in the Symbol Table Shortcuts section of that
file. In particular, I would like to extract the attribute
scaled:center from a SEXP.


This is an R-devel question.  If the following isn't clear, followup 
there:  use install(scaled:center).


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting data

2011-08-12 Thread R. Michael Weylandt
Yes, that likely is the source of the difference: I'm happy to help fix it
up (won't be hard), but I want to clarify exactly how you want the data
done:

say we have 20 variables x = 1:20 if there's a split we go to 1:10, 11:20;
then 1:5, 6:10, 11:15,16:20 etc

but what about situations with very different data sets:

x = cbind(1:20, 1:7)
one split takes us to where exactly: cbind( c(1:10, 11:20), c(1:3,1:4)) or
cbind( c(1:10,11:20), c(1:4,5:7)) and then what of the next iteration?

More generally, what exactly are you comparing? It seems odd to have two
different categories/samples and to compare their means and then to switch
gears entirely to compare subsamples of the categories independently. It
seems that they are just different inferences: comparing the average of cats
vs dogs and then comparing boy cats vs girl cats and boy dogs vs girl dogs.
That winds up highlighting different independent variables. (Iteration one:
species -- iteration two: gender)

If you could speak a little more about your data, it'd be easier to do the
splits in a meaningful way.

As currently implemented, my code takes a 2d data frame and simply divides
it into the top and bottom halves, which in most applications would
corresponding to doing a mean-comparison calculation for different
statistics of the same observation. The subsetting then keeps
corresponding data together -- I put corresponding in parentheses because
we aren't doing paired t-tests.

Looking forward to your reply,

Michael

PS -- I did the splits basically the same way (other than the direction) but
I just used floor() instead of round().


On Fri, Aug 12, 2011 at 3:45 AM, Marina de Wolff
marinadewo...@hotmail.comwrote:

  Thank you for your reply,

 I used this code on my test data, but did not get the same p-values.

 I think I know were the difference lies; when the data is split in 4 parts
 I want to compare the two left groups (group 1 and 2) with each other and
 the two right groups (group 3 and 4) with each other. It seems that with
 this code group 1 and 3 are compared with each other and group 2 and 4, I
 did not yet succeeded in changing this.

 About the unequal data sizes, I thought I could 'correct' this by using
 round. For example, when my data consists of 17 data points I would use

 m - length(data)/2
 x - data[1:round(m)]
 y - data[(round(m)+1):length(data)]

 x has size 9 and y has size 8.


 Sincerely,

 Marina de Wolff

  --
 From: michael.weyla...@gmail.com
 Date: Thu, 11 Aug 2011 11:54:11 -0400
 Subject: Re: [R] Splitting data
 To: marinadewo...@hotmail.com
 CC: r-help@r-project.org


 This sounds very much like a recursive problem: something like this seems
 to get the gist of what you want.

 DataSplits - function(Data, alpha = 0.05) {
 DataSplitsCore - function(Data, alpha, level) {
 tt - t.test(Data[,1],Data[,2])
 print(tt)
 if (tt$p.value  alpha) {
 print(paste(Stopped at level, level))
 return(invisible(TRUE))
 } else {
 nr = floor(NROW(Data)/2)
 if (nr == 1) {print(paste(Reached Samples of Size 1)); stop}
 d1 = DataSplitsCore(Data[(1:nr),], alpha = alpha, level = level
 + 1)
 if (d1) return(invisible(TRUE))
 d2 = DataSplitsCore(Data[-(1:nr),], alpha = alpha, level =
 level +1)
 if (d2) return(invisible(TRUE))
 return(invisible(FALSE))
 }
 }
 DataSplitsCore(Data, alpha = alpha, level = 1)
 }

 Your description wasn't the clearest about what to do when the data sizes
 didn't match, but this should give you a start. Let me know if this doesn't
 do as desired and I can help tweak it.

 Hope this can be of help,

 Michael Weylandt

 PS -- You might as well use R's built in t.test function.

 On Thu, Aug 11, 2011 at 5:17 AM, Marina de Wolff 
 marinadewo...@hotmail.com wrote:


 I want to implement the following algorithm in R:

 I want to split my data, use a t test to compare both means of the groups
 to see if they significantly differ from each other. If this is a yes (p 
 alpha) I want to split again (into 4 groups) and do the same procedure
 twice,  and stop otherwise (here the problem arises). As a final result I
 would have different groups of data.

 I made some code where the data is splitted, until no splitting is
 possible. So for 16 datapoints, we can split 4 times with a final result of
 16 groups (p is NA for the 4th split since sd cannot be calculated..).

 The code calculated all p values, but I don't want this. I want it to stop
 when p  alpha. I tried while, but didn't succeed.

 I hope someone can help me to acchieve my goal.

 This is what I tried so far with test data:

 a = rnorm(9,0,0.1)
 b = rnorm(7,1,0.1)
 data = c(a,b)
 plot(data)

 # Want to calculate max of groups/split for the data
 d = seq(1,100,1)
 n = 2^d
 m - which(n =length(data))
 n = n[m[1]:m[length(m)]]

 # All groups
 i=0
 j=0
 dx = 0
 dy =
 for (i in 

Re: [R] Extract values from a data frame

2011-08-12 Thread Ista Zahn
On Fri, Aug 12, 2011 at 3:54 AM, Lali laur...@gmail.com wrote:
 Hi Ista,
 Thanks for your suggestion, I am still trying to wrap my head around the
 functions you used, as I am not familiar with any of them, but it works
 perfectly!
 I do want to understand the code, if you don't mind I would like to ask a
 few questions
 In this line:
 dfm - melt(df[c(1, 2, 5, 8)], id = 1)
 What does the id=1 do? The variables are already specified in df[c(1, 2, 5,
 8)], right?

It retains the first variable in df in dfm. The remaining variables
are collapsed into a single variable named value.

 What does this line do:
 dfm.r - ddply(dfm.r, .(name), transform, index = paste(S,
 1:length(name), sep = ))

It adds a variable named index with values of Si for each level of
name where i= 1:length(name)

HTH,
Ista

 Thanks again for the help.
 Laura



 On Thu, Aug 11, 2011 at 5:57 PM, Ista Zahn iz...@psych.rochester.edu
 wrote:

 Hi Laura,

 On Thu, Aug 11, 2011 at 7:01 AM, Lali laur...@gmail.com wrote:
  Hi everyone,
  I have a data frame that looks *sort of* like this:
 
  name - letters[1:5]
  signal.1 - c(12, bad signal, noise, 10, X)
  length.signal.1 - 5:9
  intensity.signal.1 - 3:7
  signal.2 - c(13, noise, 19.2, X, V)
  length.signal.2 - 2:6
  intensity.signal.2 - 1:5
  signal.3 - c(NA, 15.4, error, NA, 17)
  length.signal.3 - c(NA, 2, 3, NA, 4)
  intensity.signal.3 - c(NA,4, 5, NA, 5)
 
  #(there are actually up to 16 signals and 50 names, but I made this
  short
  for the example)
 
  df - data.frame(cbind(name, signal.1, length.signal.1,
  intensity.signal.1,
  signal.2,
                        length.signal.2, intensity.signal.2, signal.3,
  length.signal.3,
                        intensity.signal.3))
 
 
 
  I need to fish out some values and have them in a new data frame.
 
  I am only interested in values in columns 2, 5 and 8 (actually seq(2,
  50, 3)
  in my real df)
  I want the values that are not:
  bad signal
  noise
  error
  NA
  V
 
  This is the output I want (the name column is unimportant for my
  purposes,
  its just there as a reference for the example).
 
  (name)  S1       S2
  A        12        13
  B        15.4     (another value found in the other signals 3 not shown
  on
  example)
  C        19.2     (another value found in the other signals 3 not shown
  on
  example)
  D        10        X
  E        X         17
 
  I do know that there will always be 2 values exactly that do not match
  the
  exclusions named above, or none at all
 
  I have tried different approaches, grep, matching,%nin%... But as I am
  not
  an advanced used, I am very likely doing something wrong, because I
  either
  get a vector, or I get a matrix with TRUE FALSE, and usually I get the
  whole
  rows, and I don't want that :(
  I have also being searching the list for answers without avail.
  Any suggestions? Examples including syntax are appreciated (syntax is a
  major weak point for me).

 Here is a solution using the reshape and plyr packages

 library(reshape)
 dfm - melt(df[c(1, 2, 5, 8)], id = 1)
 dfm.r - dfm[!dfm$value %in% c(bad signal, noise, error, NA, V),
 ]
 dfm.r - ddply(dfm.r, .(name), transform, index = paste(S,
 1:length(name), sep = ))
 cast(dfm.r, name ~ index)

 Best,
 Ista
 
 
  Laura
 
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org





-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting bootstrap statistic to work

2011-08-12 Thread Alex Olssen
Hi R-help,

I am trying to implement a nonparametric bootstrap to find the
standard errors of a simple statistics - the ratio of two scalars.  I
am having difficulty getting boot() to work correctly.  I code a
function to create the ratio of the relevant scalars. theta(data, i).
When I call the function for my data and every observation appearing
once, theta(test, c(1)), I get the correct statistic for my original
data.  However when I use boot(test, theta, 200) the original
statistic is incorrect.

My code and data are below.  The code is very short.

Any help will be appreciated.

Cheers,
Alex

library(boot)
test - read.csv(test.csv)
test.mean - mean(test)
test.cov - cov(test)*87/88 ## use the biased version as I am
reproducing a result from a book
test.eigen - eigen(test.cov)
theta - function(data, i) {
  data - data * i
  data.cov - cov(data)
  data.eigen - eigen(data.cov)
  data.eigen$values[1]/sum(data.eigen$values)
}
test.boot - boot(test, theta, 200)

test
   mec vec alg ana sta
1   77  82  67  67  81
2   63  78  80  70  81
3   75  73  71  66  81
4   55  72  63  70  68
5   63  63  65  70  63
6   53  61  72  64  73
7   51  67  65  65  68
8   59  70  68  62  56
9   62  60  58  62  70
10  64  72  60  62  45
11  52  64  60  63  54
12  55  67  59  62  44
13  50  50  64  55  63
14  65  63  58  56  37
15  31  55  60  57  73
16  60  64  56  54  40
17  44  69  53  53  53
18  42  69  61  55  45
19  62  46  61  57  45
20  31  49  62  63  62
21  44  61  52  62  46
22  49  41  61  49  64
23  12  58  61  63  67
24  49  53  49  62  47
25  54  49  56  47  53
26  54  53  46  59  44
27  44  56  55  61  36
28  18  44  50  57  81
29  46  52  65  50  35
30  32  45  49  57  64
31  30  69  50  52  45
32  46  49  53  59  37
33  40  27  54  61  61
34  31  42  48  54  68
35  36  59  51  45  51
36  56  40  56  54  35
37  46  56  57  49  32
38  45  42  55  56  40
39  42  60  54  49  33
40  40  63  53  54  25
41  23  55  59  53  44
42  48  48  49  51  37
43  41  63  49  46  34
44  46  52  53  41  40
45  46  61  46  38  41
46  40  57  51  52  31
47  49  49  45  48  39
48  22  58  53  56  41
49  35  60  47  54  33
50  48  56  49  42  32
51  31  57  50  54  34
52  17  53  57  43  51
53  49  57  47  39  26
54  59  50  47  15  46
55  37  56  49  28  45
56  40  43  48  21  61
57  35  35  41  51  50
58  38  44  54  47  24
59  43  43  38  34  49
60  39  46  46  32  43
61  62  44  36  22  42
62  48  38  41  44  33
63  34  42  50  47  29
64  18  51  40  56  30
65  35  36  46  48  29
66  59  53  37  22  19
67  41  41  43  30  33
68  31  52  37  27  40
69  17  51  52  35  31
70  34  30  50  47  36
71  46  40  47  29  17
72  10  46  36  47  39
73  46  37  45  15  30
74  30  34  43  46  18
75  13  51  50  25  31
76  49  50  38  23   9
77  18  32  31  45  40
78   8  42  48  26  40
79  23  38  36  48  15
80  30  24  43  33  25
81   3   9  51  47  40
82   7  51  43  17  22
83  15  40  43  23  18
84  15  38  39  28  17
85   5  30  44  36  18
86  12  30  32  35  21
87   5  26  15  20  20
88   0  40  21   9  14

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting bootstrap statistic to work

2011-08-12 Thread Ken Hutchison
Hi,
 It's kind of weird for me to not see a return() statement in the
function. Maybe try rolling your own, nonparametric bootstraps aren't all
that bad.
i.e.
reps=1000
stat.holder=rep(NA,reps) ###Rep of NA's so you can pick up errors easily
for(count in 1:reps)
{
bootsample=sample(data, n.data.to.compute.from,replace=T)
##Insert code here to compute statistic
stat.holder[count]=computed.statistic
}
   It may be overtly simplistic, but you may find it helpful.
Thanks,
   Ken

On Fri, Aug 12, 2011 at 10:10 AM, Alex Olssen alex.ols...@gmail.com wrote:

 Hi R-help,

 I am trying to implement a nonparametric bootstrap to find the
 standard errors of a simple statistics - the ratio of two scalars.  I
 am having difficulty getting boot() to work correctly.  I code a
 function to create the ratio of the relevant scalars. theta(data, i).
 When I call the function for my data and every observation appearing
 once, theta(test, c(1)), I get the correct statistic for my original
 data.  However when I use boot(test, theta, 200) the original
 statistic is incorrect.

 My code and data are below.  The code is very short.

 Any help will be appreciated.

 Cheers,
 Alex

 library(boot)
 test - read.csv(test.csv)
 test.mean - mean(test)
 test.cov - cov(test)*87/88 ## use the biased version as I am
 reproducing a result from a book
 test.eigen - eigen(test.cov)
 theta - function(data, i) {
  data - data * i
  data.cov - cov(data)
  data.eigen - eigen(data.cov)
  data.eigen$values[1]/sum(data.eigen$values)
 }
 test.boot - boot(test, theta, 200)

 test
   mec vec alg ana sta
 1 77 82 67 67 81
 2   63  78  80  70  81
 3   75 73 71 66 81
 4   55  72  63  70  68
 5   63 63 65 70 63
 6   53  61  72  64  73
 7   51 67 65 65 68
 8   59  70  68  62  56
 9   62  60  58  62  70
 10 64 72 60 62 45
 11  52  64  60  63  54
 12  55  67  59  62  44
 13  50  50  64  55  63
 14  65  63  58  56  37
 15  31 55 60 57 73
 16  60 64 56 54 40
 17  44  69  53  53  53
 18  42  69  61  55  45
 19  62  46  61  57  45
 20  31 49 62 63 62
 21 44 61 52 62  46
 22 49 41 61 49  64
 23 12 58 61 63  67
 24  49  53  49  62  47
 25  54 49 56 47 53
 26  54  53  46  59  44
 27  44 56 55 61 36
 28 18 44 50 57  81
 29  46  52  65  50  35
 30 32 45 49 57  64
 31  30 69 50 52 45
 32  46  49  53  59  37
 33  40 27 54 61 61
 34  31 42 48 54 68
 35  36  59  51  45  51
 36  56  40  56  54  35
 37  46  56  57  49  32
 38  45  42  55  56  40
 39  42  60  54  49  33
 40  40 63 53 54 25
 41 23 55 59 53  44
 42  48 48 49 51 37
 43  41 63 49 46 34
 44  46  52  53  41  40
 45  46  61  46  38  41
 46  40 57 51 52 31
 47  49  49  45  48  39
 48  22 58 53 56 41
 49  35  60  47  54  33
 50 48 56 49 42  32
 51  31 57 50 54 34
 52  17  53  57  43  51
 53 49 57 47 39  26
 54  59  50  47  15  46
 55  37  56  49  28  45
 56  40  43  48  21  61
 57  35  35  41  51  50
 58  38  44  54  47  24
 59  43  43  38  34  49
 60  39  46  46  32  43
 61  62  44  36  22  42
 62  48  38  41  44  33
 63  34  42  50  47  29
 64  18  51  40  56  30
 65  35  36  46  48  29
 66  59  53  37  22  19
 67  41  41  43  30  33
 68  31  52  37  27  40
 69  17  51  52  35  31
 70  34  30  50  47  36
 71  46  40  47  29  17
 72  10  46  36  47  39
 73  46  37  45  15  30
 74  30  34  43  46  18
 75  13  51  50  25  31
 76  49  50  38  23   9
 77  18  32  31  45  40
 78   8  42  48  26  40
 79  23  38  36  48  15
 80  30  24  43  33  25
 81   3   9  51  47  40
 82   7  51  43  17  22
 83  15  40  43  23  18
 84  15  38  39  28  17
 85   5  30  44  36  18
 86  12  30  32  35  21
 87   5  26  15  20  20
 88   0  40  21   9  14

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reduce the space under legend title

2011-08-12 Thread Uwe Ligges



On 12.08.2011 01:28, Amelia McNamara wrote:

I would like to know if there is a way to reduce the space between the
legend title and the first line of the legend, without reducing all
the vertical space using y.intersp. Because of lack of space, I would
like to differentiate my title by bolding it, and reduce the vertical
space under the title to the space typically used between categories
in a legend.

plot(c(1.1, 2.3, 4.6), c(2.0, 1.6, 3.2), pch=c(1,2,3))
legend(x=bottomright, legend=c(Category 1, Category 2, Category
3), title=expression(bold(Fairly long title)), pch=c(1,2,3),
bty=o)


That is hardcoded in the legend() function. Hence a more or less dirty 
hack or maybe better adapting the legend function for your needs is the 
way to go.


Best,
Uwe Ligges






~Amelia McNamara
Statistics PhD student, UCLA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] install packages from intranet

2011-08-12 Thread Peter Aberline

Hi,
 
I'm new to R. Apologies if this is a simple query, I've searched the mailing 
lists and docs but can't find a solution to my problem. 
 
I'm trying to make some packages available on our intranet. During development 
the 'intranet' is a webserver running on localhost. 
 
* When I call install.packages I get a mesage about not being able to access 
'index for repository'. 
* The directories are viewable and can be seen through a web browser. 
* Web server is IIS running on Winows 7. 
* I've used the same paths on the web server as where the packages are located 
in the CRAN mirror's.
* I've tried setting setInternet2(TRUE), which was already set.
 
 
 r - getOption(repos);
 r[CRAN] = http://localhost;
 r[CRANextra] = http://localhost/pub/RWin;
 options(repos=r)
 r
   CRAN   CRANextra
 http://localhost; http://localhost/pub/RWin;
 
 install.packages(abind)
Warning: unable to access index for repository http://localhost/bin/windows/cont
rib/2.13
Warning: unable to access index for repository http://localhost/pub/RWin/bin/win
dows/contrib/2.13
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package 'abind' is not available (for R version 2.13.1)

 
OR
 
 install.packages(abind, 
 contriburl=http://localhost/bin/windows/contrib/2.13;)
Warning: unable to access index for repository 
http://localhost/bin/windows/contrib/2.13
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘abind’ is not available (for R version 2.13.1)
 
OR 
 
  install.packages(abind, 
contriburl=http://localhost/bin/windows/contrib/2.13/abind_1.3-0.zip;)
Warning: unable to access index for repository 
http://localhost/bin/windows/contrib/2.13/abind_1.3-0.zip
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘abind’ is not available (for R version 2.13.1)
 
 
Can anyone give me any clues as to what I'm doing wrong? Does R need some kind 
of index file to map between the name 'abind' and the zip filename?
 
Many thanks
Peter.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] deSolve output

2011-08-12 Thread bosse
Hi, 

I've solved a simple differential equation describing the degradation of
amino acid carbon (THAA-C) using deSolve. 

Code is a follows: 

# Input of model parameters, a and b describes form of curve, i is apparent
initial age of Org. C.
parameters - c(a = a, b = b, i=i)  


# Initial value of the model, G
state = c(G = G)


#specifies the function degradation as a function of time (t), inital state
(state) and parameters (given parameters in model)
degradation = function (t, state, parameters) {


#Specifies that state an parameters is inputted as lists, hence order in the
lists is important
with(as.list(c(state, parameters)), 


#differential equation to be solved
{dG = (-a*(t+i)^b)*(G)  


list(c(dG))
})
}

# Makes sequence of times to be solved for. Start at t0 and proceed to iteno
in steps of 1 day
times = seq(t0, iteno, by = 1)  


#calls solver, to solve diffrential equation and saves results as variable
THAAC
THAAC = ode (y = state, times = times, func  = degradation, parms =
parameters) 


This all works fine, and output THAAC contains carbon concentrations of G at
all times t. However, i would like the model to give me the rates, i.e. dG
for all times. And possibly also the value of -a*(t+i)^b for all times.
Does anyone know if this is possible?

Kristoffer Piil

--
View this message in context: 
http://r.789695.n4.nabble.com/deSolve-output-tp3738970p3738970.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing all duplicate row except by one

2011-08-12 Thread m.marcinmichal
Hi,
thanks for response it's work perfect. Yes 't' is not good wariable it will
be 'test', my wrong.

Best 

Marcin M.

--
View this message in context: 
http://r.789695.n4.nabble.com/Removing-all-duplicate-row-except-by-one-tp3736949p3738132.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sapply to bind columns, with repeat?

2011-08-12 Thread Katrina Bennett
Hi R-help,

I am working with US COOP network station data and the files are
concatenated in single rows for all years, but I need to pull these
apart into rows for each day. To do this, I need to extract part of
each row such as station id, year, mo, and repeat this against other
variables in the row (days). My problem is that there are repeated
values for each day, and the files are fixed width field without
order.

Here is an example of just one line of data.

coop.raw - c(DLY09752806TMAX F2010010620107 00049 20107 00062
B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407
00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055
20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007
00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048
B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507
00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067
21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107
00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048
B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607
00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057
22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B)
write.csv(coop.raw, coop.tmp, row.names=F, quote=F)
coop.dat - read.fwf(coop.tmp, widths =
c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c(),
skip=1, as.is=T)
rep.name - rep(c(day,hr,met,dat,fl1,fl2), 62)
rep.count - rep(c(1:62), each=6, 1)
names(coop.dat) - c(rect, id, elem, unt, year, mo,
fill, numval, paste(rep.name, rep.count, sep=_))

I would like to generate output that contains in one row, the columns
id, elem, unt, year, mo, and numval. Binded to these
initial columns, I would like only day_1, hr_1, met_1, dat_1,
fl1_1, and fl2_1. Then, in the next row I would like repeated the
initial columns id, elem, unt, year, mo, and numval and
then binded day_2, hr_2, met_2, dat_2, fl1_2, and f2_2 and
so on until all the data for all rows has been allocated. Then, move
onto the next row and repeat.

I think I should be able to do this with some sort of sapply or lapply
function, but I'm struggling with the format for repeating the initial
columns, and then skipping through the next columns.

Thank you,

Katrina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] EXPORTING VERY LARGE DATASETS FROM R - PLEASE HELP!

2011-08-12 Thread BeckyHolt
Hi, 

Im trying to export a very large dataset from R after i have perfomed a
predict function.
I have tried saving the data as an object and then exporting via write.table
but it is too big to fit into both a txt file and an excel.csv file...the
following error occyrs : clipboard buffer is full and output lost

Can anyone help please??? 
I need to export all of the data asap, does anyone know the code or how to
maybe export it to a database like access??

Any help with this problem would be much appreciated!

Thank you

Becky 

--
View this message in context: 
http://r.789695.n4.nabble.com/EXPORTING-VERY-LARGE-DATASETS-FROM-R-PLEASE-HELP-tp3738718p3738718.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract values from a data frame

2011-08-12 Thread Lali
Hi Ista,
Thanks for your suggestion, I am still trying to wrap my head around the
functions you used, as I am not familiar with any of them, but it works
perfectly!
I do want to understand the code, if you don't mind I would like to ask a
few questions

In this line:
dfm - melt(df[c(1, 2, 5, 8)], id = 1)
What does the id=1 do? The variables are already specified in df[c(1, 2, 5,
8)], right?
What does this line do:
dfm.r - ddply(dfm.r, .(name), transform, index = paste(S,
1:length(name), sep = ))

Thanks again for the help.

Laura




On Thu, Aug 11, 2011 at 5:57 PM, Ista Zahn iz...@psych.rochester.eduwrote:

 Hi Laura,

 On Thu, Aug 11, 2011 at 7:01 AM, Lali laur...@gmail.com wrote:
  Hi everyone,
  I have a data frame that looks *sort of* like this:
 
  name - letters[1:5]
  signal.1 - c(12, bad signal, noise, 10, X)
  length.signal.1 - 5:9
  intensity.signal.1 - 3:7
  signal.2 - c(13, noise, 19.2, X, V)
  length.signal.2 - 2:6
  intensity.signal.2 - 1:5
  signal.3 - c(NA, 15.4, error, NA, 17)
  length.signal.3 - c(NA, 2, 3, NA, 4)
  intensity.signal.3 - c(NA,4, 5, NA, 5)
 
  #(there are actually up to 16 signals and 50 names, but I made this short
  for the example)
 
  df - data.frame(cbind(name, signal.1, length.signal.1,
 intensity.signal.1,
  signal.2,
length.signal.2, intensity.signal.2, signal.3,
  length.signal.3,
intensity.signal.3))
 
 
 
  I need to fish out some values and have them in a new data frame.
 
  I am only interested in values in columns 2, 5 and 8 (actually seq(2, 50,
 3)
  in my real df)
  I want the values that are not:
  bad signal
  noise
  error
  NA
  V
 
  This is the output I want (the name column is unimportant for my
 purposes,
  its just there as a reference for the example).
 
  (name)  S1   S2
  A1213
  B15.4 (another value found in the other signals 3 not shown
 on
  example)
  C19.2 (another value found in the other signals 3 not shown
 on
  example)
  D10X
  EX 17
 
  I do know that there will always be 2 values exactly that do not match
 the
  exclusions named above, or none at all
 
  I have tried different approaches, grep, matching,%nin%... But as I am
 not
  an advanced used, I am very likely doing something wrong, because I
 either
  get a vector, or I get a matrix with TRUE FALSE, and usually I get the
 whole
  rows, and I don't want that :(
  I have also being searching the list for answers without avail.
  Any suggestions? Examples including syntax are appreciated (syntax is a
  major weak point for me).

 Here is a solution using the reshape and plyr packages

 library(reshape)
 dfm - melt(df[c(1, 2, 5, 8)], id = 1)
 dfm.r - dfm[!dfm$value %in% c(bad signal, noise, error, NA, V),
 ]
 dfm.r - ddply(dfm.r, .(name), transform, index = paste(S,
 1:length(name), sep = ))
 cast(dfm.r, name ~ index)

 Best,
 Ista
 
 
  Laura
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] value.labels

2011-08-12 Thread zcatav
Hi,
My data.frame as follows;

a   b   c   d
1   58009   2010-11-02  0   NA
2   114761  2011-07-28  1   2008-11-05
3   184440  2011-07-28  1   2009-12-08
4   189372  2011-07-28  0   NA
5   105286  NA  2   NA
6   186717  2011-07-28  0   NA
7   189106  2011-07-28  0   NA
8   127306  2011-07-28  0   NA
9   157342  2011-04-25  0   NA

 library(prettyR)
 add.value.labels(test2$c, alive,dead,missed)
Error in add.value.labels(test2$c, alive, dead, missed) : 
  unused argument(s) (dead, missed)

 add.value.labels(test2$c, alive)
[1] 0 1 1 0 2 0 0 0 0
attr(,value.labels)
alive  NA  NA 
0 1 2 

This function allows only one label. How can add second and other labels?

Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/value-labels-tp3735947p3738244.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] value.labels

2011-08-12 Thread zcatav

Jim Lemon wrote:
 
 On 08/12/2011 12:10 AM, zcatav wrote:
 Hello R people,

 I have a data.frame. Status variable has 3 values. 0-alive, 1-dead
 and
 2-missed
 Status as a factor have correct levels. Levels and labels output as
 follows;

 levels(Adbf$status); labels(Adbf$status)
 [1] 0 1 2
[1] 1   2   3   4   5   6   7   8   9   10
   [11] 11  12  13  14  15  16  17  18  19  20
   [21] 21  22  23  24  25  26  27  28  29  30
   [31] 31  32  33  34  35  36  37  38  39  40
   [41] 41  42  43  44  45  46  47  48  49  50

 644

 Can i add value.labels to status variable? If yes how? Can i see these
 value.labels on results or graphics?

 Hi zcatav,
 There is a convenience function in the prettyR package named 
 add.value.labels that does just that. It is there mostly so that 
 someone with data not converted from SPSS can make them look like data 
 that has been converted from SPSS. I don't know whether the method used 
 will work with functions from any other package, though.
 
 Jim
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Hi,
My data.frame as follows;

   a  b c d
1 58009   2010-11-02 0 NA
2 114761 2011-07-28 1 2008-11-05
3 184440 2011-07-28 1 2009-12-08
4 189372 2011-07-28 0 NA
5 105286 NA  2 NA
6 186717 2011-07-28 0 NA
7 189106 2011-07-28 0 NA
8 127306 2011-07-28 0 NA
9 157342 2011-04-25 0 NA

 library(prettyR)
 add.value.labels(test2$c, alive,dead,missed)
Error in add.value.labels(test2$c, alive, dead, missed) :
  unused argument(s) (dead, missed)

 add.value.labels(test2$c, alive)
[1] 0 1 1 0 2 0 0 0 0
attr(,value.labels)
alive  NA  NA 
0 1 2

This function allows only one label. How can add second and other labels?

Thanks. 

--
View this message in context: 
http://r.789695.n4.nabble.com/value-labels-tp3735947p3738370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can the environment of a function be specified when the function is defined?

2011-08-12 Thread Frederic F
Hello, 

Is there a syntax to set the environment of a function when this function is
defined? 

The best I could come up with so far is using a wrapper function:

foo_internals- function(x) {Code of the function}

foo- function(x) {
  environment(foo_internals)-as.environment(target_environment)
  foo_internals(x)
}

But I would like to know if there is a cleaner syntax.

There are two reasons why I would like to have a function defined with an
environment different then the local one:
The first is to develop a function that will go in a package: there might be
more things loaded in my R_GlobalEnv than what will be available to the
function when its environment will be the namespace of the package, so I
would like to restrict right away the environment of the function that I
develop in my local enviroment to what will be their environment in 'real
life'.  
The other reason would be to have a function written in a package act as if
it was from another package by giving it the environment of this other
package. 

Thanks for your suggestions and comments,

Frederic

--
View this message in context: 
http://r.789695.n4.nabble.com/Can-the-environment-of-a-function-be-specified-when-the-function-is-defined-tp3739482p3739482.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding dependancies?

2011-08-12 Thread Uwe Ligges



On 03.08.2011 09:03, Rainer M Krug wrote:

Hi

although the background is that it happened on an hpc cluster, this question
does *not* concern hpc computing with R.

I was using R on a cluster and had to install several packages in my home
directory. Now the head node was migrated to new hardware (new install as
well) and many dependencies for my in $HOME installed packages, which were
present on the old head, are missing on the new head. Now I could simply try
to run my script, wait for the error message, mail the administrator to
install the package, try again, ... . But this is a tedious process - and I
can not go to him directly, as I am sitting opn a different continent.

So my question: is there an easy way (like ldd for programs), to identify
the dependencies which are not met, and how could I use that on the
packages?

Thanks,

Rainer



1. Identify the packages you are using in your code. I.e. watch out for 
library() and require() calls.



2.

Way A)

Tell the administrator to run
 install.packages(c(packageA, packageB, .), dependencies=TRUE)
and he or she will install all the dependencies (including suggests) in 
one pass.


Way B)

You can identify the dependencies by function package.dependencies() in 
package tools.


Way C)

Same as Way A) but just do it yourself into a private library, if you 
have the permissions.




Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting bootstrap statistic to work

2011-08-12 Thread Jean V Adams
Shouldn't the i in your theta() function refer to the selected rows (a 
vector of indices as referred to in the help file for boot) of the data 
used by boot()?

theta - function(data, i) {
  data - data[i, ]
  data.cov - cov(data)
  data.eigen - eigen(data.cov)
  data.eigen$values[1]/sum(data.eigen$values)
}

Jean

r-help-boun...@r-project.org wrote on 08/12/2011 09:10:52 AM:

 [image removed] 
 
 [R] Getting bootstrap statistic to work
 
 Alex Olssen 
 
 to:
 
 r-help
 
 08/12/2011 09:14 AM
 
 Sent by:
 
 r-help-boun...@r-project.org
 
 Hi R-help,
 
 I am trying to implement a nonparametric bootstrap to find the
 standard errors of a simple statistics - the ratio of two scalars.  I
 am having difficulty getting boot() to work correctly.  I code a
 function to create the ratio of the relevant scalars. theta(data, i).
 When I call the function for my data and every observation appearing
 once, theta(test, c(1)), I get the correct statistic for my original
 data.  However when I use boot(test, theta, 200) the original
 statistic is incorrect.
 
 My code and data are below.  The code is very short.
 
 Any help will be appreciated.
 
 Cheers,
 Alex
 
 library(boot)
 test - read.csv(test.csv)
 test.mean - mean(test)
 test.cov - cov(test)*87/88 ## use the biased version as I am
 reproducing a result from a book
 test.eigen - eigen(test.cov)
 theta - function(data, i) {
   data - data * i
   data.cov - cov(data)
   data.eigen - eigen(data.cov)
   data.eigen$values[1]/sum(data.eigen$values)
 }
 test.boot - boot(test, theta, 200)
 
 test
mec vec alg ana sta
 1   77  82  67  67  81
 2   63  78  80  70  81
 3   75  73  71  66  81
 4   55  72  63  70  68
 5   63  63  65  70  63
 6   53  61  72  64  73
 7   51  67  65  65  68
 8   59  70  68  62  56
 9   62  60  58  62  70
 10  64  72  60  62  45
 11  52  64  60  63  54
 12  55  67  59  62  44
 13  50  50  64  55  63
 14  65  63  58  56  37
 15  31  55  60  57  73
 16  60  64  56  54  40
 17  44  69  53  53  53
 18  42  69  61  55  45
 19  62  46  61  57  45
 20  31  49  62  63  62
 21  44  61  52  62  46
 22  49  41  61  49  64
 23  12  58  61  63  67
 24  49  53  49  62  47
 25  54  49  56  47  53
 26  54  53  46  59  44
 27  44  56  55  61  36
 28  18  44  50  57  81
 29  46  52  65  50  35
 30  32  45  49  57  64
 31  30  69  50  52  45
 32  46  49  53  59  37
 33  40  27  54  61  61
 34  31  42  48  54  68
 35  36  59  51  45  51
 36  56  40  56  54  35
 37  46  56  57  49  32
 38  45  42  55  56  40
 39  42  60  54  49  33
 40  40  63  53  54  25
 41  23  55  59  53  44
 42  48  48  49  51  37
 43  41  63  49  46  34
 44  46  52  53  41  40
 45  46  61  46  38  41
 46  40  57  51  52  31
 47  49  49  45  48  39
 48  22  58  53  56  41
 49  35  60  47  54  33
 50  48  56  49  42  32
 51  31  57  50  54  34
 52  17  53  57  43  51
 53  49  57  47  39  26
 54  59  50  47  15  46
 55  37  56  49  28  45
 56  40  43  48  21  61
 57  35  35  41  51  50
 58  38  44  54  47  24
 59  43  43  38  34  49
 60  39  46  46  32  43
 61  62  44  36  22  42
 62  48  38  41  44  33
 63  34  42  50  47  29
 64  18  51  40  56  30
 65  35  36  46  48  29
 66  59  53  37  22  19
 67  41  41  43  30  33
 68  31  52  37  27  40
 69  17  51  52  35  31
 70  34  30  50  47  36
 71  46  40  47  29  17
 72  10  46  36  47  39
 73  46  37  45  15  30
 74  30  34  43  46  18
 75  13  51  50  25  31
 76  49  50  38  23   9
 77  18  32  31  45  40
 78   8  42  48  26  40
 79  23  38  36  48  15
 80  30  24  43  33  25
 81   3   9  51  47  40
 82   7  51  43  17  22
 83  15  40  43  23  18
 84  15  38  39  28  17
 85   5  30  44  36  18
 86  12  30  32  35  21
 87   5  26  15  20  20
 88   0  40  21   9  14
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can the environment of a function be specified when the function is defined?

2011-08-12 Thread Uwe Ligges
I doubt what you have in mind makes too much sense, but changing the 
environment of a function foo to and environment env can be done using


environment(foo) - env

Uwe Ligges



On 12.08.2011 16:45, Frederic F wrote:

Hello,

Is there a syntax to set the environment of a function when this function is
defined?

The best I could come up with so far is using a wrapper function:

foo_internals- function(x) {Code of the function}

foo- function(x) {
   environment(foo_internals)-as.environment(target_environment)
   foo_internals(x)
}

But I would like to know if there is a cleaner syntax.

There are two reasons why I would like to have a function defined with an
environment different then the local one:
The first is to develop a function that will go in a package: there might be
more things loaded in my R_GlobalEnv than what will be available to the
function when its environment will be the namespace of the package, so I
would like to restrict right away the environment of the function that I
develop in my local enviroment to what will be their environment in 'real
life'.
The other reason would be to have a function written in a package act as if
it was from another package by giving it the environment of this other
package.

Thanks for your suggestions and comments,

Frederic

--
View this message in context: 
http://r.789695.n4.nabble.com/Can-the-environment-of-a-function-be-specified-when-the-function-is-defined-tp3739482p3739482.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] install packages from intranet

2011-08-12 Thread Uwe Ligges



On 12.08.2011 11:40, Peter Aberline wrote:


Hi,

I'm new to R. Apologies if this is a simple query, I've searched the mailing 
lists and docs but can't find a solution to my problem.

I'm trying to make some packages available on our intranet. During development 
the 'intranet' is a webserver running on localhost.

* When I call install.packages I get a mesage about not being able to access 
'index for repository'.
* The directories are viewable and can be seen through a web browser.
* Web server is IIS running on Winows 7.
* I've used the same paths on the web server as where the packages are located 
in the CRAN mirror's.
* I've tried setting setInternet2(TRUE), which was already set.



r- getOption(repos);
r[CRAN] = http://localhost;
r[CRANextra] = http://localhost/pub/RWin;
options(repos=r)
r

CRAN   CRANextra
  http://localhost; http://localhost/pub/RWin;


install.packages(abind)

Warning: unable to access index for repository http://localhost/bin/windows/cont
rib/2.13
Warning: unable to access index for repository http://localhost/pub/RWin/bin/win
dows/contrib/2.13
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
   package 'abind' is not available (for R version 2.13.1)




OR


install.packages(abind, 
contriburl=http://localhost/bin/windows/contrib/2.13;)

Warning: unable to access index for repository 
http://localhost/bin/windows/contrib/2.13
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
   package ‘abind’ is not available (for R version 2.13.1)

OR


   install.packages(abind, 
contriburl=http://localhost/bin/windows/contrib/2.13/abind_1.3-0.zip;)

Warning: unable to access index for repository 
http://localhost/bin/windows/contrib/2.13/abind_1.3-0.zip
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
   package ‘abind’ is not available (for R version 2.13.1)




Can anyone give me any clues as to what I'm doing wrong? Does R need some kind 
of index file to map between the name 'abind' and the zip filename?



Right, the PACKAGES file. You will find it in any repository.
If you only have a selectio of packages on your local server, you can 
use  write_PACKAGES() from the tools package to generate your own.
Anyway, if you make a local repository available: Does it make sense to 
just install all of them into a library that can be accessed from all 
machines you aimed at with the local repository?


Uwe Ligges






Many thanks
Peter.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] EXPORTING VERY LARGE DATASETS FROM R - PLEASE HELP!

2011-08-12 Thread Uwe Ligges



On 12.08.2011 12:23, BeckyHolt wrote:

Hi,

Im trying to export a very large dataset from R after i have perfomed a
predict function.
I have tried saving the data as an object and then exporting via write.table
but it is too big to fit into both a txt file and an excel.csv file...the
following error occyrs : clipboard buffer is full and output lost

Can anyone help please???
I need to export all of the data asap, does anyone know the code or how to
maybe export it to a database like access??



For database access see the R Data Import/Export manual.
Please do read the posting guide and do not shout.

Uwe Ligges




Any help with this problem would be much appreciated!

Thank you

Becky

--
View this message in context: 
http://r.789695.n4.nabble.com/EXPORTING-VERY-LARGE-DATASETS-FROM-R-PLEASE-HELP-tp3738718p3738718.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lattice panel.abline use

2011-08-12 Thread jjap
Dear R-users,

I am unsuccessful in trying to add an horizontal line to all graphs in the
example below:

library(lattice)
val-runif(15)
x-rep(seq(1:5),3)
type-c(rep(a,5), rep(b,5), rep(c,5))

xyplot(val ~ x | type, panel.abline(h=.6))

Any hints are appreciated.
Best regards,

---Jean



--
View this message in context: 
http://r.789695.n4.nabble.com/lattice-panel-abline-use-tp3739693p3739693.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Which Durbin-Watson is correct? (weights involved) - using durbinWatsonTest and dwtest (packages car and lmtest)

2011-08-12 Thread Dimitri Liakhovitski
Hello!

I have a data frame mysample (sorry for a long way of creating it
below - but I need it in this form, and it works). I regress Y onto X1
through X11 - first without weights, then with weights:

regtest1-lm(Y~., data=mysample[-13]))
regtest2-lm(Y~., data=mysample[-13]),weights=mysample$weight)
summary(regtest1)
summary(regtest2)

Then I calculate Durbin-Watson for both regressions using 2 different packages:

library(car)
library(lmtest)

durbinWatsonTest(regtest1)[2]
dwtest(regtest1)$stat

durbinWatsonTest(regtest2)[2]
dwtest(regtest2)$stat

When there are no weights, the Durbin-Watson statistic is the same.
But when there are weights, 2 packages give Durbin-Watson different
statistics. Anyone knows why?
Also, it's interesting that both of them are also different from what
SPSS spits out...

Thank you!
Dimitri



### Run the whole code below to create mysample:

intercor-0.3   # intercorrelation among all predictors
k-10   # number of predictors
sigma-matrix(intercor,nrow=k,ncol=k) # matrix of intercorrelations
among predictors
diag(sigma)-1

require(mvtnorm)
set.seed(123)
mypop-as.data.frame(rmvnorm(n=10, mean=rep(0,k), sigma=sigma,
method=chol))
names(mypop)-paste(x,1:k,sep=)
set.seed(123)
mypop$x11-sample(c(0,1),10,replace=T)

set.seed(123)
betas-round(abs(rnorm(k+1)),2) # desired betas
Y-as.matrix(mypop) %*% betas
mypop-cbind(mypop, Y)
rSQR-.5
VARofY- mean(apply(as.data.frame(mypop$Y),2,function(x){x^2})) -
mean(mypop$Y)^2
mypop$Y-mypop$Y + rnorm(10, mean=0, sd=sqrt(VARofY/rSQR-VARofY))

n-200
set.seed(123)
cases.for.sample-sample(10,n,replace=F)
mysample-mypop[cases.for.sample,]
mysample-cbind(mysample[k+2],mysample[1:(k+1)])  #dim(sample)
weight-rep(1:10,20);weight-weight[order(weight)]
mysample$weight-weight



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting data

2011-08-12 Thread R. Michael Weylandt
Hmmm, interesting problem. I now see your motivation for not using a data
frame set up in the split code: try this --

DataSplits - function(Data, alpha = 0.05) {
DataSplitsCore - function(Data1,Data2, alpha, level) {
tt - t.test(Data1,Data2)
print(tt)
if (tt$p.value  alpha) {
print(paste(Stopped at level, level))
return(invisible(TRUE))
} else {
nr1 = floor(NROW(Data1)/2); nr2 = floor(NROW(Data2)/2)
if (nr == 1) {print(paste(Reached Samples of Size 1)); stop}
d1 = DataSplitsCore(Data1[(1:nr1)], Data2[(1:nr2)], alpha =
alpha, level = level + 1)
if (d1) return(invisible(TRUE))
d2 = DataSplitsCore(Data1[-(1:nr1)], Data2[-(1:nr2)], alpha =
alpha, level = level +1)
if (d2) return(invisible(TRUE))
return(invisible(FALSE))
}
}
DataSplitsCore(Data, alpha = alpha, level = 1)
}

By the way, would you rather this returned the data rather than the depth
the search got to?

I don't know much about the subject (gas pressures or outlier
identification), so I'll just spit-ball some other filtering techniques:

1) Since it's a time series that changes over time, do some sort of rolling
z-score filter: i.e., if you are more than 2 sigma out, toss the data until
it stabilizes. Can also do this non-parametrically with means and MADs. (Not
wild about this one, but like I said: spitballing)

2) Subsample at random and throw out outliers until you have a fairly stable
data set.

3) Use a trimmed mean or median instead of a mean for robustness in your
analysis.

4) Filter between some a priori bands: 2e5 to 5e5?

5) Check ROC between points and make sure that's small.

Would these work for this sort of reading? None would be hard in R, and most
would probably be faster than the splitting technique.

Looking at the data, I'm assuming that we more or less hope to isolate 5
stable levels: the first one with the widest variance, 2 and 3 where it gets
higher and then comes back down, the long 4 and then 5 after that. I.e.,
just to confirm those little drops should be discarded in their entirety.

Michael Weylandt


On Fri, Aug 12, 2011 at 11:19 AM, Marina de Wolff marinadewo...@hotmail.com
 wrote:

  I understand the confusion, I hope I can clearify this. The problem you
 are refering to will not apply for my data I think.

 My data consists of 40.000 points, of pressure at a gaswell (vs time). I
 included a picture.
 The problem with this data set is that only datapoints in a 'stable'
 situation are reliable.
 Therefore the dataset needs to be filtered before it is useable. I'm trying
 different ideas to fulfill that goal.

 I already used breakpoints and some sort of steady state detection with
 moving variance.
 An other idea would be to split the data in half, compare with each other
 and if both groups (first half of the data and second half of the data)
 significantly differ split again and compare both left groups with each
 other and both right groups with each other etc.
 As a final result I would have different groups with different lengths (I
 hope), and only use the groups with a minimum size of m.

 Many thanks in advance for your assitance in this.

 Sincerely,

 Marina de Wolff


  --
 From: michael.weyla...@gmail.com
 Date: Fri, 12 Aug 2011 09:35:00 -0400

 Subject: Re: [R] Splitting data
 To: marinadewo...@hotmail.com
 CC: r-help@r-project.org

 Yes, that likely is the source of the difference: I'm happy to help fix it
 up (won't be hard), but I want to clarify exactly how you want the data
 done:

 say we have 20 variables x = 1:20 if there's a split we go to 1:10, 11:20;
 then 1:5, 6:10, 11:15,16:20 etc

 but what about situations with very different data sets:

 x = cbind(1:20, 1:7)
 one split takes us to where exactly: cbind( c(1:10, 11:20), c(1:3,1:4)) or
 cbind( c(1:10,11:20), c(1:4,5:7)) and then what of the next iteration?

 More generally, what exactly are you comparing? It seems odd to have two
 different categories/samples and to compare their means and then to switch
 gears entirely to compare subsamples of the categories independently. It
 seems that they are just different inferences: comparing the average of cats
 vs dogs and then comparing boy cats vs girl cats and boy dogs vs girl dogs.
 That winds up highlighting different independent variables. (Iteration one:
 species -- iteration two: gender)

 If you could speak a little more about your data, it'd be easier to do the
 splits in a meaningful way.

 As currently implemented, my code takes a 2d data frame and simply divides
 it into the top and bottom halves, which in most applications would
 corresponding to doing a mean-comparison calculation for different
 statistics of the same observation. The subsetting then keeps
 corresponding data together -- I put corresponding in parentheses because
 we aren't doing paired t-tests.

 Looking forward to your reply,

 

Re: [R] Splitting data

2011-08-12 Thread R. Michael Weylandt
Sorry -- missed a tweak. The call to DataSplitsCore should read

nr = floor(NROW(Data)/2); Data1 = Data[(1:nr)]; Data2 = Data[-(1:nr)];
DataSplitsCore(Data1,Data2,alpha,level)

On Fri, Aug 12, 2011 at 11:46 AM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Hmmm, interesting problem. I now see your motivation for not using a data
 frame set up in the split code: try this --

 DataSplits - function(Data, alpha = 0.05) {
 DataSplitsCore - function(Data1,Data2, alpha, level) {
 tt - t.test(Data1,Data2)

 print(tt)
 if (tt$p.value  alpha) {
 print(paste(Stopped at level, level))
 return(invisible(TRUE))
 } else {
 nr1 = floor(NROW(Data1)/2); nr2 = floor(NROW(Data2)/2)

 if (nr == 1) {print(paste(Reached Samples of Size 1)); stop}
 d1 = DataSplitsCore(Data1[(1:nr1)], Data2[(1:nr2)], alpha =
 alpha, level = level + 1)

 if (d1) return(invisible(TRUE))
 d2 = DataSplitsCore(Data1[-(1:nr1)], Data2[-(1:nr2)], alpha =
 alpha, level = level +1)

 if (d2) return(invisible(TRUE))
 return(invisible(FALSE))
 }
 }
 DataSplitsCore(Data, alpha = alpha, level = 1)
 }

 By the way, would you rather this returned the data rather than the depth
 the search got to?

 I don't know much about the subject (gas pressures or outlier
 identification), so I'll just spit-ball some other filtering techniques:

 1) Since it's a time series that changes over time, do some sort of rolling
 z-score filter: i.e., if you are more than 2 sigma out, toss the data until
 it stabilizes. Can also do this non-parametrically with means and MADs. (Not
 wild about this one, but like I said: spitballing)

 2) Subsample at random and throw out outliers until you have a fairly
 stable data set.

 3) Use a trimmed mean or median instead of a mean for robustness in your
 analysis.

 4) Filter between some a priori bands: 2e5 to 5e5?

 5) Check ROC between points and make sure that's small.

 Would these work for this sort of reading? None would be hard in R, and
 most would probably be faster than the splitting technique.

 Looking at the data, I'm assuming that we more or less hope to isolate 5
 stable levels: the first one with the widest variance, 2 and 3 where it gets
 higher and then comes back down, the long 4 and then 5 after that. I.e.,
 just to confirm those little drops should be discarded in their entirety.

 Michael Weylandt



 On Fri, Aug 12, 2011 at 11:19 AM, Marina de Wolff 
 marinadewo...@hotmail.com wrote:

  I understand the confusion, I hope I can clearify this. The problem you
 are refering to will not apply for my data I think.

 My data consists of 40.000 points, of pressure at a gaswell (vs time). I
 included a picture.
 The problem with this data set is that only datapoints in a 'stable'
 situation are reliable.
 Therefore the dataset needs to be filtered before it is useable. I'm
 trying different ideas to fulfill that goal.

 I already used breakpoints and some sort of steady state detection with
 moving variance.
 An other idea would be to split the data in half, compare with each other
 and if both groups (first half of the data and second half of the data)
 significantly differ split again and compare both left groups with each
 other and both right groups with each other etc.
 As a final result I would have different groups with different lengths (I
 hope), and only use the groups with a minimum size of m.

 Many thanks in advance for your assitance in this.

 Sincerely,

 Marina de Wolff


  --
 From: michael.weyla...@gmail.com
 Date: Fri, 12 Aug 2011 09:35:00 -0400

 Subject: Re: [R] Splitting data
 To: marinadewo...@hotmail.com
 CC: r-help@r-project.org

 Yes, that likely is the source of the difference: I'm happy to help fix it
 up (won't be hard), but I want to clarify exactly how you want the data
 done:

 say we have 20 variables x = 1:20 if there's a split we go to 1:10, 11:20;
 then 1:5, 6:10, 11:15,16:20 etc

 but what about situations with very different data sets:

 x = cbind(1:20, 1:7)
 one split takes us to where exactly: cbind( c(1:10, 11:20), c(1:3,1:4)) or
 cbind( c(1:10,11:20), c(1:4,5:7)) and then what of the next iteration?

 More generally, what exactly are you comparing? It seems odd to have two
 different categories/samples and to compare their means and then to switch
 gears entirely to compare subsamples of the categories independently. It
 seems that they are just different inferences: comparing the average of cats
 vs dogs and then comparing boy cats vs girl cats and boy dogs vs girl dogs.
 That winds up highlighting different independent variables. (Iteration one:
 species -- iteration two: gender)

 If you could speak a little more about your data, it'd be easier to do the
 splits in a meaningful way.

 As currently implemented, my code takes a 2d data frame and simply divides
 it into the top and bottom 

Re: [R] lattice panel.abline use

2011-08-12 Thread Mikhail Titov
I hope this helps:

xyplot(val ~ x | type,
   panel=function(...) {
   panel.xyplot(...)
   panel.abline(h=.6)
   })

Mikhail

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of jjap
 Sent: Friday, August 12, 2011 10:41 AM
 To: r-help@r-project.org
 Subject: [R] lattice panel.abline use
 
 Dear R-users,
 
 I am unsuccessful in trying to add an horizontal line to all graphs in the
 example below:
 
 library(lattice)
 val-runif(15)
 x-rep(seq(1:5),3)
 type-c(rep(a,5), rep(b,5), rep(c,5))
 
 xyplot(val ~ x | type, panel.abline(h=.6))
 
 Any hints are appreciated.
 Best regards,
 
 ---Jean
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/lattice-panel-
 abline-use-tp3739693p3739693.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] recode Variable in dependence of values of two other variables

2011-08-12 Thread Julia Moeller

Hi,

as an R-beginner, I have a recoding problem and hope you can help me:

I am working on a SPSS dataset, which I loaded into R (load(C:/...)

I have  2 existing Variables: ID and X ,
and one variable to be computed: meanX.dependID (=mean of X for all rows 
in which ID has the same value)


ID = subject ID.  Since it is a longitudinal dataset, there are repeated 
measurement points for each subject, each of which appears in a new row. 
So, each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 
in rows 6:8 etc).



Now: For all rows, in which ID has a certain value, meanX.dependID shall 
be the mean of X in for these rows. How can I automatisize that, without 
having to specify the number of the rows each time?


e.g.


IDXmeanX.dependID
122.25
132.25
112.25
132.25
253.3
223.3
233.3
343
313
323
333
343
353


Thanks a lot! Hope this is the right place to post, if not, please tell me!
best,
Julia
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standard error bars on bar plots

2011-08-12 Thread Christopher Crooks
Hi,
I know there have been numerous posts about this but I am unable to find one, 
or at least carry out one, that gives me the plot I want. 
I have managed to add the error bars to the plot, but they end up not aligned 
with the centre of the bars themselves. 

Here is my script: 


means-c(0.13528,0.082829167,0.2757625)
SE-c(0.036364714,0.039582896,0.06867608)
halfSE-c(0.018182357,0.019791448,0.03433804)

barx-barplot(means,main=Proportion of time spent in AMU 
sector,xlab=Treatments,ylab=Proportion of 
time,names.arg=c(Solitary,Size-matched conspecific,Sub-adult 
conspecific),cex.names=0.85,axis.lty=1,ylim=c(0,0.4))
library(gplots)
plotCI(x=means,uiw=halfSE,lty=1,gap=0,add=TRUE)

Thanks for any suggestions or help you may have,
CJ
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rollapply.zoo() with na.rm=TRUE

2011-08-12 Thread Giles
Hi.

I'm comparing output from rollapply.zoo, as produced by two versions
of R and package zoo.  I'm illustrating with an example from a R-help
posting 'Zoo - bug ???' dated 2010-07-13.

My question is not about the first version, or the questions raised in
that posting, because the behaviour is as documented.  I'm puzzled as
to why na.rm no longer is passed to mean, i.e. why element 2 is NA and
not 1.5 when na.rm=TRUE, as it was before.

The first example, where na.rm is not specified, and which now behaves
more as one might expect prior to carefully reading the documentation,
is also different from before.

This is not specific to mean(), similar behaviour is shown for e.g. sum().

Have I misunderstood the documentation?  Is there a way to reproduce
the old behaviour with na.rm=TRUE?

Thanks.

Giles

Version 1 --

R version 2.12.0 (2010-10-15)
Platform: i386-pc-mingw32/i386 (32-bit)
[27] zoo_1.6-4

 a - zoo(c(NA,1:9),1:10)

 rollapply(a,FUN=mean,width=3)
 2  3  4  5  6  7  8  9
NA NA NA NA NA NA NA NA
 rollapply(a,FUN=mean,width=3, na.rm = FALSE)
 2  3  4  5  6  7  8  9
NA  2  3  4  5  6  7  8
 rollapply(a,FUN=mean,width=3, na.rm = TRUE)
  2   3   4   5   6   7   8   9
1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0


Version 2 --

R version 2.13.1 (2011-07-08)
Platform: i386-pc-mingw32/i386 (32-bit)
[25] zoo_1.7-2

 a - zoo(c(NA,1:9),1:10)

 rollapply(a,FUN=mean,width=3)
 2  3  4  5  6  7  8  9
NA  2  3  4  5  6  7  8
 rollapply(a,FUN=mean,width=3, na.rm = FALSE)
 2  3  4  5  6  7  8  9
NA  2  3  4  5  6  7  8
 rollapply(a,FUN=mean,width=3, na.rm = TRUE)
 2  3  4  5  6  7  8  9
NA  2  3  4  5  6  7  8

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting and quantiles

2011-08-12 Thread Mark D.
Dear R users,

This is most likely very basic question but I am new to R and would really 
appreciate some tips on those two problems.


1) I need to plot variables from a data frame. Because of some few high numbers 
my graph is really strange looking. How could I plot a fraction of the samples 
(like 0.1 (10%), 0.2 up to for example 0.6) on x axis and values 'boundaries' 
(like any value ' 100',  '101-200' and ' 201') on the y axis? This needs to 
be a simple line plot. The values would come from one column.


2) I have a data frame with values and need to subset the rows based on the 
values. I wanted to order them (with increasing values) and divide into 3-4 
groups. I though about using quantile but I want the group to be something like 
'1-25', '26-50', '51-75', '75-100'. I could just look for a median divide into 
two and then again (or use quantiles 0.25, 0.5, 0.7 and 1 and then get rid of 
all rows in 0.25 that are in 0.5 etc) but surely there must by a faster and 
simpler way to do that (I need to do this a lot on different columns)?

Thanks for your help,
Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem in asreml function in wgaim package

2011-08-12 Thread ram basnet
Dear R users,
 
I am trying to use wgaim package for QTL analysis using mixed model approach. 
But i am stuck with asreml function while using wgaim package.
Do i need a separate package to activate asreml function beside wgaim 
package ?
 
If so, i tried to download Asreml-R package (i guess this is right package 
for asreml funtion) but i could not find it. Where can i find this package to 
download ?
Is this licensed package ?
 
I believe some of you can suggest way to use wgaim package or, recommend 
similar package?
 
Thanks in advance.
 
Regards,
Ram Kumar Basnet
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standard error bars on bar plots

2011-08-12 Thread Uwe Ligges



On 12.08.2011 18:43, Christopher Crooks wrote:

Hi,
I know there have been numerous posts about this but I am unable to find one, 
or at least carry out one, that gives me the plot I want.
I have managed to add the error bars to the plot, but they end up not aligned 
with the centre of the bars themselves.

Here is my script:


means-c(0.13528,0.082829167,0.2757625)
SE-c(0.036364714,0.039582896,0.06867608)
halfSE-c(0.018182357,0.019791448,0.03433804)

barx-barplot(means,main=Proportion of time spent in AMU sector,xlab=Treatments,ylab=Proportion of 
time,names.arg=c(Solitary,Size-matched conspecific,Sub-adult 
conspecific),cex.names=0.85,axis.lty=1,ylim=c(0,0.4))
library(gplots)
plotCI(x=means,uiw=halfSE,lty=1,gap=0,add=TRUE)



plotCI(x=barx, y=means,uiw=halfSE,lty=1,gap=0,add=TRUE)

Uwe Ligges


Thanks for any suggestions or help you may have,
CJ
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode Variable in dependence of values of two other variables

2011-08-12 Thread Mikhail Titov
?aggregate

aggregate(X~ID, your.data.frame.goes.here, mean)

Mikhail


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of Julia Moeller
 Sent: Friday, August 12, 2011 10:10 AM
 To: r-help@r-project.org
 Subject: [R] recode Variable in dependence of values of two other
variables
 
 Hi,
 
 as an R-beginner, I have a recoding problem and hope you can help me:
 
 I am working on a SPSS dataset, which I loaded into R (load(C:/...)
 
 I have  2 existing Variables: ID and X , and one variable to be
 computed: meanX.dependID (=mean of X for all rows in which ID has the same
 value)
 
 ID = subject ID.  Since it is a longitudinal dataset, there are repeated
 measurement points for each subject, each of which appears in a new row.
 So, each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 in
 rows 6:8 etc).
 
 
 Now: For all rows, in which ID has a certain value, meanX.dependID shall
be
 the mean of X in for these rows. How can I automatisize that, without
 having to specify the number of the rows each time?
 
 e.g.
 
 
 IDXmeanX.dependID
 122.25
 132.25
 112.25
 132.25
 253.3
 223.3
 233.3
 343
 313
 323
 333
 343
 353
 
 
 Thanks a lot! Hope this is the right place to post, if not, please tell
me!
 best,
 Julia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in asreml function in wgaim package

2011-08-12 Thread Uwe Ligges



On 12.08.2011 18:58, ram basnet wrote:

Dear R users,

I am trying to use wgaim package for QTL analysis using mixed model approach. But i am stuck with 
asreml function while using wgaim package.
Do i need a separate package to activate asreml function beside wgaim 
package ?

If so, i tried to download Asreml-R package (i guess this is right package for 
asreml funtion) but i could not find it. Where can i find this package to download ?
Is this licensed package ?

I believe some of you can suggest way to use wgaim package or, recommend 
similar package?



When loading the package it tells us

 library(wgaim)
Loading required package: qtl
Loading required package: lattice
ASReml-R needs to be installed before this package can be used.

Please visit http://www.vsni.co.uk/products/asreml/ for more information.

So go ahead.

Best,
Uwe Ligges





Thanks in advance.

Regards,
Ram Kumar Basnet
[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standard error bars on bar plots

2011-08-12 Thread Mikhail Titov
Or ?barplot2

barplot2(means,main=Proportion of time spent in AMU
sector,xlab=Treatments,ylab=Proportion of
time,names.arg=c(Solitary,Size-matched conspecific,Sub-adult
conspecific),cex.names=0.85,axis.lty=1,ylim=c(0,0.4),
 plot.ci=TRUE,ci.l=means,ci.u=means+halfSE)

Mikhail


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of Uwe Ligges
 Sent: Friday, August 12, 2011 12:04 PM
 To: Christopher Crooks
 Cc: r-help@r-project.org
 Subject: Re: [R] Standard error bars on bar plots
 
 
 
 On 12.08.2011 18:43, Christopher Crooks wrote:
  Hi,
  I know there have been numerous posts about this but I am unable to find
 one, or at least carry out one, that gives me the plot I want.
  I have managed to add the error bars to the plot, but they end up not
 aligned with the centre of the bars themselves.
 
  Here is my script:
 
 
  means-c(0.13528,0.082829167,0.2757625)
  SE-c(0.036364714,0.039582896,0.06867608)
  halfSE-c(0.018182357,0.019791448,0.03433804)
 
  barx-barplot(means,main=Proportion of time spent in AMU
 sector,xlab=Treatments,ylab=Proportion of
 time,names.arg=c(Solitary,Size-matched conspecific,Sub-adult
 conspecific),cex.names=0.85,axis.lty=1,ylim=c(0,0.4))
  library(gplots)
  plotCI(x=means,uiw=halfSE,lty=1,gap=0,add=TRUE)
 
 
 plotCI(x=barx, y=means,uiw=halfSE,lty=1,gap=0,add=TRUE)
 
 Uwe Ligges
 
  Thanks for any suggestions or help you may have,
  CJ
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Applyin Weights In RCommander

2011-08-12 Thread Simon Kiss
Dear Colleagues,
Do any R-plugins handle complex sampling procedures? I know that survey is 
probably the best one from the command line and the standard linear model can 
handle it in the RCommander, but I'd like to be able to show students how to 
apply weights doing simple descriptive statistics as well, in R Commander.
Yours, Simon Kiss
*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in asreml function in wgaim package

2011-08-12 Thread ram basnet
Dear Uwe Ligges,
 
Thanks for your prompt response.
I visited website link that you mentioned. It seems it needs license though it 
is R package.
Am i correct ?
 
Thanks
 
Regards,
 
Ram Kumar Basnet
 

From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: ram basnet basnet...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Friday, August 12, 2011 7:13 PM
Subject: Re: [R] problem in asreml function in wgaim package



On 12.08.2011 18:58, ram basnet wrote:
 Dear R users,

 I am trying to use wgaim package for QTL analysis using mixed model 
 approach. But i am stuck with asreml function while using wgaim package.
 Do i need a separate package to activate asreml function beside wgaim 
 package ?

 If so, i tried to download Asreml-R package (i guess this is right package 
 for asreml funtion) but i could not find it. Where can i find this package 
 to download ?
 Is this licensed package ?

 I believe some of you can suggest way to use wgaim package or, recommend 
 similar package?


When loading the package it tells us

 library(wgaim)
Loading required package: qtl
Loading required package: lattice
ASReml-R needs to be installed before this package can be used.

Please visit http://www.vsni.co.uk/products/asreml/ for more information.

So go ahead.

Best,
Uwe Ligges




 Thanks in advance.

 Regards,
 Ram Kumar Basnet
     [[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting and quantiles

2011-08-12 Thread Daniel Malter
Q1 is very opaque because you are not even saying what kind of plot you want.
For a regular scatterplot, you have multiple options.

a.) select only the data in the given intervals and plot the data

b.) plot the entire data, but restrict the graph region to the intervals you
are interested in, or

c.) winsorize the data (i.e., set values below the lower cutoff and above
the upper cutoff to the cutoff values

Which one you want to do depends on which one makes the most sense given the
purpose of your analysis

Say:

x-rnorm(100)
y-x+rnorm(100)

Then

a.) plot(y~x,data=data.frame(x,y)[x2x-2 , ])
#plots y against x only for xs between -2 and 2

b.) plot(y~x,xlim=c(-2,2))

#plots all y agains x, but restricts the plotting region to -2 to 2 on the
x-axis

c.)

x-replace(x,x2,2)
x-replace(x,x(-2),-2)
plot(y~x)

#sets all x-values below -2 and above 2 to these cutoffs



Q2: look at the cut() function.

?cut

HTH,
Daniel



Mark D. wrote:
 
 Dear R users,
 
 This is most likely very basic question but I am new to R and would really
 appreciate some tips on those two problems.
 
 
 1) I need to plot variables from a data frame. Because of some few high
 numbers my graph is really strange looking. How could I plot a fraction of
 the samples (like 0.1 (10%), 0.2 up to for example 0.6) on x axis and
 values 'boundaries' (like any value ' 100',  '101-200' and ' 201') on
 the y axis? This needs to be a simple line plot. The values would come
 from one column.
 
 
 2) I have a data frame with values and need to subset the rows based on
 the values. I wanted to order them (with increasing values) and divide
 into 3-4 groups. I though about using quantile but I want the group to be
 something like '1-25', '26-50', '51-75', '75-100'. I could just look for a
 median divide into two and then again (or use quantiles 0.25, 0.5, 0.7 and
 1 and then get rid of all rows in 0.25 that are in 0.5 etc) but surely
 there must by a faster and simpler way to do that (I need to do this a lot
 on different columns)?
 
 Thanks for your help,
 Mark
 
   [[alternative HTML version deleted]]
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

--
View this message in context: 
http://r.789695.n4.nabble.com/Plotting-and-quantiles-tp3739905p3739958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rollapply.zoo() with na.rm=TRUE

2011-08-12 Thread Gabor Grothendieck
On Fri, Aug 12, 2011 at 11:47 AM, Giles giles.heyw...@cantab.net wrote:
 Hi.

 I'm comparing output from rollapply.zoo, as produced by two versions
 of R and package zoo.  I'm illustrating with an example from a R-help
 posting 'Zoo - bug ???' dated 2010-07-13.

 My question is not about the first version, or the questions raised in
 that posting, because the behaviour is as documented.  I'm puzzled as
 to why na.rm no longer is passed to mean, i.e. why element 2 is NA and
 not 1.5 when na.rm=TRUE, as it was before.

 The first example, where na.rm is not specified, and which now behaves
 more as one might expect prior to carefully reading the documentation,
 is also different from before.

 This is not specific to mean(), similar behaviour is shown for e.g. sum().

 Have I misunderstood the documentation?  Is there a way to reproduce
 the old behaviour with na.rm=TRUE?

This is a bug. Its fixed in the development version.

Get the entire development version or just that one file:

library(zoo)
source(http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?root=zoo;)
rollapply(a, FUN = mean, width = 3, na.rm = TRUE)

or use this workaround:
rollapply(a, FUN = function(x) mean(x, na.rm = TRUE),  width = 3)


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in asreml function in wgaim package

2011-08-12 Thread Uwe Ligges



On 12.08.2011 19:21, ram basnet wrote:

Dear Uwe Ligges,
Thanks for your prompt response.
I visited website link that you mentioned. It seems it needs license
though it is R package.


Don't know about ASREML-R - ask the authors.
The wgaim package itself is declared to be under license GPL-2.
Note that an R package always has a license, most come under some sort 
of GPL license, but there are several others ...


Uwe Ligges






Am i correct ?



Thanks
Regards,
Ram Kumar Basnet

*From:* Uwe Ligges lig...@statistik.tu-dortmund.de
*To:* ram basnet basnet...@yahoo.com
*Cc:* R help r-help@r-project.org
*Sent:* Friday, August 12, 2011 7:13 PM
*Subject:* Re: [R] problem in asreml function in wgaim package



On 12.08.2011 18:58, ram basnet wrote:
  Dear R users,
 
  I am trying to use wgaim package for QTL analysis using mixed model
approach. But i am stuck with asreml function while using wgaim package.
  Do i need a separate package to activate asreml function beside
wgaim package ?
 
  If so, i tried to download Asreml-R package (i guess this is right
package for asreml funtion) but i could not find it. Where can i find
this package to download ?
  Is this licensed package ?
 
  I believe some of you can suggest way to use wgaim package or,
recommend similar package?


When loading the package it tells us

  library(wgaim)
Loading required package: qtl
Loading required package: lattice
ASReml-R needs to be installed before this package can be used.

Please visit http://www.vsni.co.uk/products/asreml/ for more information.

So go ahead.

Best,
Uwe Ligges




  Thanks in advance.
 
  Regards,
  Ram Kumar Basnet
  [[alternative HTML version deleted]]
 
 
 
 
  __
  R-help@r-project.org mailto:R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
http://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in asreml function in wgaim package

2011-08-12 Thread ram basnet
Dear Uwe Ligges,

Thanks for response.
I mean this R package is not free package.
 
Thanks
 
Ram Kumar Basnet
 

From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: ram basnet basnet...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Friday, August 12, 2011 7:28 PM
Subject: Re: [R] problem in asreml function in wgaim package



On 12.08.2011 19:21, ram basnet wrote:
 Dear Uwe Ligges,
 Thanks for your prompt response.
 I visited website link that you mentioned. It seems it needs license
 though it is R package.

Don't know about ASREML-R - ask the authors.
The wgaim package itself is declared to be under license GPL-2.
Note that an R package always has a license, most come under some sort 
of GPL license, but there are several others ...

Uwe Ligges





 Am i correct ?

 Thanks
 Regards,
 Ram Kumar Basnet

 *From:* Uwe Ligges lig...@statistik.tu-dortmund.de
 *To:* ram basnet basnet...@yahoo.com
 *Cc:* R help r-help@r-project.org
 *Sent:* Friday, August 12, 2011 7:13 PM
 *Subject:* Re: [R] problem in asreml function in wgaim package



 On 12.08.2011 18:58, ram basnet wrote:
   Dear R users,
  
   I am trying to use wgaim package for QTL analysis using mixed model
 approach. But i am stuck with asreml function while using wgaim package.
   Do i need a separate package to activate asreml function beside
 wgaim package ?
  
   If so, i tried to download Asreml-R package (i guess this is right
 package for asreml funtion) but i could not find it. Where can i find
 this package to download ?
   Is this licensed package ?
  
   I believe some of you can suggest way to use wgaim package or,
 recommend similar package?


 When loading the package it tells us

   library(wgaim)
 Loading required package: qtl
 Loading required package: lattice
 ASReml-R needs to be installed before this package can be used.

 Please visit http://www.vsni.co.uk/products/asreml/ for more information.

 So go ahead.

 Best,
 Uwe Ligges




   Thanks in advance.
  
   Regards,
   Ram Kumar Basnet
   [[alternative HTML version deleted]]
  
  
  
  
   __
   R-help@r-project.org mailto:R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 http://www.r-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] testEquatingData (part of ltm package) changes order of columns

2011-08-12 Thread Alexander Schwall
Dear R community,

I hope someone is out there who has some insights into the ltm package.
Specifically, I am seeking help for the testEquatingData function which is
part of this package.

Here is an example of my data:

#install.packages(ltm, dependencies = TRUE)
library(ltm)

anchor- as.data.frame(cbind(c(NA, NA, NA, NA, 1), c(NA, NA, NA, 1, NA),
c(1,1,NA,NA,NA)))
names(anchor) - c(A1,A2, A3)

items - list(as.data.frame(cbind(c(NA, NA, NA, NA, 1), c(NA, NA, NA, 1,
NA), c(1,1,NA,NA,NA #let's assume these are the items that I want to
estimate

datAll - testEquatingData(items, AnchoringItems = anchor)

Now, after I run the testEquatingData() function, the order of the variables
has been changed to A3, V3, A1, A2, V1, V2, instead of, for example A1, A2,
A3, V1, V2,V3, or V1, V2, V3, A1, A2, A3. Since I am using the constraint
function of the three parameter estimation function [e.g. tpm(datAll,
contraint) I would like Anchor items not to be interspersed with the items
to be estimated (V1, V2, etc).  I can arrange the order of my constraint
parameters to match the order of the anchor items, but also including
regular items gets tricky. Of particular concern is that second part of the
datAll list (which can be seen when running datAll). The stars indicate that
an item is an anchoring items. However, Their order does not match the order
of the actual items. I hope this makes sense.

Any help would be appreciated.
Thanks,
Alexander

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply to bind columns, with repeat?

2011-08-12 Thread Weidong Gu
Katrina,

try this.

reorg-function(x){
mat-matrix(x[9:length(x)],ncol=6,byrow=T)
rem.col-matrix(rep(x[1:8],nrow(mat)),byrow=T,ncol=8)
return(data.frame(cbind(rem.col,mat)))
}

co-do.call('rbind',apply(coop.dat,1,function(x) reorg(x)))

You may need to tweak a bit to fit exactly what you want.

Weidong Gu

On Fri, Aug 12, 2011 at 2:35 AM, Katrina Bennett kebenn...@alaska.edu wrote:
 Hi R-help,

 I am working with US COOP network station data and the files are
 concatenated in single rows for all years, but I need to pull these
 apart into rows for each day. To do this, I need to extract part of
 each row such as station id, year, mo, and repeat this against other
 variables in the row (days). My problem is that there are repeated
 values for each day, and the files are fixed width field without
 order.

 Here is an example of just one line of data.

 coop.raw - c(DLY09752806TMAX F2010010620107 00049 20107 00062
 B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407
 00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055
 20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007
 00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048
 B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507
 00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067
 21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107
 00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048
 B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607
 00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057
 22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B)
 write.csv(coop.raw, coop.tmp, row.names=F, quote=F)
 coop.dat - read.fwf(coop.tmp, widths =
 c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c(),
 skip=1, as.is=T)
 rep.name - rep(c(day,hr,met,dat,fl1,fl2), 62)
 rep.count - rep(c(1:62), each=6, 1)
 names(coop.dat) - c(rect, id, elem, unt, year, mo,
 fill, numval, paste(rep.name, rep.count, sep=_))

 I would like to generate output that contains in one row, the columns
 id, elem, unt, year, mo, and numval. Binded to these
 initial columns, I would like only day_1, hr_1, met_1, dat_1,
 fl1_1, and fl2_1. Then, in the next row I would like repeated the
 initial columns id, elem, unt, year, mo, and numval and
 then binded day_2, hr_2, met_2, dat_2, fl1_2, and f2_2 and
 so on until all the data for all rows has been allocated. Then, move
 onto the next row and repeat.

 I think I should be able to do this with some sort of sapply or lapply
 function, but I'm struggling with the format for repeating the initial
 columns, and then skipping through the next columns.

 Thank you,

 Katrina

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in asreml function in wgaim package

2011-08-12 Thread Kevin Wright
You are correct.  asreml is a commercial package which requires a license.

See the download site for more information.

Kevin Wright


On Fri, Aug 12, 2011 at 12:52 PM, ram basnet basnet...@yahoo.com wrote:

 Dear Uwe Ligges,

 Thanks for response.
 I mean this R package is not free package.

 Thanks

 Ram Kumar Basnet


 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 To: ram basnet basnet...@yahoo.com
 Cc: R help r-help@r-project.org
 Sent: Friday, August 12, 2011 7:28 PM
 Subject: Re: [R] problem in asreml function in wgaim package



 On 12.08.2011 19:21, ram basnet wrote:
  Dear Uwe Ligges,
  Thanks for your prompt response.
  I visited website link that you mentioned. It seems it needs license
  though it is R package.

 Don't know about ASREML-R - ask the authors.
 The wgaim package itself is declared to be under license GPL-2.
 Note that an R package always has a license, most come under some sort
 of GPL license, but there are several others ...

 Uwe Ligges





  Am i correct ?
 
  Thanks
  Regards,
  Ram Kumar Basnet
 
  *From:* Uwe Ligges lig...@statistik.tu-dortmund.de
  *To:* ram basnet basnet...@yahoo.com
  *Cc:* R help r-help@r-project.org
  *Sent:* Friday, August 12, 2011 7:13 PM
  *Subject:* Re: [R] problem in asreml function in wgaim package
 
 
 
  On 12.08.2011 18:58, ram basnet wrote:
Dear R users,
   
I am trying to use wgaim package for QTL analysis using mixed model
  approach. But i am stuck with asreml function while using wgaim
 package.
Do i need a separate package to activate asreml function beside
  wgaim package ?
   
If so, i tried to download Asreml-R package (i guess this is right
  package for asreml funtion) but i could not find it. Where can i find
  this package to download ?
Is this licensed package ?
   
I believe some of you can suggest way to use wgaim package or,
  recommend similar package?
 
 
  When loading the package it tells us
 
library(wgaim)
  Loading required package: qtl
  Loading required package: lattice
  ASReml-R needs to be installed before this package can be used.
 
  Please visit http://www.vsni.co.uk/products/asreml/ for more
 information.
 
  So go ahead.
 
  Best,
  Uwe Ligges
 
 
 
 
Thanks in advance.
   
Regards,
Ram Kumar Basnet
[[alternative HTML version deleted]]
   
   
   
   
__
R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 
 
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any alternatives to draw.colorkey from lattice package?

2011-08-12 Thread Mikhail Titov
Hello!

I’d like to have a continuous color bar on my lattice xyplot with colors lets 
say from topo.colors such that it has ticks  labels at few specific points 
only.

Right now I use do.breaks  level.colors with somewhat large number of steps. 
The problem is that color change point doesn’t necessary correspond to the 
value I’d like to label. Since I have many color steps and I don’t need high 
precision I generate labels like this

labels - ifelse( sapply(at,function(x) any(abs(att-x).03)) , sprintf(depth= 
%s ft, at), )

, where `att` has mine points of interest on color scale bar and `at` 
corresponds to color change points used with level.colors . It is a bit 
inconvenient as I have to adjust threshold `.03`, number of color steps so that 
it labels only adjacent color change point with my labels.

Q: Are there any ready to use functions that would generate some kind of 
GRaphical OBject with continuous color scale bar/key with custom at/labels such 
that it would work with `legend` argument of xyplot from lattice?

Mikhail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which Durbin-Watson is correct? (weights involved) - using durbinWatsonTest and dwtest (packages car and lmtest)

2011-08-12 Thread Achim Zeileis

On Fri, 12 Aug 2011, Dimitri Liakhovitski wrote:


Hello!

I have a data frame mysample (sorry for a long way of creating it
below - but I need it in this form, and it works). I regress Y onto X1
through X11 - first without weights, then with weights:

regtest1-lm(Y~., data=mysample[-13]))
regtest2-lm(Y~., data=mysample[-13]),weights=mysample$weight)
summary(regtest1)
summary(regtest2)

Then I calculate Durbin-Watson for both regressions using 2 different packages:

library(car)
library(lmtest)

durbinWatsonTest(regtest1)[2]
dwtest(regtest1)$stat

durbinWatsonTest(regtest2)[2]
dwtest(regtest2)$stat

When there are no weights, the Durbin-Watson statistic is the same.
But when there are weights, 2 packages give Durbin-Watson different
statistics. Anyone knows why?


The result of dwtest() is wrong. Internally, dwtest() extracts the model 
matrix and response (but no weights) and does all processing based on 
these. Thus, it computes the DW statistic for regtest1 not regtest2.


I've just added a patch to my source code which catches this problem and 
throws a meaningful error message. It will be part of the next release 
(0.9-29) in due course.


Of course, this doesn't help you with computing the DW statistic for the 
weighted regression but hopefully it reduces the confusion about the 
different behaviors...

Z


Also, it's interesting that both of them are also different from what
SPSS spits out...

Thank you!
Dimitri



### Run the whole code below to create mysample:

intercor-0.3# intercorrelation among all predictors
k-10# number of predictors
sigma-matrix(intercor,nrow=k,ncol=k) # matrix of intercorrelations
among predictors
diag(sigma)-1

require(mvtnorm)
set.seed(123)
mypop-as.data.frame(rmvnorm(n=10, mean=rep(0,k), sigma=sigma,
method=chol))
names(mypop)-paste(x,1:k,sep=)
set.seed(123)
mypop$x11-sample(c(0,1),10,replace=T)

set.seed(123)
betas-round(abs(rnorm(k+1)),2) # desired betas
Y-as.matrix(mypop) %*% betas
mypop-cbind(mypop, Y)
rSQR-.5
VARofY- mean(apply(as.data.frame(mypop$Y),2,function(x){x^2})) -
mean(mypop$Y)^2
mypop$Y-mypop$Y + rnorm(10, mean=0, sd=sqrt(VARofY/rSQR-VARofY))

n-200
set.seed(123)
cases.for.sample-sample(10,n,replace=F)
mysample-mypop[cases.for.sample,]
mysample-cbind(mysample[k+2],mysample[1:(k+1)])  #dim(sample)
weight-rep(1:10,20);weight-weight[order(weight)]
mysample$weight-weight



--
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CairoPDF

2011-08-12 Thread ivo welch
Regarding Simon Urbanek's excellent CairoPDF package, I noticed two items:

[1] looking at the april 6, 2011 documentation, on page 6, there are
some painful non-line breaks.  One of them makes it impossible to
understand how the process would work.  (see my example post earlier
on the charter font, though.)

[2] pdf(testfile) creates a file without a .pdf extension.
CairoPDF(testfile) creates testfile.pdf .  not a big deal---just a
small inconsistency if someone is switching from pdf to CairoPDF .

regards,

/iaw


Ivo Welch (ivo.we...@gmail.com)

PS: Tried to email Simon directly, but it bounced.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting data from an *.RData file into a data.frame object.

2011-08-12 Thread Ed Heaton
Hi, all.

I'm new to R.  I've been a SAS programmer for 20 years.

I seem to be having trouble with the most basic task - bringing a table in
an *.RData file into a data.frame object.

Here's how I created the *.RData file.

library(RODBC)
db - odbcConnect(***)
df - sqlQuery(
db
  , select * from schema.table where year(someDate)=2006
)
save(
df
  , file=C:/Documents and Settings/userName/My Documents/table2006.RData
)
dim(df)
remove(df)
odbcClose(db)
remove(db)
detach(package:RODBC)

Next, I moved that data file (table2006.RData) to another workstation - not
at the client site.

Now, I need to get that data file into a data.frame object.  I know this
should be simple, but I can't seem to find out how to do that.  I tried the
following.  First, after opening R without doing anything, RGui used 35,008
KB of memory.  I submitted the following.

 debt2006 - load(T:/R.Data/table2006.RData)

Memory used by RGui jumped to 191,512 KB.  So, it looks like the data
loaded.  However, debt2005 is of type character instead of data.frame.

 ls()
[1] debt2005
 class(debt2005)
[1] character


Help, please.

Ed

Ed Heaton
Project Manager, Sr. SAS Developer
Data and Analytic Solutions, Inc.
10318 Yearling Drive
Rockville, MD 20850
Office: 301-520-7414
ehea...@dasconsultants.com
www.dasconsultants.com http://www.dasconsultants.com/ 
CMMI ML-2, SBA 8(a)  SDB, WBE (WBENC), MBE (VA  MD)

e...@heaton.name

(Re: http://www.r-project.org/posting-guide.html)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] value.labels

2011-08-12 Thread zcatav

Jim Lemon wrote:
 
 Hi Zeki,
 
 fgh-read.table(fgh.tab,header=TRUE,sep=\t)
 fgh$c-add.value.labels(fgh$c,c(alive,dead,zombie))
 fgh$c
 
 Jim
 

Thanks Jim,
This is what i want.

Result as follows;

 test2$c-add.value.labels(test2$c,c(alive,dead,zombie))
 test2$c
[1] 0 1 1 0 2 0 0 0 0
attr(,value.labels)
 alive   dead zombie 
 0  1  2 
 table(test2$c)

0 1 2 
6 2 1 
 str(test2)
'data.frame':   9 obs. of  4 variables:
 $ a: int  58009 114761 184440 189372 105286 186717 189106 127306 157342
 $ b: Date, format: 2011-07-28 2008-11-05 ...
 $ c: atomic  0 1 1 0 2 0 0 0 0
  ..- attr(*, value.labels)= Named num  0 1 2
  .. ..- attr(*, names)= chr  alive dead zombie
 $ d: Date, format: NA 2008-11-05 ...


--
View this message in context: 
http://r.789695.n4.nabble.com/value-labels-tp3735947p374.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automating an R function call

2011-08-12 Thread RobertJK
Any way to run an R function every 5 minutes from the R terminal? Searched
around but couldn't find any answers. Thanks!!
Robert

--
View this message in context: 
http://r.789695.n4.nabble.com/Automating-an-R-function-call-tp3740070p3740070.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automating an R function call

2011-08-12 Thread Ken
Hey,
  Sys.sleep(300)
   ?Sys.sleep
Ken Hutchison
On Aug 12, 2554 BE, at 2:03 PM, RobertJK rkind...@gmail.com wrote:

 Any way to run an R function every 5 minutes from the R terminal? Searched
 around but couldn't find any answers. Thanks!!
 Robert
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Automating-an-R-function-call-tp3740070p3740070.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting data from an *.RData file into a data.frame object.

2011-08-12 Thread Duncan Murdoch

On 12/08/2011 3:12 PM, Ed Heaton wrote:

Hi, all.

I'm new to R.  I've been a SAS programmer for 20 years.

I seem to be having trouble with the most basic task - bringing a table in
an *.RData file into a data.frame object.

Here's how I created the *.RData file.

library(RODBC)
db- odbcConnect(***)
df- sqlQuery(
 db
   , select * from schema.table where year(someDate)=2006
)
save(
 df
   , file=C:/Documents and Settings/userName/My Documents/table2006.RData
)
dim(df)
remove(df)
odbcClose(db)
remove(db)
detach(package:RODBC)

Next, I moved that data file (table2006.RData) to another workstation - not
at the client site.

Now, I need to get that data file into a data.frame object.  I know this
should be simple, but I can't seem to find out how to do that.  I tried the
following.  First, after opening R without doing anything, RGui used 35,008
KB of memory.  I submitted the following.

  debt2006- load(T:/R.Data/table2006.RData)

Memory used by RGui jumped to 191,512 KB.  So, it looks like the data
loaded.  However, debt2005 is of type character instead of data.frame.

  ls()
[1] debt2005
  class(debt2005)
[1] character


Help, please.


save() and load() work with multiple objects, and the objects keep their 
names.  So your object would be recreated as df after the load.


If you just want to save the data from one object without its name, use 
saveRDS() and readRDS().


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Grid unit width and font face

2011-08-12 Thread Sébastien Bihorel
Dear R-users,

When one defines a grid unit object using the 'strwidth' dimension, it
seems that the default plain font is assumed as the following example
illustrates. Is there a way to either make use of a font option when
creating a unit object or to know the factor that exists between the
width of the same text printed in plain and in bold? This might be
dependent on the font, though...

require(grid)

grid.rect(width=unit(1,'strwidth','Some text'),draw=T)

grid.text('Some text',draw=T)   # fits
nicely in the box
grid.text('Some text',y=0.4,gp=gpar(font=2),draw=T)  # partially outside the box

Thank you in advance for your input on this issue.

Sebastien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automating an R function call

2011-08-12 Thread Duncan Murdoch

On 12/08/2011 2:03 PM, RobertJK wrote:

Any way to run an R function every 5 minutes from the R terminal? Searched
around but couldn't find any answers. Thanks!!


Yes, but not in the background:

repeat {
  f()
  Sys.sleep(300)
}

If you want it run in the background, get your OS to run R to do it.  
(It's possible the tcltk package will give you access to some tcl way to 
set a background process; I don't know.  Similarly, you could call out 
to C and start up another thread, etc., but R itself is single-threaded.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting data from an *.RData file into a data.frame object.

2011-08-12 Thread peter dalgaard

On Aug 12, 2011, at 21:12 , Ed Heaton wrote:

 Hi, all.
 
 I'm new to R.  I've been a SAS programmer for 20 years.
 
 I seem to be having trouble with the most basic task - bringing a table in
 an *.RData file into a data.frame object.
 
 Here's how I created the *.RData file.
 
 library(RODBC)
 db - odbcConnect(***)
 df - sqlQuery(
db
  , select * from schema.table where year(someDate)=2006
 )
 save(
df
  , file=C:/Documents and Settings/userName/My Documents/table2006.RData
 )
 dim(df)
 remove(df)
 odbcClose(db)
 remove(db)
 detach(package:RODBC)
 
 Next, I moved that data file (table2006.RData) to another workstation - not
 at the client site.
 
 Now, I need to get that data file into a data.frame object.  I know this
 should be simple, but I can't seem to find out how to do that.  I tried the
 following.  First, after opening R without doing anything, RGui used 35,008
 KB of memory.  I submitted the following.
 
 debt2006 - load(T:/R.Data/table2006.RData)
 
 Memory used by RGui jumped to 191,512 KB.  So, it looks like the data
 loaded.  However, debt2005 is of type character instead of data.frame.
 
 ls()
 [1] debt2005
 class(debt2005)
 [1] character
 
 
 Help, please.
 

The load/save mechanism handles multiple objects. You can't really assign the 
result of load(), its return value is just a vector of object names (df, 
right?).

So look in your workspace for an object called df. Then assign it to 
debt2005. AFAIK, this won't duplicate quite as much as it sounds it might.

For a single-step variation, try something like

debt2005 - local({load(myfile); df})


 Ed
 
 Ed Heaton
 Project Manager, Sr. SAS Developer
 Data and Analytic Solutions, Inc.
 10318 Yearling Drive
 Rockville, MD 20850
 Office: 301-520-7414
 ehea...@dasconsultants.com
 www.dasconsultants.com http://www.dasconsultants.com/ 
 CMMI ML-2, SBA 8(a)  SDB, WBE (WBENC), MBE (VA  MD)
 
 e...@heaton.name
 
 (Re: http://www.r-project.org/posting-guide.html)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automating an R function call

2011-08-12 Thread Greg Snow
Do you want to be doing other things with the R terminal in the meantime?

Are you OK with the terminal being locked up between runs and just want to see 
the output updated?

Is it OK to have a new instance of R run the function?


If the last one is doable then you can have your OS scheduled to run a script 
every 5 minutes and the script runs your function.  How to do this depends on 
your OS, fairly easy with chron on linux and related OS's (you can see the 
updated file using the tail program, some versions will wait for new info to be 
appended and show it automatically).  I have done this in windows before, but 
it takes several clicks and I need to rediscover the correct sequence each time.

If you are happy with the 2nd set of conditions then you can just use a while 
loop to repeatedly run the function and include a call to Sys.sleep to wait the 
required time.  Note that if your function takes more than a couple of seconds 
then you will want to reduce the amount of time accordingly.  If the amount of 
time your function takes varies then you will not be running exactly every 5 
minutes.

If you need the 1st set of conditions, then you could load the tcltk2 package 
and use the tclTaskSchedule function.  This is the most dangerous situation, 
you could end up with messed up data if you are trying to edit something at the 
same time the function runs and also tries to use/edit it.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of RobertJK
 Sent: Friday, August 12, 2011 12:03 PM
 To: r-help@r-project.org
 Subject: [R] Automating an R function call
 
 Any way to run an R function every 5 minutes from the R terminal?
 Searched
 around but couldn't find any answers. Thanks!!
 Robert
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Automating-
 an-R-function-call-tp3740070p3740070.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automating an R function call

2011-08-12 Thread peter dalgaard

On Aug 12, 2011, at 21:26 , Duncan Murdoch wrote:

 On 12/08/2011 2:03 PM, RobertJK wrote:
 Any way to run an R function every 5 minutes from the R terminal? Searched
 around but couldn't find any answers. Thanks!!
 
 Yes, but not in the background:
 
 repeat {
  f()
  Sys.sleep(300)
 }
 
 If you want it run in the background, get your OS to run R to do it.  (It's 
 possible the tcltk package will give you access to some tcl way to set a 
 background process; I don't know.  Similarly, you could call out to C and 
 start up another thread, etc., but R itself is single-threaded.)
 

For asynchronuos within-R possibilities check out tkafter in the tcltk package.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode Variable in dependence of values of two other variables

2011-08-12 Thread Dennis Murphy
Hi:

Here are several equivalent ways to produce your desired output:

# Base package: transform()

df - transform(df, mean = ave(x, id, FUN = mean))

# plyr package
library('plyr')
ddply(df, .(id), transform, mean = mean(x))

# data.table package
library('data.table')
dt - data.table(df, key = 'id')
dt[, list(x, mean = mean(x)), by = 'id']

# doBy package
library('doBy')
transformBy(~ id, data = df, mean = mean(x))

HTH,
Dennis

On Fri, Aug 12, 2011 at 8:10 AM, Julia Moeller
julia.moel...@uni-erfurt.de wrote:
 Hi,

 as an R-beginner, I have a recoding problem and hope you can help me:

 I am working on a SPSS dataset, which I loaded into R (load(C:/...)

 I have  2 existing Variables: ID and X ,
 and one variable to be computed: meanX.dependID (=mean of X for all rows in
 which ID has the same value)

 ID = subject ID.  Since it is a longitudinal dataset, there are repeated
 measurement points for each subject, each of which appears in a new row. So,
 each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 in rows
 6:8 etc).


 Now: For all rows, in which ID has a certain value, meanX.dependID shall be
 the mean of X in for these rows. How can I automatisize that, without having
 to specify the number of the rows each time?

 e.g.


 ID    X    meanX.dependID
 1    2    2.25
 1    3    2.25
 1    1    2.25
 1    3    2.25
 2    5    3.3
 2    2    3.3
 2    3    3.3
 3    4    3
 3    1    3
 3    2    3
 3    3    3
 3    4    3
 3    5    3


 Thanks a lot! Hope this is the right place to post, if not, please tell me!
 best,
 Julia

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which Durbin-Watson is correct? (weights involved) - using durbinWatsonTest and dwtest (packages car and lmtest)

2011-08-12 Thread Dimitri Liakhovitski
Thank you, Achim.
So, then a follow up question to the community: is there a package out
there that allows one to calculate a correct DW for a regression with
weights?
Dimitri

On Fri, Aug 12, 2011 at 2:42 PM, Achim Zeileis achim.zeil...@uibk.ac.at wrote:
 On Fri, 12 Aug 2011, Dimitri Liakhovitski wrote:

 Hello!

 I have a data frame mysample (sorry for a long way of creating it
 below - but I need it in this form, and it works). I regress Y onto X1
 through X11 - first without weights, then with weights:

 regtest1-lm(Y~., data=mysample[-13]))
 regtest2-lm(Y~., data=mysample[-13]),weights=mysample$weight)
 summary(regtest1)
 summary(regtest2)

 Then I calculate Durbin-Watson for both regressions using 2 different
 packages:

 library(car)
 library(lmtest)

 durbinWatsonTest(regtest1)[2]
 dwtest(regtest1)$stat

 durbinWatsonTest(regtest2)[2]
 dwtest(regtest2)$stat

 When there are no weights, the Durbin-Watson statistic is the same.
 But when there are weights, 2 packages give Durbin-Watson different
 statistics. Anyone knows why?

 The result of dwtest() is wrong. Internally, dwtest() extracts the model
 matrix and response (but no weights) and does all processing based on these.
 Thus, it computes the DW statistic for regtest1 not regtest2.

 I've just added a patch to my source code which catches this problem and
 throws a meaningful error message. It will be part of the next release
 (0.9-29) in due course.

 Of course, this doesn't help you with computing the DW statistic for the
 weighted regression but hopefully it reduces the confusion about the
 different behaviors...
 Z

 Also, it's interesting that both of them are also different from what
 SPSS spits out...

 Thank you!
 Dimitri


 
 ### Run the whole code below to create mysample:

 intercor-0.3   # intercorrelation among all predictors
 k-10           # number of predictors
 sigma-matrix(intercor,nrow=k,ncol=k) # matrix of intercorrelations
 among predictors
 diag(sigma)-1

 require(mvtnorm)
 set.seed(123)
 mypop-as.data.frame(rmvnorm(n=10, mean=rep(0,k), sigma=sigma,
 method=chol))
 names(mypop)-paste(x,1:k,sep=)
 set.seed(123)
 mypop$x11-sample(c(0,1),10,replace=T)

 set.seed(123)
 betas-round(abs(rnorm(k+1)),2) # desired betas
 Y-as.matrix(mypop) %*% betas
 mypop-cbind(mypop, Y)
 rSQR-.5
 VARofY- mean(apply(as.data.frame(mypop$Y),2,function(x){x^2})) -
 mean(mypop$Y)^2
 mypop$Y-mypop$Y + rnorm(10, mean=0, sd=sqrt(VARofY/rSQR-VARofY))

 n-200
 set.seed(123)
 cases.for.sample-sample(10,n,replace=F)
 mysample-mypop[cases.for.sample,]
 mysample-cbind(mysample[k+2],mysample[1:(k+1)])  #dim(sample)
 weight-rep(1:10,20);weight-weight[order(weight)]
 mysample$weight-weight



 --
 Dimitri Liakhovitski
 marketfusionanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automating an R function call

2011-08-12 Thread Ken Hutchison
Hey,
 Also, if your function varies in time (less than 5 minutes) then you could
use
total.time.difference=system.time(some.function)[1)]
# (for user time)  and you could plug this into your Sys.sleep() (which
accepts decimal seconds) if you can #accept an error on the order of
hundredths of seconds (and don't need to run another process in said
console)
Ken



On Fri, Aug 12, 2011 at 3:41 PM, peter dalgaard pda...@gmail.com wrote:


 On Aug 12, 2011, at 21:26 , Duncan Murdoch wrote:

  On 12/08/2011 2:03 PM, RobertJK wrote:
  Any way to run an R function every 5 minutes from the R terminal?
 Searched
  around but couldn't find any answers. Thanks!!
 
  Yes, but not in the background:
 
  repeat {
   f()
   Sys.sleep(300)
  }
 
  If you want it run in the background, get your OS to run R to do it.
  (It's possible the tcltk package will give you access to some tcl way to
 set a background process; I don't know.  Similarly, you could call out to C
 and start up another thread, etc., but R itself is single-threaded.)
 

 For asynchronuos within-R possibilities check out tkafter in the tcltk
 package.

 --
 Peter Dalgaard, Professor,
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com
 Døden skal tape! --- Nordahl Grieg

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] deSolve output

2011-08-12 Thread Sébastien Bihorel
Hi,

Try with the following ODE function. This should give you an extra
column with the derivative of G in your THAAC matrix.

degradation = function (t, state, parameters) {
 with(as.list(c(state, parameters)),
   {dG = (-a*(t+i)^b)*(G)
   list(c(dG),dG=dG)
   })
}

Any additional variables that you want to output from your ODE system
function need to be a separate level of the output list. For instance:

# Example taken from the deSolve vignette
parameters - c(a = -8/3,b = -10, c = 28)

state - c(X = 1,Y = 1, Z = 1)

Lorenz-function(t, state, parameters) {
 with(as.list(c(state, parameters)),{
 # rate of change
 dX - a*X + Y*Z
 dY - b * (Y-Z)
 dZ - -X*Y + c*Y - Z

 # return the rate of change
 list(c(dX, dY, dZ),dX=dX, dY=dY,Dummy=-X/Y)
 }) # end with(as.list ...
}

times - seq(0, 100, by = 0.01)
library(deSolve)
out - ode(y = state, times = times, func = Lorenz, parms = parameters)
head(out)

Sebastien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grid unit width and font face

2011-08-12 Thread Prof Brian Ripley
I think you are doing this in the wrong order.  You need to set the 
gpar on the viewport, then compute the grid.rect.



grid.rect(width=unit(1,'strwidth','Some text'),draw=T, gp=gpar(font=2))
grid.text('Some text',y=0.4,gp=gpar(font=2),draw=T)


is one way to do it: pushing a viewport is another.

The factor does depend on the font, family, pointsize, but see e.g.


strwidth('Some text', units='in')

[1] 0.7503255

strwidth('Some text', units='in', font = 2)

[1] 0.7965495


On Fri, 12 Aug 2011, Sébastien Bihorel wrote:


Dear R-users,

When one defines a grid unit object using the 'strwidth' dimension, it
seems that the default plain font is assumed as the following example
illustrates. Is there a way to either make use of a font option when
creating a unit object or to know the factor that exists between the
width of the same text printed in plain and in bold? This might be
dependent on the font, though...

require(grid)

grid.rect(width=unit(1,'strwidth','Some text'),draw=T)

grid.text('Some text',draw=T)   # fits
nicely in the box
grid.text('Some text',y=0.4,gp=gpar(font=2),draw=T)  # partially outside the box

Thank you in advance for your input on this issue.

Sebastien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply to bind columns, with repeat?

2011-08-12 Thread Weidong Gu
On Fri, Aug 12, 2011 at 5:08 PM, Katrina Bennett kebenn...@alaska.edu wrote:
 Hi Weidong Gu,

 This works! For my clarity, and so I can repeat this process if need be:

 The 'mat' generates a matrix using whatever is supplied to x (i.e.
 coop.dat) using the columns from position 9:length(x) of 6 columns (by
 row).
x is passed as a row of coop.dat at a time.
then, using matrix() to split the part of the row (9:length(x)) into a
matrix of 6 columns, note using byrow parameter

 The 'rem.col' generates a matrix of the first 1:8 columns of 8 columns.
this would generate repeated the first 8 columns, note using nrow(mat)
to match the number of repeated columns to that of mat.

 The 'return' statement calls the function to cbind together rem.col and mat.
return a list of data.frame

 Then 'apply' this all to coop.dat, by rows, using function reorg.

do.call ('rbind', retured list of df)  would combine the list into one
data frame

you can see what is going on
 temp-apply(coop.dat,1,function(x) reorg(x)))
str(temp)
 do.call('rbind',temp)


HTH

Weidong Gu


 Is this correct?

 Thank you very much,

 Katrina


 On Fri, Aug 12, 2011 at 10:28 AM, Weidong Gu anopheles...@gmail.com wrote:
 Katrina,

 try this.

 reorg-function(x){
 mat-matrix(x[9:length(x)],ncol=6,byrow=T)
 rem.col-matrix(rep(x[1:8],nrow(mat)),byrow=T,ncol=8)
 return(data.frame(cbind(rem.col,mat)))
 }

 co-do.call('rbind',apply(coop.dat,1,function(x) reorg(x)))

 You may need to tweak a bit to fit exactly what you want.

 Weidong Gu

 On Fri, Aug 12, 2011 at 2:35 AM, Katrina Bennett kebenn...@alaska.edu 
 wrote:
 Hi R-help,

 I am working with US COOP network station data and the files are
 concatenated in single rows for all years, but I need to pull these
 apart into rows for each day. To do this, I need to extract part of
 each row such as station id, year, mo, and repeat this against other
 variables in the row (days). My problem is that there are repeated
 values for each day, and the files are fixed width field without
 order.

 Here is an example of just one line of data.

 coop.raw - c(DLY09752806TMAX F2010010620107 00049 20107 00062
 B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407
 00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055
 20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007
 00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048
 B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507
 00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067
 21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107
 00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048
 B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607
 00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057
 22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B)
 write.csv(coop.raw, coop.tmp, row.names=F, quote=F)
 coop.dat - read.fwf(coop.tmp, widths =
 c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c(),
 skip=1, as.is=T)
 rep.name - rep(c(day,hr,met,dat,fl1,fl2), 62)
 rep.count - rep(c(1:62), each=6, 1)
 names(coop.dat) - c(rect, id, elem, unt, year, mo,
 fill, numval, paste(rep.name, rep.count, sep=_))

 I would like to generate output that contains in one row, the columns
 id, elem, unt, year, mo, and numval. Binded to these
 initial columns, I would like only day_1, hr_1, met_1, dat_1,
 fl1_1, and fl2_1. Then, in the next row I would like repeated the
 initial columns id, elem, unt, year, mo, and numval and
 then binded day_2, hr_2, met_2, dat_2, fl1_2, and f2_2 and
 so on until all the data for all rows has been allocated. Then, move
 onto the next row and repeat.

 I think I should be able to do this with some sort of sapply or lapply
 function, but I'm struggling with the format for repeating the initial
 columns, and then skipping through the next columns.

 Thank you,

 Katrina

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sample size AUC for ROC curves

2011-08-12 Thread David Winsemius


On Aug 11, 2011, at 5:50 AM, Karl Knoblick wrote:


Thanks. Actually I thought of something like
Hanley JA, McNeil BJ. A method of comparing the areas under receiver  
operating
characteristic curves derived from the same cases. Radiology. 1983;  
148:

839–843.
http://radiology.rsna.org/content/148/3/839.full.pdf+html

Has anybody R-code for this or something similar but newer?

The question is just easy - How many subjects do I need if I want to  
show that
my diagnostic test is not only a game of dice. Data for input are  
the epected

AUC, alpha and beta,


If you want the binomial choice situation then the AUC is not the  
right place to start. You should be looking at sample size  
calculations for logistic regression (or maybe even binom.test if you  
have no covariates that matter.)


--
David Winsemius

--


Would be great if somebody has a solution!

Karl



- Ursprüngliche Mail 
Von: Greg Snow greg.s...@imail.org
An: Karl Knoblick karlknobl...@yahoo.de; r-h...@stat.math.ethz.ch
r-h...@stat.math.ethz.ch
Gesendet: Dienstag, den 9. August 2011, 19:45:12 Uhr
Betreff: RE: [R] Sample size AUC for ROC curves

If you know how to generate random data that represents your null  
hypothesis

(chance, auc=0.5) and how to do your analysis, then you can do this by
simulation, simulate a dataset at a given sample size, analyze it,  
repeat a
bunch of times and see if that sample size is about the right size.   
If not, do
it again with a different sample size until you find one that works  
for you.


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Karl Knoblick
Sent: Monday, August 08, 2011 3:29 PM
To: r-h...@stat.math.ethz.ch
Subject: [R] Sample size AUC for ROC curves

Hallo!

Does anybody know a way to calculate the sample size for comparing  
AUC

of ROC
curves against 'by chance' with AUC=0.5 (and/or against anothe AUC)?

Thanks!
Karl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] define variables from a matrix

2011-08-12 Thread gallon li
I have a following matrix and wish to define a variable based the variable

 A=matrix(0,5,5)
A[1,]=c(30,20,100,120,90)
A[2,]=c(40,30,20,50,100)
A[3,]=c(50,50,40,30,30)
A[4,]=c(30,20,40,50,50)
A[5,]=c(30,50,NA,NA,100)
 A
 [,1] [,2] [,3] [,4] [,5]
[1,]   30   20  100  120   90
[2,]   40   30   20   50  100
[3,]   50   50   40   30   30
[4,]   30   20   40   50   50
[5,]   30   50   NA   NA  100

I want to define two variables:

X is the first column in each row that is equal to 20, for example, for the
first row, I need X=2; 2nd row, X=3; 3rd row, X=NA; 4th row, X=2, 5th row,
X=NA;

Y is then the first column in each row that is equal to 100 if before this a
20 has been reached, for example, for the first row, Y=3; 2nd row, Y=5; 3rd
row, Y=NA, 4th row, Y=NA; 5th row, Y=NA.

the matrix may involve NA as well.

How can I define these two variables quickly?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.