date:20110719


P1-tapply(P1,Experiment,mean)[Experiment]

HTH,
Daniel


ronny wrote:
 
 Hi,
 
 I would like to center P1 and P2 of the following data frame by the factor
 Experiment, i.e. substruct from each value the average of its
 experiment, and keep the original data structure, i.e. the experiment and
 the group of each value. 
 
 RAW=
 data.frame(Experiment=c(2,2,2,1,1,1),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=c(8,12,16,2,3,4))
 
 Desired result:
 
 NORMALIZED=
 data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1))
 
 I tried using by, but then I lose the original order, and the Group
 varaible. Can you help?
 
 RAW 
   Experiment Group P1 P2
  2 A 10  8
  2 A 12 12
  2 B 14 16
  1 A  5  2
  1 A  3  3
  1 B  4  4
 
 NOT.OK- within (RAW,
 {P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))})
 
 NOT.OK
   Experiment Group P1 P2
   2 A  1  8
   2 A -1 12
   2 B  0 16
   1 A -2  2
   1 A  0  3
   1 B  2  4
 

--
View this message in context: 
http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677620.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to convert number (matlab) to date

2011-07-19 Thread Prof Brian Ripley


but even this is dubious, since there is no year 0 AD. In Gregorian
and Julian calendars, 1 BC continues directly into 1 AD.


True, but these days we are ruled by ISO 8601:2004, which does define 
a year 0 (the year before 1CE aka 1AD). See

http://en.wikipedia.org/wiki/0_(year) .

It seems also to redefine the meaning of 'Gregorian calendar' calling 
what you are referring to the 'BC/AD calendar system'.  (Those who 
prefer BCE/CE to BC/AD might note the usage of the latter in the 
definitive international standard.)



On Mon, 18 Jul 2011, peter dalgaard wrote:



On Jul 18, 2011, at 14:08 , Gabor Grothendieck wrote:


On Sat, Jul 16, 2011 at 11:50 PM, Eduardo M. A. M. Mendes
emammen...@gmail.com wrote:

Hello

I am new to R and I need to convert some dates (numeric format by matlab) to 
actual dates in R.

For instance,

Matlab - 730456 -  datestr(730456)

ans =

02-Dec-1999



Set the origin to Matlab's origin like this.  Be sure you are using
the indicated version of zoo or later:


library(zoo)
packageVersion(zoo)

[1] ‘1.7.1’

as.Date(730456, origin = -00-00)

[1] 1999-12-02


Doesn't work on a Mac, and in general, I think it depends on a quirk in your 
OS's date conversion utilities. What does work for me is


as.Date(730456-1, origin='-01-01')

[1] 1999-12-02

but even this is dubious, since there is no year 0 AD. In Gregorian 
and Julian calendars, 1 BC continues directly into 1 AD.


So, to be sure, try


as.Date(730456-367, origin='0001-01-01')

[1] 1999-12-02

(from which it transpires that the non-existing year 0 is a leap year...).

Or, or course, just use the appropriate magic constant of 719529 and begone 
with it:


as.Date(730456-719529)

[1] 1999-12-02


I fail to see what zoo has to do with this at all!


--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tm: Read a single text file into a corpus as single document?

2011-07-19 Thread Alexander James Rickett

Hello everyone,

I'm doing some JGR (a gui frontend for R) development, specifically adding 
functionality from tm.  In order to enable users to select some text files from 
a file dialog, and turn them into a corpus, I need to be able to generate a 
corpus using a *SINGLE* text file as a single document, and to append a new 
document to an existing corpora.  I know if I could read files into single 
character vectors I'd be in business, but I can't find how to do this either.  
This seems like a no-brainer, so I'm at my wits' end.

Here's pseudo code of what I'd like to be able to do:

##
 corp1doc - Corpus(singleTextDocSource(path/to/doc)) #read in 1 text doc as 
 a 1-document corpus
 corp1doc
A corpus with 1 text document

 corp1doc[[2]] - AnotherSingleTextDoc(path/to/doc) #append a second 
 document to the same corpus
 corp1doc
A corpus with 2 text documents
##

I can almost do this with dirSource, by setting pattern='filename', but this 
requires me to also to separate the path to the enclosing directory, which 
shouldn't be necessary.  

Thanks for taking a look!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Centering data frame by factor

2011-07-19 Thread ronny

Hi,

I would like to center P1 and P2 of the following data frame by the factor
Experiment, i.e. substruct from each value the average of its experiment,
and keep the original data structure, i.e. the experiment and the group of
each value. 

RAW=
data.frame(Experiment=c(2,2,2,1,1,1),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=c(8,12,16,2,3,4))

Desired result:

NORMALIZED=
data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1))

I tried using by, but then I lose the original order, and the Group
varaible. Can you help?

 RAW 
  Experiment Group P1 P2
 2 A 10  8
 2 A 12 12
 2 B 14 16
 1 A  5  2
 1 A  3  3
 1 B  4  4

NOT.OK- within (RAW,
{P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))})

 NOT.OK
  Experiment Group P1 P2
  2 A  1  8
  2 A -1 12
  2 B  0 16
  1 A -2  2
  1 A  0  3
  1 B  2  4


--
View this message in context: 
http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677609.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.csv help

2011-07-19 Thread psombe

Well yeah it works fine for small data but when i tried the exact same
command with a large data set (abt 167 rows and 4000 columns) it gave me a
different data frame.
 either i get the first column as row names and so when i put data[1,1] i
get the the first row second column data (from the original data) as the
first row became row names.
or 
if i explicitly put row.names = NULL i get my columns shifted.

this is how the data should look
 tdata[1,1:3]
   timestamp system.system.nfs_ops system.system.cifs_ops
1 1299376803   1104233  0
 

and this is how i'm able to load the data

   row.names timestamp system.system.nfs_ops system.system.cifs_ops
1 1299376803   1104233 0  0

notice the shift in the first column
i hope this makes my problem clearer

--
View this message in context: 
http://r.789695.n4.nabble.com/read-csv-help-tp3677454p3677586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.csv help

2011-07-19 Thread Peter Ehlers


On 2011-07-19 01:27, psombe wrote:

Well yeah it works fine for small data but when i tried the exact same
command with a large data set (abt 167 rows and 4000 columns) it gave me a
different data frame.
  either i get the first column as row names and so when i put data[1,1] i
get the the first row second column data (from the original data) as the
first row became row names.
or
if i explicitly put row.names = NULL i get my columns shifted.

this is how the data should look

tdata[1,1:3]

timestamp system.system.nfs_ops system.system.cifs_ops
1 1299376803   1104233  0


and this is how i'm able to load the data

row.names timestamp system.system.nfs_ops system.system.cifs_ops
1 1299376803   1104233 0  0

notice the shift in the first column
i hope this makes my problem clearer


This has nothing to do with the size of your data set.
Try count.fields() on your data file and do take note of the
description of the row.names argument to read.csv function:
If there is a header and the first row contains one fewer field than
the number of columns, the first column in the input is used for the
row names.

Peter Ehlers



--
View this message in context: 
http://r.789695.n4.nabble.com/read-csv-help-tp3677454p3677586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tm: Read a single text file into a corpus as single document?

2011-07-19 Thread Juan Carlos Borrás

Some hints:
list.files() will return the list of files in a directory
readLines() will allow you to load text files as vectors of lines
strsplit() will allow you to break lines into words
c(x,y) concatenates vectors x and y ; x - c(x,y) appends vector y to x
unique() will allow you to get rid of repeats
And the Map/Reduce family of functions will allow you to write what
you want in about 15 lines of concise R code with no loops.

Hope it helps,
Cheers,
jcb!

On Tue, Jul 19, 2011 at 11:11 AM, Alexander James Rickett
ack.van...@gmail.com wrote:
 Hello everyone,

 I'm doing some JGR (a gui frontend for R) development, specifically adding 
 functionality from tm.  In order to enable users to select some text files 
 from a file dialog, and turn them into a corpus, I need to be able to 
 generate a corpus using a *SINGLE* text file as a single document, and to 
 append a new document to an existing corpora.  I know if I could read files 
 into single character vectors I'd be in business, but I can't find how to do 
 this either.  This seems like a no-brainer, so I'm at my wits' end.

 Here's pseudo code of what I'd like to be able to do:

 ##
 corp1doc - Corpus(singleTextDocSource(path/to/doc)) #read in 1 text doc 
 as a 1-document corpus
 corp1doc
        A corpus with 1 text document

 corp1doc[[2]] - AnotherSingleTextDoc(path/to/doc) #append a second 
 document to the same corpus
 corp1doc
        A corpus with 2 text documents
 ##

 I can almost do this with dirSource, by setting pattern='filename', but this 
 requires me to also to separate the path to the enclosing directory, which 
 shouldn't be necessary.

 Thanks for taking a look!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] urgent Help needed

2011-07-19 Thread Paul Hiemstra

 On 07/19/2011 04:40 AM, Ana-Maria Pistea wrote:
  B
Hi Ana-Maria,

A quick google for you error message shows that there can be quite a
number of causes for this problem. Therefore it is impossible for us to
help you. Please read the R-help posting guide to improve your question
[1]. The most important thing you need to provide is a commented,
minimal, self-contained, reproducible example of this problem that we
can run on our own computers.

regards,
Paul

[1] http://www.r-project.org/posting-guide.html

-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dead code removal

2011-07-19 Thread Juan Carlos Borrás

Ideally you'd have the next two items available:
- tests that ensure that your code carries out what it should and as it should.
- a coverage analysis tool that reports what parts of your code have
been and have not been executed by your tests above.
Neither of those are mandatory though, but they will save you from
nightmares later on.
While there are a few tests harnesses for doing the former (i.e.
RUnit, test_that and so on...), I am not aware of any code coverage
tools for R, sadly.
Cheers,
jcb!

On Tue, Jul 19, 2011 at 9:51 AM, Alex Bird sund...@gmail.com wrote:
 Hi there,

  I have some unused code in my project but have no idea how to clean
 it up in some kind of automatic way.
  Maybe there are some tools/ways to identify and remove/mark dead
 parts of the code?

 Thanks in advance!

 Kind regards,
 Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dead code removal

2011-07-19 Thread Paul Hiemstra

 On 07/19/2011 09:57 AM, Juan Carlos Borrás wrote:
 Ideally you'd have the next two items available:
 - tests that ensure that your code carries out what it should and as it 
 should.
 - a coverage analysis tool that reports what parts of your code have
 been and have not been executed by your tests above.
 Neither of those are mandatory though, but they will save you from
 nightmares later on.
 While there are a few tests harnesses for doing the former (i.e.
 RUnit, test_that and so on...), I am not aware of any code coverage
 tools for R, sadly.
 Cheers,
 jcb!

Hi,

Maybe you could profile your code and record which functions are used.
For an example of how to do this see [1]. Cross-referencing them with
your code should indicate if there are any functions that are unused.

cheers,
Paul

[1]
http://www.stat.berkeley.edu/~nolan/stat133/Fall05/lectures/profilingEx.html

 On Tue, Jul 19, 2011 at 9:51 AM, Alex Bird sund...@gmail.com wrote:
 Hi there,

  I have some unused code in my project but have no idea how to clean
 it up in some kind of automatic way.
  Maybe there are some tools/ways to identify and remove/mark dead
 parts of the code?

 Thanks in advance!

 Kind regards,
 Alex
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Understanding R's Environment concept

2011-07-19 Thread Duncan Murdoch


On 11-07-18 2:16 PM, Nipesh Bajaj wrote:

Hi all, I am trying to understand the R's environment concept
however the underlying help files look quite technical to me. Can
experts here provide me some more intuitive ideas behind this concept
like, why it is there, what exactly it is doing in R's architecture
etc.?

I mainly need some non-technical intuitive explanation.



There are three characteristics that describe environments:

1.  They are a collection of named objects.  Much of the time when you 
ask for something by name, you're looking in an environment to find it.


2.  They have a child-parent relationship to another environment.  Some 
of the time, when you look up a name and it is not found, it goes to the 
parent to look.  (And then the grandparent  )  This means most of 
the time when you specify a name, R just looks in one environment and 
its ancestors to find the object.


3.  They don't get copied when you make an assignment.  So you can say
env - globalenv(), and your env is another name for the global 
environment, which is where most user objects are created.  Saying


env$z - 3

will create a new variable named z in the global environment.  This 
differs from most other R objects, where assignment makes an independent 
copy.


And one thing that says how they are used:

1.  Things like functions need to look up names all the time.  Those 
things generally have an associated environment which is where they'll 
look.  (Functions are a little complicated in that they get a new one 
every time you call them, but they also have an associated environment 
which is the parent of the new one.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lattice plot problem outputting to jpeg

2011-07-19 Thread creamers

Hi.I am relatively new to R but was quite pleased with myself at having
generated a series of lattice plots as PDFs. I was very surprised when
plotting these out as jpegs (or png or tiff) that the strip title
information above each lattice plot vanished. The pdf was fine. Has anybody
any ideas? I can't add an image as the information is sensitive.

Many Thanks
Steve Creamer

Here is the code snippet 

#jpeg(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_1.jpg,height=600,width=600)
pdf(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.pdf)
#png(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.png)
#tiff(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.tiff)
while (irec = nrec)
{
   con_new-consultants[irec]
   spec_new-specialty[irec]
   spec_cons_new-spec_cons[irec]

   if (spec_cons_new!=spec_cons_old || irec==nrec )
   {
  strip_name[con_count]-paste(spec_old,'\n',con_old)
  if (spec_new!=spec_old || irec==nrec)
  {
  spec_old-spec_new
   }
  con_count-con_count+1
  if (con_count  36 || irec==nrec )
  {

#   
-
#Use xyplot for the lattice - plot from iStartRec to irec each times
- this will be 36 consultants
#/specialties per device plot.
#   
-


tplot-xyplot(DAtotals[irecStart:(irec-1)]~dates[irecStart:(irec-1)]
  
|spec_cons[irecStart:(irec-1)],group=nf[irecStart:(irec-1)],layout=c(6,6),
   type='b',as.table=TRUE,
   main=paste(Monthly Appts Provided After 1 DNA\n by
Specialties/Consultants - ,iSuffix),
   auto.key = list(cex=0.5,lines=TRUE,
points=FALSE,border = TRUE, x=0.05,y=0.90,corner=c(0,0)),
   ylab = Number of Monthly Attended Appts Following a
DNA ,xlab=Date,
   scales = list(x = list(rot = 90,format=%b-%y,
   cex=0.6)),xaxt=n,
   strip = function(which.panel,...) 
 {
   
panel.fill(trellis.par.get(strip.background)$col[1])
type - strip_name[which.panel]
grid::grid.text(label = type,x = 0.5, y =
0.5,gp=grid::gpar(fontsize=5))
grid::grid.rect() 
 }
   )
 print(tplot) # plot the lattice plot
 con_count-1
 irecStart-irec
 iSuffix-iSuffix+1 # create suffix 
 if (irec != nrec)
 {

#   
#   Open new device and direct to jpeg with new name
#   
dev.off()
graphics.off() # turn graphics off to clear memory
dev.new()

#jpeg(paste(Z:\\My 
Documents\\PROJECTS\\AccessPolicy\\AccessDAPlots_,toString(iSuffix),.jpg,sep=),
#  height=600,width=600)
pdf(paste(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_,toString(iSuffix),.pdf,sep=))
#png(paste(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_,toString(iSuffix),.png,sep=))
#tiff(paste(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_,toString(iSuffix),.tiff,sep=))
 }
  }
  con_old-con_new
  spec_cons_old-spec_cons_new
   }  
   irec-irec+1
}


--
View this message in context: 
http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3677705.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] line jump in plot legend title

2011-07-19 Thread loic

As suggested by David, I applied some modifications to the code of the legend
function so that the legend box size adapts to the number of line jumps in
the legend title.

Attached the modified code of the function.

Thanks to David

Regards,

Loïc
Wageningen University

http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r 

--
View this message in context: 
http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp3676157p3677996.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Centering data frame by factor

2011-07-19 Thread ronny

Perfect! Made my day!

--
View this message in context: 
http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677665.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice plot problem outputting to jpeg



On Jul 19, 2011, at 5:40 AM, creamers wrote:

Hi.I am relatively new to R but was quite pleased with myself at  
having

generated a series of lattice plots as PDFs. I was very surprised when
plotting these out as jpegs (or png or tiff) that the strip title
information above each lattice plot vanished. The pdf was fine. Has  
anybody

any ideas? I can't add an image as the information is sensitive.


You have obviously advance far in you understanding of the  
underpinnings of lattice plots, farther than I in many respects. I was  
surprised, therefore, to see that you were directly accessing elements  
of your data in the global environment. Generally lattice functions  
work best when they are given data.frames as arguments.


The other (more specific to your problem) comment is that replacement  
strip functions are generally constructed with the function  
strip.custom(). I get the impression for the docs that this may be  
required, but apparently you succeeded with single plot testing and it  
may be a device issue, so I may be off base. You could still take a  
look at the examples in the help page and see if using that wrapper  
gets you better delivery of arguments to the operative code.


Obviously not able to do any testing, since you have not constructed a  
minimal test dataframe. Device issues often require knowing OS and  
other information that the Posting Guide requests you provide with  
sessionInfo().


--
David.


Many Thanks
Steve Creamer

Here is the code snippet

#jpeg(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_1.jpg,height=600,width=600)
pdf(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.pdf)
#png(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.png)
#tiff(Z:\\My Documents\\PROJECTS\\Access Policy\ 
\AccessDAPlots_1.tiff)

while (irec = nrec)
{
  con_new-consultants[irec]
  spec_new-specialty[irec]
  spec_cons_new-spec_cons[irec]

  if (spec_cons_new!=spec_cons_old || irec==nrec )
  {
 strip_name[con_count]-paste(spec_old,'\n',con_old)
 if (spec_new!=spec_old || irec==nrec)
 {
 spec_old-spec_new
  }
 con_count-con_count+1
 if (con_count  36 || irec==nrec )
 {

#
-
#Use xyplot for the lattice - plot from iStartRec to irec  
each times

- this will be 36 consultants
#/specialties per device plot.
#
-


tplot-xyplot(DAtotals[irecStart:(irec-1)]~dates[irecStart:(irec-1)]

|spec_cons[irecStart:(irec-1)],group=nf[irecStart: 
(irec-1)],layout=c(6,6),

  type='b',as.table=TRUE,
  main=paste(Monthly Appts Provided After 1 DNA 
\n by

Specialties/Consultants - ,iSuffix),
  auto.key = list(cex=0.5,lines=TRUE,
points=FALSE,border = TRUE, x=0.05,y=0.90,corner=c(0,0)),
  ylab = Number of Monthly Attended Appts  
Following a

DNA ,xlab=Date,
  scales = list(x = list(rot = 90,format=%b-%y,
  cex=0.6)),xaxt=n,
   strip = function(which.panel,...)
{

panel.fill(trellis.par.get(strip.background)$col[1])
   type - strip_name[which.panel]
   grid::grid.text(label = type,x = 0.5,  
y =

0.5,gp=grid::gpar(fontsize=5))
   grid::grid.rect()
}
  )
print(tplot) # plot the lattice plot
con_count-1
irecStart-irec
iSuffix-iSuffix+1 # create suffix
if (irec != nrec)
{

#   
#   Open new device and direct to jpeg with new name
#   
   dev.off()
   graphics.off() # turn graphics off to clear memory
   dev.new()

#jpeg(paste(Z:\\My
Documents\\PROJECTS\\AccessPolicy\ 
\AccessDAPlots_,toString(iSuffix),.jpg,sep=),

#  height=600,width=600)
   pdf(paste(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_,toString(iSuffix),.pdf,sep=))
#png(paste(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_,toString(iSuffix),.png,sep=))
#tiff(paste(Z:\\My Documents\\PROJECTS\\Access
Policy\\AccessDAPlots_,toString(iSuffix),.tiff,sep=))
}
 }
 con_old-con_new
 spec_cons_old-spec_cons_new
  }
  irec-irec+1
}


--
View this message in context: 
http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3677705.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] Drawing a histogram from a massive dataset

2011-07-19 Thread Paul Smith

On Tue, Jul 19, 2011 at 12:30 AM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 [snip] I guess that I must have a data frame to plot a histogram.

 Not at all!

 ## a *vector* of 100 million observation
 x - rnorm(10^8)
 ## a histogram for it (see attached for the result from my system)
 hist(x)

 No data frame required.  I would not try this straight in anything but
 traditional graphics for a 100 million observation vector, but if you
 wanted it made in ggplot2 or something, you could prebin the data and
 THEN plot bars corresponding to the bins.

 Thanks, Joshua, for your answer.

 True: A vector is enough to supply data for hist(). But my point is:
 Can a histogram be drawn without having all data on the computer
 memory? You partially answer this question by suggesting to prebind
 the data. Can this prebinning process be done transparently but chunk
 by chunk of data underneath?

 Sure, as long as you can figure out some basic details about the full
 dataset.  Just define your breaks, and then for chunks of the data at
 a time, count how many fall into any particular bin.  Once you are
 done, add up all the counts for each bin, and voila.

 ## Get these values from the full data (using SQL)
 x - rnorm(1000)
 n - length(x)
 minx - min(x)
 maxx - max(x)

 ## Sturges style breaks
 breaks - pretty(c(minx, maxx), n = ceiling(log2(n) + 1))
 nB - length(breaks)

 fuzz - rep(1e-07 * median(diff(breaks)), nB)
 fuzz[1] - fuzz[1] * -1
 fuzzybreaks - breaks + fuzz

 chunks - 10

 counts - matrix(NA, nrow = chunks, ncol = nB - 1,
  dimnames = list(paste(Sec, 1:chunks, sep = ''),
    as.character(fuzzybreaks[-1])))

 for(i in 1:chunks) {
  index - seq(1, n/chunks) + (n/chunks * (i - 1))
  counts[i, ] - hist(x[index], breaks = fuzzybreaks)$counts
 }

 ## The heights of your bars
 colSums(counts)
 ## results using hist() on x all at once
 hist(x)$counts

 You would not even need to know the number of chunks you were going to
 split your data into before hand, I just did it for convenience and to
 instatiate a full sized matrix to hold the results.  If you are
 selecting subsets of your data using SQL rather than R, it becomes
 even simpler.  Once you have your fuzzybreaks, you just keep calling
 hist on your new data with using the predefined breaks and saving the
 results.  Still, I do not break about 4.5 GB of memory used to just
 plot a histogram on a 100 million observation vector, and it is
 difficult to imagine the shape of the distribution changing
 appreciably using a random sample of 100 million observations.  It
 also takes less than 10 seconds to calculate and draw the histogram on
 my computer.  The point being, I suspect you will spend more time
 getting everything setup and working than seems worth it because you
 can easily and quickly create a histogram on so large of vectors
 already, the distribution is unlikely to vary anyway.  Whatever floats
 your boat, though.

Thanks again, Joshua. Your approach is quite interesting.

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dead code removal

2011-07-19 Thread Alex Bird

Grand merci! Will try!
Kind regards,
Alex

2011/7/19 Paul Hiemstra paul.hiems...@knmi.nl:
 e you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Centering data frame by factor



On Jul 19, 2011, at 4:50 AM, Daniel Malter wrote:



P1-tapply(P1,Experiment,mean)[Experiment]


Another way would be with ave(), but I discovered that it does not  
accept subsidiary arguments and does not issue warnings either, so  
this works:


  with(dfrm, ave(P1, Experiment, FUN=function(x) scale(x,   
scale=FALSE) ) )

[1] -2  0  2  1 -1  0


But this doesn't behave as directed ... by my pre-operational R-brain.

with(dfrm, ave(P1, Experiment, FUN=scale,  scale=FALSE) )
[1] -1  0  1  1 -1  0

(It applies both default arguments and issues no warning about unused  
argument. Most (well, some anyway) functions like this accept  
subsidiary arguments with ..., but `ave` uses that construction to  
gather its factor arguments rather than expecting them to be in a list  
or vector, as do tapply, aggregate, and by. Some functions like mapply  
and many other give you a moreArgs option, but not ave.)


--
David.



HTH,
Daniel


ronny wrote:


Hi,

I would like to center P1 and P2 of the following data frame by the  
factor

Experiment, i.e. substruct from each value the average of its
experiment, and keep the original data structure, i.e. the  
experiment and

the group of each value.

RAW=
data 
.frame 
(Experiment 
= 
c 
(2,2,2,1,1,1 
),Group 
= 
c 
(A 
,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=c(8,12,16,2,3,4))


Desired result:

NORMALIZED=
data 
.frame 
(Experiment 
= 
c 
(2,2,2,1,1,1 
),Group 
= 
c 
(B 
,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1))


I tried using by, but then I lose the original order, and the  
Group

varaible. Can you help?


RAW

 Experiment Group P1 P2
2 A 10  8
2 A 12 12
2 B 14 16
1 A  5  2
1 A  3  3
1 B  4  4

NOT.OK- within (RAW,
{P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))})


NOT.OK

 Experiment Group P1 P2
 2 A  1  8
 2 A -1 12
 2 B  0 16
 1 A -2  2
 1 A  0  3
 1 B  2  4



--
View this message in context: 
http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677620.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave in 2.13.1

2011-07-19 Thread John Minter

Duncan, thanks for your work. I could not get the workaround you suggested

Rterm.exe --no-restore --slave -e utils::Sweave(file.Rnw)

to work under 2.13.1 and so gave up and installed the 2.14.0 development
build. I can verify that

R CMD Sweave file.Rnw

works properly on the 2.14.0 development build and not on the 2.13.1 release
build. For now I will stick with the development build

--
View this message in context: 
http://r.789695.n4.nabble.com/Sweave-in-2-13-1-tp3672267p3678162.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] line jump in plot legend title

2011-07-19 Thread Duncan Murdoch


On 11-07-19 8:16 AM, loic wrote:

As suggested by David, I applied some modifications to the code of the legend
function so that the legend box size adapts to the number of line jumps in
the legend title.

Attached the modified code of the function.


No code was attached.

In case you didn't, could you make sure that your modification is to the 
version of the code that's in the svn repository?  Then it will be 
straightforward to import your change into R.


That version is online at

https://svn.r-project.org/R/trunk/src/library/graphics/R/legend.R

if your changes only affect the legend() function.

Duncan Murdoch



Thanks to David

Regards,

Loïc
Wageningen University

http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r

--
View this message in context: 
http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp3676157p3677996.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave in 2.13.1

2011-07-19 Thread Duncan Murdoch


On 11-07-19 9:42 AM, John Minter wrote:

Duncan, thanks for your work. I could not get the workaround you suggested

Rterm.exe --no-restore --slave -e utils::Sweave(file.Rnw)

to work under 2.13.1 and so gave up and installed the 2.14.0 development
build. I can verify that

R CMD Sweave file.Rnw

works properly on the 2.14.0 development build and not on the 2.13.1 release
build. For now I will stick with the development build



You might prefer R-patched instead:  R-devel is going through a lot of 
changes right now, and there may be bugs in the nightly build. 
R-patched is very stable at the moment.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice plot problem outputting to jpeg

2011-07-19 Thread creamers

Thanks David...I am trying to plot out data for various consultants by
specialty - each specialty has a varying number of consultants - each
consultant a varying number of data pointsI found direct access of the
elements of the dataframe was the only way to plot this type of variation,
otherwise xyplot seemed to assume that there were a similar number of
consultants per specialty. As for the strip.custom, I did try this, its just
that I ended up embedding the function inline...I'm not sure there is any
difference is there? Sorry I didn't follow protocol ...it is my first time!
Steve

--
View this message in context: 
http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3678288.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple comparison test on selected contrasts

2011-07-19 Thread B Jessop

Dear Help-list, I have solved the problem by simply deleting the erroneous 0 
in the CR core - CR EC contrast and deleting the unnecessary command 
test=adjusted(summarytype = single-step).   Regards,B. Jessop
  From: deel...@hotmail.com
 To: r-help@r-project.org
 Date: Sun, 17 Jul 2011 21:04:51 -0300
 Subject: [R] Multiple comparison test on selected contrasts

 Dear Help-list, How can I do a multiple comparison test (mct) on selected 
 contrasts from a linear model while using packages lme4 and multcomp?  I am 
 running R 2.13.0 under Windows 7.  The following linear model and mct 
 produces a global mct of 15 paired contrasts of the combined (Site, Position) 
 factor SitePos of which only 9 are of interest.  Model.G = lmer(log10(SrCa) ~ 
 SitePos + (1 | Eel), data = Data1)
 Model.G.mct = glht(Model.G, linfct = mcp(SitePos = Tukey))summary 
 (Model.G.mct)  The following code creates the desired reduced set of 
 contrasts but I have been unable to apply it correctly to the mct.   contr = 
 rbind(CR core - MH core = c(1,0,0,-1,0,0),CR core - CR edge = 
 c(1,0,-1,0,0,0),
 CR core - CR EC = c(1,-1,0,0,0,0,0),CR edge - MH edge = c(0,0,1,0,0,-1),
 CR edge - CR EC = c(0,-1,1,0,0,0),CR EC - MH EC = c(0,1,0,0,-1,0),
 MH core - MH edge = c(0,0,0,1,0,-1),MH core - MH EC = c(0,0,0,1,-1,0),
 MH edge - MH EC = c(0,0,0,0,-1,1)) Execution of this code produces the 
 error message: In rbind ('CR core - MH core = c(1,0,0,-1,0,0), etc.: number 
 of columns of results is not a multiple of vector length (arg 1)'.  
 Model.G.mct2 = glht(Model.G, linfct = mcp(SitePos = contr))  #execution 
 produces Error in linfct{[nm]} %*% c: non-comformable argument, as a 
 consequence of the previous error
 summary (Model.G.mct2, test = adjusted(summarytype = single-step)) Clearly, 
 this approach is incorrect (and I have tried others).  How can I introduce 
 the selected set of contrasts into the mct?  Thanks for any help provided.  
 Regards,B. Jessop  
   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] line jump in plot legend title



On Jul 19, 2011, at 10:38 AM, Duncan Murdoch wrote:


On 11-07-19 8:16 AM, loic wrote:
As suggested by David, I applied some modifications to the code of  
the legend
function so that the legend box size adapts to the number of line  
jumps in

the legend title.

Attached the modified code of the function.


No code was attached.


But there was code at the link. It does behave as hoped with defaults  
unchanged, but when cex is altered, the size of the box does not seem  
to change appropriately.




In case you didn't, could you make sure that your modification is to  
the version of the code that's in the svn repository?  Then it will  
be straightforward to import your change into R.


That version is online at

https://svn.r-project.org/R/trunk/src/library/graphics/R/legend.R

if your changes only affect the legend() function.

Duncan Murdoch



Thanks to David

Regards,

Loïc
Wageningen University

http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r

--
View this message in context: 
http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp3676157p3677996.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice plot problem outputting to jpeg

2011-07-19 Thread Justin

creamers stephen.creamer at rdeft.nhs.uk writes:

 
 Thanks David...I am trying to plot out data for various consultants by
 specialty - each specialty has a varying number of consultants - each
 consultant a varying number of data pointsI found direct access of the
 elements of the dataframe was the only way to plot this type of variation,
 otherwise xyplot seemed to assume that there were a similar number of
 consultants per specialty. As for the strip.custom, I did try this, its just
 that I ended up embedding the function inline...I'm not sure there is any
 difference is there? Sorry I didn't follow protocol ...it is my first time!
 Steve

It would probably be worth looking at Hadley's ggplot2 package also, ?melt
?reshape.  Maybe you could make one faceted plot instead?  


 
 --
 View this message in context:
http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3678288.html
 Sent from the R help mailing list archive at Nabble.com.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] barplot question

2011-07-19 Thread Robert Baer

As Sarah requested, could you at least read the posting guide and provide us 
with some sample data?





--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A. T. Still University of Health Sciences
800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965
-Original Message- 
From: Sally_roman

Sent: Monday, July 18, 2011 12:15 PM
To: r-help@r-project.org
Subject: Re: [R] barplot question

I would like to make stacked barplots, but with two stacked columns per x
value.  For cod - I have kept and discard values for 2 nets.  I would like
to have one stacked column for the control net with the kept and discard
value and then another column with the kept and discard values for the
experimental net.  I can make a stacked barplots in R that will plot one net
by species, but my boss would like to have both nets on the same graph.  I
have done it in excel, but was hoping to do the same thing in R.  All of the
R barplots I have found including the ones that you included as links do not
demonstrate what I would like to do.  Is is possible in R?

--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-question-tp3670861p3675912.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Writing the output of a regression object to a file

2011-07-19 Thread Hanlie Pretorius

Hi,

I'm using R 2.12.0 on Windows XP.

I've used the e1071 package to tune a Support Vector Regression object
and I've created the SVR object:

 epsilon.svr - svm(C8R004 ~.,data = rain_flow.train, scale = T, type = 
 eps-regression,
+ kernel = radial, cost = 0.9, epsilon=0.55,tolerance=0.001,
shrinking=T, gamma=0.18,fitted=T)
 esvr.pred - predict(epsilon.svr,newdata = rain_flow.test)

I would like to export the esvr.pred object to a file so that I can
draw a graph of it against my original data in other software that I'm
using.

I've tried the write.svm command, but that outputs the scaled data
instead of something that I can directly compare to my original data.

Does anyone know of an easy way to get the result such a format?

Alternatively, how can I use the scale values to generate such a format?

Thanks
Hanlie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] line jump in plot legend title

2011-07-19 Thread Dutrieux , Loïc

I also noticed, after sending the code that the modification does only work 
when the legend is positioned at the bottom of the figure region , and the 
function crashes when no title is provided.
The modification I applied to the code was intended to be more the solving of a 
particular problem than a real contribution to the function. Unfortunately my 
programing skills are quite limited to really contribute to the project.
However, if one of you manage to make the change work in all case, I believe 
including it in a next version of R is relevant.

Regards,

Loïc

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: dinsdag 19 juli 2011 16:56
To: Duncan Murdoch
Cc: Dutrieux, Loïc; r-help@r-project.org
Subject: Re: [R] line jump in plot legend title


On Jul 19, 2011, at 10:38 AM, Duncan Murdoch wrote:

 On 11-07-19 8:16 AM, loic wrote:
 As suggested by David, I applied some modifications to the code of 
 the legend function so that the legend box size adapts to the number 
 of line jumps in the legend title.

 Attached the modified code of the function.

 No code was attached.

But there was code at the link. It does behave as hoped with defaults 
unchanged, but when cex is altered, the size of the box does not seem to change 
appropriately.


 In case you didn't, could you make sure that your modification is to 
 the version of the code that's in the svn repository?  Then it will be 
 straightforward to import your change into R.

 That version is online at

 https://svn.r-project.org/R/trunk/src/library/graphics/R/legend.R

 if your changes only affect the legend() function.

 Duncan Murdoch


 Thanks to David

 Regards,

 Loïc
 Wageningen University

 http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp367615
 7p3677996.html Sent from the R help mailing list archive at 
 Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing the output of a regression object to a file

If I understand you correctly,


 I would like to export the esvr.pred object to a file so that I can
 draw a graph of it against my original data in other software that I'm
 using.


you cannot do this. You can export **data**, but of course any R
object is either a binary or text (via dput) representation of an R
structure, which can only be understood by R, not another software
system.

See ?write, ?write.table, or the R import/export manual for how to
export data (as text) to be imported by other software.


Cheers,
Bert


Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Centering data frame by factor

2011-07-19 Thread William Dunlap


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Daniel Malter
 Sent: Tuesday, July 19, 2011 1:51 AM
 To: r-help@r-project.org
 Subject: Re: [R] Centering data frame by factor
 
 
 P1-tapply(P1,Experiment,mean)[Experiment]

Note that the above solution works in this example
because Experiment takes the values 1 and 2.  If
Experiment were coded as, say, 101 and 102 the above
would not work.  This is a case where converting
Experiment to a factor would avoid problems.  E.g.,
   RAW - 
data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1))
   RAW$E - RAW$Experiment + 100 # relabeled Experiment
   with(RAW, P1-tapply(P1,Experiment,mean)[Experiment]) # good
   2  2  2  1  1  1 
  -2  0  2  1 -1  0 
   with(RAW, P1-tapply(P1,E,mean)[E]) # bad
  NA NA NA NA NA NA 
NA   NA   NA   NA   NA   NA 
   RAW$E - factor(RAW$E) # convert to factor
   with(RAW, P1-tapply(P1,E,mean)[E]) # good
  102 102 102 101 101 101 
   -2   0   2   1  -1   0

Another way to approach the problem is to think of
your normalized data as the residuals from a linear model:
   residuals(lm(data=RAW, cbind(P1,P2) ~ E))
   P1P2
  1 -2.00e+00 -4.00e+00
  2  4.385598e-17  8.771196e-17
  3  2.00e+00  4.00e+00
  4  1.00e+00 -1.00e+00
  5 -1.00e+00  8.771196e-17
  6  4.385598e-17  1.00e+00
   zapsmall(.Last.value) # make reading easier 
P1 P2
  1 -2 -4
  2  0  0
  3  2  4
  4  1 -1
  5 -1  0
  6  0  1
That approach can make generizations to more factors
or to smoothing approaches easier.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

 
 HTH,
 Daniel
 
 
 ronny wrote:
 
  Hi,
 
  I would like to center P1 and P2 of the following data frame by the factor
  Experiment, i.e. substruct from each value the average of its
  experiment, and keep the original data structure, i.e. the experiment and
  the group of each value.
 
  RAW=
 
 data.frame(Experiment=c(2,2,2,1,1,1),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=
 c(8,12,16,2,3,4))
 
  Desired result:
 
  NORMALIZED=
  data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-
 1,0),P2=c(-4,0,4,-1,0,1))
 
  I tried using by, but then I lose the original order, and the Group
  varaible. Can you help?
 
  RAW
Experiment Group P1 P2
   2 A 10  8
   2 A 12 12
   2 B 14 16
   1 A  5  2
   1 A  3  3
   1 B  4  4
 
  NOT.OK- within (RAW,
  {P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))})
 
  NOT.OK
Experiment Group P1 P2
2 A  1  8
2 A -1 12
2 B  0 16
1 A -2  2
1 A  0  3
1 B  2  4
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-
 tp3677609p3677620.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grey colored lines and overwriting labels i qqplot2

2011-07-19 Thread Brian Diggs


On 7/18/2011 9:23 PM, Sigrid wrote:

Hi
I apologize for not providing reproducible codes more clearly, and I hope
this will be more understandable.

I have 14 lines (7 per facet that I would like to add). I will provide you
with six of the lines from the data as that should  enough data to work
with, and also result in less plotting for all of us. These value are from a
previously conducted ancova, so not based on simple linear regression.

Line #Country  TreatmentIntercept   Slope
1  Low   A   81.47   47.267
2  Low   B   31.809 20.234
3  Low   C   69.892 33.717
4  High  A   67.024 47.267
5  High  B   17.357 20.234
6  High  C   105.10733.717


Is this (above) a data.frame?  If not, can you get it into one?  If so, 
then adding all the lines at once is easy.  Lets say that the data.frame 
is named lines (Note that I changed the capitalization of country and 
treatment to match what was in test.)


 lines
  Line # country treatment Intercept  Slope
1  1 Low A81.470 47.267
2  2 Low B31.809 20.234
3  3 Low C69.892 33.717
4  4High A67.024 47.267
5  5High B17.357 20.234
6  6High C   105.107 33.717
 dput(lines)
structure(list(`Line #` = 1:6, country = structure(c(2L, 2L,
2L, 1L, 1L, 1L), .Label = c(High, Low), class = factor),
treatment = structure(c(1L, 2L, 3L, 1L, 2L, 3L), .Label = c(A,
B, C), class = factor), Intercept = c(81.47, 31.809,
69.892, 67.024, 17.357, 105.107), Slope = c(47.267, 20.234,
33.717, 47.267, 20.234, 33.717)), .Names = c(Line #, country,
treatment, Intercept, Slope), class = data.frame, row.names = c(NA,
-6L))



From the help that I got here, i was able to make the plot I wanted.

ggplot(data = test, aes(x = year, y = total, colour = treatment)) +
  geom_point(aes(shape = treatment)) +
facet_wrap(~country) +
  scale_colour_grey(breaks=c('A','B','C','D','E','F','G'),
  labels=c('label A','label B','label C','label D',
  'label E','label F','label G')) +
  scale_shape_manual(breaks=c('A','B','C','D','E','F','G'),
  labels=c('label A','label B','label C','label D',
  'label E','label F','label G'),
  values = c(0, 1, 2, 3, 4, 5, 6)) +
  scale_y_continuous(number of votes) +
  scale_x_continuous(Years, breaks=1:4) +
  theme_bw()+


You can just add

geom_abline(aes(intercept = Intercept, slope = Slope, colour = 
treatment), data = lines)


This says to use the data from the lines data.frame, plotting a line for 
each row of the data set.  The line will be colored based on the value 
of the treatment variable (with the mapping defined the same as for the 
points). The lines will also be faceted according to country (the 
facet_wrap affects all geoms).



And I added line #1 and # 4 using the abline command.

+geom_abline(intercept = 81.47, slope=47.267, colour = black, size = 0.5,
subset = .(country == 'low'))+ geom_abline(intercept = 67.024, slope=47.267,
colour =  grey, size = 0.5, subset = .(country== 'high'))

How can I make the lines correspond with the descriptions on the right side
of the graph more clearly?

--
View this message in context: 
http://r.789695.n4.nabble.com/grey-colored-lines-and-overwriting-labels-i-qqplot2-tp3657119p3677248.html
Sent from the R help mailing list archive at Nabble.com.




--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health  Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Centering data frame by factor



On Jul 19, 2011, at 11:58 AM, William Dunlap wrote:




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On Behalf Of Daniel Malter

Sent: Tuesday, July 19, 2011 1:51 AM
To: r-help@r-project.org
Subject: Re: [R] Centering data frame by factor


P1-tapply(P1,Experiment,mean)[Experiment]


Note that the above solution works in this example
because Experiment takes the values 1 and 2.  If
Experiment were coded as, say, 101 and 102 the above
would not work.  This is a case where converting
Experiment to a factor would avoid problems.


I checked to see if my ave solution was subject to the same caveats  
and it is not. The help page is less categorical about what the  
grouping variables' structure should be, saying only that they are  
typically factors.



 E.g.,
RAW -  
data 
.frame 
(Experiment 
= 
c 
(2,2,2,1,1,1 
),Group 
= 
c 
(B 
,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1))

RAW$E - RAW$Experiment + 100 # relabeled Experiment
with(RAW, P1-tapply(P1,Experiment,mean)[Experiment]) # good

  2  2  2  1  1  1
 -2  0  2  1 -1  0

with(RAW, P1-tapply(P1,E,mean)[E]) # bad

 NA NA NA NA NA NA
   NA   NA   NA   NA   NA   NA


with(RAW, ave(P1, E, FUN=function(x) scale(x,  scale=FALSE) ) )
# [1] -2  0  2  1 -1  0   good



RAW$E - factor(RAW$E) # convert to factor
with(RAW, P1-tapply(P1,E,mean)[E]) # good

 102 102 102 101 101 101
  -2   0   2   1  -1   0


And take note that Bill made his variable a factor outside the tapply  
environment. If he had just used it in the tapply function (as I often  
do ...possibly unwisely in light of this gotcha)  it would fail:


 with(RAW, P1-tapply(P1, factor(E), mean)[E])
NA NA NA NA NA NA
  NA   NA   NA   NA   NA   NA

... that is unless you also use factor(E) as the index:

 with(RAW, P1-tapply(P1, factor(E), mean)[factor(E)])
102 102 102 101 101 101
 -2   0   2   1  -1   0

Thanks. Bill. I've learned a lot of R from you.

--
David.



Another way to approach the problem is to think of
your normalized data as the residuals from a linear model:

residuals(lm(data=RAW, cbind(P1,P2) ~ E))

  P1P2
 1 -2.00e+00 -4.00e+00
 2  4.385598e-17  8.771196e-17
 3  2.00e+00  4.00e+00
 4  1.00e+00 -1.00e+00
 5 -1.00e+00  8.771196e-17
 6  4.385598e-17  1.00e+00

zapsmall(.Last.value) # make reading easier

   P1 P2
 1 -2 -4
 2  0  0
 3  2  4
 4  1 -1
 5 -1  0
 6  0  1
That approach can make generizations to more factors
or to smoothing approaches easier.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com



HTH,
Daniel


ronny wrote:


Hi,

I would like to center P1 and P2 of the following data frame by  
the factor

Experiment, i.e. substruct from each value the average of its
experiment, and keep the original data structure, i.e. the  
experiment and

the group of each value.

RAW=

data 
.frame 
(Experiment 
= 
c 
(2,2,2,1,1,1 
),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=

c(8,12,16,2,3,4))


Desired result:

NORMALIZED=
data 
.frame 
(Experiment 
= 
c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-

1,0),P2=c(-4,0,4,-1,0,1))


I tried using by, but then I lose the original order, and the  
Group

varaible. Can you help?


RAW

 Experiment Group P1 P2
2 A 10  8
2 A 12 12
2 B 14 16
1 A  5  2
1 A  3  3
1 B  4  4

NOT.OK- within (RAW,
{P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))})


NOT.OK

 Experiment Group P1 P2
 2 A  1  8
 2 A -1 12
 2 B  0 16
 1 A -2  2
 1 A  0  3
 1 B  2  4



--
View this message in context: 
http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-
tp3677609p3677620.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing SAS and R survival analysis with time-dependent covariates

2011-07-19 Thread AO_Statistics


Terry Therneau-2 wrote:
 
 This query of why do SAS and S give different answers for Cox models
 comes 
 up every so often.  The two most common reasons are that
   a. they are using different options for the ties
   b. the SAS and S data sets are slightly different.
 You have both errors.
 
 First, make sure I have the same data set by reading a common file, and
 then
 compare the results.
 
 tmt54% more sdata.txt
  1   0.0  0.5 0   0
  1   0.5  3.0 1   1
  2   0.0  1.0 0   0
  2   1.0  1.5 1   1
  3   0.0  6.0 0   0
  4   0.0  8.0 0   1
  5   0.0  1.0 0   0
  5   1.0  8.0 1   0
  6   0.0 21.0 0   1
  7   0.0  3.0 0   0
  7   3.0 11.0 1   1
 
 tmt55% more test.sas
 options linesize=80;
 
 data trythis;
 infile 'sdata.txt';
 input id start end delir outcome;
 
 proc phreg data=trythis;
   model (start, end)*outcome(0)=delir/ ties=discrete;
 
 proc phreg data=trythis;
   model (start, end)*outcome(0)=delir/ ties=efron;
 
 
 tmt56% more test.r
 trythis - read.table('sdata.txt',
   col.names=c(id, start, end, delir,
 outcome))
 
 coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact')
 coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron')
 
 -
  I now get comparable answers.  Note that Cox's exact partial likelihood
 is 
 the correct form to use for discrete time data.  I labeled this as the
 'exact' 
 method and SAS as the 'discrete' method.  The exact marginal likelihood
 of 
 Prentice et al, which SAS calls the 'exact' method is not implemented in
 S.
  
   As to which package is more reliable, I can only point to a set of
 formal test 
 cases that are found in Appendix E of the book by Therneau and Grambsch.  
 
 [...]
 
 


I am processing estimations of regression parameters in the Cox model for
recurrent event data with time-dependent covariates. As my data sets contain
a lot of ties, I use the discrete method of SAS (PHREG procedure) and
exact option in R (coxph function of survival package).

Despite the high computation time (up to 45s), I always get estimations
without error or warning message with the PHREG procedure.
On the other hand, when I use R software (latest version 2.13.11 on 32 or 64
bits), I sometimes get different estimates from those obtained with SAS and
I get various warnings. And some other time I don't get any result, R
freezes and does not respond.

In order to understand, I have tried some tests from your examples. It turns
out that dysfunctions appear when the proportion of ties become important :

 Test1 

With R :

 (trythis - read.table('***\\sdata4.txt',
+   col.names=c(id, start, end, delir,
outcome)))
   id start end delir outcome
1   1   0.5 3.0 1   1
2   1   0.5 3.0 1   1
3   1   0.5 3.0 1   1
4   1   0.5 3.0 1   1
5   1   0.5 3.0 1   1
6   2   1.0 1.5 1   1
7   2   1.0 1.5 1   1
8   2   1.0 1.5 1   1
9   2   1.0 1.5 1   1
10  2   1.0 1.5 1   1
11  2   1.0 1.5 1   1
12  2   1.0 1.5 1   1
13  2   1.0 1.5 1   1
14  4   0.0 8.0 0   1
15  4   0.0 8.0 0   1
16  4   0.0 8.0 0   1
17  4   0.0 8.0 0   1
18  5   0.0 1.0 0   0
19  5   0.0 1.0 0   0
20  5   0.0 1.0 0   0
21  5   0.0 1.0 0   0
 coxph(Surv(start, end, outcome) ~ delir, data=trythis, method='exact')
Call:
coxph(formula = Surv(start, end, outcome) ~ delir, data = trythis, 
method = exact)


  coef exp(coef) se(coef)   z p
delir 22.5  6.06e+0915460 0.00146 1

Likelihood ratio test=15.6  on 1 df, p=8.04e-05  n= 21, number of events= 17 
Message d'avis :
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Ran out of iterations and did not converge

With SAS :

data trythis ;
input id start end delir outcome;
datalines;
 1   0.5  3.0 1   1
 1   0.5  3.0 1   1
 1   0.5  3.0 1   1
 1   0.5  3.0 1   1
 1   0.5  3.0 1   1
 2   1.0  1.5 1   1
 2   1.0  1.5 1   1
 2   1.0  1.5 1   1
 2   1.0  1.5 1   1
 2   1.0  1.5 1   1
 2   1.0  1.5 1   1
 2   1.0  1.5 1   1
 2   1.0  1.5 1   1
 4   0.0  8.0 0   1
 4   0.0  8.0 0   1
 4   0.0  8.0 0   1
 4   0.0  8.0 0   1
 5   0.0  1.0 0   0
 5   0.0  1.0 0   0
 5   0.0  1.0 0   0
 5   0.0  1.0 0   0
run;
proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=discrete;
RUN;

No error message, results :

estimate delir : 20.52466
se : 5689
Pr  Khi 2 : 0.9971
convergence status : Convergence criterion (GCONV=1E-8) satisfied.

 Test2 

With R :

 (trythis - read.table('***\\montest.txt',
+   col.names=c(id, start, end, delir,

Re: [R] barplot question

2011-07-19 Thread Sally_roman

In my first post is example data.

--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-question-tp3670861p3678402.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why I could not reproduce the Mandelbrot plot demonstrated on R wiki

2011-07-19 Thread belisario

I had the same problem with the code of the Wikipedia in a 64-bit Windows 7.

The gif works fine if I use the executable located in bin\x64
instead of bin, which produces the ugly gif

eg, in a cmd: the_R_path\bin\x64\Rscript.exe example.r

I realised the solution using the --verbose flag

Sorry about my english


--
View this message in context: 
http://r.789695.n4.nabble.com/why-I-could-not-reproduce-the-Mandelbrot-plot-demonstrated-on-R-wiki-tp2591429p3678522.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting intraday data in quantmod

2011-07-19 Thread kev946

I'm using this to plot the data with success, but am unable to figure out how
to get the H:M:S timestamp included with their respective Dates. Any
suggestions?

xts(Dataset[,-1],as.Date(Dataset[,1],%Y-%m-%d))

--
View this message in context: 
http://r.789695.n4.nabble.com/Plotting-intraday-data-in-quantmod-tp3677268p3678573.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to get predicted values of y for different x values?

2011-07-19 Thread halptekin

Here is my model with interaction terms and control variables (I changed
variables names for easy read): 

reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3) 

x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three are
discrete ordinal variables; but I will treat them as continuous variables. 

(a) How can I see the predicted values of y for each of these scenarios (210
scenarios I guess)? 
(b) How can I see the predicted value of y for the minimum and maximum
values of x1, x2, and x3 (8 scenarios)? 
(c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (1
scenario)? 

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678662p3678662.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stacked Bar Plot in ggplot2

2011-07-19 Thread Abraham Mathew

I'm trying to develop a stacked bar plot in R with ggplot2.

My data:

conv = c(10, 4.76, 17.14, 25, 26.47, 37.5, 20.83, 25.53, 32.5, 16.7, 27.33)
click = c(20, 42, 35, 28, 34, 48, 48, 47, 40, 30, 30)
date = c(July 7, July 8, July 9, July 10, July 11, July 12,
July 13,
July 14, July 15, July 16, July 17)

dat - data.frame(date=c(date), click=c(click), conv=c(conv),
stringsAsFactors = FALSE)
dat


I'm trying to create a stacked bar plot with the values for Clicks in the
background and the values
for conversions in the forefront. I tried the following, but because the
values aren't factors,
it's doesn't produce the right result.

p3 = ggplot(dat, aes(as.character(date))) +
geom_bar(aes(fill=as.factor(conv))) + ylim(c(0,70)) +
geom_bar(aes(fill = conv), position = 'fill')
p3

Help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting intraday data in quantmod

2011-07-19 Thread Joshua Ulrich

On Tue, Jul 19, 2011 at 11:31 AM, kev946 klee...@gmail.com wrote:
 I'm using this to plot the data with success, but am unable to figure out how
 to get the H:M:S timestamp included with their respective Dates. Any
 suggestions?

 xts(Dataset[,-1],as.Date(Dataset[,1],%Y-%m-%d))

This is not a plot command.  What plot command are you using?

Dates don't have H:M:S; they're all zero.  You don't provide a
sample of Dataset, so we don't know what your data look like.

Best,
--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] barplot question

2011-07-19 Thread Sarah Goslee

People who participate in this list via email are unlikely to have
your example data, or even to have any idea what you are currently
referring to.

Please leave enough of the previous messages in your reply to the list
to provide context, and include all necessary information. Don't
assume that everyone has all previous messages in a thread immediately
to hand.

Sarah

On Tue, Jul 19, 2011 at 11:24 AM, Sally_roman sro...@umassd.edu wrote:
 In my first post is example data.

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/barplot-question-tp3670861p3678402.html
 Sent from the R help mailing list archive at Nabble.com.

Nabble.com is not the main way in which people participate in this list.

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get predicted values of y for different x values?

2011-07-19 Thread Jeff Newmiller

?predict

Use data.frame() to generate input vectors (newdata) for which you want 
predicted values.
---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

halptekin halpte...@gmail.com wrote:

Here is my model with interaction terms and control variables (I changed
variables names for easy read): 

reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3) 

x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three are
discrete ordinal variables; but I will treat them as continuous variables. 

(a) How can I see the predicted values of y for each of these scenarios (210
scenarios I guess)? 
(b) How can I see the predicted value of y for the minimum and maximum
values of x1, x2, and x3 (8 scenarios)? 
(c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (1
scenario)? 

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678662p3678662.html
Sent from the R help mailing list archive at Nabble.com.

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get predicted values of y for different x values?



On Jul 19, 2011, at 2:36 PM, Jeff Newmiller wrote:


?predict

Use data.frame() to generate input vectors (newdata) for which you  
want predicted values.


The OP probably needs to use expand.grid to generate the spanning  
combinations of x values. He will in addition need to include values  
for control variables.


dfrm - expand.grid(x1 = 0: 6, x2 = 0: 5,  x3 = 0 : 4)
dfrm$control1 - some value of same class as original control1 data
repeat x 2

--
David.

---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---
Sent from my phone. Please excuse my brevity.

halptekin halpte...@gmail.com wrote:

Here is my model with interaction terms and control variables (I  
changed

variables names for easy read):

reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3)

x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three  
are
discrete ordinal variables; but I will treat them as continuous  
variables.


(a) How can I see the predicted values of y for each of these  
scenarios (210

scenarios I guess)?
(b) How can I see the predicted value of y for the minimum and maximum
values of x1, x2, and x3 (8 scenarios)?
(c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (1
scenario)?

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678662p3678662.html
Sent from the R help mailing list archive at Nabble.com.

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculating the mean of a random matrix (by row) and some general questions

2011-07-19 Thread RichardLang

Hi everyone!

I'm trying to teach myself R in order to do some data analysis. I'm a
mathematics student and (only) familiar with matlab and latex. I'm working
trough the official introduction to R at the moment, while simultaneously
solving some exercises I found in the web. Before I post my (probably
stupid) question, I'd like to ask you for some general advice. How do you
work with R? Is it like in matlab, that you write your functions with a lot
of loops etc. in a textfile and then run it? Or do you just prepare your
data and then use the functions provided by R (plot, mean etc) to get some
analysis? I'd be very thankfull for some of your thoughts about
approaches.

Now the question: I'm trying to build a vector with n entries, each
consisting of the mean of m random numbers (exponential distributed for
example). My approach was to construct a nxm random matrix and then to
somehow take the mean of each row. But in the mean function there is no
parameter to do this, so the intended approach of R is probably different..
any ideas? =)

Richard

--
View this message in context:
http://r.789695.n4.nabble.com/calculating-the-mean-of-a-random-matrix-by-row-and-some-general-questions-tp3678964p3678964.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] negative binomial regression with spatial weights matrix (not locations)

2011-07-19 Thread ChristyM

Hello, 
I have some data that exhibits a negative binomial distribution and also
spatial structure. I would like to create a model that accounts for both.
However, instead of locations, I have a distance matrix (cost matrix)
describing the spatial relationships among the locations. I have tried using
various methods (spBayes,geoRglm,geoR), but none incorporate both a negative
binomial distribution and distance matrix. Many methods use the geodata
object, which uses coordinates. 

Please provide suggestions as to what approach I can use to analyze this
data. 

Thank you!!

Christy M. 

--
View this message in context: 
http://r.789695.n4.nabble.com/negative-binomial-regression-with-spatial-weights-matrix-not-locations-tp3679071p3679071.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stacked Bar Plot in ggplot2

2011-07-19 Thread Justin

Abraham Mathew abraham at thisorthat.com writes:

 
 I'm trying to develop a stacked bar plot in R with ggplot2.
 
 My data:
 
 conv = c(10, 4.76, 17.14, 25, 26.47, 37.5, 20.83, 25.53, 32.5, 16.7, 27.33)
 click = c(20, 42, 35, 28, 34, 48, 48, 47, 40, 30, 30)
 date = c(July 7, July 8, July 9, July 10, July 11, July 12,
 July 13,
 July 14, July 15, July 16, July 17)
 
 dat - data.frame(date=c(date), click=c(click), conv=c(conv),
 stringsAsFactors = FALSE)


Is: 
 
ggplot(dat,aes(x=date))+geom_bar(aes(y=click),fill='red')+geom_bar(aes(y=conv),fill='blue')

what you're looking for?

or 

dat.melt-melt(dat,'date')
ggplot(dat.melt,aes(x=date,y=value,fill=variable))+geom_bar()


Justin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] notation question

2011-07-19 Thread Carson Farmer

Dear list, I am currently writing up some of my R models in a more
formal sense for a paper, and I am having trouble with the notation.
Although this isn't really an 'R' question, it should help me to
understand a bit better what I am actually doing when fitting my
models!

Using the analysis of co-variance example from MASS (fourth edition, p
142), what is the correct notation for the formula Gas, ~ Insul/Temp
- 1? Obviously, if we fit it as two separate models (as in the
example above it), we would have something like y_i = \beta x_i for
each of the two models. So my question is, when we have a single model
with a k-level factor interaction term as in the equation above, what
is the correct/standard statistical (LaTeX style) notation?

Cheers,

Carson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating the mean of a random matrix (by row) and some general questions

2011-07-19 Thread Nordlund, Dan (DSHS/RDA)

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of RichardLang
 Sent: Tuesday, July 19, 2011 11:44 AM
 To: r-help@r-project.org
 Subject: [R] calculating the mean of a random matrix (by row) and some
 general questions

 Hi everyone!

 I'm trying to teach myself R in order to do some data analysis. I'm a
 mathematics student and (only) familiar with matlab and latex. I'm
 working
 trough the official introduction to R at the moment, while
 simultaneously
 solving some exercises I found in the web. Before I post my (probably
 stupid) question, I'd like to ask you for some general advice. How do
 you
 work with R? Is it like in matlab, that you write your functions with a
 lot
 of loops etc. in a textfile and then run it? Or do you just prepare
 your
 data and then use the functions provided by R (plot, mean etc) to get
 some
 analysis? I'd be very thankfull for some of your thoughts about
 approaches.

 Now the question: I'm trying to build a vector with n entries, each
 consisting of the mean of m random numbers (exponential distributed for
 example). My approach was to construct a nxm random matrix and then to
 somehow take the mean of each row. But in the mean function there is no
 parameter to do this, so the intended approach of R is probably
 different..
 any ideas? =)

 Richard

Richard,

If you have a matrix, M, with n rows and m columns, you can use the apply() 
function to get either row or column means

 n - 10 
 m -3
 M - matrix(rnorm(m*n),n,m)
 M
[,1][,2]   [,3]
 [1,]  0.6239267 -0.70546496  0.3682918
 [2,] -0.7326689 -1.86571052 -0.2899552
 [3,]  0.7778313 -1.01227191  0.7735718
 [4,]  0.8336683 -0.07755214 -0.1375798
 [5,] -1.6134414  0.12088648 -0.4064939
 [6,] -0.2578007  0.45142456 -1.0197297
 [7,]  1.0108260 -0.24933408 -0.4083304
 [8,] -0.7936603 -0.67286769 -0.8666802
 [9,]  1.0054039  2.52498995  1.0915742
[10,] -0.1610073  0.43504924  2.4288474
 rowMeans - apply(M,1,mean)
 rowMeans
 [1]  0.09558452 -0.96277820  0.17971042  0.20617876 -0.63301628 -0.27536860
 [7]  0.11772050 -0.3605  1.54065601  0.90096312
 colMeans - apply(M,2,mean)
 colMeans
[1]  0.06930777 -0.10508511  0.15335160

I will let others describe how they use R.

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating the mean of a random matrix (by row) and some general questions

On Jul 19, 2011, at 2:43 PM, RichardLang wrote:

Hi everyone!

solving some exercises I found in the web. Before I post my (probably
stupid) question, I'd like to ask you for some general advice. How
do you
work with R? Is it like in matlab, that you write your functions
with a lot
of loops etc. in a textfile and then run it? Or do you just prepare
your
data and then use the functions provided by R (plot, mean etc) to
get some

analysis? I'd be very thankfull for some of your thoughts about
approaches.

In R one generally tries to avoid loops. There are many functions that
accept vector and matrix arguments and retrun similarly structured
results. My impression was that that was also true in Matlab and that
many of the loop-less (the R term is vectorized) constructs were
very similar. Matrix math and all that jazz.

Now the question: I'm trying to build a vector with n entries, each
consisting of the mean of m random numbers (exponential distributed
for

example). My approach was to construct a nxm random matrix and then to
somehow take the mean of each row. But in the mean function there is
no
parameter to do this, so the intended approach of R is probably
different..

any ideas? =)

Generally when performing a repeated task involving random simulation
one turns to the replicate function. To generate 5 means of length 10,
random exponentially distributed vectors:

replicate(5, mean( rexp(10, rate= 1.5) ) )
[1] 1.0786732 0.7196179 0.7179612 0.5024858 0.4624592

--
David.

Richard

David Winsemius, MD
West Hartford, CT

[R] Questions about DCC-GARCH Model

2011-07-19 Thread zoe_zhang

Dear list members, 
I'm trying to use DCC-GARCH model to estimate the correlation. I have
downloeaded ccgarch packeage but can't understand some argument in the
formula.

dcc.estimation(inia, iniA, iniB, ini.dcc, dvar, model, method=BFGS,
gradient=1, message=1)
which is on R.Help
I understand others except ini.dcc which is described as a vector of
initial values for the DCC parameters (2 × 1). 
Does anyone know that？
In the example it is set as c(0.01,0.98), but i have no idea where the
numbers come from.

Another question is how to estimate DCC of known structural break. Let's
say, we have already known the break point A and establish a restricted
model.Use this model to estimate DCC. Then do likelihood test with
unrestricted model( no structural break) to see if it is significant. But I
have no idea how to establish the restricted model. I know in the restricted
model, I have to use different rho,(use rho(1~A) and rho(A~n) instead of
rho(1~), but how to identify this difference in R ?

I can't do it by myself. Does anyone have any idea? I appreciate it! 
Thank you in advance! 

Zoe


--
View this message in context: 
http://r.789695.n4.nabble.com/Questions-about-DCC-GARCH-Model-tp3679130p3679130.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating the mean of a random matrix (by row) and some general questions

2011-07-19 Thread RichardLang

Thanks for your advice!

I found another method to solve my problem (on page 20 of the manual =) ).
Here's my code

# Set n and m to whatever you want
n = 25
m = 10

# Build a random vector (here with exponential distribution) with lenght of
n*m
x = rexp(n*m,1)

# Set up a factor in order to group x into pieces each of size ten
y = factor( rep(1:n, each = m) )

# tapply will apply mean to each group..
result = tapply(x,y,mean)

--
View this message in context: 
http://r.789695.n4.nabble.com/calculating-the-mean-of-a-random-matrix-by-row-and-some-general-questions-tp3678964p3679186.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating the mean of a random matrix (by row) and some general questions



On Jul 19, 2011, at 4:03 PM, RichardLang wrote:


Thanks for your advice!

I found another method to solve my problem (on page 20 of the manual  
=) ).

Here's my code


In r-help most of the readers and most of the regular responders are  
reading this as a mailing list posting not on Nabble and it looks  
like you are responding to yourself. I remember what I wrote but the  
rest of the readers probably don't and no one can tell if you are  
reponding to me or to Nordlund. You are asked in the Posting Guide to  
provide context and it's not that hard in Nabble.




# Set n and m to whatever you want
n = 25
m = 10

# Build a random vector (here with exponential distribution) with  
lenght of

n*m
x = rexp(n*m,1)

# Set up a factor in order to group x into pieces each of size ten
y = factor( rep(1:n, each = m) )

# tapply will apply mean to each group..
result = tapply(x,y,mean)



There are many ways to skinning that cat:

Try this:

rowMeans( matrix( rexp(n*m,1), ncol=10) )

This will be much faster than Nordlund's answer on bigger tasks,  
because the row/colMeans and row/colSums functions are tweaked to be  
fast, whereas there is a lot of overhead with the apply functions.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating the mean of a random matrix (by row) and some general questions

2011-07-19 Thread Peter Lomas

Hi Richard,

As others have said, try to use the apply functions rather than loops.
There is also an apply function for lists, see ?lapply.  This is much more
efficient.  I also like writing my own functions.  For example:

f - function(x) {
   x^2
}

Which can then be used by:
 f(2)
[1] 4

This is very useful if you're getting into maximum likelihood programming,
or want to use the optim function (for multivariate functions) or
optimize (for univariate functions).

Lastly, check out the R reference card.
http://cran.r-project.org/doc/contrib/Short-refcard.pdf

Regards,
Peter

On Tue, Jul 19, 2011 at 12:43, RichardLang l...@zedat.fu-berlin.de wrote:

 Hi everyone!

 I'm trying to teach myself R in order to do some data analysis. I'm a
 mathematics student and (only) familiar with matlab and latex. I'm working
 trough the official introduction to R at the moment, while simultaneously
 solving some exercises I found in the web. Before I post my (probably
 stupid) question, I'd like to ask you for some general advice. How do you
 work with R? Is it like in matlab, that you write your functions with a lot
 of loops etc. in a textfile and then run it? Or do you just prepare your
 data and then use the functions provided by R (plot, mean etc) to get some
 analysis? I'd be very thankfull for some of your thoughts about
 approaches.

 Now the question: I'm trying to build a vector with n entries, each
 consisting of the mean of m random numbers (exponential distributed for
 example). My approach was to construct a nxm random matrix and then to
 somehow take the mean of each row. But in the mean function there is no
 parameter to do this, so the intended approach of R is probably different..
 any ideas? =)

 Richard

 --
 View this message in context:
 http://r.789695.n4.nabble.com/calculating-the-mean-of-a-random-matrix-by-row-and-some-general-questions-tp3678964p3678964.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R on a server (Windows Server 2008)

Thanks a lot, Roger!
Dimitri

On Tue, Jul 19, 2011 at 11:00 AM, Bos, Roger roger@rothschild.com wrote:
 Yes. I have it running on a win server 2008 machine with no problems.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Dimitri Liakhovitski
 Sent: Monday, July 18, 2011 7:19 PM
 To: r-help
 Subject: [R] R on a server (Windows Server 2008)

 Apologies for a naive question: Can R be installed and run on a server
 (operating system Windows Server 2008)?
 Thank you!

 --
 Dimitri Liakhovitski

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 ***

 This message is for the named person's use only. It may
 contain confidential, proprietary or legally privileged
 information. No right to confidential or privileged treatment
 of this message is waived or lost by an error in transmission.
 If you have received this message in error, please immediately
 notify the the sender by e-mail, delete the message and all
 copies from your system and destroy any hard copies.  You must
 not, directly or indirectly, use, disclose, distribute,
 print or copy any part of this message if you are not
 the intended recipient.

 



 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email
 __




-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculating mean excluding zeros

Sorry if it's been discussed before - don't seem to find it.
I'd like to calculate a mean while ignoring zeros.
mean doesn't seem to have an option for that.
Any other function/package that could do it?

Thanks for a pointer!

-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] timeDate with month designated by three letters.

2011-07-19 Thread mdkzone

Dear R Experts:


   I am trying to convert a date and time character field to timeDate where the 
month is presented as three letters, such as JUN for June, etc. 


This is an example of the full character field:


04-MAY-11 1428
What is the proper format syntax?


I've tried

timeDate(04-MAY-11 1428,format=%d-%m-%y %H%M)
but only get

GMT
[1] [NA]
If I change the month to a number as below, then it works, but that would 
require recoding of the data field.



timeDate(04-05-11 1428,format=%d-%m-%y %H%M)

gives

GMT
[1] [2011-05-04 14:28:00]
which is correct.  How do I get R to recognize the month as a 3 letter 
designator.


Any recommendations you can provide would be greatly appreciated.


Michael


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating mean excluding zeros

2011-07-19 Thread Weidong Gu

You can do it by subsetting or indexing

 r-c(0,0,0,rnorm(10,10,5))
 mean(r)
[1] 8.052215
 mean(r[r!=0])
[1] 10.46788

Weidong Gu

On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Sorry if it's been discussed before - don't seem to find it.
 I'd like to calculate a mean while ignoring zeros.
 mean doesn't seem to have an option for that.
 Any other function/package that could do it?

 Thanks for a pointer!

 --
 Dimitri Liakhovitski
 marketfusionanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating mean excluding zeros

2011-07-19 Thread Sarah Goslee

In the more general case, that approach is prone to machine precision
error (FAQ 7.31).

Here's a clunky but safer alternative:

 set.seed(1234)
 testvec - sample(0:10, 100, replace=TRUE)
 mean(testvec)
[1] 4.31
 mean(testvec[testvec != 0])
[1] 4.842697
 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))])
[1] 4.842697


(Is there an elementwise equivalent to all.equal() that I'm missing?)

Sarah
On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote:
 You can do it by subsetting or indexing

  r-c(0,0,0,rnorm(10,10,5))
 mean(r)
 [1] 8.052215
 mean(r[r!=0])
 [1] 10.46788

 Weidong Gu

 On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Sorry if it's been discussed before - don't seem to find it.
 I'd like to calculate a mean while ignoring zeros.
 mean doesn't seem to have an option for that.
 Any other function/package that could do it?

 Thanks for a pointer!

 --
 Dimitri Liakhovitski
 marketfusionanalytics.com




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] hold position of vertices constant in network {statnet}?

2011-07-19 Thread Matt Bakker

I am a novice with network fuctions! I have been exploring the network
function in the statnet package, but haven't been able to figure out
how to hold vertices in position while varying edge features. Can
anyone advise on whether this is possible, and if so, how to do it?
Thanks!
-- 
Matthew Bakker, Ph.D.
Department of Plant Pathology
University of Minnesota
495 Borlaug Hall
1991 Upper Buford Circle
Saint Paul, MN  55108 USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get predicted values of y for different x values?

Please read the posting guide (requires a self-contained example of code) and
consult the help pages before posting. If you type ?predict.lm the help page
clearly states that the argument 'newdata' takes [a]n optional data frame
in which to look for variables with which to predict...

x1-rnorm(100)
x2-rnorm(100)
e-rnorm(100)
y-x1+x2+x1*x2+e
reg-lm(y~x1*x2)
summary(reg)
predict(reg,newdata=data.frame(x1=2,x2=2)) 

HTH,
Daniel


halptekin wrote:
 
 Here is my model with interaction terms and control variables (I changed
 variables names for easy read): 
 
 reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3) 
 
 x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three are
 discrete ordinal variables; but I will treat them as continuous variables. 
 (a) How can I see the predicted values of y for each of these scenarios
 (210 y values I guess)? 
 (b) How can I see the predicted value of y for the minimum and maximum
 values of x1, x2, and x3 (8 y values)? 
 (c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (only
 one y value)?
 

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678658p3679351.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] timeDate with month designated by three letters.

2011-07-19 Thread Peter Alspach

Tena koe Michael

The help file for strptime suggests you should be using %b (three letter month) 
rather than %m (decimal number month).

HTH 

Peter Alspach

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of mdkz...@aol.com
 Sent: Wednesday, 20 July 2011 8:39 a.m.
 To: R-help@r-project.org
 Subject: [R] timeDate with month designated by three letters.
 
 Dear R Experts:
 
 
I am trying to convert a date and time character field to timeDate
 where the month is presented as three letters, such as JUN for June,
 etc.
 
 
 This is an example of the full character field:
 
 
 04-MAY-11 1428
 What is the proper format syntax?
 
 
 I've tried
 
 timeDate(04-MAY-11 1428,format=%d-%m-%y %H%M)
 but only get
 
 GMT
 [1] [NA]
 If I change the month to a number as below, then it works, but that
 would require recoding of the data field.
 
 
 
 timeDate(04-05-11 1428,format=%d-%m-%y %H%M)
 
 gives
 
 GMT
 [1] [2011-05-04 14:28:00]
 which is correct.  How do I get R to recognize the month as a 3 letter
 designator.
 
 
 Any recommendations you can provide would be greatly appreciated.
 
 
 Michael
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating mean excluding zeros

Thanks a lot, Sarah.
I assume, if the values against which I am comparing are REALLY zero
(0) - then even the first one (mean(testvec[testvec != 0])) should
work, right?
Dimitri

On Tue, Jul 19, 2011 at 4:56 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 In the more general case, that approach is prone to machine precision
 error (FAQ 7.31).

 Here's a clunky but safer alternative:

 set.seed(1234)
 testvec - sample(0:10, 100, replace=TRUE)
 mean(testvec)
 [1] 4.31
 mean(testvec[testvec != 0])
 [1] 4.842697
 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))])
 [1] 4.842697


 (Is there an elementwise equivalent to all.equal() that I'm missing?)

 Sarah
 On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote:
 You can do it by subsetting or indexing

  r-c(0,0,0,rnorm(10,10,5))
 mean(r)
 [1] 8.052215
 mean(r[r!=0])
 [1] 10.46788

 Weidong Gu

 On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Sorry if it's been discussed before - don't seem to find it.
 I'd like to calculate a mean while ignoring zeros.
 mean doesn't seem to have an option for that.
 Any other function/package that could do it?

 Thanks for a pointer!

 --
 Dimitri Liakhovitski
 marketfusionanalytics.com




 --
 Sarah Goslee
 http://www.functionaldiversity.org




-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loops and simulation

I dare the conjecture that if you had written the code, you would know how to
do this. This suggests that you are asking us to do your homework, which is
not the purpose of this list. A simple inclusion of the code in a for or
while loop and storing the estimated parameters with the index of the
iteration at which the loop is at should be no problem if you have
programmed the code you provided.

Best,
Daniel


Majdi wrote:
 
 Hi
 
 My code generates data using (progressive censoring)
 then the program use this data to estimate the parameters (theta.t and
 beta.t) using EM algorithm
 
 My question is that I have to do simulation, that is I have to run my code
 10,000 times, for each time I have to store the output and at the end find
 the average of theta.t and beta.t over these 1 simulations
 
 Is this possible at all  OR I should go back to SAS and try to do it
 over there 
 I learned that R doesn't like loops
 
 ##
 
 ## Lomax distribution ##
 ## Use EM algorithm to estimate parameters of Lomax based on progressive
 censored data
   
 n=20;m=5;R-c(0,0,0,0,15)
 theta=0.4986; beta=3
 
 # generate data using progressive censoring algorithm#
 ##
 
 W-matrix((runif(m,0,1)),m,1)
 E-matrix(NA,m,1);V-matrix(NA,m,1);U-matrix(NA,m,1)
 
 i-1
 while (i=m) {
  
 E[i]- 1/( i+ sum(R[(m+1-i):m])) 
 i-i+1
 }
 
 V-W^E
 
 i-1
 while (i=m) {
  
 U[i]- (1- prod( V[(m+1-i):m])) 
 i-i+1
 }
 
 x-( (1-U)^(-1/theta)-1 )/beta   # simulation ends
 
 Yobs-x
 
 em.lomax - function(Y){
 r - length(Yobs)
 
 ##initial value
 
 theta.t=theta
 beta.t=beta
 
 # Define log-likelihood function
 ll - function(y, k, c){
 n*log(c*k)-(k+1)*( sum(log(1+c*Y))+sum( R*log(1+c*Y)+1/k ) )
 }
 
 # Compute the log-likelihood for the initial values, and ignoring the
 missing data mechanism
 lltm1 - ll(Yobs, theta.t, beta.t)
 
 repeat{
 # E-step
 
 Bbar- function(theta.t,beta.t){
 sum(R*(1+beta.t*(theta.t+1)*Yobs)/(beta.t*(theta.t+1)*(1+beta.t*Yobs)))
 }
 
 Abar- function(theta.t,beta.t){
 sum( R* ( log(1+beta.t*Yobs)+ 1/theta.t ))
 }
 
 theta.t1- n/( sum(log(1+beta.t*Yobs)) + Abar(theta.t,beta.t))
 
 # M-step
 beta.t1-((theta.t1+ 1)/n
 *(sum(Yobs/(1+beta.t*Yobs))+Bbar(theta.t,beta.t)))^(-1)
 theta.t1- n/( sum(log(1+beta.t*Yobs)) + Abar(theta.t,beta.t))
 
 beta.t -beta.t1
 theta.t-theta.t1
 
 # compute log-likelihood using current estimates, and igoring the missing
 data mechanism
 llt - ll(Yobs, theta.t, beta.t)
 
 # Print current parameter values and likelihood
 cat(theta.t, beta.t, llt, \n)
 
 
 # Stop if converged
 if ( abs(lltm1 - llt)  0.0001) break
 lltm1 - llt
 }
 return(theta.t,beta.t)
 }
 
 em.lomax(Yobs)
 
 
 
 please help me with this 
 
 majdi
 

--
View this message in context: 
http://r.789695.n4.nabble.com/loops-and-simulation-tp3678640p3679377.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating the mean of a random matrix (by row) and some general questions



On Jul 19, 2011, at 4:18 PM, Peter Lomas wrote:


Hi Richard,

As others have said, try to use the apply functions rather than  
loops.
There is also an apply function for lists, see ?lapply.  This is  
much more

efficient.


Actually the apply functions are not more efficient in the usual  
meaning of time of execution. And sometimes they is rather  
inefficient. Prior discussions of this topic in the archives should be  
easy to find. The economy is in expression and the advantage is in  
code creation and maintenance.


Doubters of this proposition should consider these results:

library(rbenchmark)  # help page has a more compact version of these  
tests


means.rep = function(n, m) {res1 - vector(length=100, mode=numeric)
  res1 - replicate(n, mean( rexp(m)))}
means.colMn = function(n, m) {res2 - vector(length=100, mode=numeric)
   res2 - colMeans(matrix( rexp(n*m), c(m, n)))}
means.tapply = function(n,m) {res3 - vector(length=100, mode=numeric)
 res3 - tapply( rexp(n*m), rep(1:n, each = m), mean)}
means.apply =function(n,m) { res4 - vector(length=100, mode=numeric)
res4 -apply( matrix(rexp(m*n),n,m), 1, mean) }
means.forloop =function(n, m) {res5 - vector(length=100,  
mode=numeric)

 for (i in n) {res5[i] -mean(rexp(m))} }
benchmark(
   repl = means.rep(100, 100),
   tappl = means.tapply(100, 100),
   appl = means.apply(100, 100),
   pat = means.pat(100, 100),
   forloop =  means.forloop(100,100),
   replications=100, columns=c(test,replications,elapsed),
   order='elapsed' )

###
Results:
 test replications relative elapsed
5 forloop  100 1.00   0.004
4 pat  10020.25   0.081
1repl  10077.00   0.308
3appl  10089.75   0.359
2   tappl  100   264.50   1.058

I admit that I was rather surprised to see the for-loop beating  
colMeans by such a wide margin,  and this is making me wonder if I  
reversed some index or coded the for-loop test wrong. So would  
appreciate some auditing and improvement of this test.  (But I don't  
see how I could have reversed the order since the n and m are both  
100. And I tried adding assignments to see if there were only promises  
being made with no calculations. The relative efficiencies stays the  
same.)


--
David.



 I also like writing my own functions.  For example:

f - function(x) {
  x^2
}

Which can then be used by:

f(2)

[1] 4

This is very useful if you're getting into maximum likelihood  
programming,

or want to use the optim function (for multivariate functions) or
optimize (for univariate functions).

Lastly, check out the R reference card.
http://cran.r-project.org/doc/contrib/Short-refcard.pdf

Regards,
Peter

On Tue, Jul 19, 2011 at 12:43, RichardLang l...@zedat.fu-berlin.de  
wrote:



Hi everyone!

I'm trying to teach myself R in order to do some data analysis. I'm a
mathematics student and (only) familiar with matlab and latex. I'm  
working
trough the official introduction to R at the moment, while  
simultaneously

solving some exercises I found in the web. Before I post my (probably
stupid) question, I'd like to ask you for some general advice. How  
do you
work with R? Is it like in matlab, that you write your functions  
with a lot
of loops etc. in a textfile and then run it? Or do you just prepare  
your
data and then use the functions provided by R (plot, mean etc) to  
get some

analysis? I'd be very thankfull for some of your thoughts about
approaches.

Now the question: I'm trying to build a vector with n entries, each
consisting of the mean of m random numbers (exponential distributed  
for
example). My approach was to construct a nxm random matrix and then  
to
somehow take the mean of each row. But in the mean function there  
is no
parameter to do this, so the intended approach of R is probably  
different..

any ideas? =)

Richard





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing SAS and R survival analysis with time-dependent covariates

2011-07-19 Thread Thomas Lumley

On Wed, Jul 20, 2011 at 5:42 AM, AO_Statistics abouesl...@gmail.com wrote:

 Terry Therneau-2 wrote:

 This query of why do SAS and S give different answers for Cox models
 comes
 up every so often.  The two most common reasons are that
       a. they are using different options for the ties
       b. the SAS and S data sets are slightly different.
 You have both errors.

 First, make sure I have the same data set by reading a common file, and
 then
 compare the results.

 tmt54% more sdata.txt
  1   0.0  0.5     0       0
  1   0.5  3.0     1       1
  2   0.0  1.0     0       0
  2   1.0  1.5     1       1
  3   0.0  6.0     0       0
  4   0.0  8.0     0       1
  5   0.0  1.0     0       0
  5   1.0  8.0     1       0
  6   0.0 21.0     0       1
  7   0.0  3.0     0       0
  7   3.0 11.0     1       1

 tmt55% more test.sas
 options linesize=80;

 data trythis;
     infile 'sdata.txt';
     input id start end delir outcome;

 proc phreg data=trythis;
   model (start, end)*outcome(0)=delir/ ties=discrete;

 proc phreg data=trythis;
   model (start, end)*outcome(0)=delir/ ties=efron;


 tmt56% more test.r
 trythis - read.table('sdata.txt',
                       col.names=c(id, start, end, delir,
 outcome))

 coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact')
 coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron')

 -
  I now get comparable answers.  Note that Cox's exact partial likelihood
 is
 the correct form to use for discrete time data.  I labeled this as the
 'exact'
 method and SAS as the 'discrete' method.  The exact marginal likelihood
 of
 Prentice et al, which SAS calls the 'exact' method is not implemented in
 S.

   As to which package is more reliable, I can only point to a set of
 formal test
 cases that are found in Appendix E of the book by Therneau and Grambsch.

 [...]




 I am processing estimations of regression parameters in the Cox model for
 recurrent event data with time-dependent covariates. As my data sets contain
 a lot of ties, I use the discrete method of SAS (PHREG procedure) and
 exact option in R (coxph function of survival package).

 Despite the high computation time (up to 45s), I always get estimations
 without error or warning message with the PHREG procedure.
 On the other hand, when I use R software (latest version 2.13.11 on 32 or 64
 bits), I sometimes get different estimates from those obtained with SAS and
 I get various warnings. And some other time I don't get any result, R
 freezes and does not respond.

 In order to understand, I have tried some tests from your examples. It turns
 out that dysfunctions appear when the proportion of ties become important :

Edited down to results:
R
      coef exp(coef) se(coef)       z p
 delir 22.5  6.06e+09    15460 0.00146 1
SAS
 estimate delir : 20.52466
 se : 5689
R

       coef exp(coef) se(coef)         z p
 delir -20.8  9.42e-10    42054 -0.000494 1
SAS
 estimate delir : -17.78257
 se : 9383
 Pr  Khi 2 : 0.9985
 convergence status : Convergence criterion (GCONV=1E-8) satisfied.

The warning and error messages are correct here.  Look at the point
estimate. It's a log hazard ratio of about 20 in one case and about
-20 in the other case.  The true partial maximum likelihood estimator
is infinite. The estimated standard errors are meaningless, since the
partial likelihood isn't close to quadratic at the maximum.


-thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Assigning colors to cells

2011-07-19 Thread Sumukh Sathnur


Hi everyone,

I was wondering if there was a simple way to assign a color to a cell 
based on value in a data frame and return the cell as an image? For 
example, if the value is 1, then blue, if between 1 and 2, green, if 
between 2 and 3, yellow, etc.


I tried using a heatmap function but I was hoping there was an easier 
way to do this.


Thanks,
Sumukh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating mean excluding zeros

Sarah et. al:

On Tue, Jul 19, 2011 at 1:56 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 In the more general case, that approach is prone to machine precision
 error (FAQ 7.31).

 Here's a clunky but safer alternative:

Perhaps ?zapsmall  .

However, I would agree with your sentiments that it may depend on
context. Finite precision can be a tricky thing.

-- Bert


 set.seed(1234)
 testvec - sample(0:10, 100, replace=TRUE)
 mean(testvec)
 [1] 4.31
 mean(testvec[testvec != 0])
 [1] 4.842697
 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))])
 [1] 4.842697


 (Is there an elementwise equivalent to all.equal() that I'm missing?)

 Sarah
 On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote:
 You can do it by subsetting or indexing

  r-c(0,0,0,rnorm(10,10,5))
 mean(r)
 [1] 8.052215
 mean(r[r!=0])
 [1] 10.46788

 Weidong Gu

 On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Sorry if it's been discussed before - don't seem to find it.
 I'd like to calculate a mean while ignoring zeros.
 mean doesn't seem to have an option for that.
 Any other function/package that could do it?

 Thanks for a pointer!

 --
 Dimitri Liakhovitski
 marketfusionanalytics.com




 --
 Sarah Goslee
 http://www.functionaldiversity.org

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] timeDate with month designated by three letters.

2011-07-19 Thread Clint Bowman


strptime(04-MAY-11 1428,format=%d-%b-%y %H%M)

[1] 2011-05-04 14:28:00


--
Clint BowmanINTERNET:   cl...@ecy.wa.gov
Air Quality Modeler INTERNET:   cl...@math.utah.edu
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600


USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274


On Tue, 19 Jul 2011, mdkz...@aol.com wrote:


Dear R Experts:


  I am trying to convert a date and time character field to timeDate where the month is 
presented as three letters, such as JUN for June, etc.


This is an example of the full character field:


04-MAY-11 1428
What is the proper format syntax?


I've tried

timeDate(04-MAY-11 1428,format=%d-%m-%y %H%M)
but only get

GMT
[1] [NA]
If I change the month to a number as below, then it works, but that would 
require recoding of the data field.



timeDate(04-05-11 1428,format=%d-%m-%y %H%M)

gives

GMT
[1] [2011-05-04 14:28:00]
which is correct.  How do I get R to recognize the month as a 3 letter 
designator.


Any recommendations you can provide would be greatly appreciated.


Michael


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in vector(double, length) : cannot allocate vector of length 1010723280 - need help

2011-07-19 Thread kinnari

Hi All,
I am working on CGH datasets.
When I am running clustering, it shows me this error.
 d - dist(mydata, method = euclidean)
Error in vector(double, length) : 
  cannot allocate vector of length 1010723280

If anyone can help me, I will really appreciate.

Thanks,
Kinnari

--
View this message in context: 
http://r.789695.n4.nabble.com/Error-in-vector-double-length-cannot-allocate-vector-of-length-1010723280-need-help-tp3679326p3679326.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Incorrect degrees of freedom for splines using GAMM4?

2011-07-19 Thread Melinda Power

Hello,

I'm running mixed models in GAMM4 with 2 (non-nested) random intercepts and
I want to include a spline term for one of my exposure variables.  However,
when I include a spline term, I always get reported degrees of freedom of
less than 1, even when I know that my spline is using more than 1 degree of
freedom.  For example, here is the code for my model:

 global.gamm4-gamm4(zcog~s(adjpatx, fx=TRUE, k=5)+int234+cogagec+cogagesq
+
+ + oldfran +newus +alc2 +alc3 +alc4 +alcmiss +smk2 +smk3
+ +mdinc10c +mdinc10sq+ pwhtc +pwhtsq  +edu2+ edu3 +husbgs
+husbcol+ husbmiss
+  +currpmh +pastpmh +neverpmh, random= ~(1|id)
+(1|cogtest), data=global)

Using  summary(global.gamm4$mer), I get the following output for my spline
term, indicating that I use the expected 4 degrees of freedom.

Xs(adjpatx)Fx1  0.1018943  0.1073225   0.949
Xs(adjpatx)Fx2 -0.0708114  0.1123845  -0.630
Xs(adjpatx)Fx3  0.7459511  0.6836413   1.091
Xs(adjpatx)Fx4 -0.2062321  0.0923569  -2.233

However, when I use  summary(global.gamm4$gam).  I get an estimate of
degrees of freedom that is not 4:

Approximate significance of smooth terms:
  edf Ref.df F p-value
s(adjpatx) 0.7588 0.7588 1.346   0.234

This degree of freedom = 0.76 also shows up on my plot.

Ultimately, I would like to use a cubic regression penalized spline,
allowing R to choose the degrees of freedom for me using GCV.  However, when
I use the correct code for this or variants of it using mgcv, I also get
degrees of freedom less than 1.  For example, in the following code provides
a degree of freedom of less than 1 as well:


 global.gamm4-gamm4(zcog~s(adjpatx, fx=FALSE)+int234+cogagec+cogagesq +
+ + oldfran +newus +alc2 +alc3 +alc4 +alcmiss +smk2 +smk3
+ +mdinc10c +mdinc10sq+ pwhtc +pwhtsq  +edu2+ edu3 +husbgs
+husbcol+ husbmiss
+  +currpmh +pastpmh +neverpmh, random= ~(1|id)
+(1|cogtest), data=global)

Output indicating that this spline should probably look linear:

 summary(global.gamm4$mer)
Random effects:
 Groups   NameVariance  Std.Dev.
 id   (Intercept) 0.1823454 0.427019
 cogtest  (Intercept) 0.0025498 0.050496
 Xr.1 s(adjtibx)  0.000 0.00
 Residual 0.7782969 0.882211

Xs(adjtibx)Fx1 -0.0387360  0.0215596  -1.797


Output getting a df for this spline of 0.20.

 summary(global.gamm4$gam)

Approximate significance of smooth terms:
  edf Ref.df F p-value
s(adjtibx) 0.2009 0.2009 16.07  NA

The plot looks linear, but reports a df =0.20.


So...to summarize my questions:

1.  Are the splines produced by s(exp, fx=FALSE) or s(exp, fx=TRUE, k=k)
correct even though the reported degrees of freedom appears to be wrong?

2.  Can I believe my plot?

3.  How can I get the true df used when I use s(exp, fx=FALSE)?





Thanks for any and all help you can provide!

Melinda

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Different result of multiple regression in R and SPSS

2011-07-19 Thread J.

Hi, I am trying to do a simple multiple regression analysis that has one
nominal variable (gender) and three numeric variables as independent
variables and one numeric variable as dependent variable.

So, I got a formula like this:
summary(out.3 - lm(scale(DV) ~  gender + scale(IV.1) + scale(IV.2) +
scale(IV.3))

I tried to compare the outcome in R with the outcome in SPSS and found the
results are different!
I found that R and SPSS have the exact same outcome when every variable is
numeric; however, whenever I included gender (0/1) variable in the
equation, the result become different.

I guess that SPSS automatically treat gender as a numeric variable and
standardize it when running analysis. So, I tried to change gender to a
numeric variable and ran analysis but the results were still not identical.

What is the problem here and what is the right way to do this analysis?
Thanks,

Jay Yang

--
View this message in context: 
http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Taking all complete diagonals of a matrix

2011-07-19 Thread Peter Lomas

Hi R-Help!

I am trying to find a nicer way of extracting all the complete diagonals
of a matrix.  I am working with very large matrices that have many more rows
than columns.  I want to be able to extract each of the diagonals that are
as long as the number of columns in the matrix.  I have written a rather
ugly function that presently does the job.  It illustrates what I am trying
to do, but I feel like there must be a cleaner (and faster) way.  Does
anybody have any ideas?  Here is what I've done so far:

diagonals - function(mat){
output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat))
for(i in 1:NROW(output)){
   G - c()
   for(j in 1:NCOL(mat)){
  G  -  c(G,mat[(i+j-1),j])
  }
   output[i,]  -  G
  }
 return(output)
}

example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3))

example
 [,1] [,2] [,3]
[1,]111
[2,]222
[3,]333
[4,]444
[5,]555

 diagonals(example)
 [,1] [,2] [,3]
[1,]123
[2,]234
[3,]345

Many thanks,
Peter

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS

Answer: Contrasts, i.e. the parameterization of the categorical variable(s) df.

?contrasts may be of some help, but you really need to do some
background studying of the linear models principles involved. Googling
may provide tutorials. Also searching the mail archives, e.g.:

https://stat.ethz.ch/pipermail/r-help/2009-February/187479.html

-- Bert

On Tue, Jul 19, 2011 at 2:39 PM, J. seoulseoulse...@gmail.com wrote:
Hi, I am trying to do a simple multiple regression analysis that has one
nominal variable (gender) and three numeric variables as independent
variables and one numeric variable as dependent variable.

So, I got a formula like this:
summary(out.3 - lm(scale(DV) ~ gender + scale(IV.1) + scale(IV.2) +
scale(IV.3))

I tried to compare the outcome in R with the outcome in SPSS and found the
results are different!
I found that R and SPSS have the exact same outcome when every variable is
numeric; however, whenever I included gender (0/1) variable in the
equation, the result become different.

I guess that SPSS automatically treat gender as a numeric variable and
standardize it when running analysis. So, I tried to change gender to a
numeric variable and ran analysis but the results were still not identical.

What is the problem here and what is the right way to do this analysis?
Thanks,

Jay Yang

--
View this message in context:
http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html
Sent from the R help mailing list archive at Nabble.com.

--
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

Re: [R] Different result of multiple regression in R and SPSS

2011-07-19 Thread J.

Thanks for the answer.

However, I am still curious about which result I should use? The result from
R or the one from SPSS?
Why the results from two programs are different?

Jay

--
View this message in context: 
http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679511.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS

I don't think SPSS does anything with the variables you enter there.
Have you entered it as numeric?
Have you entered gender as numeric in R?

On Tue, Jul 19, 2011 at 6:11 PM, Bert Gunter gunter.ber...@gene.com wrote:
Answer: Contrasts, i.e. the parameterization of the categorical variable(s)
df.

?contrasts may be of some help, but you really need to do some
background studying of the linear models principles involved. Googling
may provide tutorials. Also searching the mail archives, e.g.:

https://stat.ethz.ch/pipermail/r-help/2009-February/187479.html

-- Bert

So, I got a formula like this:
summary(out.3 - lm(scale(DV) ~ gender + scale(IV.1) + scale(IV.2) +
scale(IV.3))

What is the problem here and what is the right way to do this analysis?
Thanks,

Jay Yang

--
View this message in context:
http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html
Sent from the R help mailing list archive at Nabble.com.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

--
Dimitri Liakhovitski
marketfusionanalytics.com

[R] list.files recursively to find files in a specific way...

2011-07-19 Thread JIA Pei

Hi, all:

My folders are organized in such a way:


root

branch1
---A
---file1.txt
---file2.txt
---B
---file1.txt
---file2.txt

branch2
---A
---file1.txt
---file2.txt
---B
---file1.txt
---file2.txt

...

branch100
---A
---file1.txt
---file2.txt
---B
---file1.txt
---file2.txt



I'd love to list all file2.txt from all subdirectories Bs but not from
As, how to do that?

I tried the following two

a) allResults - list.files(path = routeStr, pattern = file2.txt,
all.files = TRUE, full.names = TRUE, recursive = TRUE);
gives me 200 files in allResults, which is wrong. There should be only 100
files in allResults.

b) allResults - list.files(path = routeStr, pattern = B/file2.txt,
all.files = TRUE, full.names = TRUE, recursive = TRUE);
still wrong. It give me nothing, namely, 0 file(s) in allResults.


Can anybody help to solve this problem?


Best Regards
Pei











-- 

Pei JIA

Email: jp4w...@gmail.com
cell:+1 604-362-5816

Welcome to Vision Open
http://www.visionopen.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS



On Jul 19, 2011, at 6:29 PM, J. wrote:


Thanks for the answer.

However, I am still curious about which result I should use? The  
result from

R or the one from SPSS?


It is becoming apparent that you do not know how to use the results  
from either system. The progress of science would be safer if you get  
some advice from a person that knows what they are doing.



Why the results from two programs are different?


Different parametrizations. If I had to guess I would bet that the  
gender coefficient is R is exactly twice that of the one from SPSS.  
They are probably both correct in the context of their respective  
codings.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] list.files recursively to find files in a specific way...

2011-07-19 Thread Phil Spector


Pei -
   A file pattern can't contain a directory separator, but it's 
easy to search for one outside the context of list.files.  I think


grep('B/file2.txt',list.files(path = routeStr, all.files = TRUE,
  full.names = TRUE, recursive = TRUE),value=TRUE)

should give you what you want.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Tue, 19 Jul 2011, JIA Pei wrote:


Hi, all:

My folders are organized in such a way:


root

branch1
---A
---file1.txt
---file2.txt
---B
---file1.txt
---file2.txt

branch2
---A
---file1.txt
---file2.txt
---B
---file1.txt
---file2.txt

...

branch100
---A
---file1.txt
---file2.txt
---B
---file1.txt
---file2.txt



I'd love to list all file2.txt from all subdirectories Bs but not from
As, how to do that?

I tried the following two

a) allResults - list.files(path = routeStr, pattern = file2.txt,
all.files = TRUE, full.names = TRUE, recursive = TRUE);
gives me 200 files in allResults, which is wrong. There should be only 100
files in allResults.

b) allResults - list.files(path = routeStr, pattern = B/file2.txt,
all.files = TRUE, full.names = TRUE, recursive = TRUE);
still wrong. It give me nothing, namely, 0 file(s) in allResults.


Can anybody help to solve this problem?


Best Regards
Pei











--

Pei JIA

Email: jp4w...@gmail.com
cell:+1 604-362-5816

Welcome to Vision Open
http://www.visionopen.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS

On Tue, Jul 19, 2011 at 3:45 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Jul 19, 2011, at 6:29 PM, J. wrote:

 Thanks for the answer.


#
 However, I am still curious about which result I should use? The result
 from
 R or the one from SPSS?

 It is becoming apparent that you do not know how to use the results from
 either system. The progress of science would be safer if you get some advice
 from a person that knows what they are doing.
##
I nominate this for an R fortune.

-- Bert

 Why the results from two programs are different?

 Different parametrizations. If I had to guess I would bet that the gender
 coefficient is R is exactly twice that of the one from SPSS. They are
 probably both correct in the context of their respective codings.

 --
 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking all complete diagonals of a matrix

2011-07-19 Thread Nordlund, Dan (DSHS/RDA)

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Peter Lomas
 Sent: Tuesday, July 19, 2011 2:16 PM
 To: r-help@r-project.org
 Subject: [R] Taking all complete diagonals of a matrix

 Hi R-Help!

 I am trying to find a nicer way of extracting all the complete
 diagonals
 of a matrix.  I am working with very large matrices that have many more
 rows
 than columns.  I want to be able to extract each of the diagonals that
 are
 as long as the number of columns in the matrix.  I have written a
 rather
 ugly function that presently does the job.  It illustrates what I am
 trying
 to do, but I feel like there must be a cleaner (and faster) way.  Does
 anybody have any ideas?  Here is what I've done so far:

 diagonals - function(mat){
 output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat))
 for(i in 1:NROW(output)){
G - c()
for(j in 1:NCOL(mat)){
   G  -  c(G,mat[(i+j-1),j])
   }
output[i,]  -  G
   }
  return(output)
 }

 example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3))

 example
  [,1] [,2] [,3]
 [1,]111
 [2,]222
 [3,]333
 [4,]444
 [5,]555

  diagonals(example)
  [,1] [,2] [,3]
 [1,]123
 [2,]234
 [3,]345

 Many thanks,
 Peter

Peter,

Here are two possibilities.  I leave it up to you to determine whether they are 
cleaner or faster.

diagonals1 - function(mat){
  #setup
  R - dim(mat)[1]
  C - dim(mat)[2]
  output - matrix(0,(R-C+1),C)
  #get diagonals
  for(i in 1:(R-C+1)) output[i,] - diag(mat[i:(i+C-1),])
  return(output)
}

diagonals2 - function(mat){
  #setup
  R - dim(mat)[1]
  C - dim(mat)[2]
  output - matrix(0,(R-C+1),C)
  #get diagonals
  for(i in 1:(R-C+1)) output[,i] - mat[i:(i+C-1),i]
  return(output)
}

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS

2011-07-19 Thread Mike Marchywka

 From: dwinsem...@comcast.net
 To: seoulseoulse...@gmail.com
 Date: Tue, 19 Jul 2011 18:45:47 -0400
 CC: r-help@r-project.org
 Subject: Re: [R] Different result of multiple regression in R and SPSS

 On Jul 19, 2011, at 6:29 PM, J. wrote:

  Thanks for the answer.

  However, I am still curious about which result I should use? The
  result from
  R or the one from SPSS?

 It is becoming apparent that you do not know how to use the results
 from either system. The progress of science would be safer if you get
 some advice from a person that knows what they are doing.

  Why the results from two programs are different?

 Different parametrizations. If I had to guess I would bet that the
 gender coefficient is R is exactly twice that of the one from SPSS.
 They are probably both correct in the context of their respective
 codings.

I guess I would also suggest, again, run some samples with known data sets
and see what you get(RSSWKDSASWYG). You would want to do this anyway if you 
want to insure
your real data is being used reasonably. You still need to have some way to 
check your
opinion from the expert mentioned above and known data will help there too.  A 
factor
of 2 often shows up from just looking at pictures once you have some intuition. 
I've
often been wrong on intuition, but chasing it down and proving it helps you 
learn a lot :)

 --
 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assigning colors to cells

2011-07-19 Thread Dennis Murphy

Use the cut() function to produce the interval categories and color
names in the data frame and then pass that variable to the col =
argument in the appropriate plot function. Something like

mydata$mycolors - cut(mydata$value, c(-Inf, 1, 2, 3, Inf), label =
c('blue', 'green', 'yellow'))

Ask an abstract question, get an abstract answer...

HTH,
Dennis

On Tue, Jul 19, 2011 at 2:30 PM, Sumukh Sathnur
sumukh.sath...@gmail.com wrote:
 Hi everyone,

 I was wondering if there was a simple way to assign a color to a cell based
 on value in a data frame and return the cell as an image? For example, if
 the value is 1, then blue, if between 1 and 2, green, if between 2 and 3,
 yellow, etc.

 I tried using a heatmap function but I was hoping there was an easier way to
 do this.

 Thanks,
 Sumukh

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating mean excluding zeros

2011-07-19 Thread Sarah Goslee

On Tue, Jul 19, 2011 at 5:20 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Thanks a lot, Sarah.
 I assume, if the values against which I am comparing are REALLY zero
 (0) - then even the first one (mean(testvec[testvec != 0])) should
 work, right?
 Dimitri

Well, yes. But what's really zero?

 ((.2 + .1) - .3) == 0
[1] FALSE
 all.equal(((.2 + .1) - .3), 0)
[1] TRUE

0 is a string, and string comparison is a different issue, not
subject to machine precision.

Sarah

 On Tue, Jul 19, 2011 at 4:56 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 In the more general case, that approach is prone to machine precision
 error (FAQ 7.31).

 Here's a clunky but safer alternative:

 set.seed(1234)
 testvec - sample(0:10, 100, replace=TRUE)
 mean(testvec)
 [1] 4.31
 mean(testvec[testvec != 0])
 [1] 4.842697
 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))])
 [1] 4.842697


 (Is there an elementwise equivalent to all.equal() that I'm missing?)

 Sarah
 On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote:
 You can do it by subsetting or indexing

  r-c(0,0,0,rnorm(10,10,5))
 mean(r)
 [1] 8.052215
 mean(r[r!=0])
 [1] 10.46788

 Weidong Gu

 On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Sorry if it's been discussed before - don't seem to find it.
 I'd like to calculate a mean while ignoring zeros.
 mean doesn't seem to have an option for that.
 Any other function/package that could do it?

 Thanks for a pointer!

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking all complete diagonals of a matrix

2011-07-19 Thread Dennis Murphy

Hi:

Does this work for you?

mydiags - function(mat) diag(mat[seq_len(ncol(mat)), ])

# Example:
set.seed(103)
u - matrix(rpois(200, 10), ncol = 10)
#  dim(u)
# [1] 20 10
mydiags(u)
# [1]  7 12  6 13 12  6  5  6 14  6
u[1:10, ] # as a double check

HTH,
Dennis

On Tue, Jul 19, 2011 at 2:15 PM, Peter Lomas peter.lo...@ucalgary.ca wrote:
 Hi R-Help!

 I am trying to find a nicer way of extracting all the complete diagonals
 of a matrix.  I am working with very large matrices that have many more rows
 than columns.  I want to be able to extract each of the diagonals that are
 as long as the number of columns in the matrix.  I have written a rather
 ugly function that presently does the job.  It illustrates what I am trying
 to do, but I feel like there must be a cleaner (and faster) way.  Does
 anybody have any ideas?  Here is what I've done so far:

 diagonals - function(mat){
 output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat))
 for(i in 1:NROW(output)){
   G - c()
   for(j in 1:NCOL(mat)){
      G  -  c(G,mat[(i+j-1),j])
      }
   output[i,]  -  G
  }
  return(output)
 }

 example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3))

 example
     [,1] [,2] [,3]
 [1,]    1    1    1
 [2,]    2    2    2
 [3,]    3    3    3
 [4,]    4    4    4
 [5,]    5    5    5

  diagonals(example)
     [,1] [,2] [,3]
 [1,]    1    2    3
 [2,]    2    3    4
 [3,]    3    4    5

 Many thanks,
 Peter

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] notation question

2011-07-19 Thread Rolf Turner


On 20/07/11 07:24, Carson Farmer wrote:

Dear list, I am currently writing up some of my R models in a more
formal sense for a paper, and I am having trouble with the notation.
Although this isn't really an 'R' question, it should help me to
understand a bit better what I am actually doing when fitting my
models!

Using the analysis of co-variance example from MASS (fourth edition, p
142), what is the correct notation for the formula Gas, ~ Insul/Temp


There shouldn't be a comma after ``Gas'' in that formula.

- 1? Obviously, if we fit it as two separate models (as in the
example above it), we would have something like y_i = \beta x_i for
each of the two models.


No.  y_i = alpha + beta x_i  .  Clearly you need an intercept.
Do you really expect gas consumption to be nil when the average
external temperature is 0 degrees C ?

So my question is, when we have a single model
with a k-level factor interaction term as in the equation above, what
is the correct/standard statistical (LaTeX style) notation?


The model is simply

y_ij = alpha_i + beta_i x_ij

where i = 1 (before) or 2 (after).  I.e. you are allowing a different 
slope and intercept

for each of the scenarios (before and after).

But this is the ``deterministic'' part of the model.  You should really 
include the random

part:

y_ij = alpha_i + beta_i x_ij + E_ij

where the E_ij are independent random variables with mean 0 and common
variance sigma^2.  (Often the E_ij are assumed to be Gaussian, mainly 
because

if all you have is a hammer, then everything looks like a nail).

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SSOAP chemspider

2011-07-19 Thread Benton, Paul

Dear all,

I've been trying on and off for the past few months to get SSOAP to work with 
chemspider. First I tried the WSDL file:

cs-processWSDL(http://www.chemspider.com/MassSpecAPI.asmx?WSDL;)
Error in parse(text = paste(txt, collapse = \n)) : 
  text:1:29: unexpected input
1: function(x, ..., obj = new( ‚
   ^
In addition: Warning message:
In processWSDL(http://www.chemspider.com/MassSpecAPI.asmx?WSDL;) :
  Ignoring additional serviceport ... elements

Next I've tried using just the pure .SOAP to call the database. 

s - SOAPServer(http://www.chemspider.com/MassSpecAPI.asmx;)
csid- .SOAP(s, SearchByMass2, mass=89.04767, range=0.01,
action = I(http://www.chemspider.com/SearchByMass2;),
xmlns = c(http://www.chemspider.com;), .opts = list(verbose = TRUE))

This seems to work and gives back a result. However, this result isn't the 
right result. It's seems to have converted the mass into 0. When I run the 
similar program in perl I get the correct id's. So this isn't a server side 
problem but SSOAP. Any thoughts or suggestions on other packages to use?
Further infomation about the SeachByMass2 method and it's xml that it's 
expecting.
http://www.chemspider.com/MassSpecAPI.asmx?op=SearchByMass2

Cheers,


Paul


PS Placing a fake error in the .SOAP code I can look at the xml it's sending to 
the server:
Browse[1] doc
?xml version=1.0?
SOAP-ENV:Envelope xmlns:SOAP-ENC=http://schemas.xmlsoap.org/soap/encoding/; 
xmlns:SOAP-ENV=http://schemas.xmlsoap.org/soap/envelope/; 
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; 
xmlns:xsd=http://www.w3.org/2001/XMLSchema; 
SOAP-ENV:encodingStyle=http://schemas.xmlsoap.org/soap/encoding/;
  SOAP-ENV:Body
ns:SearchByMass2 xmlns:ns=http://www.chemspider.com;
  ns:mass89.04767/ns:mass
  ns:range0.01/ns:range
/ns:SearchByMass2
  /SOAP-ENV:Body
/SOAP-ENV:Envelope
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] notation question

2011-07-19 Thread Carson Farmer

Thank you Rolf,
 Using the analysis of co-variance example from MASS (fourth edition, p
 142), what is the correct notation for the formula Gas, ~ Insul/Temp
    There shouldn't be a comma after ``Gas'' in that formula.
 - 1? Obviously, if we fit it as two separate models (as in the
 example above it), we would have something like y_i = \beta x_i for
 each of the two models.
    No.  y_i = alpha + beta x_i  .  Clearly you need an intercept.
    Do you really expect gas consumption to be nil when the average
    external temperature is 0 degrees C ?
Right, of course... both silly mistakes, my apologies!
 So my question is, when we have a single model
 with a k-level factor interaction term as in the equation above, what
 is the correct/standard statistical (LaTeX style) notation?
 The model is simply
    y_ij = alpha_i + beta_i x_ij
 where i = 1 (before) or 2 (after).  I.e. you are allowing a different slope
 and intercept
 for each of the scenarios (before and after).
 But this is the ``deterministic'' part of the model.  You should really
 include the random part:
    y_ij = alpha_i + beta_i x_ij + E_ij
 where the E_ij are independent random variables with mean 0 and common
 variance sigma^2.  (Often the E_ij are assumed to be Gaussian, mainly
 because if all you have is a hammer, then everything looks like a nail).
Perfect! Thanks for the clarification. I think I was previously trying
to be a bit more clever than necessary (and as a result not being very
clever at all :-p)

Carson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking all complete diagonals of a matrix

2011-07-19 Thread Peter Lomas

Thanks very much to everyone who replied.  Peter got me on my way with
the use diag() hint, and I came with a less pretty version of Dan's
first option almost at the same time as I got that email.  Seems I
can't avoid one for loop, but one is better than two.

Just as a note, with this code you have to make sure that you are in
fact giving it a matrix, or diag() will error.  I fed it a data frame
unaware, but using as.matrix() works just fine.

diagonals - function(mat){
 R - dim(mat)[1]
 C - dim(mat)[2]
output - matrix(NA,(R-C+1),C)
for(i in 1:(R-C+1))
   output[i,] - diag(mat[i:(i+C-1),])
return(output)
}
example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3))
diagonals(as.data.frame(example))

Error in output[i, ] - diag(mat[i:(i + C - 1), ]) :
  number of items to replace is not a multiple of replacement length

Thanks again,
Peter


On Tue, Jul 19, 2011 at 17:34, Nordlund, Dan (DSHS/RDA)
nord...@dshs.wa.gov wrote:
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Peter Lomas
 Sent: Tuesday, July 19, 2011 2:16 PM
 To: r-help@r-project.org
 Subject: [R] Taking all complete diagonals of a matrix

 Hi R-Help!

 I am trying to find a nicer way of extracting all the complete
 diagonals
 of a matrix.  I am working with very large matrices that have many more
 rows
 than columns.  I want to be able to extract each of the diagonals that
 are
 as long as the number of columns in the matrix.  I have written a
 rather
 ugly function that presently does the job.  It illustrates what I am
 trying
 to do, but I feel like there must be a cleaner (and faster) way.  Does
 anybody have any ideas?  Here is what I've done so far:

 diagonals - function(mat){
 output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat))
 for(i in 1:NROW(output)){
    G - c()
    for(j in 1:NCOL(mat)){
       G  -  c(G,mat[(i+j-1),j])
       }
    output[i,]  -  G
   }
  return(output)
 }

 example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3))

 example
      [,1] [,2] [,3]
 [1,]    1    1    1
 [2,]    2    2    2
 [3,]    3    3    3
 [4,]    4    4    4
 [5,]    5    5    5

  diagonals(example)
      [,1] [,2] [,3]
 [1,]    1    2    3
 [2,]    2    3    4
 [3,]    3    4    5

 Many thanks,
 Peter


 Peter,

 Here are two possibilities.  I leave it up to you to determine whether they 
 are cleaner or faster.

 diagonals1 - function(mat){
  #setup
  R - dim(mat)[1]
  C - dim(mat)[2]
  output - matrix(0,(R-C+1),C)
  #get diagonals
  for(i in 1:(R-C+1)) output[i,] - diag(mat[i:(i+C-1),])
  return(output)
 }

 diagonals2 - function(mat){
  #setup
  R - dim(mat)[1]
  C - dim(mat)[2]
  output - matrix(0,(R-C+1),C)
  #get diagonals
  for(i in 1:(R-C+1)) output[,i] - mat[i:(i+C-1),i]
  return(output)
 }


 Hope this is helpful,

 Dan

 Daniel J. Nordlund
 Washington State Department of Social and Health Services
 Planning, Performance, and Accountability
 Research and Data Analysis Division
 Olympia, WA 98504-5204

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS

2011-07-19 Thread Spencer Graves


On 7/19/2011 4:04 PM, Bert Gunter wrote:

On Tue, Jul 19, 2011 at 3:45 PM, David Winsemiusdwinsem...@comcast.net  wrote:

On Jul 19, 2011, at 6:29 PM, J. wrote:


Thanks for the answer.


#

However, I am still curious about which result I should use? The result
from
R or the one from SPSS?

It is becoming apparent that you do not know how to use the results from
either system. The progress of science would be safer if you get some advice
from a person that knows what they are doing.

##
I nominate this for an R fortune.

-- Bert


None of us ever know what we're doing at some level.  We often think we 
do, and sometimes we get results more in spite of what we've done than 
because of it.  That of course increases our confidence and encourages 
us to repeat mistakes in contexts where we might not be so lucky.



Spencer


Why the results from two programs are different?

Different parametrizations. If I had to guess I would bet that the gender
coefficient is R is exactly twice that of the one from SPSS. They are
probably both correct in the context of their respective codings.

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS

First, it would have helped if you had posted the actual results for us to
see how far they are off (and, more specifically, by which factor).

Second, given your epiphany, you will find that that's exactly what David
(and others before him) said or suggested. It is not about standardizing a
nominal variable, which you theoretically cannot. It is about how the
programs encode nominal variables by standard.

Daniel


J. wrote:
 
 I finally got the same result by converting gender variable as numeric,
 and standardize it.
 I guess SPSS automatically doing the same thing when doing analysis.
 But, it still is not clear to me how I can interpret standardized
 categorical (dummy coded) variable.
 I'd rather stick to use R.
 Thanks for all the comments and advice.
 
 Jay
 

--
View this message in context: 
http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679861.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA - princomp can only be used with more units than variables

2011-07-19 Thread Joshua Wiley

On Mon, Jul 18, 2011 at 10:48 AM, a.me...@yahoo.co.uk
a.me...@yahoo.co.uk wrote:
 Ok thank you Josh.

 Basically I have a matrix A with 7 rows and 18 columns.

If i  j (where i is the number of rows in your matrix and j is the
number of columns), then the determinant of the covariance (or
correlation) matrix |Sigma_A| will be 0 (or very near zero, you can
easily convince yourself of this by running det(cov(matrix(rnorm(90),
9))) as many times as you need).  From Cramer's Rule, if the
determinant of the matrix is 0, there is not a unique solution
(clarifications/corrections are welcome if any of this is wrong).



 What I am told is I need the 'varimax rotated scores from the PCA analysis
 of matrix A'

Who told you that?  Is this homework?  You could look at the
?principal function in package psych.  That said, if this is
homework I would talk with your instructor more, and if this is
anything beyond an exercise (i.e., has real world implications), I
would seek the advice/help of a local statistician.


 I can choose from 3 up to 7 components. My problem is how to carry out the
 above.

 Have you any ideas?

 Would appreciate your help!

 Armin

 On 18/07/2011 18:07, Joshua Wiley wrote:

 Hi,

 You need to explain what you want to do.  This is not a software
 issue, you simply cannot create more uncorrelated variables than you
 have observations.

 Josh

 On Mon, Jul 18, 2011 at 8:53 AM, a.me...@yahoo.co.uk
 a.me...@yahoo.co.uk  wrote:

 Hi,

 May I ask a question about a thread
 https://stat.ethz.ch/pipermail/r-help/2005-March/068365.html?

 I understand I need to use prcomp instead of princomp when i have less
 units than variables.

 However, when I use prcomp the scores is NULL. How can I overcome this?

 Regards,

 Armin

 --
 Kind Regards,

 Armin Mewes
 Groundesign
 10 Jerusalem street
 Belfast
 BT7 1QN

 Tel.    (0044)(0)2890280887
 Email.  enquir...@groundesign.net
 www.    www.groundesign.net


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Kind Regards,

 Armin Mewes
 Groundesign
 10 Jerusalem street
 Belfast
 BT7 1QN

 Tel.    (0044)(0)2890280887
 Email.  enquir...@groundesign.net
 www.    www.groundesign.net

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to convert number (matlab) to date

2011-07-19 Thread John

On Monday, July 18, 2011 05:56:14 peter dalgaard wrote:
 but even this is dubious, since there is no year 0 AD. In Gregorian and
 Julian calendars, 1 BC continues directly into 1 AD. 
 

Although this seems to be a widely recognized problem, I would argue it is 
an entirely specious one.  It makes no more sense to recognize a year 0 than 
it would worry about the lack of a zeroth inch or We name centuries and 
decades without issues.   in smaller time intervals with no problem.  The 
entire concern comes down to naming issue.  Whenever a fraction from the first 
inch on a ruler is named, it is written as 0.* or */** - with either a 
preceding 0, or as a common fraction written without any preceding whole 
number.  Every year in the 19th century starts with 18 yet few people are 
confused.  Why has this ever been regarded as an issue?

JWDougherty

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loops and simulation