Re: [R] Equality of multiple vectors

2012-05-04 Thread Jan van der Laan

or

identical(vec1, vec2)  identical(vec2, vec3)

Jan



Petr Savicky savi...@cs.cas.cz schreef:


On Fri, May 04, 2012 at 12:53:12AM -0700, aaurouss wrote:

Hello,

I'm writing a piece of code where I need to compare multiple same length
vectors.

I've gone through the basic functions like identical() or all(), but they
only work for comparing 2 vectors. From 3 vectors on, it doesn't work .

Example: Assuming
vec1 - c (1,2,3,4,5)
vec2 - c(1,2,3,4,5)
vec3 - c(1,2,3,4,4)

identical (vec1,vec2,vec3) returns TRUE, since the 2 first vectors are
equal. I need a function that returns FALSE if one of the vectors is
different.


Hi.

Try the following.

  length(unique(list(vec1, vec2, vec3))) == 1

  [1] FALSE

  length(unique(list(vec1, vec2, vec1))) == 1

  [1] TRUE

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't import this 4GB DATASET

2012-05-04 Thread Jan van der Laan


OK, not all, but most lines have the same length. Perhaps you could  
write the lines with a different line size to a separate file to have  
a closer look at those lines. Modifying the previous code (again not  
tested):


con - file(dataset.txt, rt)
out - file(strangelines.txt, wt)
# skip first 5 lines
lines - readLines(con, n=5)
# read the rest in blocks of 100.000 lines
while (TRUE) {
   lines - readLines(con, n=1E5)
   if (length(lines) == 0) break;
   strangelines - lines[nchar(lines) != 97]
   writeLines(strangelines, con=out)
}
close(con)
close(out)

Jan



Quoting iliketurtles isaacm...@gmail.com:


Jan, thank you.


table(line_sizes)

line_sizes
   01   97  256
1430 2860 46869069 1430

-


Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:   
http://r.789695.n4.nabble.com/Can-t-import-this-4GB-DATASET-tp4607862p4608172.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Handling 8GB .txt file in R?

2012-03-25 Thread Jan van der Laan


What you could try to do is skip the first 5 lines. After that the file 
seems to be 'normal'. With read.table.ffdf you could try something like


# open a connection to the file
con - file('yourfile', 'rt')
# skip first 5 lines
tmp - readLines(con, n=5)
# read the remainder using read.table.ffdf
ffdf - read.table.ffdf(file=con)
# close connection
close(con)

HTH

Jan

On 03/25/2012 06:20 AM, iliketurtles wrote:

Thanks to all the suggestions. To the first individual that replied, I can't
do any stuff with unix or perl. All I know is R.

@KEN:
I'm using Windows 7, 64 bit.

@Steve:
Here's the readLines output.. As we can see, lines 1-3 are empty and line 5
is empty, and there's also empty elements after line 5!.

  [1]  
   [2] 

   [3]  
   [4]   PERMNO  DATETICKERPERMCO PRC
VOLNUMTRDvwretdewretd
   [5] 
   [6]106/01/19867952  .
. . -0.000138  0.001926
   [7]107/01/1986OMFGA   7952-2.56250
1000 .  0.013809  0.011061
   [8]108/01/1986OMFGA   7952-2.5
12800 . -0.020744 -0.005117
   [9]109/01/1986OMFGA   7952-2.5
1400 . -0.011219 -0.011588
  [10]110/01/1986OMFGA   7952-2.5
8500 .  0.83  0.003651
  [11]113/01/1986OMFGA   7952-2.62500
5450 .  0.002749  0.002433

-


Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context: 
http://r.789695.n4.nabble.com/Handling-8GB-txt-file-in-R-tp4500971p4502706.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading big files in chunks-ff package

2012-03-25 Thread Jan van der Laan


Your question is not completely clear. read.csv.ffdf  automatically 
reads in the data in chunks. You don´t have to do anything for that. You 
can specify the size of the chunks using the next.rows option.


Jan


On 03/24/2012 09:29 PM, Mav wrote:

Hello!
A question about reading large CSV files

I need to analyse several files with  sizes larger than 3 GB. Those files
have more than 10million rows (and up to 25 million) and 9 columns. Since I
don´t have a large RAM memory,  I think that the ff package can really help
me. I am trying to use read.csv.ffdf but I have some questions:

How can I read the files in several chunks…with an automatic way of
calculating the number of rows to include in each chunk? (my problem is that
the files have different number of rows)

For instance…. I have used
read.csv.ffdf(NULL, “file.csv”, sep=|, dec=.,header = T,row.names =
NULL,colClasses = c(rep(integer, 3), rep(integer, 10), rep(integer,
6)))
  But with this way I am reading the whole fileI would prefer to read it
in chunksbut I don´t know  how to read it in chunks

I have read the ff documentation but I am not good with R!

Thanks in advance!

--
View this message in context: 
http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4502070.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading big files in chunks-ff package

2012-03-25 Thread Jan van der Laan
The 'normal' way of doing that with ff is to first convert your csv  
file completely to a
ffdf object (which stores its data on disk so shouldn't give any  
memory problems). You
can then use the chunk routine (see ?chunk) to divide your data in the  
required chunks.


Untested so may contain errors:

ffdf - read.table.ffdf(...)

chnks - chunk(from=1, to=nrow(yourffdf), by=5E6, method='seq')

for (chnk in chnks) {
  # read data
  data - ffdf[chnk, ]
  # do your thing with the data
  # clean up
  rm(data)
  gc()
}


If you want to process your csv file directly in chunks, you could  
also have a look at
the LaF package. Especially the process_blocks routine which does  
exactly that. The
manual vignette  
(http://cran.r-project.org/web/packages/LaF/vignettes/LaF-manual.pdf)

contains some examples how to do that.

Jan



Quoting Mav mastorvar...@gmail.com:


Thank you Jan

My problem is the following:
For instance, I have 2 files with different number of rows (15 million and 8
million of rows each).
I would like to read the first one in chunks of 5 million each. However
between the first and second chunk, I would like to analyze those first 5
million of rows, write the analysis in a new csv and then proceed to read
and analyze the second chunk and so on until the third chunk. With the
second file, I would like to do the same...read the first chunk, analyze it
and continue to read the second and analyze it.

Basically my problem is that I manage to read the filesbut with so many
rows...I cannot do any analyses (even filtering the rows) because of the RAM
restrictions.

Sorry if is still not clear.

Thank you

--
View this message in context:   
http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4503642.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Singleton pattern

2012-03-16 Thread Jan T. Kim
Using the singleton pattern in R has never occurred to me so far, as
I think it applies to languages that support multiple references to
one instance. R doesn't do that, at least not in ways that would be
required for applying the singleton pattern as described in the GoF book,
anyway. One would have to use closures and / or environments to
approximate references, I suppose.

When passed around as parameters, R objects don't get copied unless
the called function starts modifying them, so if the primary concern
is to prevent unnecessary / costly copying of bulky objects, creating
the thing once and then passing it around as necessary, taking care
that called functions don't change it, is perhaps good enough.

Best regards, Jan

On Fri, Mar 16, 2012 at 12:15:27PM -0400, Bryan Hanson wrote:
 Since no one else has bit, I'll take a stab.  I'm an experienced R person, 
 but I've recently been teaching myself objective-c and I've been using 
 singletons quite a bit (and mis-using them quite a bit!).  Not a computer 
 scientist at all.  You've been warned.
 
 I don't think there is a comparable concept in R.  You do have a choice of S3 
 or S4 classes for your object orientation in R.  S3 is very loose in that you 
 can add to S3 objects readily and abuse them a lot.  There really is no 
 checking of them unless you implement it manually.  S4 objects are much 
 tighter and they are less readily modified and are self-checking (I know 
 some will complain about this characterization but  it's approximately 
 correct).  So perhaps you want an S4 object so it's less likely to get 
 mangled, but I doubt there is a way to prevent users from copying it, which 
 would be more along the lines of a singleton.
 
 You can google the archives for some great discussions of S3 vs S4 if that 
 sounds interesting.
 
 Bryan
 
 ***
 Bryan Hanson
 Professor of Chemistry  Biochemistry
 DePauw University
 
 On Mar 16, 2012, at 7:47 AM, David Cassany wrote:
 
  Hi all,
  
  I know it may not have much sense thinking about a Singleton Pattern in an
  R application which doesn't use any OOP facilities, however I'm curious to
  know if anybody faced the same issue. I've been googling but using
  singleton pattern as a key word leads to typical OOP languages like Java
  or C++ among others.
  
  So my problem is that I'd like to ensure some very big objects aren't
  copied again and again in some other variables. In the worst case I'll
  check all code by myself to ensure it but in this case the application
  won't force programmers to take it in consideration which is what I am
  really looking for.
  
  Any advice will be highly appreciated :P
  
  Thanks!
  -- 
  *David Cassany Viladomat
  Software Developer
  Transmural Biote**ch S.L*
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
 +- Jan T. Kim ---+
 | email: jtt...@gmail.com|
 | WWW:   http://www.jtkim.dreamhosters.com/  |
 *-=  hierarchical systems are for files, not for humans  =-*

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] JSS support

2012-03-15 Thread Jan de Leeuw
We received a generous gift to support the Journal of Statistical Software 
(www.jstatsoft.org) from the DC Area R Users Group. If you think the Journal is 
a worthy cause, then support it through the Statistics Computing Support Fund at

https://giving.ucla.edu/Standard/NetDonate.aspx?SiteNum=107


===
Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of Statistics;
Editor: Journal of Multivariate Analysis, Journal of Statistical Software;
US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550; fax (310)-206-5658; email: dele...@stat.ucla.edu 
(mailto:dele...@stat.ucla.edu)
.mac: jdeleeuw ++ aim: deleeuwjan ++ skype: j_deleeuw
homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org
-
No matter where you go, there you are. --- Buckaroo Banzai
http://gifi.stat.ucla.edu/sounds/nomatter.au
-



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] check for data in a data.frame and return correspondent number

2012-03-14 Thread Jan van der Laan

Marianna,

You can use merge for that (or match). Using merge:

MyData - data.frame(
V1=c(red-j, red-j, red-j, red-j, red-j, red-j),
V4=c(10.5032, 9.3749, 10.2167, 10.8200, 9.2831, 8.2838),
redNew=c(appearance blood-n, appearance ground-n, appearance 
sea-n, appearance sky-n, area chicken-n, area color-n)

  )

MyVector - data.frame(
V1 = c(appearance blood-n, appearance ground-n, appearance 
sea-n, as_adj_as fire-n, as_adj_as carrot-n, appearance sky-n, 
area chicken-n, area color-n)

  )


merge(MyVector, MyData[, c(V4, redNew)] , by.x=V1, by.y=redNew, 
all.x=TRUE)



Btw I saw some spaces in some of your strings (I have removed these in 
the example above). Be aware that the character string   appearance 
ground-n is not equal to appearance ground-n.


HTH
Jan





On 03/14/2012 06:49 PM, mari681 wrote:

Dear R-ers,

still the newbie. With a question about coordinates of a vector appearing or
not in a data.frame.
I have a data.frame (MyData) with 3 columns which looks like this:

V1V4  redNew
  red-j   10.5032  appearance blood-n
  red-j9.3749   appearance ground-n
  red-j   10.2167  appearance sea-n
  red-j   10.8200  appearance sky-n
 red-j9.2831   area chicken-n
 red-j8.2838area color-n

and a MyVector  which includes also (but not only) the data in the 3rd
column:

   appearance blood-n
   appearance ground-n
   appearance sea-n
   as_adj_as fire-n
  as_adj_as carrot-n
  appearance sky-n
 area chicken-n
 area color-n

I would like to get a data.frame of 2 columns where in the first column
there is all MyVector, and in the second column  there is either the
correspondent number found in MyData (shown in column 2) or a 0 if the
entrance is not found.

I've tried some options, among which a loop:

out-for(x in MyVector) if (x %in% MyData) print (MyData[,2])

but obviously doesn't work.
How can I select the correspondent element on column 2 for each x found in
column 3?

Suggestions in general?
Thank you for consideration!!!

Have a nice day,
Marianna


--
View this message in context: 
http://r.789695.n4.nabble.com/check-for-data-in-a-data-frame-and-return-correspondent-number-tp4472634p4472634.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading in 9.6GB .DAT File - OK with 64-bit R?

2012-03-09 Thread Jan van der Laan


You could also have a look at the LaF package which is written to  
handle large text files:


http://cran.r-project.org/web/packages/LaF/index.html

Under the vignettes you'll find a manual.

Note: LaF does not help you to fit 9GB of data in 4GB of memory, but  
it could help you reading your file block by block and filtering it.


Jan






RHelpPlease rrum...@trghcsolutions.com schreef:


Hi Barry,

You could do a similar thing in R by opening a text connection to
your file and reading one line at a time, writing the modified or
selected lines to a new file.

Great!  I'm aware of this existing, but don't know the commands for R.  I
have a variable [560,1] to use to pare down the incoming large data set (I'm
sure of millions of rows).  With other data sets they've been small enough
where I've been able to use the merge function after data has been read in.
Obviously I'm having trouble reading in this large data set in in the first
place.

Any additional help would be great!


--
View this message in context:  
http://r.789695.n4.nabble.com/Reading-in-9-6GB-DAT-File-OK-with-64-bit-R-tp4457220p4458074.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Novice Alert!: odfWeave help!

2012-03-08 Thread Jan van der Laan


Step by step:

1. Create a new document in Open/LibreOffice
2. Copy/paste the following text into the document (as an example)

helloworld=
cat(Hello, world)
@

2. Save the file (e.g. hello.odt)
3. Start R (if not already) shouldn't matter if its plain R/RStudio
4. Change working directory to the folder in which you odt-document resides

setwd(/path/to/your/file)

4. Load odfWeave

library(odfWeave)

5. odfWeave your document. All code-chunks are taken from your  
document, executed in R and the output of the R-commands is inserted  
into the resulting odt-document.


odfWeave(hello.odt, hello_out.odt)

You can now open hello_out.odt (or whatever you named it) and see  
the resulting output.



HTH,

Jan







metatarsals sjcast...@gmail.com schreef:


Hello world,
I'm pretty new to computer code: for example, I consider it a small
victory that I (all by myself!) managed to ssh into the server at my
lab from home and copy a file onto my desktop. Be gentle. I have
primarily used R for running some pretty mid-level statistics
(creating distance matrices, manipulating graphs for pretty figures,
etc).

I'm working through Bolker's Ecological Models and Data in R (which is
a great book for ecologists/life sciences types who want to learn how
to just barely get by in R, with know previous knowledge of R code
presupposed). My advisor wants me to explore odfWeave to stream-line
my notes. This is important because I will inevitably be his TA in his
R stat course, and I will need to be proficient with the software. So
far I have been copy-pasting my codes into a word processor (both open
office and word) and inserting my plots after saving them.

I do not understand how to use odfWeave. The way it was explained to
me initially sounded like it was some kind of Open Office add-on I
could install and my chunks of code would be automatically translated.
Six hours of research later, I realize this is not the case, and that
I need outside help. I'm on a Mac OSx 10.7.3 Lion, I normally use
RStudios, but I have R and R64 and I operate at about, oh, let's say
the level of a 2- or 3-year-old does with language and walking.

So, what exactly does odfWeave do? Do I stick my chunks of code (I
know I need to use  to start and @ to end to bracket off the
sections of code) in the .odf document, then do the file.in/file.out
commands, which then reads the code and pops out a pretty little graph
to my specified parameters? Or do I use the file.in/file.out commands
to paste code I've created in R into an existing .odf doc?

Any baby steps or example code you could give me would warm my little heart.

If the first scenario (write the code into an .odf document, set off
as mentioned above, and then tell R to do stuff to it) is the
scenario, I'd be happy to send an example.

Thanks! I can offer a cute picture of a cat as payment, if desired!


--
View this message in context:  
http://r.789695.n4.nabble.com/Novice-Alert-odfWeave-help-tp4455481p4455481.html

Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Week number from a date

2012-02-22 Thread Jan van der Laan


The suggestion below gives you week numbers with week 1 being the week  
containing the first monday of the year and weeks going from monday to  
sunday. There are other conventions. The ISO convention is that week 1  
is the first week containing at least 4 days in the new year (week 1  
of 2012 starts on 2nd januari; week 1 of 2008 starts on december 29th  
2008).


http://www.r-bloggers.com/iso-week/

gives a function for that type of week numbers (not tested by me).

Jan



Patrick Breheny patrick.breh...@uky.edu schreef:

To give a little more detail, you can convert your character strings  
into POSIX objects, then extract from it virtually anything you  
would want using strftime.  In particular, %W is how you get the  
week number:



dateRange - c(2008-10-01,2008-12-01)
x - as.POSIXlt(dateRange)
strftime(x,format=%W)

[1] 39 48

--Patrick

On 02/22/2012 08:37 AM, Ingmar Visser wrote:

?strptime is a good place to start
hth, Ingmar

On Wed, Feb 22, 2012 at 2:09 PM, arunkumarakpbond...@gmail.com  wrote:


Hi

My data looks like this

startDate=2008-06-01

dateRange =c( 2008-10-01,2008-12-01)
Is there any method to find the week number from the startDate range


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Triangular Test

2012-02-20 Thread kende jan
Hello,

I would like to perform triangular test for clinical trial with R.
can you help me please ?

Jan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems reading tab-delim files using read.table and read.delim

2012-02-08 Thread Jan van der Laan



I don't know if this completely solves your problem, but here are some  
arguments to read.table/read.delim you might try:

row.names=FALSE
fill=TRUE
The details section also suggests using the colClasses argument as the  
number of columns is determined from the first 5 rows which may not be  
correct.


HTH

Jan



mails mails00...@gmail.com schreef:


Hello,

I used read.xlsx to read in Excel files but for large files it turned out to
be not very efficient.
For that reason I use a programme which writes each sheet in an Excel file
into tab-delim txt files.
After that I tried using read.table and read.delim to read in those txt
files. Unfortunately, the results
are not as expected. To show you what I mean I created a tiny Excel sheet
with some rows and columns and
read it in using read.xlsx. I also used my script to write that sheet to a
tab-delim txt file and read that one it with
read.table and read.delim. Here is the R output:




(test - read.table(Sheet1.txt, header=TRUE, sep=\t))

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
  line 1 did not have 5 elements


(test - read.delim(Sheet1.txt, header=TRUE, sep=\t))

 c1 c2 c3  X
123 213 NA NA NA
234 asd NA NA NA


(test - read.xlsx(file.path(data), Sheet1))

   c1   c2  c3   NA. NA..1 NA..2
1 123 NA 213NA  NA
2 234  asd  NA  NA


The last output is what I would expect the file to be read in. Columns 4 to
6 do not have any header rows. in R1C4 I added some white spaces as well as
into R2C5 and R2C6 which a read in correctly by the read.xlsx function.

read.table and read.delim seem not to be able to handle such files. Is there
any workaround for that?


Cheers

--
View this message in context:  
http://r.789695.n4.nabble.com/Problems-reading-tab-delim-files-using-read-table-and-read-delim-tp4369195p4369195.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2011 Journal of Statistical Software

2012-02-04 Thread Jan de Leeuw
The Journal of Statistical Software published eight volumes in 2011, five of 
them as special volumes.

V38: Special Volume: Competing Risks and Multi-State Models
V39: Regular Volume
V40: Regular Volume
V41: Special Volume: Statistical Software for State Space Methods
V42: Special Volume: Political Methodology
V43: Regular Volume
V44: Special Volume: Magnetic Resonance Imaging in R
V45: Special Volume: Multiple Imputation

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Not generating line chart

2012-01-19 Thread Jan van der Laan

Devarayalu,

This is FAQ 7.22:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f

use print(qplot())

Regards,
Jan


Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef:


Hi All,


Can you please help me, why this code in not generating line chart?



library(ggplot2)
par(mfrow=c(1,3))

#qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line),  
colour= ACTTRT)

unique(Orange1$REFID) - refid
for (i in refid)
{
Orange2 - Orange1[i == Orange1$REFID, ]
pdf('PGA.pdf')
qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT)
dev.off()
}
Regards,
Devarayalu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Not generating line chart

2012-01-19 Thread Jan van der Laan

Devarayalu,

Please reply to the list.

And it would have easier if you would have outputted your data using  
dput (in your case dput(Orange1)) so that I and other r-help members  
can just copy the data into R. Not everybody had Excell available (I  
for example haven't). The easier you make it for people to look into  
your problem, the higher the probability that you will get a usefull  
answer. In your case your data is quite small, so using dput is no  
problem.


To answer your question. Except for the probable error

refid - unique(Orange2$REFID)

which should probably be

refid - unique(Orange1$REFID)

and the fact that overwrite your files in the loop, I have no problem  
generating the graphs. On my system the following code runs and  
generates two graphs:



library(ggplot2)

Orange1 - structure(list(REFID = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 9L,
9L, 9L, 9L), ARM = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L,
2L, 2L), SUBARM = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L), ACTTRT = structure(c(3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L,
1L, 2L, 2L), .Label = c(ABC, DEF, LCD, Vehicle), class = factor),
TIME1 = c(0L, 2L, 6L, 12L, 0L, 2L, 6L, 12L, 0L, 12L, 0L,
12L), ENDPOINT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), .Label = PGA, class = factor), BASCHGA = c(0L,
-39L, -47L, -31L, 0L, -34L, -25L, -12L, 0L, -30L, 0L, -40L
), STATANAL = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = UNK, class = factor), X = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(,
Dansinger_2010_20687812), class = factor)), .Names = c(REFID,
ARM, SUBARM, ACTTRT, TIME1, ENDPOINT, BASCHGA, STATANAL,
X), class = data.frame, row.names = c(NA, -12L))

refid - unique(Orange1$REFID)
for (i in refid)
{
  Orange2 - Orange1[i == Orange1$REFID, ]
  pdf(paste('PGA', i, '.pdf', sep=''))
  print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT))
  dev.off()
}



Regards,
Jan



Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef:


Jan

Thank you, for your valuable reply. But...

Sorry still I am not getting by using print() with the following  
modified code. I am also attaching the raw datafile.



par(mfrow=c(1,3))

#qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line),  
colour= ACTTRT)

unique(Orange1$REFID) - refid
for (i in refid)
{
Orange2 - Orange1[i == Orange1$REFID, ]
pdf('PGA.pdf')
print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT))
dev.off()
}
Regards
Devarayalu





-Original Message-
From: Jan van der Laan [mailto:rh...@eoos.dds.nl]
Sent: Thursday, January 19, 2012 4:25 PM
To: Sri krishna Devarayalu Balanagu
Cc: r-help@r-project.org
Subject: Re: [R] Not generating line chart

Devarayalu,

This is FAQ 7.22:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f

use print(qplot())

Regards,
Jan


Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef:


Hi All,


Can you please help me, why this code in not generating line chart?



library(ggplot2)
par(mfrow=c(1,3))

#qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line),
colour= ACTTRT)
unique(Orange1$REFID) - refid
for (i in refid)
{
Orange2 - Orange1[i == Orange1$REFID, ]
pdf('PGA.pdf')
qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT)
dev.off()
}
Regards,
Devarayalu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Not generating line chart

2012-01-19 Thread Jan van der Laan
As I mentioned in my previous reply: do not only email to me  
personally but also include the mailinglist. This gives other members  
also the opportunity to answer your question and lets other members,  
who might have a similar question, also see the answer.


As for your first question: put the pdf(...) and dev.off() outside of  
the loop. I am not an ggplot2 expert, but you could also have a look  
at the facets option of qplot.


As for your second question: have a look at
levels(Orange1$ACTTRT)
and
?factor

Regards,
Jan


Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef:


Jan,

Thank you very much for the solution given. Still I am having one  
more question.


I want both the graphs in single pdf and the legend should contain  
ACTTRT of individual REFID (Only two lines in legend)

Can you solve it?

Devarayalu


-Original Message-
From: Jan van der Laan [mailto:rh...@eoos.dds.nl]
Sent: Thursday, January 19, 2012 5:09 PM
To: Sri krishna Devarayalu Balanagu
Cc: r-help@r-project.org
Subject: Re: [R] Not generating line chart

Devarayalu,

Please reply to the list.

And it would have easier if you would have outputted your data using
dput (in your case dput(Orange1)) so that I and other r-help members
can just copy the data into R. Not everybody had Excell available (I
for example haven't). The easier you make it for people to look into
your problem, the higher the probability that you will get a usefull
answer. In your case your data is quite small, so using dput is no
problem.

To answer your question. Except for the probable error

refid - unique(Orange2$REFID)

which should probably be

refid - unique(Orange1$REFID)

and the fact that overwrite your files in the loop, I have no problem
generating the graphs. On my system the following code runs and
generates two graphs:


library(ggplot2)

Orange1 - structure(list(REFID = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 9L,
9L, 9L, 9L), ARM = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L,
2L, 2L), SUBARM = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L), ACTTRT = structure(c(3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L,
1L, 2L, 2L), .Label = c(ABC, DEF, LCD, Vehicle), class = factor),
 TIME1 = c(0L, 2L, 6L, 12L, 0L, 2L, 6L, 12L, 0L, 12L, 0L,
 12L), ENDPOINT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
 1L, 1L, 1L, 1L, 1L), .Label = PGA, class = factor), BASCHGA = c(0L,
 -39L, -47L, -31L, 0L, -34L, -25L, -12L, 0L, -30L, 0L, -40L
 ), STATANAL = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
 1L, 1L, 1L, 1L), .Label = UNK, class = factor), X = structure(c(1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(,
 Dansinger_2010_20687812), class = factor)), .Names = c(REFID,
ARM, SUBARM, ACTTRT, TIME1, ENDPOINT, BASCHGA, STATANAL,
X), class = data.frame, row.names = c(NA, -12L))

refid - unique(Orange1$REFID)
for (i in refid)
{
   Orange2 - Orange1[i == Orange1$REFID, ]
   pdf(paste('PGA', i, '.pdf', sep=''))
   print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line),  
colour= ACTTRT))

   dev.off()
}



Regards,
Jan



Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef:


Jan

Thank you, for your valuable reply. But...

Sorry still I am not getting by using print() with the following
modified code. I am also attaching the raw datafile.


par(mfrow=c(1,3))

#qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line),
colour= ACTTRT)
unique(Orange1$REFID) - refid
for (i in refid)
{
Orange2 - Orange1[i == Orange1$REFID, ]
pdf('PGA.pdf')
print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT))
dev.off()
}
Regards
Devarayalu





-Original Message-
From: Jan van der Laan [mailto:rh...@eoos.dds.nl]
Sent: Thursday, January 19, 2012 4:25 PM
To: Sri krishna Devarayalu Balanagu
Cc: r-help@r-project.org
Subject: Re: [R] Not generating line chart

Devarayalu,

This is FAQ 7.22:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f

use print(qplot())

Regards,
Jan


Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef:


Hi All,


Can you please help me, why this code in not generating line chart?



library(ggplot2)
par(mfrow=c(1,3))

#qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line),
colour= ACTTRT)
unique(Orange1$REFID) - refid
for (i in refid)
{
Orange2 - Orange1[i == Orange1$REFID, ]
pdf('PGA.pdf')
qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT)
dev.off()
}
Regards,
Devarayalu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide  
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

[R] compare means

2012-01-12 Thread kende jan
Dear all,
 
I would compare two means between cases and controls taking
into account that I have  matched 1 case
to two controls. How i can do it with R.
Thanks in advance 
 Jan
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Web analytics / Customer Analytics book recommendation

2012-01-03 Thread Jan Hornych
 Hi,


I am curious if you know about any book that is dealing with the web
analytics / customer analytics subject and is referencing R as the main
statistical tool. I am particularly interested into using R in the real
production environment and not only as the analytical tool.

Thank you

Jan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simple ggplot2 question

2011-12-23 Thread Albert-Jan Roskam
Hello,

I am trying to make a plot using the code below. The plot is divided into two 
parts, using facet_grid. I would like the vertical axis (labelled 'place') to 
be different for each location (=part). So in the upper part, only places 'n' 
through 'z' are shown, while in the lower part, only places 'a' through 'm' are 
shown. I thought 'free_y' would do the trick. I also tried converting variable 
place into class 'factor'.


require(ggplot2)
DF - data.frame(place=letters, value=runif(26), location=c(rep(1, 13), rep(0, 
13)))
qplot(data=DF, x=place, y=value, geom=bar, stat=identity) + 
  coord_flip() + 
  geom_abline(intercept=35, slope=0, colour=red) +
  facet_grid(location ~ ., scales=free_y)
R.version.string # R version 2.10.1 (2009-12-14)


Thank you in advance  merry xmas!
 
Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Journal of Statistical Software 2011

2011-12-20 Thread Jan de Leeuw
This year JSS, at www.jstatsoft.org, published eight volumes V38-V45. Five of 
them were special volumes:

V38 -  Competing Risks and Multi-State Models (guest editor Putter)
V41 - Statistical Software for State Space Methods (Guest editors Commandeur, 
Koopman, Ooms)
V42 - Poltical Methodology (Guest editors Altman, Fox, Jackman, Zeileis)
V44 - Magnetic Resonance Imaging in R (Guest editors Tabelow, Whitcher)
V45 - Multiple Imputation (Guest editor Yucel)

The Thomson/Reuters Impact Factors for the last three year for computational 
statistics journals are

Comp Stat 0.500 - 0.731 - 0.628
CSDA 0.226 - 1.281 - 1.089
JCGS 1.505 - 1.258 - 1.206
JSS 1.033 - 2.320 - 2.647



You may be interested our success in Computer Science

http://www.sciencewatch.com/inter/jou/2011/11decJofStatSoft/

You can follow and befriend us at

http://www.facebook.com/jstatsoft

===
Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of Statistics;
Editor: Journal of Multivariate Analysis, Journal of Statistical Software;
US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: dele...@stat.ucla.edu
.mac: jdeleeuw ++  aim: deleeuwjan ++ skype: j_deleeuw
homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org
 
-
  No matter where you go, there you are. --- Buckaroo Banzai
   http://gifi.stat.ucla.edu/sounds/nomatter.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] maptools/spatial analysis question

2011-12-18 Thread Albert-Jan Roskam
Hi,

I am using maptools to plot air quality data on a map. Each measurement point 
is mapped to a postal code area. This yields pictures with discrete borders, 
like so:
http://dl.dropbox.com/u/27415200/baincome.png
The problem is that the size of a postal code area doesn't mean much in this 
context. Moreover, only a small minority of all the postal code areas has a 
measurement sation. Are there any ways/tools to interpolate the various 
(strategically chosen) measurement stations? I am looking for sensible ways to 
create plots like this:

http://matplotlib.github.com/basemap/_images/etopo5.png

 sessionInfo()
R version 2.11.1 (2010-05-31) 
i686-pc-linux-gnu 
...

 
Thank you in advance!


Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] slight documentation error in stats package arima

2011-12-15 Thread Jan Theodore Galkowski
The documentation for the arima function in the package stats has
a slight error. It references:

Ripley, B. D. (2002) Time series in R 1.5.0. R News, 2/1,
2–7. [1]http://www.r-project.org/doc/Rnews/Rnews_2002-1.pdf

This should be:

Ripley, B. D. (2002) Time series in R 1.5.0. R News, 2/2,
2–7. [2]http://www.r-project.org/doc/Rnews/Rnews_2002-2.pdf

Anyone know who I should tell about this?

Thanks!

 - Jan

References

1. http://www.r-project.org/doc/Rnews/Rnews_2002-1.pdf
2. http://www.r-project.org/doc/Rnews/Rnews_2002-1.pdf
--

 Jan Theodore Galkowski
 Senior Systems Software Engineer
 Akamai Technologies
 Cambridge, MA 02142

 jgalk...@akamai.com
 bayesianlo...@acm.org

 607.239.1834 (m)
 607.239.1834 (h)
 617.444.4995 (w)




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating input population for microsimulation

2011-12-14 Thread Jan van der Laan

Emma,

If, as you say, each unit is the same you can just repeat the units to  
obtain the required number of units. For example,



  unit_size - 10
  n_units - 10

  unit_id - rep(1:n_units, each=unit_size)
  pid - rep(1:unit_size, n_units)
  senior  - ifelse(pid = 2, 1, 0)

  pop - data.frame(unit_id, pid, senior)


If you want more flexibility in generating the units, I would first  
generate the units (without the persons) and then generate the persons  
for each unit. In the example below I use the plyr package; you could  
probably also use lapply/sapply, or simply a loop over the units.


  library(plyr)

  generate_unit - function(unit) {
  pid - 1:unit$size
  senior - rep(0, unit$size)
  senior[sample(unit$size, 2)] - 1
  return(data.frame(unit_id=unit$id, pid=pid, senior=senior))
  }

  units - data.frame(id=1:n_units, size=unit_size)

  library(plyr)
  ddply(units, .(id), generate_unit)


HTH,

Jan




Emma Thomas thomas...@yahoo.com schreef:


Hi all,

I've been struggling with some code and was wondering if you all could help.

I am trying to generate a theoretical population of P people who are  
housed within X different units. Each unit follows the same  
structure- 10 people per unit, 8 of whom are junior and two of whom  
are senior. I'd like to create a unit ID and a unique identifier for  
each person (person ID, PID) in the population so that I have a  
matrix that looks like:


 unit_id pid senior
  [1,]  1   1  0
  [2,]  1   2  0
  [3,]  1   3  0
  [4,]  1   4  0
  [5,]  1   5  0
  [6,]  1   6  0
  [7,]  1   7  0
  [8,]  1   8  0
  [9,]  1   9  1
  [10,]    1   10   1
...

I came up with the following code, but am having some trouble  
getting it to populate my matrix the way I'd like.


world - function(units, pop_size, unit_size){
    pid - rep(0,pop_size) #person ID
    senior - rep(0,pop_size) #senior in charge
    unit_id - rep(0,pop_size) #unit ID
   
        for (i in 1:pop_size){
        for (f in 1:units)    { 
        senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE)
        pid[i] = sample(c(1:10), 1, replace = FALSE)
        unit_id[i] - f
                }}   
    data - cbind(unit_id, pid, senior)
   
    return(data)
    }

    world(units = 10,pop_size = 100, unit_size = 10) #call the function



The output looks like:
 unit_id pid senior
  [1,]  10   7  0
  [2,]  10   4  0
  [3,]  10  10  0
  [4,]  10   9  1
  [5,]  10  10  0
  [6,]  10   1  1
...

but what I really want is to generate is 10 different units with two  
seniors per unit, and with each person in the population having a  
unique identifier.


I thought a nested for loop was one way to go about creating my data  
set of people and families, but obviously I'm doing something (or  
many things) wrong. Any suggestions on how to fix this? I had been  
focusing on creating a person and assigning them to a unit, but  
perhaps I should create the units and then populate the units with  
people?


Thanks so much in advance.

Emma

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating input population for microsimulation

2011-12-14 Thread Jan van der Laan

Emma,

That is because generate_unit expects a data.frame with one row and  
columns id and size:


generate_unit(data.frame(id=1, size=10))

Jan




Emma Thomas thomas...@yahoo.com schreef:


Dear Jan,

Thanks for your reply.

The first solution works well for my needs for now, but I have a  
question about the second. If I run your code and then call the  
function:


generate_unit(10)

I get an error that

Error in unit$size : $ operator is invalid for atomic vectors


Did you experience the same thing?

In any case, I will definitely take a look at the plyr package,  
which I'm sure will be useful in the future.


Thanks again!

Emma



- Original Message -
From: Jan van der Laan rh...@eoos.dds.nl
To: r-help@r-project.org r-help@r-project.org
Cc: Emma Thomas thomas...@yahoo.com
Sent: Wednesday, December 14, 2011 6:18 AM
Subject: Re: [R] Generating input population for microsimulation

Emma,

If, as you say, each unit is the same you can just repeat the units  
to obtain the required number of units. For example,



  unit_size - 10
  n_units - 10

  unit_id - rep(1:n_units, each=unit_size)
  pid     - rep(1:unit_size, n_units)
  senior  - ifelse(pid = 2, 1, 0)

  pop - data.frame(unit_id, pid, senior)


If you want more flexibility in generating the units, I would first  
generate the units (without the persons) and then generate the  
persons for each unit. In the example below I use the plyr package;  
you could probably also use lapply/sapply, or simply a loop over the  
units.


  library(plyr)

  generate_unit - function(unit) {
      pid - 1:unit$size
      senior - rep(0, unit$size)
      senior[sample(unit$size, 2)] - 1
      return(data.frame(unit_id=unit$id, pid=pid, senior=senior))
  }

  units - data.frame(id=1:n_units, size=unit_size)

  library(plyr)
  ddply(units, .(id), generate_unit)


HTH,

Jan




Emma Thomas thomas...@yahoo.com schreef:


Hi all,

I've been struggling with some code and was wondering if you all could help.

I am trying to generate a theoretical population of P people who  
are housed within X different units. Each unit follows the same  
structure- 10 people per unit, 8 of whom are junior and two of whom  
are senior. I'd like to create a unit ID and a unique identifier  
for each person (person ID, PID) in the population so that I have a  
matrix that looks like:


 unit_id pid senior
  [1,]  1   1  0
  [2,]  1   2  0
  [3,]  1   3  0
  [4,]  1   4  0
  [5,]  1   5  0
  [6,]  1   6  0
  [7,]  1   7  0
  [8,]  1   8  0
  [9,]  1   9  1
  [10,]    1   10   1
...

I came up with the following code, but am having some trouble  
getting it to populate my matrix the way I'd like.


world - function(units, pop_size, unit_size){
    pid - rep(0,pop_size) #person ID
    senior - rep(0,pop_size) #senior in charge
    unit_id - rep(0,pop_size) #unit ID
   
        for (i in 1:pop_size){
        for (f in 1:units)    { 
        senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE)
        pid[i] = sample(c(1:10), 1, replace = FALSE)
        unit_id[i] - f
                }}   
    data - cbind(unit_id, pid, senior)
   
    return(data)
    }

    world(units = 10,pop_size = 100, unit_size = 10) #call the function



The output looks like:
 unit_id pid senior
  [1,]  10   7  0
  [2,]  10   4  0
  [3,]  10  10  0
  [4,]  10   9  1
  [5,]  10  10  0
  [6,]  10   1  1
...

but what I really want is to generate is 10 different units with  
two seniors per unit, and with each person in the population having  
a unique identifier.


I thought a nested for loop was one way to go about creating my  
data set of people and families, but obviously I'm doing something  
(or many things) wrong. Any suggestions on how to fix this? I had  
been focusing on creating a person and assigning them to a unit,  
but perhaps I should create the units and then populate the units  
with people?


Thanks so much in advance.

Emma

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R - Linux_SSH

2011-12-14 Thread Jan van der Laan
What I did in the past (not with R scripts) is to start my jobs using  
at (start the job at a specified time e.g. now) or batch (start the  
job when the cpu drops below ?%)


at now R CMD BATCH yourscript.R

or

batch R CMD BATCH yourscript.R

something like that, you'll have to look at the man pages for at  
and/or batch. You probably need something like atd running. I do not  
know if current linux distributions have that running by default.  
You'll get an email when the job is finished.


HTH
Jan



R CMD BATCH [options] my_script.R [outfile]


Chris Mcowen chrismco...@gmail.com schreef:


Dear List,



I am unsure if this is the correct list to post to, if it isn't I apologise.



I am using SSH to access a Linux version of R on a remote computer as it
offers more memory and processing power. The model will take 1-2 days to
run, I am accessing R through Putty and when I close the connection and open
R again, I am faced with a new session.



As a Linux newbie, I was wondering if anybody here knew how to keep R
running and interactive and return to it on a later date?



Thanks



Chris


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CART with rpart

2011-12-02 Thread kende jan
dear all, 


i want to keep in my data file the results of  terminal nodes (groups) after 
CART analysis for performing other statisticals analysis by this groups.

can you help me please?

thanks.

jan.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read TXT file with variable separation

2011-11-29 Thread Jan van der Laan


Raphael,

This looks like fixed width format which you can read with read.fwf.

In fixed width format the columns are not separated by white space (or  
other characters), but are identified by the positition in the file.  
So in your file, for example the first field looks to contained in the  
first 2 columns of your file (the first 2 characters of every line),  
the second field in the next five columns, etc.


Regards,
Jan


Citeren Raphael Saldanha saldanha.plan...@gmail.com:


Hi!

I have to import some TXT files into R, but the separation between the
columns are made with different blank spaces, but each file use the
same separation. Example:

31  104 5 0   11RUA SAO
SEBASTIAO 25



 BAIRRO FILETO
  01
0020033854

The pattern is the same on each file.

There is two sample files attached to this message.

I would like to figure out how to import a single file, and the use
some code to import several files (like this
http://www.ats.ucla.edu/stat/r/code/read_multiple.htm)

When I try read.table, I receive this:

cnefe - read.table(sample1.txt, header=FALSE)
Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  linha 1 não tinha 17 elementos


Information about my session:

sessionInfo()R version 2.12.1 (2010-12-16)Platform:  
i386-pc-mingw32/i386 (32-bit)

locale:[1] LC_COLLATE=Portuguese_Brazil.1252
LC_CTYPE=Portuguese_Brazil.1252   [3]
LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252
attached base packages:[1] stats     graphics  grDevices utils
datasets  methods   base

--
Atenciosamente,

Raphael Saldanha
saldanha.plan...@gmail.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival curves for case control and control

2011-11-21 Thread kende jan
Hi,

I want to perform Survival curves for case and control subjects in
the propensity score-matched cohort  that
accounted for the clustering of matched pairs. How I can do it with R.
Thanks for your help,
Jan
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems

2011-11-17 Thread Jan van der Laan


I assume you use a command window to build your packages. One possible 
solution might be to leave out the path variables set by Rtools from 
your global path and to create a separate shortcut to cmd for building 
r-packages where you set your path as needed by R CMD build/check


Something like

cmd /K PATH 
c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files 
(x86)\MiKTeX 2.9\miktex\bin


(I haven't tried this so it might need some tinkering to get it to 
actually work)


HTH

Jan



On 17-11-2011 9:54, Rubén Roa wrote:



De: Rubén Roa
Enviado el: jueves, 17 de noviembre de 2011 9:53
Para: 'us...@admb-project.org'
Asunto: Reporting a conflict between ADMB and Rtools on Windows systems



Hi,



I have to work under Windows, it's a company policy.



I've just found that there is a conflict between tools used to build R packages 
(Rtools) and ADMB due to the need to put Rtools compiler's location in the PATH 
environmental variable to make Rtools work.



On a Windows 7 64bit  with Rtools installed I installed ADMB-IDE latest version 
and although I could translate ADMB code to cpp code I could not build the cpp 
code into an executable via ADMB-IDE's compiler.



On another Windows machine, a Windows Vista 32bits with Rtools installed I also 
installed the latest ADMB-IDE and this time it was not possible to create the 
.obj file on the way to build the executable when building with ADMB-IDE. On 
this machine I also have a previous ADMB version (6.0.1) that I used to run 
from the DOS shell. This ADMB also failed to build the .obj file.



Now, going to PATH, the location info to make Rtools is:

c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files 
(x86)\MiKTeX 2.9\miktex\bin;

If from this list I remove the reference to the compiler

c:\Rtools\MinGW\bin

then ADMB works again.



So beware of this conflict. Suggestion of a solution will be appreciated. 
Meanwhile, I run ADMB code in one computer and build R packages with Rtools in 
another computer.



Best



Ruben

--

Dr. Ruben H. Roa-Ureta

Senior Researcher, AZTI Tecnalia,

Marine Research Division,

Txatxarramendi Ugartea z/g, 48395, Sukarrieta,

Bizkaia, Spain




[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hierachical code system

2011-11-17 Thread Albert-Jan Roskam
Hi,
 
Thanks for your reply. Based on your suggestions, I managed to simplify the 
code, but only a little. I don't see how I could do without a loop, given the 
nestedness of the hierachy. See the code below, which is working, but I'd like 
to simplify it.
 
# sample data
theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 
'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 
'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 
'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 
'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 
'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03')
theValues - as.numeric(c(NA, NA, 15074.23366, 4882.942034, 1619.59628, 
1801.722877, 1019.973666, NA, 503.9239317, 917.2189347, 6018.830465, 
1944.11311, 1427.575402, 1965.725428, NA, 5857.293612, 5933.770263, NA, 
6077.089518, 1427.180073, 455.9387993, 859.766603, 1002.983331, 2225.328211))
df - as.data.frame(cbind(code=theCodes, value=theValues))
df$value - as.numeric(df$value)
 
# actual code
getDepth - function(df) {
    df$diepte - do.call(rbind, lapply(strsplit(df$code, \\.), length)) - 1
    return(df)
    }
getParents - function(df) {
    df$parent - substr(df$code, 1, 4 + (df$diepte - 1) * 3)
    return(df)
    }
getTotals - function(df, depth) {
    s - subset(df, diepte==depth)
    if(!parent %in% names(df)) s - getParents(s)
    agg - aggregate(s[value], s[parent], FUN=sum, na.rm=TRUE)
    merged - merge(df, agg, by.x=code, by.y=parent, all=TRUE, 
suffixes=c(, _summed))
    isSum - !is.na(merged$value_summed)
    merged[isSum, value] - merged[isSum, value_summed] 
    merged$value_summed - merged$parent - NULL
    return(merged)
    }
#library(debug)
#mtrace(getTotals)
df - getDepth(df)
for( depth in max(df$diepte):2 ) {
    if (depth == max(df$diepte)) {
    x - getTotals(df, depth) 
    } else {
    x - getTotals(x, depth)
    }
    }

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~



From: ONKELINX, Thierry thierry.onkel...@inbo.be
To: Albert-Jan Roskam fo...@yahoo.com; R Mailing List r-help@r-project.org
Sent: Wednesday, November 16, 2011 2:34 PM
Subject: RE: [R] hierachical code system

Dear Albert-Jan,

The easiest way is to create extra variables with the corresponding 
aggregation level. substr() en strsplit() can be your friends. Once you have 
those variables you can use aggregate() or any other aggregating function. You 
don't need loops.

Best regards,

Thierry

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Namens Albert-Jan Roskam
 Verzonden: woensdag 16 november 2011 14:28
 Aan: R Mailing List
 Onderwerp: [R] hierachical code system
 
 Hi,
 
 I have a hierachical code system such as the example below (the printed data
 are easiest to read). I would like to write a function that returns an 
 'imputed'
 data frame, ie. where the the parent values are calculated as the sum of the
 child values. So, for instance, STAT.01.01.06  is the sum of STAT.01.01.06.01
 through STAT.01.01.06.06. The code I have written uses two for loops, and,
 moreover, does not work as intended. My starting point was to determine the
 code depth by counting the dots in the variable 'code' (using strsplit), then
 iterate over the tree from deep to shallow. Does anybody have a good idea as
 to how to approach this in R?
 
 theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02',
 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06',
 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 
 'STAT.01.01.06.04',
 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01',
 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02',
 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03')
 theValues - c('NA', 'NA', '15074.23366', '4882.942034', '1619.59628',
 '1801.722877', '1019.973666', 'NA', '503.9239317', '917.2189347',
 '6018.830465', '1944.11311', '1427.575402', '1965.725428', 'NA', 
 '5857.293612',
 '5933.770263', '6077.089518', 'NA', '1427.180073', '455.9387993', 
 '859.766603',
 '1002.983331', '2225.328211') df - as.data.frame(cbind(code=theCodes,
 value=theValues))
 print(df)
    code   value
 1   STAT.01  NA
 2    STAT.01.01  NA
 3 STAT.01.01.01 15074.23366
 4 STAT.01.01.02 4882.942034
 5 STAT.01.01.03  1619.59628
 6 STAT.01.01.04 1801.722877
 7 STAT.01.01.05 1019.973666
 8 STAT.01.01.06  NA
 9  STAT.01.01.06.01 503.9239317
 10 STAT.01.01.06.02

[R] hierachical code system

2011-11-16 Thread Albert-Jan Roskam
Hi,
 
I have a hierachical code system such as the example below (the printed data 
are easiest to read). I would like to write a function that returns an 
'imputed' data frame, ie. where the the parent values are calculated as the sum 
of the child values. So, for instance, STAT.01.01.06  is the sum of 
STAT.01.01.06.01 through STAT.01.01.06.06. The code I have written uses two for 
loops, and, moreover, does not work as intended. My starting point was to 
determine the code depth by counting the dots in the variable 'code' (using 
strsplit), then iterate over the tree from deep to shallow. Does anybody have a 
good idea as to how to approach this in R?
 
theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 
'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 
'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 
'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 
'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 
'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03')
theValues - c('NA', 'NA', '15074.23366', '4882.942034', '1619.59628', 
'1801.722877', '1019.973666', 'NA', '503.9239317', '917.2189347', 
'6018.830465', '1944.11311', '1427.575402', '1965.725428', 'NA', '5857.293612', 
'5933.770263', '6077.089518', 'NA', '1427.180073', '455.9387993', '859.766603', 
'1002.983331', '2225.328211')
df - as.data.frame(cbind(code=theCodes, value=theValues))
print(df)
   code   value
1   STAT.01  NA
2    STAT.01.01  NA
3 STAT.01.01.01 15074.23366
4 STAT.01.01.02 4882.942034
5 STAT.01.01.03  1619.59628
6 STAT.01.01.04 1801.722877
7 STAT.01.01.05 1019.973666
8 STAT.01.01.06  NA
9  STAT.01.01.06.01 503.9239317
10 STAT.01.01.06.02 917.2189347
11 STAT.01.01.06.03 6018.830465
12 STAT.01.01.06.04  1944.11311
13 STAT.01.01.06.05 1427.575402
14 STAT.01.01.06.06 1965.725428
15   STAT.01.02  NA
16    STAT.01.02.01 5857.293612
17    STAT.01.02.02 5933.770263
18    STAT.01.02.03 6077.089518
19 STAT.01.02.03.01  NA
20 STAT.01.02.03.02 1427.180073
21 STAT.01.02.03.03 455.9387993
22 STAT.01.02.03.04  859.766603
23 STAT.01.02.03.05 1002.983331
24   STAT.01.03 2225.328211
 

Thank you in advance!

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading a specific column of a csv file in a loop

2011-11-15 Thread Jan van der Laan

Yet another solution. This time using the LaF package:

library(LaF)
d-c(1,4,7,8)
P1 - laf_open_csv(M1.csv, column_types=rep(double, 10), skip=1)
P2 - laf_open_csv(M2.csv, column_types=rep(double, 10), skip=1)
for (i in d) {
  M-data.frame(P1[, i],P2[, i])
}

(The skip=1 is needed as laf_open_csv doesn't read headers)

Jan



On 11/08/2011 11:04 AM, Sergio René Araujo Enciso wrote:

Dear all:

I have two larges files with 2000 columns. For each file I am
performing a loop to extract the ith element of each file and create
a data frame with both ith elements in order to perform further
analysis. I am not extracting all the ith elements but only certain
which I am indicating on a vector called d.

See  an example of my  code below

### generate an example for the CSV files, the original files contain
more than 2000 columns, here for the sake of simplicity they have only
10 columns
M1-matrix(rnorm(1000), nrow=100, ncol=10,
dimnames=list(seq(1:100),letters[1:10]))
M2-matrix(rnorm(1000), nrow=100, ncol=10,
dimnames=list(seq(1:100),letters[1:10]))
write.table(M1, file=M1.csv, sep=,)
write.table(M2, file=M2.csv, sep=,)

### the vector containing the i elements to be read
d-c(1,4,7,8)
P1-read.table(M1.csv, header=TRUE)
P2-read.table(M1.csv, header=TRUE)
for (i in d) {
M-data.frame(P1[i],P2[i])
rm(list=setdiff(ls(),d))
}

As the files are quite large, I want to include read.table within
the loop so as it only read the ith element. I know that there is
the option colClasses for which I have to create a vector with zeros
for all the columns I do not want to load. Nonetheless I have no idea
how to make this vector to change in the loop, so as the only element
with no zeros is the ith element following the vector d. Any ideas
how to do this? Or is there anz other approach to load only an
specific element?

best regards,

Sergio René

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] LaF 0.3: fast access to large ASCII files

2011-11-14 Thread Jan van der Laan
The LaF package provides methods for fast access to large ASCII files. 
Currently the following file formats are supported:


* comma separated format (csv) and other separated formats and
* fixed width format.

It is assumed that the files are too large to fit into memory, although 
the package can also be used to efficiently access files that do fit 
into memory.


In order to process files that are too large to fit into memory, methods 
are provided to access and process file blockwise. Furthermore, an 
opened file can be indexed as one would a data.frame. In this way 
subsets. or specific columns can be read into memory. For example, 
assuming that an object laf has been created using one of the functions 
laf_open_csv or laf_open_fwf, the third column from the file can be read 
into memory using:


 col - laf[,3]


The LaF-manual vignette contains a description of all functionality 
provided:


  http://laf-r.googlecode.com/files/LaF-manual_0.3.pdf

The Laf-benchmark vignette compares the performance of LaF to the 
standard R-routines read.table and read.fwf:


  http://laf-r.googlecode.com/files/LaF-benchmark_0.3.pdf

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] JSS Special Volumes for 2011

2011-11-14 Thread Jan de Leeuw
So far in 2011 JSS has published 4 (four !) special volumes. If you have 
additional suggestions for special
volumes, let us know. Also, submit your JSS-adapted package vignettes. 

If you like what you see, friend us at 

http://www.facebook.com/jstatsoft

Tabelow and Whitcher, Guest Editors 
Volume 44: Magnetic Resonance Imaging in R
http://www.jstatsoft.org/v44

Altman, Fox, Jackman and Zeileis , Guest Editors
Volume 42: Political Methodology
http://www.jstatsoft.org/v42

Commandeur, Koopman, and Ooms, Guest Editors
Volume 41: Statistical Software for State Space Methods
http://www.jstatsoft.org/v41

Putter, Guest Editor
Volume 38: Competing Risks and Multi-State Models
http://www.jstatsoft.org/v38

Additional regular volumes, of course, at http://www.jstatsoft.org. ===
Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of Statistics;
Editor: Journal of Multivariate Analysis, Journal of Statistical Software;
US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: dele...@stat.ucla.edu
.mac: jdeleeuw ++  aim: deleeuwjan ++ skype: j_deleeuw
homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org
 
-
  No matter where you go, there you are. --- Buckaroo Banzai
   http://gifi.stat.ucla.edu/sounds/nomatter.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] overloading + operator for chars

2011-11-02 Thread Albert-Jan Roskam
Hello,
 
I would like to overload the + operator so that it can be used to concatenate 
two strings, e.g John + Doe = JohnDoe.
How can I 'unseal' the + method?
 setMethod(+, signature(e1=character, e2=character), function(e1, e2) 
 paste(e1, e2, sep=) )
Error in setMethod(+, signature(e1 = character, e2 = character),  : 
  the method for function + and signature e1=character, e2=character is 
sealed and cannot be re-defined
 

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Chi-Square test and survey results

2011-10-12 Thread Jan van der Laan

George,

Perhaps the site of the RISQ project (Representativity indicators for  
Survey Quality) might be of use: http://www.risq-project.eu/ . They  
also provide R-code to calculate their indicators.


HTH,
Jan



Quoting ghe...@mathnmaps.com:


An organization has asked me to comment on the validity of their
recent all-employee survey.  Survey responses, by geographic region, compared
with the total number of employees in each region, were as follows:


ByRegion

  All.Employees Survey.Respondents
Region_1735142
Region_2500 83
Region_3897 78
Region_4717133
Region_5167 48
Region_6309  0
Region_7806125
Region_8627122
Region_9858177
Region_10   851160
Region_11   336 52
Region_12  1823312
Region_1380  9
Region_14   774121
Region_15   561 24
Region_16   834134

How well does the survey represent the employee population?
Chi-square test says, not very well:


chisq.test(ByRegion)


Pearson's Chi-squared test

data:  ByRegion
X-squared = 163.6869, df = 15, p-value  2.2e-16

By striking three under-represented regions (3,6, and 15), we get
a more reasonable, although still not convincing, result:


chisq.test(ByRegion[setdiff(1:16,c(3,6,15)),])


Pearson's Chi-squared test

data:  ByRegion[setdiff(1:16, c(3, 6, 15)), ]
X-squared = 22.5643, df = 12, p-value = 0.03166

This poses several questions:

1)  Looking at a side-by-side barchart (proportion of responses vs.
proportion of employees, per region), the pattern of survey responses
appears, visually, to match fairly well the pattern of employees.  Is
this a case where we trust the numbers and not the picture?

2) Part of the problem, ironically, is that there were too many responses
to the survey.  If we had only one-tenth the responses, but in the same
proportions by region, the chi-square statistic would look much better,
(though with a warning about possible inaccuracy):

data:  data.frame(ByRegion$All.Employees, 0.1 *   
(ByRegion$Survey.Respondents))

X-squared = 17.5912, df = 15, p-value = 0.2848

Is there a way of reconciling a large response rate with an unrepresentative
response profile?  Or is the bad news that the survey will give very precise
results about a very ill-specified sub-population?

(Of course, I would put in softer terms, like you need to assess the degree
of homogeneity across different regions .)

3) Is Chi-squared really the right measure of how representative is the
survey?

 

Thanks for any help you can give - hope these questions make sense -

George H.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying function to only numeric variable (plyr package?)

2011-10-12 Thread Jan van der Laan


plyr isn't necessary in this case. You can use the following:

cols - sapply(df, is.numeric)
df[, cols] - pct(df[,cols])


round (and therefore pct) accepts a data.frame and returns a  
data.frame with the same dimensions. If that hadn't been the case  
colwise might have been of help:


library(plyr)
pct.colwise - colwise(pct)
df[, cols] - pct.colwise(df[,colwise])

HTH,

Jan



Quoting michael.laviole...@dhhs.state.nh.us:



My data frame consists of character variables, factors, and proportions,
something like

c1 - c(A, B, C, C)
c2 - factor(c(1, 1, 2, 2), labels = c(Y,N))
x - c(0.5234, 0.6919, 0.2307, 0.1160)
y - c(0.9251, 0.7616, 0.3624, 0.4462)
df - data.frame(c1, c2, x, y)
pct - function(x) round(100*x, 1)

I want to apply the pct function to only the numeric variables so that the
proportions are computed to percentages, and retain all the columns:

  c1 c2   x1   x2
1  A  Y 52.3 92.5
2  B  Y 69.2 76.2
3  C  N 23.1 36.2
4  C  N 11.6 44.6

I've been approaching it with the ddply and colwise functions from the plyr
package, but in that case each I need each row to be its own group and
retain all columns. Am I on the right track? If not, what's the best way to
do this?

Thanks in advance,
M. L.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with .C

2011-10-06 Thread Jan van der Laan
An obvious reason might be that your second argument should be a  
pointer to int.


As others have mentioned, you might want to have a look at Rccp and/or  
inline. The documentation is good and I find it much easier to work  
with.


For example, your example could be written as:

library(Rcpp)
library(inline)

test - cxxfunction(signature(x = numeric ) , '
Rcpp::NumericVector v(x);
Rcpp::NumericVector result(v.length());
for (int i = 0; i  v.length(); ++i) {
result[i] = v[i] + i;
}
return(result);
', plugin = Rcpp )


HTH,

Jan


Quoting Grigory Alexandrovich alexandrov...@mathematik.uni-marburg.de:


Hello,

first thank you for your answers.

I did not read the whole pdf Writing R Extension, but I read this
strongly shortened introduction to this subject:

http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf

I get the same error with this C-function:

void test(double * b, int l)
{
 int i;
 for(i=0; i  l ; i++) b[i] +=i;
}



I call it from R like this:

parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3);
.C(test, as.double(parameter), as.integer(11))

The programm crashes even in this simple case.
Where can be the error?

Thanks
Grigory Alexandrovich







Answer 1
Without knowing that C code, we cannot know. Have you read Writing   
R Extensions carefully? I.e. take care with memory allocation and   
printing as mentioned in the manual.


Uwe Ligges



Answer 2
This looks like a classic case of not reading the manual, and then   
compounding it by not reading the posting guide. The manual would   
be the Writing R Extensions pdf that comes with R or you can   
google it. The posting guide is referenced at the bottom of this   
and every other posting on this mailing list.
There are nearly an infinite variety of errors that can lead to a   
crash, so it is really unreasonable of you to pose this question   
this way and expect constructive assistance.

---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---
Sent from my phone. Please excuse my brevity.



Answer 3


It's impossible to say, with such minimal information, but a reasonable
guess is that there is a problem with the declaration of x and y in
foo.c.  These would (I think) need to be declared as double *, not double,
when foo is called from .C().

   cheers,

   Rolf Turner



Answer 4


Hi,

As other have said, it's very difficult to help you without an example
+ code to know what you are talking about.

That having been said, it seems as if you are just getting your feet
wet in this R -- C bridge, and I'd recommend you checkout the Rcpp
and inline package to help make your life a lot easier ...

-steve










On 04.10.2011 14:04, Grigory Alexandrovich wrote:

Hello,

I wrote a function in C, which works fine if called from the
main-function in C.

But as soon as I try to call this function from R like .C('foo',
as.double(x), as.integer(y)), the programm crashes.

I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and
loaded it into R with dyn.load().

What can be the cause of such behaviour?
Again, the C-funcion itself works, but not if called from R.

Thanks
Grigory Alexandrovich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with .C

2011-10-06 Thread Jan van der Laan

Quoting Uwe Ligges lig...@statistik.tu-dortmund.de:




I don't agree that it's overkill -- you get to sidestep the whole `R
CMD SHLIB ...` and `dyn.load` dance this way while you experiment with
C(++) code 'live using the inline package.



You need two additional packages now where you have to rely on the fact
those are available. Moreover, you have to get used to that syntax, and
part of it seems to be C++ now? At least I do not know why the above
should work at all, while I know the simple C function does.


OK, I agree that switching to Rcpp/C++ might be a bit of overkill in
this example although in a lot of other example I find the Rcpp syntax
much more readable than the c-code when dealing with .Call .

The example could also have been writen in C using inline removing the
need of Rcpp and looking more like the original example:

library(inline)

test - cfunction(signature(b = numeric, l = integer) , '
 for(int i=0; i  *l; i++) b[i] += i;
 ', convention=.C)

I find that the advantage of using inline (especially in case of
simple functions like this) is that
1. I no long need to compile and load the shared library manually,
which can sometimes be frustrating when windows locks the dll.
2. Inline performs typechecking and casts variables to the right type.  
You can now type test(1:10,10) without needing as.numeric or  
as.integer. Reducing the amount of r code and the probabiliry of  
screwing things up by passing the wrong type.



Jan




Uwe



It's really handy.


Just make the original source


void test(double *b, int *l)
{
int i;
for(i=0; i  *l ; i++) b[i] += i;
}


which you would have know after reading the Wriiting R Extensions manual.


I agree that this step is unavoidable no matter which avenue (Rcpp or
otherwise) one decides to take.

-steve



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with regexp

2011-10-05 Thread Albert-Jan Roskam
Hello!
 
library(gsubfn)
test - c('filename_1_def.pdf', 'filename_2_abc.pdf')
gsubfn((.+_)([a-z]+)(\\.pdf), \\2, test)

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~



From: Jannis bt_jan...@yahoo.de
To: r-h...@stat.math.ethz.ch
Sent: Wednesday, October 5, 2011 1:56 PM
Subject: [R] help with regexp

Dear list memebers, 


I am stuck with using regular expressions.


Imagine I have a vector of character strings like:

test - c('filename_1_def.pdf', 'filename_2_abc.pdf')

How could I use regexpressions to extract only the 'def'/'abc' parts of these 
strings?


Some try from my side yielded no results:

testresults - grep('(?=filename_[[:digit:]]_).{1,3}(?=.pdf)', perl = TRUE, 
value = TRUE)

Somehow I seem to miss some important concept here. Until now I always used 
nested sub expressions like:

testresults - sub('.pdf$', '', sub('^filename_[[:digit:]]_', '' , test))


but this tends to become cumbersome and I was wondering whether there is a 
more elegant way to do this?



Thanks for any help

Jannis



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimize R code: replace for loop

2011-10-05 Thread Albert-Jan Roskam
Hello,
 
I'd do:
ave(testvec, FUN=cumsum)+1

But in R everything can be done in a trillion different ways. ;-)

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~



From: ONKELINX, Thierry thierry.onkel...@inbo.be
To: Chris82 rubenba...@gmx.de; r-help@r-project.org r-help@r-project.org
Sent: Wednesday, October 5, 2011 11:54 AM
Subject: Re: [R] optimize R code: replace for loop

You can vectorize it using cumsum.

cumsum(c(1, testvec))

all.equal(final.sum, cumsum(c(1, testvec)))

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Namens Chris82
 Verzonden: woensdag 5 oktober 2011 11:50
 Aan: r-help@r-project.org
 Onderwerp: [R] optimize R code: replace for loop
 
 Dear R Users,
 
 at the moment I am trying to optimize an R script.
 
 testvec - c(0,1,0,1,1,1,1,0,0,1,0,1,0)
 
 
 sum.testvec - vector()
 tempsum - 1
 for (e in 1:length(testvec)){
 sum.testvec[e] - tempsum+testvec[e]
 tempsum - sum.testvec[e]
 
 }
 
 final.sum - c(1,sum.testvec)
 
 
 Is there an option to do something with apply? Unfortunately I am not so
 familiar with the apply functions.
 
 Thanks.
 
 --
 View this message in context: http://r.789695.n4.nabble.com/optimize-R-code-
 replace-for-loop-tp3873945p3873945.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot: how to fix the ratio of the plot box?

2011-10-02 Thread Hofert Jan Marius
Dear all,

this should be trivial, but I couldn't figure out how to solve it... I would 
like to have a plot with fixed aspect ratio of 1. Whenever I resize the Quartz 
window, the axes are extended so that the plot fills the whole window. However, 
if you have different extensions for the different axes, the plot does not look 
 like a square anymore (i.e., aspect ratio 1). The same of course happens if 
you print it to .pdf (ultimate goal). How can I fix the plot box (formed by the 
axes) ratio to be 1, meaning that the plot box is a square no matter how I 
resize the Quartz window?

I searched for this and found: 
http://tolstoy.newcastle.edu.au/R/help/05/04/2888.html
It is more or less recommended to use lattice's xyplot for that. Is there no 
solution for base graphics?
[I know that the extension is by default 4% and that's great, but the the size 
of the Quartz window should not change this (which it does if you resize the 
window accordingly)].

Cheers,

Marius

Minimal example:
u - runif(10)
pdf(width=5, height=5)
plot(u, u, asp=1, xlim=c(0,1), ylim=c(0,1), main=My title)
dev.off()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot: how to fix the ratio of the plot box?

2011-10-02 Thread Hofert Jan Marius
ahh, perfect, thanks.

Cheers,

Marius

On 2011-10-02, at 13:08 , Jim Lemon wrote:

 On 10/02/2011 07:20 PM, Hofert Jan Marius wrote:
 Dear all,
 
 this should be trivial, but I couldn't figure out how to solve it... I would 
 like to have a plot with fixed aspect ratio of 1. Whenever I resize the 
 Quartz window, the axes are extended so that the plot fills the whole 
 window. However, if you have different extensions for the different axes, 
 the plot does not look  like a square anymore (i.e., aspect ratio 1). The 
 same of course happens if you print it to .pdf (ultimate goal). How can I 
 fix the plot box (formed by the axes) ratio to be 1, meaning that the plot 
 box is a square no matter how I resize the Quartz window?
 
 I searched for this and found: 
 http://tolstoy.newcastle.edu.au/R/help/05/04/2888.html
 It is more or less recommended to use lattice's xyplot for that. Is there no 
 solution for base graphics?
 [I know that the extension is by default 4% and that's great, but the the 
 size of the Quartz window should not change this (which it does if you 
 resize the window accordingly)].
 
 Cheers,
 
 Marius
 
 Minimal example:
 u- runif(10)
 pdf(width=5, height=5)
 plot(u, u, asp=1, xlim=c(0,1), ylim=c(0,1), main=My title)
 dev.off()
 
 Hi Marius,
 Have you tried:
 
 par(pty=s)
 
 after you open the device and before plotting?
 
 Jim
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] last observation carried forward +1

2011-09-30 Thread Jan Wijffels
Hi R-helpers

I'm looking for a vectorised function which does missing value replacement
as in last observation carried forward in the zoo package but instead of a
locf, I would like the locf function to add +1 to each time a missing value
occurred. See below for an example.

 require(zoo)
 x - 5:15
 x[4:7] - NA
 coredata(na.locf(zoo(x)))
 [1]  5  6  7  7  7  7  7 12 13 14 15
But what I need is
5  6  7  7+1  7+1+1  7+1+1+1  7+1+1+1+1 12 13 14 15
to obtain
[1]  5  6  7  8  9 10 11 12 13 14 15
I could program this in C but if anyone has already done this I would be
interested in seeing their vectorized solution.

thanks,
Jan

-- 
groeten/kind regards,
Jan

Jan Wijffels
Statistical Data Miner
www.bnosac.be  | +32 486 611708

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data import

2011-09-26 Thread Jan van der Laan
You can with the routines in the memisc library. You can open a file 
using spss.system.file and then import a subset using subset. Look in 
the help pages of spss.system.file for examples.


HTH

Jan


On 09/25/2011 11:56 PM, sassorauk wrote:

Is it possible to import only certain variables from a SPSS file.

I know that read.spss in the foreign library will bring the data into R but
can I choose to important only chosen variables from the SPSS dataset to R?

Thanks for your help.

R

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-import-tp3842196p3842196.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help on write.csv

2011-09-22 Thread Jan van der Laan
Rowwise is easy. The example code I gave does this: it appends the new 
data /below/ the old. I'll repeat the example below:


con - file(d:test2.csv, wt)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE, 
col.names=TRUE)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE, 
col.names=FALSE, append=TRUE)

close(con)

Or do you mean columnwise where you append columns? This would be very 
difficult in CSV. If you would like to do this you might have a look at 
the various options for exporting to Excel directly.  See for example 
http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows . I have 
no experience in this.


Regards,
Jan

PS I am sorry for my previous triple post. I had a little fight with my 
webmail client.



On 09/22/2011 06:14 AM, Ashish Kumar wrote:


IS there a way we can append row wise, so that it all stacks up 
horizontally, the way you do it in xlswrite in matlab, where you can 
even specify the cell number from where you want to write.


-Ashish

*From:*R. Michael Weylandt [mailto:michael.weyla...@gmail.com]
*Sent:* Thursday, September 22, 2011 12:03 AM
*To:* Jan van der Laan
*Cc:* r-help@r-project.org; ashish.ku...@esteeadvisors.com
*Subject:* Re: [R] R help on write.csv

Oh darn, I had that line and then when I copied it to gmail I thought 
I'd be all slick and clean up my code: oh well...just not my day/thread...


It's possible to work around the repeated headers business (change to 
something like Call$col.names - !append) but yeah, at this point 
I'm thinking its perhaps better practice to direct the OP to the 
various connection methods: sink() is nice, but he'll probably have to 
do something to convert his object to a CSV like string before printing:


apply(OBJ, 1, paste, sep=,)

Michael Weylandt

On Wed, Sep 21, 2011 at 11:20 AM, Jan van der Laan e...@dds.nl 
mailto:e...@dds.nl wrote:


Michael,

You example doesn't seem to work. Append isn't passed on to the 
write.table call. You will need to add a


 Call$append- append

to the function. And even then there will be a problem with the 
headers that are repeated when appending.



An easier solution is to use write.table directly (I am using 
Dutch/European csv format):


data - data.frame(a=1:10, b=1, c=letters[1:10])
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, 
col.names=TRUE)
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, 
col.names=FALSE,

append=TRUE)


When first openening a file connection and passing that to write.csv 
or write.table data is also appended. The problem with write.csv is 
that writing the column names can not be suppressed which will result 
in repeated column names:


con - file(d:test2.csv, wt)
write.csv2(data, file=con, row.names=FALSE)
write.csv2(data, file=con, row.names=FALSE)
close(con)

So one will still have to use write.table to avoid this:

con - file(d:test2.csv, wt)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE, 
col.names=TRUE)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE, 
col.names=FALSE,

append=TRUE)
close(con)

Using a file connection is probably also more efficient when doing a 
large number of appends.


Jan







Quoting R. Michael Weylandt michael.weyla...@gmail.com 
mailto:michael.weyla...@gmail.com:


Touche -- perhaps we could make one though?

write.csv.append - function(..., append = TRUE)
{
   Call - match.call(expand.dots = TRUE)
   for (argname in c(col.names, sep, dec, qmethod)) if
(!is.null(Call[[argname]]))
   warning(gettextf(attempt to set '%s' ignored, argname),
   domain = NA)
   rn - eval.parent(Call$row.names)
   Call$col.names - if (is.logical(rn)  !rn)
   TRUE
   else NA
   Call$sep - ,
   Call$dec - .
   Call$qmethod - double
   Call[[1L]] - as.name http://as.name(write.table)
   eval.parent(Call)
}
write.csv.append(1:5,test.csv, append = FALSE)
write.csv.append(1:15, test.csv)

Output seems a little sloppy, but might work for the OP.

Michael Weylandt

On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra
ivan.calan...@uni-hamburg.de mailto:ivan.calan...@uni-hamburg.de

wrote:

I don't think there is an append argument to write.csv()
(well, actually
there is one, but set to FALSE).
There is however one to write.table()
Ivan

Le 9/21/2011 14:54, R. Michael Weylandt
michael.weyla...@gmail.com mailto:michael.weyla...@gmail.com a
écrit :

 The append argument of write.csv()?


Michael

On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@**

esteeadvisors.com http://esteeadvisors.com
ashish.ku...@esteeadvisors.com
mailto:ashish.ku...@esteeadvisors.com  wrote:

 Hi,




I wanted to write the data created using R  on existing
csv file. However
everytime I use write.csv, it overwrites the values

Re: [R] R help on write.csv

2011-09-21 Thread Jan van der Laan


Michael,

You example doesn't seem to work. Append isn't passed on to the  
write.table call. You will need to add a


 Call$append- append

to the function. And even then there will be a problem with the  
headers that are repeated when appending.



An easier solution is to use write.table directly (I am using  
Dutch/European csv format):


data - data.frame(a=1:10, b=1, c=letters[1:10])
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE,  
col.names=TRUE)
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE,  
col.names=FALSE, append=TRUE)



When first openening a file connection and passing that to write.csv  
or write.table data is also appended. The problem with write.csv is  
that writing the column names can not be suppressed which will result  
in repeated column names:


con - file(d:\\test2.csv, wt)
write.csv2(data, file=con, row.names=FALSE)
write.csv2(data, file=con, row.names=FALSE)
close(con)

So one will still have to use write.table to avoid this:

con - file(d:\\test2.csv, wt)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE,  
col.names=FALSE, append=TRUE)

close(con)

Using a file connection is probably also more efficient when doing a  
large number of appends.


Jan






Quoting R. Michael Weylandt michael.weyla...@gmail.com:


Touche -- perhaps we could make one though?

write.csv.append - function(..., append = TRUE)
{
Call - match.call(expand.dots = TRUE)
for (argname in c(col.names, sep, dec, qmethod)) if
(!is.null(Call[[argname]]))
warning(gettextf(attempt to set '%s' ignored, argname),
domain = NA)
rn - eval.parent(Call$row.names)
Call$col.names - if (is.logical(rn)  !rn)
TRUE
else NA
Call$sep - ,
Call$dec - .
Call$qmethod - double
Call[[1L]] - as.name(write.table)
eval.parent(Call)
}
write.csv.append(1:5,test.csv, append = FALSE)
write.csv.append(1:15, test.csv)

Output seems a little sloppy, but might work for the OP.

Michael Weylandt

On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra ivan.calan...@uni-hamburg.de

wrote:



I don't think there is an append argument to write.csv() (well, actually
there is one, but set to FALSE).
There is however one to write.table()
Ivan

Le 9/21/2011 14:54, R. Michael Weylandt michael.weyla...@gmail.com a
écrit :

 The append argument of write.csv()?


Michael

On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@**
esteeadvisors.com ashish.ku...@esteeadvisors.com  wrote:

 Hi,




I wanted to write the data created using R  on existing csv file. However
everytime I use write.csv, it overwrites the values already there in the
existing csv file. Any workaround on this.



Thanks for your help



Ashish Kumar



Estee Advisors Pvt. Ltd.

Email: ashish.ku...@esteeadvisors.com

Cell: +91-9654072144

Direct: +91-124-4637-713




   [[alternative HTML version deleted]]

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Dept. Mammalogy
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-**hamburg.de/mammals/eng/1525_8_**1.phphttp://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php


__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help on write.csv

2011-09-21 Thread Jan van der Laan

Michael,

You example doesn't seem to work. Append isn't passed on to the  
write.table call. You

will need to add a

 Call$append- append

to the function. And even then there will be a problem with the  
headers that are repeated

when appending.


An easier solution is to use write.table directly (I am using  
Dutch/European csv format):


data - data.frame(a=1:10, b=1, c=letters[1:10])
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE,  
col.names=TRUE)
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE,  
col.names=FALSE,

append=TRUE)


When first openening a file connection and passing that to write.csv  
or write.table data
is also appended. The problem with write.csv is that writing the  
column names can not be

suppressed which will result in repeated column names:

con - file(d:test2.csv, wt)
write.csv2(data, file=con, row.names=FALSE)
write.csv2(data, file=con, row.names=FALSE)
close(con)

So one will still have to use write.table to avoid this:

con - file(d:test2.csv, wt)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE,  
col.names=FALSE,

append=TRUE)
close(con)

Using a file connection is probably also more efficient when doing a  
large number of

appends.

Jan





Quoting R. Michael Weylandt michael.weyla...@gmail.com:


Touche -- perhaps we could make one though?

write.csv.append - function(..., append = TRUE)
{
Call - match.call(expand.dots = TRUE)
for (argname in c(col.names, sep, dec, qmethod)) if
(!is.null(Call[[argname]]))
warning(gettextf(attempt to set '%s' ignored, argname),
domain = NA)
rn - eval.parent(Call$row.names)
Call$col.names - if (is.logical(rn)  !rn)
TRUE
else NA
Call$sep - ,
Call$dec - .
Call$qmethod - double
Call[[1L]] - as.name(write.table)
eval.parent(Call)
}
write.csv.append(1:5,test.csv, append = FALSE)
write.csv.append(1:15, test.csv)

Output seems a little sloppy, but might work for the OP.

Michael Weylandt

On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra ivan.calan...@uni-hamburg.de

wrote:



I don't think there is an append argument to write.csv() (well, actually
there is one, but set to FALSE).
There is however one to write.table()
Ivan

Le 9/21/2011 14:54, R. Michael Weylandt michael.weyla...@gmail.com a
écrit :

 The append argument of write.csv()?


Michael

On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@**
esteeadvisors.com ashish.ku...@esteeadvisors.com  wrote:

 Hi,




I wanted to write the data created using R  on existing csv file. However
everytime I use write.csv, it overwrites the values already there in the
existing csv file. Any workaround on this.



Thanks for your help



Ashish Kumar



Estee Advisors Pvt. Ltd.

Email: ashish.ku...@esteeadvisors.com

Cell: +91-9654072144

Direct: +91-124-4637-713




   [[alternative HTML version deleted]]

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Dept. Mammalogy
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-**hamburg.de/mammals/eng/1525_8_**1.phphttp://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php


__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help on write.csv

2011-09-21 Thread Jan van der Laan

Michael,

You example doesn't seem to work. Append isn't passed on to the  
write.table call. You will need to add a


 Call$append- append

to the function. And even then there will be a problem with the  
headers that are repeated when appending.



An easier solution is to use write.table directly (I am using  
Dutch/European csv format):


data - data.frame(a=1:10, b=1, c=letters[1:10])
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE,  
col.names=TRUE)
write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE,  
col.names=FALSE,

append=TRUE)


When first openening a file connection and passing that to write.csv  
or write.table data is also appended. The problem with write.csv is  
that writing the column names can not be suppressed which will result  
in repeated column names:


con - file(d:test2.csv, wt)
write.csv2(data, file=con, row.names=FALSE)
write.csv2(data, file=con, row.names=FALSE)
close(con)

So one will still have to use write.table to avoid this:

con - file(d:test2.csv, wt)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE)
write.table(data, file=con, sep=;, dec=,, row.names=FALSE,  
col.names=FALSE,

append=TRUE)
close(con)

Using a file connection is probably also more efficient when doing a  
large number of appends.


Jan








Quoting R. Michael Weylandt michael.weyla...@gmail.com:


Touche -- perhaps we could make one though?

write.csv.append - function(..., append = TRUE)
{
Call - match.call(expand.dots = TRUE)
for (argname in c(col.names, sep, dec, qmethod)) if
(!is.null(Call[[argname]]))
warning(gettextf(attempt to set '%s' ignored, argname),
domain = NA)
rn - eval.parent(Call$row.names)
Call$col.names - if (is.logical(rn)  !rn)
TRUE
else NA
Call$sep - ,
Call$dec - .
Call$qmethod - double
Call[[1L]] - as.name(write.table)
eval.parent(Call)
}
write.csv.append(1:5,test.csv, append = FALSE)
write.csv.append(1:15, test.csv)

Output seems a little sloppy, but might work for the OP.

Michael Weylandt

On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra ivan.calan...@uni-hamburg.de

wrote:



I don't think there is an append argument to write.csv() (well, actually
there is one, but set to FALSE).
There is however one to write.table()
Ivan

Le 9/21/2011 14:54, R. Michael Weylandt michael.weyla...@gmail.com a
écrit :

 The append argument of write.csv()?


Michael

On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@**
esteeadvisors.com ashish.ku...@esteeadvisors.com  wrote:

 Hi,




I wanted to write the data created using R  on existing csv file. However
everytime I use write.csv, it overwrites the values already there in the
existing csv file. Any workaround on this.



Thanks for your help



Ashish Kumar



Estee Advisors Pvt. Ltd.

Email: ashish.ku...@esteeadvisors.com

Cell: +91-9654072144

Direct: +91-124-4637-713




   [[alternative HTML version deleted]]

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Dept. Mammalogy
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-**hamburg.de/mammals/eng/1525_8_**1.phphttp://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php


__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Possible or not possible: serif axis labels with plotmath [but everything else sans serif]?

2011-09-19 Thread Hofert Jan Marius
Dear expeRts,

I it possible to have serif labels in the following plot?

x - 1:10
y - x
plot(x, y, type=b, xlab=expression(x[1]), ylab=expression(x[2]))

I know that one can use pdf(, family=serif), but then also the axis tick 
marks 
are printed in serif font. Apart from the fact that it may not look nice, I'm 
just interested if one can have serif axis labels but everything else in sans 
serif
(default).

Cheers,

Marius
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible or not possible: serif axis labels with plotmath [but everything else sans serif]?

2011-09-19 Thread Hofert Jan Marius
Dear Eik,

although possible in this case, tikzDevice is certainly not a general solution 
to all kinds of problems :-) I used it for quite some time before I gave up: I 
had a simple bar plot, the bars being black. This already caused errors like 
TeX capacity exceeded ... and I obtained these a lot. In fact, enlarging the 
TeX capacity (not trivial but possible) did not solve these issues. That's why 
I gave up on this package [although, clearly, the idea of full TeX support is 
totally appealing -- that's why I looked at the package in the first place].

Cheers,

Marius

On 2011-09-19, at 14:38 , Eik Vettorazzi wrote:

 Hi Jan Marius,
 using the tikzDevice-package, nearly everything is possible (at least,
 all what can be done in LaTeX).
 
 cheers
 
 Am 19.09.2011 11:58, schrieb Hofert Jan Marius:
 Dear expeRts,
 
 I it possible to have serif labels in the following plot?
 
 x - 1:10
 y - x
 plot(x, y, type=b, xlab=expression(x[1]), ylab=expression(x[2]))
 
 I know that one can use pdf(, family=serif), but then also the axis tick 
 marks 
 are printed in serif font. Apart from the fact that it may not look nice, 
 I'm 
 just interested if one can have serif axis labels but everything else in 
 sans serif
 (default).
 
 Cheers,
 
 Marius
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Eik Vettorazzi
 Institut für Medizinische Biometrie und Epidemiologie
 Universitätsklinikum Hamburg-Eppendorf
 
 Martinistr. 52
 20246 Hamburg
 
 T ++49/40/7410-58243
 F ++49/40/7410-57790
 
 --
 Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
 Genossenschaftsregister sowie das Unternehmensregister (EHUG):
 
 Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; 
 Gerichtsstand: Hamburg
 
 Vorstandsmitglieder: Prof. Dr. Jörg F. Debatin (Vorsitzender), Dr. Alexander 
 Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 
 

ETH Zurich
Dr. Marius Hofert
RiskLab, Department of Mathematics
HG E 65.2
Rämistrasse 101
8092 Zurich
Switzerland

Phone +41 44 632 2423
marius.hof...@math.ethz.ch
http://www.math.ethz.ch/~hofertj

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where to put tryCatch or similar in a very big for loop

2011-09-16 Thread Jan van der Laan

Laura,

Perhaps the following example helps:

nbstr - 100
result - numeric(nbstr)
for (i in seq_len(nbstr)) {
  # set the default value for when the current bootstrap fails
  result[i] - NA
  try({
# estimate your cox model here
if (runif(1)  0.1) stop(ERROR)
result[i] - i
  }, silent=TRUE)
}

Regards,
Jan




Quoting Bonnett, Laura l.j.bonn...@liverpool.ac.uk:


Hi,

The simulation occasionally generates either a rare event meaning   
that the Cox model is not appropriate or it generates a covariate   
with most responses being the same which means that the Cox model   
cannot be fit.


At bootstrap sample number 10, the variable c11 is considered   
singular by model cox1.


Thanks,
Laura

-Original Message-
From: Ken [mailto:vicvoncas...@gmail.com]
Sent: 15 September 2011 21:43
To: Bonnett, Laura
Cc: Steve Lianoglou; r-help@r-project.org
Subject: Re: [R] Where to put tryCatch or similar in a very big for loop

What type of singularity exactly, if you're working with counts is   
it a special case? If using a Monte Carlo generation scheme, there   
are various workarounds such as while(sum(vec)!=0) {sample} for   
example. More info on the error circumstances would help.


   Good luck!
Ken Hutchison

On Sep 15, 2554 BE, at 11:41 AM, Bonnett, Laura   
l.j.bonn...@liverpool.ac.uk wrote:



Hi Steve,

Thanks for your response.  The slight issue is that I need to use a  
 different starting seed for each simulation.  If I use 'lapply'   
then I end up using the same seed each time.  (By contrast, I need   
to be able to specify which starting seed I am using).




Thanks,
Laura

-Original Message-
From: Steve Lianoglou [mailto:mailinglist.honey...@gmail.com]
Sent: 15 September 2011 16:17
To: Bonnett, Laura
Cc: r-help@r-project.org
Subject: Re: [R] Where to put tryCatch or similar in a very big for loop

Hi Laura,

On Thu, Sep 15, 2011 at 10:53 AM, Bonnett, Laura
l.j.bonn...@liverpool.ac.uk wrote:

Dear all,

I am running a simulation study to test variable imputation   
methods for Cox models using R 2.9.0 and Windows XP.  The code I   
have written (which is rather long) works (if I set nsim = 9) with  
 the following starting values.



bootrs(nsim=9,lendevdat=1500,lenvaldat=855,ac1=-0.19122,bc1=-0.18355,cc1=-0.51982,cc2=-0.49628,eprop1=0.98,eprop2=0.28,lda=0.003)


I need to run the code 1400 times in total (bootstrap resampling)   
however, occasionally the random numbers generated lead to a   
singularity and hence the code crashes as one of the Cox model   
cannot be fitted (the 10th iteration is the first time this   
happens).


I've been trawling the internet for ideas and it seems that there   
are several options in the form of try() or tryCatch() or next.
I'm not sure however, how to include them in my code (attached).
Ideally I'd like it to run everything simulation from 1 to 1400   
and if there is an error at some point get an error message   
returned (I need to count how many there are) but move onto the   
next number in the loop.


I've tried putting try(,silent=TRUE) around each cox model   
(cph statement) but that hasn't work and I've also tried putting   
try around the whole for loop without any success.


Let's imagine you are using an `lapply` instead of `for`, only because
I guess you want to store the results of `bootrs` somewhere, you can
adapt this to your `for` solution. I typically return NULL when an
error is caught, then filter those out from my results, or whatever
you like:

results - lapply(1:1400, function(i) {
 tryCatch(bootrs(...whatever...), error=function(e) NULL)
})
went.south - sapply(results, is.null)

The `went.south` vector will be TRUE where an error occurred in your
bootrs call.

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave: Combining multiple output statements in a function

2011-09-16 Thread Jan van der Laan


Page 7 in my version of formatting.odt (to be sure I have the right  
version I downloaded the latest odfWeave from CRAN) discusses  
registering style definitions and Examples of Changing Styles for  
Tables, Paragraphs, Bullets and Pages which has nothing to do with my  
question (as far as I can tell).  Could you perhaps just tell me how I  
should combine the output of multiple odf* calls inside a function?


Thanks again.

Jan


Quoting Max Kuhn mxk...@gmail.com:


formatting.odf, page 7. The results are in formattingOut.odt

On Thu, Sep 15, 2011 at 2:44 PM, Jan van der Laan rh...@eoos.dds.nl wrote:

Max,

Thank you for your answer. I have had another look at the examples (I
already had before mailing the list), but could find the example you
mention. Could you perhaps tell me which example I should have a look at?

Regards,
Jan



On 09/15/2011 04:47 PM, Max Kuhn wrote:


There are examples in the package directory that explain this.

On Thu, Sep 15, 2011 at 8:16 AM, Jan van der Laanrh...@eoos.dds.nl
 wrote:


What is the correct way to combine multiple calls to odfCat, odfItemize,
odfTable etc. inside a function?

As an example lets say I have a function that needs to write two
paragraphs
of text and a list to the resulting odf-document (the real function has
much
more complex logic, but I don't think thats relevant). My first guess
would
be:

exampleOutput- function() {
  odfCat(This is the first paragraph)
  odfCat(This is the second paragraph)
  odfItemize(letters[1:5])
}

However, calling this function in my odf-document only generates the last
list as only the output of the odfItemize function is returned by
exampleOutput. How do I combine the three results into one to be returned
by
exampleOutput?

I tried to wrap the calls to the odf* functions into a print statement:

exampleOutput2- function() {
  print(odfCat(This is the first paragraph))
  print(odfCat(This is the second paragraph))
  print(odfItemize(letters[1:5]))
}

In another document this seemed to work, but in my current document
strange
odf-output is generated.

Regards,

Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.











--

Max



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave: Combining multiple output statements in a function

2011-09-15 Thread Jan van der Laan


What is the correct way to combine multiple calls to odfCat,  
odfItemize, odfTable etc. inside a function?


As an example lets say I have a function that needs to write two  
paragraphs of text and a list to the resulting odf-document (the real  
function has much more complex logic, but I don't think thats  
relevant). My first guess would be:


exampleOutput - function() {
   odfCat(This is the first paragraph)
   odfCat(This is the second paragraph)
   odfItemize(letters[1:5])
}

However, calling this function in my odf-document only generates the  
last list as only the output of the odfItemize function is returned by  
exampleOutput. How do I combine the three results into one to be  
returned by exampleOutput?


I tried to wrap the calls to the odf* functions into a print statement:

exampleOutput2 - function() {
   print(odfCat(This is the first paragraph))
   print(odfCat(This is the second paragraph))
   print(odfItemize(letters[1:5]))
}

In another document this seemed to work, but in my current document  
strange odf-output is generated.


Regards,

Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave: Combining multiple output statements in a function

2011-09-15 Thread Jan van der Laan

Max,

Thank you for your answer. I have had another look at the examples (I 
already had before mailing the list), but could find the example you 
mention. Could you perhaps tell me which example I should have a look at?


Regards,
Jan



On 09/15/2011 04:47 PM, Max Kuhn wrote:

There are examples in the package directory that explain this.

On Thu, Sep 15, 2011 at 8:16 AM, Jan van der Laanrh...@eoos.dds.nl  wrote:

What is the correct way to combine multiple calls to odfCat, odfItemize,
odfTable etc. inside a function?

As an example lets say I have a function that needs to write two paragraphs
of text and a list to the resulting odf-document (the real function has much
more complex logic, but I don't think thats relevant). My first guess would
be:

exampleOutput- function() {
   odfCat(This is the first paragraph)
   odfCat(This is the second paragraph)
   odfItemize(letters[1:5])
}

However, calling this function in my odf-document only generates the last
list as only the output of the odfItemize function is returned by
exampleOutput. How do I combine the three results into one to be returned by
exampleOutput?

I tried to wrap the calls to the odf* functions into a print statement:

exampleOutput2- function() {
   print(odfCat(This is the first paragraph))
   print(odfCat(This is the second paragraph))
   print(odfItemize(letters[1:5]))
}

In another document this seemed to work, but in my current document strange
odf-output is generated.

Regards,

Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] list of all methods winthin an S4 class

2011-09-06 Thread Albert-Jan Roskam
Hello,
 
How can I generate an overview/vector of all the methods winthin an S4 class? 
Similar to dir() in this Python code:
 class SomeClass():
 def some_method_1(self):
  pass
 def some_method_2(self):
  pass
 
 dir(SomeClass)
['__doc__', '__module__', 'some_method_1', 'some_method_2']
 
 
Thanks in advance!

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about object permanence/marshalling

2011-08-25 Thread Albert-Jan Roskam
Hello,
 
I am trying to write some code that dumps R objects to the harddisk in a binary 
format so they can be quickly re-used later. Goal is to save time. The objects 
may be quite large (e.g. classes for a GUI). I was thinking that save() and 
load() would be suitable for this (until now I only thought it could be used 
for 'real' data, e.g. matrices, data.frames etc), but I am hoping any object 
can be 'marshalled' using these functions. Probably I am doing something wrong 
in the unmarshal() function, perhaps with assign().
 
Thank you in advance!
 
AJ
 
 
#
# Creation of test data
#
setClass(
  Class=Test,
    representation=representation(
  amounts=data.frame
  )
)
setMethod(
  f=initialize,
  signature=Test,
  definition=function(.Object, amounts){
    .Object@amounts - amounts
    return(.Object)
  }
)
setGeneric (
  name=doStuff,
  def=function(.Object){standardGeneric(doStuff)}
)
setMethod(
  f = doStuff,
  signature = Test,
  definition=function(.Object) {
    return(mean(.Object@amounts, na.rm=TRUE))
  }
)
    
print( objects() )
instance - new(Class=Test, data.frame(amount=runif(10, 0, 10)))
doStuff(instance) 
 
#
# actual code (incomplete)
#
marshal - function(object) {
    fn - file.path(Sys.getenv()[TEMP], paste(object, .xdr, sep=))
    save(object, file=fn, compress=FALSE)
    print(sprintf(Saving %s, fn))
    }
unmarshal - function(xdr) {
    object - strsplit(strsplit(xdr, \\.)[[1]][[1]], /)
    object - object[[1]][length(object[[1]])]
    assign(object, load(xdr))
    print(sprintf(Loading %s, xdr))
    }
print(objects())
lapply(c(doStuff, instance), marshal)
rm(list=c(doStuff, instance))
   
xdrs - Sys.glob(file.path(Sys.getenv()[TEMP], *.xdr))
lapply(xdrs, unmarshal)
print(objects())  ## doStuff and instance do not appear! :-(


Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about object permanence/marshalling

2011-08-25 Thread Albert-Jan Roskam
Hi Jim,
 
Thanks for your reply. It seems that save and load can only be used for 
datasets (as the title in ?load suggests).
I'd be very glad if I'm mistaken though! 

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~

From: jim holtman jholt...@gmail.com
To: Albert-Jan Roskam fo...@yahoo.com
Cc: R Mailing List r-help@r-project.org
Sent: Thursday, August 25, 2011 4:07 PM
Subject: Re: [R] Question about object permanence/marshalling

The problem I think is in your unmarshal.  'load' will load the object
into the local environment, not the global.  You have to explicitly
return it, that means you have to know the name that its was 'save'd
by;

 unmarshal - function(xdr) {
     object - strsplit(strsplit(xdr, \\.)[[1]][[1]], /)
     object - object[[1]][length(object[[1]])]
     load(xdr)
     print(sprintf(Loading %s, xdr))
      ?  # return the object that was just loaded; don't know if
they all have the same name.
     }




On Thu, Aug 25, 2011 at 9:34 AM, Albert-Jan Roskam fo...@yahoo.com wrote:
 Hello,

 I am trying to write some code that dumps R objects to the harddisk in a 
 binary format so they can be quickly re-used later. Goal is to save time. 
 The objects may be quite large (e.g. classes for a GUI). I was thinking that 
 save() and load() would be suitable for this (until now I only thought it 
 could be used for 'real' data, e.g. matrices, data.frames etc), but I am 
 hoping any object can be 'marshalled' using these functions. Probably I am 
 doing something wrong in the unmarshal() function, perhaps with assign().

 Thank you in advance!

 AJ


 #
 # Creation of test data
 #
 setClass(
   Class=Test,
     representation=representation(
   amounts=data.frame
   )
 )
 setMethod(
   f=initialize,
   signature=Test,
   definition=function(.Object, amounts){
     .Object@amounts - amounts
     return(.Object)
   }
 )
 setGeneric (
   name=doStuff,
   def=function(.Object){standardGeneric(doStuff)}
 )
 setMethod(
   f = doStuff,
   signature = Test,
   definition=function(.Object) {
     return(mean(.Object@amounts, na.rm=TRUE))
   }
 )

 print( objects() )
 instance - new(Class=Test, data.frame(amount=runif(10, 0, 10)))
 doStuff(instance)

 #
 # actual code (incomplete)
 #
 marshal - function(object) {
     fn - file.path(Sys.getenv()[TEMP], paste(object, .xdr, sep=))
     save(object, file=fn, compress=FALSE)
     print(sprintf(Saving %s, fn))
     }
 unmarshal - function(xdr) {
     object - strsplit(strsplit(xdr, \\.)[[1]][[1]], /)
     object - object[[1]][length(object[[1]])]
     assign(object, load(xdr))
     print(sprintf(Loading %s, xdr))
     }
 print(objects())
 lapply(c(doStuff, instance), marshal)
 rm(list=c(doStuff, instance))

 xdrs - Sys.glob(file.path(Sys.getenv()[TEMP], *.xdr))
 lapply(xdrs, unmarshal)
 print(objects())  ## doStuff and instance do not appear! :-(


 Cheers!!
 Albert-Jan


 ~~
 All right, but apart from the sanitation, the medicine, education, wine, 
 public order, irrigation, roads, a fresh water system, and public health, 
 what have the Romans ever done for us?
 ~~
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to colour specific edges in a dendrogram

2011-08-01 Thread Jan Teichmann
Dear Mailing-list

I used hclust to make a dendrogram of 2613 leafs. I also have a list
with the names of certain labels which are of interest and I would like
to visualize their appearance within the dendrogram. I found an example
how to use dendrapply to colour the labels but the problem is that with
2613 leafs I cannot plot the labels as it gets super messy.

I now tried to write a function using dendrapply() to colour the edges
of the leafs of interest red. Unfortunately, I fail writing this
function. Could someone help me out with the stub of a function
colouring edges?

I have the
  dendrogram
  list of labels to colour their edges

I would like to colour the edges between the final leaf node and their
parental node.

Thank you very much for your help!
Jan


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time series question

2011-06-28 Thread Albert-Jan Roskam
Hi,

I have a long data file with time data that change to wide format using 
reshape. 
The data contain Values and Factors. Some values are missing but can be 
obtained 
by multiplying value of year T-1 with Factor of year T. Sometimes, multiple 
succesive years have no values, so the calculated values need to be calculated 
sequentially. 


# sample data.
DF - data.frame(Var=rep(letters, 10), 
Fac=rep(runif(26), 10), 
Val=rep(runif(26), 10), 
Year=rep(2000:2009, 26))
DF[as.numeric(rownames(DF)) %% 3 == 0,Val] - NA # make some holes
DF2 - cast(melt(DF, id=c(Var, Year)), ... ~ variable + Year, fun=mean, 
na.rm=T)

# my attempt
library(reshape)
prev - grep(Val_, names(DF2)) - 1
this - grep(Fac_, names(DF2))
DF3 - DF2
DF3[, prev] - mapply(*, DF2[, this], DF2[, prev])

 
This doesn't work. Another option would be to use two loops for cols and rows, 
but I didn't get that to work either :-(

Suggestions for clean code, anyone?

Thank you in advance!

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standards for delivery of GPL software in CRAN packages

2011-06-27 Thread Galkowski, Jan
I wondered if there were standard practices in CRAN for delivery of R source 
implementing functions in R packages. I has encountered a couple of packages 
where the gzipped version of source contains very little, primarily the Help 
files describing the functions in the package. In some cases I can find the 
source as the value of the function name.

Given that these packages are released as GPL, oughtn't the unoptimized source 
be freely available, hopefully with comments? Am I missing something? Is there 
a central place other than mirrors where such source is retained? Sourceforge?

  - Jan

- Jan, from Sierpinski, 
 a Blackberry, 6072391834,
 Google Talk to: 
   bayesianlogi...@gmail.com
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standards for delivery of GPL software in CRAN packages

2011-06-27 Thread Galkowski, Jan
Fine.  Attached. It's waved.

All it has is *.Rd files. Apparently the functions are collected in 
functionINIT.R. But 00Index and DESCRIPTION are not helpful.

 - j

-Original Message-
From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On 
Behalf Of Barry Rowlingson
Sent: Monday, June 27, 2011 10:18 AM
To: Galkowski, Jan
Cc: r-help@r-project.org
Subject: Re: [R] Standards for delivery of GPL software in CRAN packages

On Mon, Jun 27, 2011 at 1:24 PM, Galkowski, Jan jgalk...@akamai.com wrote:
 I wondered if there were standard practices in CRAN for delivery of R source 
 implementing functions in R packages. I has encountered a couple of packages 
 where the gzipped version of source contains very little, primarily the Help 
 files describing the functions in the package. In some cases I can find the 
 source as the value of the function name.

 Given that these packages are released as GPL, oughtn't the unoptimized 
 source be freely available, hopefully with comments? Am I missing something? 
 Is there a central place other than mirrors where such source is retained? 
 Sourceforge?


The 'package source' link on CRAN should point you to a tar.gz file
that contains the source code. For example, for splancs off the heanet
mirror it is:

http://ftp.heanet.ie/mirrors/cran.r-project.org/src/contrib/splancs_2.01-27.tar.gz

 .tar.gz files from those links should have full R, C and Fortran source code.

I think we need counter-examples...

Barry
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standards for delivery of GPL software in CRAN packages

2011-06-27 Thread Galkowski, Jan
No, you are correct. It meets the letter of GPL. Took me a while to find 
FunctionINIT.R though. As I wrote in the original, source is available, if only 
by keying in function names and seeing their value. Was hoping for greater 
clarity.

I of course have the paper and the help. I was trying to understand what this 
eta partameter was and how to interpret it.

As mentioned, not the first package this has been an issue for.

Thanks,

   - Jan

- Jan, from Sierpinski, 
 a Blackberry, 6072391834,
 Google Talk to: 
   bayesianlogi...@gmail.com

- Original Message -
From: Gavin Simpson gavin.simp...@ucl.ac.uk
To: Galkowski, Jan
Cc: Barry Rowlingson b.rowling...@lancaster.ac.uk; r-help@r-project.org 
r-help@r-project.org
Sent: Mon Jun 27 11:36:57 2011
Subject: Re: [R] Standards for delivery of GPL software in CRAN packages

On Mon, 2011-06-27 at 11:14 -0400, Galkowski, Jan wrote:
 Fine.  Attached. It's waved.
 
 All it has is *.Rd files. Apparently the functions are collected in
 functionINIT.R. But 00Index and DESCRIPTION are not helpful.
 
  - j

The Rd files are the help or manual pages for the functions defined in
the package. In this particular case, the package author has decided to
put all the R code for their package into a single R source file -
functionINIT.R. The other two files are R-specific files, the latter of
which is used to describe the package; which, incidentally, points you
to a peer-reviewed paper that the package is support for.

I don't recall the GPL mentioning anything requiring that the source
code be helpful. The authors have most certainly fulfilled their
requirements under GPL, as has CRAN in distributing the package sources.

Or am I being obtuse and completely missing your point?

G

 -Original Message-
 From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On 
 Behalf Of Barry Rowlingson
 Sent: Monday, June 27, 2011 10:18 AM
 To: Galkowski, Jan
 Cc: r-help@r-project.org
 Subject: Re: [R] Standards for delivery of GPL software in CRAN packages
 
 On Mon, Jun 27, 2011 at 1:24 PM, Galkowski, Jan jgalk...@akamai.com wrote:
  I wondered if there were standard practices in CRAN for delivery of R 
  source implementing functions in R packages. I has encountered a couple of 
  packages where the gzipped version of source contains very little, 
  primarily the Help files describing the functions in the package. In some 
  cases I can find the source as the value of the function name.
 
  Given that these packages are released as GPL, oughtn't the unoptimized 
  source be freely available, hopefully with comments? Am I missing 
  something? Is there a central place other than mirrors where such source is 
  retained? Sourceforge?
 
 
 The 'package source' link on CRAN should point you to a tar.gz file
 that contains the source code. For example, for splancs off the heanet
 mirror it is:
 
 http://ftp.heanet.ie/mirrors/cran.r-project.org/src/contrib/splancs_2.01-27.tar.gz
 
  .tar.gz files from those links should have full R, C and Fortran source code.
 
 I think we need counter-examples...
 
 Barry
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standards for delivery of GPL software in CRAN packages

2011-06-27 Thread Galkowski, Jan
Regarding the subject, I want to thank the many respondents for clarifying the 
nature of the relationship between R and the GPL, as well as giving help with 
the structure of R-delivered source.

I want to emphasize I meant nothing at all harsh or accusatory in my email. I 
did say I had access to source, as function name values, just that my 
expectation was to see functions called out individually. 

The particular case I sought information about a maxiset threshold parameter 
in the package waved for the function WaveD and what it meant, trying to 
understand the related algorithms.  I'm now convinced that I'll need to 
understand the original papers, Cavalier and Raimondo (2007) and Donoho and 
Raimondo (2004), as well as Johnstone, Keykyacharian, Picard, and Raimondo 
(2004), in order to obtain a satisfactory answer.

I think my reaction was in response to being rather spoiled by some of the 
really excellent, world class, and mature packages and their documentation 
elsewhere in the R contributions library, some backed up by whole textbooks. I 
realize all package authors do their best and the packages are thoroughly 
tested.  I never had any question the package was correct, merely trying to 
understand how it worked. When I wasn't satisfied by the documentation in the 
package, I turned to the source.

Again, I meant no offense to anyone. I thank you all for your responses and 
efforts, and am grateful.

 - Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ranking submodels by AIC (more general question)

2011-06-23 Thread Jan van der Laan

Alexandra,

Have a look at add1 and drop1.

Regards,
Jan


On 06/23/2011 07:32 PM, Alexandra Thorn wrote:

Here's a more general question following up on the specific question I
asked earlier:

Can anybody recommend an R command other than mle.aic() (from the wle
package) that will give back a ranked list of submodels?  It seems like
a pretty basic piece of functionality, but the closest I've been able to
find is stepAIC(), which as far as I can tell only gives back the best
submodel, not a ranking of all submodels.

Thanks in advance,
Alexandra

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Documenting variables, dataframes and files?

2011-06-22 Thread Jan van der Laan

The memisc package also offers functionality for documenting data.

Jan

On 06/22/2011 04:57 PM, Robert Lundqvist wrote:

Every now and then I realize that my attempts to document what all dataframes 
consist of are unsufficient. So far, I have been writing notes in an external 
file. Are there any better ways to do this within R? One possibility could be 
to set up the data as packages, but I would like to have a solution on a lower 
level, closer to data. I can't find any pointers in the standard manuals. 
Suggestions are most welcome.

Robert

**
Robert Lundqvist
Norrbotten regional council
Sweden


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] omitting columns from a data frame

2011-06-21 Thread Albert-Jan Roskam
But isn't this version of which() typo-proof?
 x - iris[-which(c(Sepal.Length, SSSepal.Width) %in% names(iris))]
Btw, I prefer the following, ie. simply assigning to NULL. Much easier notation.
 y - iris
 y$Sepal.Width - y$SSSepal.Width - NULL


 Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~





From: Joshua Wiley jwiley.ps...@gmail.com
To: Ista Zahn iz...@psych.rochester.edu
Cc: R-Help r-h...@stat.math.ethz.ch
Sent: Tue, June 21, 2011 5:05:27 PM
Subject: Re: [R] omitting columns from a data frame

On Tue, Jun 21, 2011 at 6:57 AM, Ista Zahn iz...@psych.rochester.edu wrote:
 I would cation people not to use the -which strategy because entering
 a value that doesn't exist as a column name returns a zero-column
 data.frame, without so much as a warning. This can be a problem if you
 don't know if a column exists but just want to make sure it doesn't,
 or if you make a typo. Compare

Good point.  In some ways, I am a little unsettled by setdiff()
because if you make a typo, you may *think* you have omitted it, and
you will have a sensible data frame, but it will actually still be
there.  I am particularly thinking of the case where you are omitting
several variables at once:

mtcars[setdiff(names(mtcars), c(disp, jp))]

which is why my current preference has been match().  The default for
no match fails spectacularly if the variable does not exist:

mtcars[-match(c(disp, jp), names(mtcars))]

of course, this would not work for your example of a variable you just
want to make sure is deleted.  Anyone have thoughts on pitfalls of
match?

Josh

 head(mtcars[, -which(names(mtcars) == make.sure.to.delete)])

 to

 head(mtcars[, setdiff(names(mtcars), make.sure.to.delete)])

 Best,
 Ista

 On Tue, Jun 21, 2011 at 12:22 AM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 On Mon, Jun 20, 2011 at 8:55 PM, Erin Hodgess erinm.hodg...@gmail.com 
wrote:
 Too funny!

 how about subset?

 Sure, that is one option.  Each of the following will also work.  The
 ones wrapped with c() can easily omit more than one at a time.

 mtcars[, -which(names(mtcars) == drat)]
 mtcars[, names(mtcars) != drat]
 mtcars[, !names(mtcars) %in% c(drat)]
 mtcars[, -match(c(drat), names(mtcars))]


 On Mon, Jun 20, 2011 at 10:52 PM, Joshua Wiley jwiley.ps...@gmail.com 
wrote:
 Hi Erin,

 See inline.

 On Mon, Jun 20, 2011 at 8:45 PM, Erin Hodgess erinm.hodg...@gmail.com 
wrote:
 Dear R People:

 I have a data frame, xm1, which has 12 rows and 4 columns.

 If I put is xm1[,-4], I get all rows, and columns 1 - 3, which is as
 it should be.

 Okay, so you know how to use the column number to omit columns.


 Now, is there a way to use the names of the columns to omit them, please?

 You have all the pieces (the column names, and the knowledge that you
 can omit columns by their index).

 Homework: find a way to return the column numbers given the column names 
(hint).

 Cheers,

 Josh




[[elided Yahoo spam]]

 Sincerely,
 Erin


 --
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: erinm.hodg...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/




 --
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: erinm.hodg...@gmail.com




 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting

Re: [R] is this a bug?

2011-06-18 Thread Albert-Jan Roskam
Thanks a lot to all who responded. This is a little less confusing now, 
although 
it's hard for me to fathom the (practical) use of a dataframe within a 
dataframe. If one mixes different notations, or, put in a different way, 
different underlying classes (data.frame vs. numeric), these rather unintuitive 
results appear.
So I'll use any of these:
df$pct - df$weight / ave(df$weight, df$sex, FUN=sum)*100
df[pct] - df[weight] / ave(df[weight], df[sex], FUN=sum)*100

using str() is very insightful, as is using class()

I'd prefer it if R simply generated an error when one attempts to nest a 
data.frame within a data.frame.

Thanks again!

 Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~





From: Brian Diggs dig...@ohsu.edu
To: R-help@r-project.org
Sent: Fri, June 17, 2011 11:58:44 PM
Subject: Re: [R] is this a bug?

On 6/17/2011 2:24 PM, (Ted Harding) wrote:
 And the extra twist in the tale is exemplified by this
 mini-version of Albert-Jan's first example:

DF- data.frame(A=c(1,2,3))
DF$B- c(4,5,6)
DF$C- c(7,8,9)
DF
#   A B C
# 1 1 4 7
# 2 2 5 8
# 3 3 6 9

DF$D- DF[A]/DF[B]
DF
#   A B CA
# 1 1 4 7 0.25
# 2 2 5 8 0.40
# 3 3 6 9 0.50

 ##And why:

DF[A]/DF[B]
#  A
# 1 0.25
# 2 0.40
# 3 0.50

 ##So the ratio DF[A]/DF[B] comes out with the name of
 ##the numerator, A. This is then the name given to DF$D

It's even slightly weirder than that:

str(DF)
#'data.frame':   3 obs. of  4 variables:
# $ A: num  1 2 3
# $ B: num  4 5 6
# $ C: num  7 8 9
# $ D:'data.frame':  3 obs. of  1 variable:
#  ..$ A: num  0.25 0.4 0.5

There is a column D in DF which is itself a data frame with a single 
column whose name is A (because of what Ted said).  When formatted for 
printing out, the column name of the inner data frame is used (as a 
result of how data.frame() itself handles named arguments when the 
argument is itself a data.frame: If a list or data frame or matrix is 
passed to data.frame it is as if each component or column had been 
passed as a separate argument...).

So not a bug, but a convoluted set of circumstances that can happen when 
non-atomic vectors are assigned to columns of a data.frame.  That's one 
of those /you shouldn't do that even though it is technically legal or 
at least you shouldn't be surprised when things don't work the way you 
thought they would/ things.

 Thus Albert-Jan's
df[weight] / ave(df[weight], df[sex], FUN=sum)*100
 comes through with name weight.

 Ted.


 On 17-Jun-11 21:06:42, William Dunlap wrote:
 df$varname is a column of df.

 df[varname] is a one-column df containing that column.

 df[[varname]] is a column of df (same as df$varname).

 df[,varname] is a column of df (same as df$varname).

 df[,varname,drop=FALSE] is a one-column df (same as df$varname).

 df$newVarname- df[varname] inserts a new component
 into df, the component being a one-column data.frame,
 not the column in that data.frame.

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Albert-Jan Roskam
 Sent: Friday, June 17, 2011 1:49 PM
 To: R Mailing List
 Subject: [R] is this a bug?

 Hello,

 Is the following a bug? I always thought that df$varname-
 does the same as
 df[varname]-

 df- data.frame(weight=round(runif(10, 10, 100)),
 sex=round(runif(100, 0,
 1)))
 df$pct- df[weight] / ave(df[weight], df[sex], FUN=sum)*100
 names(df)
 [1] weight sexpct ### --  ok
 head(df)
[[elided Yahoo spam]]
 1 86   0 2.4002233
 2 19   1 0.5643006
 3 32   0 0.8931063
 4 87   0 2.4281328
 5 45   0 1.2559308
 6 95   0 2.6514094
 rm(df)
 df- data.frame(weight=round(runif(10, 10, 100)),
 sex=round(runif(100, 0,
 1)))
 df[pct]- df[weight] / ave(df[weight], df[sex],
 FUN=sum)*100 ###
 -  this does work
 names(df)
 [1] weight sexpct
 head(df)
weight sex   pct
 1 15   0 0.5246590
 2 43   0 1.5040224
 3 17   1 0.9284544
 4 44   1 2.4030584
 5 76   1 4.1507373
 6 59   0 2.0636586
 do.call(c, R.Version())
 platformarch
  i686-pc-linux-gnu  i686
   os  system
  linux-gnu   i686, linux-gnu
   status   major
2
minoryear
   11.1  2010

[R] is this a bug?

2011-06-17 Thread Albert-Jan Roskam
Hello,

Is the following a bug? I always thought that df$varname - does the same as 
df[varname] -

 df - data.frame(weight=round(runif(10, 10, 100)), sex=round(runif(100, 0, 
1)))
 df$pct - df[weight] / ave(df[weight], df[sex], FUN=sum)*100
 names(df)
[1] weight sexpct ### -- ok
 head(df)
  weight sexweight  ### -- huh!?!
1 86   0 2.4002233
2 19   1 0.5643006
3 32   0 0.8931063
4 87   0 2.4281328
5 45   0 1.2559308
6 95   0 2.6514094
 rm(df)
 df - data.frame(weight=round(runif(10, 10, 100)), sex=round(runif(100, 0, 
1)))
 df[pct] - df[weight] / ave(df[weight], df[sex], FUN=sum)*100 ### 
- this does work
 names(df)
[1] weight sexpct   
 head(df)
  weight sex   pct
1 15   0 0.5246590
2 43   0 1.5040224
3 17   1 0.9284544
4 44   1 2.4030584
5 76   1 4.1507373
6 59   0 2.0636586
 do.call(c, R.Version())
   platformarch 
i686-pc-linux-gnu  i686 
 os  system 
linux-gnu   i686, linux-gnu 
 status   major 
  2 
  minoryear 
 11.1  2010 
  month day 
   0531 
svn revlanguage 
52157 R 
 version.string 
R version 2.11.1 (2010-05-31) 
 # Thanks!

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running a GMM Estimation on dynamic Panel Model using plm-Package

2011-06-12 Thread Jan Schulz

Hi!

Am 12.06.2011 21:43, schrieb bstudent:

Error in solve.default(Reduce(+, A2)) :
   System ist für den Rechner singulär: reziproke Konditionszahl =
4.08048e-22

Error in solve.default(Reduce(+, A2)) :
   System is singulary for the computer: reciprocal number of conditions =
4.08048e-22


Just for the record: I had the same error with my data and finaly gave 
up and used stata.


Kind regards and good luck!

Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reshape::cast: invalid 'yinds' argument

2011-05-31 Thread Albert-Jan Roskam
Hi,

I'm using reshape to cast molten data. When I use the following command, R 
either crashes (when I use Notepad++) or gives an error (when I use Rgui or 
source()), BUT the error occurs not always, maybe only on half the attempts:
w - cast(v, id + code + productname + year + begin + end + specificDesc + 
specificDesc2 ~ type)
Error in merge.data.frame(data, all.combinations, by = unlist(vars), sort = 
FALSE,  : 

  invalid 'yinds' argument

What does this message mean, and how can I get rid of the error? I tried 
changing the colvars from character to factor, but that didn't help.

I'm using R2.10.1 and either WinXP or Win2000.

 Thanks in advance,
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NaN, Inf to NA

2011-05-27 Thread Albert-Jan Roskam
Aha! Thank you very much for that clarification! It would be much more user 
friendly if R generated a NotImplementedError or something similar. The 
'garbage 
results' are pretty misleading, esp. to a novice.

I wanted to recode every NaN and Inf value of an entire data.frame to NA. The 
data.frame also includes character variables. So the following might work (?) 
(Can't test it here)

ditch - function(x) ifelse(is.infinite(x) | is.nan(x), NA, x)
df - apply(df, 2, ditch)






From: William Dunlap wdun...@tibco.com

Cc: R Mailing List r-help@r-project.org
Sent: Fri, May 27, 2011 12:57:01 AM
Subject: RE: [R] NaN, Inf to NA

I think the source of the OP's problem is that
while things like df30 and is.na(df) return
a logical matrix with the dimensions of the
data.frame df, both is.infinite(df) and is.nan(df)
return a logical vector as long as the number
of columns of df.  (`` and is.na have data.frame
methods but is.infinite and is.nan do not: the latter
give garbage results for data.frames.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Marc Schwartz
 Sent: Thursday, May 26, 2011 2:15 PM
 To: Albert-Jan Roskam
 Cc: R Mailing List
 Subject: Re: [R] NaN, Inf to NA
 
 On May 26, 2011, at 3:18 PM, Albert-Jan Roskam wrote:
 
  Hi,
  
  I want to recode all Inf and NaN values to NA, but I;m 
 surprised to see the 
  result of the following code. Could anybody enlighten me 
 about this? 
  
  df - data.frame(a=c(NA, NaN, Inf, 1:3))
  df[is.infinite(df) | is.nan(df)] - NA
  df
 a
  1  NA
  2 NaN
  3 Inf
  4   1
  5   2
  6   3
  
  
  
  Thanks!
  
  Cheers!!
  Albert-Jan
 
 
 The canonical way is to use is.na() to assign the NA value 
 based upon a condition. See ?is.na for more information.
 
 is.na(df$a) - !is.finite(df$a)
 
  df
a
 1 NA
 2 NA
 3 NA
 4  1
 5  2
 6  3
 
 
 HTH,
 
 Marc Schwartz
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NaN, Inf to NA

2011-05-26 Thread Albert-Jan Roskam
Hi,

I want to recode all Inf and NaN values to NA, but I;m surprised to see the 
result of the following code. Could anybody enlighten me about this? 

 df - data.frame(a=c(NA, NaN, Inf, 1:3))
 df[is.infinite(df) | is.nan(df)] - NA
 df
a
1  NA
2 NaN
3 Inf
4   1
5   2
6   3
 

 
Thanks!

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Gui immediately closes when started from command-line

2011-05-19 Thread Albert-Jan Roskam
Hello,

I want to run an r script that contains code for a gui (rgtk) on the command 
line (windows 2000, 32 bits) using R2.10.1, but the Gui disappears a few 
miliseconds after I started the program. What switch should I use to prevent 
this? I tried r.exe, rterm.exe and rscript.exe with various combinations of 
switches, but none of them works.
 
TIA
Cheers!!
Albert-Jan 


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~ 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Gui immediately closes when started from command-line

2011-05-19 Thread Albert-Jan Roskam
Thanks, we tried it, but it didn't solve the problem. Some more info (mostly 
strings of ) was shown in the Dos box, but that was all. 

 Cheers!!
Albert-Jan 


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~ 





From: Jonathan Gabris jonat...@k-m-p.nl
To: r-help@r-project.org
Sent: Thu, May 19, 2011 9:54:13 AM
Subject: Re: [R] Gui immediately closes when started from command-line

I had a problem similar to this I think. Though I cannot remember the 
symptoms.
Something to to with the lack of possible interaction with the console 
as I was using R as a backend to a Qt interface.

To solve the problem I used the flag: '--ess'
(using '--vanilla' is also a good idea)

(cf Appendix B:Invoking R, in one of the R manuals)

Hope this helps.

Jonathan.

 Hello,

 I want to run an r script that contains code for a gui (rgtk) on the command
 line (windows 2000, 32 bits) using R2.10.1, but the Gui disappears a few
 miliseconds after I started the program. What switch should I use to prevent
 this? I tried r.exe, rterm.exe and rscript.exe with various combinations of
 switches, but none of them works.
  
 TIA
 Cheers!!
 Albert-Jan


 ~~
 All right, but apart from the sanitation, the medicine, education, wine,
public
 order, irrigation, roads, a fresh water system, and public health, what have 
the
 Romans ever done for us?
 ~~
     [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behaviour as.data.frame

2011-05-16 Thread Jan van der Laan

Santosh, Ivan,

This is also what I was looking for. Thanks. Looking at the source of 
dataFrame.default is seems that it uses the same approach as I did: 
first create a list then a data.frame from that list. I think I'll stick 
with the code I already had as I don't want another dependency (multiple 
actually for R.utils). But thanks again for pointing it out.


Jan

On 05/16/2011 10:42 AM, Santosh Srinivas wrote:

Hi Ivan, Take a look dataFrame in R.utils ... is that what you want?

from the help file:

Examples

   df- dataFrame(colClasses=c(a=integer, b=double), nrow=10)
   df[,1]- sample(1:nrow(df))
   df[,2]- rnorm(nrow(df))
   print(df)

Thanks,
Santosh

On Mon, May 16, 2011 at 1:42 PM, Ivan Calandra
ivan.calan...@uni-hamburg.de  wrote:

I feel like I'm always asking this type of questions, but is it possible to
add a base function that allows creating an empty data.frame, as matrix()
does?

What I mean would be something like:
create.data.frame(number_of_columns, mode_of_columns).
I think it would make things easier than creating one or several matrices
and then combining them

Is it possible; does it make sense?

Ivan

Le 5/15/2011 22:17, Bert Gunter a écrit :

Inline below.

On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

Thanks. I also noticed myself minutes after sending my message to the
list.
My 'please ignore my question it was just a stupid typo' message was sent
with the wrong account and is now awaiting moderation.

However, my other question still stands: what is the
preferred/fastest/simplest way to create a data.fame with given column
types
and dimensions?

I do not know, but  why is simply

data.frame(numeric(10), character(10), integer(10),
stringsAsFactors=FALSE)

not acceptable? Note that if you had, say, 500, numeric (= double) and
100 character columns to add, you might do something like:


z- matrix(numeric(5000),nr=10)
u- matrix(character(1000),nr=10)
frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

While this might save some typing, it may not be much more efficient
than typing it all out -- maybe just some parsing time is saved. You
can experiment and see.

However, since a data.frame **is** a list with added attributes and a
great deal of the work of the constructor is in constructing and
checking these attributes (e.g. row and column names), I see nothing
terribly inefficient with what you did. It's just a bit obscure.  But
maybe someone with greater expertise will set us both straight.

Cheers,
Bert



Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it works as
expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

I use the following code to create two data.frames d1 and d2 from a
list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second
column
is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1
1
1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create
an
'empty' data.frame with specified column types and dimensions. I need
this
data.frame to pass on to my c++ routines. Is there a more
simple/elegant
way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King

Re: [R] Unexpected behaviour as.data.frame

2011-05-15 Thread Jan van der Laan
Forget I asked. There was a typo in my example (stringsAsFactor  
instead of stringAsFactors) which explained the difference. My  
apologies.


My second question however still stands: How does on create a  
data.frame with given column types and given dimensions? Thanks.


Regards,
Jan


Quoting Jan van der Laan rh...@eoos.dds.nl:


I use the following code to create two data.frames d1 and d2 from a list:

types  - c(integer, character, double)
nlines - 10
d1 - as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2 - lapply(types, do.call, list(nlines))
d2 - as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second
column is a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: Factor w/ 1 level : 1 1
1 1 1 1 1 1 1 1
 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: chr  ...
 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create
an 'empty' data.frame with specified column types and dimensions. I
need this data.frame to pass on to my c++ routines. Is there a more
simple/elegant way of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected behaviour as.data.frame

2011-05-15 Thread Jan van der Laan

I use the following code to create two data.frames d1 and d2 from a list:

types  - c(integer, character, double)
nlines - 10
d1 - as.data.frame(lapply(types, do.call, list(nlines)),  
stringsAsFactor=FALSE)

l2 - lapply(types, do.call, list(nlines))
d2 - as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second  
column is a factor while in d2 it is a character (which I would expect):



str(d1)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: Factor w/ 1 level : 1  
1 1 1 1 1 1 1 1 1

 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: chr  ...
 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create  
an 'empty' data.frame with specified column types and dimensions. I  
need this data.frame to pass on to my c++ routines. Is there a more  
simple/elegant way of creating this data.frame?


Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behaviour as.data.frame

2011-05-15 Thread Jan van der Laan
Thanks. I also noticed myself minutes after sending my message to the 
list. My 'please ignore my question it was just a stupid typo' message 
was sent with the wrong account and is now awaiting moderation.


However, my other question still stands: what is the 
preferred/fastest/simplest way to create a data.fame with given column 
types and dimensions?


Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it works as
expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl  wrote:

I use the following code to create two data.frames d1 and d2 from a list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second column is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1 1 1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create an
'empty' data.frame with specified column types and dimensions. I need this
data.frame to pass on to my c++ routines. Is there a more simple/elegant way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] first occurrence of a value?

2011-05-04 Thread Albert-Jan Roskam
Hello,

A simple question perhaps, but how do I, within each row, find the first 
occurence of the number 1 in the df below? I want to use this position to 
programmatically create the variable 'year'. I'v come up with a solution, but I 
find it downright ugly. Is there a simpler way? I was hoping for a useful 
built-in function that I don;t yet know about.

df - data.frame(j1999=c(0,0,0,0,1,0), j2000=c(NA, 1, 1, 1, 0, 0), j2001=c(1, 
0, 
1, 0, 0, 0), year=c(2001, 2000, 2000, 2000, 1999, NA))
library(gsubfn)
x - apply(df==1, 1, which)
giveYear - function(df) { return( as.numeric(gsubfn(^[^0-9]+, , 
names(df)[1])) ) }
df$year2 - sapply(x, giveYear)

Thanks in advance!

 Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] first occurrence of a value?

2011-05-04 Thread Albert-Jan Roskam
Hi Patrick, Dimitri,

Thank you! Yes, 'match' was exactly what I was looking for. I like it as it 
doesn't require too many functions to be nested.

 Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~





From: Patrick Breheny patrick.breh...@uky.edu

Cc: R Mailing List r-help@r-project.org
Sent: Wed, May 4, 2011 2:17:25 PM
Subject: Re: [R] first occurrence of a value?

You may want to look into the function 'match', which finds the first 
occurrence of a value.  In your example,

df - data.frame(j1999=c(0,0,0,0,1,0), j2000=c(NA, 1, 1, 1, 0, 0), 
j2001=c(1, 0,
1, 0, 0, 0), year=c(2001, 2000, 2000, 2000, 1999, NA))

apply(df,1,match,x=1)

[1]  3  2  2  2  1 NA

___
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky


On 05/04/2011 07:52 AM, Albert-Jan Roskam wrote:
 Hello,

 A simple question perhaps, but how do I, within each row, find the first
 occurence of the number 1 in the df below? I want to use this position to
 programmatically create the variable 'year'. I'v come up with a solution, but 
I
 find it downright ugly. Is there a simpler way? I was hoping for a useful
 built-in function that I don;t yet know about.

 df- data.frame(j1999=c(0,0,0,0,1,0), j2000=c(NA, 1, 1, 1, 0, 0), j2001=c(1, 
0,
 1, 0, 0, 0), year=c(2001, 2000, 2000, 2000, 1999, NA))
 library(gsubfn)
 x- apply(df==1, 1, which)
 giveYear- function(df) { return( as.numeric(gsubfn(^[^0-9]+, ,
 names(df)[1])) ) }
 df$year2- sapply(x, giveYear)

 Thanks in advance!

   Cheers!!
 Albert-Jan


 ~~
 All right, but apart from the sanitation, the medicine, education, wine, 
public
 order, irrigation, roads, a fresh water system, and public health, what have 
the
 Romans ever done for us?
 ~~

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rodbc quesion: how to reliably determine the data type?

2011-05-03 Thread Albert-Jan Roskam
Hello,

How can I tell RODBC to scan all the records of an xls file to determine the 
data type? If the first n records happen to be empty Rodbc assumes a character, 
and any numbers are made NA. And if, for instance, the first n records 
contain 
numbers, and later they also contain characters, those characters  become NA.

 Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rodbc quesion: how to reliably determine the data type?

2011-05-03 Thread Albert-Jan Roskam
Hi Jeff,

Ah, thanks a lot! Yes, meanwhile I also switched to csv. This still requires 
knowledge about the regional settings (Sys.getlocale), but it's a lot more 
transparent.

 
I'm quite new to R and I must say that stuff like this is eating up a LOT of my 
time. All those invisible data type conversions are driving me nuts. 
StringsAsFactors=F should be the default, for instance.

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~





From: Jeff Newmiller jdnew...@dcn.davis.ca.us

Sent: Tue, May 3, 2011 10:21:02 AM
Subject: Re: [R] Rodbc quesion: how to reliably determine the data type?

This is not a decision being made by RODBC... it is in the Microsoft ODBC 
driver 
for Excel. If you really want to know more, you can read 
http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-types/ ...

but the best solution is to take your data out of Excel and only use xls/xlsx 
formats for data output (if at all).
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.



Hello,

How can I tell RODBC to scan all the records of an xls file to determine the  
data type? If the first n records happen to be empty Rodbc assumes a 
character,  
and any numbers are made NA. And if, for instance, the first n records 
contain  
numbers, and later they also contain characters, those characters  become NA.

 Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, 
public  
order, irrigation, roads, a fresh water system, and public health, what have 
the 

Romans ever done for us?
~~

   [[alternative HTML version deleted]]

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html and provide commented, minimal, 
self-contained, reproducible code. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] blank space escape sequence in R?

2011-04-25 Thread Jan van der Laan

There exists a non-breaking space:

http://en.wikipedia.org/wiki/Non-breaking_space

Perhaps you could use this. In R on Linux under gnome-terminal I can 
enter it with CTRL+SHIFT+U00A0. This seems to work: it prints as a 
space, but is not equal to ' '. I don't know if there are any 
difficulties using, for example, utf8 encoding in source files (which 
you'll probably need).


Jan



On 04/25/2011 03:28 PM, Duncan Murdoch wrote:

On 25/04/2011 9:13 AM, Mark Heckmann wrote:
I use a function that inserts line breaks (\n as escape sequence) 
according to some criterion when there are blanks in the string.

e.g. some text \nand some more text.

What I want now is another form of a blank, so my function will not 
insert a ”\n at that point.

e.g. some text\spaceand some more text

Here \space stands for some escape sequence for a  blank, which is 
what I am looking for.
So what I need is something that will appear as a blank when printed 
but not in the string itself.


I don't think R has anything like that built in.   You'll need to 
attach a class to your vector of strings, and write a print method for 
it that does the substitution before printing.


Duncan Murdoch


TIA

Am 25.04.2011 um 15:05 schrieb Duncan Murdoch:

  On 25/04/2011 9:01 AM, Mark Heckmann wrote:
  Is there a blank space escape sequence in R, i.e. something like 
\sp etc. to produce a blank space?


  You need to give some context.  A blank in a character vector will 
be printed as a blank, so you are probably talking about something 
else, but what?


  Duncan Murdoch

–––
Mark Heckmann
Blog: www.markheckmann.de
R-Blog: http://ryouready.wordpress.com







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] blank space escape sequence in R?

2011-04-25 Thread Jan van der Laan

There exists a non-breaking space:

http://en.wikipedia.org/wiki/Non-breaking_space

Perhaps you could use this. In R on Linux under gnome-terminal I can 
enter it with CTRL+SHIFT+U00A0. This seems to work: it prints as a 
space, but is not equal to ' '. I don't know if there are any 
difficulties using, for example, utf8 encoding in source files (which 
you'll probably need).


Jan




On 04/25/2011 03:28 PM, Duncan Murdoch wrote:

On 25/04/2011 9:13 AM, Mark Heckmann wrote:
I use a function that inserts line breaks (\n as escape sequence) 
according to some criterion when there are blanks in the string.

e.g. some text \nand some more text.

What I want now is another form of a blank, so my function will not 
insert a ”\n at that point.

e.g. some text\spaceand some more text

Here \space stands for some escape sequence for a  blank, which is 
what I am looking for.
So what I need is something that will appear as a blank when printed 
but not in the string itself.


I don't think R has anything like that built in.   You'll need to 
attach a class to your vector of strings, and write a print method for 
it that does the substitution before printing.


Duncan Murdoch


TIA

Am 25.04.2011 um 15:05 schrieb Duncan Murdoch:

  On 25/04/2011 9:01 AM, Mark Heckmann wrote:
  Is there a blank space escape sequence in R, i.e. something like 
\sp etc. to produce a blank space?


  You need to give some context.  A blank in a character vector will 
be printed as a blank, so you are probably talking about something 
else, but what?


  Duncan Murdoch

–––
Mark Heckmann
Blog: www.markheckmann.de
R-Blog: http://ryouready.wordpress.com







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automatic splitting/combining nested categorical variable in glm

2011-04-14 Thread Jan van der Laan
I have a categorical variable with a nested structure. For example,  
region: a country is split into parts, which in turn contain  
provinces, which contain municipalities:


Part - Province - Municipality

North
   Province A
  Municipality 1
  Municipality 2
  Municipality 3
  ...
   Province B
  Municipality 1
  ...
   ...
West
   Province A
  ...
   Province B
  ...
   ...
...


What I would like to do is to automatically split/combine regions in a  
forward (starting with parts and then splitting) or backward (starting  
with municipalities and collapsing) manner. Do there exists methods  
for this in R? Googling I couldn't find anything, but perhaps I have  
been using the wrong terms.


Please note that I do not want to choose between using Part as  
covarate OR e.g. Province. I want to allow for different levels in one  
covariate, e.g. West split into Provinces and the remaining parts not.  
   Also: I am using logistig regression (glm).


Thank you for your help.

With regards,

Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to reference a package in academical paper

2011-03-07 Thread Jan Hornych
Dear,

I am now writing more formal academical paper, and would like to reference
an R package. Do you have any recommendation how to do it?

Taking for instance the RODBC package as an example, how would the reference
look like?
http://cran.r-project.org/web/packages/RODBC/index.html

Thank you
Jan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] df.residual for rlm()

2011-03-01 Thread Jan
Hello,

for testing coefficients of lm(), I wrote the following function (with
the kind support of this mailing list):

# See Verzani, simpleR (pdf), p. 80
coeff.test - function(lm.result, idx, value) {
  # idx = 1 is the intercept, idx1 the other coefficients
  # null hypothesis: coeff = value
  # alternative hypothesis: coeff != value
  coeff - coefficients(lm.result)[idx]
  SE - coefficients(summary(lm.result))[idx,Std. Error]
  n - df.residual(lm.result) 
  t - (coeff - value )/SE
  2 * pt(-abs(t),n) # times two because problem is two-sided
}

This works fine for lm() objects, but fails for rlm() because
df.residual() is NA. 

Can I get the degrees of freedom by calculating 

n = length(lm.result) - length(coefficients(lm.result))

Thanks for any help!
Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Transitions probability comparison

2011-02-27 Thread kende jan
Hello, 

I am training to use the changeLOS package. Using data provided in this package 
(los.data), I want to compare transition probability P01 and P03 like the 
Kaplan-Meier Method.Can someone help me ?

Thank you. 
Jan

data(los.data)
my.observ - prepare.los.data(x=los.data)
my.model - msmodel(c(0,1,2,3),cens.name=cens)
my.trans - trans(model=my.model,observ=my.observ)
my.aj - aj(my.trans, s=0, t=80)
plot(my.aj,c(0,0,0,0),c(0,1,2,3))


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] changeLOS package use

2011-02-24 Thread kende jan
Hello, 

I am training to use the changeLOS package. Using  data provided in this 
package 
(los.data), I want to generate a new plot with overlaying 2 curves of 
transition 
probability P01 and P03 and also statistically  compare the two curves like the 
Kaplan-Meier Method.Can someone help me ?

Thank you. 
Jan

data(los.data)
my.observ - prepare.los.data(x=los.data)
my.model - msmodel(c(0,1,2,3),cens.name=cens)
my.trans - trans(model=my.model,observ=my.observ)
my.aj - aj(my.trans, s=0, t=80)
plot(my.aj,c(0,0,0,0),c(0,1,2,3))



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error with 'hash' library

2011-02-22 Thread Albert-Jan Roskam
Hello,

I'm using R2.10 on Windows 2000 and I'm having trouble installing the 'hash' 
library. This is the error I get:
 library(hash)
    _   _    
  ___  _ __   ___ _ __   __| | __ _| |_ __ _ 
 / _ \| '_ \ / _ \ '_ \ / _` |/ _' | __/ _' |
| (_) | |_) |  __/ | | | (_| | (_| | || (_| |
 \___/| .__/ \___|_| |_|\__,_|\__,_|\__\__,_|
  |_|   http://www.opendatagroup.com
Error in cat(\n  , pkgname, -, utils::installed.packages()[pkgname,  : 
  subscript out of bounds
Error : .onLoad failed in 'loadNamespace' for 'hash'
Error: package/namespace load failed for 'hash'

Can anybody tell how to solve this? Thanks in advance!

 Cheers!!
Albert-Jan 


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the 
Romans ever done for us?
~~ 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGtk2 on Debian Testing

2011-02-18 Thread Jan van der Laan


It has been a while back, but I believe I had to install libgtk2.0-dev  
(that was on Ubuntu)


You could also try to install the r-cran-rgtk2 debian-package using  
dpkg, aptitude, or whatever you use as package manager. This makes  
rgtk available for all users.


HTH,

Jan



Quoting Lorenzo Isella lorenzo.ise...@gmail.com:


Dear All,
I am running Debian testing on my system for the amd64 architecture,
When trying to install the RGtk package I get this error



install.packages('RGtk2')

Installing package(s) into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL
'http://rm.mirror.garr.it/mirrors/CRAN/src/contrib/RGtk2_2.20.8.tar.gz'
Content type 'application/x-gzip' length 2637806 bytes (2.5 Mb)
opened URL
==
downloaded 2.5 Mb

* installing *source* package ‘RGtk2’ ...
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for INTROSPECTION... no
checking for GTK... no
configure: error: GTK version 2.8.0 required
ERROR: configuration failed for package ‘RGtk2’
* removing ‘/usr/local/lib/R/site-library/RGtk2’

The downloaded packages are in
‘/tmp/RtmpMTHLGF/downloaded_packages’
Warning message:
In install.packages(RGtk2) :
  installation of package 'RGtk2' had non-zero exit status

Does anyone know why there is a mismatch between my GTK and the one
required by R?
Should I enable some particular R repositories (I know that the
previous Debian testing was released a few days ago, but I do not know
if this is relevant).
Any suggestion is welcome.
Cheers

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGtk2 on Debian Testing

2011-02-18 Thread Jan van der Laan
It has been a while back, but I believe I had to install libgtk2.0-dev  
(that was on Ubuntu)


You could also try to install the r-cran-rgtk2 debian-package using  
dpkg, aptitude, or whatever you use as package manager. This makes  
rgtk available for all users.


HTH,
Jan



Quoting Lorenzo Isella lorenzo.ise...@gmail.com:


Dear All,
I am running Debian testing on my system for the amd64 architecture,
When trying to install the RGtk package I get this error



install.packages('RGtk2')

Installing package(s) into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL
'http://rm.mirror.garr.it/mirrors/CRAN/src/contrib/RGtk2_2.20.8.tar.gz'
Content type 'application/x-gzip' length 2637806 bytes (2.5 Mb)
opened URL
==
downloaded 2.5 Mb

* installing *source* package ‘RGtk2’ ...
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for INTROSPECTION... no
checking for GTK... no
configure: error: GTK version 2.8.0 required
ERROR: configuration failed for package ‘RGtk2’
* removing ‘/usr/local/lib/R/site-library/RGtk2’

The downloaded packages are in
‘/tmp/RtmpMTHLGF/downloaded_packages’
Warning message:
In install.packages(RGtk2) :
  installation of package 'RGtk2' had non-zero exit status

Does anyone know why there is a mismatch between my GTK and the one
required by R?
Should I enable some particular R repositories (I know that the
previous Debian testing was released a few days ago, but I do not know
if this is relevant).
Any suggestion is welcome.
Cheers

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lm without intercept

2011-02-18 Thread Jan
Hi,

I am not a statistics expert, so I have this question. A linear model
gives me the following summary:

Call:
lm(formula = N ~ N_alt)

Residuals:
Min  1Q  Median  3Q Max 
-110.30  -35.80  -22.77   38.07  122.76 

Coefficients:
Estimate Std. Error t value Pr(|t|)  
(Intercept)  13.5177   229.0764   0.059   0.9535  
N_alt 0.2832 0.1501   1.886   0.0739 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 56.77 on 20 degrees of freedom
  (16 observations deleted due to missingness)
Multiple R-squared: 0.151, Adjusted R-squared: 0.1086 
F-statistic: 3.558 on 1 and 20 DF,  p-value: 0.07386 

The regression is not very good (high p-value, low R-squared). 
The Pr value for the intercept seems to indicate that it is zero with a
very high probability (95.35%). So I repeat the regression forcing the
intercept to zero:

Call:
lm(formula = N ~ N_alt - 1)

Residuals:
Min  1Q  Median  3Q Max 
-110.11  -36.35  -22.13   38.59  123.23 

Coefficients:
  Estimate Std. Error t value Pr(|t|)
N_alt 0.292046   0.007742   37.72   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 55.41 on 21 degrees of freedom
  (16 observations deleted due to missingness)
Multiple R-squared: 0.9855, Adjusted R-squared: 0.9848 
F-statistic:  1423 on 1 and 21 DF,  p-value:  2.2e-16 

1. Is my interpretation correct?
2. Is it possible that just by forcing the intercept to become zero, a
bad regression becomes an extremely good one?
3. Why doesn't lm suggest a value of zero (or near zero) by itself if
the regression is so much better with it?

Please excuse my ignorance.

Jan Rheinländer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm without intercept

2011-02-18 Thread Jan
Hello Achim,

 Not quite. Consult your statistics textbook for the correct interpretation 
 of p-values. Under the null hypothesis of a true intercept of zero, it is 
 very likely to observe an intercept as large as 13.52 or larger.
thank you for that help. I suppose the net doesn't have a detailed
explanation of the output of summary.lm for someone with very little
knowledge about statistics? I worked through J. Verzani simple R but
it does assume some pre-knowledge.

  So I repeat the regression forcing the intercept to zero:
 
 Do you have a good interpretation for that?
In this case, my knowledge of the physical reality behind the numbers
tells me that the intercept should be zero.

 The model without intercept needs to be interpreted differently. The 
 p-value pertains to a regression with intercept zero and slope 0.292 
 against a model with both intercept zero and slope zero.
In other words, of course the slope of 0.292 is almost infinitely better
than a zero slope? But the same would be true for most slopes 0, I
suppose.
So what is the correct way to compare the quality of the regression with
and without intercept? Assuming that I don't know from the physical
reality that the intercept should be zero, what can I say to support one
model against the other?

Thanks,
Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm without intercept

2011-02-18 Thread Jan
Hi,

thanks for your help. I'm beginning to understand things better.

 If you plotted your data, you would realize that whether you fit the
 'best' least squares model or one with a zero intercept, the fit is
 not going to be very good
 Do the data cluster tightly around the dashed line?
No, and that is why I asked the question. The plotted fit doesn't look
any better with or without intercept, so I was surprised that the
R-value etc. indicated an excellent regression (which I now understood
is the wrong interpretation).

One of the references you googled suggests that intercepts should never
be omitted. Is this true even if I know that the physical reality behind
the numbers suggests an intercept of zero?

Thanks,
Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] monitor variable change

2011-02-16 Thread Jan van der Laan


One possible solution is to use something like:

a - 0
for (i in 1:1E6) {
old.a - a

# do something e.g.
a - runif(1)  1E-6

if (a != old.a) browser()
}


Another solution is to write your output to file (using sink for  
example) and to watch this file using a tool like tail.


Jan






Quoting Alaios ala...@yahoo.com:


I think we are both talking for watchpoints-breakpoints

--- On Wed, 2/16/11, Rainer M Krug r.m.k...@gmail.com wrote:


From: Rainer M Krug r.m.k...@gmail.com
Subject: Re: [R] monitor variable change
To: Alaios ala...@yahoo.com
Cc: R-help@r-project.org
Date: Wednesday, February 16, 2011, 9:54 AM
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/16/2011 10:38 AM, Alaios wrote:
 Dear all I would like to ask you if there is a way in
R to monitor in R when a value changes.

 Right now I use the sprintf('my variables is %d \n, j)
to print the value of the variable.

 Is it possible when a 'big' for loop executes to open
in a new window to dynamically check only the variable I
want to.

I don't think that this functionality is implemented.

But I guess you can implement it - would it be possible to
re-define th
- to check if a certain variable is to be changed,
and then print it?

Might be tricky and would slow everything considerably
down.

Just a thought,

Rainer


 If I put all the sprintf statements inside my loop
then I get flooded with so many messages that makes it
useless. 

 Best Regards
 Alex

 __
 R-help@r-project.org
mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide   
http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained,
reproducible code.


- --
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc
(Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Natural Sciences Building
Office Suite 2039
Stellenbosch University
Main Campus, Merriman Avenue
Stellenbosch
South Africa

Tel:        +33 - (0)9 53 10 27 44
Cell:       +27 - (0)8 39 47 90
42
Fax (SA):   +27 - (0)8 65 16 27 82
Fax (D) :   +49 - (0)3 21 21 25 22 44
Fax (FR):   +33 - (0)9 58 10 27 44
email:      rai...@krugs.de

Skype:      RMkrug
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1bnsoACgkQoYgNqgF2egr53gCffKAK4FnRxm/H371ANg8ONs6E
NF8AoIyIGoAsdWu6a0HpE0BPqVD0fV+n
=1MOY
-END PGP SIGNATURE-






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with aggregate()

2011-02-15 Thread Jan van der Laan

The fact that your column names from your aggregate result contain multiple 
numbers, suggests that something has gone wrong with reading your data in from 
file. Have you had a look at your data.frame 'all'? Are BAR and X etc. numeric? 
Judging from the 'c. etc' they aren't.



 So, how do I aggregate the data frame?


Aggregate either accepts a data.frame or a vector as first argument (actually 
anything that can be coerced into a data.frame). In case of a data.frame is 
applies the aggregation function to each column. So, your first aggregate call 
should be ok (except that you input might be wrong (see above)). However, you 
didn't use names arguments in you list() so R will generate names for you. 
Hence, the strange names.

aggregate returns a data.frame. So if you want to do combine more than one 
aggregate call, you can use merge to merge the results:

Count- aggregate(all$FOO, by = list(FOO=all$FOO), FUN = length);
byFOO- merge(byFOO, by=FOO)

If you want to have a vector you could use tapply.


 How do I rename a column?


?names

e.g.
names(all)- c(column1 , column2, ...)


 How do I check that two vectors are the same?


?all

all(vector1 == vector2)

but first have a look at:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f


HTH,
Jan







On 02/15/2011 12:42 AM, Sam Steingold wrote:

Hi,

I am trying to aggregate some data and I am confused by the results.
I load a data frame all from a csv file, and then I do:
(FOO,BAR,X,Y come from the header line in the csv file,
BTW, how do I rename a column?)

byFOO- aggregate(list(all$BAR,all$QUUX,all$X/all$Y),
  by = list(FOO=all$FOO),
  FUN = mean);

I expect a data frame with 4 columns: FOO,BAR,QUUX and X/Y with all FOO
being different (they are character strings, do I need a special
incantation to turn them into factors?)
what I get is indeed a data frame but with names

[1] FOO
[2] c.1.78e.11..4.38e.09..1.461e.11..4.3186e.10..1.1181e.10..5.5389e.10..
[3] c.33879300..3713870..190963000..7042170..4590010..91569200..12108200..
[4] c.1.37087599544937..1.72690992018244..1.82034830430797..1.70338983050847..

why? how do I fix the column names?

then I am trying to add to that same frame byFOO some other columns:

byFOO$Count- aggregate(all$FOO, by = list(all$FOO), FUN = length);
byFOO$Mean- aggregate(all$Value, by = list(all$FOO), FUN = mean);
byFOO$Total- aggregate(all$Value, by = list(all$FOO), FUN = sum);

however, byFOO$Count et al are not columns in byFOO with the appropriate
names (Countc) but data frames with columns Group.1 and x.
Luckily, at least it appears that byFOO$Count$Group.1 is the same as
byFOO$FOO, as they should be, although I don't see any function which
would check that two vectors are the same (== returns a vector which I
have to manually inspect for presence of FALSE).

So, how do I aggregate the data frame?
How do I rename a column?
How do I check that two vectors are the same?

thanks a lot!

PS. I have not used R for a few years, so please be gentle...
PPS. Please do not tell me to RTFM - I did. At least tell me what to
search for.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Proportions comparison

2011-02-08 Thread kende jan
Dear all,
I want to compare  two proportions of  disease in two populations : group 1 
(1200/15000) and group 2 (26/650). However  I would take into account 
the number of physicians involved in each group G1 (1600 physicians) and G2 
(1.6 
million). Please can someone can help me ?
 
Thanks 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


<    1   2   3   4   5   >