[R] dead code removal
Hi there, I have some unused code in my project but have no idea how to clean it up in some kind of automatic way. Maybe there are some tools/ways to identify and remove/mark dead parts of the code? Thanks in advance! Kind regards, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.csv help
Hi, I'm a new R user and I'm having trouble with the read.csv command. It somehow treats the first column as a row name field even though it's not a row name. there are no missing columns/entries and i'm not sure how to resolve this. the format of my data is A, B, C, D,..(3984 columns) 12, 13, 41,..(all numeric) it either treats column A as rownames or if I explicitly disable row names with row.names = NULL field it right shifts all the columns like rowno. A B C Last column 1 12 13 41 NA Srinivas -- View this message in context: http://r.789695.n4.nabble.com/read-csv-help-tp3677454p3677454.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] urgent Help needed
Dear Sir, Please help me to open my rda files. It's urgent to see the content and even if I looked for solutions i didn't find one. When I try to load my .rda file I receive this message: R load(file = SpecPhantomExp2WaterSupp Set2.rda) Error: bad restore file magic number (file may be corrupted) -- no data loaded In addition: Warning message: file 'SpecPhantomExp2WaterSupp Set2.rda' has magic number ' B' Use of save versions prior to 2 is deprecated Please, could you help me to repair this error? Yours sincerly, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] venn diagram in percentage
Hi Jim, Thanks for the reply. But i was looking for percentages in venn diagram. Regard's Poornima -- View this message in context: http://r.789695.n4.nabble.com/venn-diagram-in-percentage-tp3664380p3677458.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doesnt' winedt 6 version work as Rwinedt?
I am very happy using the R-Sweave plugin that works with WinEdt 6.0. It understands not only R but, as the name suggests, Sweave as well. http://www.winedt.org/Config/modes/R-Sweave.php -- View this message in context: http://r.789695.n4.nabble.com/Doesnt-winedt-6-version-work-as-Rwinedt-tp3676502p3677485.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv help
Can you explain a little more? I have created a small CSV file following your pattern which looks like this in a text editor: A,B,C,D,E 65,68,71,74,77 67,71,75,79,83 69,73,77,81,85 71,77,83,89,95 When I load it into R with x - read.csv( a.csv ) I get this which I think is what you would expect: x A B C D E 1 65 68 71 74 77 2 67 71 75 79 83 3 69 73 77 81 85 4 71 77 83 89 95 with rownames( x ) [1] 1 2 3 4 Here is the dput version: dput( x ) structure(list(A = c(65L, 67L, 69L, 71L), B = c(68L, 71L, 73L, 77L), C = c(71L, 75L, 77L, 83L), D = c(74L, 79L, 81L, 89L), E = c(77L, 83L, 85L, 95L)), .Names = c(A, B, C, D, E), class = data.frame, row.names = c(NA, -4L)) What is different with your data? Rgds, Rainer Original-Nachricht Datum: Tue, 19 Jul 2011 00:05:30 -0700 (PDT) Von: psombe srinivas.es...@gmail.com An: r-help@r-project.org Betreff: [R] read.csv help Hi, I'm a new R user and I'm having trouble with the read.csv command. It somehow treats the first column as a row name field even though it's not a row name. there are no missing columns/entries and i'm not sure how to resolve this. the format of my data is A, B, C, D,..(3984 columns) 12, 13, 41,..(all numeric) it either treats column A as rownames or if I explicitly disable row names with row.names = NULL field it right shifts all the columns like rowno. A B C Last column 1 12 13 41 NA Srinivas -- View this message in context: http://r.789695.n4.nabble.com/read-csv-help-tp3677454p3677454.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- --- Windows: Just say No. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write merged data frame to a file
Very Sorry for the instinctive bad comment... I didn't express correctly what I meant. I meant that I would have never expected such behaviour, by default (but may be I am wrong and most of the people need quoting, and hence has sense to put it by default) Problem solved, Philipp guessed right. It was just a quoting problem on the gene description. The problem could be solved simply adding the (quote=) parameter to the read.table command. The quoting problem reversed into that strange behavior when I merged two table (I don't know why). Thankyou very much for the assistance, Now that I realized that the quoting exists I can enjoying using R (that as a tool perfectly suits my needs at the moment). Have a nice day, Andrea On Mon, Jul 18, 2011 at 7:36 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 18, 2011, at 11:10 AM, Andrea Franceschini wrote: Dear Philipp, You were right, thankyou very much. Effectively the second read.table didn't work (probably because of the strange characters). From my point of view this is a very bad R bug. When I read this I thought it was an example of a young person just starting their education exhibiting Piaget's preoperational stage when applied to the task of learning a computer language. the child begins to use symbols to represent objects. Early in this stage he also personifies objects. [snipped] His thinking is influenced by fantasy -- the way he'd like things to be -- and he assumes that others see situations from his viewpoint. [Copied from a webpage for grade school teachers: http://www2.honolulu.hawaii.edu/facdev/guidebk/teachtip/piaget.htm ] But when I do a search in PubMed, I find that you may be an established researcher, so I thing the onus is now on you to demonstrate how R's behavior is different than the documented design goals for the read.table function. I precisely inserted the \t separator in the read table command, hence I would expect that everything is working. How can I make it work ??? I thought Pagel already gave you the answer, but your response is too general to be sure. What exactly did you find and can you provide a minimal code and data example that illustrates how read.table is acting in any way different than the documentation describes? -- David. Effectively there are several ' (i.e. apostrophe) characters in the file. I guess are those that confuse R Thankyou very much, Best Regards, Andrea On Mon, Jul 18, 2011 at 4:32 PM, Philipp Pagel p.pa...@wzw.tum.de wrote: On Mon, Jul 18, 2011 at 04:00:29PM +0200, Andrea Franceschini wrote: I use version 13 of R in OSX (downloaded and installed less than 1 year ago). Probably 2.13 ... [...] code omitted The first lines are OK (i.e. 14 columns, like the dataframe), while at a certain point I get lines with only 3 columns !!! The bad lines that contain only 3 columns have the name and the description of the gene (i.e. the content of the file that I merged with). Besides, these strange lines also get repeated (see the bottom). I havent't carefully analyzed your code so I may be wrong but my guess for all weird behaviour of gene related data.frames problems is this: Gene descriptions love to contain things like Foo 5' obfuscation factor. Note the ' in the description which read.table will happily interpret as a quotation mark and eat lots of rows until it happens to encouter a closing counterpart. This leads to all kinds of funny results. So I bet your problem is not in write.table but in reading the data. Have a closer look at your data frame: are you really getting the expected number of observations in the merged data.frame? Are the rows in question really ok in the data frame? If my guess is correct you should be able to fix your problem by including quote= in both your read.table commands. If it doesn't, also try comment.char= - another popular source of problems. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering data frame by factor
P1-tapply(P1,Experiment,mean)[Experiment] HTH, Daniel ronny wrote: Hi, I would like to center P1 and P2 of the following data frame by the factor Experiment, i.e. substruct from each value the average of its experiment, and keep the original data structure, i.e. the experiment and the group of each value. RAW= data.frame(Experiment=c(2,2,2,1,1,1),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=c(8,12,16,2,3,4)) Desired result: NORMALIZED= data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1)) I tried using by, but then I lose the original order, and the Group varaible. Can you help? RAW Experiment Group P1 P2 2 A 10 8 2 A 12 12 2 B 14 16 1 A 5 2 1 A 3 3 1 B 4 4 NOT.OK- within (RAW, {P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))}) NOT.OK Experiment Group P1 P2 2 A 1 8 2 A -1 12 2 B 0 16 1 A -2 2 1 A 0 3 1 B 2 4 -- View this message in context: http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677620.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to convert number (matlab) to date
but even this is dubious, since there is no year 0 AD. In Gregorian and Julian calendars, 1 BC continues directly into 1 AD. True, but these days we are ruled by ISO 8601:2004, which does define a year 0 (the year before 1CE aka 1AD). See http://en.wikipedia.org/wiki/0_(year) . It seems also to redefine the meaning of 'Gregorian calendar' calling what you are referring to the 'BC/AD calendar system'. (Those who prefer BCE/CE to BC/AD might note the usage of the latter in the definitive international standard.) On Mon, 18 Jul 2011, peter dalgaard wrote: On Jul 18, 2011, at 14:08 , Gabor Grothendieck wrote: On Sat, Jul 16, 2011 at 11:50 PM, Eduardo M. A. M. Mendes emammen...@gmail.com wrote: Hello I am new to R and I need to convert some dates (numeric format by matlab) to actual dates in R. For instance, Matlab - 730456 - datestr(730456) ans = 02-Dec-1999 Set the origin to Matlab's origin like this. Be sure you are using the indicated version of zoo or later: library(zoo) packageVersion(zoo) [1] ‘1.7.1’ as.Date(730456, origin = -00-00) [1] 1999-12-02 Doesn't work on a Mac, and in general, I think it depends on a quirk in your OS's date conversion utilities. What does work for me is as.Date(730456-1, origin='-01-01') [1] 1999-12-02 but even this is dubious, since there is no year 0 AD. In Gregorian and Julian calendars, 1 BC continues directly into 1 AD. So, to be sure, try as.Date(730456-367, origin='0001-01-01') [1] 1999-12-02 (from which it transpires that the non-existing year 0 is a leap year...). Or, or course, just use the appropriate magic constant of 719529 and begone with it: as.Date(730456-719529) [1] 1999-12-02 I fail to see what zoo has to do with this at all! -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tm: Read a single text file into a corpus as single document?
Hello everyone, I'm doing some JGR (a gui frontend for R) development, specifically adding functionality from tm. In order to enable users to select some text files from a file dialog, and turn them into a corpus, I need to be able to generate a corpus using a *SINGLE* text file as a single document, and to append a new document to an existing corpora. I know if I could read files into single character vectors I'd be in business, but I can't find how to do this either. This seems like a no-brainer, so I'm at my wits' end. Here's pseudo code of what I'd like to be able to do: ## corp1doc - Corpus(singleTextDocSource(path/to/doc)) #read in 1 text doc as a 1-document corpus corp1doc A corpus with 1 text document corp1doc[[2]] - AnotherSingleTextDoc(path/to/doc) #append a second document to the same corpus corp1doc A corpus with 2 text documents ## I can almost do this with dirSource, by setting pattern='filename', but this requires me to also to separate the path to the enclosing directory, which shouldn't be necessary. Thanks for taking a look! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Centering data frame by factor
Hi, I would like to center P1 and P2 of the following data frame by the factor Experiment, i.e. substruct from each value the average of its experiment, and keep the original data structure, i.e. the experiment and the group of each value. RAW= data.frame(Experiment=c(2,2,2,1,1,1),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=c(8,12,16,2,3,4)) Desired result: NORMALIZED= data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1)) I tried using by, but then I lose the original order, and the Group varaible. Can you help? RAW Experiment Group P1 P2 2 A 10 8 2 A 12 12 2 B 14 16 1 A 5 2 1 A 3 3 1 B 4 4 NOT.OK- within (RAW, {P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))}) NOT.OK Experiment Group P1 P2 2 A 1 8 2 A -1 12 2 B 0 16 1 A -2 2 1 A 0 3 1 B 2 4 -- View this message in context: http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677609.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv help
Well yeah it works fine for small data but when i tried the exact same command with a large data set (abt 167 rows and 4000 columns) it gave me a different data frame. either i get the first column as row names and so when i put data[1,1] i get the the first row second column data (from the original data) as the first row became row names. or if i explicitly put row.names = NULL i get my columns shifted. this is how the data should look tdata[1,1:3] timestamp system.system.nfs_ops system.system.cifs_ops 1 1299376803 1104233 0 and this is how i'm able to load the data row.names timestamp system.system.nfs_ops system.system.cifs_ops 1 1299376803 1104233 0 0 notice the shift in the first column i hope this makes my problem clearer -- View this message in context: http://r.789695.n4.nabble.com/read-csv-help-tp3677454p3677586.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv help
On 2011-07-19 01:27, psombe wrote: Well yeah it works fine for small data but when i tried the exact same command with a large data set (abt 167 rows and 4000 columns) it gave me a different data frame. either i get the first column as row names and so when i put data[1,1] i get the the first row second column data (from the original data) as the first row became row names. or if i explicitly put row.names = NULL i get my columns shifted. this is how the data should look tdata[1,1:3] timestamp system.system.nfs_ops system.system.cifs_ops 1 1299376803 1104233 0 and this is how i'm able to load the data row.names timestamp system.system.nfs_ops system.system.cifs_ops 1 1299376803 1104233 0 0 notice the shift in the first column i hope this makes my problem clearer This has nothing to do with the size of your data set. Try count.fields() on your data file and do take note of the description of the row.names argument to read.csv function: If there is a header and the first row contains one fewer field than the number of columns, the first column in the input is used for the row names. Peter Ehlers -- View this message in context: http://r.789695.n4.nabble.com/read-csv-help-tp3677454p3677586.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tm: Read a single text file into a corpus as single document?
Some hints: list.files() will return the list of files in a directory readLines() will allow you to load text files as vectors of lines strsplit() will allow you to break lines into words c(x,y) concatenates vectors x and y ; x - c(x,y) appends vector y to x unique() will allow you to get rid of repeats And the Map/Reduce family of functions will allow you to write what you want in about 15 lines of concise R code with no loops. Hope it helps, Cheers, jcb! On Tue, Jul 19, 2011 at 11:11 AM, Alexander James Rickett ack.van...@gmail.com wrote: Hello everyone, I'm doing some JGR (a gui frontend for R) development, specifically adding functionality from tm. In order to enable users to select some text files from a file dialog, and turn them into a corpus, I need to be able to generate a corpus using a *SINGLE* text file as a single document, and to append a new document to an existing corpora. I know if I could read files into single character vectors I'd be in business, but I can't find how to do this either. This seems like a no-brainer, so I'm at my wits' end. Here's pseudo code of what I'd like to be able to do: ## corp1doc - Corpus(singleTextDocSource(path/to/doc)) #read in 1 text doc as a 1-document corpus corp1doc A corpus with 1 text document corp1doc[[2]] - AnotherSingleTextDoc(path/to/doc) #append a second document to the same corpus corp1doc A corpus with 2 text documents ## I can almost do this with dirSource, by setting pattern='filename', but this requires me to also to separate the path to the enclosing directory, which shouldn't be necessary. Thanks for taking a look! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] urgent Help needed
On 07/19/2011 04:40 AM, Ana-Maria Pistea wrote: B Hi Ana-Maria, A quick google for you error message shows that there can be quite a number of causes for this problem. Therefore it is impossible for us to help you. Please read the R-help posting guide to improve your question [1]. The most important thing you need to provide is a commented, minimal, self-contained, reproducible example of this problem that we can run on our own computers. regards, Paul [1] http://www.r-project.org/posting-guide.html -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dead code removal
Ideally you'd have the next two items available: - tests that ensure that your code carries out what it should and as it should. - a coverage analysis tool that reports what parts of your code have been and have not been executed by your tests above. Neither of those are mandatory though, but they will save you from nightmares later on. While there are a few tests harnesses for doing the former (i.e. RUnit, test_that and so on...), I am not aware of any code coverage tools for R, sadly. Cheers, jcb! On Tue, Jul 19, 2011 at 9:51 AM, Alex Bird sund...@gmail.com wrote: Hi there, I have some unused code in my project but have no idea how to clean it up in some kind of automatic way. Maybe there are some tools/ways to identify and remove/mark dead parts of the code? Thanks in advance! Kind regards, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dead code removal
On 07/19/2011 09:57 AM, Juan Carlos Borrás wrote: Ideally you'd have the next two items available: - tests that ensure that your code carries out what it should and as it should. - a coverage analysis tool that reports what parts of your code have been and have not been executed by your tests above. Neither of those are mandatory though, but they will save you from nightmares later on. While there are a few tests harnesses for doing the former (i.e. RUnit, test_that and so on...), I am not aware of any code coverage tools for R, sadly. Cheers, jcb! Hi, Maybe you could profile your code and record which functions are used. For an example of how to do this see [1]. Cross-referencing them with your code should indicate if there are any functions that are unused. cheers, Paul [1] http://www.stat.berkeley.edu/~nolan/stat133/Fall05/lectures/profilingEx.html On Tue, Jul 19, 2011 at 9:51 AM, Alex Bird sund...@gmail.com wrote: Hi there, I have some unused code in my project but have no idea how to clean it up in some kind of automatic way. Maybe there are some tools/ways to identify and remove/mark dead parts of the code? Thanks in advance! Kind regards, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding R's Environment concept
On 11-07-18 2:16 PM, Nipesh Bajaj wrote: Hi all, I am trying to understand the R's environment concept however the underlying help files look quite technical to me. Can experts here provide me some more intuitive ideas behind this concept like, why it is there, what exactly it is doing in R's architecture etc.? I mainly need some non-technical intuitive explanation. There are three characteristics that describe environments: 1. They are a collection of named objects. Much of the time when you ask for something by name, you're looking in an environment to find it. 2. They have a child-parent relationship to another environment. Some of the time, when you look up a name and it is not found, it goes to the parent to look. (And then the grandparent ) This means most of the time when you specify a name, R just looks in one environment and its ancestors to find the object. 3. They don't get copied when you make an assignment. So you can say env - globalenv(), and your env is another name for the global environment, which is where most user objects are created. Saying env$z - 3 will create a new variable named z in the global environment. This differs from most other R objects, where assignment makes an independent copy. And one thing that says how they are used: 1. Things like functions need to look up names all the time. Those things generally have an associated environment which is where they'll look. (Functions are a little complicated in that they get a new one every time you call them, but they also have an associated environment which is the parent of the new one.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice plot problem outputting to jpeg
Hi.I am relatively new to R but was quite pleased with myself at having generated a series of lattice plots as PDFs. I was very surprised when plotting these out as jpegs (or png or tiff) that the strip title information above each lattice plot vanished. The pdf was fine. Has anybody any ideas? I can't add an image as the information is sensitive. Many Thanks Steve Creamer Here is the code snippet #jpeg(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.jpg,height=600,width=600) pdf(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.pdf) #png(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.png) #tiff(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.tiff) while (irec = nrec) { con_new-consultants[irec] spec_new-specialty[irec] spec_cons_new-spec_cons[irec] if (spec_cons_new!=spec_cons_old || irec==nrec ) { strip_name[con_count]-paste(spec_old,'\n',con_old) if (spec_new!=spec_old || irec==nrec) { spec_old-spec_new } con_count-con_count+1 if (con_count 36 || irec==nrec ) { # - #Use xyplot for the lattice - plot from iStartRec to irec each times - this will be 36 consultants #/specialties per device plot. # - tplot-xyplot(DAtotals[irecStart:(irec-1)]~dates[irecStart:(irec-1)] |spec_cons[irecStart:(irec-1)],group=nf[irecStart:(irec-1)],layout=c(6,6), type='b',as.table=TRUE, main=paste(Monthly Appts Provided After 1 DNA\n by Specialties/Consultants - ,iSuffix), auto.key = list(cex=0.5,lines=TRUE, points=FALSE,border = TRUE, x=0.05,y=0.90,corner=c(0,0)), ylab = Number of Monthly Attended Appts Following a DNA ,xlab=Date, scales = list(x = list(rot = 90,format=%b-%y, cex=0.6)),xaxt=n, strip = function(which.panel,...) { panel.fill(trellis.par.get(strip.background)$col[1]) type - strip_name[which.panel] grid::grid.text(label = type,x = 0.5, y = 0.5,gp=grid::gpar(fontsize=5)) grid::grid.rect() } ) print(tplot) # plot the lattice plot con_count-1 irecStart-irec iSuffix-iSuffix+1 # create suffix if (irec != nrec) { # # Open new device and direct to jpeg with new name # dev.off() graphics.off() # turn graphics off to clear memory dev.new() #jpeg(paste(Z:\\My Documents\\PROJECTS\\AccessPolicy\\AccessDAPlots_,toString(iSuffix),.jpg,sep=), # height=600,width=600) pdf(paste(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_,toString(iSuffix),.pdf,sep=)) #png(paste(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_,toString(iSuffix),.png,sep=)) #tiff(paste(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_,toString(iSuffix),.tiff,sep=)) } } con_old-con_new spec_cons_old-spec_cons_new } irec-irec+1 } -- View this message in context: http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3677705.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] line jump in plot legend title
As suggested by David, I applied some modifications to the code of the legend function so that the legend box size adapts to the number of line jumps in the legend title. Attached the modified code of the function. Thanks to David Regards, Loïc Wageningen University http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r -- View this message in context: http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp3676157p3677996.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering data frame by factor
Perfect! Made my day! -- View this message in context: http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677665.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice plot problem outputting to jpeg
On Jul 19, 2011, at 5:40 AM, creamers wrote: Hi.I am relatively new to R but was quite pleased with myself at having generated a series of lattice plots as PDFs. I was very surprised when plotting these out as jpegs (or png or tiff) that the strip title information above each lattice plot vanished. The pdf was fine. Has anybody any ideas? I can't add an image as the information is sensitive. You have obviously advance far in you understanding of the underpinnings of lattice plots, farther than I in many respects. I was surprised, therefore, to see that you were directly accessing elements of your data in the global environment. Generally lattice functions work best when they are given data.frames as arguments. The other (more specific to your problem) comment is that replacement strip functions are generally constructed with the function strip.custom(). I get the impression for the docs that this may be required, but apparently you succeeded with single plot testing and it may be a device issue, so I may be off base. You could still take a look at the examples in the help page and see if using that wrapper gets you better delivery of arguments to the operative code. Obviously not able to do any testing, since you have not constructed a minimal test dataframe. Device issues often require knowing OS and other information that the Posting Guide requests you provide with sessionInfo(). -- David. Many Thanks Steve Creamer Here is the code snippet #jpeg(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.jpg,height=600,width=600) pdf(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.pdf) #png(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_1.png) #tiff(Z:\\My Documents\\PROJECTS\\Access Policy\ \AccessDAPlots_1.tiff) while (irec = nrec) { con_new-consultants[irec] spec_new-specialty[irec] spec_cons_new-spec_cons[irec] if (spec_cons_new!=spec_cons_old || irec==nrec ) { strip_name[con_count]-paste(spec_old,'\n',con_old) if (spec_new!=spec_old || irec==nrec) { spec_old-spec_new } con_count-con_count+1 if (con_count 36 || irec==nrec ) { # - #Use xyplot for the lattice - plot from iStartRec to irec each times - this will be 36 consultants #/specialties per device plot. # - tplot-xyplot(DAtotals[irecStart:(irec-1)]~dates[irecStart:(irec-1)] |spec_cons[irecStart:(irec-1)],group=nf[irecStart: (irec-1)],layout=c(6,6), type='b',as.table=TRUE, main=paste(Monthly Appts Provided After 1 DNA \n by Specialties/Consultants - ,iSuffix), auto.key = list(cex=0.5,lines=TRUE, points=FALSE,border = TRUE, x=0.05,y=0.90,corner=c(0,0)), ylab = Number of Monthly Attended Appts Following a DNA ,xlab=Date, scales = list(x = list(rot = 90,format=%b-%y, cex=0.6)),xaxt=n, strip = function(which.panel,...) { panel.fill(trellis.par.get(strip.background)$col[1]) type - strip_name[which.panel] grid::grid.text(label = type,x = 0.5, y = 0.5,gp=grid::gpar(fontsize=5)) grid::grid.rect() } ) print(tplot) # plot the lattice plot con_count-1 irecStart-irec iSuffix-iSuffix+1 # create suffix if (irec != nrec) { # # Open new device and direct to jpeg with new name # dev.off() graphics.off() # turn graphics off to clear memory dev.new() #jpeg(paste(Z:\\My Documents\\PROJECTS\\AccessPolicy\ \AccessDAPlots_,toString(iSuffix),.jpg,sep=), # height=600,width=600) pdf(paste(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_,toString(iSuffix),.pdf,sep=)) #png(paste(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_,toString(iSuffix),.png,sep=)) #tiff(paste(Z:\\My Documents\\PROJECTS\\Access Policy\\AccessDAPlots_,toString(iSuffix),.tiff,sep=)) } } con_old-con_new spec_cons_old-spec_cons_new } irec-irec+1 } -- View this message in context: http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3677705.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Drawing a histogram from a massive dataset
On Tue, Jul 19, 2011 at 12:30 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: [snip] I guess that I must have a data frame to plot a histogram. Not at all! ## a *vector* of 100 million observation x - rnorm(10^8) ## a histogram for it (see attached for the result from my system) hist(x) No data frame required. I would not try this straight in anything but traditional graphics for a 100 million observation vector, but if you wanted it made in ggplot2 or something, you could prebin the data and THEN plot bars corresponding to the bins. Thanks, Joshua, for your answer. True: A vector is enough to supply data for hist(). But my point is: Can a histogram be drawn without having all data on the computer memory? You partially answer this question by suggesting to prebind the data. Can this prebinning process be done transparently but chunk by chunk of data underneath? Sure, as long as you can figure out some basic details about the full dataset. Just define your breaks, and then for chunks of the data at a time, count how many fall into any particular bin. Once you are done, add up all the counts for each bin, and voila. ## Get these values from the full data (using SQL) x - rnorm(1000) n - length(x) minx - min(x) maxx - max(x) ## Sturges style breaks breaks - pretty(c(minx, maxx), n = ceiling(log2(n) + 1)) nB - length(breaks) fuzz - rep(1e-07 * median(diff(breaks)), nB) fuzz[1] - fuzz[1] * -1 fuzzybreaks - breaks + fuzz chunks - 10 counts - matrix(NA, nrow = chunks, ncol = nB - 1, dimnames = list(paste(Sec, 1:chunks, sep = ''), as.character(fuzzybreaks[-1]))) for(i in 1:chunks) { index - seq(1, n/chunks) + (n/chunks * (i - 1)) counts[i, ] - hist(x[index], breaks = fuzzybreaks)$counts } ## The heights of your bars colSums(counts) ## results using hist() on x all at once hist(x)$counts You would not even need to know the number of chunks you were going to split your data into before hand, I just did it for convenience and to instatiate a full sized matrix to hold the results. If you are selecting subsets of your data using SQL rather than R, it becomes even simpler. Once you have your fuzzybreaks, you just keep calling hist on your new data with using the predefined breaks and saving the results. Still, I do not break about 4.5 GB of memory used to just plot a histogram on a 100 million observation vector, and it is difficult to imagine the shape of the distribution changing appreciably using a random sample of 100 million observations. It also takes less than 10 seconds to calculate and draw the histogram on my computer. The point being, I suspect you will spend more time getting everything setup and working than seems worth it because you can easily and quickly create a histogram on so large of vectors already, the distribution is unlikely to vary anyway. Whatever floats your boat, though. Thanks again, Joshua. Your approach is quite interesting. Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dead code removal
Grand merci! Will try! Kind regards, Alex 2011/7/19 Paul Hiemstra paul.hiems...@knmi.nl: e you __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering data frame by factor
On Jul 19, 2011, at 4:50 AM, Daniel Malter wrote: P1-tapply(P1,Experiment,mean)[Experiment] Another way would be with ave(), but I discovered that it does not accept subsidiary arguments and does not issue warnings either, so this works: with(dfrm, ave(P1, Experiment, FUN=function(x) scale(x, scale=FALSE) ) ) [1] -2 0 2 1 -1 0 But this doesn't behave as directed ... by my pre-operational R-brain. with(dfrm, ave(P1, Experiment, FUN=scale, scale=FALSE) ) [1] -1 0 1 1 -1 0 (It applies both default arguments and issues no warning about unused argument. Most (well, some anyway) functions like this accept subsidiary arguments with ..., but `ave` uses that construction to gather its factor arguments rather than expecting them to be in a list or vector, as do tapply, aggregate, and by. Some functions like mapply and many other give you a moreArgs option, but not ave.) -- David. HTH, Daniel ronny wrote: Hi, I would like to center P1 and P2 of the following data frame by the factor Experiment, i.e. substruct from each value the average of its experiment, and keep the original data structure, i.e. the experiment and the group of each value. RAW= data .frame (Experiment = c (2,2,2,1,1,1 ),Group = c (A ,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2=c(8,12,16,2,3,4)) Desired result: NORMALIZED= data .frame (Experiment = c (2,2,2,1,1,1 ),Group = c (B ,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1)) I tried using by, but then I lose the original order, and the Group varaible. Can you help? RAW Experiment Group P1 P2 2 A 10 8 2 A 12 12 2 B 14 16 1 A 5 2 1 A 3 3 1 B 4 4 NOT.OK- within (RAW, {P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))}) NOT.OK Experiment Group P1 P2 2 A 1 8 2 A -1 12 2 B 0 16 1 A -2 2 1 A 0 3 1 B 2 4 -- View this message in context: http://r.789695.n4.nabble.com/Centering-data-frame-by-factor-tp3677609p3677620.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave in 2.13.1
Duncan, thanks for your work. I could not get the workaround you suggested Rterm.exe --no-restore --slave -e utils::Sweave(file.Rnw) to work under 2.13.1 and so gave up and installed the 2.14.0 development build. I can verify that R CMD Sweave file.Rnw works properly on the 2.14.0 development build and not on the 2.13.1 release build. For now I will stick with the development build -- View this message in context: http://r.789695.n4.nabble.com/Sweave-in-2-13-1-tp3672267p3678162.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] line jump in plot legend title
On 11-07-19 8:16 AM, loic wrote: As suggested by David, I applied some modifications to the code of the legend function so that the legend box size adapts to the number of line jumps in the legend title. Attached the modified code of the function. No code was attached. In case you didn't, could you make sure that your modification is to the version of the code that's in the svn repository? Then it will be straightforward to import your change into R. That version is online at https://svn.r-project.org/R/trunk/src/library/graphics/R/legend.R if your changes only affect the legend() function. Duncan Murdoch Thanks to David Regards, Loïc Wageningen University http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r -- View this message in context: http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp3676157p3677996.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave in 2.13.1
On 11-07-19 9:42 AM, John Minter wrote: Duncan, thanks for your work. I could not get the workaround you suggested Rterm.exe --no-restore --slave -e utils::Sweave(file.Rnw) to work under 2.13.1 and so gave up and installed the 2.14.0 development build. I can verify that R CMD Sweave file.Rnw works properly on the 2.14.0 development build and not on the 2.13.1 release build. For now I will stick with the development build You might prefer R-patched instead: R-devel is going through a lot of changes right now, and there may be bugs in the nightly build. R-patched is very stable at the moment. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice plot problem outputting to jpeg
Thanks David...I am trying to plot out data for various consultants by specialty - each specialty has a varying number of consultants - each consultant a varying number of data pointsI found direct access of the elements of the dataframe was the only way to plot this type of variation, otherwise xyplot seemed to assume that there were a similar number of consultants per specialty. As for the strip.custom, I did try this, its just that I ended up embedding the function inline...I'm not sure there is any difference is there? Sorry I didn't follow protocol ...it is my first time! Steve -- View this message in context: http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3678288.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple comparison test on selected contrasts
Dear Help-list, I have solved the problem by simply deleting the erroneous 0 in the CR core - CR EC contrast and deleting the unnecessary command test=adjusted(summarytype = single-step). Regards,B. Jessop From: deel...@hotmail.com To: r-help@r-project.org Date: Sun, 17 Jul 2011 21:04:51 -0300 Subject: [R] Multiple comparison test on selected contrasts Dear Help-list, How can I do a multiple comparison test (mct) on selected contrasts from a linear model while using packages lme4 and multcomp? I am running R 2.13.0 under Windows 7. The following linear model and mct produces a global mct of 15 paired contrasts of the combined (Site, Position) factor SitePos of which only 9 are of interest. Model.G = lmer(log10(SrCa) ~ SitePos + (1 | Eel), data = Data1) Model.G.mct = glht(Model.G, linfct = mcp(SitePos = Tukey))summary (Model.G.mct) The following code creates the desired reduced set of contrasts but I have been unable to apply it correctly to the mct. contr = rbind(CR core - MH core = c(1,0,0,-1,0,0),CR core - CR edge = c(1,0,-1,0,0,0), CR core - CR EC = c(1,-1,0,0,0,0,0),CR edge - MH edge = c(0,0,1,0,0,-1), CR edge - CR EC = c(0,-1,1,0,0,0),CR EC - MH EC = c(0,1,0,0,-1,0), MH core - MH edge = c(0,0,0,1,0,-1),MH core - MH EC = c(0,0,0,1,-1,0), MH edge - MH EC = c(0,0,0,0,-1,1)) Execution of this code produces the error message: In rbind ('CR core - MH core = c(1,0,0,-1,0,0), etc.: number of columns of results is not a multiple of vector length (arg 1)'. Model.G.mct2 = glht(Model.G, linfct = mcp(SitePos = contr)) #execution produces Error in linfct{[nm]} %*% c: non-comformable argument, as a consequence of the previous error summary (Model.G.mct2, test = adjusted(summarytype = single-step)) Clearly, this approach is incorrect (and I have tried others). How can I introduce the selected set of contrasts into the mct? Thanks for any help provided. Regards,B. Jessop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] line jump in plot legend title
On Jul 19, 2011, at 10:38 AM, Duncan Murdoch wrote: On 11-07-19 8:16 AM, loic wrote: As suggested by David, I applied some modifications to the code of the legend function so that the legend box size adapts to the number of line jumps in the legend title. Attached the modified code of the function. No code was attached. But there was code at the link. It does behave as hoped with defaults unchanged, but when cex is altered, the size of the box does not seem to change appropriately. In case you didn't, could you make sure that your modification is to the version of the code that's in the svn repository? Then it will be straightforward to import your change into R. That version is online at https://svn.r-project.org/R/trunk/src/library/graphics/R/legend.R if your changes only affect the legend() function. Duncan Murdoch Thanks to David Regards, Loïc Wageningen University http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r -- View this message in context: http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp3676157p3677996.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice plot problem outputting to jpeg
creamers stephen.creamer at rdeft.nhs.uk writes: Thanks David...I am trying to plot out data for various consultants by specialty - each specialty has a varying number of consultants - each consultant a varying number of data pointsI found direct access of the elements of the dataframe was the only way to plot this type of variation, otherwise xyplot seemed to assume that there were a similar number of consultants per specialty. As for the strip.custom, I did try this, its just that I ended up embedding the function inline...I'm not sure there is any difference is there? Sorry I didn't follow protocol ...it is my first time! Steve It would probably be worth looking at Hadley's ggplot2 package also, ?melt ?reshape. Maybe you could make one faceted plot instead? -- View this message in context: http://r.789695.n4.nabble.com/Lattice-plot-problem-outputting-to-jpeg-tp3677705p3678288.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot question
As Sarah requested, could you at least read the posting guide and provide us with some sample data? -- Robert W. Baer, Ph.D. Professor of Physiology Kirksville College of Osteopathic Medicine A. T. Still University of Health Sciences 800 W. Jefferson St. Kirksville, MO 63501 660-626-2322 FAX 660-626-2965 -Original Message- From: Sally_roman Sent: Monday, July 18, 2011 12:15 PM To: r-help@r-project.org Subject: Re: [R] barplot question I would like to make stacked barplots, but with two stacked columns per x value. For cod - I have kept and discard values for 2 nets. I would like to have one stacked column for the control net with the kept and discard value and then another column with the kept and discard values for the experimental net. I can make a stacked barplots in R that will plot one net by species, but my boss would like to have both nets on the same graph. I have done it in excel, but was hoping to do the same thing in R. All of the R barplots I have found including the ones that you included as links do not demonstrate what I would like to do. Is is possible in R? -- View this message in context: http://r.789695.n4.nabble.com/barplot-question-tp3670861p3675912.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing the output of a regression object to a file
Hi, I'm using R 2.12.0 on Windows XP. I've used the e1071 package to tune a Support Vector Regression object and I've created the SVR object: epsilon.svr - svm(C8R004 ~.,data = rain_flow.train, scale = T, type = eps-regression, + kernel = radial, cost = 0.9, epsilon=0.55,tolerance=0.001, shrinking=T, gamma=0.18,fitted=T) esvr.pred - predict(epsilon.svr,newdata = rain_flow.test) I would like to export the esvr.pred object to a file so that I can draw a graph of it against my original data in other software that I'm using. I've tried the write.svm command, but that outputs the scaled data instead of something that I can directly compare to my original data. Does anyone know of an easy way to get the result such a format? Alternatively, how can I use the scale values to generate such a format? Thanks Hanlie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] line jump in plot legend title
I also noticed, after sending the code that the modification does only work when the legend is positioned at the bottom of the figure region , and the function crashes when no title is provided. The modification I applied to the code was intended to be more the solving of a particular problem than a real contribution to the function. Unfortunately my programing skills are quite limited to really contribute to the project. However, if one of you manage to make the change work in all case, I believe including it in a next version of R is relevant. Regards, Loïc -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: dinsdag 19 juli 2011 16:56 To: Duncan Murdoch Cc: Dutrieux, Loïc; r-help@r-project.org Subject: Re: [R] line jump in plot legend title On Jul 19, 2011, at 10:38 AM, Duncan Murdoch wrote: On 11-07-19 8:16 AM, loic wrote: As suggested by David, I applied some modifications to the code of the legend function so that the legend box size adapts to the number of line jumps in the legend title. Attached the modified code of the function. No code was attached. But there was code at the link. It does behave as hoped with defaults unchanged, but when cex is altered, the size of the box does not seem to change appropriately. In case you didn't, could you make sure that your modification is to the version of the code that's in the svn repository? Then it will be straightforward to import your change into R. That version is online at https://svn.r-project.org/R/trunk/src/library/graphics/R/legend.R if your changes only affect the legend() function. Duncan Murdoch Thanks to David Regards, Loïc Wageningen University http://r.789695.n4.nabble.com/file/n3677996/legend.r legend.r -- View this message in context: http://r.789695.n4.nabble.com/line-jump-in-plot-legend-title-tp367615 7p3677996.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing the output of a regression object to a file
If I understand you correctly, I would like to export the esvr.pred object to a file so that I can draw a graph of it against my original data in other software that I'm using. you cannot do this. You can export **data**, but of course any R object is either a binary or text (via dput) representation of an R structure, which can only be understood by R, not another software system. See ?write, ?write.table, or the R import/export manual for how to export data (as text) to be imported by other software. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering data frame by factor
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Daniel Malter Sent: Tuesday, July 19, 2011 1:51 AM To: r-help@r-project.org Subject: Re: [R] Centering data frame by factor P1-tapply(P1,Experiment,mean)[Experiment] Note that the above solution works in this example because Experiment takes the values 1 and 2. If Experiment were coded as, say, 101 and 102 the above would not work. This is a case where converting Experiment to a factor would avoid problems. E.g., RAW - data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1)) RAW$E - RAW$Experiment + 100 # relabeled Experiment with(RAW, P1-tapply(P1,Experiment,mean)[Experiment]) # good 2 2 2 1 1 1 -2 0 2 1 -1 0 with(RAW, P1-tapply(P1,E,mean)[E]) # bad NA NA NA NA NA NA NA NA NA NA NA NA RAW$E - factor(RAW$E) # convert to factor with(RAW, P1-tapply(P1,E,mean)[E]) # good 102 102 102 101 101 101 -2 0 2 1 -1 0 Another way to approach the problem is to think of your normalized data as the residuals from a linear model: residuals(lm(data=RAW, cbind(P1,P2) ~ E)) P1P2 1 -2.00e+00 -4.00e+00 2 4.385598e-17 8.771196e-17 3 2.00e+00 4.00e+00 4 1.00e+00 -1.00e+00 5 -1.00e+00 8.771196e-17 6 4.385598e-17 1.00e+00 zapsmall(.Last.value) # make reading easier P1 P2 1 -2 -4 2 0 0 3 2 4 4 1 -1 5 -1 0 6 0 1 That approach can make generizations to more factors or to smoothing approaches easier. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com HTH, Daniel ronny wrote: Hi, I would like to center P1 and P2 of the following data frame by the factor Experiment, i.e. substruct from each value the average of its experiment, and keep the original data structure, i.e. the experiment and the group of each value. RAW= data.frame(Experiment=c(2,2,2,1,1,1),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2= c(8,12,16,2,3,4)) Desired result: NORMALIZED= data.frame(Experiment=c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,- 1,0),P2=c(-4,0,4,-1,0,1)) I tried using by, but then I lose the original order, and the Group varaible. Can you help? RAW Experiment Group P1 P2 2 A 10 8 2 A 12 12 2 B 14 16 1 A 5 2 1 A 3 3 1 B 4 4 NOT.OK- within (RAW, {P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))}) NOT.OK Experiment Group P1 P2 2 A 1 8 2 A -1 12 2 B 0 16 1 A -2 2 1 A 0 3 1 B 2 4 -- View this message in context: http://r.789695.n4.nabble.com/Centering-data-frame-by-factor- tp3677609p3677620.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grey colored lines and overwriting labels i qqplot2
On 7/18/2011 9:23 PM, Sigrid wrote: Hi I apologize for not providing reproducible codes more clearly, and I hope this will be more understandable. I have 14 lines (7 per facet that I would like to add). I will provide you with six of the lines from the data as that should enough data to work with, and also result in less plotting for all of us. These value are from a previously conducted ancova, so not based on simple linear regression. Line #Country TreatmentIntercept Slope 1 Low A 81.47 47.267 2 Low B 31.809 20.234 3 Low C 69.892 33.717 4 High A 67.024 47.267 5 High B 17.357 20.234 6 High C 105.10733.717 Is this (above) a data.frame? If not, can you get it into one? If so, then adding all the lines at once is easy. Lets say that the data.frame is named lines (Note that I changed the capitalization of country and treatment to match what was in test.) lines Line # country treatment Intercept Slope 1 1 Low A81.470 47.267 2 2 Low B31.809 20.234 3 3 Low C69.892 33.717 4 4High A67.024 47.267 5 5High B17.357 20.234 6 6High C 105.107 33.717 dput(lines) structure(list(`Line #` = 1:6, country = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c(High, Low), class = factor), treatment = structure(c(1L, 2L, 3L, 1L, 2L, 3L), .Label = c(A, B, C), class = factor), Intercept = c(81.47, 31.809, 69.892, 67.024, 17.357, 105.107), Slope = c(47.267, 20.234, 33.717, 47.267, 20.234, 33.717)), .Names = c(Line #, country, treatment, Intercept, Slope), class = data.frame, row.names = c(NA, -6L)) From the help that I got here, i was able to make the plot I wanted. ggplot(data = test, aes(x = year, y = total, colour = treatment)) + geom_point(aes(shape = treatment)) + facet_wrap(~country) + scale_colour_grey(breaks=c('A','B','C','D','E','F','G'), labels=c('label A','label B','label C','label D', 'label E','label F','label G')) + scale_shape_manual(breaks=c('A','B','C','D','E','F','G'), labels=c('label A','label B','label C','label D', 'label E','label F','label G'), values = c(0, 1, 2, 3, 4, 5, 6)) + scale_y_continuous(number of votes) + scale_x_continuous(Years, breaks=1:4) + theme_bw()+ You can just add geom_abline(aes(intercept = Intercept, slope = Slope, colour = treatment), data = lines) This says to use the data from the lines data.frame, plotting a line for each row of the data set. The line will be colored based on the value of the treatment variable (with the mapping defined the same as for the points). The lines will also be faceted according to country (the facet_wrap affects all geoms). And I added line #1 and # 4 using the abline command. +geom_abline(intercept = 81.47, slope=47.267, colour = black, size = 0.5, subset = .(country == 'low'))+ geom_abline(intercept = 67.024, slope=47.267, colour = grey, size = 0.5, subset = .(country== 'high')) How can I make the lines correspond with the descriptions on the right side of the graph more clearly? -- View this message in context: http://r.789695.n4.nabble.com/grey-colored-lines-and-overwriting-labels-i-qqplot2-tp3657119p3677248.html Sent from the R help mailing list archive at Nabble.com. -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering data frame by factor
On Jul 19, 2011, at 11:58 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of Daniel Malter Sent: Tuesday, July 19, 2011 1:51 AM To: r-help@r-project.org Subject: Re: [R] Centering data frame by factor P1-tapply(P1,Experiment,mean)[Experiment] Note that the above solution works in this example because Experiment takes the values 1 and 2. If Experiment were coded as, say, 101 and 102 the above would not work. This is a case where converting Experiment to a factor would avoid problems. I checked to see if my ave solution was subject to the same caveats and it is not. The help page is less categorical about what the grouping variables' structure should be, saying only that they are typically factors. E.g., RAW - data .frame (Experiment = c (2,2,2,1,1,1 ),Group = c (B ,A,B,B,A,B),P1=c(-2,0,2,1,-1,0),P2=c(-4,0,4,-1,0,1)) RAW$E - RAW$Experiment + 100 # relabeled Experiment with(RAW, P1-tapply(P1,Experiment,mean)[Experiment]) # good 2 2 2 1 1 1 -2 0 2 1 -1 0 with(RAW, P1-tapply(P1,E,mean)[E]) # bad NA NA NA NA NA NA NA NA NA NA NA NA with(RAW, ave(P1, E, FUN=function(x) scale(x, scale=FALSE) ) ) # [1] -2 0 2 1 -1 0 good RAW$E - factor(RAW$E) # convert to factor with(RAW, P1-tapply(P1,E,mean)[E]) # good 102 102 102 101 101 101 -2 0 2 1 -1 0 And take note that Bill made his variable a factor outside the tapply environment. If he had just used it in the tapply function (as I often do ...possibly unwisely in light of this gotcha) it would fail: with(RAW, P1-tapply(P1, factor(E), mean)[E]) NA NA NA NA NA NA NA NA NA NA NA NA ... that is unless you also use factor(E) as the index: with(RAW, P1-tapply(P1, factor(E), mean)[factor(E)]) 102 102 102 101 101 101 -2 0 2 1 -1 0 Thanks. Bill. I've learned a lot of R from you. -- David. Another way to approach the problem is to think of your normalized data as the residuals from a linear model: residuals(lm(data=RAW, cbind(P1,P2) ~ E)) P1P2 1 -2.00e+00 -4.00e+00 2 4.385598e-17 8.771196e-17 3 2.00e+00 4.00e+00 4 1.00e+00 -1.00e+00 5 -1.00e+00 8.771196e-17 6 4.385598e-17 1.00e+00 zapsmall(.Last.value) # make reading easier P1 P2 1 -2 -4 2 0 0 3 2 4 4 1 -1 5 -1 0 6 0 1 That approach can make generizations to more factors or to smoothing approaches easier. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com HTH, Daniel ronny wrote: Hi, I would like to center P1 and P2 of the following data frame by the factor Experiment, i.e. substruct from each value the average of its experiment, and keep the original data structure, i.e. the experiment and the group of each value. RAW= data .frame (Experiment = c (2,2,2,1,1,1 ),Group=c(A,A,B,A,A,B),P1=c(10,12,14,5,3,4),P2= c(8,12,16,2,3,4)) Desired result: NORMALIZED= data .frame (Experiment = c(2,2,2,1,1,1),Group=c(B,A,B,B,A,B),P1=c(-2,0,2,1,- 1,0),P2=c(-4,0,4,-1,0,1)) I tried using by, but then I lose the original order, and the Group varaible. Can you help? RAW Experiment Group P1 P2 2 A 10 8 2 A 12 12 2 B 14 16 1 A 5 2 1 A 3 3 1 B 4 4 NOT.OK- within (RAW, {P1-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))}) NOT.OK Experiment Group P1 P2 2 A 1 8 2 A -1 12 2 B 0 16 1 A -2 2 1 A 0 3 1 B 2 4 -- View this message in context: http://r.789695.n4.nabble.com/Centering-data-frame-by-factor- tp3677609p3677620.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing SAS and R survival analysis with time-dependent covariates
Terry Therneau-2 wrote: This query of why do SAS and S give different answers for Cox models comes up every so often. The two most common reasons are that a. they are using different options for the ties b. the SAS and S data sets are slightly different. You have both errors. First, make sure I have the same data set by reading a common file, and then compare the results. tmt54% more sdata.txt 1 0.0 0.5 0 0 1 0.5 3.0 1 1 2 0.0 1.0 0 0 2 1.0 1.5 1 1 3 0.0 6.0 0 0 4 0.0 8.0 0 1 5 0.0 1.0 0 0 5 1.0 8.0 1 0 6 0.0 21.0 0 1 7 0.0 3.0 0 0 7 3.0 11.0 1 1 tmt55% more test.sas options linesize=80; data trythis; infile 'sdata.txt'; input id start end delir outcome; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=discrete; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=efron; tmt56% more test.r trythis - read.table('sdata.txt', col.names=c(id, start, end, delir, outcome)) coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact') coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron') - I now get comparable answers. Note that Cox's exact partial likelihood is the correct form to use for discrete time data. I labeled this as the 'exact' method and SAS as the 'discrete' method. The exact marginal likelihood of Prentice et al, which SAS calls the 'exact' method is not implemented in S. As to which package is more reliable, I can only point to a set of formal test cases that are found in Appendix E of the book by Therneau and Grambsch. [...] I am processing estimations of regression parameters in the Cox model for recurrent event data with time-dependent covariates. As my data sets contain a lot of ties, I use the discrete method of SAS (PHREG procedure) and exact option in R (coxph function of survival package). Despite the high computation time (up to 45s), I always get estimations without error or warning message with the PHREG procedure. On the other hand, when I use R software (latest version 2.13.11 on 32 or 64 bits), I sometimes get different estimates from those obtained with SAS and I get various warnings. And some other time I don't get any result, R freezes and does not respond. In order to understand, I have tried some tests from your examples. It turns out that dysfunctions appear when the proportion of ties become important : Test1 With R : (trythis - read.table('***\\sdata4.txt', + col.names=c(id, start, end, delir, outcome))) id start end delir outcome 1 1 0.5 3.0 1 1 2 1 0.5 3.0 1 1 3 1 0.5 3.0 1 1 4 1 0.5 3.0 1 1 5 1 0.5 3.0 1 1 6 2 1.0 1.5 1 1 7 2 1.0 1.5 1 1 8 2 1.0 1.5 1 1 9 2 1.0 1.5 1 1 10 2 1.0 1.5 1 1 11 2 1.0 1.5 1 1 12 2 1.0 1.5 1 1 13 2 1.0 1.5 1 1 14 4 0.0 8.0 0 1 15 4 0.0 8.0 0 1 16 4 0.0 8.0 0 1 17 4 0.0 8.0 0 1 18 5 0.0 1.0 0 0 19 5 0.0 1.0 0 0 20 5 0.0 1.0 0 0 21 5 0.0 1.0 0 0 coxph(Surv(start, end, outcome) ~ delir, data=trythis, method='exact') Call: coxph(formula = Surv(start, end, outcome) ~ delir, data = trythis, method = exact) coef exp(coef) se(coef) z p delir 22.5 6.06e+0915460 0.00146 1 Likelihood ratio test=15.6 on 1 df, p=8.04e-05 n= 21, number of events= 17 Message d'avis : In fitter(X, Y, strats, offset, init, control, weights = weights, : Ran out of iterations and did not converge With SAS : data trythis ; input id start end delir outcome; datalines; 1 0.5 3.0 1 1 1 0.5 3.0 1 1 1 0.5 3.0 1 1 1 0.5 3.0 1 1 1 0.5 3.0 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 2 1.0 1.5 1 1 4 0.0 8.0 0 1 4 0.0 8.0 0 1 4 0.0 8.0 0 1 4 0.0 8.0 0 1 5 0.0 1.0 0 0 5 0.0 1.0 0 0 5 0.0 1.0 0 0 5 0.0 1.0 0 0 run; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=discrete; RUN; No error message, results : estimate delir : 20.52466 se : 5689 Pr Khi 2 : 0.9971 convergence status : Convergence criterion (GCONV=1E-8) satisfied. Test2 With R : (trythis - read.table('***\\montest.txt', + col.names=c(id, start, end, delir,
Re: [R] barplot question
In my first post is example data. -- View this message in context: http://r.789695.n4.nabble.com/barplot-question-tp3670861p3678402.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why I could not reproduce the Mandelbrot plot demonstrated on R wiki
I had the same problem with the code of the Wikipedia in a 64-bit Windows 7. The gif works fine if I use the executable located in bin\x64 instead of bin, which produces the ugly gif eg, in a cmd: the_R_path\bin\x64\Rscript.exe example.r I realised the solution using the --verbose flag Sorry about my english -- View this message in context: http://r.789695.n4.nabble.com/why-I-could-not-reproduce-the-Mandelbrot-plot-demonstrated-on-R-wiki-tp2591429p3678522.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting intraday data in quantmod
I'm using this to plot the data with success, but am unable to figure out how to get the H:M:S timestamp included with their respective Dates. Any suggestions? xts(Dataset[,-1],as.Date(Dataset[,1],%Y-%m-%d)) -- View this message in context: http://r.789695.n4.nabble.com/Plotting-intraday-data-in-quantmod-tp3677268p3678573.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to get predicted values of y for different x values?
Here is my model with interaction terms and control variables (I changed variables names for easy read): reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3) x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three are discrete ordinal variables; but I will treat them as continuous variables. (a) How can I see the predicted values of y for each of these scenarios (210 scenarios I guess)? (b) How can I see the predicted value of y for the minimum and maximum values of x1, x2, and x3 (8 scenarios)? (c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (1 scenario)? -- View this message in context: http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678662p3678662.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stacked Bar Plot in ggplot2
I'm trying to develop a stacked bar plot in R with ggplot2. My data: conv = c(10, 4.76, 17.14, 25, 26.47, 37.5, 20.83, 25.53, 32.5, 16.7, 27.33) click = c(20, 42, 35, 28, 34, 48, 48, 47, 40, 30, 30) date = c(July 7, July 8, July 9, July 10, July 11, July 12, July 13, July 14, July 15, July 16, July 17) dat - data.frame(date=c(date), click=c(click), conv=c(conv), stringsAsFactors = FALSE) dat I'm trying to create a stacked bar plot with the values for Clicks in the background and the values for conversions in the forefront. I tried the following, but because the values aren't factors, it's doesn't produce the right result. p3 = ggplot(dat, aes(as.character(date))) + geom_bar(aes(fill=as.factor(conv))) + ylim(c(0,70)) + geom_bar(aes(fill = conv), position = 'fill') p3 Help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting intraday data in quantmod
On Tue, Jul 19, 2011 at 11:31 AM, kev946 klee...@gmail.com wrote: I'm using this to plot the data with success, but am unable to figure out how to get the H:M:S timestamp included with their respective Dates. Any suggestions? xts(Dataset[,-1],as.Date(Dataset[,1],%Y-%m-%d)) This is not a plot command. What plot command are you using? Dates don't have H:M:S; they're all zero. You don't provide a sample of Dataset, so we don't know what your data look like. Best, -- Joshua Ulrich | FOSS Trading: www.fosstrading.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot question
People who participate in this list via email are unlikely to have your example data, or even to have any idea what you are currently referring to. Please leave enough of the previous messages in your reply to the list to provide context, and include all necessary information. Don't assume that everyone has all previous messages in a thread immediately to hand. Sarah On Tue, Jul 19, 2011 at 11:24 AM, Sally_roman sro...@umassd.edu wrote: In my first post is example data. -- View this message in context: http://r.789695.n4.nabble.com/barplot-question-tp3670861p3678402.html Sent from the R help mailing list archive at Nabble.com. Nabble.com is not the main way in which people participate in this list. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get predicted values of y for different x values?
?predict Use data.frame() to generate input vectors (newdata) for which you want predicted values. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. halptekin halpte...@gmail.com wrote: Here is my model with interaction terms and control variables (I changed variables names for easy read): reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3) x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three are discrete ordinal variables; but I will treat them as continuous variables. (a) How can I see the predicted values of y for each of these scenarios (210 scenarios I guess)? (b) How can I see the predicted value of y for the minimum and maximum values of x1, x2, and x3 (8 scenarios)? (c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (1 scenario)? -- View this message in context: http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678662p3678662.html Sent from the R help mailing list archive at Nabble.com. _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get predicted values of y for different x values?
On Jul 19, 2011, at 2:36 PM, Jeff Newmiller wrote: ?predict Use data.frame() to generate input vectors (newdata) for which you want predicted values. The OP probably needs to use expand.grid to generate the spanning combinations of x values. He will in addition need to include values for control variables. dfrm - expand.grid(x1 = 0: 6, x2 = 0: 5, x3 = 0 : 4) dfrm$control1 - some value of same class as original control1 data repeat x 2 -- David. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. halptekin halpte...@gmail.com wrote: Here is my model with interaction terms and control variables (I changed variables names for easy read): reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3) x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three are discrete ordinal variables; but I will treat them as continuous variables. (a) How can I see the predicted values of y for each of these scenarios (210 scenarios I guess)? (b) How can I see the predicted value of y for the minimum and maximum values of x1, x2, and x3 (8 scenarios)? (c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (1 scenario)? -- View this message in context: http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678662p3678662.html Sent from the R help mailing list archive at Nabble.com. _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculating the mean of a random matrix (by row) and some general questions
Hi everyone! I'm trying to teach myself R in order to do some data analysis. I'm a mathematics student and (only) familiar with matlab and latex. I'm working trough the official introduction to R at the moment, while simultaneously solving some exercises I found in the web. Before I post my (probably stupid) question, I'd like to ask you for some general advice. How do you work with R? Is it like in matlab, that you write your functions with a lot of loops etc. in a textfile and then run it? Or do you just prepare your data and then use the functions provided by R (plot, mean etc) to get some analysis? I'd be very thankfull for some of your thoughts about approaches. Now the question: I'm trying to build a vector with n entries, each consisting of the mean of m random numbers (exponential distributed for example). My approach was to construct a nxm random matrix and then to somehow take the mean of each row. But in the mean function there is no parameter to do this, so the intended approach of R is probably different.. any ideas? =) Richard -- View this message in context: http://r.789695.n4.nabble.com/calculating-the-mean-of-a-random-matrix-by-row-and-some-general-questions-tp3678964p3678964.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] negative binomial regression with spatial weights matrix (not locations)
Hello, I have some data that exhibits a negative binomial distribution and also spatial structure. I would like to create a model that accounts for both. However, instead of locations, I have a distance matrix (cost matrix) describing the spatial relationships among the locations. I have tried using various methods (spBayes,geoRglm,geoR), but none incorporate both a negative binomial distribution and distance matrix. Many methods use the geodata object, which uses coordinates. Please provide suggestions as to what approach I can use to analyze this data. Thank you!! Christy M. -- View this message in context: http://r.789695.n4.nabble.com/negative-binomial-regression-with-spatial-weights-matrix-not-locations-tp3679071p3679071.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stacked Bar Plot in ggplot2
Abraham Mathew abraham at thisorthat.com writes: I'm trying to develop a stacked bar plot in R with ggplot2. My data: conv = c(10, 4.76, 17.14, 25, 26.47, 37.5, 20.83, 25.53, 32.5, 16.7, 27.33) click = c(20, 42, 35, 28, 34, 48, 48, 47, 40, 30, 30) date = c(July 7, July 8, July 9, July 10, July 11, July 12, July 13, July 14, July 15, July 16, July 17) dat - data.frame(date=c(date), click=c(click), conv=c(conv), stringsAsFactors = FALSE) Is: ggplot(dat,aes(x=date))+geom_bar(aes(y=click),fill='red')+geom_bar(aes(y=conv),fill='blue') what you're looking for? or dat.melt-melt(dat,'date') ggplot(dat.melt,aes(x=date,y=value,fill=variable))+geom_bar() Justin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] notation question
Dear list, I am currently writing up some of my R models in a more formal sense for a paper, and I am having trouble with the notation. Although this isn't really an 'R' question, it should help me to understand a bit better what I am actually doing when fitting my models! Using the analysis of co-variance example from MASS (fourth edition, p 142), what is the correct notation for the formula Gas, ~ Insul/Temp - 1? Obviously, if we fit it as two separate models (as in the example above it), we would have something like y_i = \beta x_i for each of the two models. So my question is, when we have a single model with a k-level factor interaction term as in the equation above, what is the correct/standard statistical (LaTeX style) notation? Cheers, Carson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating the mean of a random matrix (by row) and some general questions
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of RichardLang Sent: Tuesday, July 19, 2011 11:44 AM To: r-help@r-project.org Subject: [R] calculating the mean of a random matrix (by row) and some general questions Hi everyone! I'm trying to teach myself R in order to do some data analysis. I'm a mathematics student and (only) familiar with matlab and latex. I'm working trough the official introduction to R at the moment, while simultaneously solving some exercises I found in the web. Before I post my (probably stupid) question, I'd like to ask you for some general advice. How do you work with R? Is it like in matlab, that you write your functions with a lot of loops etc. in a textfile and then run it? Or do you just prepare your data and then use the functions provided by R (plot, mean etc) to get some analysis? I'd be very thankfull for some of your thoughts about approaches. Now the question: I'm trying to build a vector with n entries, each consisting of the mean of m random numbers (exponential distributed for example). My approach was to construct a nxm random matrix and then to somehow take the mean of each row. But in the mean function there is no parameter to do this, so the intended approach of R is probably different.. any ideas? =) Richard Richard, If you have a matrix, M, with n rows and m columns, you can use the apply() function to get either row or column means n - 10 m -3 M - matrix(rnorm(m*n),n,m) M [,1][,2] [,3] [1,] 0.6239267 -0.70546496 0.3682918 [2,] -0.7326689 -1.86571052 -0.2899552 [3,] 0.7778313 -1.01227191 0.7735718 [4,] 0.8336683 -0.07755214 -0.1375798 [5,] -1.6134414 0.12088648 -0.4064939 [6,] -0.2578007 0.45142456 -1.0197297 [7,] 1.0108260 -0.24933408 -0.4083304 [8,] -0.7936603 -0.67286769 -0.8666802 [9,] 1.0054039 2.52498995 1.0915742 [10,] -0.1610073 0.43504924 2.4288474 rowMeans - apply(M,1,mean) rowMeans [1] 0.09558452 -0.96277820 0.17971042 0.20617876 -0.63301628 -0.27536860 [7] 0.11772050 -0.3605 1.54065601 0.90096312 colMeans - apply(M,2,mean) colMeans [1] 0.06930777 -0.10508511 0.15335160 I will let others describe how they use R. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating the mean of a random matrix (by row) and some general questions
On Jul 19, 2011, at 2:43 PM, RichardLang wrote: Hi everyone! I'm trying to teach myself R in order to do some data analysis. I'm a mathematics student and (only) familiar with matlab and latex. I'm working trough the official introduction to R at the moment, while simultaneously solving some exercises I found in the web. Before I post my (probably stupid) question, I'd like to ask you for some general advice. How do you work with R? Is it like in matlab, that you write your functions with a lot of loops etc. in a textfile and then run it? Or do you just prepare your data and then use the functions provided by R (plot, mean etc) to get some analysis? I'd be very thankfull for some of your thoughts about approaches. In R one generally tries to avoid loops. There are many functions that accept vector and matrix arguments and retrun similarly structured results. My impression was that that was also true in Matlab and that many of the loop-less (the R term is vectorized) constructs were very similar. Matrix math and all that jazz. Now the question: I'm trying to build a vector with n entries, each consisting of the mean of m random numbers (exponential distributed for example). My approach was to construct a nxm random matrix and then to somehow take the mean of each row. But in the mean function there is no parameter to do this, so the intended approach of R is probably different.. any ideas? =) Generally when performing a repeated task involving random simulation one turns to the replicate function. To generate 5 means of length 10, random exponentially distributed vectors: replicate(5, mean( rexp(10, rate= 1.5) ) ) [1] 1.0786732 0.7196179 0.7179612 0.5024858 0.4624592 -- David. Richard -- View this message in context: http://r.789695.n4.nabble.com/calculating-the-mean-of-a-random-matrix-by-row-and-some-general-questions-tp3678964p3678964.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Questions about DCC-GARCH Model
Dear list members, I'm trying to use DCC-GARCH model to estimate the correlation. I have downloeaded ccgarch packeage but can't understand some argument in the formula. dcc.estimation(inia, iniA, iniB, ini.dcc, dvar, model, method=BFGS, gradient=1, message=1) which is on R.Help I understand others except ini.dcc which is described as a vector of initial values for the DCC parameters (2 × 1). Does anyone know that? In the example it is set as c(0.01,0.98), but i have no idea where the numbers come from. Another question is how to estimate DCC of known structural break. Let's say, we have already known the break point A and establish a restricted model.Use this model to estimate DCC. Then do likelihood test with unrestricted model( no structural break) to see if it is significant. But I have no idea how to establish the restricted model. I know in the restricted model, I have to use different rho,(use rho(1~A) and rho(A~n) instead of rho(1~), but how to identify this difference in R ? I can't do it by myself. Does anyone have any idea? I appreciate it! Thank you in advance! Zoe -- View this message in context: http://r.789695.n4.nabble.com/Questions-about-DCC-GARCH-Model-tp3679130p3679130.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating the mean of a random matrix (by row) and some general questions
Thanks for your advice! I found another method to solve my problem (on page 20 of the manual =) ). Here's my code # Set n and m to whatever you want n = 25 m = 10 # Build a random vector (here with exponential distribution) with lenght of n*m x = rexp(n*m,1) # Set up a factor in order to group x into pieces each of size ten y = factor( rep(1:n, each = m) ) # tapply will apply mean to each group.. result = tapply(x,y,mean) -- View this message in context: http://r.789695.n4.nabble.com/calculating-the-mean-of-a-random-matrix-by-row-and-some-general-questions-tp3678964p3679186.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating the mean of a random matrix (by row) and some general questions
On Jul 19, 2011, at 4:03 PM, RichardLang wrote: Thanks for your advice! I found another method to solve my problem (on page 20 of the manual =) ). Here's my code In r-help most of the readers and most of the regular responders are reading this as a mailing list posting not on Nabble and it looks like you are responding to yourself. I remember what I wrote but the rest of the readers probably don't and no one can tell if you are reponding to me or to Nordlund. You are asked in the Posting Guide to provide context and it's not that hard in Nabble. # Set n and m to whatever you want n = 25 m = 10 # Build a random vector (here with exponential distribution) with lenght of n*m x = rexp(n*m,1) # Set up a factor in order to group x into pieces each of size ten y = factor( rep(1:n, each = m) ) # tapply will apply mean to each group.. result = tapply(x,y,mean) There are many ways to skinning that cat: Try this: rowMeans( matrix( rexp(n*m,1), ncol=10) ) This will be much faster than Nordlund's answer on bigger tasks, because the row/colMeans and row/colSums functions are tweaked to be fast, whereas there is a lot of overhead with the apply functions. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating the mean of a random matrix (by row) and some general questions
Hi Richard, As others have said, try to use the apply functions rather than loops. There is also an apply function for lists, see ?lapply. This is much more efficient. I also like writing my own functions. For example: f - function(x) { x^2 } Which can then be used by: f(2) [1] 4 This is very useful if you're getting into maximum likelihood programming, or want to use the optim function (for multivariate functions) or optimize (for univariate functions). Lastly, check out the R reference card. http://cran.r-project.org/doc/contrib/Short-refcard.pdf Regards, Peter On Tue, Jul 19, 2011 at 12:43, RichardLang l...@zedat.fu-berlin.de wrote: Hi everyone! I'm trying to teach myself R in order to do some data analysis. I'm a mathematics student and (only) familiar with matlab and latex. I'm working trough the official introduction to R at the moment, while simultaneously solving some exercises I found in the web. Before I post my (probably stupid) question, I'd like to ask you for some general advice. How do you work with R? Is it like in matlab, that you write your functions with a lot of loops etc. in a textfile and then run it? Or do you just prepare your data and then use the functions provided by R (plot, mean etc) to get some analysis? I'd be very thankfull for some of your thoughts about approaches. Now the question: I'm trying to build a vector with n entries, each consisting of the mean of m random numbers (exponential distributed for example). My approach was to construct a nxm random matrix and then to somehow take the mean of each row. But in the mean function there is no parameter to do this, so the intended approach of R is probably different.. any ideas? =) Richard -- View this message in context: http://r.789695.n4.nabble.com/calculating-the-mean-of-a-random-matrix-by-row-and-some-general-questions-tp3678964p3678964.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on a server (Windows Server 2008)
Thanks a lot, Roger! Dimitri On Tue, Jul 19, 2011 at 11:00 AM, Bos, Roger roger@rothschild.com wrote: Yes. I have it running on a win server 2008 machine with no problems. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dimitri Liakhovitski Sent: Monday, July 18, 2011 7:19 PM To: r-help Subject: [R] R on a server (Windows Server 2008) Apologies for a naive question: Can R be installed and run on a server (operating system Windows Server 2008)? Thank you! -- Dimitri Liakhovitski __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by an error in transmission. If you have received this message in error, please immediately notify the the sender by e-mail, delete the message and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message if you are not the intended recipient. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculating mean excluding zeros
Sorry if it's been discussed before - don't seem to find it. I'd like to calculate a mean while ignoring zeros. mean doesn't seem to have an option for that. Any other function/package that could do it? Thanks for a pointer! -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] timeDate with month designated by three letters.
Dear R Experts: I am trying to convert a date and time character field to timeDate where the month is presented as three letters, such as JUN for June, etc. This is an example of the full character field: 04-MAY-11 1428 What is the proper format syntax? I've tried timeDate(04-MAY-11 1428,format=%d-%m-%y %H%M) but only get GMT [1] [NA] If I change the month to a number as below, then it works, but that would require recoding of the data field. timeDate(04-05-11 1428,format=%d-%m-%y %H%M) gives GMT [1] [2011-05-04 14:28:00] which is correct. How do I get R to recognize the month as a 3 letter designator. Any recommendations you can provide would be greatly appreciated. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating mean excluding zeros
You can do it by subsetting or indexing r-c(0,0,0,rnorm(10,10,5)) mean(r) [1] 8.052215 mean(r[r!=0]) [1] 10.46788 Weidong Gu On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry if it's been discussed before - don't seem to find it. I'd like to calculate a mean while ignoring zeros. mean doesn't seem to have an option for that. Any other function/package that could do it? Thanks for a pointer! -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating mean excluding zeros
In the more general case, that approach is prone to machine precision error (FAQ 7.31). Here's a clunky but safer alternative: set.seed(1234) testvec - sample(0:10, 100, replace=TRUE) mean(testvec) [1] 4.31 mean(testvec[testvec != 0]) [1] 4.842697 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))]) [1] 4.842697 (Is there an elementwise equivalent to all.equal() that I'm missing?) Sarah On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote: You can do it by subsetting or indexing r-c(0,0,0,rnorm(10,10,5)) mean(r) [1] 8.052215 mean(r[r!=0]) [1] 10.46788 Weidong Gu On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry if it's been discussed before - don't seem to find it. I'd like to calculate a mean while ignoring zeros. mean doesn't seem to have an option for that. Any other function/package that could do it? Thanks for a pointer! -- Dimitri Liakhovitski marketfusionanalytics.com -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hold position of vertices constant in network {statnet}?
I am a novice with network fuctions! I have been exploring the network function in the statnet package, but haven't been able to figure out how to hold vertices in position while varying edge features. Can anyone advise on whether this is possible, and if so, how to do it? Thanks! -- Matthew Bakker, Ph.D. Department of Plant Pathology University of Minnesota 495 Borlaug Hall 1991 Upper Buford Circle Saint Paul, MN 55108 USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get predicted values of y for different x values?
Please read the posting guide (requires a self-contained example of code) and consult the help pages before posting. If you type ?predict.lm the help page clearly states that the argument 'newdata' takes [a]n optional data frame in which to look for variables with which to predict... x1-rnorm(100) x2-rnorm(100) e-rnorm(100) y-x1+x2+x1*x2+e reg-lm(y~x1*x2) summary(reg) predict(reg,newdata=data.frame(x1=2,x2=2)) HTH, Daniel halptekin wrote: Here is my model with interaction terms and control variables (I changed variables names for easy read): reg1 - lm(y ~ x1*x2*x3 +control1 + control2 + control3) x1 ranges from 0 to 6; x2 from 0 to 5; and x3 from 0 to 4. All three are discrete ordinal variables; but I will treat them as continuous variables. (a) How can I see the predicted values of y for each of these scenarios (210 y values I guess)? (b) How can I see the predicted value of y for the minimum and maximum values of x1, x2, and x3 (8 y values)? (c) How can I see the predicted value of y for x1=6; x2=5; and x3=4 (only one y value)? -- View this message in context: http://r.789695.n4.nabble.com/How-to-get-predicted-values-of-y-for-different-x-values-tp3678658p3679351.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] timeDate with month designated by three letters.
Tena koe Michael The help file for strptime suggests you should be using %b (three letter month) rather than %m (decimal number month). HTH Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of mdkz...@aol.com Sent: Wednesday, 20 July 2011 8:39 a.m. To: R-help@r-project.org Subject: [R] timeDate with month designated by three letters. Dear R Experts: I am trying to convert a date and time character field to timeDate where the month is presented as three letters, such as JUN for June, etc. This is an example of the full character field: 04-MAY-11 1428 What is the proper format syntax? I've tried timeDate(04-MAY-11 1428,format=%d-%m-%y %H%M) but only get GMT [1] [NA] If I change the month to a number as below, then it works, but that would require recoding of the data field. timeDate(04-05-11 1428,format=%d-%m-%y %H%M) gives GMT [1] [2011-05-04 14:28:00] which is correct. How do I get R to recognize the month as a 3 letter designator. Any recommendations you can provide would be greatly appreciated. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating mean excluding zeros
Thanks a lot, Sarah. I assume, if the values against which I am comparing are REALLY zero (0) - then even the first one (mean(testvec[testvec != 0])) should work, right? Dimitri On Tue, Jul 19, 2011 at 4:56 PM, Sarah Goslee sarah.gos...@gmail.com wrote: In the more general case, that approach is prone to machine precision error (FAQ 7.31). Here's a clunky but safer alternative: set.seed(1234) testvec - sample(0:10, 100, replace=TRUE) mean(testvec) [1] 4.31 mean(testvec[testvec != 0]) [1] 4.842697 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))]) [1] 4.842697 (Is there an elementwise equivalent to all.equal() that I'm missing?) Sarah On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote: You can do it by subsetting or indexing r-c(0,0,0,rnorm(10,10,5)) mean(r) [1] 8.052215 mean(r[r!=0]) [1] 10.46788 Weidong Gu On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry if it's been discussed before - don't seem to find it. I'd like to calculate a mean while ignoring zeros. mean doesn't seem to have an option for that. Any other function/package that could do it? Thanks for a pointer! -- Dimitri Liakhovitski marketfusionanalytics.com -- Sarah Goslee http://www.functionaldiversity.org -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loops and simulation
I dare the conjecture that if you had written the code, you would know how to do this. This suggests that you are asking us to do your homework, which is not the purpose of this list. A simple inclusion of the code in a for or while loop and storing the estimated parameters with the index of the iteration at which the loop is at should be no problem if you have programmed the code you provided. Best, Daniel Majdi wrote: Hi My code generates data using (progressive censoring) then the program use this data to estimate the parameters (theta.t and beta.t) using EM algorithm My question is that I have to do simulation, that is I have to run my code 10,000 times, for each time I have to store the output and at the end find the average of theta.t and beta.t over these 1 simulations Is this possible at all OR I should go back to SAS and try to do it over there I learned that R doesn't like loops ## ## Lomax distribution ## ## Use EM algorithm to estimate parameters of Lomax based on progressive censored data n=20;m=5;R-c(0,0,0,0,15) theta=0.4986; beta=3 # generate data using progressive censoring algorithm# ## W-matrix((runif(m,0,1)),m,1) E-matrix(NA,m,1);V-matrix(NA,m,1);U-matrix(NA,m,1) i-1 while (i=m) { E[i]- 1/( i+ sum(R[(m+1-i):m])) i-i+1 } V-W^E i-1 while (i=m) { U[i]- (1- prod( V[(m+1-i):m])) i-i+1 } x-( (1-U)^(-1/theta)-1 )/beta # simulation ends Yobs-x em.lomax - function(Y){ r - length(Yobs) ##initial value theta.t=theta beta.t=beta # Define log-likelihood function ll - function(y, k, c){ n*log(c*k)-(k+1)*( sum(log(1+c*Y))+sum( R*log(1+c*Y)+1/k ) ) } # Compute the log-likelihood for the initial values, and ignoring the missing data mechanism lltm1 - ll(Yobs, theta.t, beta.t) repeat{ # E-step Bbar- function(theta.t,beta.t){ sum(R*(1+beta.t*(theta.t+1)*Yobs)/(beta.t*(theta.t+1)*(1+beta.t*Yobs))) } Abar- function(theta.t,beta.t){ sum( R* ( log(1+beta.t*Yobs)+ 1/theta.t )) } theta.t1- n/( sum(log(1+beta.t*Yobs)) + Abar(theta.t,beta.t)) # M-step beta.t1-((theta.t1+ 1)/n *(sum(Yobs/(1+beta.t*Yobs))+Bbar(theta.t,beta.t)))^(-1) theta.t1- n/( sum(log(1+beta.t*Yobs)) + Abar(theta.t,beta.t)) beta.t -beta.t1 theta.t-theta.t1 # compute log-likelihood using current estimates, and igoring the missing data mechanism llt - ll(Yobs, theta.t, beta.t) # Print current parameter values and likelihood cat(theta.t, beta.t, llt, \n) # Stop if converged if ( abs(lltm1 - llt) 0.0001) break lltm1 - llt } return(theta.t,beta.t) } em.lomax(Yobs) please help me with this majdi -- View this message in context: http://r.789695.n4.nabble.com/loops-and-simulation-tp3678640p3679377.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating the mean of a random matrix (by row) and some general questions
On Jul 19, 2011, at 4:18 PM, Peter Lomas wrote: Hi Richard, As others have said, try to use the apply functions rather than loops. There is also an apply function for lists, see ?lapply. This is much more efficient. Actually the apply functions are not more efficient in the usual meaning of time of execution. And sometimes they is rather inefficient. Prior discussions of this topic in the archives should be easy to find. The economy is in expression and the advantage is in code creation and maintenance. Doubters of this proposition should consider these results: library(rbenchmark) # help page has a more compact version of these tests means.rep = function(n, m) {res1 - vector(length=100, mode=numeric) res1 - replicate(n, mean( rexp(m)))} means.colMn = function(n, m) {res2 - vector(length=100, mode=numeric) res2 - colMeans(matrix( rexp(n*m), c(m, n)))} means.tapply = function(n,m) {res3 - vector(length=100, mode=numeric) res3 - tapply( rexp(n*m), rep(1:n, each = m), mean)} means.apply =function(n,m) { res4 - vector(length=100, mode=numeric) res4 -apply( matrix(rexp(m*n),n,m), 1, mean) } means.forloop =function(n, m) {res5 - vector(length=100, mode=numeric) for (i in n) {res5[i] -mean(rexp(m))} } benchmark( repl = means.rep(100, 100), tappl = means.tapply(100, 100), appl = means.apply(100, 100), pat = means.pat(100, 100), forloop = means.forloop(100,100), replications=100, columns=c(test,replications,elapsed), order='elapsed' ) ### Results: test replications relative elapsed 5 forloop 100 1.00 0.004 4 pat 10020.25 0.081 1repl 10077.00 0.308 3appl 10089.75 0.359 2 tappl 100 264.50 1.058 I admit that I was rather surprised to see the for-loop beating colMeans by such a wide margin, and this is making me wonder if I reversed some index or coded the for-loop test wrong. So would appreciate some auditing and improvement of this test. (But I don't see how I could have reversed the order since the n and m are both 100. And I tried adding assignments to see if there were only promises being made with no calculations. The relative efficiencies stays the same.) -- David. I also like writing my own functions. For example: f - function(x) { x^2 } Which can then be used by: f(2) [1] 4 This is very useful if you're getting into maximum likelihood programming, or want to use the optim function (for multivariate functions) or optimize (for univariate functions). Lastly, check out the R reference card. http://cran.r-project.org/doc/contrib/Short-refcard.pdf Regards, Peter On Tue, Jul 19, 2011 at 12:43, RichardLang l...@zedat.fu-berlin.de wrote: Hi everyone! I'm trying to teach myself R in order to do some data analysis. I'm a mathematics student and (only) familiar with matlab and latex. I'm working trough the official introduction to R at the moment, while simultaneously solving some exercises I found in the web. Before I post my (probably stupid) question, I'd like to ask you for some general advice. How do you work with R? Is it like in matlab, that you write your functions with a lot of loops etc. in a textfile and then run it? Or do you just prepare your data and then use the functions provided by R (plot, mean etc) to get some analysis? I'd be very thankfull for some of your thoughts about approaches. Now the question: I'm trying to build a vector with n entries, each consisting of the mean of m random numbers (exponential distributed for example). My approach was to construct a nxm random matrix and then to somehow take the mean of each row. But in the mean function there is no parameter to do this, so the intended approach of R is probably different.. any ideas? =) Richard David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing SAS and R survival analysis with time-dependent covariates
On Wed, Jul 20, 2011 at 5:42 AM, AO_Statistics abouesl...@gmail.com wrote: Terry Therneau-2 wrote: This query of why do SAS and S give different answers for Cox models comes up every so often. The two most common reasons are that a. they are using different options for the ties b. the SAS and S data sets are slightly different. You have both errors. First, make sure I have the same data set by reading a common file, and then compare the results. tmt54% more sdata.txt 1 0.0 0.5 0 0 1 0.5 3.0 1 1 2 0.0 1.0 0 0 2 1.0 1.5 1 1 3 0.0 6.0 0 0 4 0.0 8.0 0 1 5 0.0 1.0 0 0 5 1.0 8.0 1 0 6 0.0 21.0 0 1 7 0.0 3.0 0 0 7 3.0 11.0 1 1 tmt55% more test.sas options linesize=80; data trythis; infile 'sdata.txt'; input id start end delir outcome; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=discrete; proc phreg data=trythis; model (start, end)*outcome(0)=delir/ ties=efron; tmt56% more test.r trythis - read.table('sdata.txt', col.names=c(id, start, end, delir, outcome)) coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact') coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron') - I now get comparable answers. Note that Cox's exact partial likelihood is the correct form to use for discrete time data. I labeled this as the 'exact' method and SAS as the 'discrete' method. The exact marginal likelihood of Prentice et al, which SAS calls the 'exact' method is not implemented in S. As to which package is more reliable, I can only point to a set of formal test cases that are found in Appendix E of the book by Therneau and Grambsch. [...] I am processing estimations of regression parameters in the Cox model for recurrent event data with time-dependent covariates. As my data sets contain a lot of ties, I use the discrete method of SAS (PHREG procedure) and exact option in R (coxph function of survival package). Despite the high computation time (up to 45s), I always get estimations without error or warning message with the PHREG procedure. On the other hand, when I use R software (latest version 2.13.11 on 32 or 64 bits), I sometimes get different estimates from those obtained with SAS and I get various warnings. And some other time I don't get any result, R freezes and does not respond. In order to understand, I have tried some tests from your examples. It turns out that dysfunctions appear when the proportion of ties become important : Edited down to results: R coef exp(coef) se(coef) z p delir 22.5 6.06e+09 15460 0.00146 1 SAS estimate delir : 20.52466 se : 5689 R coef exp(coef) se(coef) z p delir -20.8 9.42e-10 42054 -0.000494 1 SAS estimate delir : -17.78257 se : 9383 Pr Khi 2 : 0.9985 convergence status : Convergence criterion (GCONV=1E-8) satisfied. The warning and error messages are correct here. Look at the point estimate. It's a log hazard ratio of about 20 in one case and about -20 in the other case. The true partial maximum likelihood estimator is infinite. The estimated standard errors are meaningless, since the partial likelihood isn't close to quadratic at the maximum. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning colors to cells
Hi everyone, I was wondering if there was a simple way to assign a color to a cell based on value in a data frame and return the cell as an image? For example, if the value is 1, then blue, if between 1 and 2, green, if between 2 and 3, yellow, etc. I tried using a heatmap function but I was hoping there was an easier way to do this. Thanks, Sumukh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating mean excluding zeros
Sarah et. al: On Tue, Jul 19, 2011 at 1:56 PM, Sarah Goslee sarah.gos...@gmail.com wrote: In the more general case, that approach is prone to machine precision error (FAQ 7.31). Here's a clunky but safer alternative: Perhaps ?zapsmall . However, I would agree with your sentiments that it may depend on context. Finite precision can be a tricky thing. -- Bert set.seed(1234) testvec - sample(0:10, 100, replace=TRUE) mean(testvec) [1] 4.31 mean(testvec[testvec != 0]) [1] 4.842697 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))]) [1] 4.842697 (Is there an elementwise equivalent to all.equal() that I'm missing?) Sarah On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote: You can do it by subsetting or indexing r-c(0,0,0,rnorm(10,10,5)) mean(r) [1] 8.052215 mean(r[r!=0]) [1] 10.46788 Weidong Gu On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry if it's been discussed before - don't seem to find it. I'd like to calculate a mean while ignoring zeros. mean doesn't seem to have an option for that. Any other function/package that could do it? Thanks for a pointer! -- Dimitri Liakhovitski marketfusionanalytics.com -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] timeDate with month designated by three letters.
strptime(04-MAY-11 1428,format=%d-%b-%y %H%M) [1] 2011-05-04 14:28:00 -- Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600FAX:(360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels:300 Desmond Drive, Lacey, WA 98503-1274 On Tue, 19 Jul 2011, mdkz...@aol.com wrote: Dear R Experts: I am trying to convert a date and time character field to timeDate where the month is presented as three letters, such as JUN for June, etc. This is an example of the full character field: 04-MAY-11 1428 What is the proper format syntax? I've tried timeDate(04-MAY-11 1428,format=%d-%m-%y %H%M) but only get GMT [1] [NA] If I change the month to a number as below, then it works, but that would require recoding of the data field. timeDate(04-05-11 1428,format=%d-%m-%y %H%M) gives GMT [1] [2011-05-04 14:28:00] which is correct. How do I get R to recognize the month as a 3 letter designator. Any recommendations you can provide would be greatly appreciated. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in vector(double, length) : cannot allocate vector of length 1010723280 - need help
Hi All, I am working on CGH datasets. When I am running clustering, it shows me this error. d - dist(mydata, method = euclidean) Error in vector(double, length) : cannot allocate vector of length 1010723280 If anyone can help me, I will really appreciate. Thanks, Kinnari -- View this message in context: http://r.789695.n4.nabble.com/Error-in-vector-double-length-cannot-allocate-vector-of-length-1010723280-need-help-tp3679326p3679326.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Incorrect degrees of freedom for splines using GAMM4?
Hello, I'm running mixed models in GAMM4 with 2 (non-nested) random intercepts and I want to include a spline term for one of my exposure variables. However, when I include a spline term, I always get reported degrees of freedom of less than 1, even when I know that my spline is using more than 1 degree of freedom. For example, here is the code for my model: global.gamm4-gamm4(zcog~s(adjpatx, fx=TRUE, k=5)+int234+cogagec+cogagesq + + + oldfran +newus +alc2 +alc3 +alc4 +alcmiss +smk2 +smk3 + +mdinc10c +mdinc10sq+ pwhtc +pwhtsq +edu2+ edu3 +husbgs +husbcol+ husbmiss + +currpmh +pastpmh +neverpmh, random= ~(1|id) +(1|cogtest), data=global) Using summary(global.gamm4$mer), I get the following output for my spline term, indicating that I use the expected 4 degrees of freedom. Xs(adjpatx)Fx1 0.1018943 0.1073225 0.949 Xs(adjpatx)Fx2 -0.0708114 0.1123845 -0.630 Xs(adjpatx)Fx3 0.7459511 0.6836413 1.091 Xs(adjpatx)Fx4 -0.2062321 0.0923569 -2.233 However, when I use summary(global.gamm4$gam). I get an estimate of degrees of freedom that is not 4: Approximate significance of smooth terms: edf Ref.df F p-value s(adjpatx) 0.7588 0.7588 1.346 0.234 This degree of freedom = 0.76 also shows up on my plot. Ultimately, I would like to use a cubic regression penalized spline, allowing R to choose the degrees of freedom for me using GCV. However, when I use the correct code for this or variants of it using mgcv, I also get degrees of freedom less than 1. For example, in the following code provides a degree of freedom of less than 1 as well: global.gamm4-gamm4(zcog~s(adjpatx, fx=FALSE)+int234+cogagec+cogagesq + + + oldfran +newus +alc2 +alc3 +alc4 +alcmiss +smk2 +smk3 + +mdinc10c +mdinc10sq+ pwhtc +pwhtsq +edu2+ edu3 +husbgs +husbcol+ husbmiss + +currpmh +pastpmh +neverpmh, random= ~(1|id) +(1|cogtest), data=global) Output indicating that this spline should probably look linear: summary(global.gamm4$mer) Random effects: Groups NameVariance Std.Dev. id (Intercept) 0.1823454 0.427019 cogtest (Intercept) 0.0025498 0.050496 Xr.1 s(adjtibx) 0.000 0.00 Residual 0.7782969 0.882211 Xs(adjtibx)Fx1 -0.0387360 0.0215596 -1.797 Output getting a df for this spline of 0.20. summary(global.gamm4$gam) Approximate significance of smooth terms: edf Ref.df F p-value s(adjtibx) 0.2009 0.2009 16.07 NA The plot looks linear, but reports a df =0.20. So...to summarize my questions: 1. Are the splines produced by s(exp, fx=FALSE) or s(exp, fx=TRUE, k=k) correct even though the reported degrees of freedom appears to be wrong? 2. Can I believe my plot? 3. How can I get the true df used when I use s(exp, fx=FALSE)? Thanks for any and all help you can provide! Melinda [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Different result of multiple regression in R and SPSS
Hi, I am trying to do a simple multiple regression analysis that has one nominal variable (gender) and three numeric variables as independent variables and one numeric variable as dependent variable. So, I got a formula like this: summary(out.3 - lm(scale(DV) ~ gender + scale(IV.1) + scale(IV.2) + scale(IV.3)) I tried to compare the outcome in R with the outcome in SPSS and found the results are different! I found that R and SPSS have the exact same outcome when every variable is numeric; however, whenever I included gender (0/1) variable in the equation, the result become different. I guess that SPSS automatically treat gender as a numeric variable and standardize it when running analysis. So, I tried to change gender to a numeric variable and ran analysis but the results were still not identical. What is the problem here and what is the right way to do this analysis? Thanks, Jay Yang -- View this message in context: http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Taking all complete diagonals of a matrix
Hi R-Help! I am trying to find a nicer way of extracting all the complete diagonals of a matrix. I am working with very large matrices that have many more rows than columns. I want to be able to extract each of the diagonals that are as long as the number of columns in the matrix. I have written a rather ugly function that presently does the job. It illustrates what I am trying to do, but I feel like there must be a cleaner (and faster) way. Does anybody have any ideas? Here is what I've done so far: diagonals - function(mat){ output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat)) for(i in 1:NROW(output)){ G - c() for(j in 1:NCOL(mat)){ G - c(G,mat[(i+j-1),j]) } output[i,] - G } return(output) } example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3)) example [,1] [,2] [,3] [1,]111 [2,]222 [3,]333 [4,]444 [5,]555 diagonals(example) [,1] [,2] [,3] [1,]123 [2,]234 [3,]345 Many thanks, Peter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
Answer: Contrasts, i.e. the parameterization of the categorical variable(s) df. ?contrasts may be of some help, but you really need to do some background studying of the linear models principles involved. Googling may provide tutorials. Also searching the mail archives, e.g.: https://stat.ethz.ch/pipermail/r-help/2009-February/187479.html -- Bert On Tue, Jul 19, 2011 at 2:39 PM, J. seoulseoulse...@gmail.com wrote: Hi, I am trying to do a simple multiple regression analysis that has one nominal variable (gender) and three numeric variables as independent variables and one numeric variable as dependent variable. So, I got a formula like this: summary(out.3 - lm(scale(DV) ~ gender + scale(IV.1) + scale(IV.2) + scale(IV.3)) I tried to compare the outcome in R with the outcome in SPSS and found the results are different! I found that R and SPSS have the exact same outcome when every variable is numeric; however, whenever I included gender (0/1) variable in the equation, the result become different. I guess that SPSS automatically treat gender as a numeric variable and standardize it when running analysis. So, I tried to change gender to a numeric variable and ran analysis but the results were still not identical. What is the problem here and what is the right way to do this analysis? Thanks, Jay Yang -- View this message in context: http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
Thanks for the answer. However, I am still curious about which result I should use? The result from R or the one from SPSS? Why the results from two programs are different? Jay -- View this message in context: http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679511.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
I don't think SPSS does anything with the variables you enter there. Have you entered it as numeric? Have you entered gender as numeric in R? On Tue, Jul 19, 2011 at 6:11 PM, Bert Gunter gunter.ber...@gene.com wrote: Answer: Contrasts, i.e. the parameterization of the categorical variable(s) df. ?contrasts may be of some help, but you really need to do some background studying of the linear models principles involved. Googling may provide tutorials. Also searching the mail archives, e.g.: https://stat.ethz.ch/pipermail/r-help/2009-February/187479.html -- Bert On Tue, Jul 19, 2011 at 2:39 PM, J. seoulseoulse...@gmail.com wrote: Hi, I am trying to do a simple multiple regression analysis that has one nominal variable (gender) and three numeric variables as independent variables and one numeric variable as dependent variable. So, I got a formula like this: summary(out.3 - lm(scale(DV) ~ gender + scale(IV.1) + scale(IV.2) + scale(IV.3)) I tried to compare the outcome in R with the outcome in SPSS and found the results are different! I found that R and SPSS have the exact same outcome when every variable is numeric; however, whenever I included gender (0/1) variable in the equation, the result become different. I guess that SPSS automatically treat gender as a numeric variable and standardize it when running analysis. So, I tried to change gender to a numeric variable and ran analysis but the results were still not identical. What is the problem here and what is the right way to do this analysis? Thanks, Jay Yang -- View this message in context: http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] list.files recursively to find files in a specific way...
Hi, all: My folders are organized in such a way: root branch1 ---A ---file1.txt ---file2.txt ---B ---file1.txt ---file2.txt branch2 ---A ---file1.txt ---file2.txt ---B ---file1.txt ---file2.txt ... branch100 ---A ---file1.txt ---file2.txt ---B ---file1.txt ---file2.txt I'd love to list all file2.txt from all subdirectories Bs but not from As, how to do that? I tried the following two a) allResults - list.files(path = routeStr, pattern = file2.txt, all.files = TRUE, full.names = TRUE, recursive = TRUE); gives me 200 files in allResults, which is wrong. There should be only 100 files in allResults. b) allResults - list.files(path = routeStr, pattern = B/file2.txt, all.files = TRUE, full.names = TRUE, recursive = TRUE); still wrong. It give me nothing, namely, 0 file(s) in allResults. Can anybody help to solve this problem? Best Regards Pei -- Pei JIA Email: jp4w...@gmail.com cell:+1 604-362-5816 Welcome to Vision Open http://www.visionopen.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
On Jul 19, 2011, at 6:29 PM, J. wrote: Thanks for the answer. However, I am still curious about which result I should use? The result from R or the one from SPSS? It is becoming apparent that you do not know how to use the results from either system. The progress of science would be safer if you get some advice from a person that knows what they are doing. Why the results from two programs are different? Different parametrizations. If I had to guess I would bet that the gender coefficient is R is exactly twice that of the one from SPSS. They are probably both correct in the context of their respective codings. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list.files recursively to find files in a specific way...
Pei - A file pattern can't contain a directory separator, but it's easy to search for one outside the context of list.files. I think grep('B/file2.txt',list.files(path = routeStr, all.files = TRUE, full.names = TRUE, recursive = TRUE),value=TRUE) should give you what you want. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Tue, 19 Jul 2011, JIA Pei wrote: Hi, all: My folders are organized in such a way: root branch1 ---A ---file1.txt ---file2.txt ---B ---file1.txt ---file2.txt branch2 ---A ---file1.txt ---file2.txt ---B ---file1.txt ---file2.txt ... branch100 ---A ---file1.txt ---file2.txt ---B ---file1.txt ---file2.txt I'd love to list all file2.txt from all subdirectories Bs but not from As, how to do that? I tried the following two a) allResults - list.files(path = routeStr, pattern = file2.txt, all.files = TRUE, full.names = TRUE, recursive = TRUE); gives me 200 files in allResults, which is wrong. There should be only 100 files in allResults. b) allResults - list.files(path = routeStr, pattern = B/file2.txt, all.files = TRUE, full.names = TRUE, recursive = TRUE); still wrong. It give me nothing, namely, 0 file(s) in allResults. Can anybody help to solve this problem? Best Regards Pei -- Pei JIA Email: jp4w...@gmail.com cell:+1 604-362-5816 Welcome to Vision Open http://www.visionopen.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
On Tue, Jul 19, 2011 at 3:45 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 19, 2011, at 6:29 PM, J. wrote: Thanks for the answer. # However, I am still curious about which result I should use? The result from R or the one from SPSS? It is becoming apparent that you do not know how to use the results from either system. The progress of science would be safer if you get some advice from a person that knows what they are doing. ## I nominate this for an R fortune. -- Bert Why the results from two programs are different? Different parametrizations. If I had to guess I would bet that the gender coefficient is R is exactly twice that of the one from SPSS. They are probably both correct in the context of their respective codings. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Taking all complete diagonals of a matrix
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Peter Lomas Sent: Tuesday, July 19, 2011 2:16 PM To: r-help@r-project.org Subject: [R] Taking all complete diagonals of a matrix Hi R-Help! I am trying to find a nicer way of extracting all the complete diagonals of a matrix. I am working with very large matrices that have many more rows than columns. I want to be able to extract each of the diagonals that are as long as the number of columns in the matrix. I have written a rather ugly function that presently does the job. It illustrates what I am trying to do, but I feel like there must be a cleaner (and faster) way. Does anybody have any ideas? Here is what I've done so far: diagonals - function(mat){ output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat)) for(i in 1:NROW(output)){ G - c() for(j in 1:NCOL(mat)){ G - c(G,mat[(i+j-1),j]) } output[i,] - G } return(output) } example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3)) example [,1] [,2] [,3] [1,]111 [2,]222 [3,]333 [4,]444 [5,]555 diagonals(example) [,1] [,2] [,3] [1,]123 [2,]234 [3,]345 Many thanks, Peter Peter, Here are two possibilities. I leave it up to you to determine whether they are cleaner or faster. diagonals1 - function(mat){ #setup R - dim(mat)[1] C - dim(mat)[2] output - matrix(0,(R-C+1),C) #get diagonals for(i in 1:(R-C+1)) output[i,] - diag(mat[i:(i+C-1),]) return(output) } diagonals2 - function(mat){ #setup R - dim(mat)[1] C - dim(mat)[2] output - matrix(0,(R-C+1),C) #get diagonals for(i in 1:(R-C+1)) output[,i] - mat[i:(i+C-1),i] return(output) } Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
From: dwinsem...@comcast.net To: seoulseoulse...@gmail.com Date: Tue, 19 Jul 2011 18:45:47 -0400 CC: r-help@r-project.org Subject: Re: [R] Different result of multiple regression in R and SPSS On Jul 19, 2011, at 6:29 PM, J. wrote: Thanks for the answer. However, I am still curious about which result I should use? The result from R or the one from SPSS? It is becoming apparent that you do not know how to use the results from either system. The progress of science would be safer if you get some advice from a person that knows what they are doing. Why the results from two programs are different? Different parametrizations. If I had to guess I would bet that the gender coefficient is R is exactly twice that of the one from SPSS. They are probably both correct in the context of their respective codings. I guess I would also suggest, again, run some samples with known data sets and see what you get(RSSWKDSASWYG). You would want to do this anyway if you want to insure your real data is being used reasonably. You still need to have some way to check your opinion from the expert mentioned above and known data will help there too. A factor of 2 often shows up from just looking at pictures once you have some intuition. I've often been wrong on intuition, but chasing it down and proving it helps you learn a lot :) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning colors to cells
Use the cut() function to produce the interval categories and color names in the data frame and then pass that variable to the col = argument in the appropriate plot function. Something like mydata$mycolors - cut(mydata$value, c(-Inf, 1, 2, 3, Inf), label = c('blue', 'green', 'yellow')) Ask an abstract question, get an abstract answer... HTH, Dennis On Tue, Jul 19, 2011 at 2:30 PM, Sumukh Sathnur sumukh.sath...@gmail.com wrote: Hi everyone, I was wondering if there was a simple way to assign a color to a cell based on value in a data frame and return the cell as an image? For example, if the value is 1, then blue, if between 1 and 2, green, if between 2 and 3, yellow, etc. I tried using a heatmap function but I was hoping there was an easier way to do this. Thanks, Sumukh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating mean excluding zeros
On Tue, Jul 19, 2011 at 5:20 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thanks a lot, Sarah. I assume, if the values against which I am comparing are REALLY zero (0) - then even the first one (mean(testvec[testvec != 0])) should work, right? Dimitri Well, yes. But what's really zero? ((.2 + .1) - .3) == 0 [1] FALSE all.equal(((.2 + .1) - .3), 0) [1] TRUE 0 is a string, and string comparison is a different issue, not subject to machine precision. Sarah On Tue, Jul 19, 2011 at 4:56 PM, Sarah Goslee sarah.gos...@gmail.com wrote: In the more general case, that approach is prone to machine precision error (FAQ 7.31). Here's a clunky but safer alternative: set.seed(1234) testvec - sample(0:10, 100, replace=TRUE) mean(testvec) [1] 4.31 mean(testvec[testvec != 0]) [1] 4.842697 mean(testvec[!sapply(testvec, function(x)isTRUE(all.equal(x, 0)))]) [1] 4.842697 (Is there an elementwise equivalent to all.equal() that I'm missing?) Sarah On Tue, Jul 19, 2011 at 4:48 PM, Weidong Gu anopheles...@gmail.com wrote: You can do it by subsetting or indexing r-c(0,0,0,rnorm(10,10,5)) mean(r) [1] 8.052215 mean(r[r!=0]) [1] 10.46788 Weidong Gu On Tue, Jul 19, 2011 at 4:36 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry if it's been discussed before - don't seem to find it. I'd like to calculate a mean while ignoring zeros. mean doesn't seem to have an option for that. Any other function/package that could do it? Thanks for a pointer! -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Taking all complete diagonals of a matrix
Hi: Does this work for you? mydiags - function(mat) diag(mat[seq_len(ncol(mat)), ]) # Example: set.seed(103) u - matrix(rpois(200, 10), ncol = 10) # dim(u) # [1] 20 10 mydiags(u) # [1] 7 12 6 13 12 6 5 6 14 6 u[1:10, ] # as a double check HTH, Dennis On Tue, Jul 19, 2011 at 2:15 PM, Peter Lomas peter.lo...@ucalgary.ca wrote: Hi R-Help! I am trying to find a nicer way of extracting all the complete diagonals of a matrix. I am working with very large matrices that have many more rows than columns. I want to be able to extract each of the diagonals that are as long as the number of columns in the matrix. I have written a rather ugly function that presently does the job. It illustrates what I am trying to do, but I feel like there must be a cleaner (and faster) way. Does anybody have any ideas? Here is what I've done so far: diagonals - function(mat){ output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat)) for(i in 1:NROW(output)){ G - c() for(j in 1:NCOL(mat)){ G - c(G,mat[(i+j-1),j]) } output[i,] - G } return(output) } example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3)) example [,1] [,2] [,3] [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 [4,] 4 4 4 [5,] 5 5 5 diagonals(example) [,1] [,2] [,3] [1,] 1 2 3 [2,] 2 3 4 [3,] 3 4 5 Many thanks, Peter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] notation question
On 20/07/11 07:24, Carson Farmer wrote: Dear list, I am currently writing up some of my R models in a more formal sense for a paper, and I am having trouble with the notation. Although this isn't really an 'R' question, it should help me to understand a bit better what I am actually doing when fitting my models! Using the analysis of co-variance example from MASS (fourth edition, p 142), what is the correct notation for the formula Gas, ~ Insul/Temp There shouldn't be a comma after ``Gas'' in that formula. - 1? Obviously, if we fit it as two separate models (as in the example above it), we would have something like y_i = \beta x_i for each of the two models. No. y_i = alpha + beta x_i . Clearly you need an intercept. Do you really expect gas consumption to be nil when the average external temperature is 0 degrees C ? So my question is, when we have a single model with a k-level factor interaction term as in the equation above, what is the correct/standard statistical (LaTeX style) notation? The model is simply y_ij = alpha_i + beta_i x_ij where i = 1 (before) or 2 (after). I.e. you are allowing a different slope and intercept for each of the scenarios (before and after). But this is the ``deterministic'' part of the model. You should really include the random part: y_ij = alpha_i + beta_i x_ij + E_ij where the E_ij are independent random variables with mean 0 and common variance sigma^2. (Often the E_ij are assumed to be Gaussian, mainly because if all you have is a hammer, then everything looks like a nail). cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SSOAP chemspider
Dear all, I've been trying on and off for the past few months to get SSOAP to work with chemspider. First I tried the WSDL file: cs-processWSDL(http://www.chemspider.com/MassSpecAPI.asmx?WSDL;) Error in parse(text = paste(txt, collapse = \n)) : text:1:29: unexpected input 1: function(x, ..., obj = new( ‚ ^ In addition: Warning message: In processWSDL(http://www.chemspider.com/MassSpecAPI.asmx?WSDL;) : Ignoring additional serviceport ... elements Next I've tried using just the pure .SOAP to call the database. s - SOAPServer(http://www.chemspider.com/MassSpecAPI.asmx;) csid- .SOAP(s, SearchByMass2, mass=89.04767, range=0.01, action = I(http://www.chemspider.com/SearchByMass2;), xmlns = c(http://www.chemspider.com;), .opts = list(verbose = TRUE)) This seems to work and gives back a result. However, this result isn't the right result. It's seems to have converted the mass into 0. When I run the similar program in perl I get the correct id's. So this isn't a server side problem but SSOAP. Any thoughts or suggestions on other packages to use? Further infomation about the SeachByMass2 method and it's xml that it's expecting. http://www.chemspider.com/MassSpecAPI.asmx?op=SearchByMass2 Cheers, Paul PS Placing a fake error in the .SOAP code I can look at the xml it's sending to the server: Browse[1] doc ?xml version=1.0? SOAP-ENV:Envelope xmlns:SOAP-ENC=http://schemas.xmlsoap.org/soap/encoding/; xmlns:SOAP-ENV=http://schemas.xmlsoap.org/soap/envelope/; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xmlns:xsd=http://www.w3.org/2001/XMLSchema; SOAP-ENV:encodingStyle=http://schemas.xmlsoap.org/soap/encoding/; SOAP-ENV:Body ns:SearchByMass2 xmlns:ns=http://www.chemspider.com; ns:mass89.04767/ns:mass ns:range0.01/ns:range /ns:SearchByMass2 /SOAP-ENV:Body /SOAP-ENV:Envelope __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] notation question
Thank you Rolf, Using the analysis of co-variance example from MASS (fourth edition, p 142), what is the correct notation for the formula Gas, ~ Insul/Temp There shouldn't be a comma after ``Gas'' in that formula. - 1? Obviously, if we fit it as two separate models (as in the example above it), we would have something like y_i = \beta x_i for each of the two models. No. y_i = alpha + beta x_i . Clearly you need an intercept. Do you really expect gas consumption to be nil when the average external temperature is 0 degrees C ? Right, of course... both silly mistakes, my apologies! So my question is, when we have a single model with a k-level factor interaction term as in the equation above, what is the correct/standard statistical (LaTeX style) notation? The model is simply y_ij = alpha_i + beta_i x_ij where i = 1 (before) or 2 (after). I.e. you are allowing a different slope and intercept for each of the scenarios (before and after). But this is the ``deterministic'' part of the model. You should really include the random part: y_ij = alpha_i + beta_i x_ij + E_ij where the E_ij are independent random variables with mean 0 and common variance sigma^2. (Often the E_ij are assumed to be Gaussian, mainly because if all you have is a hammer, then everything looks like a nail). Perfect! Thanks for the clarification. I think I was previously trying to be a bit more clever than necessary (and as a result not being very clever at all :-p) Carson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Taking all complete diagonals of a matrix
Thanks very much to everyone who replied. Peter got me on my way with the use diag() hint, and I came with a less pretty version of Dan's first option almost at the same time as I got that email. Seems I can't avoid one for loop, but one is better than two. Just as a note, with this code you have to make sure that you are in fact giving it a matrix, or diag() will error. I fed it a data frame unaware, but using as.matrix() works just fine. diagonals - function(mat){ R - dim(mat)[1] C - dim(mat)[2] output - matrix(NA,(R-C+1),C) for(i in 1:(R-C+1)) output[i,] - diag(mat[i:(i+C-1),]) return(output) } example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3)) diagonals(as.data.frame(example)) Error in output[i, ] - diag(mat[i:(i + C - 1), ]) : number of items to replace is not a multiple of replacement length Thanks again, Peter On Tue, Jul 19, 2011 at 17:34, Nordlund, Dan (DSHS/RDA) nord...@dshs.wa.gov wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Peter Lomas Sent: Tuesday, July 19, 2011 2:16 PM To: r-help@r-project.org Subject: [R] Taking all complete diagonals of a matrix Hi R-Help! I am trying to find a nicer way of extracting all the complete diagonals of a matrix. I am working with very large matrices that have many more rows than columns. I want to be able to extract each of the diagonals that are as long as the number of columns in the matrix. I have written a rather ugly function that presently does the job. It illustrates what I am trying to do, but I feel like there must be a cleaner (and faster) way. Does anybody have any ideas? Here is what I've done so far: diagonals - function(mat){ output - matrix(0,(dim(mat)[1]-dim(mat)[2]+1),NCOL(mat)) for(i in 1:NROW(output)){ G - c() for(j in 1:NCOL(mat)){ G - c(G,mat[(i+j-1),j]) } output[i,] - G } return(output) } example - rbind(rep(1,3),rep(2,3),rep(3,3),rep(4,3),rep(5,3)) example [,1] [,2] [,3] [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 [4,] 4 4 4 [5,] 5 5 5 diagonals(example) [,1] [,2] [,3] [1,] 1 2 3 [2,] 2 3 4 [3,] 3 4 5 Many thanks, Peter Peter, Here are two possibilities. I leave it up to you to determine whether they are cleaner or faster. diagonals1 - function(mat){ #setup R - dim(mat)[1] C - dim(mat)[2] output - matrix(0,(R-C+1),C) #get diagonals for(i in 1:(R-C+1)) output[i,] - diag(mat[i:(i+C-1),]) return(output) } diagonals2 - function(mat){ #setup R - dim(mat)[1] C - dim(mat)[2] output - matrix(0,(R-C+1),C) #get diagonals for(i in 1:(R-C+1)) output[,i] - mat[i:(i+C-1),i] return(output) } Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
On 7/19/2011 4:04 PM, Bert Gunter wrote: On Tue, Jul 19, 2011 at 3:45 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 19, 2011, at 6:29 PM, J. wrote: Thanks for the answer. # However, I am still curious about which result I should use? The result from R or the one from SPSS? It is becoming apparent that you do not know how to use the results from either system. The progress of science would be safer if you get some advice from a person that knows what they are doing. ## I nominate this for an R fortune. -- Bert None of us ever know what we're doing at some level. We often think we do, and sometimes we get results more in spite of what we've done than because of it. That of course increases our confidence and encourages us to repeat mistakes in contexts where we might not be so lucky. Spencer Why the results from two programs are different? Different parametrizations. If I had to guess I would bet that the gender coefficient is R is exactly twice that of the one from SPSS. They are probably both correct in the context of their respective codings. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different result of multiple regression in R and SPSS
First, it would have helped if you had posted the actual results for us to see how far they are off (and, more specifically, by which factor). Second, given your epiphany, you will find that that's exactly what David (and others before him) said or suggested. It is not about standardizing a nominal variable, which you theoretically cannot. It is about how the programs encode nominal variables by standard. Daniel J. wrote: I finally got the same result by converting gender variable as numeric, and standardize it. I guess SPSS automatically doing the same thing when doing analysis. But, it still is not clear to me how I can interpret standardized categorical (dummy coded) variable. I'd rather stick to use R. Thanks for all the comments and advice. Jay -- View this message in context: http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679861.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PCA - princomp can only be used with more units than variables
On Mon, Jul 18, 2011 at 10:48 AM, a.me...@yahoo.co.uk a.me...@yahoo.co.uk wrote: Ok thank you Josh. Basically I have a matrix A with 7 rows and 18 columns. If i j (where i is the number of rows in your matrix and j is the number of columns), then the determinant of the covariance (or correlation) matrix |Sigma_A| will be 0 (or very near zero, you can easily convince yourself of this by running det(cov(matrix(rnorm(90), 9))) as many times as you need). From Cramer's Rule, if the determinant of the matrix is 0, there is not a unique solution (clarifications/corrections are welcome if any of this is wrong). What I am told is I need the 'varimax rotated scores from the PCA analysis of matrix A' Who told you that? Is this homework? You could look at the ?principal function in package psych. That said, if this is homework I would talk with your instructor more, and if this is anything beyond an exercise (i.e., has real world implications), I would seek the advice/help of a local statistician. I can choose from 3 up to 7 components. My problem is how to carry out the above. Have you any ideas? Would appreciate your help! Armin On 18/07/2011 18:07, Joshua Wiley wrote: Hi, You need to explain what you want to do. This is not a software issue, you simply cannot create more uncorrelated variables than you have observations. Josh On Mon, Jul 18, 2011 at 8:53 AM, a.me...@yahoo.co.uk a.me...@yahoo.co.uk wrote: Hi, May I ask a question about a thread https://stat.ethz.ch/pipermail/r-help/2005-March/068365.html? I understand I need to use prcomp instead of princomp when i have less units than variables. However, when I use prcomp the scores is NULL. How can I overcome this? Regards, Armin -- Kind Regards, Armin Mewes Groundesign 10 Jerusalem street Belfast BT7 1QN Tel. (0044)(0)2890280887 Email. enquir...@groundesign.net www. www.groundesign.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Kind Regards, Armin Mewes Groundesign 10 Jerusalem street Belfast BT7 1QN Tel. (0044)(0)2890280887 Email. enquir...@groundesign.net www. www.groundesign.net -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to convert number (matlab) to date
On Monday, July 18, 2011 05:56:14 peter dalgaard wrote: but even this is dubious, since there is no year 0 AD. In Gregorian and Julian calendars, 1 BC continues directly into 1 AD. Although this seems to be a widely recognized problem, I would argue it is an entirely specious one. It makes no more sense to recognize a year 0 than it would worry about the lack of a zeroth inch or We name centuries and decades without issues. in smaller time intervals with no problem. The entire concern comes down to naming issue. Whenever a fraction from the first inch on a ruler is named, it is written as 0.* or */** - with either a preceding 0, or as a common fraction written without any preceding whole number. Every year in the 19th century starts with 18 yet few people are confused. Why has this ever been regarded as an issue? JWDougherty __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loops and simulation
Yes, there are in Europe. And there are summer classes in the US, as well. And no, this list is not so much about helping beginners to learn R. For that, there is a myriad of online sources. Rather, this list is for people who have exhausted their ability to (elegantly) solve a problem. Also, it so happens that students post their homework and that people on this list are adamant not to contribute to academic dishonesty. Typically these posts ask very basic questions and read like desperate, last-minute outcries for help. And your post was similar enough to one of these requests, except that you were referring to SAS and had heard that R does not like loops. (This is factually wrong. But R can be slow looping). Your post looked like homework because your code uses while loops and indexing, which would imply that you know how to use them had you written the code you posted. This knowledge would be sufficient to do what you asked. In particular, indexing is very basic functionality to any language, be it R, SAS, or any other. Indexing is dealt with starting page 10 of the introductory R manual: http://cran.r-project.org/doc/manuals/R-intro.pdf. Loops are dealt with starting page 40 of the same manual. I suggest you take a look. Inspector Daniel Majdi wrote: Dear daniel with all my respect to you, you are wrong I am done with home work long long time agoby the way are there school in JULY??? I used to work with SAS, this summer I decided to learn R, I read everything and try to learn I believe this is what this list about to help beginners to learn R, and not to be chalanged by some one smart as your self Detective Malter part of the code ( the EM algorith part) was posted to solve a problem for *Normal distribution* I study it learn it and then apply it to my Pareto distribution, I guess there is no problem with this again this is what this site is about thank you for your help -- View this message in context: http://r.789695.n4.nabble.com/loops-and-simulation-tp3678640p3680036.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with RODBC
I have been trying to read some data from an Excel workbook without success. The workbook is in .xls format and has multiple sheets, one with the sheet name Data, which is the sheet I wish to read from. One complication is that the header row of this sheet is comprised of dropdown boxes. I tried what I normally would do plus some variations. Here is the output. require(RODBC) options(stringsAsFactors = FALSE) fileName - paste(getwd(), + /../Data/10_11 Quality Threshold Calculations v3.xls, + sep = ) channel - odbcConnectExcel(fileName) sqlTables(channel)$TABLE_NAME [1] Data$ [2] PBC$ [3] SQL$ [4] '10_11 Summary$' [5] '10_11 Summary$'Print_Area [6] 'Cust Nos$' [7] Data$_ [8] 'Diagnostic Pivot$' [9] 'Historic summary$' [10] 'MED Supporting Evidence$' [11] 'MED Supporting Evidence$'Print_Area faults - sqlFetch(channel, sqtable = 'Data', +colnames = FALSE, as.is = TRUE) faults [1] HY001 -1040 [Microsoft][ODBC Excel Driver] Too many fields defined. [2] [RODBC] ERROR: Could not SQLExecDirect 'SELECT * FROM [Data$]' faults - sqlFetch(channel, sqtable = 'Data$', +colnames = FALSE, as.is = TRUE) faults [1] HY001 -1040 [Microsoft][ODBC Excel Driver] Too many fields defined. [2] [RODBC] ERROR: Could not SQLExecDirect 'SELECT * FROM [Data$]' faults - sqlFetch(channel, sqtable = 'Data$_', +colnames = FALSE, as.is = TRUE) faults [1] 42S02 -1305 [Microsoft][ODBC Excel Driver] The Microsoft Jet database engine could not find the object 'Data$_'. Make sure the object exists and that you spell its name and the path name correctly. [2] [RODBC] ERROR: Could not SQLExecDirect 'SELECT * FROM [Data$_]' odbcCloseAll() I was able to read the data in using xlsReadWrite by skipping the header row and specifying the sheet name, so I have a workaround. I would like to hear any advice on what might be wrong though since usually RODBC has been extremely reliable. Data is confidential (and in a 14Mb file) so I can't provide it. My session info is: sessionInfo() R version 2.13.0 Patched (2011-06-09 r56106) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C [5] LC_TIME=English_New Zealand.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RODBC_1.3-2 djsmisc_1.0-1 loaded via a namespace (and not attached): [1] tools_2.13.0 David Scott -- _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.