Re: [R] Error: unexpected '' in when modifying existing functions
Thank you both. 1) As Duncan said, if I leave environment: namespace:stats out, it will not work since it is using .C and .Fortran functions that kmeans calls. I 2) don`t know how to use the as.environment() (I did not understood by reading the help). 3) Setting environment(kmeansnew) - environment(stats::kmeans) does not work as well. 4) Using fix() works, but then I don`t know how to store just the function in an external file. To use it in another computer, for example. If I use save(myfunc,myFile.R, ASCII=TRUE) it doesn't work when I try to load it again using myfunc=load(myFile.R) Rui On Sat, Jan 14, 2012 at 3:22 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-01-13 8:05 PM, Peter Langfelder wrote: On Fri, Jan 13, 2012 at 4:57 PM, Rui Estevesruimax...@gmail.com wrote: Hi. I am trying to modify kmeans function. It seems that is failing something obvious with the workspace. I am a newbie and here is my code: environment: namespace:stats Error: unexpected '' in Do not include the last line environment: namespace:stats it is not part of the function definition. Simply leave it out and your function will be defined in the user workspace (a.k.a. global environment). That's only partly right. Leaving it off will define the function in the global environment, but the definition might not work, because that's where it will look up variables, and the original function would look them up in the stats namespace. I don't know if that will matter, but it might lead to tricky bugs. What you should do when modifying a function from a package is set the environment to the same environment a function in the package would normally get, i.e. to the stats namespace. I think the as.environment() function can do this, but I always forget the syntax; an easier way is the following: Create the new function: kmeansnew - function (...) ... Set its environment the same as the old one: environment(kmeansnew) - environment(stats::kmeans) BTW, if you use the fix() function to get a copy for editing, it will do this for you automatically. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with plotCI
JL == Jim Lemon j...@bitwrit.com.au on Sat, 14 Jan 2012 18:52:50 +1100 writes: JL On 01/14/2012 06:35 PM, Jim Lemon wrote: On 01/13/2012 11:09 PM, Lasse DSR-mail wrote: Got problems with plotCI (plotrix) ... JL Whoops - looks like the R help list doesn't accept R JL source code as attachments any more. nonsense, sorry. As I say every few months: If an attachment is accepted or not does *not* depend on its content proper, but on its MIME type, and R-help, e.g. accepts text/plain If your e-mail software has changed, and no longer uses (or allows you to use) text/plain for plain text such as R source code, then you should blame the provider of your e-mail software ... or alas provide text inline, as you did. Martin Maechler, ETH Zurich (and R-help maintainer). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting regression line in with lattice
Weidong, thanks for the suggestion, but i also need to show to which trt each point belongs to. I had my problem solved by the way. I've been told to add a group subscript object within the panel function, and than use panel.points to plot the original data as data points and panel.lines to draw the predicted regression line of the model. cheers m. Il giorno 13 Jan 2012, alle ore 19:57, Weidong Gu ha scritto: Hi, Since trt is a factor, you use it for indexing. try just delete in the code fill - my.fill[combined$trt[subscripts]] Weidong Gu On Fri, Jan 13, 2012 at 11:30 AM, matteo dossena m.doss...@qmul.ac.uk wrote: #Dear All, #I'm having a bit of a trouble here, please help me... #I have this data set.seed(4) mydata - data.frame(var = rnorm(100), temp = rnorm(100), subj = as.factor(rep(c(1:10),5)), trt = rep(c(A,B), 50)) #and this model that fits them lm - lm(var ~ temp * subj, data = mydata) #i want to plot the results with lattice anf fit the regression line, predicted with my model, trough them #to do so, I'm using this approach, outlined Lattice Tricks for the power useR by D. Sarkar temp_rng - range(mydata$temp, finite = TRUE) grid - expand.grid(temp = do.breaks(temp_rng, 30), subj = unique(mydata$subj), trt = unique(mydata$trt)) model - cbind(grid, var = predict(lm, newdata = grid)) orig - mydata[c(var,temp,subj,trt)] combined - make.groups(original = orig, model = model) xyplot(var ~ temp | subj, data = combined, groups = which, type = c(p, l), distribute.type = TRUE ) # so far every thing is fine, but, i also whant assign a filling to the data points for the two treatments trt=1 and trt=2 # so I have written this piece of code, that works fine, but when it comes to plot the regression line, it seems that type is not recognized by the panel function... my.fill - c(black, grey) plot - with(combined, xyplot(var ~ temp | subj, data = combined, group = combined$which, type = c(p, l), distribute.type = TRUE, panel = function(x, y, ..., subscripts){ fill - my.fill[combined$trt[subscripts]] panel.xyplot(x, y, pch = 21, fill = my.fill, col = black) }, key = list(space = right, text = list(c(trt1, trt2), cex = 0.8), points = list(pch = c(21), fill = c(black, grey)), rep = FALSE) ) ) plot #I've also tried to move type and distribute type within panel.xyplot, as well as subsseting the data in it panel.xyplot like this plot - with(combined, xyplot(var ~ temp | subj, data = combined, panel = function(x, y, ..., subscripts){ fill - my.fill[combined$trt[subscripts]] panel.xyplot(x[combined$which==original], y[combined$which==original], pch = 21, fill = my.fill, col = black) panel.xyplot(x[combined$which==model], y[combined$which==model], type = l, col = black) }, key = list(space = right, text = list(c(trt1, trt2), cex = 0.8), points = list(pch = c(21), fill = c(black, grey)), rep = FALSE) ) ) plot #but no success with that either... #can anyone help me to get the predicted values plotted as a line instead of being points? #really appricieate #matteo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with plotCI
On 01/14/2012 09:22 PM, Martin Maechler wrote: JL == Jim Lemonj...@bitwrit.com.au on Sat, 14 Jan 2012 18:52:50 +1100 writes: JL On 01/14/2012 06:35 PM, Jim Lemon wrote: On 01/13/2012 11:09 PM, Lasse DSR-mail wrote: Got problems with plotCI (plotrix) ... JL Whoops - looks like the R help list doesn't accept R JL source code as attachments any more. nonsense, sorry. As I say every few months: If an attachment is accepted or not does *not* depend on its content proper, but on its MIME type, and R-help, e.g. accepts text/plain If your e-mail software has changed, and no longer uses (or allows you to use) text/plain for plain text such as R source code, then you should blame the provider of your e-mail software ... or alas provide text inline, as you did. Martin Maechler, ETH Zurich (and R-help maintainer). Hmmm, I send messages in plain text by default, so I've listed r-project.org as a plain text only domain (I use the Thunderbird email client). I'll see if this fixes it. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting .Rout file to pdf via Sweave automatically
On 12-01-13 10:21 PM, Parag Magunia wrote: The R documentation mentions to create a PDF or DVI file from an Rnw template, the Sweave command can be used used. However, is there any way to go from a .Rout file straight to pdf with an Rnw template ? What I'm trying to avoid is adding the Sweave markup to the .tex file manually. What I think I'm missing is the exact arguments to the Sweave command. I tried numerous forms of: Sweave(batch.Rout, RweaveLatex(), myR.Rnw); but without any succuess. You misunderstand Sweave. You don't add markup to the Rout file or the tex file, you add markup to the input file. Usually you name it as Rnw. For a case where you want to print everything in the batch file and there are no graphs, you could just put markup at the very beginning and at the very end. Figures are slightly more complicated, but it sounds like you don't need them. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Determining if an object name does not exist
Is there a way to tell whether an object name 1. is valid 2. is not going to cause a collision with an existing object by the same name? Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: unexpected '' in when modifying existing functions
On 12-01-14 3:58 AM, Rui Esteves wrote: Thank you both. 1) As Duncan said, if I leaveenvironment: namespace:stats out, it will not work since it is using .C and .Fortran functions that kmeans calls. I 2) don`t know how to use the as.environment() (I did not understood by reading the help). 3) Setting environment(kmeansnew)- environment(stats::kmeans) does not work as well. I think you need to explain what does not work means. What did you do, and how do you know it didn't work? 4) Using fix() works, but then I don`t know how to store just the function in an external file. To use it in another computer, for example. If I use save(myfunc,myFile.R, ASCII=TRUE) it doesn't work when I try to load it again using myfunc=load(myFile.R) Don't use load() on a source file. Use load() on a binary file produced by save(). You could save() your working function, but then you can't edit it outside of R. To produce a .R file that you can use in another session, you're going to need to produce the function, then modify the environment, using 2 or 3 above. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining if an object name does not exist
On 12-01-14 5:47 AM, Ajay Askoolum wrote: Is there a way to tell whether an object name 1. is valid 2. is not going to cause a collision with an existing object by the same name? For 1, you could put your names in a character vector x, then check whether x and make.names(x) are identical; if so, x contains syntactically valid names. (Do remember that almost anything can be a name if you put it in back quotes.) For 2, you could check exists(x) to find if objects with those names exist. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: unexpected '' in when modifying existing functions
All of these tries leave to the same result: 1) First I defined kmeansnew with the content of kmeans, but leaving the environment: namespace:stats out. Then I run environment(kmeansnew)- environment(stats::kmeans) at the command line. 2) kmeansnew - kmeans() { environment(kmeansnew)- environment(stats::kmeans) } 3) kmeansnew - kmeans() {} environment(kmeansnew)- environment(stats::kmeans) When I do kmeansnew(iris[-5],4) it returns: Error in do_one(nmeth) : object 'R_kmns' not found 'R_kmns' is a .FORTRAN that is called by the original kmeans(). it is the same error as if i would just leave environment: namespace:stats out. On Sat, Jan 14, 2012 at 11:50 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-01-14 3:58 AM, Rui Esteves wrote: Thank you both. 1) As Duncan said, if I leaveenvironment: namespace:stats out, it will not work since it is using .C and .Fortran functions that kmeans calls. I 2) don`t know how to use the as.environment() (I did not understood by reading the help). 3) Setting environment(kmeansnew)- environment(stats::kmeans) does not work as well. I think you need to explain what does not work means. What did you do, and how do you know it didn't work? 4) Using fix() works, but then I don`t know how to store just the function in an external file. To use it in another computer, for example. If I use save(myfunc,myFile.R, ASCII=TRUE) it doesn't work when I try to load it again using myfunc=load(myFile.R) Don't use load() on a source file. Use load() on a binary file produced by save(). You could save() your working function, but then you can't edit it outside of R. To produce a .R file that you can use in another session, you're going to need to produce the function, then modify the environment, using 2 or 3 above. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multidimensional array calculation
Dear Jean, Thank you, expand.grid was the function I needed. /johannes See ?expand.grid For example, df - expand.grid(L=L, AR=AR, SO=SO, T=T) df$y - fun(df$L, df$AR, df$SO, df$T) Jean Johannes Radinger wrote on 01/13/2012 12:28:46 PM: Hello, probably it is quite easy but I can get it: I have mulitple numeric vectors and a function using all of them to calculate a new value: L - c(200,400,600) AR - c(1.5) SO - c(1,3,5) T - c(30,365) fun - function(L,AR,SO,T){ exp(L*AR+sqrt(SO)*log(T)) } How can I get an array or dataframe where all possible combinations of the factors are listed and the new value is calculated. I thought about an array like: array(NA, dim = c(3,1,3,2), dimnames=list(c(200,400,600),c(1.5),c(1, 3,5),c(30,365))) but how can I get the array populated according to the function? As I want to get in the end a 2D dataframe I probably will use the melt.array() function from the reshape package or is there another way to get simple such a full-factorial dataframe with all possible combinations? Best regards, Johannes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fUtilities removed -- use fBasics
David Winsemius dwinsem...@comcast.net on Fri, 13 Jan 2012 13:52:57 -0500 writes: On Jan 13, 2012, at 12:33 PM, Dominic Comtois wrote: When setting up my new machine, I had the surprise to see that Package 'fUtilities' was removed from the CRAN repository. https://stat.ethz.ch/pipermail/rmetrics-core/2012-January/000554.html https://stat.ethz.ch/pipermail/rmetrics-core/2011-November/000549.html indeed. thank you David (and Google, I presume ..) This is problematic for my work. I use many of its functions, and it will complicate things a lot if other programmers want to use my previous code in the future. Plus, nowhere can I find the justification for its removal. For a longer time, the Rmetrics management had planned to deprecate fUtilities (and fSeries and fCalendar), basically refactoring the functionality ``approximately'' along the lines of old package replacement pkgs --- fUtilities fBasics fSeries timeSeries fCalendar timeDate but clearly not a 1:1 replacement, but a refactoring as said above. fBasics, indeed 'Depends' on both timeSeries and timeDate, so I think it is safe to say that you should replace fUtilities by fBasics everywhere ... and things should work... Yes, the communication about these plans where not put out the way they should have; and indeed the deprecation would not have necessarily meant that the package be dropped without proper notice. One excuse has been the lack of resources and health on the side of Rmetrics. Disclaimer: I am one of rmetrics-c...@r-project.org, as having been an active co-maintainer of some parts of the Rmetrics collection, but I have not been part of the management nor the foundation. Martin Maechler, ETH Zurich You need to send your questions to the maintainers. They apparently did not respond to the requests to fix the errors. Thanks for any info on this You should perhaps subscribe to the list that is established for discussion on this and related packages. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: unexpected '' in when modifying existing functions
On 12-01-14 6:08 AM, Rui Esteves wrote: All of these tries leave to the same result: 1) First I defined kmeansnew with the content of kmeans, but leaving theenvironment: namespace:stats out. Then I run environment(kmeansnew)- environment(stats::kmeans) at the command line. 2) kmeansnew- kmeans() { environment(kmeansnew)- environment(stats::kmeans) } 3) kmeansnew- kmeans() {} environment(kmeansnew)- environment(stats::kmeans) When I do kmeansnew(iris[-5],4) it returns: Error in do_one(nmeth) : object 'R_kmns' not found 'R_kmns' is a .FORTRAN that is called by the original kmeans(). it is the same error as if i would just leaveenvironment: namespace:stats out. Number 1 is what you should do. When you do that and print kmeansnew in the console, does it list the environment at the end? What does environment(kmeansnew) print? Duncan Murdoch On Sat, Jan 14, 2012 at 11:50 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-01-14 3:58 AM, Rui Esteves wrote: Thank you both. 1) As Duncan said, if I leaveenvironment: namespace:statsout, it will not work since it is using .C and .Fortran functions that kmeans calls. I 2) don`t know how to use the as.environment() (I did not understood by reading the help). 3) Setting environment(kmeansnew)- environment(stats::kmeans) does not work as well. I think you need to explain what does not work means. What did you do, and how do you know it didn't work? 4) Using fix() works, but then I don`t know how to store just the function in an external file. To use it in another computer, for example. If I use save(myfunc,myFile.R, ASCII=TRUE) it doesn't work when I try to load it again using myfunc=load(myFile.R) Don't use load() on a source file. Use load() on a binary file produced by save(). You could save() your working function, but then you can't edit it outside of R. To produce a .R file that you can use in another session, you're going to need to produce the function, then modify the environment, using 2 or 3 above. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Date/time
Hey guys, I have been trying for some time to nicely plot some of my data, I have around 1800 values for some light intensity, taken every hour of the day over almost 2 months. My data file looks like: DateTime. GMT.02.00 Intensity 106.10.11 11:00:00AM x 206.10.11 12:00:00PM x 306.10.11 01:00:00PM x 406.10.11 02:00:00PM x As I am pretty new to R, I am totally struggling with this issue, does anyone has an idea on how I could plot nicely the data and if I need to change my data file? Thanks a lot for your help -- View this message in context: http://r.789695.n4.nabble.com/Date-time-tp4294499p4294499.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Future of R | API to Public Databases
Spencer I highly appreciate your input. What we need is a standard for statistics. That may reinvent the way how we see data. The recent crisis is the best proof that we are lost in our own generated information overload. The traditional approach is not working anymore. Finding the right members for the initial committee would be the hardest but most important part. Another point is that I am only a student of 21 years which has limited financial capabilities with respect to what I can commit to such kind of a work. But I have my motivation which is the *real* engine to advance an idea. I am open to work in my spare time on it. Over time I would become an expert in my own field, that is implicit in such a decision. I don't have any background of a statistician but know what the relevance of data is. It may be a solution that a fresher gives a new perspective. Starting from scratch is at some point beneficial. It will be even harder for a person like me to convince the experienced professionals to overcome their own conventional schemes and procedures. Because my approach would pay not respect to the established ones. Why the hell should I know it just better than the experts? I respect single solutions; they might work in a specific situation but they make it impossible to put everything together into a big picture which is finally required. I am really interested in leading the initiative of such a new standard. My problem is how to start. Would a scientific paper which proposes the development of a standard, be a starting point? Benjamin On 14 January 2012 08:19, Spencer Graves spencer.gra...@structuremonitoring.com wrote: A traditional way to exit a chaotic situation as you describe is to try to establish a standards committee, invite participation from suppliers and users of whatever (data in this case), apply for registration with the International Standards Organization, and organize meetings, draft and circulate a proposed standard, etc. A statistician who had published maybe 100 papers and 3 books told me that his work on ISO 9000 (I think) made a larger contribution to humanity than anything else he had done. Work on standards is one of the most boring, tedious activities I can imagine -- and can potentially be the most impactful thing one does in this life: If you have an ISO standard number for something, people who are starting something new may find it and follow it. People who are working to upgrade something may tell their management, Let's follow this standard. Customers sometimes ask their suppliers, If you follow the standard, you might get more customers. I think you could get support for such a standard effort from the American Association for the Advancement of Science, the American Economics Association, the American Statistical Association, and many other organizations, including many on-line science journals that today pressure authors of papers to put the data behind their published paper in the public domain, downloadable from their web site, etc. IMHO. Spencer On 1/13/2012 3:39 PM, Benjamin Weber wrote: The whole issue is related to the mismatch of (1) the publisher of the data and (2) the user at the rendezvous point. Both the publisher and the user don't know anything about the rendezvous point. Both want to meet but don't meet in reality. The user wastes time to find the rendezvous point defined by the publisher. The publisher assumes any rendezvous point. As per the number of publishers, the variety of the fields and the flavor of each expert, we end up in today's data world. Everyone has to waste his precious time to find out the rendezvous point. Only experts do know in which corner to focus their search on - but even they need their time to find what they want. However, each expert (of each profession) believes that his approach is the best one in the world. Finally we have a state of total confusion, where only experts can handle the information and non-experts can not even access the data without diving fully into the flood of data and their specialities. That's my point: Data is not accessible. The discussion should follow a strategical approach: - Is the classical csv file (in all its varieties) the simplest and best way? - Isn't it the responsibility of the R community to recommend standards for different kinds of data? With the existence of this rendezvous point the publisher would know a specific point which is favorable from the user's point of view. That is missing. Only a rendezvous point defined by the community can be a 'known' rendezvous point for all stakeholders, globally. I do believe that the publisher's greatest interest is data accessibility. Where is the toolkit we provide them to enable them to serve us the data exactly as we want it? No, we just try to build even more packages to be lost in the noise of information. I disagree with a proposed solution to
Re: [R] tm package, custom reader
Le vendredi 13 janvier 2012 à 09:00 -0800, pl.r...@gmail.com a écrit : I need help with creating custom xml reader for use with the tm package. The objective is to crate a corpus for analysis. Files that I'm working with come from solr and are in a funky XML format never the less I'm able to parse the XML files using solrDocs.R function provided by Duncan Temple Lang. The problem I'm having that once I parse the document I need to create a custom reader that would be compatible with the tm package. If someone build a custom reader for tm package, or has some ideas of how to go about this, I would greatly appreciate the help. I've just written a custom XML source for tm a few days ago, so I guess I can help. First, tm has a document explaining how to write an XML reader [1], and it's relatively easy. Though, I think you shouldn't base your tm reader on the functions solrDocs.R, since they don't share the same structure as what tm expects. But you can probably adapt the code from there. To sum up how tm extensions work, you should have one function parsing the XML file and returning one XML string for each document in a corpus: this is the source. And one function parsing these per-document XML strings, and filling the document's body and meta-data from the XML tags. I think your code can be simpler than solrDocs.R since you probably know beforehand which tags are useful for you, which aren't, and what their types are. Feel free to ask for help on specific issues you may have. But please provide a short XML example (and possible code). Also, when you're done, please consider making this available, either from tm itself, or from a new package, if it can be useful to others. Regards 1: http://cran.r-project.org/web/packages/tm/vignettes/extensions.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date/time
What exactly is your problem? People quite like the zoo package for handling and plotting time series: perhaps that will work for you? Michael On Sat, Jan 14, 2012 at 4:35 AM, claire5 claire.moran...@free.fr wrote: Hey guys, I have been trying for some time to nicely plot some of my data, I have around 1800 values for some light intensity, taken every hour of the day over almost 2 months. My data file looks like: Date Time. GMT.02.00 Intensity 1 06.10.11 11:00:00 AM x 2 06.10.11 12:00:00 PM x 3 06.10.11 01:00:00 PM x 4 06.10.11 02:00:00 PM x As I am pretty new to R, I am totally struggling with this issue, does anyone has an idea on how I could plot nicely the data and if I need to change my data file? Thanks a lot for your help -- View this message in context: http://r.789695.n4.nabble.com/Date-time-tp4294499p4294499.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date/time
Well, I am not sure how to use the zoo package to be honest. I am trying to plot all 1800 data in the same graph but of course it looks super messy. And R does not really recognize the time data input. so i just want to plot the time series kind of. The problem is that the x value is date and time and i don't know how to tell that R yet i would like it to be a line then, no points, I guess a very long line y would have light data and x the time. And of course it sounds unrealistic, but would be great to have just the days on the x axis not each value for every hours I hope i am clear, somehow it is really unclear in my head as well :) -- View this message in context: http://r.789695.n4.nabble.com/Date-time-tp4294499p4294878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Future of R | API to Public Databases
Web services are only part of the problem. In essence, there are at least two facets: 1. downloading the data using some protocol 2. mapping the data to a common model Having #1 makes the import/download easier, but it really becomes useful when both are included. I think #2 is the harder problem to address. Software can usually be written to handle #1 by making a useful abstraction layer. #2 means that data has consistent names and meanings, and this requires people to agree on common definitions and a common naming convention. RDF (Resource Description Framework) and its related technologies (SPARQL, OWL, etc) are one of the many attempts to try to address this. While this effort would benefit R, I think it's best if it's part of a larger effort. Services such as DBpedia and Freebase are trying to unify many data sets using RDF. The task view and package ideas a great ideas. I'm just adding another perspective. Jason On 01/13/2012 05:18 PM, Roy Mendelssohn wrote: HI Benjamin: What would make this easier is if these sites used standardized web services, so it would only require writing once. data.gov is the worst example, they spun the own, weak service. There is a lot of environmental data available through OPenDAP, and that is supported in the ncdf4 package. My own group has a service called ERDDAP that is entirely RESTFul, see: http://coastwatch.pfel.noaa.gov/erddap and http://upwell.pfeg.noaa.gov/erddap We provide R (and matlab) scripts that automate the extract for certain cases, see: http://coastwatch.pfeg.noaa.gov/xtracto/ We also have a tool called the Environmental Data Connector (EDC) that provides a GUI from with R (and ArcGIS, Matlab and Excel) that allows you to subset data that is served by OPeNDAP, ERDDAP, certain Sensor Observation Service (SOS) servers, and have it read directly into R. It is freely available at: http://www.pfeg.noaa.gov/products/EDC/ We can write such tools because the service is either standardized (OPeNDAP, SOS) or is easy to implement (ERDDAP). -Roy On Jan 13, 2012, at 1:14 PM, Benjamin Weber wrote: Dear R Users - R is a wonderful software package. CRAN provides a variety of tools to work on your data. But R is not apt to utilize all the public databases in an efficient manner. I observed the most tedious part with R is searching and downloading the data from public databases and putting it into the right format. I could not find a package on CRAN which offers exactly this fundamental capability. Imagine R is the unified interface to access (and analyze) all public data in the easiest way possible. That would create a real impact, would put R a big leap forward and would enable us to see the world with different eyes. There is a lack of a direct connection to the API of these databases, to name a few: - Eurostat - OECD - IMF - Worldbank - UN - FAO - data.gov - ... The ease of access to the data is the key of information processing with R. How can we handle the flow of information noise? R has to give an answer to that with an extensive API to public databases. I would love your comments and ideas as a contribution in a vital discussion. Benjamin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** The contents of this message do not reflect any position of the U.S. Government or NOAA. ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center 1352 Lighthouse Avenue Pacific Grove, CA 93950-2097 e-mail: roy.mendelss...@noaa.gov (Note new e-mail address) voice: (831)-648-9029 fax: (831)-648-8440 www: http://www.pfeg.noaa.gov/ Old age and treachery will overcome youth and skill. From those who have been given much, much will be expected the arc of the moral universe is long, but it bends toward justice -MLK Jr. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date/time
On Sat, Jan 14, 2012 at 4:35 AM, claire5 claire.moran...@free.fr wrote: I have been trying for some time to nicely plot some of my data, I have around 1800 values for some light intensity, taken every hour of the day over almost 2 months. My data file looks like: Date Time. GMT.02.00 Intensity 1 06.10.11 11:00:00 AM x 2 06.10.11 12:00:00 PM x 3 06.10.11 01:00:00 PM x 4 06.10.11 02:00:00 PM x As I am pretty new to R, I am totally struggling with this issue, does anyone has an idea on how I could plot nicely the data and if I need to change my data file? With the zoo package its as follows. For the actual data which resides in a file rather than in a character string, Lines, we would replace text=Lines with something like myfile.dat. # sample data Lines - DateTime. GMT.02.00 Intensity 106.10.11 11:00:00AM 1 206.10.11 12:00:00PM 2 306.10.11 01:00:00PM 3 406.10.11 02:00:00PM 4 library(zoo) z - read.zoo(text = Lines, index = 1:3, tz = , format = %m.%d.%y %r) plot(z) We might alternately want to use chron date/times to avoid time zone problems later (as per R News 4/1). In that case it would be: library(zoo) library(chron) toChron - function(d, t, p) as.chron(paste(d, t, p), format = %m.%d.%y %r) z - read.zoo(text = Lines, index = 1:3, FUN = toChron) plot(z) Note that in both cases we could omit header = TRUE because there is one more data column than header column so it can deduce the correct header= value. Read the 5 zoo vignettes and particularly the one on read.zoo as well as the help files for more info. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HOW To CHANGE THE TYPE OF NUMBER IN THE X-Y AXIS in the (barplot) GRAPH?
Dear all, I have troubles where i have to make all the fonts in my grpahs into TImes New Roman, I know now how to change fonts for the x-axis-y-axis labels (from http://www.statmethods.net/advgraphs/parameters.html ) but HOW CAN I ALSO CHANGE THE TYPE OF FONT FOR THE NUMBER INTO Times New Roman? THank you very much in advance, KInd regards, YAKAMU [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change state names to abbreviations in an irregular list of names, abbreviations, null values, and foreign provinces
David Kikuchi dkikuchi at email.unc.edu writes: I'm trying to create maps of reptile abundance in different states counties using data from Herp.net, which provides lists of specimens with the places that they were found. First I would like to parse the list by state using 2-letter abbreviations, since I'm focusing on certain regions. To do this, I've been trying to create a vector (state2) that gives all state names as 2-letter abbreviations, using advice given on the thread: http://tolstoy.newcastle.edu.au/R/help/05/09/12136.html [snip] state2 - rep(NA,length(tener$State.Province)) for(i in 1:length(tener$Institution)){ if(tener$State.Province[i] != ''){ if(grep(tener$State.Province[i],state.name) 0){ state2[i] - state.abb[grep(tener$State.Province[i], state.name)] } else{ state2[i] - NA } } else{ state2[i] - NA } } I think you might be looking for length(grep(...))0 , but is this an easier way? state.province - c(Massachusetts,Ontario,Cuba,,Pennsylvania) myabbr - state.abb[match(state.province,state.name)] myabbr ## [1] MA NA NA NA PA (You described your problem pretty clearly, but a reproducible example would have been nice) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quantiles in boxplot
On Jan 14, 2012, at 16:07 , René Brinkhuis wrote: Based on your information I created a custom function for calculating the first and third quartile according to the 'boxplot logic'. A more compact (though not as readable) version is afforded by stats:::fivenum. A convenient description is (I believe) that the hinges are the medians of the bottom and top halves of the sorted observations, with the middle observation counting in both groups if n is odd). x - rnorm(121) fivenum(x) [1] -2.4596038 -0.6034689 0.1105829 0.6686026 2.2580863 median(sort(x)[1:floor((length(x)+1)/2)]) [1] -0.6034689 median(sort(x)[ceiling((length(x)+1)/2):length(x)]) [1] 0.6686026 -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Future of R | API to Public Databases
LOL, I remember posting about this in the past. The US gov agencies vary but mostare quite good. The big problem appears to be people who push proprietary orcommercial standards for which only one effective source exists. Some formats,like Excel and PDF come to mind and there is a disturbing trend towards theiradoption in some places where raw data is needed by many. The best thing to do is contact the informationprovider and let them know you want raw data, not images or stuff that worksin limited commercial software packages. Often data sources are valuable andthe revenue model impacts availability. If you are just arguing over different open formats, it is usually easy for someone towrite some conversion code and publish it- CSV to JSON would not be a problem for example. Data of course are quite variable and there is nothingwrong with giving provider his choice. Date: Sat, 14 Jan 2012 10:21:23 -0500 From: ja...@rampaginggeek.com To: r-help@r-project.org Subject: Re: [R] The Future of R | API to Public Databases Web services are only part of the problem. In essence, there are at least two facets: 1. downloading the data using some protocol 2. mapping the data to a common model Having #1 makes the import/download easier, but it really becomes useful when both are included. I think #2 is the harder problem to address. Software can usually be written to handle #1 by making a useful abstraction layer. #2 means that data has consistent names and meanings, and this requires people to agree on common definitions and a common naming convention. RDF (Resource Description Framework) and its related technologies (SPARQL, OWL, etc) are one of the many attempts to try to address this. While this effort would benefit R, I think it's best if it's part of a larger effort. Services such as DBpedia and Freebase are trying to unify many data sets using RDF. The task view and package ideas a great ideas. I'm just adding another perspective. Jason On 01/13/2012 05:18 PM, Roy Mendelssohn wrote: HI Benjamin: What would make this easier is if these sites used standardized web services, so it would only require writing once. data.gov is the worst example, they spun the own, weak service. There is a lot of environmental data available through OPenDAP, and that is supported in the ncdf4 package. My own group has a service called ERDDAP that is entirely RESTFul, see: http://coastwatch.pfel.noaa.gov/erddap and http://upwell.pfeg.noaa.gov/erddap We provide R (and matlab) scripts that automate the extract for certain cases, see: http://coastwatch.pfeg.noaa.gov/xtracto/ We also have a tool called the Environmental Data Connector (EDC) that provides a GUI from with R (and ArcGIS, Matlab and Excel) that allows you to subset data that is served by OPeNDAP, ERDDAP, certain Sensor Observation Service (SOS) servers, and have it read directly into R. It is freely available at: http://www.pfeg.noaa.gov/products/EDC/ We can write such tools because the service is either standardized (OPeNDAP, SOS) or is easy to implement (ERDDAP). -Roy On Jan 13, 2012, at 1:14 PM, Benjamin Weber wrote: Dear R Users - R is a wonderful software package. CRAN provides a variety of tools to work on your data. But R is not apt to utilize all the public databases in an efficient manner. I observed the most tedious part with R is searching and downloading the data from public databases and putting it into the right format. I could not find a package on CRAN which offers exactly this fundamental capability. Imagine R is the unified interface to access (and analyze) all public data in the easiest way possible. That would create a real impact, would put R a big leap forward and would enable us to see the world with different eyes. There is a lack of a direct connection to the API of these databases, to name a few: - Eurostat - OECD - IMF - Worldbank - UN - FAO - data.gov - ... The ease of access to the data is the key of information processing with R. How can we handle the flow of information noise? R has to give an answer to that with an extensive API to public databases. I would love your comments and ideas as a contribution in a vital discussion. Benjamin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** The contents of this message do not reflect any position of the U.S. Government or NOAA. ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center
Re: [R] Averaging within a range of values
On Fri, Jan 13, 2012 at 6:34 AM, doggysaywhat chwh...@ucsd.edu wrote: Hello all. I have two data frames. Group Start End G1 200 700 G2 500 1000 G3 2000 3000 G4 4000 6000 G5 7000 8000 and Pos C0 C1 200 0.9 0.6 500 0.8 0.8 800 0.9 0.7 1000 0.7 0.6 2000 0.6 0.4 2500 1.2 0.8 3000 0.6 1.5 3500 0.7 0.7 4000 0.8 0.8 4500 0.6 0.6 5000 0.9 0.9 5500 0.7 0.8 6000 0.8 0.7 6500 0.4 0.4 7000 0.5 0.8 7500 0.7 0.9 8000 0.9 0.5 8500 0.8 0.6 9000 0.9 0.8 I need to conditionally average all values in columns C0 and C1 based upon the bins I defined in the first data frame. For example, for the bin G1 in the first dataframe, the values are 200 to 700 so i would average the value at pos 200 (0.9) and 500 (0.8) for C0 and then perform the same thing for C1. I can do this in excel with array formulas but I'm relatively new to R and would like know if there is a function that will perform the same action. I don't know if this will help, but the excel array function I used was average(if(range=start)*(range=end),range)). Where the range is the entire pos column. Initially I looked at the aggregate function. I can use aggregate when I give a single vector to be used for grouping such as (A,B,C) but I'm not sure how to define grouping as the bin 200-500 and the second bin as 500-1000 etc. and use that as my grouping vector. Here is an sqldf solution where the two input data frames are d1 and d2 (as in Jeff's post). Note that Group is quoted since its an SQL keyword: library(sqldf) sqldf(select d1.'Group', avg(d2.C0), avg(d2.C1) from d1, d2 where d2.Pos between d1.Start and d1.End group by d1.'Group') The result is; Group avg(d2.C0) avg(d2.C1) 1G1 0.85 0.700 2G2 0.80 0.700 3G3 0.80 0.900 4G4 0.76 0.760 5G5 0.70 0.733 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Future of R | API to Public Databases
Mike We see that the publishers are aware of the problem. They don't think that the raw data is the usable for the user. Consequently they recognizing this fact with the proprietary formats. Yes, they resign in the information overload. That's pathetic. It is not a question of *which* data format, it is a question about the general concept. Where do publisher and user meet? There has to be one *defined* point which all parties agree on. I disagree with your statement that the publisher should just publish csv or cook his own API. That leads to fragmentation and inaccessibility of data. We want data to be accessible. A more pragmatic approach is needed to revolutionize the way we go about raw data. Benjamin On 14 January 2012 22:17, Mike Marchywka marchy...@hotmail.com wrote: LOL, I remember posting about this in the past. The US gov agencies vary but mostare quite good. The big problem appears to be people who push proprietary orcommercial standards for which only one effective source exists. Some formats,like Excel and PDF come to mind and there is a disturbing trend towards theiradoption in some places where raw data is needed by many. The best thing to do is contact the informationprovider and let them know you want raw data, not images or stuff that worksin limited commercial software packages. Often data sources are valuable andthe revenue model impacts availability. If you are just arguing over different open formats, it is usually easy for someone towrite some conversion code and publish it- CSV to JSON would not be a problem for example. Data of course are quite variable and there is nothingwrong with giving provider his choice. Date: Sat, 14 Jan 2012 10:21:23 -0500 From: ja...@rampaginggeek.com To: r-help@r-project.org Subject: Re: [R] The Future of R | API to Public Databases Web services are only part of the problem. In essence, there are at least two facets: 1. downloading the data using some protocol 2. mapping the data to a common model Having #1 makes the import/download easier, but it really becomes useful when both are included. I think #2 is the harder problem to address. Software can usually be written to handle #1 by making a useful abstraction layer. #2 means that data has consistent names and meanings, and this requires people to agree on common definitions and a common naming convention. RDF (Resource Description Framework) and its related technologies (SPARQL, OWL, etc) are one of the many attempts to try to address this. While this effort would benefit R, I think it's best if it's part of a larger effort. Services such as DBpedia and Freebase are trying to unify many data sets using RDF. The task view and package ideas a great ideas. I'm just adding another perspective. Jason On 01/13/2012 05:18 PM, Roy Mendelssohn wrote: HI Benjamin: What would make this easier is if these sites used standardized web services, so it would only require writing once. data.gov is the worst example, they spun the own, weak service. There is a lot of environmental data available through OPenDAP, and that is supported in the ncdf4 package. My own group has a service called ERDDAP that is entirely RESTFul, see: http://coastwatch.pfel.noaa.gov/erddap and http://upwell.pfeg.noaa.gov/erddap We provide R (and matlab) scripts that automate the extract for certain cases, see: http://coastwatch.pfeg.noaa.gov/xtracto/ We also have a tool called the Environmental Data Connector (EDC) that provides a GUI from with R (and ArcGIS, Matlab and Excel) that allows you to subset data that is served by OPeNDAP, ERDDAP, certain Sensor Observation Service (SOS) servers, and have it read directly into R. It is freely available at: http://www.pfeg.noaa.gov/products/EDC/ We can write such tools because the service is either standardized (OPeNDAP, SOS) or is easy to implement (ERDDAP). -Roy On Jan 13, 2012, at 1:14 PM, Benjamin Weber wrote: Dear R Users - R is a wonderful software package. CRAN provides a variety of tools to work on your data. But R is not apt to utilize all the public databases in an efficient manner. I observed the most tedious part with R is searching and downloading the data from public databases and putting it into the right format. I could not find a package on CRAN which offers exactly this fundamental capability. Imagine R is the unified interface to access (and analyze) all public data in the easiest way possible. That would create a real impact, would put R a big leap forward and would enable us to see the world with different eyes. There is a lack of a direct connection to the API of these databases, to name a few: - Eurostat - OECD - IMF - Worldbank - UN - FAO - data.gov - ... The ease of access to the data is the key of information processing with
Re: [R] The Future of R | API to Public Databases
I have been following this thread, but there are many aspects of it which are unclear to me. Who are the publishers? Who are the users? What is the problem? I have a vauge sense for some of these, but it seems to me like one valuable starting place would be creating a document that clarifies everything. It is easier to tackle a concrete problem (e.g., agree on a standard numerical representation of dates and times a la ISO 8601) than something diffuse (e.g., information overload). Good luck, Josh On Sat, Jan 14, 2012 at 10:02 AM, Benjamin Weber m...@bwe.im wrote: Mike We see that the publishers are aware of the problem. They don't think that the raw data is the usable for the user. Consequently they recognizing this fact with the proprietary formats. Yes, they resign in the information overload. That's pathetic. It is not a question of *which* data format, it is a question about the general concept. Where do publisher and user meet? There has to be one *defined* point which all parties agree on. I disagree with your statement that the publisher should just publish csv or cook his own API. That leads to fragmentation and inaccessibility of data. We want data to be accessible. A more pragmatic approach is needed to revolutionize the way we go about raw data. Benjamin On 14 January 2012 22:17, Mike Marchywka marchy...@hotmail.com wrote: LOL, I remember posting about this in the past. The US gov agencies vary but mostare quite good. The big problem appears to be people who push proprietary orcommercial standards for which only one effective source exists. Some formats,like Excel and PDF come to mind and there is a disturbing trend towards theiradoption in some places where raw data is needed by many. The best thing to do is contact the informationprovider and let them know you want raw data, not images or stuff that worksin limited commercial software packages. Often data sources are valuable andthe revenue model impacts availability. If you are just arguing over different open formats, it is usually easy for someone towrite some conversion code and publish it- CSV to JSON would not be a problem for example. Data of course are quite variable and there is nothingwrong with giving provider his choice. Date: Sat, 14 Jan 2012 10:21:23 -0500 From: ja...@rampaginggeek.com To: r-help@r-project.org Subject: Re: [R] The Future of R | API to Public Databases Web services are only part of the problem. In essence, there are at least two facets: 1. downloading the data using some protocol 2. mapping the data to a common model Having #1 makes the import/download easier, but it really becomes useful when both are included. I think #2 is the harder problem to address. Software can usually be written to handle #1 by making a useful abstraction layer. #2 means that data has consistent names and meanings, and this requires people to agree on common definitions and a common naming convention. RDF (Resource Description Framework) and its related technologies (SPARQL, OWL, etc) are one of the many attempts to try to address this. While this effort would benefit R, I think it's best if it's part of a larger effort. Services such as DBpedia and Freebase are trying to unify many data sets using RDF. The task view and package ideas a great ideas. I'm just adding another perspective. Jason On 01/13/2012 05:18 PM, Roy Mendelssohn wrote: HI Benjamin: What would make this easier is if these sites used standardized web services, so it would only require writing once. data.gov is the worst example, they spun the own, weak service. There is a lot of environmental data available through OPenDAP, and that is supported in the ncdf4 package. My own group has a service called ERDDAP that is entirely RESTFul, see: http://coastwatch.pfel.noaa.gov/erddap and http://upwell.pfeg.noaa.gov/erddap We provide R (and matlab) scripts that automate the extract for certain cases, see: http://coastwatch.pfeg.noaa.gov/xtracto/ We also have a tool called the Environmental Data Connector (EDC) that provides a GUI from with R (and ArcGIS, Matlab and Excel) that allows you to subset data that is served by OPeNDAP, ERDDAP, certain Sensor Observation Service (SOS) servers, and have it read directly into R. It is freely available at: http://www.pfeg.noaa.gov/products/EDC/ We can write such tools because the service is either standardized (OPeNDAP, SOS) or is easy to implement (ERDDAP). -Roy On Jan 13, 2012, at 1:14 PM, Benjamin Weber wrote: Dear R Users - R is a wonderful software package. CRAN provides a variety of tools to work on your data. But R is not apt to utilize all the public databases in an efficient manner. I observed the most tedious part with R is searching and downloading the data from public databases and putting it into the
[R] How can I change font type in graph (including all the text in lagend, and the number in x-y axis)
Dear all, I would like to make a survival analysis graph line with all fonts in Times New Roman, Including all the numbers in x-y axis and the legend explanation. I know how to change fonts for the x-y axis labels (from http://www.statmethods.net/advgraphs/parameters.html ) and this is what i did : # SURVIVAL PLOT colsurvival-c(black, black, black, black) windowsFonts(A=windowsFont(Times New Roman)) plot(fit1, lty=c(2, 1, 4, 3), lwd=2, col=colsurvival, yscale=100, frame.plot=FALSE) title(xlab=results, cex.lab=1.3, cex.axis=1.3, ylab=percentage survival, family=A) legend(âbottomleftâ, â¦â¦â¦etcâ¦) I have the titles all in TimesNEw Roman, but not the number in x-y axis. Is there anyone can help me here? Thank you very much in advance, Kind regards, Yakamu  [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fUtilities removed -- use fBasics
Thank you for your prompt and useful replies. I will be using fBasics from now on. Regards, Dominic Comtois, Montréal -Message d'origine- De : Martin Maechler [mailto:maech...@stat.math.ethz.ch] Envoyé : 14 janvier 2012 06:39 À : David Winsemius Cc : Dominic Comtois; r-help@r-project.org; rmetrics-c...@r-project.org Objet : Re: [R] fUtilities removed -- use fBasics David Winsemius dwinsem...@comcast.net on Fri, 13 Jan 2012 13:52:57 -0500 writes: On Jan 13, 2012, at 12:33 PM, Dominic Comtois wrote: When setting up my new machine, I had the surprise to see that Package 'fUtilities' was removed from the CRAN repository. https://stat.ethz.ch/pipermail/rmetrics-core/2012-January/000554.html https://stat.ethz.ch/pipermail/rmetrics-core/2011-November/000549.html indeed. thank you David (and Google, I presume ..) This is problematic for my work. I use many of its functions, and it will complicate things a lot if other programmers want to use my previous code in the future. Plus, nowhere can I find the justification for its removal. For a longer time, the Rmetrics management had planned to deprecate fUtilities (and fSeries and fCalendar), basically refactoring the functionality ``approximately'' along the lines of old package replacement pkgs --- fUtilities fBasics fSeries timeSeries fCalendar timeDate but clearly not a 1:1 replacement, but a refactoring as said above. fBasics, indeed 'Depends' on both timeSeries and timeDate, so I think it is safe to say that you should replace fUtilities by fBasics everywhere ... and things should work... Yes, the communication about these plans where not put out the way they should have; and indeed the deprecation would not have necessarily meant that the package be dropped without proper notice. One excuse has been the lack of resources and health on the side of Rmetrics. Disclaimer: I am one of rmetrics-c...@r-project.org, as having been an active co-maintainer of some parts of the Rmetrics collection, but I have not been part of the management nor the foundation. Martin Maechler, ETH Zurich You need to send your questions to the maintainers. They apparently did not respond to the requests to fix the errors. Thanks for any info on this You should perhaps subscribe to the list that is established for discussion on this and related packages. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] metafor: weights computation in Mantel-Haenszel method
Dear R users, In metafor 1.6-0, the Mantel-Haenszel method is implemented by the rma.mh() function. I have observed that the sum of the weights computed by weights(x) doesn't add to 100% when x is an object of class rma.mh. The consequences of this fact can be clearly seen when a forest diagram is drawn with forest(x), which calls weights(x) (or more precisely, the method weights.rma.mh() defined in the package). Is this, as I suppose, a bug? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nabble? Was Re: function to replace values doesn't work on vectors
On 12-01-13 11:48 AM, Sarah Goslee wrote: ... I hope that it was a momentary glitch; greater disagreement between Nabble and the email list will cause all sorts of fun. If the interface, whatever it is, starts stripping out code? I'll have to quit answering Nabble queries entirely. You know, that's a great suggestion. I'm now filtering out all messages with nabble.com in the Message-ID. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tm package, custom reader
On Sat, Jan 14, 2012 at 12:41 PM, Milan Bouchet-Valat nalimi...@club.frwrote: Le samedi 14 janvier 2012 à 12:24 -0600, Andy Adamiec a écrit : Hi Milan, The xml solr files are not in a typical format, here is an example http://www.omegahat.org/RSXML/solr.xml I'm not sure how to parse the documents with out using solrDocs.R function, and how to make the function compatible with a tm package. Indeed, this doesn't seem to be easy to parse using the generic XML source from tm. So it will be easier for you to create your own custom source from scratch. Have a look at the source.R and reader.R files in the tm source: you need to replicate the behavior of one of the sources. The code should include the following functions: readSorl - FunctionGenerator(function(...) { function(elem, language, id) { # Use elem$content, which contains an item set by SorlSource() below, # and create a PlainTextDocument() from it, # putting the data where appropriate (text, meta-data) } }) SorlSource - function(x) { # Parse the XML file using functions from solrDocs.R, and # create content, which is a list with one item for each document, # to pass to readSorl() one by one s - tm:::.Source(readSorl, UTF-8, length(content), FALSE, seq(1, length(content)), 0, FALSE) s$Content - content s$URI - match.call()$x class(s) = c(SorlSource, Source) s } getElem - function(x) UseMethod(getElem, x) getElem.SorlSource - function(x) { list(content = x$Content[[x$Position]], uri = match.call()$x) } eoi - function(x) UseMethod(eoi, x) eoi.SorlSource - function(x) length(x$Content) = x$Position Hope this helps [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tm package, custom reader
On Sat, Jan 14, 2012 at 12:41 PM, Milan Bouchet-Valat nalimi...@club.frwrote: Le samedi 14 janvier 2012 à 12:24 -0600, Andy Adamiec a écrit : Hi Milan, The xml solr files are not in a typical format, here is an example http://www.omegahat.org/RSXML/solr.xml I'm not sure how to parse the documents with out using solrDocs.R function, and how to make the function compatible with a tm package. Indeed, this doesn't seem to be easy to parse using the generic XML source from tm. So it will be easier for you to create your own custom source from scratch. Have a look at the source.R and reader.R files in the tm source: you need to replicate the behavior of one of the sources. The code should include the following functions: readSorl - FunctionGenerator(function(...) { function(elem, language, id) { # Use elem$content, which contains an item set by SorlSource() below, # and create a PlainTextDocument() from it, # putting the data where appropriate (text, meta-data) } }) SorlSource - function(x) { # Parse the XML file using functions from solrDocs.R, and # create content, which is a list with one item for each document, # to pass to readSorl() one by one s - tm:::.Source(readSorl, UTF-8, length(content), FALSE, seq(1, length(content)), 0, FALSE) s$Content - content s$URI - match.call()$x class(s) = c(SorlSource, Source) s } getElem - function(x) UseMethod(getElem, x) getElem.SorlSource - function(x) { list(content = x$Content[[x$Position]], uri = match.call()$x) } eoi - function(x) UseMethod(eoi, x) eoi.SorlSource - function(x) length(x$Content) = x$Position Hope this helps [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I change font type in graph (including all the text in lagend, and the number in x-y axis)
On Jan 14, 2012, at 2:19 PM, Yakamu Yakamu wrote: Dear all, I would like to make a survival analysis graph line with all fonts in Times New Roman, Including all the numbers in x-y axis and the legend explanation. I know how to change fonts for the x-y axis labels (from http://www.statmethods.net/advgraphs/parameters.html ) and this is what i did : # SURVIVAL PLOT colsurvival-c(black, black, black, black) windowsFonts(A=windowsFont(Times New Roman)) plot(fit1, lty=c(2, 1, 4, 3), lwd=2, col=colsurvival, yscale=100, frame.plot=FALSE) title(xlab=results, cex.lab=1.3, cex.axis=1.3, ylab=percentage survival, family=A) legend(“bottomleft”, ………etc…) I have the titles all in TimesNEw Roman, but not the number in x-y axis. (Since you only passed A as an argument to `title`. Why would this be expected to bleed over into the axis? I doubt that cex.axis is having any effect, either.) Is there anyone can help me here? Thank you very much in advance, You may want to see if passing a family argument to `plot` has an effect on what is eventually a call to `axis`. That's also (probably) where you should be inserting the cex.axis. Cannot test since I don't use windows (and you didn't include a reproducible sample, anyway.) (In other situations the font argument is often a number rather than the results of a call to a font-function. See the par help page) Kind regards, Yakamu [[alternative HTML version deleted]] You should learn to post in plain text and PLEASE stop replying to existing thresads when yu are submitting a new question. It screws up the threading. PLEASE do read the posting guide http://www.R-project.org/posting-guide.html AND provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Future of R | API to Public Databases
The situation for this kind of interface is much more advanced (for economic time series data) than has been suggested in other postings. Several of the organizations you mention support SDMX and I believe there is a working R interface to SDMX which has not yet been made public. A more complete list of organizations that I think already have working server side support for SDMX is: the OECD, Eurostat, the ECB, the IMF, the UN, the BIS, the Federal Reserve Board, the World Bank, the Italian Statistics agency, and to a small extent by the Bank of Canada. I have a working API to several time series databases (TS* packages on CRAN), and a partially working interface to SDMX, but have postponed further development of that in the hope that the already working code will be made available. Please see http://tsdbi.r-forge.r-project.org/ for more details. I would, of course, be happy to have other developers involved in this project. If you think you can contribute then see r-forge.r-project.org for details on how to join projects. Paul On 12-01-14 06:00 AM, r-help-requ...@r-project.org wrote: Date: Sat, 14 Jan 2012 02:44:07 +0530 From: Benjamin Weberm...@bwe.im To:r-help@r-project.org Subject: [R] The Future of R | API to Public Databases Message-ID: cany9q8k+zyvrkjjgbjp+jtnyaw15gqkocivyvpgwgyqa9dl...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Dear R Users - R is a wonderful software package. CRAN provides a variety of tools to work on your data. But R is not apt to utilize all the public databases in an efficient manner. I observed the most tedious part with R is searching and downloading the data from public databases and putting it into the right format. I could not find a package on CRAN which offers exactly this fundamental capability. Imagine R is the unified interface to access (and analyze) all public data in the easiest way possible. That would create a real impact, would put R a big leap forward and would enable us to see the world with different eyes. There is a lack of a direct connection to the API of these databases, to name a few: - Eurostat - OECD - IMF - Worldbank - UN - FAO - data.gov - ... The ease of access to the data is the key of information processing with R. How can we handle the flow of information noise? R has to give an answer to that with an extensive API to public databases. I would love your comments and ideas as a contribution in a vital discussion. Benjamin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW To CHANGE THE TYPE OF NUMBER IN THE X-Y AXIS in the (barplot) GRAPH?
On 01/15/2012 02:16 AM, Yakamu Yakamu wrote: Dear all, I have troubles where i have to make all the fonts in my grpahs into TImes New Roman, I know now how to change fonts for the x-axis-y-axis labels (from http://www.statmethods.net/advgraphs/parameters.html ) but HOW CAN I ALSO CHANGE THE TYPE OF FONT FOR THE NUMBER INTO Times New Roman? Hi Yamaku, Try this: par(family=times) plot(...) This changes the tick labels (which is what I think you want) to Times. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulating stable VAR process
Mark, statquant2 As I understand the question it is not to test if a VAR is stable but how to construct a VAR that is stable and automatically satisfies the condition Mark has taken from Lutkohl. The algorithm that I have set out will automatically satisfy that condition.The matrix that should be estimated by the algorithm is A on the last line of page 15 of Lutkepohl. Incidentally the corresponding matrix for the example on page 15 is singular. The algorithm that I have set out will only lead to systems with a non-singular matrix. I still don't see how a matrix generated in this way corresponds to a real economic system. Of course you may have some other constraints in mind that would make the generated system correspond to something more real. John On Saturday, 14 January 2012, Mark Leeds marklee...@gmail.com wrote: Hi statquant2 and john: In the first chapter of Lutkepohl, it is shown that stability f a VAR(p) is the same as det(I_k - A1z - Ap Z^p ) does not equal zero for z 1. where I_k - A1z - ... Ap z^p is referred to as the reverse characteristic polynomial. So, statquant2, given your A's, one way to do it but be would be to check the roots of the polynomial implied by taking the determinant of the your polynomial. There's an example on pg 17 of lutkepohl if you have it. If you don't, I can fax it to you over the weekend if you want it. On Fri, Jan 13, 2012 at 8:34 PM, John C Frain fra...@gmail.com wrote: I think that you must approach this in a different way. 1 Draw a set of random eigenvalues with modulus 1 2 Draw a set of random eigenvalues vectors. 3 From these you can, with some matrix manipulations, derive the corresponding Var coefficients. If your original coefficients were drawn at random I suspect that the VAR would not be stable. I am curious about what you are trying to do. John On Friday, 13 January 2012, statquant2 statqu...@gmail.com wrote: Hello Paul Thanks for the answer but my point is not how to simulate a VAR(p) process and check that it is stable. My question is more how can I generate a VAR(p) such that I already know that it is stable. We know a condition that assure that it is stable (see first message) but this is not a condition on coefficients etc... What I want is generate say a 1000 random VAR(3) processes over say 500 time periods that will be STABLE (meaning If I run stability() all will pass the test) When I try to do that it seems that none of the VAR I am generating pass this test, so I assume that the class of stable VAR(p) is very small compared to the whole VAR(p) process. -- View this message in context: http://r.789695.n4.nabble.com/simulating-stable-VAR-process-tp4261177p4291835.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:fra...@tcd.ie mailto:fra...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:fra...@tcd.ie mailto:fra...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] par.plot() for repeated measurements
I am using the package gamlss in R to plot repeated measurements. The command I am using is par.plot(). It works great except one thing about the label of the axises. I tried to label both x and y axises using ylab and xlab options. But the plot only gives variable variables. The labels did not show up. Below is the code I used. Any comments are appreciated! Thanks. library(gamlss) enable2r=read.csv(D:\\lzg\\jointmodel\\enable2r.csv,header=T) enable2r$ID-factor(enable2r$ID) par.plot(factpal~timetodeath2,data=enable2r,sub=ID,ylim=c(45,184),ylab='FACIT-PAL',xlab='Time to death',color=FALSE,lwd=1) i can not use your example, as i have no enable2r.csv, but perhaps you have luck if you change xlab='Time to death' to xlab=Time to death kind regards, -- Jonas Stein n...@jonasstein.de https://github.com/jonasstein/R-Reference-Card __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Future of R | API to Public Databases
HI, Happy to oblidge: Background: All data is dirty to some degree. A large portion of the time spent in doing data analysis is spent doing data cleaning (removing invalid data, transforming data columns into something useful). When data from multiple sources are used, then some time must be spent in making the data be able to be merged. Publishers: anyone who provides data in large quantities, usually governments and public organizations. Users: anyone who wants to use the data. This could be journalists, scientists, an concerned citizen, other organizations, etc... Problems: 1. users of data have a hard time finding data, if they can find it at all. This is the rendezvous point. There should be a common service or place to publicize the data and allow people to find it. data markets such as Infochimps can help with this. 2. data is often published using different protocols. Some data sets are so big, that the data is accessed using a custom API. Many of these services use web services, but method names vary. This is a technical problem that can be worked around using libraries to translate from one protocol to another. A 3rd party may also help here by aggregating data sets. Publisher-specific libraries have been proposed to help address this, but I think those are also a compromise. 3. data sets rarely use common data fields/columns and what they measure may vary slightly. Having a common names, and definitions for often-used columns allows for confidence in merging the data and more accurate insights may be made. If these issues can be solved, then large amount of data analysts' time can be freed up by reducing the data cleansing phase. On top of that, if the data can be merged in an automated way, then even laymen can do their own analysis. This problem is similar, if not identical, to the one being addressed by the semantic web movement. These problems can't be solved just by using ISO-formatted dates, part of the problem is getting people to use common meanings for the fields. Here is an example to illustrate: Public universities publish data such as the number of students enrolled. This number is often broken down by undergraduate and graduate students, but you have to know how that is measured. Are post-baccalaureates counted as graduate students? Were the students counted by head count or by full-time equivalent (FTE) (sum of total enrolled credit hours / credit hours for a full-time student). Even the definition of FTE varies by university or by university system. Jason On 01/14/2012 01:51 PM, Joshua Wiley wrote: I have been following this thread, but there are many aspects of it which are unclear to me. Who are the publishers? Who are the users? What is the problem? I have a vauge sense for some of these, but it seems to me like one valuable starting place would be creating a document that clarifies everything. It is easier to tackle a concrete problem (e.g., agree on a standard numerical representation of dates and times a la ISO 8601) than something diffuse (e.g., information overload). Good luck, Josh On Sat, Jan 14, 2012 at 10:02 AM, Benjamin Weberm...@bwe.im wrote: Mike We see that the publishers are aware of the problem. They don't think that the raw data is the usable for the user. Consequently they recognizing this fact with the proprietary formats. Yes, they resign in the information overload. That's pathetic. It is not a question of *which* data format, it is a question about the general concept. Where do publisher and user meet? There has to be one *defined* point which all parties agree on. I disagree with your statement that the publisher should just publish csv or cook his own API. That leads to fragmentation and inaccessibility of data. We want data to be accessible. A more pragmatic approach is needed to revolutionize the way we go about raw data. Benjamin On 14 January 2012 22:17, Mike Marchywkamarchy...@hotmail.com wrote: LOL, I remember posting about this in the past. The US gov agencies vary but mostare quite good. The big problem appears to be people who push proprietary orcommercial standards for which only one effective source exists. Some formats,like Excel and PDF come to mind and there is a disturbing trend towards theiradoption in some places where raw data is needed by many. The best thing to do is contact the informationprovider and let them know you want raw data, not images or stuff that worksin limited commercial software packages. Often data sources are valuable andthe revenue model impacts availability. If you are just arguing over different open formats, it is usually easy for someone towrite some conversion code and publish it- CSV to JSON would not be a problem for example. Data of course are quite variable and there is nothingwrong with giving provider his choice. Date: Sat, 14 Jan 2012 10:21:23 -0500 From:
Re: [R] simulating stable VAR process
Mark This should be reasonably straightforward. In the simplest case you wih to draw a random complex number in the unit circle. This is best done in polar coordinates. If r is a random mumber on (0,1) and theta a random number on (0, 2 Pi) then if x=r cos(theta) and y= r sin(theta), x + i y is inside the unit circle. As such roots come in conjugate pairs a second is x-iy. If you then need an odd number of roots the final can simply be a random number on (0,1). You do not need to use a uniform distribution but can use any distribution on the required intervals or restrain more or the eigenvalues to be real. John On Sunday, 15 January 2012, Mark Leeds marklee...@gmail.com wrote: hi john. I think I follow you. but , in your algorithm, it is straightforward to generate a set of eigenvalues with modulus less than 1 ? thanks. On Sat, Jan 14, 2012 at 5:31 PM, John C Frain fra...@gmail.com wrote: Mark, statquant2 As I understand the question it is not to test if a VAR is stable but how to construct a VAR that is stable and automatically satisfies the condition Mark has taken from Lutkohl. The algorithm that I have set out will automatically satisfy that condition.The matrix that should be estimated by the algorithm is A on the last line of page 15 of Lutkepohl. Incidentally the corresponding matrix for the example on page 15 is singular. The algorithm that I have set out will only lead to systems with a non-singular matrix. I still don't see how a matrix generated in this way corresponds to a real economic system. Of course you may have some other constraints in mind that would make the generated system correspond to something more real. John On Saturday, 14 January 2012, Mark Leeds marklee...@gmail.com wrote: Hi statquant2 and john: In the first chapter of Lutkepohl, it is shown that stability f a VAR(p) is the same as det(I_k - A1z - Ap Z^p ) does not equal zero for z 1. where I_k - A1z - ... Ap z^p is referred to as the reverse characteristic polynomial. So, statquant2, given your A's, one way to do it but be would be to check the roots of the polynomial implied by taking the determinant of the your polynomial. There's an example on pg 17 of lutkepohl if you have it. If you don't, I can fax it to you over the weekend if you want it. On Fri, Jan 13, 2012 at 8:34 PM, John C Frain fra...@gmail.com wrote: I think that you must approach this in a different way. 1 Draw a set of random eigenvalues with modulus 1 2 Draw a set of random eigenvalues vectors. 3 From these you can, with some matrix manipulations, derive the corresponding Var coefficients. If your original coefficients were drawn at random I suspect that the VAR would not be stable. I am curious about what you are trying to do. John On Friday, 13 January 2012, statquant2 statqu...@gmail.com wrote: Hello Paul Thanks for the answer but my point is not how to simulate a VAR(p) process and check that it is stable. My question is more how can I generate a VAR(p) such that I already know that it is stable. We know a condition that assure that it is stable (see first message) but this is not a condition on coefficients etc... What I want is generate say a 1000 random VAR(3) processes over say 500 time periods that will be STABLE (meaning If I run stability() all will pass the test) When I try to do that it seems that none of the VAR I am generating pass this test, so I assume that the class of stable VAR(p) is very small compared to the whole VAR(p) process. -- View this message in context: http://r.789695.n4.nabble.com/simulating-stable-VAR-process-tp4261177p4291835.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:fra...@tcd.ie mailto:fra...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulating stable VAR process
gotcha john. thanks. On Sat, Jan 14, 2012 at 9:28 PM, John C Frain fra...@gmail.com wrote: Mark This should be reasonably straightforward. In the simplest case you wih to draw a random complex number in the unit circle. This is best done in polar coordinates. If r is a random mumber on (0,1) and theta a random number on (0, 2 Pi) then if x=r cos(theta) and y= r sin(theta), x + i y is inside the unit circle. As such roots come in conjugate pairs a second is x-iy. If you then need an odd number of roots the final can simply be a random number on (0,1). You do not need to use a uniform distribution but can use any distribution on the required intervals or restrain more or the eigenvalues to be real. John On Sunday, 15 January 2012, Mark Leeds marklee...@gmail.com wrote: hi john. I think I follow you. but , in your algorithm, it is straightforward to generate a set of eigenvalues with modulus less than 1 ? thanks. On Sat, Jan 14, 2012 at 5:31 PM, John C Frain fra...@gmail.com wrote: Mark, statquant2 As I understand the question it is not to test if a VAR is stable but how to construct a VAR that is stable and automatically satisfies the condition Mark has taken from Lutkohl. The algorithm that I have set out will automatically satisfy that condition.The matrix that should be estimated by the algorithm is A on the last line of page 15 of Lutkepohl. Incidentally the corresponding matrix for the example on page 15 is singular. The algorithm that I have set out will only lead to systems with a non-singular matrix. I still don't see how a matrix generated in this way corresponds to a real economic system. Of course you may have some other constraints in mind that would make the generated system correspond to something more real. John On Saturday, 14 January 2012, Mark Leeds marklee...@gmail.com wrote: Hi statquant2 and john: In the first chapter of Lutkepohl, it is shown that stability f a VAR(p) is the same as det(I_k - A1z - Ap Z^p ) does not equal zero for z 1. where I_k - A1z - ... Ap z^p is referred to as the reverse characteristic polynomial. So, statquant2, given your A's, one way to do it but be would be to check the roots of the polynomial implied by taking the determinant of the your polynomial. There's an example on pg 17 of lutkepohl if you have it. If you don't, I can fax it to you over the weekend if you want it. On Fri, Jan 13, 2012 at 8:34 PM, John C Frain fra...@gmail.com wrote: I think that you must approach this in a different way. 1 Draw a set of random eigenvalues with modulus 1 2 Draw a set of random eigenvalues vectors. 3 From these you can, with some matrix manipulations, derive the corresponding Var coefficients. If your original coefficients were drawn at random I suspect that the VAR would not be stable. I am curious about what you are trying to do. John On Friday, 13 January 2012, statquant2 statqu...@gmail.com wrote: Hello Paul Thanks for the answer but my point is not how to simulate a VAR(p) process and check that it is stable. My question is more how can I generate a VAR(p) such that I already know that it is stable. We know a condition that assure that it is stable (see first message) but this is not a condition on coefficients etc... What I want is generate say a 1000 random VAR(3) processes over say 500 time periods that will be STABLE (meaning If I run stability() all will pass the test) When I try to do that it seems that none of the VAR I am generating pass this test, so I assume that the class of stable VAR(p) is very small compared to the whole VAR(p) process. -- View this message in context: http://r.789695.n4.nabble.com/simulating-stable-VAR-process-tp4261177p4291835.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland -- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:fra...@tcd.ie mailto:fra...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GUI preferences are not saved
I'm running R version 2.14.1 (2011-12-22) on a 32-bit Windows machine. I've edited the GUI preferences to increase the font size, saving my preferences after doing so, but the next time I start an R session, my changes to the GUI preferences are lost. Is there a way to make the GUI preference changes permanent? Phillip __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging within a range of values
doggysaywhat chwh...@ucsd.edu writes: My apologies for the context problem. I'll explain. df1 is a matrix of genes labeled g1 through g5 with start positions in the START column and end positions in the END column. df2 is a matrix of chromatin modification values at positions along the DNA. I want to average chromatin modification values for each gene from the start to the end position. So this would involve pulling out all values for column C0 that are between pos 200 and 700 for the first gene and averaging them. Then, I would pull all values from 500 to 1000, and continue for each gene. This type of operation is what the IRanges and GenomicRanges packages were developed for. Suggest you install both (from bioconductor.org), then review http://www.bioconductor.org/help/course-materials/2011/CSAMA/Tuesday/Morning%20Talks/IRangesLecture.pdf and the vignettes for those packages and the help page for 'findOverlaps'. If that doesn't solve your problem, post to the bioconductor list. HTH, Chuck The example I gave previously was a short one, but I will be doing this for around 1000 genes with different positions. This is why just removing one group. This was something I tried to come up with that allowed me to use start and end positions. Your advice to use the cut is working. start-df1[,2] end-df1[,3] while(ilength(start)){ ilt;-i+1 print(cut(df2[,1],c(start[i],end[i]))) } These were the results [1] lt;NA (200,700] NA NA NA NA NA [8] NA NA NA NA NA NA NA [15] NA NA NA NA NA Levels: (200,700] [1] NANA(500,1e+03] (500,1e+03] NANA [7] NANANANANANA [13] NANANANANANA [19] NA Levels: (500,1e+03] [1] NA NA NA NA NA [6] (2e+03,3e+03] (2e+03,3e+03] NA NA NA [11] NA NA NA NA NA [16] NA NA NA NA Levels: (2e+03,3e+03] [1] NA NA NA NA NA [6] NA NA NA NA (4e+03,6e+03] [11] (4e+03,6e+03] (4e+03,6e+03] (4e+03,6e+03] NA NA [16] NA NA NA NA Levels: (4e+03,6e+03] [1] NA NA NA NA NA [6] NA NA NA NA NA [11] NA NA NA NA NA [16] (7e+03,8e+03] (7e+03,8e+03] NA NA Levels: (7e+03,8e+03] This is producing the right bins for each of the results, but I'm not sure how to put this into a data frame. When I did this. start-df1[,2] end-df1[,3] while(ilength(start)){ i-i+1 bins-(cut(df2[,1],c(start[i],end[i]))) } the bins variable was the last level. Is there a way to assign the results of the of the while statement to a dataframe? Many thanks -- View this message in context: http://r.789695.n4.nabble.com/Averaging-within-a-range-of-values-tp4291958p4294061.html Sent from the R help mailing list archive at Nabble.com. -- Charles C. BerryDept of Family/Preventive Medicine cberry at ucsd edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] add column with values found in another data frame
I am positive this problem has a very simple solution, but I have been unable to find it, so I am asking for your help. I need to know how to look something up in one data frame and add it as a column in another. If I have a data frame that looks like this: frame1 ID score test age 1 Guy1 10 1 20 2 Guy1 13 2 20 3 Guy2 9 1 33 4 Guy2 11 2 33 and another frame that looks like this: frame2 ID 1 Guy1 2 Guy2 How do I add a column to frame2 so it looks like this: ID age 1 Guy1 20 2 Guy2 33 I know this must be simple, but I couldn't find the solution by searching. thanks so much Jeremy -- View this message in context: http://r.789695.n4.nabble.com/add-column-with-values-found-in-another-data-frame-tp4295626p4295626.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimate the average abundance using Poisson regression with a log link.
Hello, please excuse the simplicity of this question as I am not very good with stats. I am taking a class, using R which I am learning at the same time, and the questions asks us to Estimate the average abundance using Poisson regression with a log link. I can estimate the abundance from x, but I can seem to figure out how to get the average abundance in this method. Any suggestions would be welcome as I have spent about 4 hours trying to figure this one out. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Estimate-the-average-abundance-using-Poisson-regression-with-a-log-link-tp4295863p4295863.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can I doing Quality adjusted survival analysis in R?
Hi R users I need to estimate, with kaplan Meier methodology, a Quality adjusted survival analysis. It is possible doing this at R? Thanks in advance. Best Regards Pedro Mota Veiga -- View this message in context: http://r.789695.n4.nabble.com/How-can-I-doing-Quality-adjusted-survival-analysis-in-R-tp4295868p4295868.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate the average abundance using Poisson regression with a log link.
P.S. I don't understand what you mean by log link but if it's the use of a log-normal to get improved confidence intervals, package 'SPECIES' implements it, unlike 'Rcapture' that only gives point estimates. Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Estimate-the-average-abundance-using-Poisson-regression-with-a-log-link-tp4295863p4296096.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate the average abundance using Poisson regression with a log link.
There is a no homework rule, but: 1. abundance + poisson seems to be an animal species abundance problem in a capture-recapture framework. 2. If so, check out packages 'Rcapture' and 'SPECIES'. They both implement several estimators, such as Burnham and Overton's jacknife or Chao's estimators. (The poisson model is a natural one). 3. Personally, I prefer the first, but this is because I'm more used to it and have never worked with 'SPECIES', just took a look at it. Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Estimate-the-average-abundance-using-Poisson-regression-with-a-log-link-tp4295863p4295930.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] add column with values found in another data frame
Hi Jeremy, Try frame1 - structure(list(ID = structure(c(1L, 1L, 2L, 2L), .Label = c(Guy1, Guy2), class = factor), score = c(10L, 13L, 9L, 11L), test = c(1L, 2L, 1L, 2L), age = c(20L, 20L, 33L, 33L)), .Names = c(ID, score, test, age), class = data.frame, row.names = c(1, 2, 3, 4)) frame2 - structure(list(ID = structure(1:2, .Label = c(Guy1, Guy2), class = factor)), .Names = ID, class = data.frame, row.names = c(1, 2)) merge(frame1, frame2, by = ID)) ID score test age 1 Guy1101 20 2 Guy1132 20 3 Guy2 91 33 4 Guy2112 33 subset(frame1, ID %in% frame2$ID, select = c(ID, age)) ID age 1 Guy1 20 2 Guy1 20 3 Guy2 33 4 Guy2 33 See ?subset and ?merge for more information. HTH, Jorge.- On Sat, Jan 14, 2012 at 3:51 PM, jdog76 wrote: I am positive this problem has a very simple solution, but I have been unable to find it, so I am asking for your help. I need to know how to look something up in one data frame and add it as a column in another. If I have a data frame that looks like this: frame1 ID score test age 1 Guy1 10 1 20 2 Guy1 13 2 20 3 Guy2 9 1 33 4 Guy2 11 2 33 and another frame that looks like this: frame2 ID 1 Guy1 2 Guy2 How do I add a column to frame2 so it looks like this: ID age 1 Guy1 20 2 Guy2 33 I know this must be simple, but I couldn't find the solution by searching. thanks so much Jeremy -- View this message in context: http://r.789695.n4.nabble.com/add-column-with-values-found-in-another-data-frame-tp4295626p4295626.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] add column with values found in another data frame
jdog76 wrote I am positive this problem has a very simple solution, but I have been unable to find it, so I am asking for your help. I need to know how to look something up in one data frame and add it as a column in another. If I have a data frame that looks like this: frame1 ID score test age 1 Guy1 10 1 20 2 Guy1 13 2 20 3 Guy2 9 1 33 4 Guy2 11 2 33 and another frame that looks like this: frame2 ID 1 Guy1 2 Guy2 How do I add a column to frame2 so it looks like this: ID age 1 Guy1 20 2 Guy2 33 I know this must be simple, but I couldn't find the solution by searching. thanks so much Jeremy How about frame2$age = frame1[match(frame2$ID, frame1$ID),age] print(frame2) ID age 1 Guy1 20 2 Guy2 33 HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/add-column-with-values-found-in-another-data-frame-tp4295626p4296307.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coloring counties on a full US map based on a certain criterion
On 14/01/2012 10:33 a.m., Dimitri Liakhovitski wrote: Somewhat related question out of curiousity: Does anyone know how often the list of the counties and county names is updated in this package? Or is it done centrally for all packages that deal with US counties? Thanks! Dimitri Well, I would hazard a guess that the package maintainer would know :-) The answer to your first question is As and when the package maintainer is informed of errors or changes. The answer to your second question is No. Ray On Fri, Jan 13, 2012 at 3:41 PM, Ray Brownrigg ray.brownr...@ecs.vuw.ac.nz wrote: On 14/01/2012 8:04 a.m., Sarah Goslee wrote: On Fri, Jan 13, 2012 at 1:52 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.comwrote: Just to clarify, according to help about the fill argument: logical flag that says whether to draw lines or fill areas. If FALSE, the lines bounding each region will be drawn (but only once, for interior lines). If TRUE, each region will be filled using colors from the col = argument, and bounding lines will not be drawn. We have fill=TRUE - so why are the county borders still drawn? Thank you! Dimitri This prompted me to check the code: if fill=TRUE, map() calls polygon() if fill=FALSE, map() calls lines() But polygon() draws borders by default. plot(c(0,1), c(0,1), type=n) polygon(c(0,0,1,1), c(0,1,1,0), col=yellow) To not draw borders, the border argument is provided: plot(c(0,1), c(0,1), type=n) polygon(c(0,0,1,1), c(0,1,1,0), col=yellow, border=NA) But that fails in map(): map('county', 'iowa', fill=TRUE, col=rainbow(20), border=NA) Error in par(pin = p) : invalid value specified for graphical parameter pin because border is used as a named argument in map() already, for setting the size of the plot area, so there's no way to alter the border argument to polygon. Coincidentally, I became aware of this just recently. When the maps package was created (way back in the 'new' S era), polygon() didn't add borders, and that is why ?map states that fill does not add borders. A workaround is to change the map() option border= to myborder= (it is then used twice in map()). The work-around I suggested previous (lty=0) seems to be the only way to deal with the problem. In fact I believe there is another workaround if you don't want to modify the code; use the option resolution=0 in the map() call. I.e. try (in Sarah's original Iowa example): map('county', 'iowa', fill= TRUE, col = classcolors[countycol], resolution=0, lty=0) This ensures that the polygon boundaries match up. I'll fix the border issue in the next version of maps (*not* the one just uploaded to CRAN, which was to add Cibola County to NM). Ray Brownrigg Sarah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Future of R | API to Public Databases
On 14/01/2012 18:51, Joshua Wiley wrote: I have been following this thread, but there are many aspects of it which are unclear to me. Who are the publishers? Who are the users? What is the problem? I have a vauge sense for some of these, but it seems to me like one valuable starting place would be creating a document that clarifies everything. It is easier to tackle a concrete problem (e.g., agree on a standard numerical representation of dates and times a la ISO 8601) than something diffuse (e.g., information overload). Let alone something as vague as 'the future of R' (for which the R-devel list is the appropriate one). I believe the original poster is being egocentric: as someone said earlier, she has never had need of this concept, and I believe that is true of the vast majority of R users. The development of R per se is primarily driven by the needs of the core developers and those around them. Other R communities have sent up their own special-interest groups and sets of packages, and that would seem the way forward here. Good luck, Josh On Sat, Jan 14, 2012 at 10:02 AM, Benjamin Weberm...@bwe.im wrote: Mike We see that the publishers are aware of the problem. They don't think that the raw data is the usable for the user. Consequently they recognizing this fact with the proprietary formats. Yes, they resign in the information overload. That's pathetic. It is not a question of *which* data format, it is a question about the general concept. Where do publisher and user meet? There has to be one *defined* point which all parties agree on. I disagree with your statement that the publisher should just publish csv or cook his own API. That leads to fragmentation and inaccessibility of data. We want data to be accessible. A more pragmatic approach is needed to revolutionize the way we go about raw data. Benjamin On 14 January 2012 22:17, Mike Marchywkamarchy...@hotmail.com wrote: LOL, I remember posting about this in the past. The US gov agencies vary but mostare quite good. The big problem appears to be people who push proprietary orcommercial standards for which only one effective source exists. Some formats,like Excel and PDF come to mind and there is a disturbing trend towards theiradoption in some places where raw data is needed by many. The best thing to do is contact the informationprovider and let them know you want raw data, not images or stuff that worksin limited commercial software packages. Often data sources are valuable andthe revenue model impacts availability. If you are just arguing over different open formats, it is usually easy for someone towrite some conversion code and publish it- CSV to JSON would not be a problem for example. Data of course are quite variable and there is nothingwrong with giving provider his choice. Date: Sat, 14 Jan 2012 10:21:23 -0500 From: ja...@rampaginggeek.com To: r-help@r-project.org Subject: Re: [R] The Future of R | API to Public Databases Web services are only part of the problem. In essence, there are at least two facets: 1. downloading the data using some protocol 2. mapping the data to a common model Having #1 makes the import/download easier, but it really becomes useful when both are included. I think #2 is the harder problem to address. Software can usually be written to handle #1 by making a useful abstraction layer. #2 means that data has consistent names and meanings, and this requires people to agree on common definitions and a common naming convention. RDF (Resource Description Framework) and its related technologies (SPARQL, OWL, etc) are one of the many attempts to try to address this. While this effort would benefit R, I think it's best if it's part of a larger effort. Services such as DBpedia and Freebase are trying to unify many data sets using RDF. The task view and package ideas a great ideas. I'm just adding another perspective. Jason On 01/13/2012 05:18 PM, Roy Mendelssohn wrote: HI Benjamin: What would make this easier is if these sites used standardized web services, so it would only require writing once. data.gov is the worst example, they spun the own, weak service. There is a lot of environmental data available through OPenDAP, and that is supported in the ncdf4 package. My own group has a service called ERDDAP that is entirely RESTFul, see: http://coastwatch.pfel.noaa.gov/erddap and http://upwell.pfeg.noaa.gov/erddap We provide R (and matlab) scripts that automate the extract for certain cases, see: http://coastwatch.pfeg.noaa.gov/xtracto/ We also have a tool called the Environmental Data Connector (EDC) that provides a GUI from with R (and ArcGIS, Matlab and Excel) that allows you to subset data that is served by OPeNDAP, ERDDAP, certain Sensor Observation Service (SOS) servers, and have it read directly into R. It is freely available at: http://www.pfeg.noaa.gov/products/EDC/ We can write such tools because
Re: [R] GUI preferences are not saved
On 15/01/2012 01:57, Phillip Feldman wrote: I'm running R version 2.14.1 (2011-12-22) on a 32-bit Windows machine. I've edited the GUI preferences to increase the font size, saving my preferences after doing so, but the next time I start an R session, my changes to the GUI preferences are lost. Is there a way to make the GUI preference changes permanent? Save them in the right place. See ?Rconsole for where it is looking (and a file you can edit directly). -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.