Re: [R] How numerical data is stored inside ts time series objects
On Apr 21, 2015, at 9:39 PM, Paul paul.domas...@gmail.com wrote: ... I rummaged around the help files for str, summary, dput, args. This seems like a more complicated language than Matlab, VBA, or even C++'s STL of old (which was pretty thoroughly documented). A function like str() returns an object description, and I'm guessing the conventions with which the object is described depends a lot on the person who wrote the handling code for the class. The description for the variable y seems particularly elaborate. Would I be right in assuming that the notation is ad-hoc and not documented? For example, the two invocations str(x) and str(y) show a Time-Series and a ts. And there are many lines of output for str(y) that is heavy in punctuation. The details of how str() represents your x and y variables is within the utils::stl.default() function. You can hunt this down and see the code with: methods(class=class(x)) # Find the class-specific handlers -- no str() methods(str) # Find the methods for the generic getAnywhere(str.default) # or getFromNamespace('str.default','utils') Within the utils::str.default code, this 'Time-Series' specific code only triggers if the object doesn't match a long list of other items (for example: is.function(), is.list(), is.vector(object) || (is.array(object) is.atomic(object)) ...) else if (stats::is.ts(object)) { tsp.a - stats::tsp(object) str1 - paste0( Time-Series , le.str, from , format(tsp.a[1L]), to , format(tsp.a[2L]), :) std.attr - c(tsp, class) } This handling is not dependent on who wrote the ts class, but on who wrote the str.default function. A more explict way to look at the difference without the str() summarization is with dput(x) and dput(y): dput(x) structure(c(464L, 675L, 703L, 887L, 1139L, 1077L, 1318L, 1260L, 1120L, 963L, 996L, 960L, 530L, 883L, 894L, 1045L, 1199L, 1287L, 1565L, 1577L, 1076L, 918L, 1008L, 1063L, 544L, 635L, 804L, 980L, 1018L, 1064L, 1404L, 1286L, 1104L, 999L, 996L, 1015L), .Tsp = c(1, 3.916667, 12), class = ts) dput(y) structure(c(464L, 675L, 703L, 887L, 1139L, 1077L, 1318L, 1260L, 1120L, 963L, 996L, 960L, 530L, 883L, 894L, 1045L, 1199L, 1287L, 1565L, 1577L, 1076L, 918L, 1008L, 1063L, 544L, 635L, 804L, 980L, 1018L, 1064L, 1404L, 1286L, 1104L, 999L, 996L, 1015L), .Dim = c(36L, 1L), .Dimnames = list(NULL, V1), .Tsp = c(1, 3.916667, 12), class = ts) Also, Matlab sometimes needs a squeeze() to drop degenerate dimensions, and R's drop() is similar, and is less-black-magic looking than the [[1]] code: str(drop(x)) Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 1260 1120 963 ... str(drop(y)) Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 1260 1120 963 ... stl(drop(x),s.window='per') stl(drop(y),s.window='per') Maybe str.default() should do Time-Series interpretation of is.ts() objects for matrices as well as vectors. Dave __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How numerical data is stored inside ts time series objects
Paul paul.domas...@gmail.com on Wed, 22 Apr 2015 01:39:16 + writes: William Dunlap wdunlap at tibco.com writes: Use the str() function to see the internal structure of most objects. In your case it would show something like: Data - data.frame(theData=round(sin(1:38),1)) x - ts(Data[[1]], frequency=12) # or Data[,1] y - ts(Data, frequency=12) str(x) Time-Series [1:38] from 1 to 4.08: 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 - 0.5 ... str(y) ts [1:38, 1] 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -0.5 ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr theData - attr(*, tsp)= num [1:3] 1 4.08 12 'x' contains a vector of data and 'y' contains a 1-column matrix of data. stl(x,per) and stl(y, per) give similar results as you got. Evidently, stl() does not know that 1-column matrices can be treated much the same as vectors and gives an error message. Thus you must extract the one column into a vector: stl(y[,1], per). Thanks, William. Interesting that a 2D matrix of size Nx1 is treated as a different animal from a length N vector. It's a departure from math convention, and from what I'm accustomed to in Matlab. Ha -- Not at all! The above is exactly the misconception I have been fighting -- mostly in vane -- for years. Matlab's convention of treating a vector as an N x 1 matrix is a BIG confusion to much of math teaching : The vector space |R^n is not all the same space as the space |R^{n x 1} even though of course there's a trivial mapping between the objects (and the metrics) of the two. A vector *is NOT* a matrix -- but in some matrix calculus notations there is a convention to *treat* n-vectors as (n x 1) matrices. Good linear algebra teaching does distinguish vectors from one-column or one-row matrices -- I'm sure still the case in all good math departments around the globe -- but maybe not in math teaching to engineers and others who only need applied math. Yes, linear algebra teaching will also make a point that in the usual matrix product notations, it is convenient and useful to treat vectors as if they were 1-column matrices. That R's vector seems more akin to a list, where the notion of orientation doesn't apply. Sorry, but again: not at all in the sense 'list's are used in R. Fortunately, well thought out languages such as S, R, Julia, Python, all do make a good distinction between vectors and matrices i.e. 1D and 2D arrays. If Matlab still does not do that, it's just another sign that Matlab users should flee and start using julia or R or python. {and well yes, we could start bitchering about S' and hence R's distinction between a 1D array and a vector ... which I think has been a clear design error... but that's not the topic here} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modifications using “stat_summary” in ggplot
It would really help to have some sample data to see what is happening. The best way to supply data to the help group is to use dput(). Type dput for some basic information on using it Have a look at and/or http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example for some hints. For this last link it is a good idea to follow the reproducible example link for more concrete suggestions. By the way calling a data.frame data is not a good idea. Data is a predefined function in R . Type ?data to see what I mean. John Kane Kingston ON Canada -Original Message- From: michael.eisenr...@agroscope.admin.ch Sent: Wed, 22 Apr 2015 09:32:11 + To: r-help@r-project.org Subject: [R] modifications using “stat_summary” in ggplot Dear R-list members I am using stat_summary in ggplot to plot a error bar graph comparing three treatmens (damage, see code below). I would like to change the shape of the three symbols displaying the mean values (e.g one symbol should be a point (default) one should be a triangle and one should be a square). Furthermore, I would like that the outlines of my error bars are black (and that I can fill them with whatever color I want ( I used white, black and gray65). Does Anyone of you know how to solve these problems? I use the following code: line-ggplot(data,aes(leaf,cor_average,fill=damage, colour=damage)) #define x (leaf) and y (cor_average) variables within aes() and that they should be colored according to damage type line+stat_summary(fun.y=mean, geom=point, size=3)+ #add mean as point symbol stat_summary(fun.data=mean_cl_boot,geom=errorbar,width=0.3, size=0.75)+ scale_colour_manual(values=c(white,black,gray65))+ #add CI : width=width of CI whiskers, size=widht of the CI bar labs(x=Leaf,y=Average nr. glands corrected for leaf sz.) Thank you very much, Michael Eisenring Michael, Msc. PhD Student Federal Department of Economic Affairs, Education and Research EAER Institute of Sustainability Sciences ISS Biosafety Reckenholzstrasse 191, CH-8046 Zrich Tel. +41 44 37 77181 Fax +41 44 37 77201 michael.eisenr...@agroscope.admin.chmailto:michael.eisenr...@agroscope.admin.ch www.agroscope.chhttp://www.agroscope.ch/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Suggest method
That does not sound like a clustering problem at all since you already know the desired characteristics and are not trying to discover structure in your data. Simply define a score as a suitably weighted sum of individual features, order your passengers by that score, and pick the top few, or any that exceed a threshold etc. Not really an R problem at this point though. B. On Apr 22, 2015, at 12:54 AM, Lalitha Kristipati lalitha.kristip...@techmahindra.com wrote: Hi, I want to do a use case in R language. My problem statement is to upgrade the passengers from one membership level to another membership level in airlines based on their characteristics. It is like customer profiling based on their usage characteristics. Suggest a method that intakes a large amount of data and cluster them based on their characteristics and helps in knowing the passengers who are upgraded to another level . Any help is appreciated. Regards, Lalitha Kristipati Associate Software Engineer Disclaimer: This message and the information contained herein is proprietary and confidential and subject to the Tech Mahindra policy statement, you may review the policy at http://www.techmahindra.com/Disclaimer.html externally http://tim.techmahindra.com/tim/disclaimer.html internally within TechMahindra. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgdal installation with two versions of GDAL
On 22/04/2015 05:43, Rob Skelly wrote: Hello, I am using GDAL 2.0 for most of my work, but rgdal depends on GDAL 2. I have built and installed GDAL 1.11.2 in /opt/gdal-1.11.2, and rgdal compiles and installs into R, using the following command: sudo R CMD INSTALL --configure-args=--with-gdal-config=/opt/gdal.1.11.2/bin/gdal-config rgdal-0.9-2.tar.gz However when R attempts to load rgdal at the end of the installation, it fails with the error, Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/local/lib/R/site-library/rgdal/libs/rgdal.so': /usr/local/lib/R/site-library/rgdal/libs/rgdal.so: undefined symbol: _ZN10OGRPolygon7addRingEP13OGRLinearRing The addRing method on OGRPolygon seems to be a relic of GDAL 1.11.2 and no longer exists in 2.0, so R is loading libgdal from /usr/local/lib, not /opt/gdal-1.11.2/lib. I have confirmed this by temporarily moving the 1.11.2 libs into /usr/local, where it works fine. You can *really* confirm where things are found via R CMD ldd ... see the manual. So, the question is, how to I convince R to use the new library search path? I'm on xubuntu, and R is installed using apt. You either use the ld options when building rgdal.so (e.g. for your OS -Wl,-rpath=/opt/gdal-1.11.2/lib) or you link statically. The latter is easier and safer ... just build a static GDAL. Many projects set -R/-rpath flags in their config scripts: it is something you could suggest to the GDAL maintainers (and finding them is part of libtool which GDAL uses). But (see the posting guide) the generic question belonged on R-devel and questions about rgdal on R-sig-geo. Thanks, Rob PLEASE do read the posting guide http://www.R-project.org/posting-guide.html -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgdal installation with two versions of GDAL
You can *really* confirm where things are found via R CMD ldd ... see the manual. Yes of course. I had done that, and received the hoped-for response: libgdal.so.1 = /opt/gdal-1.11.2/lib/libgdal.so.1 (0x7fd9e0175000) But I realize now that this was with LD_LIBRARY_PATH set. Of course R modifies LD_LIBRARY_PATH, erasing any paths set by the user, so that while the installation succeeds, the load doesn't. I've temporarily modified ldpaths and it works. So there's the answer for now. But (see the posting guide) the generic question belonged on R-devel and questions about rgdal on R-sig-geo. Apologies. I'll ask there next time. Rob [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exhaustive CHAID package
On Wed, 22 Apr 2015, Michael Grant wrote: Many thanks for your response, sir. Here are two of the references to which I referred. I've also personally explored several data sets in which the outcomes are 'known' and have seen high variability in the topology of the trees being produced but, typically Exhaustive CHAID predictions match the 'known' results better than any of the others, using default settings. http://www.hindawi.com/journals/jam/2014/929768/ http://interstat.statjournals.net/YEAR/2010/articles/1007001.pdf Thanks for the references, I wasn't aware of these. Both of these appear to use the SPSS implementations of CHAID, exhaustive CHAID, CART, and QUEST with default settings (as you did above). As I have never used these myself in SPSS, I cannot say how the implementations compare but it's well possible that these are different from other implementations. E.g., for CART the pruning rule may make a difference (with or without crossvalidation; with 1-SE or 0-SE rule etc.). Similarly, for QUEST I think that Loh's own implementation uses somewhat different default settings. So it may be advisable to go beyond defaults. By inference, many research papers are choosing Exhaustive CHAID. My experience is that this is determined to a good degree by the software available and what others in the same literature use. My concern is not that these procedures produce mildly variant trees but dramatically variant, with not even the same set of variables included. Yes, instability of the tree structure is one of the drawbacks of tree-based procedures. Of course, the tree structure can be very different while producing very similar predictions. Is CHAID available for use as an R package? Yes. I thought R-FORGE was solely for developers? See: https://R-Forge.R-project.org/R/?group_id=343 You can easily install the package from R-Forge and also check out the entire source code anonymously. Again, many thanks. MCG -Original Message- From: Achim Zeileis [mailto:achim.zeil...@uibk.ac.at] Sent: Wednesday, April 22, 2015 3:30 AM To: Michael Grant Cc: r-help@R-project.org Subject: Re: [R] Exhaustive CHAID package On Tue, 21 Apr 2015, Michael Grant wrote: Dear R-Help: From multiple sources comparing methods of tree classification and tree regressions on various data sets, it seems that Exhaustive CHAID (distinct from CHAID), most commonly generates the most useful tree results and, in particular, is more effective than ctree or rpart which are implemented in R. I searched a bit on the web for exhaustive CHAID and didn't find any convincing evidence that this method is most commonly the most useful. I doubt that such evidence exists because the methods are applicable to so many different situations that uniformly better results are essentially never obtained. Nevertheless, if you have references of comparison studies, I would still be interested. Possibly these provide insight in which situations exhaustive CHAID performs particularly well. I see that CHAID, but not Exhaustive CHAID, is in the R-forge, and I write to ask if there are plans to create a package which employs the Exhaustive CHAID strategy. I wouldn't know of any such plans. But if you want to adapt/extend the code from the CHAID package, this is freely available. Right now the best source I can find is in SPSS-IBM and I feel a bit disloyal to R using it. I wouldn't be concerned about disloyalty. If you feel that exhaustive CHAID is the most appropriate tool for your problem and you have access to it in SPSS, why not use it? Possibly you can also export it from SPSS and import it into R using PMML. The partykit package has an example with an imported QUEST tree from SPSS. Michael Grant Professor University of Colorado Boulder [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exhaustive CHAID package
Many thanks for your response, sir. Here are two of the references to which I referred. I've also personally explored several data sets in which the outcomes are 'known' and have seen high variability in the topology of the trees being produced but, typically Exhaustive CHAID predictions match the 'known' results better than any of the others, using default settings. http://www.hindawi.com/journals/jam/2014/929768/ http://interstat.statjournals.net/YEAR/2010/articles/1007001.pdf By inference, many research papers are choosing Exhaustive CHAID. My concern is not that these procedures produce mildly variant trees but dramatically variant, with not even the same set of variables included. Is CHAID available for use as an R package? I thought R-FORGE was solely for developers? Again, many thanks. MCG -Original Message- From: Achim Zeileis [mailto:achim.zeil...@uibk.ac.at] Sent: Wednesday, April 22, 2015 3:30 AM To: Michael Grant Cc: r-help@R-project.org Subject: Re: [R] Exhaustive CHAID package On Tue, 21 Apr 2015, Michael Grant wrote: Dear R-Help: From multiple sources comparing methods of tree classification and tree regressions on various data sets, it seems that Exhaustive CHAID (distinct from CHAID), most commonly generates the most useful tree results and, in particular, is more effective than ctree or rpart which are implemented in R. I searched a bit on the web for exhaustive CHAID and didn't find any convincing evidence that this method is most commonly the most useful. I doubt that such evidence exists because the methods are applicable to so many different situations that uniformly better results are essentially never obtained. Nevertheless, if you have references of comparison studies, I would still be interested. Possibly these provide insight in which situations exhaustive CHAID performs particularly well. I see that CHAID, but not Exhaustive CHAID, is in the R-forge, and I write to ask if there are plans to create a package which employs the Exhaustive CHAID strategy. I wouldn't know of any such plans. But if you want to adapt/extend the code from the CHAID package, this is freely available. Right now the best source I can find is in SPSS-IBM and I feel a bit disloyal to R using it. I wouldn't be concerned about disloyalty. If you feel that exhaustive CHAID is the most appropriate tool for your problem and you have access to it in SPSS, why not use it? Possibly you can also export it from SPSS and import it into R using PMML. The partykit package has an example with an imported QUEST tree from SPSS. Michael Grant Professor University of Colorado Boulder [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic structural time series
Hello I try to forecast with BSM. My data is in the annex. I dont understand why my forecasting result very interesting. You can see in annex (result.png) I expect that forecasting result must be increasing my code ; (fit - StructTS(log10(x), type = BSM)) plot(cbind(fitted(fit), resids=resid(fit)), main = Data) library(forecast) plot(forecast(fit)) is there any idea? Thanks __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predict in glmnet for Cox family
Dear Terry, Thank you for your reply, I understood its difficult to predict survival time, in general. I have tried another approach and I would like to know whether my approach is correct. I have clustered my dataset based on some similarity and reduced the number of variables using LASSO and some expert opinion. And then I applied Accelerated failure time model - using weibull, used survival package - survreg and then I predicted the survival time. The accuracy is little less due to the uncertainty and complexity in survival time of individual observations, and I checked the quantile 5% and 95% and almost 95% observations falls in the confidence interval even if the interval is little wide. Actual Predicted Lower Upper 1 91 83.01901 10.497993 178.65750 2 90 62.66257 7.923863 134.85030 3 115 57.59236 7.282720 123.93918 4 20 50.72860 6.414777 109.16830 5 81 83.42176 10.548922 179.52423 6 113 57.10106 7.220593 122.88188 7 8 58.29399 7.371442 125.44907 8 88 53.19866 6.727124 114.48390 9 17 34.80713 4.401461 74.90518 10 5 45.90169 5.804401 98.78076 11 20 58.99832 7.460507 126.96480 12 34 64.05572 8.100031 137.84837 13 27 39.25003 4.963279 84.46635 14 56 41.03611 5.189134 88.31000 15 60 69.70944 8.814959 150.01520 Is my approach correct ? Can I say this model is good ? Will I be able to some more testing so that I can get a probability survival curve ? Sincerely, -- View this message in context: http://r.789695.n4.nabble.com/Predict-in-glmnet-for-cox-family-tp4706070p4706248.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R_Calculating Thiessen weights for an area with irregular boundary
1. Apologies for the lousy presentation of the data and thank you for your feedback. I promise it will not happen again. 2. Thank you 3. Yes, exactly! 4. Exactly what I wanted without much hassle. Thank you very much. Time to explore the package 'spatstat'. How can I show the Dirichlet tile names (i.e. 1,2,3,,8) in the plot? Cheers, Mano On Tue, Apr 21, 2015 at 11:33 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: (1) The manner in which you presented your data was a total mess. If you ask for help, please have the courtesy to present your data in such a manner that a potential helper can access it without needing to do a great deal of editing and processing. Like so: pts - as.data.frame(matrix(c(415720,432795,415513,432834,415325, 432740,415356,432847,415374,432858, 415426,432774,415395,432811,415626, 432762),ncol=2,byrow=TRUE)) names(pts) - c(x,y) bdry - as.data.frame(matrix(c(415491,432947,415269,432919,415211, 432776,415247,432657,415533,432657, 415781,432677,415795,432836,415746, 432937),ncol=2,byrow=TRUE)) names(bdry) - c(x,y) (2) Well, at least you presented a usable data set (even though the presentation was lousy) which is better than what most posters do. And you asked a partially clear question. (3) I do not know what you mean by Thiessen weights. I am guessing that these are the areas of the Dirichlet tiles (Thiessen polygons), intersected with the boundary polygon (i.e. observation window). (4) If my guess is correct, the following should accomplish the desired task: require(spatstat) # You will (probably) need to install spatstat first. W - owin(poly=bdry) X - as.ppp(pts,W=W) plot(X) # Just to make sure it looks right. dX - dirichlet(X) plot(dX) # Just to make sure . sapply(tiles(dX),area.owin) HTH cheers, Rolf Turner On 22/04/15 02:50, Manoranjan Muthusamy wrote: Hi R users, I want to calculate Thiessen weights to compute areal rainfall from number of point measurements. I am using R and thanks to some previous question in the same topic, I got to know that I can usedeldir. But the problem is my boundary polygon is not a rectangle; it's an irregular polygon (it's a catchment boundary derived using ArcGIS). But in deldir the boundary can only be a rectangle. Are there any other packages where I can calculate Thiessen weights of an area covered by an irregular boundary? Given below are my measurement points (meas_points) and coordinates of a (simplified) boundary polygon(boundary) meas_points X Y[1,] 415720 432795[2,] 415513 432834[3,] 415325 432740[4,] 415356 432847[5,] 415374 432858[6,] 415426 432774[7,] 415395 432811[8,] 415626 432762 boundary x y[1,] 415491 432947[2,] 415269 432919[3,] 415211 432776[4,] 415247 432657[5,] 415533 432657[6,] 415781 432677[7,] 415795 432836[8,] 415746 432937 Any help is really appreciated. Thanks. -- Rolf Turner Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 Home phone: +64-9-480-4619 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How numerical data is stored inside ts time series objects
William Dunlap wdunlap at tibco.com writes: I think we can call this a bug in stl(). I used what I learned from the responses to this thread, I looked at the code for stl. As they say in Microsoft, this is expected behaviour according to the code. And it doesn't look like an inadvertent coding oversight. --- Martin Maechler maechler at lynne.stat.math.ethz.ch writes: Paul Paul.Domaskis at gmail.com Interesting that a 2D matrix of size Nx1 is treated as a different animal from a length N vector. It's a departure from math convention, and from what I'm accustomed to in Matlab. The vector space |R^n is not all the same space as the space |R^{n x 1} even though of course there's a trivial mapping between the objects (and the metrics) of the two. A vector *is NOT* a matrix -- but in some matrix calculus notations there is a convention to *treat* n-vectors as (n x 1) matrices. Good linear algebra teaching does distinguish vectors from one-column or one-row matrices -- I'm sure still the case in all good math departments around the globe -- but maybe not in math teaching to engineers and others who only need applied math. Yes, linear algebra teaching will also make a point that in the usual matrix product notations, it is convenient and useful to treat vectors as if they were 1-column matrices. The distinction in math is new me, with academic training in engineering, even at the post grad level. I haven't seen the distinction in the math for Comp. Sci., either, and that's in the meat grinder of Canada. Admittedly, it's not quite as geeky as some meat grinders in other countries. And admittedly, I only took C.S. courses that were geared to applications. So I had always considered such a distinction to a practicality in coding implementation of vector/matrix classes, e.g., in C, a vector being a single pointer to a number, while in a 2D array is a pointer to a vector and hence a different type. That R's vector seems more akin to a list, where the notion of orientation doesn't apply. Sorry, but again: not at all in the sense 'list's are used in R. No need to apologize. To clarify, being new to R, I was referring to the general use of the term list. Specifically, I was referring to an ordered collection without orientation, so it is consistent with what you say above about distinguishing between length N vectors vs. 2D matrices of size Nx1 or 1xN. Fortunately, well thought out languages such as S, R, Julia, Python, all do make a good distinction between vectors and matrices i.e. 1D and 2D arrays. If Matlab still does not do that, it's just another sign that Matlab users should flee and start using julia or R or python. Matlab pretty well only deals with 2D arrays, some of which have size Nx1 or 1xN. I haven't seen an example of a 1-D data structure that doesn't have an orientation, implied or otherwise. Though of course, if someone proves me wrong, then I stand corrected (and smarter because of it). {and well yes, we could start bitchering about S' and hence R's distinction between a 1D array and a vector ... which I think has been a clear design error... but that's not the topic here} Big fan of python's readability, though I've only dabbled. And I won't start bitchering about R S cuz I'm a newcomer and it's all an eye popping wonderland. --- David R Forrest drf at vims.edu writes: The details of how str() represents your x and y variables is within the utils::stl.default() function. You can hunt this down and see I'm assuming that you meant utils.str.default() above. I can follow the rest of your post makes sense if I make that assumption. I snipped the majority of your response because I'm not responding to anything specific. However, it was an extremely educational post. Thank you for that. Also, Matlab sometimes needs a squeeze() to drop degenerate dimensions, and R's drop() is similar, and is less-black-magic looking than the [[1]] code: str(drop(x)) Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 1260 1120 963 ... str(drop(y)) Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 1260 1120 963 ... stl(drop(x),s.window='per') stl(drop(y),s.window='per') Maybe str.default() should do Time-Series interpretation of is.ts() objects for matrices as well as vectors. I'm assuming that you mean stl(), since str() already works on both? Maybe it's the version I have, however, but I find that the R code for stl() doesn't have have a section for is.ts(). Instead, it seems to run through a series of checks for pathological input, with the check for matrix data consisting of is.matrix(na.action(as.ts(x))), where x is the time series. Somehow, the fact that the na.action(time series argument) returns a matrix implies that the time series data is a matrix rather than a vector. In attempting to get insight, I found
Re: [R] R lattice bwplot: Fill boxplots with specific color depending on factor level
hi Pablo set.seed(1) # for reproducibility of data.frame mydata - rbind(data.frame(Col1 = rnorm(2*1000),Col2 =rep(c(A, C), each=1000),Col3=factor(rep(c(YY,NN), 1000))),data.frame(Col1 = rnorm(1000),Col2 =rep(c(B)),Col3=factor(rep(c(YY,YN), 500 mydata$Col2 - factor(mydata$Col2) In future please do not put * at end makes it harder to copy red=rgb(249/255, 21/255, 47/255) amber=rgb(255/255, 200/255,0/255) # amended to reveal a colour difference green=rgb(39/255, 232/255, 51/255) # As Deepayan Sarkar said bwplot is different to others bwplot(mydata$Col1~mydata$Col3 | mydata$Col2,data=mydata, groups = Col3, as.table = TRUE, # added to make it easier for factor levels layout = c(3,1), # looks nicer and easier to read panel = panel.superpose, panel.groups = panel.bwplot, fill = c(red,amber,green) ) Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Pablo Fleurquin Sent: Thursday, 23 April 2015 02:03 To: r-help@r-project.org Subject: [R] R lattice bwplot: Fill boxplots with specific color depending on factor level Hi, I thoroughly looked for an answer to this problem with no luck. I have a dataframe with 3 factor levels: YY, NN, YN *mydata - rbind(data.frame(Col1 = rnorm(2*1000),Col2 =rep(c(A, C), each=1000),Col3=factor(rep(c(YY,NN), 1000))),data.frame(Col1 = rnorm(1000),Col2 =rep(c(B)),Col3=factor(rep(c(YY,YN), 500* Being Col3 of factor type with 3 levels: NN YY YN I want to make a boxplot using lattice bwplot and assign to each level a specific color: *# NN:red=rgb(249/255, 21/255, 47/255)# YN:amber=rgb(255/255, 126/255, 0/255)# YY:green=rgb(39/255, 232/255, 51/255)* Using bwplot function: * pl-bwplot(mydata$Col1~mydata$Col3 | mydata$Col2,data=mydata,ylab=expression(italic(R)),panel=function(...){panel .bwplot(...,groups=mydata$Col3, fill=c(red,amber,green))})* IF YOU REPRODUCE THE EXAMPLE YOU WILL SEE THAT THE COLORS ARE NOT RELATED TO THE LEVELS IN MY DATAFRAME AS YY BOX IS NOT ALWAYS GREEN. IS THERE A WAY TO ASSIGN YY:green, NN:red, YN:amber? You can see the resulting figure in: http://stackoverflow.com/questions/29802129/r-lattice-bwplot-fill-boxplots-w ith-specific-color-depending-on-factor-level Thank you in advance! Pablo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Script to workflow conversion
Santosh, I know nothing about this personally, but I found this site with an internet search. http://www.ef-prime.com/products/ranalyticflow_en/features.html Jean On Wed, Apr 22, 2015 at 5:20 PM, Santosh santosh2...@gmail.com wrote: Dear Rxperts.. Sorry.. i don't have data for my query.. Is there a way that an R script can be converted to a workflow? or if not a workflow, converted into a flowchart or anything close to that effect. Regards, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Having trouble with twitter API
Hello one more time! I'm switching over one of my packages from Ubuntu to Windows. Part of it uses twitter data, as gathered by the twitteR program. So this is what I have so far: library(twitteR) library(ROAuth) requestURL - https://api.twitter.com/oauth/request_token; accessURL = https://api.twitter.com/oauth/access_token; authURL = https://api.twitter.com/oauth/authorize; library(RCurl) I got my consumerKey and my consumerSecret from my Twitter developer's account. So I have: Cred - OAuthFactory$new(consumerKey=consumerKey, + consumerSecret=consumerSecret, + requestURL=requestURL, + authURL=authURL) Now the next thing is where everything goes wrong. Cred$handshake(cainfo=cacert.pem) To enable the connection, please direct your web browser to: https://api.twitter.com/oauth/authorize?oauth_token=TgGQ1xBifkqusiYyyz34oVKYMcybb0rq When complete, record the PIN given to you and provide it here: 0172575 Error in function (type, msg, asError = TRUE) : Could not resolve host: Cred$handshake(cainfo=cacert.pem) To enable the connection, please direct your web browser to: https://api.twitter.com/oauth/authorize?oauth_token=TgGQ1xBifkqusiYyyz34oVKYMcybb0rq When complete, record the PIN given to you and provide it here: 0172575 Error in function (type, msg, asError = TRUE) : Could not resolve host: I have done this many times, but to no avail. Sometimes I leave the quotes off. Sometimes I leave a space. Nothing works. Has anyone run into this before, please? This is on R version 3-1.3, Windows 8. thanks, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R lattice bwplot: Fill boxplots with specific color depending on factor level
Pablo, I would do it similarly. I would also place the box and whiskers in the specified colors. ## install.packages(HH) ## if you don't have it library(HH) bwplot(mydata$Col1~mydata$Col3 | mydata$Col2,data=mydata, groups = Col3, as.table = TRUE, # added to make it easier for factor levels layout = c(3,1), # looks nicer and easier to read panel = panel.bwplot.superpose, col = c(red,darkorange,green), fill = c(red,darkorange,green), fill.alpha=.6) I changed your amber to darkorange as the amber lines are almost invisible. Rich On Wed, Apr 22, 2015 at 9:26 PM, Duncan Mackay dulca...@bigpond.com wrote: hi Pablo set.seed(1) # for reproducibility of data.frame mydata - rbind(data.frame(Col1 = rnorm(2*1000),Col2 =rep(c(A, C), each=1000),Col3=factor(rep(c(YY,NN), 1000))),data.frame(Col1 = rnorm(1000),Col2 =rep(c(B)),Col3=factor(rep(c(YY,YN), 500 mydata$Col2 - factor(mydata$Col2) In future please do not put * at end makes it harder to copy red=rgb(249/255, 21/255, 47/255) amber=rgb(255/255, 200/255,0/255) # amended to reveal a colour difference green=rgb(39/255, 232/255, 51/255) # As Deepayan Sarkar said bwplot is different to others bwplot(mydata$Col1~mydata$Col3 | mydata$Col2,data=mydata, groups = Col3, as.table = TRUE, # added to make it easier for factor levels layout = c(3,1), # looks nicer and easier to read panel = panel.superpose, panel.groups = panel.bwplot, fill = c(red,amber,green) ) Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Pablo Fleurquin Sent: Thursday, 23 April 2015 02:03 To: r-help@r-project.org Subject: [R] R lattice bwplot: Fill boxplots with specific color depending on factor level Hi, I thoroughly looked for an answer to this problem with no luck. I have a dataframe with 3 factor levels: YY, NN, YN *mydata - rbind(data.frame(Col1 = rnorm(2*1000),Col2 =rep(c(A, C), each=1000),Col3=factor(rep(c(YY,NN), 1000))),data.frame(Col1 = rnorm(1000),Col2 =rep(c(B)),Col3=factor(rep(c(YY,YN), 500* Being Col3 of factor type with 3 levels: NN YY YN I want to make a boxplot using lattice bwplot and assign to each level a specific color: *# NN:red=rgb(249/255, 21/255, 47/255)# YN:amber=rgb(255/255, 126/255, 0/255)# YY:green=rgb(39/255, 232/255, 51/255)* Using bwplot function: * pl-bwplot(mydata$Col1~mydata$Col3 | mydata$Col2,data=mydata,ylab=expression(italic(R)),panel=function(...){panel .bwplot(...,groups=mydata$Col3, fill=c(red,amber,green))})* IF YOU REPRODUCE THE EXAMPLE YOU WILL SEE THAT THE COLORS ARE NOT RELATED TO THE LEVELS IN MY DATAFRAME AS YY BOX IS NOT ALWAYS GREEN. IS THERE A WAY TO ASSIGN YY:green, NN:red, YN:amber? You can see the resulting figure in: http://stackoverflow.com/questions/29802129/r-lattice-bwplot-fill-boxplots-w ith-specific-color-depending-on-factor-level Thank you in advance! Pablo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] high density plots using lattice dotplot()
Hi Luigi Try set.seed(1) PLATE - data.frame(Delta.Rn = rnorm(500), Cycle = rnorm(500), Delta2 = rnorm(500)+1, Well = rep(1:50, each = 10)) head(PLATE,10) xyplot(Delta.Rn+Delta2 ~ Cycle | Well, data = subset(PLATE, Well %in% 1:49), allow.multiple = TRUE, ylab=Fluorescence (Delta Rn), xlab=Cycles, main=TITLE, scales = list( x = list(draw = FALSE), y = list(draw = FALSE), relation=same, alternating=TRUE), as.table = TRUE, layout = c(10,5), par.settings = list( strip.background=list(col=white), # layout.heights = list(strip = 0.8), axis.text = list(cex = 0.6), par.xlab.text = list(cex = 0.75), par.ylab.text = list(cex = 0.75), par.main.text = list(cex = 0.8), superpose.symbol = list(pch = ., cex = 2, col = c(2,4) ) ), strip= FALSE, type = p, key = list(text = list(label = c(Delta.Rn,Delta2)), points = list(cex = 0.6, pch = 16, col = c(2,4)), cex = 0.6, x = 0.9, y = 0.1), panel = panel.superpose, panel.groups = function(x,y,...){ panel.xyplot(x,y,... ) # text argument can be a vector of values not # necessarily the group name pnl = panel.number() # needed as group.number if added is now either 1 or 2 grid.text(c(LETTERS,letters)[pnl], y = 0.93, x = 0.5, default.units = npc, just = c(left, bottom), gp = gpar(fontsize = 7) ) } ) Remember to delete the group argument (I forgot to at first as the groups are now Delta.Rn Delta2) You may have 1+ empty panels so put the legend there where ever it is just amend the x and y or fine tune them you can have the pch = . and increase cex but it will become as square with large cex Duncan -Original Message- From: Luigi Marongiu [mailto:marongiu.lu...@gmail.com] Sent: Thursday, 23 April 2015 10:05 To: Duncan Mackay Subject: Re: [R] high density plots using lattice dotplot() Dear Duncan, sorry to come back so soon, but i wanted to ask you whether it would be possible to plot two sets of lines within each box, let's say a main value A and a secondary value B. In normal plots I could use a plot() followed by points(); what would be the strategy here? Thank you again, best regards, Luigi On Wed, Apr 22, 2015 at 6:46 AM, Duncan Mackay dulca...@bigpond.com wrote: Hi Luigi I should have made up an example to make things easier when I replied today This should get you going set.seed(1) PLATE - data.frame(Delta.Rn = rnorm(500), Cycle = rnorm(500), Well = rep(1:50, each = 10)) head(PLATE) xyplot(Delta.Rn ~ Cycle | Well, data = PLATE, groups = Well, ylab=Fluorescence (Delta Rn), xlab=Cycles, main=TITLE, scales = list( x = list(draw = FALSE), y = list(draw = FALSE), relation=same, alternating=TRUE), as.table = TRUE, layout = c(10,5), par.settings = list( strip.background=list(col=white), # layout.heights = list(strip = 0.8), axis.text = list(cex = 0.6), par.xlab.text = list(cex = 0.75), par.ylab.text = list(cex = 0.75), par.main.text = list(cex = 0.8) superpose.symbol = list(pch = ., cex = 2) ), strip= FALSE, type = p, col = 1, panel = panel.superpose, panel.groups = function(x,y,...,group.number){ panel.xyplot(x,y,... ) # text argument can be a vector of values not # necessarily the group name grid.text(c(LETTERS,letters)[group.number], y = 0.93, x = 0.5, default.units = npc, just = c(left, bottom), gp = gpar(fontsize = 7) ) } ) You could use panel.text instead of grid.text Duncan -Original Message- From: Luigi Marongiu [mailto:marongiu.lu...@gmail.com] Sent: Wednesday, 22 April 2015 08:24 To: Duncan Mackay Subject: Re: [R] high density plots using lattice dotplot() Dear Duncan, thank you for your reply. I tried to implement your suggestions but as is on your reply did not work (actually R crashed) and a slight elaboration returned the figure attached, which is essentially still displaying text and not drawing the data. Here is what I wrote: xyplot(Delta.Rn ~ Cycle |
[R] vegan-help
Hi, Greetings from Nepal. I am using R version 3.1.3 (2015-03-09) -- Smooth Sidewalk and vegan 2.2-1 I have been using and teaching R to my students some years. I am more used to with simple vegetation data sets. Now I am facing difficulties to handle with temporal scale, site wise variation data. One of my colleagues had fish abundance and water chemistry data of five sites for 4 years each site samled monthly. Data with 5 sites x 4 years x 12 months. I am searching such kind of data and analysis in vegan, but yet could not get. If any of this list have any idea please with example where I can look please provide me link I will be happy to search and learn it. Thank you in advance. Chitra Baniya Nepal [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Script to workflow conversion
Dear Rxperts.. Sorry.. i don't have data for my query.. Is there a way that an R script can be converted to a workflow? or if not a workflow, converted into a flowchart or anything close to that effect. Regards, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question on CCA and RDA analysis
You didn't look that hard... Start with the Environmetrics Task View: http://cran.r-project.org/web/views/Environmetrics.html That might point you to the **vegan** pkg: http://cran.r-project.org/web/packages/vegan/index.html And that has a vignette on ordination which you might find useful. I have been wondering if a blog post on such basic topics might be worthwhile. Your email suggests perhaps it is. Once I'm done teaching I will try to find some time to write some posts on these basic analyses. HTH G On 10 April 2015 at 20:38, Luis Fernando García luysgar...@gmail.com wrote: Yeah, The most useful example I found was this. https://gist.github.com/perrygeo/7572735. I always had the idea of this kind of forums was to provide sources not so obvious in the web. If you have something better it would be great. 2015-04-10 18:36 GMT-03:00 Ben Bolker bbol...@gmail.com: Luis Fernando García luysgarcia at gmail.com writes: Dear R experts, I wanted to know if you can suggest me any website or tutorial just to learn about how to make a RDA or CDA in R Thanks in advance! I hate to ask, but did you try Googling canonical correspondence analysis R ... ? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Possible bug in CRAN mirror selection.
Hi all. Using R 3.2.0 on a WinXP machine, I attempted to update my installed packages. After selecting the nearest mirror, I was again prompted to select a mirror. This process repeated itself 2 additional times before I cancelled the most recent mirror selection thereby allowing the packages to download and install. This seems to reflect a bug in the most recent version of R. Thanks, ~Caitlin Sent from my T-Mobile 4G LTE Device [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R_Calculating Thiessen weights for an area with irregular boundary
On 22/04/15 22:43, Manoranjan Muthusamy wrote: SNIP 4. SNIP How can I show the Dirichlet tile names (i.e. 1,2,3,,8) in the plot? There's no built-in way at the moment as far as I can tell. One way to get the tiles to be labelled/numbered in the plot would be: plot(dX) text(X,labels=1:npoints(X)) Slightly sexier: cents - as.data.frame(t(sapply(tiles(dX),centroid.owin))) plot(dX) text(cents,labels=1:nrow(cents)) Is this satisfactory? cheers, Rolf Turner -- Rolf Turner Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 Home phone: +64-9-480-4619 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multinomial Fitting Distrbution
mixtools package has mixture of Gaussian fitting, maybe that might help? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest in Caret
Can you post your memory profile and codes? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and S+ Courses: Sydney, Melbourne, Brisbane in May 2015
Hi, (apologies for cross-posting) SolutionMetrics is presenting R and S+ courses in Sydney, Melbourne Brisbane - May, 2015 To book, please email enquir...@solutionmetrics.com.aumailto:enquir...@solutionmetrics.com.au or call +61 2 9233 6888 Getting Started with R (1 Day) Confidently use R for data analysis, graphics reporting. Content includes introduction to R, Data objects Classes, Data Import/Export, Data Manipulation, Graphics, Basic Statistical models, avoiding repetitive typing/clicking file management. More Infohttp://bit.ly/11qFxpO Date: 5 May, 2015 - Sydney (Tue) 7 May, 2015 - Melbourne (Thu) 12 May, 2015 - Brisbane (Tue) Intermediate R (1 Day) Efficient use of R language functions objects, Big Data, Advanced Visualisations, Data Mining - Logistic Regression/Tree models. More Infohttp://bit.ly/YBsT5b Date: 19 May, 2015 - Sydney (Tue) 21 May, 2015 - Melbourne (Thu) 26 May, 2015 - Brisbane (Tue) Getting Started with S+ (1 Day) Course provides users with the knowledge to perform all day to day data analysis graphics tasks with just a click of a mouse (No Programming Required). Data objects Classes, Data Import/Export, Data Manipulation, Graphics Basic Statistical models - Regression. Course Outlinehttp://bit.ly/Xf8im1 Date: 28 May, 2015 - Sydney (Thu) . For more information, please email enquir...@solutionmetrics.com.aumailto:enquir...@solutionmetrics.com.au or call +61 2 9233 6888 Cheers Kris Angelovski | Director| SolutionMetrics T +61 2 9233 6888 | F +61 2 9233 4099 Suite 44, Level 9, 88 Pitt Street, Sydney NSW 2000 solutionmetrics.com.auhttp://www.solutionmetrics.com.au/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why is removeSparseTerms() not doing anything?
Here's the code and results. The corpus is the text version of a single book. (r vs. 3.2) docs - tm_map(docs, stemDocument) dtm - DocumentTermMatrix(docs) freq - colSums(as.matrix(dtm)) ord - order(freq) freq[tail(ord)] one experi will can lucid dream 287 312 363 452 1018 2413 freq[head(ord)] abbey abdomin abdu abraham absent abus 1 1 1 1 1 1 dim(dtm) [1] 1 5265 dtms - removeSparseTerms(dtm, 0.1) dim(dtms) [1] 1 5265 dtms - removeSparseTerms(dtm, 0.001) dim(dtms) [1] 1 5265 dtms - removeSparseTerms(dtm, 0.9) dim(dtms) [1] 1 5265 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R lattice bwplot: Fill boxplots with specific color depending on factor level
Hi, I thoroughly looked for an answer to this problem with no luck. I have a dataframe with 3 factor levels: YY, NN, YN *mydata - rbind(data.frame(Col1 = rnorm(2*1000),Col2 =rep(c(A, C), each=1000),Col3=factor(rep(c(YY,NN), 1000))),data.frame(Col1 = rnorm(1000),Col2 =rep(c(B)),Col3=factor(rep(c(YY,YN), 500* Being Col3 of factor type with 3 levels: NN YY YN I want to make a boxplot using lattice bwplot and assign to each level a specific color: *# NN:red=rgb(249/255, 21/255, 47/255)# YN:amber=rgb(255/255, 126/255, 0/255)# YY:green=rgb(39/255, 232/255, 51/255)* Using bwplot function: * pl-bwplot(mydata$Col1~mydata$Col3 | mydata$Col2,data=mydata,ylab=expression(italic(R)),panel=function(...){panel.bwplot(...,groups=mydata$Col3, fill=c(red,amber,green))})* IF YOU REPRODUCE THE EXAMPLE YOU WILL SEE THAT THE COLORS ARE NOT RELATED TO THE LEVELS IN MY DATAFRAME AS YY BOX IS NOT ALWAYS GREEN. IS THERE A WAY TO ASSIGN YY:green, NN:red, YN:amber? You can see the resulting figure in: http://stackoverflow.com/questions/29802129/r-lattice-bwplot-fill-boxplots-with-specific-color-depending-on-factor-level Thank you in advance! Pablo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R installation issues
Hi, Please help me on this. I have installed R 3.1.2 version (64 bit) and receiving the follwowing error message while launching the Rx64 desktop icon. R for Windows GUI front-end has stopped working. Close the program. The application was unable to start correctly (0xc005) I tried the latest version 3.2 (64 bit R ) too and receiving the same error message. (I do not have any issues with the 32 bit installations. Its working perfectly fine. ) Here is the infoFrom the event log: Faulting applicaiton name: Rgui.exe Faulting Module name: ntdll.dll Exception code: 0xc005 Faulting Process id:0x1bf4 Fautling applicatio path:C:\Programfiles\R\R-3.1.3\bin\x64\rgui.exe Faulting Module path:C:\Windows\System32\ntdll.dl Please let me know whats causing this issue. Please let me know how to fix this issue -- View this message in context: http://r.789695.n4.nabble.com/R-64-Bit-installation-Issues-tp4706259.html Sent from the R help mailing list archive at Nabble.com. Sent from my iPhone [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] converting Twitter data from txt file
Hello! Someone gave me a text file of Twitter data to look at. I've used the twitter package to do the actual downloading and getting the data into nice R form. Is anyone familiar with a function to convert the twitter text into that good form please? Thanks, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting Twitter data from txt file
On 22 April 2015 at 13:05, Erin Hodgess erinm.hodg...@gmail.com wrote: Hello! Someone gave me a text file of Twitter data to look at. I've used the twitter package to do the actual downloading and getting the data into nice R form. Is anyone familiar with a function to convert the twitter text into that good form please? Would it be possible to post a sample of this data? -- H Thanks, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- OpenPGP: https://hasan.d8u.us/gpg.key Sent from my mobile device Envoyé de mon portable [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] weird behavior when starting R-3.2.0 in Windows 8
Hello again! First this afternoon, I attempted to install R-3.2.0 from source on Windows 8. I thought it went fine. Then when I tried to start it, it starts for a second, and closes. Ok. I figured that I had done something wrong with the source installation. Now I just installed the binary. But the exact same thing happens! Has anyone else run into this, please? For what it's worth, all is well on my Ubuntu 14.04. Thanks, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Forest in Caret
Dear All, I am a bit concerned about the memory consumption of randomForest in caret. This seems to e due to the fact that the option keep.forest=FALSE does not work in caret. Does anybody know a workaround for that? Many thanks Lorenzo __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] distribucion de IRWIN HALL
N=1000 # tamaño simulación Monte Carlo n=5 # numero de uniformes cdf.IH -function(x,n,N) mean(replicate(N,sum(runif(n)))=x) x=seq(0,5,.1) y=sapply(x,FUN=cdf.IH,n=n,N=N) plot(x,y,type=l) - Mensaje original - De: Genaro Llusco gellu...@gmail.com Para: r-help-es@r-project.org Enviados: Martes, 21 de Abril 2015 23:07:14 Asunto: [R-es] distribucion de IRWIN HALL estimados estoy considerando programar la funcion de distribucion de Irwin hall. lamentablemente no he tenido exito, pido que alguien me pueda colaborar con aquello, les quedo agradecido de antemano. -- atte. Lic. Genaro Llusco Silvestre gellu...@gmail.com Telf: 74028671 blog personal es: http://www.cientificest.blogspot.com [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
[R] - dunn.test gives strange output
Dear all, I have a question concerning my output of the dunn.test function in R. I like to compare different datasets, which are not distributed normally, so I use the Dunn.test to do pairwise comparison. I have 2 questions concerning the output: - why are my groupnames changed, the output gives 4 times swirskii, although my groupnames are longer (see my dataset) - secondly when I check the p-values, I see something very odd: I see that p values where the same groups are compared sometimes indicate that they are different, why are not all those p-values = 1 ? This is my dataset: structure(list(controle = c(111, 88, 216, 169), chemie = c(47, 31, 35, 30), IPM = c(0, 0, 0, 1), gallicus = c(102, 152, 102, 75), swirskii3 = c(1, 0, 0, 0), swirskiiA = c(0, 0, 1, 3), swirskiiP = c(0, 0, 1, 6), swirskii1x = c(12, 2, 75, 46)), .Names = c(controle, chemie, IPM, gallicus, swirskii3, swirskiiA, swirskiiP, swirskii1x), row.names = c(NA, 4L), class = data.frame) I get the following output: DUNN - dunn.test(value, variable, method = none,kw=TRUE) Kruskal-Wallis rank sum test data: value and variable Kruskal-Wallis chi-squared = 26.8894, df = 7, p-value = 0 Comparison of value by variable (No adjustment) Col Mean-| Row Mean | controle chemieIPM gallicus swirskii swirskii -+-- chemie | -1.341044 -3.410083 -2.069039 -0.325682 1.015361 3.084401 | 0.0900 0.0003 0.0193 0.3723 0.1550 0.0010 | IPM | -3.410083 -2.069039 -0.325682 1.015361 3.084401 -3.410083 | 0.0003 0.0193 0.3723 0.1550 0.0010 0.0003 | gallicus | -0.325682 1.015361 3.084401 -3.410083 -2.069039 0.00 | 0.3723 0.1550 0.0010 0.0003 0.0193 0.5000 | swirskii | -3.410083 -2.069039 0.00 -3.084401 -3.007770 -1.666726 | 0.0003 0.0193 0.5000 0.0010 0.0013 0.0478 | swirskii | -3.007770 -1.666726 0.402313 -2.682088 0.402313 -2.969454 | 0.0013 0.0478 0.3437 0.0037 0.3437 0.0015 | swirskii | -2.969454 -1.628410 0.440628 -2.643772 0.440628 0.038315 | 0.0015 0.0517 0.3297 0.0041 0.3297 0.4847 | swirskii | -1.475148 -0.134104 1.934935 -1.149466 1.934935 1.532621 | 0.0701 0.4467 0.0265 0.1252 0.0265 0.0627 Col Mean-| Row Mean | swirskii -+--- chemie | -3.410083 | 0.0003 | IPM | -2.069039 | 0.0193 | gallicus | -3.084401 | 0.0010 | swirskii | 0.402313 | 0.3437 | swirskii | -1.628410 | 0.0517 | swirskii | -1.475148 | 0.0701 | swirskii | 1.494306 | 0.0675 Met vriendelijke groeten - With kind regards, Joachim Audenaert onderzoeker gewasbescherming - crop protection researcher PCS | proefcentrum voor sierteelt - ornamental plant research Schaessestraat 18, 9070 Destelbergen, Belgi� T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95 E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het PCS op LinkedIn Disclaimer | Please consider the environment before printing. Think green, keep it on the screen! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgdal installation with two versions of GDAL
Rob Skelly rob at dijital.ca writes: Hello, I am using GDAL 2.0 for most of my work, but rgdal depends on GDAL 2. I have built and installed GDAL 1.11.2 in /opt/gdal-1.11.2, and rgdal compiles and installs into R, using the following command: sudo R CMD INSTALL --configure-args=--with-gdal-config=/opt/gdal.1.11.2/bin/gdal-config rgdal-0.9-2.tar.gz There are several questions here. Firstly, R-sig-geo is the appropriate list for questions of this kind, if we ignore the need to understand library search path management under a particular linux distribution. GDAL 2.0 has not been released yet, and is not backward compatible with GDAL 1 (they use different object models). It is true that a release date at the end of this month was proposed, but has not yet been confirmed. Given the uncertain status of GDAL 2, there is no pressing reason to divert maintainer assets to implementing changes in rgdal that may need frequent revision tracking changes in GDAL 2, and keeping a parallel object structure for released GDAL 1.*. You do not say why GDAL 2 is essential for your work - maybe you could simply use GDAL 1* until GDAL 2 stabilises? Neither your motivation nor your affiliation are very convincing. Please never post HTML-formatted mail. Roger So, the question is, how to I convince R to use the new library search path? I'm on xubuntu, and R is installed using apt. Thanks, Rob [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exhaustive CHAID package
On Tue, 21 Apr 2015, Michael Grant wrote: Dear R-Help: From multiple sources comparing methods of tree classification and tree regressions on various data sets, it seems that Exhaustive CHAID (distinct from CHAID), most commonly generates the most useful tree results and, in particular, is more effective than ctree or rpart which are implemented in R. I searched a bit on the web for exhaustive CHAID and didn't find any convincing evidence that this method is most commonly the most useful. I doubt that such evidence exists because the methods are applicable to so many different situations that uniformly better results are essentially never obtained. Nevertheless, if you have references of comparison studies, I would still be interested. Possibly these provide insight in which situations exhaustive CHAID performs particularly well. I see that CHAID, but not Exhaustive CHAID, is in the R-forge, and I write to ask if there are plans to create a package which employs the Exhaustive CHAID strategy. I wouldn't know of any such plans. But if you want to adapt/extend the code from the CHAID package, this is freely available. Right now the best source I can find is in SPSS-IBM and I feel a bit disloyal to R using it. I wouldn't be concerned about disloyalty. If you feel that exhaustive CHAID is the most appropriate tool for your problem and you have access to it in SPSS, why not use it? Possibly you can also export it from SPSS and import it into R using PMML. The partykit package has an example with an imported QUEST tree from SPSS. Michael Grant Professor University of Colorado Boulder [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modifications using “stat_summary” in ggplot
Dear R-list members I am using �stat_summary� in ggplot to plot a error bar graph comparing three treatmens (damage, see code below). I would like to change the shape of the three symbols displaying the mean values (e.g one symbol should be a point (default) one should be a triangle and one should be a square). Furthermore, I would like that the outlines of my error bars are black (and that I can fill them with whatever color I want ( I used white, black and gray65). Does Anyone of you know how to solve these problems? I use the following code: line-ggplot(data,aes(leaf,cor_average,fill=damage, colour=damage)) #define x (leaf) and y (cor_average) variables within aes() and that they should be colored according to damage type line+stat_summary(fun.y=mean, geom=point, size=3)+ #add mean as point symbol stat_summary(fun.data=mean_cl_boot,geom=errorbar,width=0.3, size=0.75)+ scale_colour_manual(values=c(white,black,gray65))+ #add CI : width=width of CI whiskers, size=widht of the CI bar labs(x=Leaf,y=Average nr. glands corrected for leaf sz.) Thank you very much, Michael Eisenring Michael, Msc. PhD Student Federal Department of Economic Affairs, Education and Research EAER Institute of Sustainability Sciences ISS Biosafety Reckenholzstrasse 191, CH-8046 Z�rich Tel. +41 44 37 77181 Fax +41 44 37 77201 michael.eisenr...@agroscope.admin.chmailto:michael.eisenr...@agroscope.admin.ch www.agroscope.chhttp://www.agroscope.ch/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.