Re: [R] Tools to modify highlighted areas in pdf documents?
Dear Ivan, Thank you very much for the hint. I have started to test it. - it offers more colours and types of highlighting than MS Edge; - it seems to have better word-boundary detection than MS Edge (but I haven't tested all the cases yet); There are some nit-picks: - I wish it had a better default color-pallet; - the vertical positioning continues to be sometimes sub-optimal: editing manually the coordinates may be still useful; Returning to R: 1. Some of the pdf-packages could implement some of the annotation-functionality as well. 2. It would be useful to be able to export the annotations and import/merge them in another document. I have spotted errors in various articles; such a functionality would be handy, if a new version of those articles gets published. Sincerely, Leonard From: Ivan Krylov Sent: Sunday, June 2, 2024 8:02 PM To: Leo Mada via R-help Cc: Leo Mada Subject: Re: [R] Tools to modify highlighted areas in pdf documents? � Sat, 1 Jun 2024 16:16:23 + Leo Mada via R-help �: > When highlighting pdf-documents with Microsoft Edge, the bounding box > is sometimes misplaced, and quite ugly so. It also lacks the ability > to draw lines or arrows. > > On the other hand, I did not get used to Acrobat Reader: it usually > involves much more effort to add specific highlights. Lines can be > drawn, but are NOT straight! Sorry for answering a different question, but have you considered using a different PDF viewer + annotation application? Okular <https://eu01.z.antigena.com/l/iZmB5crNT773HTSUSt2S6McW6mdP5phHyzgXromFHINsN6Uo6BMnSdZK0kmSK~aLduXw-YpIAWy-DV9bac-5U3grBsLdYxuX7aMmbxQjKSLGCqTyJG54WQ2W7oaR2NEPTWiFotbkB4_eQTzHI3L-cAgOgsS4exJ4ie4BHLt > is free and available on Windows (including from outside Microsoft store). Its annotation features include all kinds of highlights, arrows and lines, both straight and arbitrarily-shaped, quickly available from the "annotations" panel. -- Best regards, Ivan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R code for overlapping variables -- count
Correcting a small glitch - see new code. From: Leo Mada Sent: Sunday, June 2, 2024 8:34 PM To: Shadee Ashtari Cc: r-help@r-project.org Subject: [R] R code for overlapping variables -- count Dear Shadee, If you have a data.frame with the following columns: n = 100; # population size x = data.frame( Sex = sample(c("M","F"), n, T), Country = sample(c("AA", "BB", "US"), n, T), Income = as.factor(sample(1:3, n, T)) ) # Dummy variable ONE = rep(1, nrow(x)) # corrected r = aggregate(ONE ~ Sex + Income + Country, length, data = x) r = r[, c("Country", "Income", "Sex", "ONE")] names(r)[4] = "Count" print(r) It is possible to write more simple code, if you need only the particular combination of variables (which you specified in your mail). But this is the more general approach. Note: you may want to use "sum" instead of "length", e.g. if you have a column specifying the number of individuals in that category. Hope this helps, Leonard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R code for overlapping variables -- count
Dear Shadee, If you have a data.frame with the following columns: n = 100; # population size x = data.frame( Sex = sample(c("M","F"), n, T), Country = sample(c("AA", "BB", "US"), n, T), Income = as.factor(sample(1:3, n, T)) ) # Dummy variable ONE = rep(1, nrow(x)) r = aggregate(ONE ~ Sex + Income + Country, length, data = x) r = r[, c("Country", "Income", "Sex")] print(r) It is possible to write more simple code, if you need only the particular combination of variables (which you specified in your mail). But this is the more general approach. Note: you may want to use "sum" instead of "length", e.g. if you have a column specifying the number of individuals in that category. Hope this helps, Leonard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tools to modify highlighted areas in pdf documents?
Dear Bert, Thank you very much for the response. I was aware of pdftools - but did not recall any such functionality. I have checked again (both pdftools, qpdf and the 3rd one): unfortunately, they do not implement such functionality. There might be other packages, which I missed. However, the functionality is feasible. I will add a few more details - maybe someone picks up the task. It is possible to edit manually the pdf-file, though it is quite cumbersome to find the right annotation. 1. One needs to edit the values both in the \QuadPoints and the \Rect in the \AP object. 2. Modifying the color is trickier: \C() encodes the color and \CA the alpha channel (= 1): but neither Acrobat, nor MIcrosoft Edge update the color. The value of the color encoded in the stream is used instead. It is possible to "trick" Edge: modify the \C color and set "\ca 1" (in the stream block) to a lower value (e.g. "\ca 0.99"). MS Edge will then accept the modified color (but Acrobat ignores it). Changing the value in the stream is the actual solution. Note: non-rectangular shapes can be specified as well. I hope that some of the referenced packages pick up this task. Sincerely, Leonard From: Bert Gunter Sent: Saturday, June 1, 2024 9:23 PM To: Leo Mada Cc: r-help@r-project.org Subject: Re: [R] Tools to modify highlighted areas in pdf documents? Search! on rseek.org<http://rseek.org>, the query "modify pdf documents in R" brought up the staplr package. A quick web search with the same query brought up the pdftools package. These were cursory efforts, so you may well find more. You will have to determine whether and to what degree any meet your needs. -- Bert On Sat, Jun 1, 2024 at 9:16 AM Leo Mada via R-help mailto:r-help@r-project.org>> wrote: Dear R-Users, Are there any packages that enable the modifications of highlighted areas / annotations in pdf documents? It seems feasible - I have explored some R code (see below). However, I would rather avoid to reinvent the wheel. The problem: When highlighting pdf-documents with Microsoft Edge, the bounding box is sometimes misplaced, and quite ugly so. It also lacks the ability to draw lines or arrows. On the other hand, I did not get used to Acrobat Reader: it usually involves much more effort to add specific highlights. Lines can be drawn, but are NOT straight! Are there tools to change the size/position of highlights? Or to add highlights and underline words? Changing position/size manually by editing the data in the pdf-document is possible. Changing the color is more trickier (somehow possible in Microsoft Edger; though the direct approach to rewrite the actual stream is better). Maybe there are some tools to do it? Some R code is below. Sincerely, Leonard # library(zip) con = file("_some_pdf_.pdf", "rb") NL = 0 # - very dirty hack; # - assumes Annotations are in the last fragment/chunk; while(TRUE) { tmp = readBin(con, "raw", 1024*128 + 515); if(length(tmp) == 0) break; x = tmp; # isNL = (x == 10) | (x == 13); isNL = (x == 13); isNL = isNL & (x[which(isNL) + 1] == 10); NL = NL + sum(isNL); } close(con) idP = which(isNL) idS = 935; # will vary with pdf and Annotations and ...; nLast = 4; # usually 2 chunks idx = idP[seq(idS, length.out = nLast)] # Check: Right position? # tmp = x[seq(idx[1] + 2, idx[1 + 2] - 1)] # intToUtf8(tmp) tmp = inflate(x[seq(idx[1] + 2, idx[nLast] - 1)]) intToUtf8(tmp$output) # Output of inflate: an Example # "/GS gs .56078434 .87058824 .97647059 rg\n # 337.298 183.836 m 364.322 183.836 l 364.322 171.83 l 337.298 171.83 l h f\n" # Note: /BBox[ 337.298 171.83 364.322 183.836] The raw pdf data: 1948 0 obj <>/C[ 0.560784 0.870588 0.976471]/CA 1/F 4/PDFIUM_HasGeneratedAP true/QuadPoints[ 337.298 186 364.322 186 337.298 174.6 364.322 174.6]/Rect[ 337.298 174.6 364.322 186]/Subtype/Highlight/Type/Annot>> endobj 1949 0 obj <>>>>>/Subtype/Form/Type/XObject>>stream xœE˱ €0 Àž)~ “ä Û™€ Ø P@ ûKˆ"Оtó²¢ß jÉC© ðT#ŠBš›zª WŸH—Ò 9(Aà š Kùäøų _ iÀŽmz dR ² endstream endobj [[alternative HTML version deleted]] __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tools to modify highlighted areas in pdf documents?
Dear R-Users, Are there any packages that enable the modifications of highlighted areas / annotations in pdf documents? It seems feasible - I have explored some R code (see below). However, I would rather avoid to reinvent the wheel. The problem: When highlighting pdf-documents with Microsoft Edge, the bounding box is sometimes misplaced, and quite ugly so. It also lacks the ability to draw lines or arrows. On the other hand, I did not get used to Acrobat Reader: it usually involves much more effort to add specific highlights. Lines can be drawn, but are NOT straight! Are there tools to change the size/position of highlights? Or to add highlights and underline words? Changing position/size manually by editing the data in the pdf-document is possible. Changing the color is more trickier (somehow possible in Microsoft Edger; though the direct approach to rewrite the actual stream is better). Maybe there are some tools to do it? Some R code is below. Sincerely, Leonard # library(zip) con = file("_some_pdf_.pdf", "rb") NL = 0 # - very dirty hack; # - assumes Annotations are in the last fragment/chunk; while(TRUE) { tmp = readBin(con, "raw", 1024*128 + 515); if(length(tmp) == 0) break; x = tmp; # isNL = (x == 10) | (x == 13); isNL = (x == 13); isNL = isNL & (x[which(isNL) + 1] == 10); NL = NL + sum(isNL); } close(con) idP = which(isNL) idS = 935; # will vary with pdf and Annotations and ...; nLast = 4; # usually 2 chunks idx = idP[seq(idS, length.out = nLast)] # Check: Right position? # tmp = x[seq(idx[1] + 2, idx[1 + 2] - 1)] # intToUtf8(tmp) tmp = inflate(x[seq(idx[1] + 2, idx[nLast] - 1)]) intToUtf8(tmp$output) # Output of inflate: an Example # "/GS gs .56078434 .87058824 .97647059 rg\n # 337.298 183.836 m 364.322 183.836 l 364.322 171.83 l 337.298 171.83 l h f\n" # Note: /BBox[ 337.298 171.83 364.322 183.836] The raw pdf data: 1948 0 obj <>/C[ 0.560784 0.870588 0.976471]/CA 1/F 4/PDFIUM_HasGeneratedAP true/QuadPoints[ 337.298 186 364.322 186 337.298 174.6 364.322 174.6]/Rect[ 337.298 174.6 364.322 186]/Subtype/Highlight/Type/Annot>> endobj 1949 0 obj <>/Subtype/Form/Type/XObject>>stream xœE˱ €0 Àž)~“äÛ™€ØP@ûKˆ"Оtó²¢ßjÉC©ðT#ŠBš›zª WŸH—Ò9(Aà š Kùäøų_iÀŽmz dR² endstream endobj [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Clustering Functions used by Reverse-Dependencies
Dear Ivan, Thank you very much for this interesting information. Regarding: "For well-behaved packages that declare their dependencies correctly, parsing the NAMESPACE for importFrom() and import() calls should give you the explicit imports." I did learn something new (I am not very experienced in package writing). Unfortunately, Roxygen2 as of the current version still suggests to use the pkg::fname approach: "If you are using just a few functions from another package, we recommending adding the package to the Imports: field of the DESCRIPTION file and calling the functions explicitly using ::, e.g., pkg::fun()." https://roxygen2.r-lib.org/articles/namespace.html Regarding analysing the actual code: it is good to know that CMD check has also some functionality. I will look into it, when I find some free time. tools:::.check_packages_used is a few pages of code. On the other hand, the help page for codetools::checkUsage is quite cryptic. But it's good to know at least where to look. Sincerely, Leonard From: Ivan Krylov Sent: Wednesday, February 28, 2024 10:36 AM To: Leo Mada via R-help Cc: Leo Mada Subject: Re: [R] Clustering Functions used by Reverse-Dependencies � Sat, 24 Feb 2024 03:08:26 + Leo Mada via R-help �: > Are there any tools to extract the function names called by > reverse-dependencies? For well-behaved packages that declare their dependencies correctly, parsing the NAMESPACE for importFrom() and import() calls should give you the explicit imports. (What if the package imports the whole dependency? The safe assumption is that all functions are used, but it comes with false positives. You could also walk the package code looking for function names that may belong to the imported package, but that may involve both false positives and false negatives.) For the rest of the imports and uses of weak dependencies, you'll have to walk the package code looking for the uses of the `::` operator. See how R CMD check walks the package code in functions tools:::.check_packages_used and codetools::checkUsage. A less-well-behaved package can always load a namespace during runtime and choose the functions to call depending on the phase of the moon or weather on Jupiter. For these, like for the halting problem, there's no general solution: the package could be written to say, "if Leonard's function says I'm about to call foo::bar, I won't do it, otherwise I will". -- Best regards, Ivan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Clustering Functions used by Reverse-Dependencies
Dear R Users, Are there any tools to extract the function names called by reverse-dependencies? I would like to group these functions using clustering methods based on the co-occurrence in the reverse-dependencies. Utility: It may be possible to split complex packages into modules with fewer reverse-dependencies. Package pkgdepR may offer some of the functionality; but I did not have time to fully explore it: https://cran.r-project.org/web/packages/pkgdepR/index.html If such tools are not yet available, I have opened an issue on GitHub and would like to collect any ideas: https://github.com/discoleo/PackageBrowser/issues/1 There are some model packages that could be used to test the clustering: 1) Rcpp, xml: the core-functionality will always be needed; splitting into modules may not be possible/useful; 2) pracma: most reverse-dependencies are likely using only a small subset of the functions in pracma; (there was some discussion on R-package-devel about reverse dependencies a few weeks ago) 3) survival: I have no idea in which category it falls - but it has a lot of reverse-dependencies; Note: I missed some important functionality from the pkgsearch package. I have started therefore work on a new package (PackageBrowser) - although it is not yet published on CRAN: https://github.com/discoleo/PackageBrowser It does not yet process recursively the reverse-dependencies; although this does not seem a big challenge. The real challenge is to parse the code and extract function names. I did some work in the past, but never had time for a full implementation. Many thanks, Leonard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Optimal use of optim?
Dear R-Users, I am interested in the optimal strategy for optim: Q: How to formulate the optimization problem? Q1: Are there benefits for abs(f(x)) vs (f(x))^2? Q2: Are there any limitations for using abs(...)? Regarding point 1: my feeling is that the gradients should be more robust with the abs(...) method; but I did not test it. Regarding point 2: Obviously, abs(f(x)) is not differentiable. This is a limitation from a mathematical point of view, but the gradient should be still computable using a finite difference method. Are there any other limitations? The question is based on the following example: solving gamma(x) = y for some given y. Function multiroot in package rootSolve fails when y < 1. The version using optim fares much better. However, I had to tweak somewhat the parameters in order to get a higher precision. This made me curious about the optimal strategy; unfortunately, I do not have much experience with optimizations. The example is also available on GitHub: https://github.com/discoleo/R/blob/master/Math/Integrals.Gamma.Inv.R Sincerely, Leonard ### Example: gamma.invN = function(x, x0, lim, ..., ndeps = 1E-5, rtol = 1E-9, gtol = 0.0001, digits = 10) { v = x; if(length(lim) == 1) lim = c(lim - 1 + gtol, lim - gtol); cntr = c(list(...), ndeps = ndeps, pgtol = rtol); # abs(): should provide greater precision than ()^2 when computing the gradients; # param ndeps: needed to improve accuracy; r = optim(x0, \(x) { abs(gamma(x) - v); }, lower = lim[1], upper =lim[2], method = "L-BFGS-B", control = cntr); if( ! is.null(digits)) print(r$par, digits); return(r$par); } ### Example Euler = 0.57721566490153286060651209008240243079; x = gamma.invN(Euler, -3.5, lim = -3) gamma(x) - Euler x = gamma.invN(Euler, -3.9, lim = -3) gamma(x) - Euler [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Skip jumps in curve
Dear Duncan, Thank you very much for the response. I suspected that such an option has not been implemented yet. The plot was very cluttered due to those vertical lines. Fortunately, the gamma function is easy to handle. But the feature remains on my wishlist as useful more in general. Sincerely, Leonard From: Duncan Murdoch Sent: Tuesday, February 13, 2024 6:05 PM To: Leo Mada ; r-help@r-project.org Subject: Re: [R] Skip jumps in curve On 13/02/2024 10:29 a.m., Leo Mada via R-help wrote: > Dear R-Users, > > Is there a way to skip over without plotting the jumps/discontinuities in > curve()? > > I have not seen such an option, but maybe I am missing something. > > plot.gamma = function(xlim = c(-6, -1), ylim = c(-1,3), hline = NULL, n = > 1000) { > curve(gamma(x), from = xlim[1], to = xlim[2], ylim=ylim, n=n); > if( ! is.null(hline)) abline(h = hline, col = "green"); > } > > Euler = 0.57721566490153286060651209008240243079; > plot.gamma(hline = Euler) > > Adding an option to the function curve may be useful: > options = c("warn", "silent", "unconnected") > > This is part of some experiments in math; but that's another topic. For > latest version: > https://github.com/discoleo/R/blob/master/Math/Integrals.Gamma.Inv.R If you know where the discontinuities are, plot multiple times with the discontinuities as endpoints: plot.gamma = function(xlim = c(-6, -1), ylim = c(-1,3), hline = NULL, n = 1000) { start <- floor(xlim[1]):floor(xlim[2]) end <- start + 1 start[1] <- xlim[1] end[length(end)] <- xlim[2] n <- round(n/length(start)) curve(gamma(x), from = start[1], to = end[1], ylim=ylim, n=n, xlim = xlim) for (i in seq_along(start)[-1]) curve(gamma(x), from = start[i], to = end[i], add = TRUE, n) if( ! is.null(hline)) abline(h = hline, col = "green"); } Euler = 0.57721566490153286060651209008240243079; plot.gamma(hline = Euler) If you don't know where the discontinuities are, it would be much harder, because discontinuities can be hard to detect unless the jumps are really big. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Skip jumps in curve
Dear R-Users, Is there a way to skip over without plotting the jumps/discontinuities in curve()? I have not seen such an option, but maybe I am missing something. plot.gamma = function(xlim = c(-6, -1), ylim = c(-1,3), hline = NULL, n = 1000) { curve(gamma(x), from = xlim[1], to = xlim[2], ylim=ylim, n=n); if( ! is.null(hline)) abline(h = hline, col = "green"); } Euler = 0.57721566490153286060651209008240243079; plot.gamma(hline = Euler) Adding an option to the function curve may be useful: options = c("warn", "silent", "unconnected") This is part of some experiments in math; but that's another topic. For latest version: https://github.com/discoleo/R/blob/master/Math/Integrals.Gamma.Inv.R Sincerely, Leonard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic astronomy package recommendation wanted.
Dear Richard, I wrote s small shiny app to improve the search for CRAN-packages. It can be installed from GitHub: https://github.com/discoleo/PackageBrowser Searching for "(?https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of geometric mean for geochemical concentrations
Dear Rich, It depends how the data is generated. Although I am not an expert in ecology, I can explain it based on a biomedical example. Certain variables are generated geometrically (exponentially), e.g. MIC or Titer. MIC = Minimum Inhibitory Concentration for bacterial resistance Titer = dilution which still has an effect, e.g. serially diluting blood samples; Obviously, diluting the samples will generate the following concentrations: 1, 1/2, 1.4, 1/8, 1/16, ... (or the reciprocal: 1, 2, 4, 8, 16, ...) It makes no sense to compute the arithmetic mean. Results are usually reported as some quantile (median or 90%); alternatively, one computes the geometric mean. ### Ecology /Environmental Chemistry I suppose that certain chemicals may be generated/released in the environment through a non-linear process. The LLOD may also play a role, but may NOT be the main reason. If the generating process is exponential, then the arithmetic mean would strongly skew the results (also inconsistently based on season, particular year, etc - the generating processes may differ). ### Harmonic Mean Did not encounter it often: maybe because of the problematic handling of 0. I do have in the meantime a nice workaround for 0 (which also works with the geometric mean), see also (unfortunately not well documented): https://github.com/discoleo/R/blob/master/Stat/Moments.Stat.R v0 = 1; # some initial "skew" 1 /(xharm + v0) = sum( 1 / (x + v0) ) / length(x) xgeom = prod(x + v0)^(1/length(x)) - v0; I apologize for the late reply; I did not have much time to read messages during the past weeks. Sincerely, Leonard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fit NLE - was: computer algebra in R
Fit NLE - was: [R] computer algebra in R Original post: https://stat.ethz.ch/pipermail/r-help/2023-November/478619.html Dear Kornad, I think I have started to understand what you try to achieve. The problem is to fit a NLE and compute the parameters of the NL-Eq. I have included the R Help-list back in the loop, as I am not an expert in optimization. Goal: y ~ I0 + IHD * hd + ID * d; where: y = given vector of measurements; x = given vector of values; I0, IHD, ID, kd = parameters to optimize; hd = satisfies a polynomial of order 3; As d = d0 - hd, the previous formula can be written: y ~ I0 + ID * d0 + (IHD - ID) * hd; f(x, hd, kd) = 0, where f = a polynomial of order 3 in hd and order 2 in kd; d0 (and other components of the polynomial) = given constants; 1) First Approach I would back-substitute hd into the polynomial: hd = (y - I0 - ID*d0) / (IHD - ID); f(x, hd, kd) becomes then f(x, y, kd, I0, ID, IHD) = 0; - f is order 3 in y; You could fit: (y^3) ~ f(x, y, kd, I0, ID, IHD) - y^3, where you subtract the y^3 term from the function f, and add the (y^3) values as a new columng to the data.frame: data.frame(y3 = y^3, y=y, x=x) If the values of y are versy small (abs(y) << 1), then it may be wiser to fit: y ~ f(x, y^2, y^3, kd, I0, ID, IHD) - (y-term); But I am not an expert in these problems. Other R-users may be more helpful. 2.) Approach 2: Math I feel that the problem can be solved quasy-exactly as well. It is much harder with 4 parameters to optimize: - one needs to compute the 4 partial derivatives; - solve the resulting system of 4 polynomial equations; The system is polynomial; although it looks ugly and I am not inclined to do such calculations myself. I hope that you can get more useful answers from the R help-list. Sincerely, Leonard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] computer algebra in R
Dear Konrad, I presume that the system can be written as follows, where h0, d0, ga0, kga and kd are given: err1 = h + hd + hga - h0; err2 = d + hd - d0; err3 = ga + hga - ga0; err4 = hga - kga*h*ga; err5 = hd - kd*h*d; All error terms should be zero. Do you need (a) the symbolic solution or (b) is a numeric solution fine? I do not have any experience with yacas or caracas. But see below. ### Case (a) I did write a (very rudimentary) symbolic solver for systems of polynomial equations in pure R. It should be able to solve this one. Please let me know if this is of interest and I will provide further details. ### Case (b) Again, it should be possible to solve, see some examples: https://github.com/discoleo/R/blob/master/Math/Systems.NLE.Tutorial.md https://github.com/discoleo/R/blob/master/Math/Systems.Lambert.R I did solve much more complex ones, for a monster see: https://github.com/discoleo/R/blob/master/Math/Poly.System.S5.Ht.Formulas.Derivation.HS0.R [and the remaining Poly.System.S5.Ht... scripts] I presume that only the real roots are required (it's possible to solve systems with complex roots as well). I can provide further details as well. Sincerely, Leonard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extract cells and their adjacent cells that may appear anywhere in a dataframe.
Hello, I have a dataframe that contains information about vegetation cover and percent coverage, collected using a quadrat. The dataframe is set up so that each row represents a single quadrat. If there are multiple species within one quadrat, they are all listed within the same row with respective % coverage always following in next column. Here is an example, species are represented as 4 letter codes: Quadrat_# Date Species_1 Species_2 covrage2 Species_3 covrage3 1 5/2/2017 unk1 2 bial 6 stgu 95 2 5/2/2017 bope 75 stja 4 ficy 9 3 5/2/2017 bope 100 stja 5 4 5/2/2017 stgu 87 stja 6 bope 20 5 5/2/2017 bg 13 stja 2 ficy 10 6 5/2/2017 bope 8 sirh 3 stgu 2 The problem is that the species were not recorded in any particular order, and not all species occur in every quadrat. There can also be any number of species per quadrat. I need to be able to extract each species AND it’s respective coverage, and place them in into another dataframe for further analysis So the result for “bope” from above example data would look like this: Quadrat# % coverage bope 1 0 2 75 3 100 4 20 5 0 6 8 Any help greatly appreciated. Brian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forging an R enthusiast meeting at EGU 2017
At the European Geosciences Union Meeting (Spring 2017 in Vienna) there will be an interactive sessionthat aims to bring together scientists that share a common indispensable tool: the statistic software R. http://meetingorganizer.copernicus.org/EGU2017/session/24971 We have the feeling that R users are like little boats on an ocean at night, unaware of how close other people are that have already worked on the same problem earlier. This is the main motivation for this session in spring 2017. If you feel interested to contribute your work we would be delighted to see you enriching this session. If you don't think you will have time to come to the EGU 2017 we would be pleased if you could share this message among your colleagues to linking these small boats on the ocean at night. Kind regards and a pleasant autumn week, Micha Dietze [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order of formula terms in model.matrix
Is it really the same model? Following the example provided by Lars: set.seed(1) x1 <- rnorm(100) f1 <- factor(sample(letters[1:3], 100, replace = TRUE)) trt <- sample(c(-1,1), 100, replace = TRUE) y <- factor(sample(c(0,1), 100, T)) df <- data.frame(y=y, x1=x1, f1=f1, trt=trt) fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial) coef(fit1) fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) coef(fit2) identical(fitted(fit1), fitted(fit2)) [1] FALSE ___ If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference. Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order of formula terms in model.matrix
Thanks Peter. That make sense. Nevertheless, what comes at a surprise to me (and maybe to others) is that one can potentially get different fits by simply swapping the terms in the model formula. Best, Leo. -Original Message- From: peter dalgaard [mailto:pda...@gmail.com] Sent: 2016, January, 18 11:16 AM To: Guelman, Leo Cc: r-help@r-project.org; Charles C. Berry Subject: Re: [R] Order of formula terms in model.matrix On 18 Jan 2016, at 16:49 , Guelman, Leo <leo.guel...@rbc.com> wrote: > Is it really the same model? No, and I didn't say that they would be. I did say that they would be in the all-factor case, which does seem to be right: > df$trt <- factor(df$trt) > fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial) > fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) > plot(fitted(fit1), fitted(fit2)) # still differs > df$x1 <- factor(sample(c(-1,1), 100, replace = TRUE)) > fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial) > fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) > plot(fitted(fit1), fitted(fit2)) # looks like it's on diagonal > identical(fitted(fit1), fitted(fit2)) # wrong check [1] FALSE > all.equal(fitted(fit1), fitted(fit2)) # better [1] TRUE -pd > > Following the example provided by Lars: > > set.seed(1) > x1 <- rnorm(100) > f1 <- factor(sample(letters[1:3], 100, replace = TRUE)) trt <- > sample(c(-1,1), 100, replace = TRUE) y <- factor(sample(c(0,1), 100, > T)) df <- data.frame(y=y, x1=x1, f1=f1, trt=trt) > > fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial) > coef(fit1) > > fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) > coef(fit2) > > identical(fitted(fit1), fitted(fit2)) > [1] FALSE > > > > __ > _ > > If you received this email in error, please advise the sender (by return > email or otherwise) immediately. You have consented to receive the attached > electronically at the above-noted email address; please retain a copy of this > confirmation for future reference. > > Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur > immédiatement, par retour de courriel ou par un autre moyen. Vous avez > accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à > l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette > confirmation pour les fins de reference future. > -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com ___ If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference. Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Order of factor levels
Dear list, What is a better way relative to the one below to keep the order of factor levels created from cut()? Notice, I'm simply pasting letters to levels before converting to character so to keep the desired order of levels. This is not very elegant... I'm converting to character so I can call the helper fun with vapply() from the main fun. Removing this line of code " levels(xc) <- paste(letters[1:nlevels(xc)], levels(xc), sep=":")" would result in factor levels that are not ordered according to x1. set.seed(1) df <- data.frame(x1 = rnorm(1000), x2 = rnorm(1000)) main_fun <- function(data) { data.frame(vapply(data, helper_fun, character(nrow(df } helper_fun <- function(x) { xc <- cut(x, breaks = unique(quantile(x, seq(0, 1, 1/10), na.rm = TRUE)), include.lowest = TRUE) levels(xc) <- paste(letters[1:nlevels(xc)], levels(xc), sep=":") as.character(xc) } res <- main_fun(df) levels(res$x1) levels(res$x1) [1] "a:[-3.01,-1.34]""b:(-1.34,-0.882]" "c:(-0.882,-0.511]" "d:(-0.511,-0.296]" "e:(-0.296,-0.0353]" [6] "f:(-0.0353,0.245]" "g:(0.245,0.536]""h:(0.536,0.854]" "i:(0.854,1.32]" "j:(1.32,3.81]" > Thanks Leo. ___ If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference. Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How best to model these datasets
Hello, I am a stats novice and I was wondering what kind of model would work best with the two datasets that are displayed. The graphs show an incremental area analysis for two feral cat home ranges where area increases with increasing number of GPS fixes. Thanks, Brian -- Brian Leo School of Aquatic and Fishery Sciences University of Washington 1122 Boat St NE Box 355020 Seattle, WA 98195 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading specific rows and columns xlsx file: Error in strsplit(names(res), \\.) : non-character argument
Hi, I am trying to read specific rows and columns from a xlsx file using the following commands: # Read rows 18-23 and columns 7-15 into R and assign the result to a variable # called dat colIndex - 18:23 rowIndex - 7:15 dat - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header = TRUE, colIndex = colIndex, rowIndex = rowIndex) But I get this error: Error in strsplit(names(res), \\.) : non-character argument I don't understand where that non-character comes from. When I read the whole file with: naturalGas - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header = TRUE) I don't get any problem. Should you be willing to try to reproduce this behavior, the file comes from: https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx sessionInfo()R version 3.1.1 (2014-07-10) Platform: x86_64-pc-linux-gnu (64-bit) I'm on debian Jessie. Disclaimer: this is a homework. Thank you in advance -- Best regards, Dr. Margherita DI LEO Scientific / technical project officer European Commission - DG JRC Institute for Environment and Sustainability (IES) Via Fermi, 2749 I-21027 Ispra (VA) - Italy - TP 261 Tel. +39 0332 78 3600 margherita.di-...@jrc.ec.europa.eu Disclaimer: The views expressed are purely those of the writer and may not in any circumstance be regarded as stating an official position of the European Commission. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading specific rows and columns xlsx file: Error in strsplit(names(res), \\.) : non-character argument
Ahem.. solved. I had switched between rows and columns.. Sorry for the noise. Best, m On Sun, Aug 3, 2014 at 6:33 PM, Margherita Di Leo direg...@gmail.com wrote: Hi, I am trying to read specific rows and columns from a xlsx file using the following commands: # Read rows 18-23 and columns 7-15 into R and assign the result to a variable # called dat colIndex - 18:23 rowIndex - 7:15 dat - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header = TRUE, colIndex = colIndex, rowIndex = rowIndex) But I get this error: Error in strsplit(names(res), \\.) : non-character argument I don't understand where that non-character comes from. When I read the whole file with: naturalGas - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header = TRUE) I don't get any problem. Should you be willing to try to reproduce this behavior, the file comes from: https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx sessionInfo()R version 3.1.1 (2014-07-10) Platform: x86_64-pc-linux-gnu (64-bit) I'm on debian Jessie. Disclaimer: this is a homework. Thank you in advance -- Best regards, Dr. Margherita DI LEO Scientific / technical project officer European Commission - DG JRC Institute for Environment and Sustainability (IES) Via Fermi, 2749 I-21027 Ispra (VA) - Italy - TP 261 Tel. +39 0332 78 3600 margherita.di-...@jrc.ec.europa.eu Disclaimer: The views expressed are purely those of the writer and may not in any circumstance be regarded as stating an official position of the European Commission. -- Best regards, Dr. Margherita DI LEO Scientific / technical project officer European Commission - DG JRC Institute for Environment and Sustainability (IES) Via Fermi, 2749 I-21027 Ispra (VA) - Italy - TP 261 Tel. +39 0332 78 3600 margherita.di-...@jrc.ec.europa.eu Disclaimer: The views expressed are purely those of the writer and may not in any circumstance be regarded as stating an official position of the European Commission. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (Seismic) signal deconvolution with R
Has anybody (successfully) tried to deconvolve a seismic signal (time series in general) using R and package signal? I have the Matlab routines: [bb,aa] = zp2tf(zero,pole',gain); [bb,aa] = bilinear(bb,aa,df); transfer = freqz(bb,aa,f,df); But the (apparent) equivalents of R produce quite different results: ZPG - Zpg(zero = zero, pole = pole, gain = gain) B - bilinear(ZPG, T = 1/df) transfer - freqz(h = B, Fs = 1/df, n = f) In Matlab, one complex vector (transfer) is returned, in R a different complex vector and a numeric vector are returned. Are the problems in the specification of zeros and poles (Hz vs. rad/s) or in parameter setting of freqz()? Or does anyone know someone who uses R for signal processing? Best and Thanks, Micha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Call for papers: Geospatial Semantic Array Programming
** Apologies for any cross posting ** Earthzine http://www.earthzine.org/, an IEEE-sponsored online scientific journal, is soliciting articles of 800-3,000 words for its second quarter theme of 2014 on *Geospatial Semantic Array Programming *(GeoSemAP). We seek contributions from all regions of the globe, addressing environmental transdisciplinary research in which a concise integration of array-based semantics and array programming, geospatial tools and a modular composition of data-transformation models are exploited for geospatial problems within the paradigm of Semantic Array Programming. This theme specifically focuses on wide-scale transdisciplinary modelling for environment (WSTMe) as a scientific challenge with an increasingly important role in allowing strategic policy-making to be effectively discussed and programmed with the support of robust science. See the call for paper at http://www.earthzine.org/2013/12/18/call-for-papers-geospatial-semantic-array-programming/ and download it in PDF.http://www.earthzine.org/wp-content/uploads/2013/12/Call-For-Papers-GeoSemAP.pdf http://www.earthzine.org/wp-content/uploads/2013/12/Call-For-Papers-GeoSemAP.pdf -- Best regards, Dr. Margherita DI LEO Scientific / technical project officer European Commission - DG JRC Institute for Environment and Sustainability (IES) Via Fermi, 2749 I-21027 Ispra (VA) - Italy - TP 261 Tel. +39 0332 78 3600 margherita.di-...@jrc.ec.europa.eu Disclaimer: The views expressed are purely those of the writer and may not in any circumstance be regarded as stating an official position of the European Commission. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] forecasting accuracy problem in R
Hi, A few weeks back I used the following command: accuracy(train,test) where train and test are training and test data respectively. Last night I updated R and the forecast package and used the same command and I got error. After trying a little I used the following command accuracy(train,test[1:30]) and it worked. (I was checking the accuracy of 30 forecasted values). Is there some change in the forecast package or did I do something wrong? regards Leo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] forecasting accuracy problem in R
Thanks. Could be the following reason given in the changelog. accuracy() can now figure out overlapping times for x and f. best regards Leo On Mon, Dec 17, 2012 at 10:11 AM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: Please read the posting guide. Packages have their own developers, as well as their own change logs [in this case http://cran.r-project.org/web/packages/forecast/ChangeLog]. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Leo spee...@gmail.com wrote: Hi, A few weeks back I used the following command: accuracy(train,test) where train and test are training and test data respectively. Last night I updated R and the forecast package and used the same command and I got error. After trying a little I used the following command accuracy(train,test[1:30]) and it worked. (I was checking the accuracy of 30 forecasted values). Is there some change in the forecast package or did I do something wrong? regards Leo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forecast Package: Draw two lines on the same plot
Hi, How is it possible to draw to different data on the same graph using forecast package? The first is the observed data and the second set is the fitted values. I want the observed data to show as solid line while the fitted values to show as dashed or dotted line. regards Leo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] formula method with special characters
Dear List, I'm trying to create a formula method, allowing for a special character in the formula (i.e., similar to for example the gam package with the character s in y ~ s(x)). I've checked and it seems this is done through attr(,specials). However, the section of code below (as an example extracted from the gam package) gives me an error as shown at the end. Evidently I'm missing something here...any help is much appreciated. Thanks, Leo. gam2.slist - c(s, lo, random) gam2 - function(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, control = gam.control(...), model = FALSE, method=glm.fit, x = FALSE, y = TRUE, ...) { call - match.call() mf - match.call(expand.dots = FALSE) m - match(c(formula, data, subset, weights, na.action, etastart, mustart, offset), names(mf), 0) mf - mf[c(1, m)] mf$drop.unused.levels - TRUE mf[[1]] - as.name(model.frame) mt - if(missing(data)) terms(formula, gam2.slist) else terms(formula,gam2.slist,data = data) mf$formula-mt mf - eval(mf, parent.frame()) } set.seed(1) x - runif(1000) y - runif(1000) dd - data.frame(y, x) g - gam2(y ~ s(x), data =dd) Error in eval(expr, envir, enclos) : could not find function s ___ This email may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this email or the information it contains by other than an intended recipient is unauthorized. If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference. Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] smooth scatterplot and geo map
check smooth.ppp{spatstat} which performs spatial smoothing based on a Gaussian kernel...it has a plot method that may do what you are looking for hope this helps, Leo. On Thu, Jul 28, 2011 at 4:09 PM, marco marco.mile...@aim.uzh.ch wrote: Hello everybody, I'm trying to understand how to draw a smoothed scatterplot on a geographic map with R. Have a dataframe with point locations (long, lat) and was able to simply plot these points on a shp map by using the maptools package. However, instead of having simply the raw points on the map, I would like to have a smoothed scatterplot of the same superimposed on the map. Tried with the smoothScatter function, but really have no idea how to combine the map with the resulting graph. tx in adv marco -- View this message in context: http://r.789695.n4.nabble.com/smooth-scatterplot-and-geo-map-tp3702374p3702374.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple example of using a closure in R to manage bank accounts?
something like this maybe? account - function (balance=0) { function (d = 0,w = 0) { newbal - balance + d - w balance - newbal newbal } } John - account(100) John John(d=100,w=50) John() Leo - account(1000) Leo Leo(d=1000,w=50) Leo() Leo(d=100,w=500) Leo() Leo. On Mon, Jul 25, 2011 at 6:00 PM, Michael Hannon jm_han...@yahoo.com wrote: Greetings. I once ran across a simple (toy) example of using a closure in R to manage bank accounts. I've got a use for it now but can no longer find it. If you have it (or a similar example), will you please send it to me? (Unfortunately, a web search that includes the terms bank and closure leads into a whole pile of unrelated stuff.) Thanks, -- Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with contrasts
Or alternatively (though very similar to Peter's idea) you can do ci - contrasts formals(ci)$contrasts - FALSE dd - data.frame(a = gl(3,4), b = gl(4,1,12)) mm - model.matrix(~ a + b, dd, contrasts = list(a=ci(dd$a), b=ci(dd$b))) Best, Leo. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of peter dalgaard Sent: 2011, May, 11 7:31 AM To: Lars Bishop Cc: R-help@r-project.org Help Subject: Re: [R] Help with contrasts On May 11, 2011, at 12:51 , Lars Bishop wrote: Hi, I need to build a function to generate one column for each level of a factor in the model matrix created on an arbitrary formula (instead of using the available contrasts options such as contr.treatment, contr.SAS, etc). My approach to this was first to use the built-in function for contr.treatment but changing the default value of the contrasts argument to FALSE (I named this function contr.identity and it shown at the bottom of the email for reference). So this function works fine, contr.identity(4) 1 2 3 4 1 1 0 0 0 2 0 1 0 0 3 0 0 1 0 4 0 0 0 1 contr.treatment(4) 2 3 4 1 0 0 0 2 1 0 0 3 0 1 0 4 0 0 1 However, when I try to create a model matrix using contr.identity specified in options(), it actually uses the contr.treatment option. Why is that? Any hint on how can I do this? It's not actually using contr.treatment, it's just calling contr.identity with contrasts=TRUE... I don't think there's a painless way to avoid this. The closest I can think of is cI - contr.treatment formals(cI)$contrasts - FALSE dd - data.frame(a = gl(3,4), b = gl(4,1,12)) model.matrix(~ C(a,cI,3) + C(b,cI,4), dd) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately. Ce courriel peut contenir des renseignements protégés et confidentiels. Lexpéditeur ne renonce pas aux droits et obligations qui sy rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements quil contient par une personne autre que le destinataire désigné est interdite. Si vous recevez ce courriel par erreur, veuillez men aviser immédiatement, par retour de courriel ou par un autre moyen. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducibility issue in gbm (32 vs 64 bit)
I'm guessing this has something to do with numerical precision on the two platforms. Leo. - Original Message - From: Joshua Wiley jwiley.ps...@gmail.com To: Axel Urbiz axel.ur...@gmail.com Cc: R-help@r-project.org R-help@r-project.org; Ridgeway, Greg Sent: Fri Feb 25 22:16:02 2011 Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit) Hi Axel, I do not have a nice explanation why the results differ off the top of my head. I can say I can replicate what you get on 32/64 (both Windows 7) bit with the development version of R and gbm_1.6-3.1. Here is an even simpler example that shows the difference: gbmfit - gbm(1:50 ~ I(50:1) + I(60:11), distribution = gaussian) summary(gbmfit) I copied that package maintainer. Cheers, Josh On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz axel.ur...@gmail.com wrote: Dear List, The gbm package on Win 7 produces different results for the relative importance of input variables in R 32-bit relative to R 64-bit. Any idea why? Any idea which one is correct? Based on this example, it looks like the relative importance of 2 perfectly correlated predictors is diluted by half in 32-bit, whereas in 64-bit, one of these predictors gets all the importance and the other gets none. I found this interesting. ### Sample code library(gbm) set.seed(12345) xc=matrix(rnorm(100*20),100,20) y=sample(1:2,100,replace=TRUE) xc[,2] - xc[,1] gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian) summary(gbmfit) ### Results on R 2.12.0 (32-bit) var rel.inf 1 xc[, 3] 49.76143 2 xc[, 1] 27.27432 3 xc[, 2] 22.96425 ### Results on R 2.12.0 (64-bit) summary(gbmfit) var rel.inf 1 xc[, 1] 50.23857 2 xc[, 3] 49.76143 3 xc[, 2] 0.0 Thanks, Axel. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ This email message is for the sole use of the intended r...{{dropped:30}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New R User Group in Toronto
Dear R users, I'm pleased to announce that the Greater Toronto Area (GTA) user's group is now active on meetup.com, and taking suggestions for the first meeting. If you are on the region, you can sign-up now at the link below http://www.meetup.com/Greater-Toronto-Area-GTA-R-Users-Group This group is aimed to bring together practitioners (from industry and academia alike) in order to exchange knowledge and experience on using R for data analysis, statistical modeling, visualization, data mining, predictive analytics, statistical computing, exploratory data analysis, etc. Users from all levels are welcome. Special thanks to Revolution Analytics to help us get started and sponsor the group. Kind Regards, Leo Guelman. ___ This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately. Ce courriel peut contenir des renseignements protégés et confidentiels. LÂexpéditeur ne renonce pas aux droits et obligations qui sÂy rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements quÂil contient par une personne autre que le destinataire désigné est interdite. Si vous recevez ce courriel par erreur, veuillez mÂen aviser immédiatement, par retour de courriel ou par un autre moyen. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can this code be written more efficiently?
Dear users, I'm working on binary classification problem using Support Vector Machines (SVM). My objective is to train a series of SVM models on a grid of hyperparameters and then select those that maximize the AUC based on an independent validation sample. My attempted code is shown below. It runs well on small data sets but when I use it on a slightly larger sample (e.g., my train data is composed of about 8,000 observations on each class and 21 inputs), it takes forever to run (more than 1 day already and still running). I'm wondering if there's any way I can optimize this code. Thanks in advance for any help. I'm using 64-bit R 2.11.1 on Win 7. Start Code library(e1071) library(ROCR) ### Create grid of hyperparameters Gseq - seq(-15,3,2); G - rep(2, length(Gseq)); G - G^Gseq Cseq - seq(-5,13,2); C - rep(2, length(Cseq)); C - C^Cseq mygrid - expand.grid(C=C, G=G) ### Train models svm.models - lapply(1:nrow(mygrid), function(i) { svm(churn.form, data = mytraindata, method = C-classification, kernel = radial, cost = mygrid[i,1], gamma = mygrid[i,2], probability=TRUE) }) ### Predict on test set pred.step3 - numeric(length(svm.models)) for (i in 1:length(svm.models)) { pred.step1 - predict(svm.models[[i]], myvaliddata, decision.values = F, probability=T) pred.step2 - prediction(predictions=attr(pred.step1,probabilities)[,1], labels=myvaliddata$churn) pred.step3[i] - performance(pred.step2, auc)@y.values[[1]] } pred.step3 End Code Thanks, Leo. ___ This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately. Ce courriel peut contenir des renseignements protégés et confidentiels. LÂexpéditeur ne renonce pas aux droits et obligations qui sÂy rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements quÂil contient par une personne autre que le destinataire désigné est interdite. Si vous recevez ce courriel par erreur, veuillez mÂen aviser immédiatement, par retour de courriel ou par un autre moyen. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] TM Package - Corpus function - Memory Allocation Problems
I'm using R 2.11.1 on Win XP (32-bit) with 3 GB of RAM. My data has (only) 16.0 MB. I want to create a VCorpus object using the Corpus function in the tm package but I'm running into Memory allocation issues: Error: cannot allocate vector of size 372 Kb. My data is stored in a csv file which I've imported with read.csv and then used the following to create the Corpus (but it failed with the error message above) txt - Corpus(DataframeSource(txt)) I've even tried to subset ~ 10% of my data but I run into the same error. What is a the best way to solve this memory problem other than increasing a physical RAM? Thanks in advance for any help, Leo. ___ This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately. Ce courriel peut contenir des renseignements protégés et confidentiels. LÂexpéditeur ne renonce pas aux droits et obligations qui sÂy rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements quÂil contient par une personne autre que le destinataire désigné est interdite. Si vous recevez ce courriel par erreur, veuillez mÂen aviser immédiatement, par retour de courriel ou par un autre moyen. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] different outcomes of P values in SPSS and R
I have been using generalized linear models in SPSS 18, in order to build models and to calculate the P values. When I was building models in Excel (using the intercept and Bs from SPSS), I noticed that the graphs differed from my expectations. When I ran the dataset again in R, I got totally different outcomes for both the P values as well as the Bs and the intercepts. The outcomes of R seem much more likely to be the correct ones, but I really cannot explain the differences. Does anyone have experience with these differences? I'm using Generalized linear models with Binary Logistics in SPSS and glm(formula = EPO_YN ~ frequ_ind + frequ_ind2 + frequ_preFDS, family = binomial(link = logit), data = w) in R. -- View this message in context: http://r.789695.n4.nabble.com/different-outcomes-of-P-values-in-SPSS-and-R-tp2324181p2324181.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fit data with y = x^-1
Dear list, I am getting weired with fitting data with a 1/x-polynomial. Suggest I have the following data: x - c(1,2,3,4,5,6,7) y - c(100,20,4,2,1,.3,.1) I may fit this with a linear model fit1 = lm(y ~ I(x)) Getting plot out of this model I applied library(polynom) pol1 = polynomial(fit1$coefficients) f1 = as.function(pol1) plot(x,y) lines(x, f1(x), col = 2) Clearly, this model appears insufficient. Therefore I intended to aplly fit2 = lm(y ~ I(1/x)) pol2 = polynomial(fit2$coefficients) f2 = as.function(pol2) lines(x, f2(x), col = 3) as this would be much more pleasant. However, the second model does not work at all. Does anyone know about how to correctly fit an x^-1-polynomial to my data cloud? Many thanks in advance, behelfsadresse [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Linear Contrasts in GLM - Query
Hi, Is there a way I can specify linear contrasts in glm? I'm looking for something equivalent to SAS' contrast statement. I'd like to do the following, suppose I have a categorical input with 4 levels (a,b,c,d), I'd like to test something like: (i) a+b=c+d, (ii) a=b, (iii) a=b+d, etc... Thanks in advance for your help! Leo. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouping data in a data frame: is there an efficient way to do it?
I have a data frame with about 10^6 rows; I want to group the data according to entries in one of the columns and do something with it. For instance, suppose I want to count up the number of elements in each group. I tried something like aggregate(my.df$my.field, list(my.df$my.field), length) but it seems to be very slow. Likewise, the split() function was slow (I killed it before it completed). Is there a way to efficiently accomplish this in R?.. I am almost tempted to write an external Perl/Python script entering every row into a hashtable keyed by my.field and iterating over the keys... Might this be faster?.. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouping data in a data frame: is there an efficient way to do it?
Thanks everyone for the useful suggestions. The bottleneck might be memory limitations of my machine (3.2GHz, 2 GB) and the fact that I am aggregating on a field that is a string. Using the suggested as.data.frame(table(my.df$my.field)) I do get a speedup, but the computation still takes 30 seconds. For the sake of comparison, I did write the counting up rows with common values function using a Perl hash (it's only 5 lines of Perl) and it takes 15 seconds to run -- a 2x speedup. Not yet sure if it's worth the hassle. --Leo On Wed, Sep 2, 2009 at 4:28 PM, David M Smithda...@revolution-computing.com wrote: You may want to try using isplit (from the iterators package). Combined with foreach, it's an efficient way of iterating through a data frame by groups of rows defined by common values of a columns (which I think is what you're after). You can speed things up further if you have a multiprocessor system with the doMC package to run iterations in parallel. There's an example here: http://blog.revolution-computing.com/2009/08/blockprocessing-a-data-frame-with-isplit.html Hope this helps, # David Smith On Wed, Sep 2, 2009 at 3:39 PM, Leo Alekseyev dnqu...@gmail.com wrote: I have a data frame with about 10^6 rows; I want to group the data according to entries in one of the columns and do something with it. For instance, suppose I want to count up the number of elements in each group. I tried something like aggregate(my.df$my.field, list(my.df$my.field), length) but it seems to be very slow. Likewise, the split() function was slow (I killed it before it completed). Is there a way to efficiently accomplish this in R?.. I am almost tempted to write an external Perl/Python script entering every row into a hashtable keyed by my.field and iterating over the keys... Might this be faster?.. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David M Smith da...@revolution-computing.com Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read floats from file into array
hi all, i have a simple question. instead of defining my measurements in a static way like ... x - c(-0.475, -1.553, -0.434, -1.019, 0.395) ... i'd like them to be read from a file ... x - read.table(07a673ac0cb1f7f8fa293860566f633c/1/raw0.txt, header=FALSE) d1 - density(x, kernel = gaussian) with a formatting that looks like: 4.284000e-01 6.758333e-01 8.292000e-01 7.856667e-01 6.633667e-01 5.408000e-01 4.728333e-01 4.377000e-01 4.374333e-01 4.102667e-01 3.628333e-01 3.277000e-01 4.909667e-01 [...] R quits and says: d1 - density(x, kernel = gaussian) Error in density.default(x, kernel = gaussian) : argument 'x' must be numeric Calls: density - density.default Execution halted is there any possibility to convert this / make this work? big thx for help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read floats from file into array
big thanks to all, the x[[1]] worked fine! :) 2009/7/21 Duncan Murdoch murd...@stats.uwo.ca: On 21/07/2009 6:09 AM, leo mueller wrote: hi all, i have a simple question. instead of defining my measurements in a static way like ... x - c(-0.475, -1.553, -0.434, -1.019, 0.395) ... i'd like them to be read from a file ... x - read.table(07a673ac0cb1f7f8fa293860566f633c/1/raw0.txt, header=FALSE) d1 - density(x, kernel = gaussian) with a formatting that looks like: 4.284000e-01 6.758333e-01 8.292000e-01 7.856667e-01 6.633667e-01 5.408000e-01 4.728333e-01 4.377000e-01 4.374333e-01 4.102667e-01 3.628333e-01 3.277000e-01 4.909667e-01 [...] R quits and says: d1 - density(x, kernel = gaussian) Error in density.default(x, kernel = gaussian) : argument 'x' must be numeric Calls: density - density.default Execution halted is there any possibility to convert this / make this work? read.table returns a dataframe, i.e. a list of vectors. density wants a vector. So you will probably get what you want using d1 - density(x[[1]], kernel=gaussian) You can use names(x) to find the name of the 1st column for a nicer syntax; it is probably V1 (for variable 1), so you could do y - x$V1 density(y, ...) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Courses in Toronto
Dear R users, Does anybody know if there any good R courses within the Toronto area? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vim R plugin-2
Any alternative ways of sending info both ways from R to any open process (vim) in windows? On windows, I'd rather use ole automation. A few years ago I successfully used this plugin: http://www.vim.org/scripts/script.php?script_id=889 I haven't used it since though. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rdonlp2 -Query
Hi, Did anyone used this package? Could you please share your thought on it? Thanks! L. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GMM estimation
Hello there!!! Sorry to bother you all with such question and difficulties that I have been facing on. Recently I have been searching for packages to run GMM estimatives with R. I have been searching for such packages for a while, but since I am a new user of R system, my quest so far was unsucessful. That´s why I had decided to ask to this forum. Hope that anyone could help me! I know that such emails might bother you guys at all... but I dont know where to search for help about this issue. Many thanks!!! hope to hear from you soon! Leo! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with simulation of heteroskedasticity
Hello guys! Sorry to bother with such a question I was trying to generate a monte carlo simulation with heteroskedasticity errors. but I am not sure if the command line that I had wrote is quite correct. the type of heteroskedasticity that I want to create is such as var(e) = var(x^4) I began my work with this x- rnorm (100, 2,0.4) # generating an indepedent random variable e- rnorm(100,0,x^2) # generating the error term y.fitted- 0.5*(x) +3 y- y.fitted+e And then I wrote an for loop code which could give me the results of a t-test. It seems that the simulation worked good because the main result is that the t-test was no longer meanfull (as it is expected) But my main doubt is if this code was capable to generate the type of heteroskedasticity that I wanted. I tried to understand how the rnorm function generates the numbers but i could not grab much So could anyone help me telling if this kind of work is okay? or is there any moree inteligent way to do the job? thanks!!! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.