from:"Leo"

Re: [R] Tools to modify highlighted areas in pdf documents?

2024-06-03 Thread Leo Mada via R-help

Dear Ivan,

Thank you very much for the hint. I have started to test it.

- it offers more colours and types of highlighting than MS Edge;
- it seems to have better word-boundary detection than MS Edge (but I haven't 
tested all the cases yet);


There are some nit-picks:
- I wish it had a better default color-pallet;
- the vertical positioning continues to be sometimes sub-optimal: editing 
manually the coordinates may be still useful;

Returning to R:
1. Some of the pdf-packages could implement some of the 
annotation-functionality as well.

2. It would be useful to be able to export the annotations and import/merge 
them in another document. I have spotted errors in various articles; such a 
functionality would be handy, if a new version of those articles gets published.

Sincerely,

Leonard

From: Ivan Krylov 
Sent: Sunday, June 2, 2024 8:02 PM
To: Leo Mada via R-help 
Cc: Leo Mada 
Subject: Re: [R] Tools to modify highlighted areas in pdf documents?

� Sat, 1 Jun 2024 16:16:23 +
Leo Mada via R-help  �:

> When highlighting pdf-documents with Microsoft Edge, the bounding box
> is sometimes misplaced, and quite ugly so. It also lacks the ability
> to draw lines or arrows.
>
> On the other hand, I did not get used to Acrobat Reader: it usually
> involves much more effort to add specific highlights. Lines can be
> drawn, but are NOT straight!

Sorry for answering a different question, but have you considered using
a different PDF viewer + annotation application? Okular
<https://eu01.z.antigena.com/l/iZmB5crNT773HTSUSt2S6McW6mdP5phHyzgXromFHINsN6Uo6BMnSdZK0kmSK~aLduXw-YpIAWy-DV9bac-5U3grBsLdYxuX7aMmbxQjKSLGCqTyJG54WQ2W7oaR2NEPTWiFotbkB4_eQTzHI3L-cAgOgsS4exJ4ie4BHLt
 > is free and available on Windows (including
from outside Microsoft store). Its annotation features include all
kinds of highlights, arrows and lines, both straight and
arbitrarily-shaped, quickly available from the "annotations" panel.

--
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R code for overlapping variables -- count

2024-06-02 Thread Leo Mada via R-help

Correcting a small glitch - see new code.

From: Leo Mada 
Sent: Sunday, June 2, 2024 8:34 PM
To: Shadee Ashtari 
Cc: r-help@r-project.org 
Subject: [R] R code for overlapping variables -- count

Dear Shadee,

If you have a data.frame with the following columns:

n = 100; # population size
x = data.frame(
  Sex = sample(c("M","F"), n, T),
  Country = sample(c("AA", "BB", "US"), n, T),
  Income  = as.factor(sample(1:3, n, T))
)

# Dummy variable
ONE = rep(1, nrow(x))

# corrected
r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
r = r[, c("Country", "Income", "Sex", "ONE")]
names(r)[4] = "Count"
print(r)

It is possible to write more simple code, if you need only the particular 
combination of variables (which you specified in your mail). But this is the 
more general approach.

Note: you may want to use "sum" instead of "length", e.g. if you have a column 
specifying the number of individuals in that category.

Hope this helps,

Leonard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R code for overlapping variables -- count

2024-06-02 Thread Leo Mada via R-help

Dear Shadee,

If you have a data.frame with the following columns:

n = 100; # population size
x = data.frame(
  Sex = sample(c("M","F"), n, T),
  Country = sample(c("AA", "BB", "US"), n, T),
  Income  = as.factor(sample(1:3, n, T))
)

# Dummy variable
ONE = rep(1, nrow(x))

r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
r = r[, c("Country", "Income", "Sex")]
print(r)

It is possible to write more simple code, if you need only the particular 
combination of variables (which you specified in your mail). But this is the 
more general approach.

Note: you may want to use "sum" instead of "length", e.g. if you have a column 
specifying the number of individuals in that category.


Hope this helps,

Leonard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tools to modify highlighted areas in pdf documents?

2024-06-01 Thread Leo Mada via R-help

Dear Bert,

Thank you very much for the response.

I was aware of pdftools - but did not recall any such functionality. I have 
checked again (both pdftools, qpdf and the 3rd one): unfortunately, they do not 
implement such functionality. There might be other packages, which I missed.

However, the functionality is feasible. I will add a few more details - maybe 
someone picks up the task.

It is possible to edit manually the pdf-file, though it is quite cumbersome to 
find the right annotation.

1. One needs to edit the values both in the \QuadPoints and the \Rect in the 
\AP object.

2. Modifying the color is trickier:
\C() encodes the color and \CA the alpha channel (= 1): but neither Acrobat, 
nor MIcrosoft Edge update the color. The value of the color encoded in the 
stream is used instead.

It is possible to "trick" Edge: modify the \C color and set "\ca 1" (in the 
stream block) to a lower value (e.g. "\ca 0.99"). MS Edge will then accept the 
modified color (but Acrobat ignores it). Changing the value in the stream is 
the actual solution.

Note: non-rectangular shapes can be specified as well.

I hope that some of the referenced packages pick up this task.

Sincerely,

Leonard


From: Bert Gunter 
Sent: Saturday, June 1, 2024 9:23 PM
To: Leo Mada 
Cc: r-help@r-project.org 
Subject: Re: [R] Tools to modify highlighted areas in pdf documents?

Search!

on rseek.org<http://rseek.org>, the query "modify pdf documents in R" brought 
up the staplr package. A quick web search with the same query brought up the 
pdftools package.

These were cursory efforts, so you may well find more. You will have to 
determine whether and to what degree any meet your needs.

-- Bert

On Sat, Jun 1, 2024 at 9:16 AM Leo Mada via R-help 
mailto:r-help@r-project.org>> wrote:
Dear R-Users,

Are there any packages that enable the modifications of highlighted areas / 
annotations in pdf documents?

It seems feasible - I have explored some R code (see below). However, I would 
rather avoid to reinvent the wheel.

The problem:
When highlighting pdf-documents with Microsoft Edge, the bounding box is 
sometimes misplaced, and quite ugly so. It also lacks the ability to draw lines 
or arrows.

On the other hand, I did not get used to Acrobat Reader: it usually involves 
much more effort to add specific highlights. Lines can be drawn, but are NOT 
straight!

Are there tools to change the size/position of highlights?
Or to add highlights and underline words?
 Changing position/size manually by editing the data in the pdf-document is 
possible. Changing the color is more trickier (somehow possible in Microsoft 
Edger; though the direct approach to rewrite the actual stream is better). 
Maybe there are some tools to do it?

Some R code is below.

Sincerely,

Leonard
#

library(zip)

con = file("_some_pdf_.pdf", "rb")

NL = 0
# - very dirty hack;
# - assumes Annotations are in the last fragment/chunk;
while(TRUE) {
tmp = readBin(con, "raw", 1024*128 + 515);
if(length(tmp) == 0) break;
x = tmp;
# isNL = (x == 10) | (x == 13);
isNL = (x == 13);
isNL = isNL & (x[which(isNL) + 1] == 10);
NL = NL + sum(isNL);
}

close(con)

idP = which(isNL)

idS = 935; # will vary with pdf and Annotations and ...;
nLast = 4; # usually 2 chunks
idx = idP[seq(idS, length.out = nLast)]

# Check: Right position?
# tmp = x[seq(idx[1] + 2, idx[1 + 2] - 1)]
# intToUtf8(tmp)

tmp = inflate(x[seq(idx[1] + 2, idx[nLast] - 1)])
intToUtf8(tmp$output)

# Output of inflate: an Example
# "/GS gs .56078434 .87058824 .97647059 rg\n
# 337.298 183.836 m 364.322 183.836 l 364.322 171.83 l 337.298 171.83 l h f\n"

# Note: /BBox[ 337.298 171.83 364.322 183.836]

The raw pdf data:

1948 0 obj
<>/C[ 0.560784 0.870588 0.976471]/CA 1/F 
4/PDFIUM_HasGeneratedAP true/QuadPoints[ 337.298 186 364.322 186 337.298 174.6 
364.322 174.6]/Rect[ 337.298 174.6 364.322 186]/Subtype/Highlight/Type/Annot>>
endobj
1949 0 obj
<>>>>>/Subtype/Form/Type/XObject>>stream
xœEË±
€0  Àž)~ “ä Û™€ Ø P@ ûKˆ"Ðžtó²¢ß jÉC© ðT#ŠBš›zª
WŸH—Ò 9(AÃ  š
KùäøÅ³ _ iÀŽmz dR ²
endstream
endobj


[[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Tools to modify highlighted areas in pdf documents?

2024-06-01 Thread Leo Mada via R-help

Dear R-Users,

Are there any packages that enable the modifications of highlighted areas / 
annotations in pdf documents?

It seems feasible - I have explored some R code (see below). However, I would 
rather avoid to reinvent the wheel.

The problem:
When highlighting pdf-documents with Microsoft Edge, the bounding box is 
sometimes misplaced, and quite ugly so. It also lacks the ability to draw lines 
or arrows.

On the other hand, I did not get used to Acrobat Reader: it usually involves 
much more effort to add specific highlights. Lines can be drawn, but are NOT 
straight!

Are there tools to change the size/position of highlights?
Or to add highlights and underline words?
 Changing position/size manually by editing the data in the pdf-document is 
possible. Changing the color is more trickier (somehow possible in Microsoft 
Edger; though the direct approach to rewrite the actual stream is better). 
Maybe there are some tools to do it?

Some R code is below.

Sincerely,

Leonard
#

library(zip)

con = file("_some_pdf_.pdf", "rb")

NL = 0
# - very dirty hack;
# - assumes Annotations are in the last fragment/chunk;
while(TRUE) {
tmp = readBin(con, "raw", 1024*128 + 515);
  if(length(tmp) == 0) break;
  x = tmp;
  # isNL = (x == 10) | (x == 13);
  isNL = (x == 13);
  isNL = isNL & (x[which(isNL) + 1] == 10);
NL = NL + sum(isNL);
}

close(con)

idP = which(isNL)

idS = 935; # will vary with pdf and Annotations and ...;
nLast = 4; # usually 2 chunks
idx = idP[seq(idS, length.out = nLast)]

# Check: Right position?
# tmp = x[seq(idx[1] + 2, idx[1 + 2] - 1)]
# intToUtf8(tmp)

tmp = inflate(x[seq(idx[1] + 2, idx[nLast] - 1)])
intToUtf8(tmp$output)

# Output of inflate: an Example
# "/GS gs .56078434 .87058824 .97647059 rg\n
# 337.298 183.836 m 364.322 183.836 l 364.322 171.83 l 337.298 171.83 l h f\n"

# Note: /BBox[ 337.298 171.83 364.322 183.836]

The raw pdf data:

1948 0 obj
<>/C[ 0.560784 0.870588 0.976471]/CA 1/F 
4/PDFIUM_HasGeneratedAP true/QuadPoints[ 337.298 186 364.322 186 337.298 174.6 
364.322 174.6]/Rect[ 337.298 174.6 364.322 186]/Subtype/Highlight/Type/Annot>>
endobj
1949 0 obj
<>/Subtype/Form/Type/XObject>>stream
xœEË±
€0 Àž)~“äÛ™€ØP@ûKˆ"Ðžtó²¢ßjÉC©ðT#ŠBš›zª
WŸH—Ò9(AÃ  š
KùäøÅ³_iÀŽmz dR²
endstream
endobj


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Clustering Functions used by Reverse-Dependencies

2024-02-29 Thread Leo Mada via R-help

Dear Ivan,

Thank you very much for this interesting information.

Regarding:
"For well-behaved packages that declare their dependencies correctly,
parsing the NAMESPACE for importFrom() and import() calls should give
you the explicit imports."

I did learn something new (I am not very experienced in package writing). 
Unfortunately, Roxygen2 as of the current version still suggests to use the 
pkg::fname approach:
"If you are using just a few functions from another package, we recommending 
adding the package to the Imports: field of the DESCRIPTION file and calling 
the functions explicitly using ::, e.g., pkg::fun()."
https://roxygen2.r-lib.org/articles/namespace.html

Regarding analysing the actual code: it is good to know that CMD check has also 
some functionality. I will look into it, when I find some free time.

tools:::.check_packages_used is a few pages of code. On the other hand, the 
help page for codetools::checkUsage is quite cryptic. But it's good to know at 
least where to look.

Sincerely,

Leonard


From: Ivan Krylov 
Sent: Wednesday, February 28, 2024 10:36 AM
To: Leo Mada via R-help 
Cc: Leo Mada 
Subject: Re: [R] Clustering Functions used by Reverse-Dependencies

� Sat, 24 Feb 2024 03:08:26 +
Leo Mada via R-help  �:

> Are there any tools to extract the function names called by
> reverse-dependencies?

For well-behaved packages that declare their dependencies correctly,
parsing the NAMESPACE for importFrom() and import() calls should give
you the explicit imports. (What if the package imports the whole
dependency? The safe assumption is that all functions are used, but it
comes with false positives. You could also walk the package code
looking for function names that may belong to the imported package, but
that may involve both false positives and false negatives.)

For the rest of the imports and uses of weak dependencies, you'll have
to walk the package code looking for the uses of the `::` operator. See
how R CMD check walks the package code in functions
tools:::.check_packages_used and codetools::checkUsage.

A less-well-behaved package can always load a namespace during runtime
and choose the functions to call depending on the phase of the moon or
weather on Jupiter. For these, like for the halting problem, there's no
general solution: the package could be written to say, "if Leonard's
function says I'm about to call foo::bar, I won't do it, otherwise I
will".

--
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Clustering Functions used by Reverse-Dependencies

2024-02-23 Thread Leo Mada via R-help

Dear R Users,

Are there any tools to extract the function names called by 
reverse-dependencies?

I would like to group these functions using clustering methods based on the 
co-occurrence in the reverse-dependencies.

Utility: It may be possible to split complex packages into modules with fewer 
reverse-dependencies.

Package pkgdepR may offer some of the functionality; but I did not have time to 
fully explore it:
https://cran.r-project.org/web/packages/pkgdepR/index.html

If such tools are not yet available, I have opened an issue on GitHub and would 
like to collect any ideas:
https://github.com/discoleo/PackageBrowser/issues/1

There are some model packages that could be used to test the clustering:
1) Rcpp, xml: the core-functionality will always be needed; splitting into 
modules may not be possible/useful;

2) pracma: most reverse-dependencies are likely using only a small subset of 
the functions in pracma;
(there was some discussion on R-package-devel about reverse dependencies a few 
weeks ago)

3) survival: I have no idea in which category it falls - but it has a lot of 
reverse-dependencies;

Note:
I missed some important functionality from the pkgsearch package. I have 
started therefore work on a new package (PackageBrowser) - although it is not 
yet published on CRAN:
https://github.com/discoleo/PackageBrowser

It does not yet process recursively the reverse-dependencies; although this 
does not seem a big challenge. The real challenge is to parse the code and 
extract function names. I did some work in the past, but never had time for a 
full implementation.

Many thanks,

Leonard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Optimal use of optim?

2024-02-13 Thread Leo Mada via R-help

Dear R-Users,

I am interested in the optimal strategy for optim:

Q: How to formulate the optimization problem?
Q1: Are there benefits for abs(f(x)) vs (f(x))^2?
Q2: Are there any limitations for using abs(...)?

Regarding point 1: my feeling is that the gradients should be more robust with 
the abs(...) method; but I did not test it.

Regarding point 2: Obviously, abs(f(x)) is not differentiable. This is a 
limitation from a mathematical point of view, but the gradient should be still 
computable using a finite difference method. Are there any other limitations?

The question is based on the following example:
solving gamma(x) = y for some given y.

Function multiroot in package rootSolve fails when y < 1. The version using 
optim fares much better. However, I had to tweak somewhat the parameters in 
order to get a higher precision. This made me curious about the optimal 
strategy; unfortunately, I do not have much experience with optimizations.

The example is also available on GitHub:
https://github.com/discoleo/R/blob/master/Math/Integrals.Gamma.Inv.R

Sincerely,

Leonard

### Example:
gamma.invN = function(x, x0, lim, ...,
ndeps = 1E-5, rtol = 1E-9, gtol = 0.0001, digits = 10) {
  v = x;
  if(length(lim) == 1) lim = c(lim - 1 + gtol, lim - gtol);
  cntr = c(list(...), ndeps = ndeps, pgtol = rtol);
  # abs(): should provide greater precision than ()^2 when computing the 
gradients;
  # param ndeps: needed to improve accuracy;
  r = optim(x0, \(x) { abs(gamma(x) - v); }, lower = lim[1], upper =lim[2],
method = "L-BFGS-B", control = cntr);
  if( ! is.null(digits)) print(r$par, digits);
  return(r$par);
}

### Example
Euler   = 0.57721566490153286060651209008240243079;
x = gamma.invN(Euler, -3.5, lim = -3)
gamma(x) - Euler
x = gamma.invN(Euler, -3.9, lim = -3)
gamma(x) - Euler


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Skip jumps in curve

2024-02-13 Thread Leo Mada via R-help

Dear Duncan,

Thank you very much for the response. I suspected that such an option has not 
been implemented yet.

The plot was very cluttered due to those vertical lines. Fortunately, the gamma 
function is easy to handle. But the feature remains on my wishlist as useful 
more in general.

Sincerely,

Leonard

From: Duncan Murdoch 
Sent: Tuesday, February 13, 2024 6:05 PM
To: Leo Mada ; r-help@r-project.org 
Subject: Re: [R] Skip jumps in curve

On 13/02/2024 10:29 a.m., Leo Mada via R-help wrote:
> Dear R-Users,
>
> Is there a way to skip over without plotting the jumps/discontinuities in 
> curve()?
>
> I have not seen such an option, but maybe I am missing something.
>
> plot.gamma = function(xlim = c(-6, -1), ylim = c(-1,3), hline = NULL, n = 
> 1000) {
> curve(gamma(x), from = xlim[1], to = xlim[2], ylim=ylim, n=n);
> if( ! is.null(hline)) abline(h = hline, col = "green");
> }
>
> Euler = 0.57721566490153286060651209008240243079;
> plot.gamma(hline = Euler)
>
> Adding an option to the function curve may be useful:
> options = c("warn", "silent", "unconnected")
>
> This is part of some experiments in math; but that's another topic. For 
> latest version:
> https://github.com/discoleo/R/blob/master/Math/Integrals.Gamma.Inv.R

If you know where the discontinuities are, plot multiple times with the
discontinuities as endpoints:

plot.gamma = function(xlim = c(-6, -1), ylim = c(-1,3), hline = NULL, n
= 1000) {
   start <- floor(xlim[1]):floor(xlim[2])
   end <- start + 1

   start[1] <- xlim[1]
   end[length(end)] <- xlim[2]

   n <- round(n/length(start))

   curve(gamma(x), from = start[1], to = end[1], ylim=ylim, n=n, xlim =
xlim)
   for (i in seq_along(start)[-1])
 curve(gamma(x), from = start[i], to = end[i], add = TRUE, n)
   if( ! is.null(hline)) abline(h = hline, col = "green");
}

Euler = 0.57721566490153286060651209008240243079;
plot.gamma(hline = Euler)

If you don't know where the discontinuities are, it would be much
harder, because discontinuities can be hard to detect unless the jumps
are really big.

Duncan Murdoch

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Skip jumps in curve

2024-02-13 Thread Leo Mada via R-help

Dear R-Users,

Is there a way to skip over without plotting the jumps/discontinuities in 
curve()?

I have not seen such an option, but maybe I am missing something.

plot.gamma = function(xlim = c(-6, -1), ylim = c(-1,3), hline = NULL, n = 1000) 
{
   curve(gamma(x), from = xlim[1], to = xlim[2], ylim=ylim, n=n);
   if( ! is.null(hline)) abline(h = hline, col = "green");
}

Euler = 0.57721566490153286060651209008240243079;
plot.gamma(hline = Euler)

Adding an option to the function curve may be useful:
options = c("warn", "silent", "unconnected")

This is part of some experiments in math; but that's another topic. For latest 
version:
https://github.com/discoleo/R/blob/master/Math/Integrals.Gamma.Inv.R

Sincerely,

Leonard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic astronomy package recommendation wanted.

2024-01-31 Thread Leo Mada via R-help

Dear Richard,

I wrote s small shiny app to improve the search for CRAN-packages. It can be 
installed from GitHub:
https://github.com/discoleo/PackageBrowser

Searching for "(?https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use of geometric mean for geochemical concentrations

2024-01-30 Thread Leo Mada via R-help

Dear Rich,

It depends how the data is generated.

Although I am not an expert in ecology, I can explain it based on a biomedical 
example.

Certain variables are generated geometrically (exponentially), e.g. MIC or 
Titer.

MIC = Minimum Inhibitory Concentration for bacterial resistance
Titer = dilution which still has an effect, e.g. serially diluting blood 
samples;

Obviously, diluting the samples will generate the following concentrations:
1, 1/2, 1.4, 1/8, 1/16, ...
(or the reciprocal: 1, 2, 4, 8, 16, ...)

It makes no sense to compute the arithmetic mean. Results are usually reported 
as some quantile (median or 90%); alternatively, one computes the geometric 
mean.

### Ecology /Environmental Chemistry
I suppose that certain chemicals may be generated/released in the environment 
through a non-linear process. The LLOD may also play a role, but may NOT be the 
main reason. If the generating process is exponential, then the arithmetic mean 
would strongly skew the results (also inconsistently based on season, 
particular year, etc - the generating processes may differ).

### Harmonic Mean
Did not encounter it often: maybe because of the problematic handling of 0.

I do have in the meantime a nice workaround for 0 (which also works with the 
geometric mean), see also (unfortunately not well documented):
https://github.com/discoleo/R/blob/master/Stat/Moments.Stat.R
v0 = 1; # some initial "skew"
1 /(xharm + v0) = sum( 1 / (x + v0) ) / length(x)
xgeom = prod(x + v0)^(1/length(x)) - v0;

I apologize for the late reply; I did not have much time to read messages 
during the past weeks.

Sincerely,

Leonard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fit NLE - was: computer algebra in R

2023-12-04 Thread Leo Mada via R-help

Fit NLE - was: [R] computer algebra in R
Original post:
https://stat.ethz.ch/pipermail/r-help/2023-November/478619.html

Dear Kornad,

I think I have started to understand what you try to achieve. The problem is to 
fit a NLE and compute the parameters of the NL-Eq. I have included the R 
Help-list back in the loop, as I am not an expert in optimization.

Goal:
y ~ I0 +  IHD * hd + ID * d;
where:
y = given vector of measurements;
x = given vector of values;
I0, IHD, ID, kd = parameters to optimize;
hd = satisfies a polynomial of order 3;

As d = d0 - hd, the previous formula can be written:
y ~ I0 + ID * d0 +  (IHD - ID) * hd;

f(x, hd, kd) = 0,
where f = a polynomial of order 3 in hd and order 2 in kd;
d0 (and other components of the polynomial) = given constants;

1) First Approach
I would back-substitute hd into the polynomial:
hd = (y - I0 - ID*d0) / (IHD - ID);

f(x, hd, kd) becomes then f(x, y, kd, I0, ID, IHD) = 0;
- f is order 3 in y;

You could fit:
(y^3) ~ f(x, y, kd, I0, ID, IHD) - y^3,
where you subtract the y^3 term from the function f, and add the (y^3) values 
as a new columng to the data.frame:
data.frame(y3 = y^3, y=y, x=x)

If the values of y are versy small (abs(y) << 1), then it may be wiser to fit:
y ~ f(x, y^2, y^3, kd, I0, ID, IHD) - (y-term);

But I am not an expert in these problems. Other R-users may be more helpful.

2.) Approach 2: Math
I feel that the problem can be solved quasy-exactly as well. It is much harder 
with 4 parameters to optimize:
- one needs to compute the 4 partial derivatives;
- solve the resulting system of 4 polynomial equations;

The system is polynomial; although it looks ugly and I am not inclined to do 
such calculations myself.

I hope that you can get more useful answers from the R help-list.

Sincerely,

Leonard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] computer algebra in R

2023-11-29 Thread Leo Mada via R-help

Dear Konrad,

I presume that the system can be written as follows, where h0, d0, ga0, kga and 
kd are given:

err1 = h + hd + hga - h0;
err2 = d + hd - d0;
err3 = ga + hga - ga0;
err4 = hga - kga*h*ga;
err5 = hd - kd*h*d;

All error terms should be zero.

Do you need (a) the symbolic solution or (b) is a numeric solution fine?

I do not have any experience with yacas or caracas. But see below.

### Case (a)
I did write a (very rudimentary) symbolic solver for systems of polynomial 
equations in pure R. It should be able to solve this one. Please let me know if 
this is of interest and I will provide further details.

### Case (b)
Again, it should be possible to solve, see some examples:
https://github.com/discoleo/R/blob/master/Math/Systems.NLE.Tutorial.md
https://github.com/discoleo/R/blob/master/Math/Systems.Lambert.R
I did solve much more complex ones, for a monster see:
https://github.com/discoleo/R/blob/master/Math/Poly.System.S5.Ht.Formulas.Derivation.HS0.R
[and the remaining Poly.System.S5.Ht... scripts]

I presume that only the real roots are required (it's possible to solve systems 
with complex roots as well). I can provide further details as well.

Sincerely,

Leonard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extract cells and their adjacent cells that may appear anywhere in a dataframe.

2017-05-08 Thread Brian Leo

Hello,

I have a dataframe that contains information about vegetation cover and
percent coverage, collected using a quadrat.  The dataframe is set up so
that each row represents a single quadrat.  If there are multiple species
within one quadrat, they are all listed within the same row with respective
% coverage always following in next column.  Here is an example, species
are represented as 4 letter codes:

Quadrat_#

Date

Species_1

Species_2

covrage2

Species_3

covrage3

1

5/2/2017

unk1

2

bial

6

stgu

95

2

5/2/2017

bope

75

stja

4

ficy

9

3

5/2/2017

bope

100

stja

5



4

5/2/2017

stgu

87

stja

6

bope

20

5

5/2/2017

bg

13

stja

2

ficy

10

6

5/2/2017

bope

8

sirh

3

stgu

2



The problem is that the species were not recorded in any particular order,
and not all species occur in every quadrat.  There can also be any number
of species per quadrat.  I need to be able to extract each species AND it’s
respective coverage, and place them in into another dataframe for further
analysis

So the result for “bope” from above example data would look like this:

Quadrat#

% coverage bope

1

0

2

75

3

100

4

20

5

0

6

8



Any help greatly appreciated.

Brian

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Forging an R enthusiast meeting at EGU 2017

2016-11-02 Thread Leo Leo via R-help

At the European Geosciences Union Meeting (Spring 2017 in Vienna) there
will be an interactive sessionthat aims to bring together scientists that
share a common indispensable tool: the
statistic software R.

http://meetingorganizer.copernicus.org/EGU2017/session/24971

We have the feeling that R users are like little boats on an ocean at
night, unaware of how close other people are that have already worked on
the same problem earlier. This is the main motivation for this session in
spring 2017.

If you feel interested to contribute your work we would be delighted to see
you enriching this session.

If you don't think you will have time to come to the EGU 2017 we would be
pleased if you could share this message among your colleagues to linking
these small boats on the ocean at night.

Kind regards and a pleasant autumn week,

Micha Dietze

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Order of formula terms in model.matrix

2016-01-18 Thread Guelman, Leo

Is it really the same model?

Following the example provided by Lars:

set.seed(1)
x1 <- rnorm(100)
f1 <- factor(sample(letters[1:3], 100, replace = TRUE))
trt <- sample(c(-1,1), 100, replace = TRUE)
y <- factor(sample(c(0,1), 100, T))
df <- data.frame(y=y, x1=x1, f1=f1, trt=trt)

fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial)
coef(fit1)

fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial)
coef(fit2)

identical(fitted(fit1), fitted(fit2))
[1] FALSE



___
If you received this email in error, please advise the sender (by return email 
or otherwise) immediately. You have consented to receive the attached 
electronically at the above-noted email address; please retain a copy of this 
confirmation for future reference.  

Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur 
immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté 
de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse 
courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation 
pour les fins de reference future.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Order of formula terms in model.matrix

2016-01-18 Thread Guelman, Leo

Thanks Peter. That make sense. Nevertheless, what comes at a surprise to me 
(and maybe to others) is that one can potentially get different fits by simply 
swapping the terms in the model formula. 

Best,
Leo. 

-Original Message-
From: peter dalgaard [mailto:pda...@gmail.com] 
Sent: 2016, January, 18 11:16 AM
To: Guelman, Leo
Cc: r-help@r-project.org; Charles C. Berry
Subject: Re: [R] Order of formula terms in model.matrix

On 18 Jan 2016, at 16:49 , Guelman, Leo <leo.guel...@rbc.com> wrote:

> Is it really the same model?

No, and I didn't say that they would be. I did say that they would be in the 
all-factor case, which does seem to be right:

> df$trt <- factor(df$trt)
> fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial)
> fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) 
> plot(fitted(fit1), fitted(fit2)) # still differs
> df$x1 <- factor(sample(c(-1,1), 100, replace = TRUE))
> fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial)
> fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) 
> plot(fitted(fit1), fitted(fit2)) # looks like it's on diagonal 
> identical(fitted(fit1), fitted(fit2)) # wrong check
[1] FALSE
> all.equal(fitted(fit1), fitted(fit2)) # better
[1] TRUE

-pd

>  
> Following the example provided by Lars:
>  
> set.seed(1)
> x1 <- rnorm(100)
> f1 <- factor(sample(letters[1:3], 100, replace = TRUE)) trt <- 
> sample(c(-1,1), 100, replace = TRUE) y <- factor(sample(c(0,1), 100, 
> T)) df <- data.frame(y=y, x1=x1, f1=f1, trt=trt)
>  
> fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial)
> coef(fit1)
>  
> fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial)
> coef(fit2)
>  
> identical(fitted(fit1), fitted(fit2))
> [1] FALSE
>  
>  
>  
> __
> _
> 
> If you received this email in error, please advise the sender (by return 
> email or otherwise) immediately. You have consented to receive the attached 
> electronically at the above-noted email address; please retain a copy of this 
> confirmation for future reference.
> 
> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur 
> immédiatement, par retour de courriel ou par un autre moyen. Vous avez 
> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à 
> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette 
> confirmation pour les fins de reference future.
> 

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 
Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

___
If you received this email in error, please advise the sender (by return email 
or otherwise) immediately. You have consented to receive the attached 
electronically at the above-noted email address; please retain a copy of this 
confirmation for future reference.  

Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur 
immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté 
de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse 
courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation 
pour les fins de reference future.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Order of factor levels

2016-01-11 Thread Guelman, Leo

Dear list,

What is a better way relative to the one below to keep the order of factor 
levels created from cut()? Notice, I'm simply pasting letters to levels before 
converting to character so to keep the desired order of levels. This is not 
very elegant... I'm converting to character so I can call the helper fun with 
vapply() from the main fun.

Removing this line of code "  levels(xc) <- paste(letters[1:nlevels(xc)], 
levels(xc), sep=":")" would result in factor levels that are not ordered 
according to x1.

set.seed(1)
df <- data.frame(x1 = rnorm(1000), x2 = rnorm(1000))

main_fun <- function(data) {
  data.frame(vapply(data, helper_fun, character(nrow(df
}

helper_fun <- function(x) {
  xc <-  cut(x, breaks = unique(quantile(x, seq(0, 1, 1/10), na.rm = TRUE)),
 include.lowest = TRUE)
  levels(xc) <- paste(letters[1:nlevels(xc)], levels(xc), sep=":")
  as.character(xc)

}


res <- main_fun(df)
levels(res$x1)
levels(res$x1)
 [1] "a:[-3.01,-1.34]""b:(-1.34,-0.882]"   "c:(-0.882,-0.511]"  
"d:(-0.511,-0.296]"  "e:(-0.296,-0.0353]"
 [6] "f:(-0.0353,0.245]"  "g:(0.245,0.536]""h:(0.536,0.854]"
"i:(0.854,1.32]" "j:(1.32,3.81]"
>

Thanks
Leo.

___
If you received this email in error, please advise the sender (by return email 
or otherwise) immediately. You have consented to receive the attached 
electronically at the above-noted email address; please retain a copy of this 
confirmation for future reference.  

Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur 
immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté 
de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse 
courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation 
pour les fins de reference future.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How best to model these datasets

2014-10-12 Thread Brian Leo

Hello,

I am a stats novice and I was wondering what kind of model would work best
with the two datasets that are displayed.  The graphs show an incremental
area analysis for two feral cat home ranges where area increases with
increasing number of GPS fixes.

Thanks,

Brian


-- 
Brian Leo
School of Aquatic and Fishery Sciences
University of Washington
1122 Boat St NE
Box 355020
Seattle, WA 98195
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading specific rows and columns xlsx file: Error in strsplit(names(res), \\.) : non-character argument

2014-08-03 Thread Margherita Di Leo

Hi,

I am trying to read specific rows and columns from a xlsx file using the
following commands:

# Read rows 18-23 and columns 7-15 into R and assign the result to a
variable
# called dat

colIndex - 18:23
rowIndex - 7:15
dat - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header = TRUE,
 colIndex = colIndex, rowIndex = rowIndex)

But I get this error:

Error in strsplit(names(res), \\.) : non-character argument


I don't understand where that non-character comes from.
When I read the whole file with:

naturalGas - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header =
TRUE)

I don't get any problem.

Should you be willing to try to reproduce this behavior, the file comes
from:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx

sessionInfo()R version 3.1.1 (2014-07-10)
Platform: x86_64-pc-linux-gnu (64-bit)


I'm on debian Jessie.

Disclaimer: this is a homework.

Thank you in advance


-- 
Best regards,

Dr. Margherita DI LEO
Scientific / technical project officer

European Commission - DG JRC
Institute for Environment and Sustainability (IES)
Via Fermi, 2749
I-21027 Ispra (VA) - Italy - TP 261

Tel. +39 0332 78 3600
margherita.di-...@jrc.ec.europa.eu

Disclaimer: The views expressed are purely those of the writer and may not
in any circumstance be regarded as stating an official position of the
European Commission.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading specific rows and columns xlsx file: Error in strsplit(names(res), \\.) : non-character argument

2014-08-03 Thread Margherita Di Leo

Ahem.. solved.
I had switched between rows and columns..

Sorry for the noise.

Best,
m


On Sun, Aug 3, 2014 at 6:33 PM, Margherita Di Leo direg...@gmail.com
wrote:

 Hi,

 I am trying to read specific rows and columns from a xlsx file using the
 following commands:

 # Read rows 18-23 and columns 7-15 into R and assign the result to a
 variable
 # called dat

 colIndex - 18:23
 rowIndex - 7:15
 dat - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header = TRUE,
  colIndex = colIndex, rowIndex = rowIndex)

 But I get this error:

 Error in strsplit(names(res), \\.) : non-character argument


 I don't understand where that non-character comes from.
 When I read the whole file with:

 naturalGas - read.xlsx(./data/natural_gas.xlsx, sheetIndex = 1, header
 = TRUE)

 I don't get any problem.

 Should you be willing to try to reproduce this behavior, the file comes
 from:
 https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx

 sessionInfo()R version 3.1.1 (2014-07-10)
 Platform: x86_64-pc-linux-gnu (64-bit)


 I'm on debian Jessie.

 Disclaimer: this is a homework.

 Thank you in advance


 --
 Best regards,

 Dr. Margherita DI LEO
 Scientific / technical project officer

 European Commission - DG JRC
 Institute for Environment and Sustainability (IES)
 Via Fermi, 2749
 I-21027 Ispra (VA) - Italy - TP 261

 Tel. +39 0332 78 3600
 margherita.di-...@jrc.ec.europa.eu

 Disclaimer: The views expressed are purely those of the writer and may not
 in any circumstance be regarded as stating an official position of the
 European Commission.




-- 
Best regards,

Dr. Margherita DI LEO
Scientific / technical project officer

European Commission - DG JRC
Institute for Environment and Sustainability (IES)
Via Fermi, 2749
I-21027 Ispra (VA) - Italy - TP 261

Tel. +39 0332 78 3600
margherita.di-...@jrc.ec.europa.eu

Disclaimer: The views expressed are purely those of the writer and may not
in any circumstance be regarded as stating an official position of the
European Commission.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (Seismic) signal deconvolution with R

2014-03-24 Thread Leo Leo

Has anybody (successfully) tried to deconvolve a seismic signal (time
series in general) using R and package signal? I have the Matlab routines:

[bb,aa] = zp2tf(zero,pole',gain);
[bb,aa] = bilinear(bb,aa,df);
transfer = freqz(bb,aa,f,df);

But the (apparent) equivalents of R produce quite different results:

ZPG - Zpg(zero = zero, pole = pole, gain = gain)
B - bilinear(ZPG, T = 1/df)
transfer - freqz(h = B, Fs = 1/df, n = f)

In Matlab, one complex vector (transfer) is returned, in R a different
complex vector and a numeric vector are returned. Are the problems in the
specification of zeros and poles (Hz vs. rad/s) or in parameter setting of
freqz()?

Or does anyone know someone who uses R for signal processing?

Best and Thanks,

Micha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Call for papers: Geospatial Semantic Array Programming

2014-01-09 Thread Margherita Di Leo

** Apologies for any cross posting **

Earthzine http://www.earthzine.org/, an IEEE-sponsored online scientific
journal, is soliciting articles of 800-3,000 words for its second quarter
theme of 2014 on *Geospatial Semantic Array Programming *(GeoSemAP). We
seek contributions from all regions of the globe, addressing environmental
transdisciplinary research in which a concise integration of array-based
semantics and array programming, geospatial tools and a modular composition
of data-transformation models are exploited for geospatial problems within
the paradigm of Semantic Array Programming. This theme specifically focuses
on wide-scale transdisciplinary modelling for environment (WSTMe) as a
scientific challenge with an increasingly important role in allowing
strategic policy-making to be effectively discussed and programmed with the
support of robust science.
See the call for paper at
http://www.earthzine.org/2013/12/18/call-for-papers-geospatial-semantic-array-programming/
 and download it in
PDF.http://www.earthzine.org/wp-content/uploads/2013/12/Call-For-Papers-GeoSemAP.pdf

http://www.earthzine.org/wp-content/uploads/2013/12/Call-For-Papers-GeoSemAP.pdf


-- 
Best regards,

Dr. Margherita DI LEO
Scientific / technical project officer

European Commission - DG JRC
Institute for Environment and Sustainability (IES)
Via Fermi, 2749
I-21027 Ispra (VA) - Italy - TP 261

Tel. +39 0332 78 3600
margherita.di-...@jrc.ec.europa.eu

Disclaimer: The views expressed are purely those of the writer and may not
in any circumstance be regarded as stating an official position of the
European Commission.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] forecasting accuracy problem in R

2012-12-16 Thread Leo

Hi,

A few weeks back I used the following command:

accuracy(train,test)

where train and test are training and test data respectively. Last
night I updated R and the forecast package and used the same command
and I got error. After trying a little I used the following command

accuracy(train,test[1:30])

and it worked. (I was checking the accuracy of 30 forecasted values).

Is there some change in the forecast package or did I do something wrong?

regards Leo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] forecasting accuracy problem in R

2012-12-16 Thread Leo

Thanks. Could be the following reason given in the changelog.
 accuracy() can now figure out overlapping times for x and f.

best regards
Leo

On Mon, Dec 17, 2012 at 10:11 AM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 Please read the posting guide. Packages have their own developers, as well as 
 their own change logs [in this case 
 http://cran.r-project.org/web/packages/forecast/ChangeLog].
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 Leo spee...@gmail.com wrote:

Hi,

A few weeks back I used the following command:

accuracy(train,test)

where train and test are training and test data respectively. Last
night I updated R and the forecast package and used the same command
and I got error. After trying a little I used the following command

accuracy(train,test[1:30])

and it worked. (I was checking the accuracy of 30 forecasted values).

Is there some change in the forecast package or did I do something
wrong?

regards Leo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Forecast Package: Draw two lines on the same plot

2012-11-13 Thread Leo

Hi,

How is it possible to draw to different data on the same graph using
forecast package? The first is the observed data and the second set is the
fitted values. I want the observed data to show as solid line while the
fitted values to show as dashed or dotted line.

regards
Leo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] formula method with special characters

2012-06-20 Thread Guelman, Leo

Dear List,

I'm trying to create a formula method, allowing for a special
character in the formula (i.e., similar to for example the gam package
with the character s in y ~ s(x)). I've checked and it seems this is
done through attr(,specials). However, the section of code below (as
an example extracted from the gam package) gives me an error as shown at
the end. Evidently I'm missing something here...any help is much
appreciated. 

Thanks,
Leo.

gam2.slist -
c(s, lo, random)

gam2 -
  function(formula, family = gaussian, data, 
   weights, subset, na.action, start = NULL, etastart, mustart,
control = gam.control(...),
   model = FALSE, method=glm.fit, x = FALSE, y = TRUE, ...)
{
  call - match.call()
  mf - match.call(expand.dots = FALSE)
  m - match(c(formula, data, subset, weights, na.action, 
   etastart, mustart, offset), names(mf), 0)
  mf - mf[c(1, m)]
  mf$drop.unused.levels - TRUE
  mf[[1]] - as.name(model.frame)
  mt - if(missing(data)) terms(formula, gam2.slist) else
terms(formula,gam2.slist,data = data)
  mf$formula-mt

  mf - eval(mf, parent.frame())
  
  }

set.seed(1)
x - runif(1000)
y - runif(1000)
dd - data.frame(y, x)

g - gam2(y ~ s(x), data =dd)
Error in eval(expr, envir, enclos) : could not find function s
___

This email may be privileged and/or confidential, and the
sender does not waive any related rights and obligations.
Any distribution, use or copying of this email or the
information it contains by other than an intended recipient
is unauthorized. If you received this email in error,
please advise the sender (by return email or otherwise)
immediately. You have consented to receive the attached
electronically at the above-noted email address; please retain a
copy of this confirmation for future reference.

Ce courriel est confidentiel et protÃ©gÃ©. L'expÃ©diteur ne renonce
pas aux droits et obligations qui s'y rapportent. Toute diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il
contient par une personne autre que le (les) destinataire(s)
dÃ©signÃ©(s) est interdite. Si vous recevez ce courriel par erreur,
veuillez en aviser l'expÃ©diteur immÃ©diatement, par retour de courriel
ou par un autre moyen. Vous avez acceptÃ© de recevoir le(s) document(s)
ci-joint(s) par voie Ã©lectronique Ã  l'adresse courriel indiquÃ©e ci-dessus;
veuillez conserver une copie de cette confirmation pour les fins de reference 
future.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] smooth scatterplot and geo map

2011-07-28 Thread Leo Guelman

check smooth.ppp{spatstat}  which performs spatial smoothing based on a
Gaussian kernel...it has a plot method that may do what you are looking for

hope this helps,
Leo.

On Thu, Jul 28, 2011 at 4:09 PM, marco marco.mile...@aim.uzh.ch wrote:

 Hello everybody,
 I'm trying to understand how to draw a smoothed scatterplot on a geographic
 map with R.
 Have a dataframe with point locations (long, lat) and was able to simply
 plot these points on a shp map by using the maptools package. However,
 instead of having simply the raw points on the map, I would like to have a
 smoothed scatterplot of the same superimposed on the map. Tried with the
 smoothScatter function, but really have no idea how to combine the map with
 the resulting graph.
 tx in adv
 marco

 --
 View this message in context:
 http://r.789695.n4.nabble.com/smooth-scatterplot-and-geo-map-tp3702374p3702374.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple example of using a closure in R to manage bank accounts?

2011-07-25 Thread Leo Guelman

something like this maybe?

account - function (balance=0) {
 function (d = 0,w = 0) {
newbal - balance + d - w
   balance - newbal
   newbal
   }
}
John - account(100)
John
John(d=100,w=50)
John()

Leo - account(1000)
Leo
Leo(d=1000,w=50)
Leo()
Leo(d=100,w=500)
Leo()
Leo.

On Mon, Jul 25, 2011 at 6:00 PM, Michael Hannon jm_han...@yahoo.com wrote:

 Greetings.  I once ran across a simple (toy) example of using a closure in
 R to
 manage bank accounts.  I've got a use for it now but can no longer find it.
  If
 you have it (or a similar example), will you please send it to me?
 (Unfortunately, a web search that includes the terms bank and closure
 leads
 into a whole pile of unrelated stuff.)

 Thanks,

 -- Mike

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with contrasts

2011-05-11 Thread Guelman, Leo

Or alternatively (though very similar to Peter's idea) you can do

ci - contrasts
formals(ci)$contrasts - FALSE
dd - data.frame(a = gl(3,4), b = gl(4,1,12))
mm - model.matrix(~ a + b, dd, contrasts = list(a=ci(dd$a),
b=ci(dd$b))) 

Best,
Leo.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of peter dalgaard
Sent: 2011, May, 11 7:31 AM
To: Lars Bishop
Cc: R-help@r-project.org Help
Subject: Re: [R] Help with contrasts


On May 11, 2011, at 12:51 , Lars Bishop wrote:

 Hi,
 
 I need to build a function to generate one column for each level of a 
 factor in the model matrix created on an arbitrary formula (instead of

 using the available contrasts options such as contr.treatment,
contr.SAS, etc).
 
 My approach to this was first to use the built-in function for 
 contr.treatment but changing the default value of the contrasts 
 argument to FALSE (I named this function contr.identity and it shown

 at the bottom of the email for reference).
 
 So this function works fine,
 
 contr.identity(4)
 
  1 2 3 4
 1 1 0 0 0
 2 0 1 0 0
 3 0 0 1 0
 4 0 0 0 1
 
 contr.treatment(4)
  2 3 4
 1 0 0 0
 2 1 0 0
 3 0 1 0
 4 0 0 1
 
 However, when I try to create a model matrix using contr.identity 
 specified in options(), it actually uses the contr.treatment option. 
 Why is that? Any hint on how can I do this?


It's not actually using contr.treatment, it's just calling
contr.identity with contrasts=TRUE...

I don't think there's a painless way to avoid this. The closest I can
think of is

cI - contr.treatment
formals(cI)$contrasts - FALSE
dd - data.frame(a = gl(3,4), b = gl(4,1,12)) model.matrix(~ C(a,cI,3) +
C(b,cI,4), dd)



--
Peter Dalgaard
Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000
Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
___

This e-mail may be privileged and/or confidential, and the sender does not waive
any related rights and obligations. Any distribution, use or copying of this 
e-mail or the information
it contains by other than an intended recipient is unauthorized.
If you received this e-mail in error, please advise me (by return e-mail or 
otherwise) immediately.

Ce courriel peut contenir des renseignements protégés et confidentiels.
Lexpéditeur ne renonce pas aux droits et obligations qui sy rapportent.
Toute diffusion, utilisation ou copie de ce courriel ou des renseignements 
quil contient
par une personne autre que le destinataire désigné est interdite.
Si vous recevez ce courriel par erreur, veuillez men aviser immédiatement, 
par retour de courriel ou par un autre moyen.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

2011-02-28 Thread Guelman, Leo

I'm guessing this has something to do with numerical precision on the two 
platforms.

Leo.

- Original Message -
From: Joshua Wiley jwiley.ps...@gmail.com
To: Axel Urbiz axel.ur...@gmail.com
Cc: R-help@r-project.org R-help@r-project.org; Ridgeway, Greg
Sent: Fri Feb 25 22:16:02 2011
Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

Hi Axel,

I do not have a nice explanation why the results differ off the top of
my head.  I can say I can replicate what you get on 32/64 (both
Windows 7) bit with the development version of R and gbm_1.6-3.1.

Here is an even simpler example that shows the difference:

gbmfit - gbm(1:50 ~ I(50:1) + I(60:11), distribution = gaussian)
summary(gbmfit)

I copied that package maintainer.

Cheers,

Josh

On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz axel.ur...@gmail.com wrote:
 Dear List,

 The gbm package on Win 7 produces different results for the
 relative importance of input variables in R 32-bit relative to R 64-bit. Any
 idea why? Any idea which one is correct?

 Based on this example, it looks like the relative importance of 2 perfectly
 correlated predictors is diluted by half in 32-bit, whereas in 64-bit, one
 of these predictors gets all the importance and the other gets none. I found
 this interesting.

 ### Sample code

 library(gbm)
 set.seed(12345)
 xc=matrix(rnorm(100*20),100,20)
 y=sample(1:2,100,replace=TRUE)
 xc[,2] - xc[,1]
 gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian)
 summary(gbmfit)

 ### Results on R 2.12.0 (32-bit)

      var  rel.inf
 1 xc[, 3] 49.76143
 2 xc[, 1] 27.27432
 3 xc[, 2] 22.96425

 ### Results on R 2.12.0 (64-bit)
 summary(gbmfit)
      var  rel.inf
 1 xc[, 1] 50.23857
 2 xc[, 3] 49.76143
 3 xc[, 2]  0.0

 Thanks,
 Axel.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__

This email message is for the sole use of the intended r...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] New R User Group in Toronto

2010-10-12 Thread Guelman, Leo

Dear R users, 

I'm pleased to announce that the Greater Toronto Area (GTA) user's group
is now active on meetup.com, and taking suggestions for the first
meeting. If you are on the region, you can sign-up now at the link below

http://www.meetup.com/Greater-Toronto-Area-GTA-R-Users-Group

This group is aimed to bring together practitioners (from industry and
academia alike) in order to exchange knowledge and experience on using R
for data analysis, statistical modeling, visualization, data mining,
predictive analytics, statistical computing, exploratory data analysis,
etc. Users from all levels are welcome.

Special thanks to Revolution Analytics to help us get started and
sponsor the group.

Kind Regards, 

Leo Guelman.
___

This e-mail may be privileged and/or confidential, and the sender does not waive
any related rights and obligations. Any distribution, use or copying of this 
e-mail or the information
it contains by other than an intended recipient is unauthorized.
If you received this e-mail in error, please advise me (by return e-mail or 
otherwise) immediately.

Ce courriel peut contenir des renseignements protÃ©gÃ©s et confidentiels.
LÂexpÃ©diteur ne renonce pas aux droits et obligations qui sÂy rapportent.
Toute diffusion, utilisation ou copie de ce courriel ou des renseignements 
quÂil contient
par une personne autre que le destinataire dÃ©signÃ© est interdite.
Si vous recevez ce courriel par erreur, veuillez mÂen aviser immÃ©diatement, 
par retour de courriel ou par un autre moyen.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can this code be written more efficiently?

2010-09-30 Thread Guelman, Leo

Dear users,

I'm working on binary classification problem using Support Vector
Machines (SVM). My objective is to train a series of SVM models on a
grid of hyperparameters and then select those that maximize the AUC
based on an independent validation sample. 

My attempted code is shown below. It runs well on small data sets but
when I use it on a slightly larger sample (e.g., my train data is
composed of about 8,000 observations on each class and 21 inputs), it
takes forever to run (more than 1 day already and still running). I'm
wondering if there's any way I can optimize this code. Thanks in advance
for any help.

I'm using 64-bit R 2.11.1 on Win 7. 

Start Code

library(e1071)
library(ROCR)

### Create grid of hyperparameters

Gseq - seq(-15,3,2); G - rep(2, length(Gseq)); G - G^Gseq
Cseq - seq(-5,13,2); C - rep(2, length(Cseq)); C - C^Cseq
mygrid - expand.grid(C=C, G=G)

### Train models

svm.models -  lapply(1:nrow(mygrid), function(i) {
svm(churn.form, data = mytraindata,
method = C-classification, kernel = radial,
cost = mygrid[i,1], gamma = mygrid[i,2],
probability=TRUE)
})

### Predict on test set 

pred.step3 - numeric(length(svm.models))

for (i in 1:length(svm.models)) {

pred.step1 - predict(svm.models[[i]], myvaliddata, decision.values = F,

  probability=T)

pred.step2 -
prediction(predictions=attr(pred.step1,probabilities)[,1],
labels=myvaliddata$churn)

pred.step3[i] - performance(pred.step2, auc)@y.values[[1]]

}

pred.step3

End Code


Thanks,
Leo.

___

This e-mail may be privileged and/or confidential, and the sender does not waive
any related rights and obligations. Any distribution, use or copying of this 
e-mail or the information
it contains by other than an intended recipient is unauthorized.
If you received this e-mail in error, please advise me (by return e-mail or 
otherwise) immediately.

Ce courriel peut contenir des renseignements protÃ©gÃ©s et confidentiels.
LÂexpÃ©diteur ne renonce pas aux droits et obligations qui sÂy rapportent.
Toute diffusion, utilisation ou copie de ce courriel ou des renseignements 
quÂil contient
par une personne autre que le destinataire dÃ©signÃ© est interdite.
Si vous recevez ce courriel par erreur, veuillez mÂen aviser immÃ©diatement, 
par retour de courriel ou par un autre moyen.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] TM Package - Corpus function - Memory Allocation Problems

2010-08-17 Thread Guelman, Leo


I'm using R 2.11.1 on Win XP (32-bit) with 3 GB of RAM. My data has
(only) 16.0 MB. 

I want to create a VCorpus object using the Corpus function in the tm
package but I'm running into Memory allocation issues: Error: cannot
allocate vector of size 372 Kb. 

My data is stored in a csv file which I've imported with read.csv and
then used the following to create the Corpus (but it failed with the
error message above)

txt - Corpus(DataframeSource(txt))

I've even tried to subset ~ 10% of my data but I run into the same
error. What is a the best way to solve this memory problem other than
increasing a 
physical RAM? 

Thanks in advance for any help,

Leo.

___

This e-mail may be privileged and/or confidential, and the sender does not waive
any related rights and obligations. Any distribution, use or copying of this 
e-mail or the information
it contains by other than an intended recipient is unauthorized.
If you received this e-mail in error, please advise me (by return e-mail or 
otherwise) immediately.

Ce courriel peut contenir des renseignements protÃ©gÃ©s et confidentiels.
LÂexpÃ©diteur ne renonce pas aux droits et obligations qui sÂy rapportent.
Toute diffusion, utilisation ou copie de ce courriel ou des renseignements 
quÂil contient
par une personne autre que le destinataire dÃ©signÃ© est interdite.
Si vous recevez ce courriel par erreur, veuillez mÂen aviser immÃ©diatement, 
par retour de courriel ou par un autre moyen.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] different outcomes of P values in SPSS and R

2010-08-13 Thread Leo Vorthoren


I have been using generalized linear models in SPSS 18, in order to build
models and to calculate the P values. When I was building models in Excel
(using the intercept and Bs from SPSS), I noticed that the graphs differed
from my expectations. When I ran the dataset again in R, I got totally
different outcomes for both the P values as well as the Bs and the
intercepts. The outcomes of R seem much more likely to be the correct ones,
but I really cannot explain the differences.

Does anyone have experience with these differences? 

I'm using Generalized linear models with Binary Logistics in SPSS and 

glm(formula = EPO_YN ~ frequ_ind + frequ_ind2 + frequ_preFDS, family =
binomial(link = logit), data = w) in R.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/different-outcomes-of-P-values-in-SPSS-and-R-tp2324181p2324181.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fit data with y = x^-1

2010-06-07 Thread Leo Leo

Dear list,

I am getting weired with fitting data with a 1/x-polynomial. Suggest I have
the following data:

x - c(1,2,3,4,5,6,7)
y - c(100,20,4,2,1,.3,.1)

I may fit this with a linear model

fit1 = lm(y ~ I(x))

Getting plot out of this model I applied
library(polynom)
pol1 = polynomial(fit1$coefficients)
f1 = as.function(pol1)

plot(x,y)
lines(x, f1(x), col = 2)

Clearly, this model appears insufficient. Therefore I intended to aplly
fit2 = lm(y ~ I(1/x))
pol2 = polynomial(fit2$coefficients)
f2 = as.function(pol2)
lines(x, f2(x), col = 3)

as this would be much more pleasant. However, the second model does not work
at all.


Does anyone know about how to correctly fit an x^-1-polynomial to my data
cloud?


Many thanks in advance,


behelfsadresse

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linear Contrasts in GLM - Query

2009-09-04 Thread Leo Guelman

Hi,

Is there a way I can specify linear contrasts in glm? I'm looking for
something equivalent to SAS' contrast statement.

I'd like to do the following, suppose I have a categorical input with 4
levels (a,b,c,d), I'd like to test something like: (i)  a+b=c+d, (ii) a=b,
(iii) a=b+d, etc...

Thanks in advance for your help!

Leo.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Grouping data in a data frame: is there an efficient way to do it?

2009-09-02 Thread Leo Alekseyev

I have a data frame with about 10^6 rows; I want to group the data
according to entries in one of the columns and do something with it.
For instance, suppose I want to count up the number of elements in
each group.  I tried something like aggregate(my.df$my.field,
list(my.df$my.field), length) but it seems to be very slow.  Likewise,
the split() function was slow (I killed it before it completed).  Is
there a way to efficiently accomplish this in R?..  I am almost
tempted to write an external Perl/Python script entering every row
into a hashtable keyed by my.field and iterating over the keys...
Might this be faster?..

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grouping data in a data frame: is there an efficient way to do it?

2009-09-02 Thread Leo Alekseyev

Thanks everyone for the useful suggestions.  The bottleneck might be
memory limitations of my machine (3.2GHz, 2 GB) and the fact that I am
aggregating on a field that is a string.  Using the suggested
as.data.frame(table(my.df$my.field)) I do get a speedup, but the
computation still takes 30 seconds.  For the sake of comparison, I did
write the counting up rows with common values function using a Perl
hash (it's only 5 lines of Perl) and it takes 15 seconds to run -- a
2x speedup.  Not yet sure if it's worth the hassle.

--Leo

On Wed, Sep 2, 2009 at 4:28 PM, David M
Smithda...@revolution-computing.com wrote:
 You may want to try using isplit (from the iterators package). Combined with
 foreach, it's an efficient way of iterating through a data frame by groups
 of rows defined by common values of a columns (which I think is what you're
 after). You can speed things up further if you have a multiprocessor system
 with the doMC package to run iterations in parallel. There's an example
 here:
 http://blog.revolution-computing.com/2009/08/blockprocessing-a-data-frame-with-isplit.html
 Hope this helps,
 # David Smith
 On Wed, Sep 2, 2009 at 3:39 PM, Leo Alekseyev dnqu...@gmail.com wrote:

 I have a data frame with about 10^6 rows; I want to group the data
 according to entries in one of the columns and do something with it.
 For instance, suppose I want to count up the number of elements in
 each group.  I tried something like aggregate(my.df$my.field,
 list(my.df$my.field), length) but it seems to be very slow.  Likewise,
 the split() function was slow (I killed it before it completed).  Is
 there a way to efficiently accomplish this in R?..  I am almost
 tempted to write an external Perl/Python script entering every row
 into a hashtable keyed by my.field and iterating over the keys...
 Might this be faster?..

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 David M Smith da...@revolution-computing.com
 Director of Community, REvolution Computing www.revolution-computing.com
 Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)

 Check out our upcoming events schedule at
 www.revolution-computing.com/events



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read floats from file into array

2009-07-21 Thread leo mueller

hi all,

i have a simple question. instead of defining my measurements in a
static way like ...

x - c(-0.475, -1.553, -0.434, -1.019, 0.395)

... i'd like them to be read from a file ...

x - read.table(07a673ac0cb1f7f8fa293860566f633c/1/raw0.txt, header=FALSE)
d1 - density(x, kernel = gaussian)

with a formatting that looks like:

4.284000e-01
6.758333e-01
8.292000e-01
7.856667e-01
6.633667e-01
5.408000e-01
4.728333e-01
4.377000e-01
4.374333e-01
4.102667e-01
3.628333e-01
3.277000e-01
4.909667e-01
[...]

R quits and says:

 d1 - density(x, kernel = gaussian)
Error in density.default(x, kernel = gaussian) :
  argument 'x' must be numeric
Calls: density - density.default
Execution halted

is there any possibility to convert this / make this work?

big thx for help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read floats from file into array

2009-07-21 Thread leo mueller

big thanks to all, the x[[1]] worked fine! :)

2009/7/21 Duncan Murdoch murd...@stats.uwo.ca:
 On 21/07/2009 6:09 AM, leo mueller wrote:

 hi all,

 i have a simple question. instead of defining my measurements in a
 static way like ...

 x - c(-0.475, -1.553, -0.434, -1.019, 0.395)

 ... i'd like them to be read from a file ...

 x - read.table(07a673ac0cb1f7f8fa293860566f633c/1/raw0.txt,
 header=FALSE)
 d1 - density(x, kernel = gaussian)

 with a formatting that looks like:

 4.284000e-01
 6.758333e-01
 8.292000e-01
 7.856667e-01
 6.633667e-01
 5.408000e-01
 4.728333e-01
 4.377000e-01
 4.374333e-01
 4.102667e-01
 3.628333e-01
 3.277000e-01
 4.909667e-01
 [...]

 R quits and says:

 d1 - density(x, kernel = gaussian)

 Error in density.default(x, kernel = gaussian) :
  argument 'x' must be numeric
 Calls: density - density.default
 Execution halted

 is there any possibility to convert this / make this work?

 read.table returns a dataframe, i.e. a list of vectors.  density wants a
 vector.  So you will probably get what you want using

 d1 - density(x[[1]], kernel=gaussian)

 You can use names(x) to find the name of the 1st column for a nicer syntax;
 it is probably V1 (for variable 1), so you could do

 y - x$V1
 density(y, ...)

 Duncan Murdoch


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Courses in Toronto

2009-06-11 Thread Leo Guelman

Dear R users,

Does anybody know if there any good R courses within the Toronto area?

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vim R plugin-2

2009-05-09 Thread Leo


 Any alternative ways of sending info both ways from R to any open
 process (vim)
 in windows?

On windows, I'd rather use ole automation. A few years ago I
successfully used this plugin:
http://www.vim.org/scripts/script.php?script_id=889

I haven't used it since though.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rdonlp2 -Query

2009-03-07 Thread Leo Guelman

Hi,

Did anyone used this package? Could you please share your thought on it?

Thanks!

L.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GMM estimation

2008-05-27 Thread Leo Correia

Hello there!!!

Sorry to bother you all with such question and difficulties that I have been
facing on.

Recently I have been searching for packages to run GMM estimatives with R.

I have been searching for such packages for a while, but since I am a new
user of R system,
my quest so far was unsucessful.

That´s why I had decided to ask to this forum.  Hope that anyone could help
me!

I know that such emails might bother you guys at all... but I dont know
where to search for help about this issue.

Many thanks!!!
hope to hear from you soon!

Leo!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with simulation of heteroskedasticity

2008-04-26 Thread Leo Correia

Hello guys!
Sorry to bother with such a question

I was trying to generate a monte carlo simulation with heteroskedasticity
errors. but I am not sure if the command line that I had
wrote is quite correct.
the type of heteroskedasticity that I want to create is such as var(e) =
var(x^4)

I began my work with this

x- rnorm (100, 2,0.4)  # generating an indepedent random variable
e- rnorm(100,0,x^2) # generating the error term
y.fitted- 0.5*(x) +3
y- y.fitted+e

And then I wrote an for loop code which could give me the results of a
t-test.
It seems that the simulation worked good because the main result is that the
t-test was no longer meanfull (as it is expected)

But my main doubt is if this code was capable to generate the type of
heteroskedasticity
that I wanted.
I tried to understand how the rnorm function generates the numbers but i
could not grab much

So could anyone help me telling if this kind of work is okay? or is there
any moree inteligent way to do the job?

thanks!!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

47 matches

Mail list logo