Re: [sc-dev] R + Calc Update

Wojciech Gryc Tue, 01 May 2007 11:39:50 -0700

Thank you both for the replies.

I didn't realize those functions existed in R, and they could make life a
lot easier. They even work on a relatively complex function like glm(). Now
that I feel comfortable with making R packages, I'm learning how to call R
from C. Putting all of this into a nice R package should be doable in a few
days, assuming I get things working.


With regards to implementation on Calc's end. The instructions provided were
good and after modifying a few .hxx files I got my own functions working.
I've written a rough draft for the wiki, and am now compiling my wiki
tutorial to make sure the code is correct. I'll put it up tonight.

Finally, I don't mind making a Calc add-in instead. I did notice the
"scaddins" folder and actually toyed with the idea of using that instead. I
just have a few questions about this:

  1. How do we expect the user to install this stuff, in general? If I
  make a link between C and R, we can't ship it with OO due to licensing
  issues, so should we build some sort of built-in installer, or just have a
  link somewhere on the OO website?

  2. If I do make an add-in, is there anywhere I can put it in the
  source code, or will it be an external package, as discussed in #1? If it is
  an external package, then can I just code a link to R directly in the
  add-in?

Thank you,
Wojciech

On 5/1/07, Leonard Mada <[EMAIL PROTECTED]> wrote:


I have some additional comments to my previous post.

If we perform some calculation inside R, e.g.
> x <- fisher.test(matrix(c(60,40,70,30),2))

We assigned the results to the object 'x'. Now we can access various
elements of x as follows:

1. How many elemts are stored inside x?
> length(x)
[1]  7

2. What are the names of those variables?
> y <- names(x)
> y
[1] "p.value"     "conf.int"    "estimate"    "null.value"  "alternative"
[6] "method"      "data.name"

This command has created an array (= y) with the names of the variables!
We can get an individual name with y[[1]] (y[1] works as well).

3. How do we access the stored output by variable?
  - either iterate through x[[1]] -> to x[[length(x)]]
  > x[[i]]
  - OR iterate i = 1 to length(x), with
  > x[[y[[i]]]]

  - for subarrays, like x[[2]], we can apply the same thing one more time:
> length(x[[2]])
[1] 2
(x[[2]] is the 95% confidence interval for the OddsRatio, so it has an
upper and a lower limit)
> x[[2]][1]        // for first value = lower limit
[1] 0.8317144

> x[[2]][2]        // for second value = upper limit
[1] 2.918995

CONCLUSIONS
============
A.) For statistical functions/techniques that return primarily a
p-value, we can easily detect this, as one of the names will be
"p.value" (it is usually the first element in the array, aka x[[1]]).

B.) We can import the data from R back into Calc, and create a list with
the names of the variables from the output (aka names(x) ) AND let the
more advanced user choose which variable from this list to enter into a
Calc cell.

Well, hope this helps to overcome some of the obstacles.

Sincerely,

Leonard


Leonard Mada wrote:
> Hi Wojciech,
>
> I just read (http://www.utsc.utoronto.ca/~04grycwo/overview.pdf) and
> began thinking on your ideas.
>
> Basically you are right, we need both methods: for advanced users and
> for beginners. That's my point, too.
>
> Regarding your concerns with importing the R-output back into Calc, I
> do admit that there are some problems and issues to discuss. I will
> try to think of the best solution.
>
> Until then, I can give you some useful tips:
>
> 1. lets say we perform some calculations in R and store the output in
> a new variable, e.g.
>    x <- fisher.test(matrix(c(40,60,30,70),2))
>    then we can get the output by typing at the prompt:> x
>     OR
>    we can get the length of the return object:
>     :> length(x)
>     <output> 7
>    and get every element individually from this output: (iterate
> through x[[1]] -> x[[7]] )
>    :> x[[1]]
>    <output>> [1] 0.1819324 (this is the p-value)
>
> 2. I imagine statistical functions as belonging to 2 large groups
> (this is NOT necessarily accurate BUT useful here):
>    a.) those that report a p-value as the main result
>         - this is usually the first value (aka x[[1]])
>    b.) those that perform more complex actions, like a multivariate
> model, or a resampling, or graphic
>
> These latter functions will be more difficult to deal with. But lets
> stick now to the first group.
>
> Hope this is helpful. I will try to work up a solution for the rest.
>
> Sincerely,
>
> Leonard
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--

Five Minutes to Midnight:
Youth on human rights and current affairs
http://www.fiveminutestomidnight.org/

Re: [sc-dev] R + Calc Update

Reply via email to