Re: [R] abline outside of plot region

2011-04-29 Thread Rolf Turner

On 30/04/11 01:14, Nick Sabbe wrote:

Hi R people.



I ran into this problem: I created a plot with errbars, like this:


errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5),

yplus=c(2.5,1.5,3.5,3.5))

Next, I wanted to accentuate some x value with an abline, like this:


abline(v=2)



In one of my R sessions (which admittedly I have had open for quite a while
now), the abline draws outside of the plotting region of errbars (till the
edge of my plotting window at least).

I tested for the cause by opening another session (clean) of the same
version of R (2.13), and running the same set of commands. In this session,
I do not have this behavior. Conclusion: I must have changed some graphical
parameter in my original session, but I don't know which one. Do you?



I think what has happened is not that *you* changed some graphical 
parameter,
but rather that the package Hmisc did.  In a rather strange way.  (I 
*presume*

that you are using the errbar() function out of Hmisc rather than out of the
sfsmisc package --- you didn't say).

For a while I thought that the problem was associated with the 
*installation*

of Hmisc, becomes it seemed to happen only on the first occasion after I
did the install, and on later occasions (quit from R, restart, load Hmisc,
try errbar and abline) the problem did not occur.

But then, about the third time I tried the re-install, the problem never 
happened
at all.  But it ***did*** happen, a couple of times.  So you're not 
imagining it,

you'll be pleased to know.

I think that if it *does* happen to you again, you can fix it by setting

par(xpd=FALSE)

I checked on par()$xpd a couple of times when the problem occurred,
and got NA, which is consistent with the observed problem.

The whole thing is weird, but.  Gremlins?

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
That works like a charm!  Thanks so much Duncan.

On Fri, Apr 29, 2011 at 6:37 PM, Duncan Murdoch wrote:

> On 29/04/2011 9:34 PM, Miao wrote:
>
>> Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
>> characters and special characters.  In R's gsub() what regular
>> expressions shall I use to handle all these situations?
>>
>
> I don't know.  This might work:
>
> gsub("[\x01-\x1f\x7f-\xff]", "", x)
>
> (i.e. the range from character 1 to character 31, and 127 to 255) but I
> don't know if our regular expression matcher will accept those characters.
>
> Duncan Murdoch
>
>
>>
>> On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch
>> mailto:murdoch.dun...@gmail.com>> wrote:
>>
>>On 29/04/2011 7:41 PM, Miao wrote:
>>
>>Hello,
>>
>>Can anyone help on gsub() in R?  I have a string like something
>>below, and
>>wanted to delete all the strings with leading backslash,
>>including "\xa0On",
>>"\023, "\xab", and many others.   How should I write a regular
>>expression
>>pattern in gsub()?  I don't care how many characters following
>>backslash.
>>
>>
>>
>>If those are R strings, none of them contain a backslash.  In R, a
>>backslash would always be printed as \\.
>>
>>\x is the introduction to a hexadecimal encoding for a character;
>>the next two characters show the hex digits.  So your first string
>>contains a single character \xa0, the third one contains \xab, and
>>so on.
>>
>>The \023 is an octal encoding for a single character.
>>
>>Duncan Murdoch
>>
>>
>>
>>txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for
>>people \xab
>>and be patient :"
>>
>>Thanks in advance,
>>Miao
>>
>>[[alternative HTML version deleted]]
>>
>>__
>>R-help@r-project.org  mailing list
>>
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>>
>> --
>> proceed everyday
>>
>
>


-- 
proceed everyday

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best Practices for submitting packages to CRAN

2011-04-29 Thread Spencer Graves

Two additions to Michael Friendly's comments:


  1.  Have you considered putting your package on R-Forge 
(r-forge.r-project.org)?  They have a facility (which has been broken 
for several months now but will likely be fixed again at some time in 
the future) to test your package more or less daily on 7 different 
combinations of operating system and versions of R.  This has 
occasionally found errors that I could not replicate -- but could 
nevertheless fix!  It also makes it easy for you to have people test a 
new version before you want to release it, e.g., via 
install.packages("fda", repos="http://R-Forge.R-project.org";)' for the 
"fda" package.



  2.  In addition to "\dontrun", you can also put tests you DO want 
to perform but don't want to show the user in "\dontshow".  I use this 
latter routinely to do unit tests.  There is an RUnit package, which I 
probably should learn to use but haven't, and there are other unit 
testing facilities with R that others can discuss that can help you 
produce "trustworthy software" (John Chambers 2008 Software for Data 
Analysis, Springer).  Rather than learn these other facilities, I 
routinely program unit tests in the examples section of my help files, 
often using a construct like the following:



myAnswer <- myFun()

correctAnswer <- list(whatever)

\dontshow{stopifnot(}
all.equal(myAnswer, correctAnswer)
\dontshow{)}


  The code won't pass "R CMD check" until "myAnswer" and 
"correctAnswer" are essentially equal.



  When I find or someone reports a bug, I first program another 
check like this into the help file.  Then I fix the bug.  After that, I 
have confidence that if something other change introduces a problem like 
that, I'll know it before it reaches users other than me.



  sg


On 4/29/2011 8:12 PM, Michael Friendly wrote:

On 4/28/2011 8:43 AM, Singmann wrote:

Dear all,

I (and a colleague) have been working on our first package (for
fitting a
certain type of cognitive models:
http://www.psychologie.uni-freiburg.de/Members/singmann/R/mptinr) for
quite
a while now and have the feeling, that it is "good to go". That is,
we want
to submit it to CRAN. However, I still have two questions before
doing so
and would appreciate any comments.

1. How often is it appropriate to submit updated versions of your
package?
Background for this question: Although we think we have tested the
package
thoroughly, there are some errors that only pop up in daily use. That
is, it
could happen that, especially shortly after the release, fixes need
to be
released rather frequently (say, every 2 weeks). On the other hand, I
know
that humans have to operate CRAN and their time is limited.
Therefore, any
update will consume their time.

You'll have to work out your own workflow for this, but a general
practice might be to release a new version (with an incremented
version number)  whenever you feel there are significant changes,
particularly those that are user-visible or change usage.

This assumes that your package passes R CMD check and R CMD build
without errors or warnings, with the current version of R.
If so, most of what happens on CRAN is automatic.  Otherwise
you may provoke one of the trolls under the CRAN bridge or even
a human.

You might find it easier to register the project on R-Forge and
do updates from there/



2. Is it necessary to put examples that take a considerable amount of
time
to run (>  1 hour) into a \dontrun block? Background: We have a
really slow
MCMC function. Some of the examples take ~1 hour to finish. If these
examples are run each time the package is checked, it will significantly
prolong the checking time. On the other hand, this check will ensure
that
all changes to the function do not corrupt it.


Yes, do use \dontrun{ ... } for stuff in examples that is inconvenient
or difficult to actually run each time during checking.  Sometimes
people include the output from such dontrun examples as comments
inside the example as in

\dontrun{
1+1
## 2
}

But then the responsibility is yours to make sure that the examples
still work after updates.



--
View this message in context:
http://r.789695.n4.nabble.com/Best-Practices-for-submitting-packages-to-CRAN-tp3480870p3480870.html
Sent from the R help mailing list archive at Nabble.com.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, rep

Re: [R] Best Practices for submitting packages to CRAN

2011-04-29 Thread Michael Friendly

On 4/28/2011 8:43 AM, Singmann wrote:

Dear all,

I (and a colleague) have been working on our first package (for fitting a
certain type of cognitive models:
http://www.psychologie.uni-freiburg.de/Members/singmann/R/mptinr) for quite
a while now and have the feeling, that it is "good to go". That is, we want
to submit it to CRAN. However, I still have two questions before doing so
and would appreciate any comments.

1. How often is it appropriate to submit updated versions of your package?
Background for this question: Although we think we have tested the package
thoroughly, there are some errors that only pop up in daily use. That is, it
could happen that, especially shortly after the release, fixes need to be
released rather frequently (say, every 2 weeks). On the other hand, I know
that humans have to operate CRAN and their time is limited. Therefore, any
update will consume their time.
You'll have to work out your own workflow for this, but a general 
practice might be to release a new version (with an incremented version 
number)  whenever you feel there are significant changes,

particularly those that are user-visible or change usage.

This assumes that your package passes R CMD check and R CMD build
without errors or warnings, with the current version of R.
If so, most of what happens on CRAN is automatic.  Otherwise
you may provoke one of the trolls under the CRAN bridge or even
a human.

You might find it easier to register the project on R-Forge and
do updates from there/



2. Is it necessary to put examples that take a considerable amount of time
to run (>  1 hour) into a \dontrun block? Background: We have a really slow
MCMC function. Some of the examples take ~1 hour to finish. If these
examples are run each time the package is checked, it will significantly
prolong the checking time. On the other hand, this check will ensure that
all changes to the function do not corrupt it.


Yes, do use \dontrun{ ... } for stuff in examples that is inconvenient
or difficult to actually run each time during checking.  Sometimes 
people include the output from such dontrun examples as comments inside 
the example as in


\dontrun{
1+1
## 2
}

But then the responsibility is yours to make sure that the examples 
still work after updates.




--
View this message in context: 
http://r.789695.n4.nabble.com/Best-Practices-for-submitting-packages-to-CRAN-tp3480870p3480870.html
Sent from the R help mailing list archive at Nabble.com.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Mike Miller

On Fri, 29 Apr 2011, Duncan Murdoch wrote:


On 29/04/2011 7:41 PM, Miao wrote:


Can anyone help on gsub() in R?  I have a string like something below, and
wanted to delete all the strings with leading backslash, including 
"\xa0On",

"\023, "\xab", and many others.   How should I write a regular expression
pattern in gsub()?  I don't care how many characters following backslash.



If those are R strings, none of them contain a backslash.  In R, a backslash 
would always be printed as \\.


\x is the introduction to a hexadecimal encoding for a character; the next 
two characters show the hex digits.  So your first string contains a single 
character \xa0, the third one contains \xab, and so on.


The \023 is an octal encoding for a single character.



If we were dealing with a leading backslash, I guess this would do it:

gsub("^.*", "", txt)

R would display a double backslash, but I believe that represents a single 
backslash.  So if the string were saved using write.table, say, only a 
single backslash would be stored.



a <- "\\This is a string."
a

[1] "\\This is a string."

gsub("^", "", a)

[1] "This is a string."

a

[1] "\\This is a string."

gsub("^.*", "", a)

[1] ""

gsub("^.*", "", c(a,"Another string","\\more"))

[1] ""   "Another string" ""

write.table(a, file="a.txt", quote=F, row.names=F, col.names=F)


$ cat a.txt
\This is a string.

Apparently this is not what the OP really wanted.  The OP probably wanted 
to remove characters that were not from the regular ASCII set.



Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 9:34 PM, Miao wrote:

Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
characters and special characters.  In R's gsub() what regular
expressions shall I use to handle all these situations?


I don't know.  This might work:

gsub("[\x01-\x1f\x7f-\xff]", "", x)

(i.e. the range from character 1 to character 31, and 127 to 255) but I 
don't know if our regular expression matcher will accept those characters.


Duncan Murdoch




On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch
mailto:murdoch.dun...@gmail.com>> wrote:

On 29/04/2011 7:41 PM, Miao wrote:

Hello,

Can anyone help on gsub() in R?  I have a string like something
below, and
wanted to delete all the strings with leading backslash,
including "\xa0On",
"\023, "\xab", and many others.   How should I write a regular
expression
pattern in gsub()?  I don't care how many characters following
backslash.



If those are R strings, none of them contain a backslash.  In R, a
backslash would always be printed as \\.

\x is the introduction to a hexadecimal encoding for a character;
the next two characters show the hex digits.  So your first string
contains a single character \xa0, the third one contains \xab, and
so on.

The \023 is an octal encoding for a single character.

Duncan Murdoch



txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for
people \xab
and be patient :"

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
proceed everyday


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
Thanks Duncan for clarifying this.  I'm pretty a newbie to such type of
characters and special characters.  In R's gsub() what regular expressions
shall I use to handle all these situations?


On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch wrote:

> On 29/04/2011 7:41 PM, Miao wrote:
>
>> Hello,
>>
>> Can anyone help on gsub() in R?  I have a string like something below, and
>> wanted to delete all the strings with leading backslash, including
>> "\xa0On",
>> "\023, "\xab", and many others.   How should I write a regular expression
>> pattern in gsub()?  I don't care how many characters following backslash.
>>
>
>
> If those are R strings, none of them contain a backslash.  In R, a
> backslash would always be printed as \\.
>
> \x is the introduction to a hexadecimal encoding for a character; the next
> two characters show the hex digits.  So your first string contains a single
> character \xa0, the third one contains \xab, and so on.
>
> The \023 is an octal encoding for a single character.
>
> Duncan Murdoch
>
>
>
>> txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for people
>> \xab
>> and be patient :"
>>
>> Thanks in advance,
>> Miao
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
proceed everyday

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread Rolf Turner

On 30/04/11 02:44, Alaios wrote:

Thanks a lot.
I finally used

M2<- M
M2[M<  thresh]<- 0
M2[M>= thresh]<- 1

as I noticed that this one line

M2<- as.numeric( M[]<  thresh )
vectorizes my matrix.

One more question I have two matrices that only differ slightly. What will be 
the easiest way to compare and find the cells that are not the same?


(1) M2 <- 0 +(M >= thresh)

is one line (and is sexier! :-) ).

(2) which(A !=B, arr.ind = TRUE)

or (possibly more sensibly)

which (abs(A-B) > sqrt(.Machine$double.eps), arr.ind = TRUE)

should solve your ``One more question''.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 7:41 PM, Miao wrote:

Hello,

Can anyone help on gsub() in R?  I have a string like something below, and
wanted to delete all the strings with leading backslash, including "\xa0On",
"\023, "\xab", and many others.   How should I write a regular expression
pattern in gsub()?  I don't care how many characters following backslash.



If those are R strings, none of them contain a backslash.  In R, a 
backslash would always be printed as \\.


\x is the introduction to a hexadecimal encoding for a character; the 
next two characters show the hex digits.  So your first string contains 
a single character \xa0, the third one contains \xab, and so on.


The \023 is an octal encoding for a single character.

Duncan Murdoch




txt<- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for people \xab
and be patient :"

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analysis and graphics by groups

2011-04-29 Thread Bilonick, Richard A
Unless I misunderstand, I think you might want the nlsList function from the 
nlme package. It will let you fit a nonlinear model for grouped data. You use 
"|" to separate the model formula from the grouping factor.

Rick

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Andrew Robinson [a.robin...@ms.unimelb.edu.au]
Sent: Friday, April 29, 2011 8:05 PM
To: Cristiano Yuji Sasada Sato
Cc: r-help@r-project.org
Subject: Re: [R] Analysis and graphics by groups

hi Christiano,

the error is that FUN is not a function.  That is true, the argument
that you are passing to FUN is a different class. Instead of fx, for
example, where fx is your model code below, try to write it as a
function of the arguments that you want to split by Cerca.

You might try to construct a minimal, reproducible, commented example
to help us explain what you need to do.

I don't have the gapply function and I don't know which package it is
in (perhaps you could provide that  information next time) so I can't
help more than that.

Andrew

On Fri, Apr 29, 2011 at 04:31:51PM -0300, Cristiano Yuji Sasada Sato wrote:
> Hello,
>
> This is my first post in this e-mail list and I hope it's enough to justify
> calling for help. In case it's not, sorry.
>
> I'm trying to do analysis and graphics using a factor as a criteria to split
> data and do the analysis/graphics for each subset of data.
>
> Right now what I'm trying to do is to fit and plot the following logistic
> model, according to a third variable named "Cerca":
> dm_fit_T<-nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T)
>
> I've found a function called gapply which seems to be what I need, but it
> doesn't seem to work. This is the argument I've used:
> gapply(perieph,FUN=nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T),groups="Cerca")
>
> But I get this error message returned:
> > Error in get(as.character(FUN), mode = "function", envir = envir) :
> object 'FUN' of mode 'function' was not found
>
> Can you help me doing this non-linear regression by groups work?
>
> Also, after I manage making the regression, I'd also need fitting a line to
> the nDMTRBgm2~nDMTRBgm2.T.1 data using the same model above. I've used
> plotfit to do that with one nlm data set. Is it possible to fit each group
> trend line and data with different colours/symbols  in one same graphic?
>
> Thank you,
> Cristiano
>
> --
> Cristiano Yuji Sasada Sato
> Doutorando
> Programa de P?s-Gradua??o em Ecologia e Evolu??o - IBRAG / UERJ
> Laborat?rio de Ecologia de Rios e C?rregos
> Departamento de Ecologia - Universidade do Estado do Rio de Janeiro
>
>   [[alternative HTML version deleted]]
>

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


--
Andrew Robinson
Program Manager, ACERA
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011)
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009):
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For loop and sqldf

2011-04-29 Thread Jeff Newmiller
Putting SQL columns/variables into square brackets is valid syntax for sqlite. 
Expecting sqlite to share variables with R is not, so there was only one 
necessary change.
---
Jeff Newmiller The . . Go Live...
DCN: Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

David Winsemius  wrote:

On Apr 29, 2011, at 4:27 PM, mathijsdevaan wrote: > Hi list, > > Can anyone 
tell my why the following does not work? Thanks a lot! > Your help > is very 
much appreciated. > > DF = data.frame(read.table(textConnection(" B C D E F G > 
8025 1995 0 4 1 2 > 8025 1997 1 1 3 4 > 8026 1995 0 7 0 0 > 8026 1996 1 2 3 0 > 
8026 1997 1 2 3 1 > 8026 1998 6 0 0 4 > 8026 1999 3 7 0 3 > 8027 1997 1 2 3 9 > 
8027 1998 1 2 3 1 > 8027 1999 6 0 0 2 > 8028 1999 3 7 0 0 > 8029 1995 0 2 3 3 > 
8029 1998 1 2 3 2 > 8029 1999 6 0 0 1"),head=TRUE,stringsAsFactors=FALSE)) 
list<-sort(unique(DF$C)) ; require(sqldf); data <-list() # added inits > for (t 
in 1:length(list)) >{ > year = as.character(list[t]) >  
data[year]<-sqldf('select * from DF where C = [year]') #I see you have already 
gotten a workable answer, but thought you might want to see if this would work: 
 data[year]<-sqldf(paste('select * from DF where C = ', year, sep="") ) # Two 
changes ... let `year` get evaluated and don't put `year` in brack
 ets. >
} > > data $`1995` [1] 8025 8026 8029 $`1996` [1] 8026 $`1997` [1] 8025 
8026 8027 $`1998` [1] 8026 8027 8029 $`1999` [1] 8026 8027 8028 8029 > I am 
trying to split up the data.frame into 5 new ones, one for > every year. > > > 
-- David Winsemius, MD West Hartford, 
CT_
R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code. 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analysis and graphics by groups

2011-04-29 Thread Andrew Robinson
hi Christiano,

the error is that FUN is not a function.  That is true, the argument
that you are passing to FUN is a different class. Instead of fx, for
example, where fx is your model code below, try to write it as a
function of the arguments that you want to split by Cerca.

You might try to construct a minimal, reproducible, commented example
to help us explain what you need to do.

I don't have the gapply function and I don't know which package it is
in (perhaps you could provide that  information next time) so I can't
help more than that.

Andrew
 
On Fri, Apr 29, 2011 at 04:31:51PM -0300, Cristiano Yuji Sasada Sato wrote:
> Hello,
> 
> This is my first post in this e-mail list and I hope it's enough to justify
> calling for help. In case it's not, sorry.
> 
> I'm trying to do analysis and graphics using a factor as a criteria to split
> data and do the analysis/graphics for each subset of data.
> 
> Right now what I'm trying to do is to fit and plot the following logistic
> model, according to a third variable named "Cerca":
> dm_fit_T<-nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T)
> 
> I've found a function called gapply which seems to be what I need, but it
> doesn't seem to work. This is the argument I've used:
> gapply(perieph,FUN=nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T),groups="Cerca")
> 
> But I get this error message returned:
> > Error in get(as.character(FUN), mode = "function", envir = envir) :
> object 'FUN' of mode 'function' was not found
> 
> Can you help me doing this non-linear regression by groups work?
> 
> Also, after I manage making the regression, I'd also need fitting a line to
> the nDMTRBgm2~nDMTRBgm2.T.1 data using the same model above. I've used
> plotfit to do that with one nlm data set. Is it possible to fit each group
> trend line and data with different colours/symbols  in one same graphic?
> 
> Thank you,
> Cristiano
> 
> -- 
> Cristiano Yuji Sasada Sato
> Doutorando
> Programa de P?s-Gradua??o em Ecologia e Evolu??o - IBRAG / UERJ
> Laborat?rio de Ecologia de Rios e C?rregos
> Departamento de Ecologia - Universidade do Estado do Rio de Janeiro
> 
>   [[alternative HTML version deleted]]
> 

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
Hello,

Can anyone help on gsub() in R?  I have a string like something below, and
wanted to delete all the strings with leading backslash, including "\xa0On",
"\023, "\xab", and many others.   How should I write a regular expression
pattern in gsub()?  I don't care how many characters following backslash.

txt <- "Is This Thing\xa0On? http://bit.ly/jAbKem  wait \023 for people \xab
and be patient :"

Thanks in advance,
Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] For loop and sqldf

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 4:27 PM, mathijsdevaan wrote:


Hi list,

Can anyone tell my why the following does not work? Thanks a lot!  
Your help

is very much appreciated.

DF = data.frame(read.table(textConnection("B  C  D  E  F  G
8025  1995  0  4  1  2
8025  1997  1  1  3  4
8026  1995  0  7  0  0
8026  1996  1  2  3  0
8026  1997  1  2  3  1
8026  1998  6  0  0  4
8026  1999  3  7  0  3
8027  1997  1  2  3  9
8027  1998  1  2  3  1
8027  1999  6  0  0  2
8028  1999  3  7  0  0
8029  1995  0  2  3  3
8029  1998  1  2  3  2
8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE))


list<-sort(unique(DF$C))  ; require(sqldf); data <-list() # added inits


for (t in 1:length(list))
{
year = as.character(list[t])
data[year]<-sqldf('select * from DF where C = [year]')


#I see you have already gotten a workable answer, but thought you  
might want to see if this would work:


data[year]<-sqldf(paste('select * from DF where C = ', year,  sep="") )

# Two changes ... let `year` get evaluated and don't put `year` in  
brackets.



}



> data
$`1995`
[1] 8025 8026 8029

$`1996`
[1] 8026

$`1997`
[1] 8025 8026 8027

$`1998`
[1] 8026 8027 8029

$`1999`
[1] 8026 8027 8028 8029
I am trying to split up the data.frame into 5 new ones, one for  
every year.





--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use nparcomp function from nparcomp library to run post hoc

2011-04-29 Thread Jun Shen
Hi, Dennis,

Thanks for the reply. I tried to upgrade to R 2.13.0. Then when I tried to
load the library(nparcomp), I got an error

Error: package 'mvtnorm' is not installed for 'arch=i386'

What does that mean? Thanks.

Jun

On Fri, Apr 29, 2011 at 5:49 PM, Dennis Murphy  wrote:

> Hi:
>
> Is this the function nparcomp() in the nparcomp package or the one
> from the mutoss package? When using functions from packages, it is
> useful to indicate the package name. I'm assuming you're using the
> nparcomp package, because your code worked for me when that package
> was loaded:
>
> > library(nparcomp)
> Loading required package: multcomp
> Loading required package: mvtnorm
> Loading required package: survival
> Loading required package: splines
> > nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')
>
>   Nonparametric Multiple Comparison Procedure based on relative
> contrast effects , Type of Contrast : Dunnett
>  NOTE:
>  *---Weight Matrix--*
>  - Weight matrix for choosen contrast based on all-pairs comparisons
>
>  *---Analysis of relative effects---*
>  - Simultaneous Confidence Intervals for relative effects p(i,j)
>  with confidence level 0.95
>  - Method = Multivariate Delta-Method (Logit)
>  - p-Values for  H_0: p(i,j)=1/2
>
>  *Interpretation*
>  p(a,b) > 1/2 : b tends to be larger than a
>  *--Mult.Distribution---*
>  - Equicoordinate Quantile
>  - Global p-Value
>  *--*
> $weight.matrix
>
>< snipped for brevity - all zeros >
>
> $Data.Info
>   Sample Size
> 1 Duoderm   24
> 2 Fibrase   24
> 3 Kollagenase   24
> 4 Non-treated   24
> 5Stimulen   24
> 6 Vehicle   24
>
> $Analysis.of.relative.effects
>  Comparison rel.effect confidence.interval t.value
> 1 p(Non-treated,Duoderm)0.5   [ 0.499 ; 0.501 ]   0
> 2 p(Non-treated,Fibrase)0.5   [ 0.499 ; 0.501 ]   0
> 3 p(Non-treated,Kollagenase)0.5   [ 0.499 ; 0.501 ]   0
> 4p(Non-treated,Stimulen)0.5   [ 0.499 ; 0.501 ]   0
> 5 p(Non-treated,Vehicle)0.5   [ 0.499 ; 0.501 ]   0
>  p.value.adjusted p.value.unadjusted
> 11  1
> 21  1
> 31  1
> 41  1
> 51  1
>
> $Mult.Distribution
>  Quantile p.Value.global
> 1 2.568766  1
>
> $Correlation
> [1] NA
>
> A graphic also appears indicating zero effect, which is what one would
> expect since Ulceration = 5 for every observation in the data frame.
>
> > sessionInfo()
> R version 2.13.0 (2011-04-13)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines   stats graphics  grDevices utils datasets  methods
> [8] base
>
> other attached packages:
> [1] nparcomp_1.0-1  multcomp_1.2-5  survival_2.36-9 mvtnorm_0.9-999
> [5] sos_1.3-0   brew_1.0-6  plyr_1.5.2
>
> loaded via a namespace (and not attached):
> [1] tcltk_2.13.0 tools_2.13.0
>
> Check your version of R and the nparcomp package against this. If you
> have an older version of R or nparcomp, perhaps an upgrade is
> sufficient to fix the problem.
>
> HTH,
> Dennis
>
> On Fri, Apr 29, 2011 at 2:49 PM, Jun Shen  wrote:
> > Dear list,
> >
> > I tried to use the nparcomp to run some post hoc non-parametric
> comparison
> > and got and error.
> >
> > Error in uniroot(pfct, interval = interval) :
> >  f() values at end points not of opposite sign
> >
> >  Appreciate any comments.
> >
> > the command line:
> >
> >>nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')
> >
> >
> > Jun
> > ===
> > data as follows
> >
> > structure(list(Group = c("Duoderm", "Duoderm", "Duoderm", "Duoderm",
> > "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
> > "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
> > "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
> > "Duoderm", "Duoderm", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> > "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> > "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> > "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> > "Fibrase", "Fibrase", "Kollagenase", "Kollagenase", "Kollagenase",
> > "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
> "Kollagenase",
> > "Kollagenase", "Kollagenase", "Kollagenase", "Koll

Re: [R] Use nparcomp function from nparcomp library to run post hoc

2011-04-29 Thread Dennis Murphy
Hi:

Is this the function nparcomp() in the nparcomp package or the one
from the mutoss package? When using functions from packages, it is
useful to indicate the package name. I'm assuming you're using the
nparcomp package, because your code worked for me when that package
was loaded:

> library(nparcomp)
Loading required package: multcomp
Loading required package: mvtnorm
Loading required package: survival
Loading required package: splines
> nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')

  Nonparametric Multiple Comparison Procedure based on relative
contrast effects , Type of Contrast : Dunnett
 NOTE:
 *---Weight Matrix--*
 - Weight matrix for choosen contrast based on all-pairs comparisons

 *---Analysis of relative effects---*
 - Simultaneous Confidence Intervals for relative effects p(i,j)
  with confidence level 0.95
 - Method = Multivariate Delta-Method (Logit)
 - p-Values for  H_0: p(i,j)=1/2

 *Interpretation*
 p(a,b) > 1/2 : b tends to be larger than a
 *--Mult.Distribution---*
 - Equicoordinate Quantile
 - Global p-Value
 *--*
$weight.matrix

< snipped for brevity - all zeros >

$Data.Info
   Sample Size
1 Duoderm   24
2 Fibrase   24
3 Kollagenase   24
4 Non-treated   24
5Stimulen   24
6 Vehicle   24

$Analysis.of.relative.effects
  Comparison rel.effect confidence.interval t.value
1 p(Non-treated,Duoderm)0.5   [ 0.499 ; 0.501 ]   0
2 p(Non-treated,Fibrase)0.5   [ 0.499 ; 0.501 ]   0
3 p(Non-treated,Kollagenase)0.5   [ 0.499 ; 0.501 ]   0
4p(Non-treated,Stimulen)0.5   [ 0.499 ; 0.501 ]   0
5 p(Non-treated,Vehicle)0.5   [ 0.499 ; 0.501 ]   0
  p.value.adjusted p.value.unadjusted
11  1
21  1
31  1
41  1
51  1

$Mult.Distribution
  Quantile p.Value.global
1 2.568766  1

$Correlation
[1] NA

A graphic also appears indicating zero effect, which is what one would
expect since Ulceration = 5 for every observation in the data frame.

> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] nparcomp_1.0-1  multcomp_1.2-5  survival_2.36-9 mvtnorm_0.9-999
[5] sos_1.3-0   brew_1.0-6  plyr_1.5.2

loaded via a namespace (and not attached):
[1] tcltk_2.13.0 tools_2.13.0

Check your version of R and the nparcomp package against this. If you
have an older version of R or nparcomp, perhaps an upgrade is
sufficient to fix the problem.

HTH,
Dennis

On Fri, Apr 29, 2011 at 2:49 PM, Jun Shen  wrote:
> Dear list,
>
> I tried to use the nparcomp to run some post hoc non-parametric comparison
> and got and error.
>
> Error in uniroot(pfct, interval = interval) :
>  f() values at end points not of opposite sign
>
>  Appreciate any comments.
>
> the command line:
>
>>nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')
>
>
> Jun
> ===
> data as follows
>
> structure(list(Group = c("Duoderm", "Duoderm", "Duoderm", "Duoderm",
> "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
> "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
> "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
> "Duoderm", "Duoderm", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
> "Fibrase", "Fibrase", "Kollagenase", "Kollagenase", "Kollagenase",
> "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
> "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
> "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
> "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
> "Kollagenase", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
> "Non-treated", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
> "Non-treated", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
> "Non-treated", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
> "Non-treated", "Non-treated", "Non

Re: [R] RCurl and postForm()

2011-04-29 Thread Duncan Temple Lang

Hi Ryan

 postForm() is using a different style (or specifically Content-Type) of 
submitting the form than the curl -d command.
Switching the style = 'POST' uses the same type, but at a quick guess, the 
parameter name 'a' is causing confusion
and the result is the empty JSON array - "[]".

A quick workaround is to use curlPerform() directly rather than postForm()

 r = dynCurlReader()
 curlPerform(postfields = 'Archbishop Huxley', url = 
'http://www.datasciencetoolkit.org/text2people', verbose = TRUE,
  post = 1L, writefunction = r$update)
 r$value()

This yields

[1]
"[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":0,\"end_index\":17,\"matched_string\":\"Archbishop
Huxley\"}]"

and you can use fromJSON() to transform it into data in R.

  D.

On 4/29/11 12:14 PM, Elmore, Ryan wrote:
> Hi everybody,
> 
> I think that I am missing something fundamental in how strings are passed 
> from a postForm() call in R to the curl or libcurl functions underneath.  For 
> example, I can do the following using curl from the command line:
> 
> $ curl -d "Archbishop Huxley" "http://www.datasciencetoolkit.org/text2people";
> [{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop
>  Huxley"}]
> 
> Trying the same thing, or what I *think* is the same thing (obvious not) in R 
> (Mac OS 10.6.7, R 2.13.0) produces:
> 
>> library(RCurl)
> Loading required package: bitops
>> api <- "http://www.datasciencetoolkit.org/text2people";
>> postForm(api, a="Archbishop Huxley")
> [1] 
> "[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":44,\"end_index\":61,\"matched_string\":\"Archbishop
>  
> Huxley\"},{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":88,\"end_index\":105,\"matched_string\":\"Archbishop
>  Huxley\"}]"
> attr(,"Content-Type")
> charset
> "text/html" "utf-8"
> 
> I can match the result given on the DSTK API's website by using system(), but 
> doesn't seem like the R-like way of doing something.
> 
>> system("curl -d 'Archbishop Huxley' 
>> 'http://www.datasciencetoolkit.org/text2people'")
> 158   141  141   141
> 0[{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop
>  Huxley"}]17599 72 --:--:-- --:--:-- --:--:--   670
> 
> If you want to see some additional information related to this question, I 
> posted on StackOverflow a few days ago:
> http://stackoverflow.com/questions/5797688/post-request-using-rcurl
> 
> I am working on this R wrapper for the data science toolkit as a way of 
> illustrating how to make an R package for the Denver RUG and ran into this 
> problem.  Any help to this problem will be greatly appreciated by the Denver 
> RUG!
> 
> Cheers,
> Ryan
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Duncan Temple Lang
Hi Tal

You can add

  ssl.verifypeer = FALSE

in the .opts list so that the certificate is simply accepted.

Alternatively, you can tell libcurl where to find the certification
authority file containing signatures. This can be done via the cainfo
option, e.g.

   cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"),

Often such a collection of certificates is installed with the ssl library.

  D.

On 4/29/11 2:42 PM, Tal Galili wrote:
> Hello Duncan,
> Thank you for having a look at this.
> 
> I tried the code you provided but it failed in the getForm stage.  running 
> this:
> 
> > tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub";,
> +  hl ="en", key = 
> "0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
> +  single = "true", gid ="0",
> +  output = "csv",
> + .opts = list(followlocation = TRUE, verbose = TRUE))
> 
> Resulted in the following error:
> 
> Error in curlPerform(url = url, headerfunction = header$update, curl = 
> curl,  : 
>   SSL certificate problem, verify that the CA cert is OK. Details:
> error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate 
> verify failed
> 
> 
> Did I miss some step?
> 
> 
> 
> 
> 
> Contact 
> Details:---
> Contact me: tal.gal...@gmail.com  |  
> 972-52-7275845
> Read me: www.talgalili.com  (Hebrew) | 
> www.biostatistics.co.il
>  (Hebrew) | www.r-statistics.com 
>  (English)
> --
> 
> 
> 
> 
> On Fri, Apr 29, 2011 at 9:18 PM, Duncan Temple Lang  > wrote:
> 
> 
> Thanks David for fixing the early issues.
> 
> The reason for the failure is that the response
> from the Web server is a to redirect the requester
> to another page, specifically
> 
>  
> https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> 
> 
> 
> Note that this is https, not http, and the built-in URL reading 
> facilities in R don't suport https.
> 
> 
> One way to see this is to use look at the headers in your browser (e.g. 
> Live HTTP Headers),
> or to use curl, or the RCurl package
> 
> tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub";,
>  hl ="en", key = 
> "0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
>  single = "true", gid ="0",
>  output = "csv",
> .opts = list(followlocation = TRUE, verbose = TRUE))
> 
> 
> The verbose option shows the entire dialog, and tt contains the
> text of the CSV document.
> 
>  read.csv(textConnection(tt))
> 
> then yields the data frame
> 
>  D.
> 
> 
> On 4/29/11 10:36 AM, David Winsemius wrote:
> >
> > On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
> >
> >> Hello all,
> >> I wish to use read.csv to read a google doc spreadsheet.
> >>
> >> I try using the following code:
> >>
> >> data_url <- "
> >>
> 
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> 
> 
> >>
> >> "
> >> read.csv(data_url)
> >>
> >> Which results in the following error:
> >>
> >> Error in file(file, "rt") : cannot open the connection
> >>
> >>
> >> I'm on windows 7.  And the code was tried on R 2.12 and 2.13
> >>
> >> I remember trying this a few months ago and it worked fine.
> >
> > I am always amused at such claims. Occasionally they are correct, but 
> more often a crucial step has been omitted. In
> > this case you have at a minimum embedded line-feeds in your URL string 
> and have not established a connection, so it
> > could not possibly have succeeded as presented.
> >
> > But now it's time to admit I do not know why it is not succeeding when 
> I correct those flaws.
> >
> >> closeAllConnections()
> >> data_url <-
> >
> 
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> 
> ")
> >
> >> read.csv(data_url)
> > Error in open.connection(file, "rt") : cannot open the connection
> >
> >> clo

Re: [R] For loop and sqldf

2011-04-29 Thread Dennis Murphy
Hi:

Try

split(DF, DF$C)

Does that work?

Dennis

On Fri, Apr 29, 2011 at 1:27 PM, mathijsdevaan  wrote:
> Hi list,
>
> Can anyone tell my why the following does not work? Thanks a lot! Your help
> is very much appreciated.
>
> DF = data.frame(read.table(textConnection("    B  C  D  E  F  G
> 8025  1995  0  4  1  2
> 8025  1997  1  1  3  4
> 8026  1995  0  7  0  0
> 8026  1996  1  2  3  0
> 8026  1997  1  2  3  1
> 8026  1998  6  0  0  4
> 8026  1999  3  7  0  3
> 8027  1997  1  2  3  9
> 8027  1998  1  2  3  1
> 8027  1999  6  0  0  2
> 8028  1999  3  7  0  0
> 8029  1995  0  2  3  3
> 8029  1998  1  2  3  2
> 8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE))
> list<-sort(unique(DF$C))
> for (t in 1:length(list))
>        {
>        year = as.character(list[t])
>        data[year]<-sqldf('select * from DF where C = [year]')
>        }
>
> I am trying to split up the data.frame into 5 new ones, one for every year.
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/For-loop-and-sqldf-tp3484559p3484559.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Use nparcomp function from nparcomp library to run post hoc

2011-04-29 Thread Jun Shen
Dear list,

I tried to use the nparcomp to run some post hoc non-parametric comparison
and got and error.

Error in uniroot(pfct, interval = interval) :
  f() values at end points not of opposite sign

 Appreciate any comments.

the command line:

>nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')


Jun
===
data as follows

structure(list(Group = c("Duoderm", "Duoderm", "Duoderm", "Duoderm",
"Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
"Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
"Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm", "Duoderm",
"Duoderm", "Duoderm", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
"Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
"Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
"Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase", "Fibrase",
"Fibrase", "Fibrase", "Kollagenase", "Kollagenase", "Kollagenase",
"Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
"Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
"Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
"Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase", "Kollagenase",
"Kollagenase", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
"Non-treated", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
"Non-treated", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
"Non-treated", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
"Non-treated", "Non-treated", "Non-treated", "Non-treated", "Non-treated",
"Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen",
"Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen",
"Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen",
"Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen", "Stimulen",
"Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle",
"Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle",
"Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle",
"Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle", "Vehicle"
), Ulceration = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5), Inflamation = c(3, 4, 3, 2, 3, 3, 4, 4, 2, 2, 3, 3,
3, 3, 3, 3, 3, 4, 3, 3, 3, 4, 3, 3, 2, 3, 3, 4, 3, 3, 2, 4, 4,
4, 4, 4, 4, 4, 3, 3, 4, 3, 5, 3, 3, 4, 4, 3, 3, 2, 4, 2, 3, 3,
4, 3, 4, 3, 3, 4, 3, 4, 2, 3, 3, 4, 2, 3, 4, 3, 2, 3, 3, 3, 2,
3, 2, 2, 2, 2, 4, 3, 2, 3, 3, 4, 3, 3, 4, 3, 4, 2, 4, 3, 4, 2,
4, 3, 4, 3, 2, 2, 2, 2, 3, 2, 3, 2, 4, 3, 2, 4, 4, 4, 2, 2, 3,
3, 2, 4, 3, 2, 3, 2, 2, 2, 4, 2, 3, 2, 3, 2, 3, 3, 3, 4, 3, 3,
4, 4, 2, 3, 2, 3), Fibroplasia = c(4, 4, 4, 4, 4, 3, 4, 4, 4,
3, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 2, 4, 4, 3,
2, 4, 4, 4, 4, 4, 4, 3, 3, 3, 4, 3, 3, 3, 4, 3, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 3, 3, 3, 4, 4, 3, 4, 4, 4, 3, 4,
4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4,
4, 4, 3, 4, 4, 3, 4, 3, 2, 3, 3, 4, 3, 3, 4, 4, 3, 3, 3, 4, 3,
3, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 3, 3, 4, 4, 4, 4, 4, 3, 4,
3, 4, 4, 4, 4, 4, 4, 4, 4), Fibrosis.and.Adexnal.Atrophy = c(4,
4, 4, 3, 4, 4, 4, 4, 4, 3, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 3, 4, 3, 4, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 3, 4, 3, 4,
4, 4, 3, 4, 3, 3, 3, 3, 3, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 3, 3,
4, 4, 3, 3, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4,
4, 4, 3, 4, 4, 4, 3, 4, 3, 4, 3, 4, 4, 3, 4, 3, 2, 3, 3, 4, 4,
3, 4, 4, 3, 3, 3, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 4,
3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), Inflammation = c(2,
2, 2, 1, 1, 1, 2, 3, 1, 2, 1, 1, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1,
1, 1, 1, 1, 1, 1, 1, 2, NA, 1, 1, 1, 2, 2, 2, 1, 2, 1, 1, 2,
1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1,
1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, NA, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1,
2, 1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 2,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2), Fibroplasia.1 =
c(4,
4, 4, 4, 4, 3, 4, 4, 4, 3, 4, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3,
4, 4, 4, 4, 3, 3, 4, 3, NA, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3,
3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 3, 4, 3, 4, 4, 4, 4, 3,
3, 4, 4, 4, 3, 4, 3, 4, 4, 3, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4,
4, 3, 3, 3, NA, 4, 4, 4, 4, 3, 4, 3, 3, 3, 3, 3, 4, 2, 4, 3,
4, 4, 3, 4, 4, 2, 3, 2, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3,
4, 3, 3, 4, 4, 4, 4, 3, 3, 4, 3, 3, 4, 4, 4, 4, 3, 4, 4), Fibrosis = c(3,
3, 3, 3, 3, 3, 3, 3, 3, 3

Re: [R] setting options only inside functions

2011-04-29 Thread William Dunlap
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of 
> luke-tier...@uiowa.edu
> Sent: Friday, April 29, 2011 9:35 AM
> To: Jonathan Daily
> Cc: r-help@r-project.org; Hadley Wickham; Barry Rowlingson
> Subject: Re: [R] setting options only inside functions
> 
> The Python solution does not extend, at least not cleanly, to things
> like dev on/ dev off or to Hadley's locale example.  In any case if I
> am reading the Python source correctly on how they handle user
> interrupts this solution has the same non-robusness to user interrupts
> issue that Bill's initial solution had.
> 
> As a basis I believe what we need is a mechanism that handles a
> setup, an action, and a cleanup, with setup and cleanup occurring with
> interrupts disablednand the action with interrupts enabled. Scheme's
> dynamic wind is similar, though I don't believe the scheme standard
> addresses interrupts and we don't need to worry about continuations,
> but some of the issues are similar.  Probably we would want two
> flavors, one in which the action has to be a function that takes as a
> single argument the result produced by the setup code, and one in
> which the action can be an argument expression that is then evaluated
> at the appropriate place by laze evaluation.
> 
> This can be done at the R level except for the controlling of
> interrupts (and possibly other asynchronous stuff)-- that would need a
> new pair of primitives (suspendInterrupts/enableInterupts or something
> like that).  There is something in the Haskell literature on this that
> I have looked at a while back -- probably time to have another look.

Luke,

  A similar problem is that if optionsList contains an illegal
option then setting options(optionList) will commit changes
to .Options as it works it way down the optionList until it
hits the illegal option, when it throws an error.  Then the
following on.exit is never called (it wouldn't have the output
of options(optionList) to work on if it were called) and the
initial settings in optionList stick around forever.  E.g.,

  > withOptions <- function(optionList, expr) {
  + oldOpt <- options(optionList)
  + on.exit(options(oldOpt))
  + expr
  + }
  > getOption("height")
  NULL
  > getOption("width")
  [1] 80
  > withOptions(list(height=10, width=-2), 666)
  Error in options(optionList) :
invalid 'width' parameter, allowed 10...1
  > getOption("height")
  [1] 10
  > getOption("width")
  [1] 80

I haven't checked to see if par() works in the same way - it
does in S+.

An ignoreInterrupts(expr) function would not help in that case.
Making options() (and par()) atomic operations would help, but that
may be a lot of work.  options() might also warn but no change
.Options if there were an attempt to set an illegal option.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
 
> 
> 
> On Thu, 28 Apr 2011, Jonathan Daily wrote:
> 
> > I would also love to see this implemented in R, as my 
> current solution
> > to the issue of doing tons of open/close, dev/dev.off, etc. 
> is to use
> > snippets in my IDE, and in the end I feel like it is a hack job. A
> > pythonic "with" function would also solve most of the 
> situations where
> > I have had to use awkward try or tryCatch calls. I would be 
> willing to
> > help with this project, even if it is just testing.
> >
> > On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson
> >  wrote:
> >>> but it's a little clumsy, because
> >>>
> >>> with_connection(file("myfile.txt"), {do stuff...})
> >>>
> >>> isn't very useful because you have no way to reference 
> the connection
> >>> that you're using. Ruby's blocks have arguments which 
> would require
> >>> big changes to R's syntax.  One option would to use pronouns:
> >>
> >>  Looking very much like python 'with' statements:
> >>
> >> http://effbot.org/zone/python-with-statement.htm
> >>
> >>  Implemented via the 'with' statement which can operate on anything
> >> that has a __enter__ and an __exit__ method. Very neat.
> >>
> >> Barry
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> >
> 
> -- 
> Luke Tierney
> Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:  l...@stat.uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-proje

Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Tal Galili
Hello Duncan,
Thank you for having a look at this.

I tried the code you provided but it failed in the getForm stage.  running
this:

> tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub";,
+  hl ="en", key =
"0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
+  single = "true", gid ="0",
+  output = "csv",
+ .opts = list(followlocation = TRUE, verbose = TRUE))

Resulted in the following error:

Error in curlPerform(url = url, headerfunction = header$update, curl = curl,
 :
  SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify
failed


Did I miss some step?





Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, Apr 29, 2011 at 9:18 PM, Duncan Temple Lang  wrote:

>
> Thanks David for fixing the early issues.
>
> The reason for the failure is that the response
> from the Web server is a to redirect the requester
> to another page, specifically
>
>
> https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
>
> Note that this is https, not http, and the built-in URL reading facilities
> in R don't suport https.
>
>
> One way to see this is to use look at the headers in your browser (e.g.
> Live HTTP Headers),
> or to use curl, or the RCurl package
>
> tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub";,
>  hl ="en", key =
> "0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
>  single = "true", gid ="0",
>  output = "csv",
> .opts = list(followlocation = TRUE, verbose = TRUE))
>
>
> The verbose option shows the entire dialog, and tt contains the
> text of the CSV document.
>
>  read.csv(textConnection(tt))
>
> then yields the data frame
>
>  D.
>
>
> On 4/29/11 10:36 AM, David Winsemius wrote:
> >
> > On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
> >
> >> Hello all,
> >> I wish to use read.csv to read a google doc spreadsheet.
> >>
> >> I try using the following code:
> >>
> >> data_url <- "
> >>
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> >>
> >> "
> >> read.csv(data_url)
> >>
> >> Which results in the following error:
> >>
> >> Error in file(file, "rt") : cannot open the connection
> >>
> >>
> >> I'm on windows 7.  And the code was tried on R 2.12 and 2.13
> >>
> >> I remember trying this a few months ago and it worked fine.
> >
> > I am always amused at such claims. Occasionally they are correct, but
> more often a crucial step has been omitted. In
> > this case you have at a minimum embedded line-feeds in your URL string
> and have not established a connection, so it
> > could not possibly have succeeded as presented.
> >
> > But now it's time to admit I do not know why it is not succeeding when I
> correct those flaws.
> >
> >> closeAllConnections()
> >> data_url <-
> > url("
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> ")
> >
> >> read.csv(data_url)
> > Error in open.connection(file, "rt") : cannot open the connection
> >
> >> closeAllConnections()
> >> dd <- read.csv(con <-
> > url("
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> "))
> >
> > Error in open.connection(file, "rt") : cannot open the connection
> >
> >
> > So, I guess I'm not reading the help pages for `url` and `read.csv` as
> well I thought I was.
> >
> >
> >> Any suggestion what might be causing this or how to solve it?
> >
> >
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strange fluctuations in system.time with kernapply

2011-04-29 Thread Alexander Senger

Hello expeRts,


here is something which strikes me as kind of odd and I would like to ask for 
some enlightenment:

First let's do this:

tkern <- kernel("modified.daniell", c(5,5))
test <- rep(1,100)
system.time(kernapply(test,tkern))
   User  System verstrichen
  1.100   0.040   1.136

That was easy. Now this:

test <- rep(1,110)
system.time(kernapply(test,tkern))
   User  System verstrichen
   1.400.021.43

Still fine. Now this:

test <- rep(1,111)
system.time(kernapply(test,tkern))
   User  System verstrichen
  1.390   0.020   1.409

Ok, by now it seems boring. But wait:

test <- rep(1,1110300)
system.time(kernapply(test,tkern))
   User  System verstrichen
 12.270   0.030  12.319

There is a sudden - and repeatable! - jump in the time needed to execute kernapply. At least from a 
naive point of view there should not be much difference between applying a kernel to a vector 
111 or 1110300 entries long. But maybe there is some limit here?


So I tried this:

test <- rep(1,1110400)
system.time(kernapply(test,tkern))
   User  System verstrichen
   1.960.011.97

which doesn't fit into the pattern. But the best thing is still to come. When I 
try this

test <- rep(1,1110308)
system.time(kernapply(test,tkern))

then the computer starts to run and does so for longer than 15 minutes until when I normally kill 
the process. As noted above this behaviour is repeatable and occurs every time I issue these commands.


I really would like to know if there is some magic to the number 1110308 I'm 
not aware of.


Last but not least, here is my

sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=de_DE.utf8   LC_NUMERIC=C
 [3] LC_TIME=de_DE.utf8LC_COLLATE=de_DE.utf8
 [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8
 [7] LC_PAPER=de_DE.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.1


Thank you,

Alex

--
Dipl.-Phys. Alexander SengerTel   : +49 30 2093 4941
Humboldt-Universitaet zu Berlin Fax   : +49 30 2093 4718
AG Quantenoptik und Metrologie
Hausvogteiplatz 5-7 Email :
10117 Berlin, Germany   sen...@physik.hu-berlin.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kolmogorov-Smirnov test

2011-04-29 Thread Greg Snow
The general idea of the KS test (and others) can be applied to discrete data, 
but the implementation in R assumes continuous data (does not have the needed 
adjustments to deal with ties).  The chi-square and other tests suffer from the 
same problems in your case.  In all cases the null hypothesis is that the data 
comes from the stated distribution (poisson in your case), failing to reject 
the null hypothesis does not prove that the data comes from that distribution, 
only shows that we cannot disprove that it comes from that distribution.  With 
large sample sizes, your data could come from a true distribution that for all 
practical purposes is equivalent to the poisson, but due to slight rounding or 
other errors has probabilities slightly different for some values (a difference 
that no one would reasonably care about), but these tests can show a 
significant difference.

Usually it is better to just show that your data and the theoretical 
distribution are close enough to each other rather than depending on a formal 
test.  The plots and diagnostics in the vcd package are a good choice here, you 
could also use the KS test statistic (ignoring the p-value and warnings) as 
another measure, but plot the empirical and theoretical distributions to see 
what the value means and how close they are.

Another option is the vis.test function in TeachingDemos, it lets you plot data 
from the theoretical distribution and the actual data, then see if you can 
visually tell the difference.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of m.marcinmichal
> Sent: Thursday, April 28, 2011 3:54 PM
> To: r-help@r-project.org
> Subject: Re: [R] Kolmogorov-Smirnov test
> 
> Hi,
> thanks for response.
> 
> >> The Kolmogorov-Smirnov test is designed for distributions on
> continuous
> >> variable, not discrete like the >> poisson.  That is why you are
> getting
> >> some of your warnings.
> 
> I read in "Fitting distributions whith R" Vito Ricci page 19  that:
> "...
> Kolmogorov-Smirnov test is used to decide if a sample comes from a
> population with a specific distribution. I can be applied both for
> discrete
> (count) data and continuous binned (even if some Authors do not agree
> on
> this point) and both for continuous variables" but in page 16 i read
> that
> "... while the Kolmogorov-Smirnov and Anderson-Darling tests are
> restricted
> to continuous distribution" and i was little confused, but try this
> test to
> my discrete data.
> 
> Generally in first step, I try fit my data to discret or continuous
> distribution (task: find distribution for emirical data). Question, Can
> I
> approximate my discret data by the continuous  distribution? I know
> that
> sometmies we can poisson distribution approxime by the normal
> distribution.
> But what happen if I use another distribution like log normall or gama?
> 
> I done another three tests - chi square test. But this tests return
> three
> another results. Suppose that we have the same data i.e vectorSentence.
> Test:
> 1. One
> param <- fitdistr(vectorSentence, "poisson")
> chisq.test(table(vectorSentence), p = dpois(1:9, lambda=param[[1]][1]),
> rescale.p = TRUE)
> 
> X-squared = 272.8958, df = 8, p-value < 2.2e-16
> 
> 2. Two
> library(vcd)
> gf <- goodfit(vectorSentence, type="poisson", method="MinChisq")
> summary(gf)
> 
>  X^2 df P(> X^2)
> Pearson 404.3607  8 2.186332e-82
> 
> 3. Three
> fdistc <- fitdist(vectorSentence, "pois")
> g<-gofstat(fdistc, print.test = TRUE)
> 
> Chi-squared statistic:  535.344
> Degree of freedom of the Chi-squared distribution:  8
> Chi-squared p-value:  1.824112e-110
> 
> Question which results is correct?
> 
> I know that I can reject null hipotesis: data don't come from poisson
> distribution. But which result is correct?
> 
> For another side I trying to accomplish another problem:
> 1. Suppose that we have a reference data (dr) from some process (pr)
> which
> save in vectorSentence.
> 2. Suppose that we have a two another sample data d1, d2 from another
> two
> process p1, p2
> 3. We know that all data is discrete.
> 
> Task:
> One: check if data d1, d2 is equal to reference data (dr) - this is not
> a
> problem. I use a cdf, histogram, another mensure etc. chi square test.
> But
> can I use Kolmogorov-Smirnov  to test cumulative distribution function
> hipotesis i.e F(d1) = F(d) for my data?
> Two: find dr distributions discret or if possible continuous
> 
> Best
> 
> Marcin M.
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-
> Smirnov-test-tp3479506p3482349.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting g

Re: [R] fisher exact for > 2x2 table

2011-04-29 Thread Jeremy Miles
On 29 April 2011 08:43, viostorm  wrote:
>
> After I shared comments form the forum yesterday with the biostatistician he
> indicated this:
>
> "Fisher's exact test is the non-parametric analog for the Chi-square
> test for 2x2 comparisons. A version (or extension) of the Fisher's Exact
> test, known as the Freeman-Halton test applies to comparisons for tables
> greater than 2x2. SAS can calculate both statistics using the following
> instructions.
>
>  proc freq; tables a * b / fisher;"
>


SAS documentation says:

"Fisher's exact test was extended to general R×C tables by Freeman and
Halton (1951), and this test is *also* known as the Freeman-Halton
test."

Emphasis mine.

Jeremy



-- 
Jeremy Miles
Psychology Research Methods Wiki: www.researchmethodsinpsychology.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for > 2x2 table

2011-04-29 Thread Mike Miller

Rob--

Your biostatistician has not disagreed with the rest of us about anything 
except for his preferred name for the test.  He wants to call it the 
Freeman-Halton test, some people call it the Fisher-Freeman-Halton test, 
but most people call it the Fisher Exact test -- all are the same test. 
When he was "adamant you could not do > 2x2", what he was being adamant 
about was the name you should use when referring to the test for tables 
larger than 2x2.  Why he was doing that, I don't know, but I think it is 
silly -- he confused you and the rest of us.


He goes on to tell you that to get the Freeman-Halton test in SAS, you use 
"tables a * b / fisher".  In other words, SAS calls the test "Fisher" 
instead of calling it Freeman-Halton.  R also calls it "Fisher" and not 
Freeman-Halton.  I'm like R and SAS and unlike your biostatistician, but 
to each his own.


You say that he is "exceptionally clear on this point," which may be true, 
but what is the point?  The point is that he prefers a different *name* 
for the test than the rest of us.  Everyone agrees on the math/stat.


Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota


On Fri, 29 Apr 2011, viostorm wrote:



After I shared comments form the forum yesterday with the biostatistician he
indicated this:

"Fisher's exact test is the non-parametric analog for the Chi-square
test for 2x2 comparisons. A version (or extension) of the Fisher's Exact
test, known as the Freeman-Halton test applies to comparisons for tables
greater than 2x2. SAS can calculate both statistics using the following
instructions.

 proc freq; tables a * b / fisher;"

Do people here still stand by position fisher exact test can be used for RxC
contingency tables ?  Sorry to both you all so much it is just important for
a paper I am writing and planning to submit soon. ( I have a 4x2 table but
does not meet expected frequencies requirements for chi-squared.)

I guess people here have suggested R implements, the following, which
unfortunately are unavailable at least easily at my library but  at least by
the titles indicates it is extending it to RxC

Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test
in r c contingency tables. Journal of the American Statistical Association
1983;78:427-34.

Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher's
exact test on unordered r x c contingency tables. ACM Transactions on
Mathematical Software 1986;12:154-61.

The only reason I ask again is he is exceptionally clear on this point.

Thanks again,

-Rob


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditonal Rank

2011-04-29 Thread Dennis Murphy
Hi:

Does this work?

library(plyr)
ddply(tmp, .(trial, Gender), transform, rankscore = rank(score))
  score trial Gender rankscore
1 1 1  M 1
2 2 1  M 2
3 3 1  F 1
4 4 1  F 2
5 4 2  M 2
6 3 2  M 1
7 2 2  F 2
8 1 2  F 1

Alternatively, you could get the 'wide form' with

aggregate(score ~ trial + Gender, data = tmp, FUN = rank)
  trial Gender score.1 score.2
1 1  M   1   2
2 2  M   2   1
3 1  F   1   2
4 2  F   2   1

HTH,
Dennis


On Fri, Apr 29, 2011 at 12:26 PM, Doran, Harold  wrote:
> Suppose I have data such as
>
> tmp <- data.frame(score = c(1,2,3,4, 4,3,2,1), trial = gl(2,4), Gender = 
> gl(2,2,8, labels=c('M', 'F')))
>
> Now I would like to compute a rank on the variable score conditional on trial 
> and gender. I could do
>
> res <- with(tmp, tapply(score, list(Gender, trial), rank))
> res[,1]
> res[,2]
>
> and then finagle a way to create a new variable in the dataframe tmp that has 
> these ranks associated with the correct rows. But, perhaps there is a better 
> way. Any suggestions?
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with qualitative variables in anova

2011-04-29 Thread katerinaaa
Yes, I wrote also in the other forum because here I didn't take an answer.  

Thanks for your reply

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3484599.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] --mem-vsize in R

2011-04-29 Thread kparamas
Hi,

I am calculation pairwise correlation coefficient for a matrix of 234 X
3.
I am getting the following error,
Error in cbind(as.vector(row(cl)), as.vector(col(cl)), as.vector(cl)) : 
  allocMatrix: too many elements specified
In addition: There were 50 or more warnings (use warnings() to see the first
50)

The function used is,
corGraphPearson = function(cData, COR) #COR is threshold 0.5,0.7, etc
{

cl = unname(cor(cData, use="pairwise.complete.obs", method="pearson"))

result = cbind(as.vector(row(cl)),as.vector(col(cl)),as.vector(cl))
result = result[result[,1] != result[,2],]

corm = result

# remove low cor pairs
corm =corm[abs(corm[,3]) >= COR, ]
# the network
net <- network(corm, directed = F)
}


I am running this in a cluster with 4 machines with 24 GB memory each.

How should I start R so that I make max use of the memory availbale?
Or how to overcome this issue?

--
View this message in context: 
http://r.789695.n4.nabble.com/mem-vsize-in-R-tp3484541p3484541.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] For loop and sqldf

2011-04-29 Thread mathijsdevaan
Hi list,

Can anyone tell my why the following does not work? Thanks a lot! Your help
is very much appreciated.

DF = data.frame(read.table(textConnection("B  C  D  E  F  G
8025  1995  0  4  1  2
8025  1997  1  1  3  4
8026  1995  0  7  0  0
8026  1996  1  2  3  0
8026  1997  1  2  3  1
8026  1998  6  0  0  4
8026  1999  3  7  0  3
8027  1997  1  2  3  9
8027  1998  1  2  3  1
8027  1999  6  0  0  2
8028  1999  3  7  0  0
8029  1995  0  2  3  3
8029  1998  1  2  3  2
8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE))
list<-sort(unique(DF$C))
for (t in 1:length(list))
{
year = as.character(list[t])
data[year]<-sqldf('select * from DF where C = [year]')
}

I am trying to split up the data.frame into 5 new ones, one for every year. 


--
View this message in context: 
http://r.789695.n4.nabble.com/For-loop-and-sqldf-tp3484559p3484559.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for > 2x2 table

2011-04-29 Thread Brian S Cade
Rob:   Fisher's exact test is conceptually possible for any r x c 
contingency table problem and uses the observed multinomial table 
probability as the test statistic.   Other tests for r x c contingency 
tables use a different test statistic (Chi-squared, likelihood ratio, 
Zelterman's).  It is possible that the probabilities for any of these 
procedures may differ slightly for the same table configuration even if 
the probabilities for each test are calculated by enumerating all possible 
permutations (hypergeometric) under the null hypothesis.   See Mielke and 
Berry 2007 (Permutation Methods:  A distance function approach) Chps 6 
and7.   Mielke has provided efficient Fortran algorithms for enumerating 
the exact probabilities for 2x2, 3x2, 4x2, 5x2, 6x2 ,3x3,and even 2x2x2 
tables for Fisher's exact and Chi-square statistics.   I don't remember 
whether Cyrus Meta's algorithms for Fisher's exact can do more.But the 
important point to keep in mind is that it is possible to use different 
statistics for evaluating the same null hypothesis for r x c tables 
(Fisher's exact uses one form, Chi-square uses another, etc.) and the 
probabilities can be computed by exact enumeration of all permutations 
(what people expect Fisher's exact to do but also possible for Chi-square 
statistic) or by some approximation (asymptotic distribution, Monte Carlo 
resampling).  The complete enumeration of test statistics under the null 
becomes computationally intractable for large dimension r x c problems 
whether using the observed table probability (like Fisher's exact) as a 
test statistic or other like Chi-square statistic.

So in short, yes you can use Fisher's exact on your 4 x 2 problem, and the 
result might differ from using a Chi-square statistic even if you compute 
the P-value for the Chi-square test by complete enumeration.   Note that 
the minimum expected cell size for the Chi-square test is related to 
whether the Chi-square distributional approximation (an asymptotic 
argument) for evaluating the Chi-square statistic will be reasonable and 
is irrelevant if  you calculate your probabilities by exact enumeration of 
all permutations.

Brian
 

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
viostorm 
To:
r-help@r-project.org
Date:
04/29/2011 01:23 PM
Subject:
Re: [R] fisher exact for > 2x2 table
Sent by:
r-help-boun...@r-project.org




After I shared comments form the forum yesterday with the biostatistician 
he
indicated this:

"Fisher's exact test is the non-parametric analog for the Chi-square 
test for 2x2 comparisons. A version (or extension) of the Fisher's Exact 
test, known as the Freeman-Halton test applies to comparisons for tables 
greater than 2x2. SAS can calculate both statistics using the following 
instructions.

  proc freq; tables a * b / fisher;"

Do people here still stand by position fisher exact test can be used for 
RxC
contingency tables ?  Sorry to both you all so much it is just important 
for
a paper I am writing and planning to submit soon. ( I have a 4x2 table but
does not meet expected frequencies requirements for chi-squared.)

I guess people here have suggested R implements, the following, which
unfortunately are unavailable at least easily at my library but  at least 
by
the titles indicates it is extending it to RxC 

Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test
in r c contingency tables. Journal of the American Statistical Association
1983;78:427-34.
 
Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for 
Fisher's
exact test on unordered r x c contingency tables. ACM Transactions on
Mathematical Software 1986;12:154-61.

The only reason I ask again is he is exceptionally clear on this point.

Thanks again, 

-Rob



viostorm wrote:
> 
> Thank you all very kindly for your help.
> 
> -Rob
> 
> 
> Robert Schutt III, MD, MCS 
> Resident - Department of Internal Medicine
> University of Virginia, Charlottesville, Virginia
> 

viostorm wrote:
> 
> Thank you all very kindly for your help.
> 
> -Rob
> 
> 
> Robert Schutt III, MD, MCS 
> Resident - Department of Internal Medicine
> University of Virginia, Charlottesville, Virginia
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/fisher-exact-for-2x2-table-tp3481979p3484009.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinf

Re: [R] Speed up code with for() loop

2011-04-29 Thread hck
Barth sent me a very good code and I modified it a bit. Have a look:

Error<-rnorm(1000, mean=0, sd=0.05)
estimate<-(log(1+0.10)+Error)

DCF_korrigiert<-(1/(exp(1/(exp(0.5*(-estimate)^2/(0.05^2))*sqrt(2*pi/(0.05^2
))*(1-pnorm(0,((-estimate)/(0.05^2)),sqrt(1/(0.05^2))-1))
DCF_verzerrt<-(1/(exp(estimate)-1))

S <- 1000   # total sample size
D <- 1  # number of subsamples
Subset <- 1  # number in each subsample
Select <- matrix(sample(S,D*Subset,replace=TRUE),nrow=Subset,ncol=D)

DCF_korrigiert_select <- matrix(DCF_korrigiert[Select],nrow=Subset,ncol=D)
Delta_ln <-(log(colMeans(DCF_korrigiert_select, na.rm=T)/(1/0.10)))



The only problem I discovered is that R cannot handle more than
2.147.483.647 integers, thus the cells in the matrix are bounded by this
condition. (R shows the max by typing: .Machine$integer.max). And if you
want to safe the workspace, the file with 10.000 times 10.000 becomes round
2 GB. Compared to the original of "just" 300 MB. 

So I cannot perform my previous bootstrap with 1.000.000 times 100.000. But
nevertheless 10.000 times 10.000 seems to be sufficiently; I have to say its
amazing, how fast the idea works.

Has anybody a suggestion how to make it work for the 1.000.000 times 100.000
bootstrap???


--
View this message in context: 
http://r.789695.n4.nabble.com/Speed-up-code-with-for-loop-tp3481680p3484548.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with qualitative variables in anova

2011-04-29 Thread David Winsemius

Are you working on the same homework problem as user494766?

http://stackoverflow.com/questions/5835605/1-way-anova-in-r-help

--  
David.


On Apr 29, 2011, at 10:43 AM, katerinaaa wrote:


Hi,
I am newbie in R programming and I need some help.

I have two columns the first has 1000 values Y/N/U and the other has  
f/m.

Like that :

q7  sex
==
Um
U f
Um
Nf

I want to do one way anova parametric and no parametric.
But I have some problems.

Code:

frameq7 <- data.frame(q7,sex)
frameq7

r <- aov(q7 ~ sex, data = frameq7)
summary(r)

I take
Error in storage.mode(y) <- "double" :
invalid to change the storage mode of a factor
In addition: Warning message:
In model.response(mf, "numeric") :
using type="numeric" with a factor response will be ignored

Could you help me please to make it wright ?

And finally how can I present this analysis ? with boxplot ?

Thanks a lot

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3483845.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread Mike Miller

On Fri, 29 Apr 2011, David Winsemius wrote:


On Apr 29, 2011, at 1:29 PM, Mike Miller wrote:


On Fri, 29 Apr 2011, Giovanni Petris wrote:

Well, but the original poster also refers to 0.2 and 0.8 as "expected min 
and max", in which case we are back to a joke...


Well, he is a lot better with English than I am with Mandarin.  He seemed 
to like the truncated normal answers, so we'll let those be his answers.


It is possible to choose parameters for a normal distribution with 500 
observations such that the expected value of the maximum is .8 and the 
expected value of the minimum is .2.  Obviously, the mean would be .5, 
not 1, but what would the variance then have to be to provide the 
correct expected max and min?  That's another legitimate question.


You would need to specify an N since the expected first and last order 
statistic would decrease/increase with increasing N.


Right -- I chose N=500, as did the OP.  I think the order statistics for 
the normal are pretty complex, but it wouldn't be hard to use the density 
for order statistics for the uniform to compute the appropriate values for 
a standard normal, then rescale.


http://en.wikipedia.org/wiki/Order_statistic#The_order_statistics_of_the_uniform_distribution

You'd have to multiply the beta density times the inverse normal cdf and 
get the weighted average for a set of points.  It doesn't sound terribly 
difficult but I don't want to do it!  ;-)


Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using Java methods in R

2011-04-29 Thread Robert Baer

-- snip --

It clogs up my email, takes a long
time to delete, and is hard to be selective enough to not
delete some of my other important email.

-- snip --

If you don't care about contributing to the R listserve community, it's hard 
to imagine why that community should care about you.


Some people (not me) seem to use nabble [ http://www.nabble.com/ ] to 
monitor the list.  See "R" under "what is cool" .  Another option is to set 
up "rules" in your email client to direct your mail to an appropriate 
folders or if you use gmail I guess we would say to "label" you R listserve 
email.


You can search mail archives for a topic of interest with the R command line 
command

RSiteSearch().  To learn more type
?RSiteSearch

For fun I put rJava rectangular arrays into this search engine (having no 
idea what  that means) and one of the things that came out was:

http://finzi.psych.upenn.edu/R/library/rJava/html/jrectRef-class.html

Hopefully, this or one of the other things can be useful to you.

Finally for the third time, try joining/looking at:
stats-rosuda-devel:
  http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel
or the archive:
 http://mailman.rz.uni-augsburg.de/pipermail/stats-rosuda-devel/

--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A. T. Still University of Health Sciences
800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bigining with a Program of SVR

2011-04-29 Thread ypriverol
Hi:
  I'm starting a research of Support Vector Regression. I want to obtain a
model to predict a property A with 
  a set of property B, C, D, ...  This problem is very common for example in
QSAR models. I want to know 
  some examples and package that could help me in this way. I know about
caret and e1071. But I' don't 
  know if this package can work with continues variables.?

Thanks in advance

--
View this message in context: 
http://r.789695.n4.nabble.com/Bigining-with-a-Program-of-SVR-tp3484476p3484476.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using R in C#

2011-04-29 Thread Lodha, Akhil
Hi,

I've been able to use R in C# on my machine by following the steps on the 
http://www.codeproject.com/KB/cs/RtoCSharp.aspx.

This works locally, i.e. if R is running on my box. I was wondering if its 
possible to change it so that I can connect to another machine that is running 
R (and has rscproxy installed).

This way a lot of people can use R in C# without having to first install it on 
their boxes if its installed on another box that they can connect to.

Thanks,
Akhil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RCurl and postForm()

2011-04-29 Thread Elmore, Ryan
Hi everybody,

I think that I am missing something fundamental in how strings are passed from 
a postForm() call in R to the curl or libcurl functions underneath.  For 
example, I can do the following using curl from the command line:

$ curl -d "Archbishop Huxley" "http://www.datasciencetoolkit.org/text2people";
[{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop
 Huxley"}]

Trying the same thing, or what I *think* is the same thing (obvious not) in R 
(Mac OS 10.6.7, R 2.13.0) produces:

> library(RCurl)
Loading required package: bitops
> api <- "http://www.datasciencetoolkit.org/text2people";
> postForm(api, a="Archbishop Huxley")
[1] 
"[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":44,\"end_index\":61,\"matched_string\":\"Archbishop
 
Huxley\"},{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":88,\"end_index\":105,\"matched_string\":\"Archbishop
 Huxley\"}]"
attr(,"Content-Type")
charset
"text/html" "utf-8"

I can match the result given on the DSTK API's website by using system(), but 
doesn't seem like the R-like way of doing something.

> system("curl -d 'Archbishop Huxley' 
> 'http://www.datasciencetoolkit.org/text2people'")
158   141  141   141
0[{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop
 Huxley"}]17599 72 --:--:-- --:--:-- --:--:--   670

If you want to see some additional information related to this question, I 
posted on StackOverflow a few days ago:
http://stackoverflow.com/questions/5797688/post-request-using-rcurl

I am working on this R wrapper for the data science toolkit as a way of 
illustrating how to make an R package for the Denver RUG and ran into this 
problem.  Any help to this problem will be greatly appreciated by the Denver 
RUG!

Cheers,
Ryan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analysis and graphics by groups

2011-04-29 Thread Cristiano Yuji Sasada Sato
Hello,

This is my first post in this e-mail list and I hope it's enough to justify
calling for help. In case it's not, sorry.

I'm trying to do analysis and graphics using a factor as a criteria to split
data and do the analysis/graphics for each subset of data.

Right now what I'm trying to do is to fit and plot the following logistic
model, according to a third variable named "Cerca":
dm_fit_T<-nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T)

I've found a function called gapply which seems to be what I need, but it
doesn't seem to work. This is the argument I've used:
gapply(perieph,FUN=nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T),groups="Cerca")

But I get this error message returned:
> Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'FUN' of mode 'function' was not found

Can you help me doing this non-linear regression by groups work?

Also, after I manage making the regression, I'd also need fitting a line to
the nDMTRBgm2~nDMTRBgm2.T.1 data using the same model above. I've used
plotfit to do that with one nlm data set. Is it possible to fit each group
trend line and data with different colours/symbols  in one same graphic?

Thank you,
Cristiano

-- 
Cristiano Yuji Sasada Sato
Doutorando
Programa de Pós-Graduação em Ecologia e Evolução - IBRAG / UERJ
Laboratório de Ecologia de Rios e Córregos
Departamento de Ecologia - Universidade do Estado do Rio de Janeiro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using lme4 with three nested random effects

2011-04-29 Thread Benjamin Caldwell
Thierry,
The first suggestion worked. Thank you very much.
*Ben Caldwell*

University of California, Berkeley
137 Mulford Hall #3114
Berkeley, CA 94720
Office 223 Mulford Hall
(510)859-3358



On Fri, Apr 29, 2011 at 1:52 AM, ONKELINX, Thierry  wrote:

> Dear Ben,
>
> Are site, transect and plot factors? And do they have unique id's?
>
> You could try this
>
> rws30.UL$site <- factor(rws30.UL$site)
> rws30.UL$transect <- interaction(rws30.UL$site, rws30.UL$transect, drop =
> TRUE)
> rws30.UL$plot <- interaction(rws30.UL$site, rws30.UL$transect,
> rws30.UL$plot, drop = TRUE)
> modelincrBS<-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
> +(1|site/transect/plot),
>  data=rws30.UL, family=gaussian, na.action=na.omit)
>
> Or
>
> modelincrBS<-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
> +(1|site) + (1|transect) + (1|plot),
>  data=rws30.UL, family=gaussian, na.action=na.omit)
>
> Best regards,
>
> Thierry
>
>
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> tel. + 32 54/436 185
> thierry.onkel...@inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
>
> > -Oorspronkelijk bericht-
> > Van: r-help-boun...@r-project.org
> > [mailto:r-help-boun...@r-project.org] Namens Benjamin Caldwell
> > Verzonden: vrijdag 29 april 2011 0:37
> > Aan: r-help
> > Onderwerp: [R] using lme4 with three nested random effects
> >
> > Hi all,
> > I'm trying to fit models for data with three levels of nested random
> > effects: site/transect/plot. For example,
> >
> > modelincrBS<-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bar
> > k.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
> > +(1|site/transect/plot),
> > data=rws30.UL, family=gaussian, na.action=na.omit)
> >
> > but I get the following error:
> >
> > Error: length(f1) == length(f2) is not TRUE In addition:
> > Warning messages:
> > 1: In plot:(transect:site) :
> >   numerical expression has 92 elements: only the first used
> > 2: In plot:(transect:site) :
> >   numerical expression has 92 elements: only the first used
> >
> > The formulation works for two nested effects (e.g. 1|site/transect)
> >
> > I can get it to run in lme
> > modelincrBS<-lme(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bark.
> > thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num,
> > data=rws30.UL, random=(~1| site/transect/plot),na.action=na.omit)
> >
> > but I can't specify a distribution family in that package.
> >
> > Any help much appreciated.
> >
> > Ben Caldwell
> >
> > *
> > *
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conditonal Rank

2011-04-29 Thread Doran, Harold
Suppose I have data such as

tmp <- data.frame(score = c(1,2,3,4, 4,3,2,1), trial = gl(2,4), Gender = 
gl(2,2,8, labels=c('M', 'F')))

Now I would like to compute a rank on the variable score conditional on trial 
and gender. I could do

res <- with(tmp, tapply(score, list(Gender, trial), rank))
res[,1]
res[,2]

and then finagle a way to create a new variable in the dataframe tmp that has 
these ranks associated with the correct rows. But, perhaps there is a better 
way. Any suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression Summary for a List

2011-04-29 Thread Phil Spector

Ryan -
   summary expects an lm object, and fit is a list.  So
you need to use something like

   lapply(fit,summary)

to pass each list element to the summary function.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Fri, 29 Apr 2011, Ryan J. McGuigan wrote:


Hi,

I am trying to run a regression on two matrices with 10 columns.  I have
been able to run the regression with the following code:

fit=list()
for(i in 1:10) {
fit[[i]]=lm(monret[,i]~janret[,i])
}

However, I can't get the regression to spit out more than the coefficients
(summary(fit) does not work).  I really need the full summary for each of
the 10 regressions, including the R-squared values.  I'm sure there's a
simple way to do this I just can't seem to figure it out.

Thanks.

-Ryan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert filogenetic tree to binary matrix

2011-04-29 Thread vanderlei52
Hi Ben,

Thank you for your help.

I did the same question in the r-sig-phylo mailing list. Liam Revell gave
the following solution: 

temp<-prop.part(tree)
X<-matrix(0,nrow=length(tree$tip),ncol=length(temp),dimnames=list(tree$tip.label,tree$node.label))
for(i in 1:ncol(X)) X[temp[[i]],i]<-1

Vanderlei


--
View this message in context: 
http://r.789695.n4.nabble.com/Convert-filogenetic-tree-to-binary-matrix-tp3478961p3484371.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for > 2x2 table

2011-04-29 Thread viostorm

After I shared comments form the forum yesterday with the biostatistician he
indicated this:

"Fisher's exact test is the non-parametric analog for the Chi-square 
test for 2x2 comparisons. A version (or extension) of the Fisher's Exact 
test, known as the Freeman-Halton test applies to comparisons for tables 
greater than 2x2. SAS can calculate both statistics using the following 
instructions.

  proc freq; tables a * b / fisher;"

Do people here still stand by position fisher exact test can be used for RxC
contingency tables ?  Sorry to both you all so much it is just important for
a paper I am writing and planning to submit soon. ( I have a 4x2 table but
does not meet expected frequencies requirements for chi-squared.)

I guess people here have suggested R implements, the following, which
unfortunately are unavailable at least easily at my library but  at least by
the titles indicates it is extending it to RxC 

Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test
in r c contingency tables. Journal of the American Statistical Association
1983;78:427-34.
 
Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher's
exact test on unordered r x c contingency tables. ACM Transactions on
Mathematical Software 1986;12:154-61.

The only reason I ask again is he is exceptionally clear on this point.

Thanks again, 

-Rob



viostorm wrote:
> 
> Thank you all very kindly for your help.
> 
> -Rob
> 
> 
> Robert Schutt III, MD, MCS 
> Resident - Department of Internal Medicine
> University of Virginia, Charlottesville, Virginia
> 

viostorm wrote:
> 
> Thank you all very kindly for your help.
> 
> -Rob
> 
> 
> Robert Schutt III, MD, MCS 
> Resident - Department of Internal Medicine
> University of Virginia, Charlottesville, Virginia
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/fisher-exact-for-2x2-table-tp3481979p3484009.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] importing and filtering time series data

2011-04-29 Thread Joel Reymont
Folks,

I'm new to R and would like to use it to analyze web server performance data. 

I collect the data in this CSV format:

1304083104.41,Y,668.856249809
1304083104.41,Y,348.143193007

First column is a  timestamp, rows with N instead of Y 
need to be skipped and the last column has the same format as the first column, 
except it's request duration (latency).

I would like to calculate average number of requests per second, mean latency, 
variance, 5 and 95 percentiles.

What is the best way to accomplish this, starting with importing of time series?

Thanks, Joel

--
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
-++---
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
-++---
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about lrm, validate, pentrace

2011-04-29 Thread khosoda

(11/04/29 22:09), Frank Harrell wrote:

Yes I would select that as the final model.


Thank you for your comment. I am able to be confident about my model now.

The difference you saw is caused

by different treatment of penalization of factor variables, related to the
use of the sum squared differences between the estimate at one category from
the average over all categories.  I think that as long as you code it one
way consistently and pick the penalty using that coding you are OK.  But if
the coefficients of the non-factor variables depend on how the binary
predictor is coded, there is a bit more concern.


A lot of previous studies have demonstrated that poor outcome is more 
frequent in treat2 than in treat 1. So, I coded treat1 as 0, and treat2 
as 1 in the first mail. Then, I came back to the original coding of 
treat1 and treat2 in the newer mail. According to your answer, I guess I 
am OK. :-)


Prof Harrell, Your book (Rregression Modeling Strategies) and many kind 
comments helped me a lot. Thank you very much again.


--
KH



Frank


細田弘吉 wrote:


Thank you for you quick reply, Prof. Harrell.
According to your advice, I ran pentrace using a very wide range.

  >  pentrace.x6factor<- pentrace(x6factor.lrm, seq(0, 100, by=0.5))
  >  plot(pentrace.x6factor)

I attached this figure. Then,

  >  pentrace.x6factor<- pentrace(x6factor.lrm, seq(0, 10, by=0.05))

It seems reasonable that the best penalty is 2.55.

  >  x6factor.lrm.pen<- update(x6factor.lrm, penalty=2.55)
  >  cbind(coef(x6factor.lrm), coef(x6factor.lrm.pen),
abs(coef(x6factor.lrm)-coef(x6factor.lrm.pen)))
   [,1][,2][,3]
Intercept -4.32434556 -3.86816460 0.456180958
stenosis  -0.01496757 -0.01091755 0.004050025
T1 3.04248257  2.42443034 0.618052225
T2-0.75335619 -0.57194342 0.181412767
procedure -1.20847252 -0.82589263 0.382579892
ClinicalScore  0.37623189  0.30524628 0.070985611

  >  validate(x6factor.lrm, bw=F, B=200)
index.orig trainingtest optimism index.corrected   n
Dxy   0.6324   0.6849  0.5955   0.0894  0.5430 200
R20.3668   0.4220  0.3231   0.0989  0.2679 200
Intercept 0.   0. -0.1924   0.1924 -0.1924 200
Slope 1.   1.  0.7796   0.2204  0.7796 200
Emax  0.   0.  0.0915   0.0915  0.0915 200
D 0.2716   0.3229  0.2339   0.0890  0.1826 200
U-0.0192  -0.0192  0.0243  -0.0436  0.0243 200
Q 0.2908   0.3422  0.2096   0.1325  0.1582 200
B 0.1272   0.1171  0.1357  -0.0186  0.1457 200
g 1.6328   1.9879  1.4940   0.4939  1.1389 200
gp0.2367   0.2502  0.2216   0.0286  0.2080 200


  >  validate(x6factor.lrm.pen, bw=F, B=200)
index.orig trainingtest optimism index.corrected   n
Dxy   0.6375   0.6857  0.6024   0.0833  0.5542 200
R20.3145   0.3488  0.3267   0.0221  0.2924 200
Intercept 0.   0.  0.0882  -0.0882  0.0882 200
Slope 1.   1.  1.0923  -0.0923  1.0923 200
Emax  0.   0.  0.0340   0.0340  0.0340 200
D 0.2612   0.2571  0.2370   0.0201  0.2411 200
U-0.0192  -0.0192 -0.0047  -0.0145 -0.0047 200
Q 0.2805   0.2763  0.2417   0.0346  0.2458 200
B 0.1292   0.1224  0.1355  -0.0132  0.1423 200
g 1.2704   1.3917  1.5019  -0.1102  1.3805 200
gp0.2020   0.2091  0.2229  -0.0138  0.2158 200

In the penalized model (x6factor.lrm.pen), the apparent Dxy is 0.64, and
bias-corrected Dxy is 0.55. The maximum absolute error is estimated to
be 0.034, smaller than non-penalized model (0.0915 in x6factor.lrm) The
changes in slope and intercept are substantially reduced in penalized
model. I think overfitting is improved at least to some extent. Should I
select this as a final model?

I have one more question. The "procedure" variable was defined as 0/1
value in the previous mail. For some graphical reason, I redefined it as
treat1/treat2 value. Then, the best penalty value was changed from 3.05
to 2.55. I guess change from numeric to factorial caused this reduction
in penalty. Which set up should I select?

I appreciate your help in advance.

--
KH

(11/04/26 0:21), Frank Harrell wrote:

You've done a lot of good work on this.  Yes I would say you have
moderate
overfitting with the first model.  The only thing that saved you from
having
severe overfitting is that there seems to be a signal present [I am
assume
this model is truly pre-specified and was not developed at all by looking
at
patterns of responses Y.]

The use of backwards stepdown demonstrated much worse overfitting.  This
is
in line with what we know about the damage of stepwise selection methods
that do not incorporate shrinkage.  I would throw away the

[R] Problem with qualitative variables in anova

2011-04-29 Thread katerinaaa
Hi,
I am newbie in R programming and I need some help.

I have two columns the first has 1000 values Y/N/U and the other has f/m.
Like that :

q7  sex
==
Um
U f
Um
Nf

I want to do one way anova parametric and no parametric.
But I have some problems.

Code:

frameq7 <- data.frame(q7,sex)
frameq7

r <- aov(q7 ~ sex, data = frameq7)
summary(r)

I take
Error in storage.mode(y) <- "double" :
invalid to change the storage mode of a factor
In addition: Warning message:
In model.response(mf, "numeric") :
using type="numeric" with a factor response will be ignored

Could you help me please to make it wright ?

And finally how can I present this analysis ? with boxplot ?

Thanks a lot 

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3483845.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trying to get RWeka/Snowball to work

2011-04-29 Thread Peter Holme
Hi!

I was trying to install RWeka to be able to use SnowballStemmer in a Mac OS
X 10.6.7 environment... but coudn't do it... I get error messages after:

> library(RWeka);
> install(Snowball);
> ## Test the supplied vocabulary for the default stemmer ('porter'):
> source <- readLines(system.file("words", "porter","voc.txt",
+ package = "Snowball"))
> result <- SnowballStemmer(source)
Error in .jnew(name) :
  java.lang.InternalError: Can't start the AWT because Java was started on
the first thread.  Make sure StartOnFirstThread is not specified in your
application's Info.plist or on the command line
> target <- readLines(system.file("words", "porter", "output.txt",
+ package = "Snowball"))
> ## Any differences?
> any(result != target)
Error: object 'result' not found
Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in
CLASSPATH?
Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in
CLASSPATH?
Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not
in CLASSPATH?
Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in
CLASSPATH?
Trying to add database driver (JDBC): org.hsqldb.jdbcDriver - Warning, not
in CLASSPATH?

Well – after searching around, I decided to take the matter into my own
hands – not ideal, but it fits my small purpose for now... will possibly
expand it later..:
http://holme.se/stem/

:)

Peter
-- 
+47 920 42 782

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regression Summary for a List

2011-04-29 Thread Ryan J. McGuigan
Hi,

I am trying to run a regression on two matrices with 10 columns.  I have
been able to run the regression with the following code:

fit=list()
for(i in 1:10) {
fit[[i]]=lm(monret[,i]~janret[,i])
}

However, I can't get the regression to spit out more than the coefficients
(summary(fit) does not work).  I really need the full summary for each of
the 10 regressions, including the R-squared values.  I'm sure there's a
simple way to do this I just can't seem to figure it out.

Thanks.

-Ryan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with "NA"

2011-04-29 Thread Nandini B

Thanks a lot Duncan, this is what I was looking for!!Thank you,Nandini 




> Date: Fri, 29 Apr 2011 09:53:06 -0400
> From: murdoch.dun...@gmail.com
> To: nandini...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] replace non numeric with "NA"
> 
> On 29/04/2011 6:45 AM, Nandini B wrote:
> >   Hello,
> > I have a sample data frame which looks like this
> >day  od   month
> > 1   1 0.12
> > 2   3 #VALUE! 1
> > 3   5 0.4 12
> > 4   7 0.8 10
> > 5  11   -  3
> > 6  14   s 7
> > 7  18  -- 12
> > 8  27  197
> >
> >
> > Now i wish to filter all the non numeric values and replace it with "NA". 
> > The data frame is actually huge and the non numeric characters vary from 
> > "-" to a string to absolutely anything!!!
> > Can anyone please help ?
> 
> You don't tell use the types of the columns, so I'll assume they are 
> factors.  If so, call
> 
> as.numeric(as.character())
> 
> on each of them to convert the number-like values to numbers, the others 
> to NA.  For example,
> 
> df$day <- as.numeric(as.character(df$day))
> 
> Duncan Murdoch
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logistic regression with glm: cooks distance and dfbetas are different compared to SPSS output

2011-04-29 Thread Biedermann, Jürgen

Hi there,

I have the problem, that I'm not able to reproduce the SPSS residual 
statistics (dfbeta and cook's distance) with a simple binary logistic 
regression model obtained in R via the glm-function.


I tried the following:

fit <- glm(y ~ x1 + x2 + x3, data, family=binomial)

cooks.distance(fit)
dfbetas(fit)

When i compare the returned values with the values that I get in SPSS, 
they are different, although the same model is calculated (the 
coefficients are the same etc.)


It seems that different calculation-formulas are used for cooks.distance 
and dfbetas in SPSS compared to R.


Unfortunately I didn't find out, what's the difference in the 
calculation and how I could get R to calculate me the same statistics 
that SPSS uses.

Or is this an unknown SPSS bug?

Greetings
Jürgen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mlogit package, "Error in X[omitlines, ] <- NA : subscript out of bounds"

2011-04-29 Thread Yong Wang
I am using the mlogit packages and get a data problem, for which I
can't find any clue from R archive.

code below shows my related code all the way to the error

#---
mydata <- data.frame(dependent,x,y,z)

mydata$dependent<-as.factor(mydata$dependent)

mldata<-mlogit.data(mydata, varying=NULL, choice="dependent", shape="wide")

summary(mlogit.1<- mlogit(dependent~1|x+y+z, data = mldata, reflevel="0"))

"Error in X[omitlines, ] <- NA : subscript out of bounds" ,
#---

Could anybody kindly tip how  can I possibly solve this problem?

Thank you

yong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Philipp Pagel
On Fri, Apr 29, 2011 at 06:19:24PM +0300, Tal Galili wrote:
>
> data_url <- "
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> "
> read.csv(data_url)
> Error in file(file, "rt") : cannot open the connection

I get the same error (R 2.11.1, Debian LINUX) and don't have a
solution. But I did some tests and found the origin of the problem

I can download the file from google with wget but get some interesting
´information in the process:


$ wget -v 
'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv'
--2011-04-29 20:07:40--  
http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
Resolving spreadsheets0.google.com... 209.85.148.139, 209.85.148.113, 
209.85.148.138, ...
Connecting to spreadsheets0.google.com|209.85.148.139|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: 
https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
 [following]
--2011-04-29 20:07:41--  
https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
Connecting to spreadsheets0.google.com|209.85.148.139|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: 
“pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1”

[ <=>   
] 41  --.-K/s   in 0s  

2011-04-29 20:07:42 (342 KB/s) - 
“pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1”
 saved [41]


The message that caught my attention was the http redirection: "302 Moved
Temporarily".

If you try again with the new url you get this:

> read.csv(url("https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&g";))
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") : unsupported URL scheme

?url told me "Note that ‘https://’ connections are not supported."
Case closed, problem unsolved...

Dirty workaround: use system() and wget or whatever command is available on
Windows for this.

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Duncan Temple Lang

Thanks David for fixing the early issues.

The reason for the failure is that the response
from the Web server is a to redirect the requester
to another page, specifically

 
https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv

Note that this is https, not http, and the built-in URL reading facilities in R 
don't suport https.


One way to see this is to use look at the headers in your browser (e.g. Live 
HTTP Headers),
or to use curl, or the RCurl package

tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub";,
  hl ="en", key = "0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
  single = "true", gid ="0",
  output = "csv",
 .opts = list(followlocation = TRUE, verbose = TRUE))


The verbose option shows the entire dialog, and tt contains the
text of the CSV document.

 read.csv(textConnection(tt))

then yields the data frame

  D.


On 4/29/11 10:36 AM, David Winsemius wrote:
> 
> On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
> 
>> Hello all,
>> I wish to use read.csv to read a google doc spreadsheet.
>>
>> I try using the following code:
>>
>> data_url <- "
>> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
>>
>> "
>> read.csv(data_url)
>>
>> Which results in the following error:
>>
>> Error in file(file, "rt") : cannot open the connection
>>
>>
>> I'm on windows 7.  And the code was tried on R 2.12 and 2.13
>>
>> I remember trying this a few months ago and it worked fine.
> 
> I am always amused at such claims. Occasionally they are correct, but more 
> often a crucial step has been omitted. In
> this case you have at a minimum embedded line-feeds in your URL string and 
> have not established a connection, so it
> could not possibly have succeeded as presented.
> 
> But now it's time to admit I do not know why it is not succeeding when I 
> correct those flaws.
> 
>> closeAllConnections()
>> data_url <-
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv";)
> 
>> read.csv(data_url)
> Error in open.connection(file, "rt") : cannot open the connection
> 
>> closeAllConnections()
>> dd <- read.csv(con <- 
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv";))
> 
> Error in open.connection(file, "rt") : cannot open the connection
> 
> 
> So, I guess I'm not reading the help pages for `url` and `read.csv` as well I 
> thought I was.
> 
> 
>> Any suggestion what might be causing this or how to solve it?
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread William Dunlap
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius
> Sent: Friday, April 29, 2011 10:36 AM
> To: Tal Galili
> Cc: r-help@r-project.org
> Subject: Re: [R] read.csv fails to read a CSV file from google docs
> 
> 
> On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
> 
> > Hello all,
> > I wish to use read.csv to read a google doc spreadsheet.
> >
> > I try using the following code:
> >
> > data_url <- "
> > 
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&ke
> y=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid
> =0&output=csv
> > "
> > read.csv(data_url)
> >
> > Which results in the following error:
> >
> > Error in file(file, "rt") : cannot open the connection

With S+ I get:
 S+>
download.file("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=
en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&ou
tput=csv", destfile="e:/temp/splus")
 Problem in
download.file("http://spreadsheets0.google.com/spreadsheet/pu..: Could
not get url: un
 supported protocol, libcurl was built with SSL disabled, https: not
supported!
and with cygwin's wget I get
 E:\temp\jnk>wget
"http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDT
Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&outpu
t=csv"
 --2011-04-29 11:00:10--
http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTV
ek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=
0&output=csv
 Resolving spreadsheets0.google.com... 74.125.224.73, 74.125.224.71,
74.125.224.64, ...
 Connecting to spreadsheets0.google.com|74.125.224.73|:80... connected.
 HTTP request sent, awaiting response... 302 Moved Temporarily
 Location:
https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDT
Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv [
following]
 --2011-04-29 11:00:11--
https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDT
Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid
=0&output=csv
 Connecting to spreadsheets0.google.com|74.125.224.73|:443... connected.
 ERROR: cannot verify spreadsheets0.google.com's certificate, issued by
`/C=US/O=Google Inc/CN=Google Internet Authority':
   Unable to locally verify the issuer's authority.
 To connect to spreadsheets0.google.com insecurely, use
`--no-check-certificate'.
 Unable to establish SSL connection.

so I suspect that the SLL/certifcate business may also be the problem
when
using R to get the document.  The R error message is not very
illuminating.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> >
> >
> > I'm on windows 7.  And the code was tried on R 2.12 and 2.13
> >
> > I remember trying this a few months ago and it worked fine.
> 
> I am always amused at such claims. Occasionally they are 
> correct, but  
> more often a crucial step has been omitted. In this case you 
> have at a  
> minimum embedded line-feeds in your URL string and have not  
> established a connection, so it could not possibly have succeeded as  
> presented.
> 
> But now it's time to admit I do not know why it is not 
> succeeding when  
> I correct those flaws.
> 
>  > closeAllConnections()
>  > data_url <- 
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=
> en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=tru
> e&gid=0&output=csv 
> ")
>  > read.csv(data_url)
> Error in open.connection(file, "rt") : cannot open the connection
> 
>  > closeAllConnections()
>  > dd <- read.csv(con <-  
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=
> en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=tru
> e&gid=0&output=csv 
> "))
> Error in open.connection(file, "rt") : cannot open the connection
> 
> 
> So, I guess I'm not reading the help pages for `url` and 
> `read.csv` as  
> well I thought I was.
> 
> 
> > Any suggestion what might be causing this or how to solve it?
> 
> 
> -- 
> David Winsemius, MD
> West Hartford, CT
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The bin/R file - hardcoded paths

2011-04-29 Thread Saptarshi Guha
Hello,

I notice that e.g /home/sguha/lib64 is hard coded into the /bin/R file .
I nstalled R as ./configure --prefix=$HOME ...

What i need to do is ship the entire R distribution to remote nodes,
and run R. These are shipped to ephemeral directories
so I dont know the path ahead of time.

R_HOME doesn't change things either.

So i guess one cant run R on a system unless it's been installed?

1. I can't install R on the compute nodes using ./configure 
2. All nodes do have the same architecture
3. I would like to stick to the 'shipping' approach.


Thanks
Saptarshi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 1:29 PM, Mike Miller wrote:


On Fri, 29 Apr 2011, Giovanni Petris wrote:

Well, but the original poster also refers to 0.2 and 0.8 as  
"expected min and max", in which case we are back to a joke...


Well, he is a lot better with English than I am with Mandarin.  He  
seemed to like the truncated normal answers, so we'll let those be  
his answers.


It is possible to choose parameters for a normal distribution with  
500 observations such that the expected value of the maximum is .8  
and the expected value of the minimum is .2.  Obviously, the mean  
would be .5, not 1, but what would the variance then have to be to  
provide the correct expected max and min?  That's another legitimate  
question.


You would need to specify an N since the expected first and last order  
statistic would decrease/increase with increasing N.


--
David.



Mike



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On Behalf Of Mao Jianfeng

Sent: Thursday, April 28, 2011 12:02 PM
To: r-help@r-project.org
Subject: [R] how to generate a normal distribution with mean=1,  
min=0.2, max=0.8


Dear all,

This is a simple probability problem. I want to know, How to  
generate a normal distribution with mean=1, min=0.2 and max=0.8?


I know how the generate a normal distribution of mean = 1 and sd  
= 1 and with 500 data point.


rnorm(n=500, m=1, sd=1)

But, I am confusing with how to generate a normal distribution  
with expected min and max. I expect to hear your directions.


Thanks in advance.

Best,
Jian-Feng,


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sub-matrix block size

2011-04-29 Thread Santosh
Dear Rxperts,

Can "Jordan decomposition" of submatrices be useful to determine size of sub
blocks? "http://en.wikipedia.org/wiki/Jordan_normal_form";..

Thanks for the ideas/suggestions.
.
I have another similar situation, where at least one of the off diagonal
elements of the lower triangle submatrices (as mentioned in the previous
example) may be zero.. and based on the visual inspection, the block size of
those square submatrices should be the same as in the previous example. How
do I resolve this one?

m1 <- structure(c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = c(11L,
11L))

Also, in the vector below is there a simple way to separate out contiguous
blocks (for identification purposes)? Please see the inserted "0" in the
vector below to identify the next block  ...

>> rowSums(m) + colSums(m) - 1
>>  [1]  2  2  1 -1  3  3  3  0  3  3  3 -1   # the elements in this vector
are the TRUE sizes of submatrices (zero is inserted to separate contiguous
blocks of same size)

Regards,
Santosh

On Wed, Apr 27, 2011 at 6:41 AM, Santosh  wrote:

> Thanks, David! That is another interesting perspective to (sub/super)
> diagonal story! For now I was looking only at block sizes of lower triangle
> submatrices as Dennis suggested.
>
> Regards,
> Santosh
>
>
> On Wed, Apr 27, 2011 at 5:57 AM, David Winsemius 
> wrote:
>
>>
>> On Apr 27, 2011, at 12:07 AM, Dennis Murphy wrote:
>>
>>  Hi:
>>>
>>> Maybe this can help get you started. Reading your data into a matrix m,
>>>
>>> m <- structure(c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
>>> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0,
>>> 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
>>> 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
>>> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim =
>>> c(11L,
>>> 11L))
>>>
>>> rowSums(m) + colSums(m) - 1
>>> [1]  2  2  1 -1  3  3  3  3  3  3 -1
>>>
>>> The pair of 2's => a 2 x 2 block, 1 => a 1 x 1 matrix with value 1, -1
>>> => a 1 x 1 matrix with entry 0, a triplet of 3's => a 3 x 3 subblock,
>>> etc. You should be able to figure out the rows and columns for each
>>> submatrix from the indices of the vector above; the values provide an
>>> indication of matrix size as well as position.
>>>
>>>
>> If we are in the stage of providing potentially useful but incomplete
>> ideas, this would be my notion. Use the row and col functions with "[" to
>> locate non-zero elements in the diagonal and subdiagonal:
>>
>> Diagonal:  (My matrix was named `mm`)
>> > mm[row(mm)==col(mm)]
>>  [1] 1 1 1 0 1 1 1 1 1 1 0
>> First subdiagonal:
>> > mm[row(mm)==col(mm)+1]
>>
>>  [1] 0 0 0 0 0 0 0 0 0 0
>> First superdiagonal:
>> > mm[row(mm)==col(mm)-1]
>>  [1] 1 0 0 0 1 1 0 1 1 0
>>
>> Perhaps a combination of the two? It seems as though the rowSums/colSums
>> approach might be insensitive to whether triangular blocks were sub or super
>> diagonal:
>>
>> > rowSums(mm) + colSums(mm) - 1
>>
>>  [1]  2  2  1 -1  3  3  3  3  3  3 -1
>> > mm[1,2]<-0
>> > mm[2,1]<-1
>> > rowSums(mm) + colSums(mm) - 1
>>
>>  [1]  2  2  1 -1  3  3  3  3  3  3 -1
>>
>>  HTH,
>>> Dennis
>>>
>>>
>>>
>>> On Tue, Apr 26, 2011 at 5:13 PM, Santosh  wrote:
>>>
 Dear Rxperts

 Below is a small vector of values of zeros and non-zeros... was
 wondering if
 there is an efficient way to get the block sizes of submatrices of a big
 matrix similar to the one shown below? diagonal elements can be zero
 too.
 Rows with only a diagonal element may be considered as a unit block
 size.

 c(1,0,0,0,0,0,0,0,0,0,0,
  1,1,0,0,0,0,0,0,0,0,0,
  0,0,1,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,1,0,0,0,0,0,0,
  0,0,0,0,1,1,0,0,0,0,0,
  0,0,0,0,1,1,1,0,0,0,0,
  0,0,0,0,0,0,0,1,0,0,0,
  0,0,0,0,0,0,0,1,1,0,0,
  0,0,0,0,0,0,0,1,1,1,0,
  0,0,0,0,0,0,0,0,0,0,0)

 Thanks much!
 Santosh

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> Davi

Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:


Hello all,
I wish to use read.csv to read a google doc spreadsheet.

I try using the following code:

data_url <- "
http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
"
read.csv(data_url)

Which results in the following error:

Error in file(file, "rt") : cannot open the connection


I'm on windows 7.  And the code was tried on R 2.12 and 2.13

I remember trying this a few months ago and it worked fine.


I am always amused at such claims. Occasionally they are correct, but  
more often a crucial step has been omitted. In this case you have at a  
minimum embedded line-feeds in your URL string and have not  
established a connection, so it could not possibly have succeeded as  
presented.


But now it's time to admit I do not know why it is not succeeding when  
I correct those flaws.


> closeAllConnections()
> data_url <- url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv 
")

> read.csv(data_url)
Error in open.connection(file, "rt") : cannot open the connection

> closeAllConnections()
> dd <- read.csv(con <-  url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv 
"))

Error in open.connection(file, "rt") : cannot open the connection


So, I guess I'm not reading the help pages for `url` and `read.csv` as  
well I thought I was.




Any suggestion what might be causing this or how to solve it?



--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread Mike Miller

On Fri, 29 Apr 2011, Giovanni Petris wrote:

Well, but the original poster also refers to 0.2 and 0.8 as "expected 
min and max", in which case we are back to a joke...


Well, he is a lot better with English than I am with Mandarin.  He seemed 
to like the truncated normal answers, so we'll let those be his answers.


It is possible to choose parameters for a normal distribution with 500 
observations such that the expected value of the maximum is .8 and the 
expected value of the minimum is .2.  Obviously, the mean would be .5, not 
1, but what would the variance then have to be to provide the correct 
expected max and min?  That's another legitimate question.


Mike



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Mao Jianfeng
Sent: Thursday, April 28, 2011 12:02 PM
To: r-help@r-project.org
Subject: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

Dear all,

This is a simple probability problem. I want to know, How to generate 
a normal distribution with mean=1, min=0.2 and max=0.8?


I know how the generate a normal distribution of mean = 1 and sd = 1 
and with 500 data point.


rnorm(n=500, m=1, sd=1)

But, I am confusing with how to generate a normal distribution with 
expected min and max. I expect to hear your directions.


Thanks in advance.

Best,
Jian-Feng,


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting options only inside functions

2011-04-29 Thread Jonathan Daily
In python, opening a connection using with allows for a temporary
assignment using "as". So:

with file("/path/to/file") as con:
permanent_object = function(con)

would provide the return of function(con) globally, but close con. If
function(con) causes an error, con is still closed.

I agree with your description of what the function would need to do.
Would it make sense to make it generic and define default methods for
different setups? e.g. Using the current with/within when it is a
data.frame/environment, evaluating it when it is a function, etc.

On Fri, Apr 29, 2011 at 12:34 PM,   wrote:
> The Python solution does not extend, at least not cleanly, to things
> like dev on/ dev off or to Hadley's locale example.  In any case if I
> am reading the Python source correctly on how they handle user
> interrupts this solution has the same non-robusness to user interrupts
> issue that Bill's initial solution had.
>
> As a basis I believe what we need is a mechanism that handles a
> setup, an action, and a cleanup, with setup and cleanup occurring with
> interrupts disablednand the action with interrupts enabled. Scheme's
> dynamic wind is similar, though I don't believe the scheme standard
> addresses interrupts and we don't need to worry about continuations,
> but some of the issues are similar.  Probably we would want two
> flavors, one in which the action has to be a function that takes as a
> single argument the result produced by the setup code, and one in
> which the action can be an argument expression that is then evaluated
> at the appropriate place by laze evaluation.
>
> This can be done at the R level except for the controlling of
> interrupts (and possibly other asynchronous stuff)-- that would need a
> new pair of primitives (suspendInterrupts/enableInterupts or something
> like that).  There is something in the Haskell literature on this that
> I have looked at a while back -- probably time to have another look.
>
>
> On Thu, 28 Apr 2011, Jonathan Daily wrote:
>
>> I would also love to see this implemented in R, as my current solution
>> to the issue of doing tons of open/close, dev/dev.off, etc. is to use
>> snippets in my IDE, and in the end I feel like it is a hack job. A
>> pythonic "with" function would also solve most of the situations where
>> I have had to use awkward try or tryCatch calls. I would be willing to
>> help with this project, even if it is just testing.
>>
>> On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson
>>  wrote:

 but it's a little clumsy, because

 with_connection(file("myfile.txt"), {do stuff...})

 isn't very useful because you have no way to reference the connection
 that you're using. Ruby's blocks have arguments which would require
 big changes to R's syntax.  One option would to use pronouns:
>>>
>>>  Looking very much like python 'with' statements:
>>>
>>> http://effbot.org/zone/python-with-statement.htm
>>>
>>>  Implemented via the 'with' statement which can operate on anything
>>> that has a __enter__ and an __exit__ method. Very neat.
>>>
>>> Barry
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>>
>
> --
> Luke Tierney
> Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>   Actuarial Science
> 241 Schaeffer Hall                  email:      l...@stat.uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 10:44 AM, Alaios wrote:


Thanks a lot.
I finally used

M2 <- M
M2[M < thresh] <- 0
M2[M >= thresh] <- 1

as I noticed that this one line

M2 <- as.numeric( M[] < thresh )
vectorizes my matrix.

One more question I have two matrices that only differ slightly.  
What will be the easiest way to compare and find the cells that are  
not the same?


M[!M==N]
N[!M==N]




Best Regards
Alex

--- On Fri, 4/29/11, David Winsemius  wrote:


From: David Winsemius 
Subject: Re: [R] threshold matrix
To: "Alaios" 
Cc: R-help@r-project.org
Date: Friday, April 29, 2011, 2:57 PM

On Apr 29, 2011, at 9:37 AM, Alaios wrote:


Dear all,
I have a quite big matrix which I would like to

threshold.

If the value is below threshold the cell should be

zero

and
if the value is over threshold the cell should be one


M2 <- M
M2[M < thresh] <- 0
M2[M >= thresh] <- 1

or perhaps simply:

M2 <- as.numeric( M[] < thresh )


One really simple way to do that is two have a nested

loop and check cell by cell.


The problem is that this seems to be really time

consuming and ineficient.


What do you suggest me to try out?


--
David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting options only inside functions

2011-04-29 Thread luke-tierney

The Python solution does not extend, at least not cleanly, to things
like dev on/ dev off or to Hadley's locale example.  In any case if I
am reading the Python source correctly on how they handle user
interrupts this solution has the same non-robusness to user interrupts
issue that Bill's initial solution had.

As a basis I believe what we need is a mechanism that handles a
setup, an action, and a cleanup, with setup and cleanup occurring with
interrupts disablednand the action with interrupts enabled. Scheme's
dynamic wind is similar, though I don't believe the scheme standard
addresses interrupts and we don't need to worry about continuations,
but some of the issues are similar.  Probably we would want two
flavors, one in which the action has to be a function that takes as a
single argument the result produced by the setup code, and one in
which the action can be an argument expression that is then evaluated
at the appropriate place by laze evaluation.

This can be done at the R level except for the controlling of
interrupts (and possibly other asynchronous stuff)-- that would need a
new pair of primitives (suspendInterrupts/enableInterupts or something
like that).  There is something in the Haskell literature on this that
I have looked at a while back -- probably time to have another look.


On Thu, 28 Apr 2011, Jonathan Daily wrote:


I would also love to see this implemented in R, as my current solution
to the issue of doing tons of open/close, dev/dev.off, etc. is to use
snippets in my IDE, and in the end I feel like it is a hack job. A
pythonic "with" function would also solve most of the situations where
I have had to use awkward try or tryCatch calls. I would be willing to
help with this project, even if it is just testing.

On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson
 wrote:

but it's a little clumsy, because

with_connection(file("myfile.txt"), {do stuff...})

isn't very useful because you have no way to reference the connection
that you're using. Ruby's blocks have arguments which would require
big changes to R's syntax.  One option would to use pronouns:


 Looking very much like python 'with' statements:

http://effbot.org/zone/python-with-statement.htm

 Implemented via the 'with' statement which can operate on anything
that has a __enter__ and an __exit__ method. Very neat.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
Luke Tierney
Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  l...@stat.uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with "NA"

2011-04-29 Thread Nandini B

Thanks a lot Jim, this is perfect!!

Thank you,
Nandini Badarinarayan




> Date: Fri, 29 Apr 2011 09:49:26 -0400
> From: jmac...@med.umich.edu
> To: nandini...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] replace non numeric with "NA"
> 
> Hi Nandini,
> 
> On 4/29/2011 6:45 AM, Nandini B wrote:
> >
> >   Hello,
> > I have a sample data frame which looks like this
> >day  od   month
> > 1   1 0.12
> > 2   3 #VALUE! 1
> > 3   5 0.4 12
> > 4   7 0.8 10
> > 5  11   -  3
> > 6  14   s 7
> > 7  18  -- 12
> > 8  27  197
> >
> 
>  > x <- data.frame(day=1:8, od = 
> c(0.1,"#VALUE!",0.4,0.8,"-","s","--",19), month = c(2,1,12,10,3,7,12,7))
>  > x
>day  od month
> 1   1 0.1 2
> 2   2 #VALUE! 1
> 3   3 0.412
> 4   4 0.810
> 5   5   - 3
> 6   6   s 7
> 7   7  --12
> 8   8  19 7
>  > x$od <- as.numeric(as.character(x$od))
> Warning message:
> NAs introduced by coercion
>  > x
>day   od month
> 1   1  0.1 2
> 2   2   NA 1
> 3   3  0.412
> 4   4  0.810
> 5   5   NA 3
> 6   6   NA 7
> 7   7   NA12
> 8   8 19.0 7
> 
> 
> Best,
> 
> Jim
> 
> 
> >
> > Now i wish to filter all the non numeric values and replace it with "NA". 
> > The data frame is actually huge and the non numeric characters vary from 
> > "-" to a string to absolutely anything!!!
> > Can anyone please help ?
> >
> >
> >
> >
> > Thank you,
> > Warm Regards,
> >
> > Nandini
> >
> >
> > 
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
> **
> Electronic Mail is not secure, may not be read every day, and should not be 
> used for urgent or sensitive issues 
> 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Tal Galili
Hello all,
I wish to use read.csv to read a google doc spreadsheet.

I try using the following code:

data_url <- "
http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
"
read.csv(data_url)

Which results in the following error:

Error in file(file, "rt") : cannot open the connection


I'm on windows 7.  And the code was tried on R 2.12 and 2.13

I remember trying this a few months ago and it worked fine.
Any suggestion what might be causing this or how to solve it?


Thanks.



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Still confused about classes

2011-04-29 Thread Russ Abbott
Thanks, to all. I didn't know about either *methods( ) *or the package *
lubridate*, which seems like a very nice *Date *package.

*-- Russ *



On Fri, Apr 29, 2011 at 1:35 AM, Kenn Konstabel  wrote:

> The function for getting the year from date  is there in package
> lubridate (as well as many other convenient functions to work with
> dates).
>
> More generally, finding "all" methods for a given class may be a
> little tricky. If "all" means everything you have installed and
> currently attached to your search path then methods(class="Date") will
> do it (for S3 classes). (but "The functions listed are  those which
> _are named like methods_ and may not actually be   methods (known
> exceptions are discarded in the code). ") The result depends on which
> packages you have loaded: in my currently open R session,
> methods("Date") lists 36 "possible methods" but after library(zoo) I
> get two more ( "as.yearmon.Date" and "as.yearqtr.Date").
>
> Regards,
> Kenn
>
>
> On Fri, Apr 29, 2011 at 9:05 AM, Russ Abbott 
> wrote:
> > Hi,
> >
> > I'm still confused about how to find out what methods are defined for a
> > given class.  For example, I know that
> >
> >> today <- Sys.Date()
> >
> > will produce an object of type Date. But I'm not sure what I can do with
> > Date objects or how I can find out.
> >
> >> ?Date
> >
> >
> > refers me to the Date documentation page. But it doesn't tell me how, for
> > example, to extract the current year from a date object.
> >
> > I tried
> >
> >> year(today)Error: could not find function "year"
> >
> >
> > Is there some other function that does the job? I want a function f such
> > that> f(today)will return 2011. Perhaps there is no such
> function.
> >  But in general I don't have any confidence that I would know how to find
> it
> > if it existed or that I would know how to assure myself that there was no
> > such function.
> >
> > Thanks.
> >
> > *-- Russ *
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Specify custom par(mfrow()) layout for defined plot()

2011-04-29 Thread Michael Bach
Dear R Users,

I am doing stats::decompose() on 4 different time series.  When I issue

csdA <- decompose(tsA)
plot(csdA)

I get a summary plot for observed, trend, seasonal and random components
of decomposed time series tsA.  As I understand it, the object returned
by decompose() has it's own plot method where mfrow(4,1) etc. is
defined.  Now suppose I wanted to wrap those mfrow(4,1) into my own
mfrow(2,2) layout.  How could I achieve this?  Is there a general way to
handle these cases?  Something like a "meta" par(mfrow())?

Best Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread Petr Savicky
On Fri, Apr 29, 2011 at 07:44:59AM -0700, Alaios wrote:
> Thanks a lot.
> I finally used
> 
> M2 <- M
> M2[M < thresh] <- 0
> M2[M >= thresh] <- 1
> 
> as I noticed that this one line
> 
> M2 <- as.numeric( M[] < thresh )
> vectorizes my matrix.

Hi.

This may be avoided, for example

  M2 <- M
  M2[, ] <- as.numeric(M >= thresh)

or

  array(as.numeric(M >= thresh), dim=dim(M))

> One more question I have two matrices that only differ slightly. What will be 
> the easiest way to compare and find the cells that are not the same?

If A and B are matrices of the same dimension, then

  A == B

is a logical matrix with TRUE entires for positions, where 
A and B match exactly.

  abs(A - B) <= eps

is a logical matrix with TRUE entires for positions, where
A and B differ at most by eps.

If you want to get only one logical result, then use

  all(A == B)

for exact equality and

  all(abs(A - B) <= eps)

for approximate equality of all entries.

See also ?all.equal, which uses the relative error, not absolute
difference.

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using Java methods in R

2011-04-29 Thread hill0093
H do I obtain a strictly rectangular
type-double array (converted to an R 2-dimensional array) from a Java class? 
I can obtain a 1-dimensional type-double array (vector) or a scalar, 
but I cannot figure out the two-dimensional from the instructions.
Is .jevalArray also involved?
My simple Java test class and R test code follows:

import java.lang.reflect.Array;
public class RJavTest { 
  public static void main(String[]args) { RJavTest rJavTest=new RJavTest();
} 
  public final static String conStg="testString"; 
  public final static double con0dbl=10001; 
  public final static double[]con1Arr=new double[] {
10001,10002,10003,10004,10005,10006 }; 
  public final static double[][]con2Arr=new double[][] { {
10001,10002,10003,10004 },{ 20001,20002,20003,20004 },{
30001,30002,30003,30004 } }; 
  public final static String retConStg() { return(conStg); } 
  public final static double retCon0dbl() { return(con0dbl); } 
  public final static double[] retCon1Arr() { return(con1Arr); } 
  public final static double[][] retCon2Arr() { return(con2Arr); } 
}

library(rJava)
.jinit()
.jaddClassPath("C:/ad/j")
print(.jclassPath())
rJavaTst <- .jnew("RJavTest")
conn1Arr <- .jfield(rJavaTst,sig="[D","con1Arr")
print(conn1Arr)
print(conn1Arr[2])
conn1ArrRet <- .jcall(rJavaTst,returnSig="[D","retCon1Arr")
print(conn1ArrRet)
print(conn1ArrRet[2])
conn0dbl <- .jfield(rJavaTst,sig="D","con0dbl")
print(conn0dbl)
##The above works, but not the following
conn2Arr <- .jfield(rJavaTst,sig="[[D","con2Arr")
print(conn2Arr[2])
print(conn2Arr[2,3])
print(conn2Arr)
arj34Ret <- .jcall(rJavaTst,returnSig="[[D","arReturnTEST")
print(arj34Ret)

The latter 2-dim stuff doesn't work



--
View this message in context: 
http://r.789695.n4.nabble.com/Using-Java-methods-in-R-tp3469299p3483862.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread Alaios
Thanks a lot.
I finally used

M2 <- M
M2[M < thresh] <- 0
M2[M >= thresh] <- 1

as I noticed that this one line

M2 <- as.numeric( M[] < thresh )
vectorizes my matrix.

One more question I have two matrices that only differ slightly. What will be 
the easiest way to compare and find the cells that are not the same?

Best Regards
Alex

--- On Fri, 4/29/11, David Winsemius  wrote:

> From: David Winsemius 
> Subject: Re: [R] threshold matrix
> To: "Alaios" 
> Cc: R-help@r-project.org
> Date: Friday, April 29, 2011, 2:57 PM
> 
> On Apr 29, 2011, at 9:37 AM, Alaios wrote:
> 
> > Dear all,
> > I have a quite big matrix which I would like to
> threshold.
> > If the value is below threshold the cell should be
> zero
> > and
> > if the value is over threshold the cell should be one
> 
> M2 <- M
> M2[M < thresh] <- 0
> M2[M >= thresh] <- 1
> 
> or perhaps simply:
> 
> M2 <- as.numeric( M[] < thresh )
> > 
> > One really simple way to do that is two have a nested
> loop and check cell by cell.
> > 
> > The problem is that this seems to be really time
> consuming and ineficient.
> > 
> > What do you suggest me to try out?
> 
> --
> David Winsemius, MD
> West Hartford, CT
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread Giovanni Petris
Well, but the original poster also refers to 0.2 and 0.8 as "expected
 min and max", in which case we are back to a joke...

Giovanni


On Thu, 2011-04-28 at 13:06 -0400, David Winsemius wrote:
> On Apr 28, 2011, at 12:09 PM, Ravi Varadhan wrote:
> 
> > Surely you must be joking, Mr. Jianfeng.
> >
> 
> Perhaps not joking and perhaps not with correct statistical  
> specification.
> 
> A truncated Normal could be simulated with:
> 
> set.seed(567)
> x <- rnorm(n=5, m=1, sd=1)
> xtrunc <- x[x>=0.2 & x <=0.8]
> require(logspline)
> plot(logspline(xtrunc, lbound=0.2, ubound=0.8, nknots=7))
> 
> -- 
> David.
> 
> > ---
> > Ravi Varadhan, Ph.D.
> > Assistant Professor,
> > Division of Geriatric Medicine and Gerontology School of Medicine  
> > Johns Hopkins University
> >
> > Ph. (410) 502-2619
> > email: rvarad...@jhmi.edu
> >
> >
> > -Original Message-
> > From:
> r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
> > ] On Behalf Of Mao Jianfeng
> > Sent: Thursday, April 28, 2011 12:02 PM
> > To: r-help@r-project.org
> > Subject: [R] how to generate a normal distribution with mean=1,  
> > min=0.2, max=0.8
> >
> > Dear all,
> >
> > This is a simple probability problem. I want to know, How to  
> > generate a
> > normal distribution with mean=1, min=0.2 and max=0.8?
> >
> > I know how the generate a normal distribution of mean = 1 and sd =
> 1  
> > and
> > with 500 data point.
> >
> > rnorm(n=500, m=1, sd=1)
> >
> > But, I am confusing with how to generate a normal distribution
> with  
> > expected
> > min and max. I expect to hear your directions.
> >
> > Thanks in advance.
> >
> > Best,
> > Jian-Feng,
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> West Hartford, CT
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 

Giovanni Petris  
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix evaluation using if function

2011-04-29 Thread Berend Hasselman

David Winsemius wrote:
> 
> On Apr 29, 2011, at 4:27 AM, ivan wrote:
> 
>> Hi All,
>>
>> I am trying to create a function which evaluates whether the values  
>> (which
>> are equal to one) of a matrix are the same as their mirror values.  
>> Consider
>> the following matrix:
>>
>>> n<-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3)
>>> colnames(n)<-cbind("A","B","C");rownames(n)<-cbind("A","B","C")
>>> n
>>  A B C
>> A 0 1 0
>> B 1 0 1
>> C 1 0 0
>>
>> Hence, since n[2,1] and n[1,2] are 1 and the same, the function should
>> return the name of the row of n[2,1]. I used the following function:
>>
>> for (i in length(rownames(n))) {
>>
>> for (j in length(colnames(n))){
>>
>> if(n[i,j]==n[j,i]){
>>
>> rownames(n)[[i]]->output} else {}
>>
>> }
>>
>> }
>>
>>> output
>> NULL
>>
>> The right answer would have been "B", though.
> 
> Can you explain why "A" would not be an equally good answer to satisfy  
> your problem set up?
> 
>  > which(n == t(n) & col(n) != row(n) , arr.ind=TRUE)
>row col
> B   2   1
> A   1   2
>  > rownames(which(n == t(n) & col(n) != row(n) , arr.ind=TRUE) )
> [1] "B" "A"
> 
> # Which would seem to be the correct answer, but
> # This adds an additional constraint and also insures no diagonal  
> elements
> 
>  > rownames(which(n == t(n) & col(n) != row(n) & lower.tri(n),  
> arr.ind=TRUE) )
> [1] "B"
> 

Wouldn't this do it too (dsince the diagonal is set to false by lower.tri)?:

rownames(which(n == t(n) & lower.tri(n),  arr.ind=TRUE))

Berend



--
View this message in context: 
http://r.789695.n4.nabble.com/matrix-evaluation-using-if-function-tp3483188p3483785.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 9:37 AM, Alaios wrote:


Dear all,
I have a quite big matrix which I would like to threshold.
If the value is below threshold the cell should be zero
and
if the value is over threshold the cell should be one


M2 <- M
M2[M < thresh] <- 0
M2[M >= thresh] <- 1

or perhaps simply:

M2 <- as.numeric( M[] < thresh )


One really simple way to do that is two have a nested loop and check  
cell by cell.


The problem is that this seems to be really time consuming and  
ineficient.


What do you suggest me to try out?


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] abline outside of plot region

2011-04-29 Thread Peter Ehlers

On 2011-04-29 06:14, Nick Sabbe wrote:

Hi R people.



I ran into this problem: I created a plot with errbars, like this:


errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5),

yplus=c(2.5,1.5,3.5,3.5))

Next, I wanted to accentuate some x value with an abline, like this:


abline(v=2)




In one of my R sessions (which admittedly I have had open for quite a while
now), the abline draws outside of the plotting region of errbars (till the
edge of my plotting window at least).

I tested for the cause by opening another session (clean) of the same
version of R (2.13), and running the same set of commands. In this session,
I do not have this behavior. Conclusion: I must have changed some graphical
parameter in my original session, but I don't know which one. Do you?



As an addendum: I also want to add a few specific axis ticks besides the
standard ones in my graph. I used axis for this, and it works. I set
col.ticks to match the color of my abline (in the nonsimplified code), and
this works too, but unfortunately, the label below the tick is not in this
color, and a parameter for this is not present in axis.



Suggestions for either? Note: I'm on windows 7 with R 2.13.


  plot(1:4, xaxt='n')
  axis(1, at=2:3, lab=c('a', 'b'),
   col.ticks=3, col.axis=2, lwd=0, lwd.ticks=1)
  par(xpd = TRUE)
  abline(v = 4)

Peter Ehlers





Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36



-- Do Not Disapprove




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up plotting to MSWindows graphics window

2011-04-29 Thread jim holtman
If you are plotting that many data points, you  might want to look at
'hexbin' as a way of aggregating the values to a different
presentation.  It is especially nice if you are doing a scatter plot
with a lot of data points and trying to make sense out of it.

On Wed, Apr 27, 2011 at 5:16 AM, Jonathan Gabris  wrote:
> Hello,
>
> I am working on a project analysing the performance of motor-vehicles
> through messages logged over a CAN bus.
>
> I am using R 2.12 on Windows XP and 7
>
> I am currently plotting the data in R, overlaying 5 or more plots of data,
> logged at 1kHz, (using plot.ts() and par(new = TRUE)).
> The aim is to be able to pan, zoom in and out and get values from the
> plotted graph using a custom Qt interface that is used as a front end to
> R.exe (all this works).
> The plot is drawn by R directly to the windows graphic device.
>
> The data is imported from a .csv file (typically around 100MB) to a matrix.
> (timestamp, message ID, byte0, byte1, ..., byte7)
> I then separate this matrix into several by message ID (dimensions are in
> the order of 8cols, 10^6 rows)
>
> The panning is done by redrawing the plots, shifted by a small amount. So as
> to view a window of data from a second to a minute long that can travel the
> length of the logged data.
>
> My problem is that, the redrawing of the plots whilst panning is too slow
> when dealing with this much data.
> i.e.: I can see the last graphs being drawn to the screen in the half-second
> following the view change.
> I need a fluid change from one view to the next.
>
> My question is this:
> Are there ways to speed up the plotting on the MSWindows display?
> By reducing plotted point densities to *sensible* values?
> Using something other than plot.ts() - is the lattice package faster?
> I don't need publication quality plots, they can be rougher...
>
> I have tried:
> -Using matrices instead of dataframes - (works for calculations but not
> enough for plots)
> -increasing the max usable memory (max-mem-size) - (no change)
> -increasing the size of the pointer protection stack (max-ppsize) - (no
> change)
> -deleting the unnecessary leftover matrices - (no change)
> -I can't use lines() instead of plot() because of the very  different scales
> (rpm-1, flags -1to3)
>
> I am going to do some resampling of the logged data to reduce the vector
> sizes.
> (removal of *less* important data and use of window.ts())
>
> But I am currently running out of ideas...
> So if sombody could point out something, I would be greatfull.
>
> Thanks,
>
> Jonathan Gabris
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with "NA"

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 6:45 AM, Nandini B wrote:

  Hello,
I have a sample data frame which looks like this
   day  od   month
1   1 0.12
2   3 #VALUE! 1
3   5 0.4 12
4   7 0.8 10
5  11   -  3
6  14   s 7
7  18  -- 12
8  27  197


Now i wish to filter all the non numeric values and replace it with "NA". The data frame 
is actually huge and the non numeric characters vary from "-" to a string to absolutely 
anything!!!
Can anyone please help ?


You don't tell use the types of the columns, so I'll assume they are 
factors.  If so, call


as.numeric(as.character())

on each of them to convert the number-like values to numbers, the others 
to NA.  For example,


df$day <- as.numeric(as.character(df$day))

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3-way contingency table

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 6:47 AM, Mathias Walter wrote:


Hi,

I have large data frame with many columns. A short example is given  
below:



dataH

   host ms01 ms31 ms33 ms34
1  cattle4   2096
2   sheep4345
3  cattle4345
4  cattle4345
5   sheep4355
6goat4345
7   sheep4355
8goat4345
9goat4345
10 cattle4345

Now I want to determine the the frequencies of every unique value in
every column depending on the host column.

It is quite easy to determine the frequencies in total with the
following command:


dataH2 <- dataH[,c(2,3,4,5)]
table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)],  
useNA="ifany")


   ms01 ms31 ms33 ms34
3 0900
410070
5 0029
6 0001
9 0010
200100

But I cannot manage to get it dependent on the host.

I tried


xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH)


and many other ways but I'm not stressful.

I can get it for each column individually with


with(dataH, table(host, ms33))


  ms33
host 4 5 9
cattle 3 0 1
deer   0 0 0
goat   3 0 0
human  0 0 0
sheep  1 2 0
tick   0 0 0

But I do not want to repeat the command for every column. I need a
single table which can be plotted as a balloon plot, for instance.


You have obviously not given us the full data from which your "correct  
answer" was drawn, but see if this is going  the right direction:


require(reshape)
> dataHm <- melt(dataH)
Using host as id variables
> xtabs(~host+value+variable, dataHm)
, , variable = ms01

value
host 3 4 5 6 9 20
  cattle 0 4 0 0 0  0
  goat   0 3 0 0 0  0
  sheep  0 3 0 0 0  0

, , variable = ms31

value
host 3 4 5 6 9 20
  cattle 3 0 0 0 0  1
  goat   3 0 0 0 0  0
  sheep  3 0 0 0 0  0

, , variable = ms33

value
host 3 4 5 6 9 20
  cattle 0 3 0 0 1  0
  goat   0 3 0 0 0  0
  sheep  0 1 2 0 0  0

, , variable = ms34

value
host 3 4 5 6 9 20
  cattle 0 0 3 1 0  0
  goat   0 0 3 0 0  0
  sheep  0 0 3 0 0  0



Does anybody knows how to achieve this?

--
Kind regards,
Mathias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with "NA"

2011-04-29 Thread James W. MacDonald

Hi Nandini,

On 4/29/2011 6:45 AM, Nandini B wrote:


  Hello,
I have a sample data frame which looks like this
   day  od   month
1   1 0.12
2   3 #VALUE! 1
3   5 0.4 12
4   7 0.8 10
5  11   -  3
6  14   s 7
7  18  -- 12
8  27  197



> x <- data.frame(day=1:8, od = 
c(0.1,"#VALUE!",0.4,0.8,"-","s","--",19), month = c(2,1,12,10,3,7,12,7))

> x
  day  od month
1   1 0.1 2
2   2 #VALUE! 1
3   3 0.412
4   4 0.810
5   5   - 3
6   6   s 7
7   7  --12
8   8  19 7
> x$od <- as.numeric(as.character(x$od))
Warning message:
NAs introduced by coercion
> x
  day   od month
1   1  0.1 2
2   2   NA 1
3   3  0.412
4   4  0.810
5   5   NA 3
6   6   NA 7
7   7   NA12
8   8 19.0 7


Best,

Jim




Now i wish to filter all the non numeric values and replace it with "NA". The data frame 
is actually huge and the non numeric characters vary from "-" to a string to absolutely 
anything!!!
Can anyone please help ?




Thank you,
Warm Regards,

Nandini



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] threshold matrix

2011-04-29 Thread Alaios
Dear all,
I have a quite big matrix which I would like to threshold.
If the value is below threshold the cell should be zero
and 
if the value is over threshold the cell should be one

One really simple way to do that is two have a nested loop and check cell by 
cell.

The problem is that this seems to be really time consuming and ineficient.

What do you suggest me to try out?

I would like to thank you in advance for your help


Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about lrm, validate, pentrace

2011-04-29 Thread Frank Harrell
Yes I would select that as the final model.  The difference you saw is caused
by different treatment of penalization of factor variables, related to the
use of the sum squared differences between the estimate at one category from
the average over all categories.  I think that as long as you code it one
way consistently and pick the penalty using that coding you are OK.  But if
the coefficients of the non-factor variables depend on how the binary
predictor is coded, there is a bit more concern.

Frank


細田弘吉 wrote:
> 
> Thank you for you quick reply, Prof. Harrell.
> According to your advice, I ran pentrace using a very wide range.
> 
>  > pentrace.x6factor <- pentrace(x6factor.lrm, seq(0, 100, by=0.5))
>  > plot(pentrace.x6factor)
> 
> I attached this figure. Then,
> 
>  > pentrace.x6factor <- pentrace(x6factor.lrm, seq(0, 10, by=0.05))
> 
> It seems reasonable that the best penalty is 2.55.
> 
>  > x6factor.lrm.pen <- update(x6factor.lrm, penalty=2.55)
>  > cbind(coef(x6factor.lrm), coef(x6factor.lrm.pen), 
> abs(coef(x6factor.lrm)-coef(x6factor.lrm.pen)))
>   [,1][,2][,3]
> Intercept -4.32434556 -3.86816460 0.456180958
> stenosis  -0.01496757 -0.01091755 0.004050025
> T1 3.04248257  2.42443034 0.618052225
> T2-0.75335619 -0.57194342 0.181412767
> procedure -1.20847252 -0.82589263 0.382579892
> ClinicalScore  0.37623189  0.30524628 0.070985611
> 
>  > validate(x6factor.lrm, bw=F, B=200)
>index.orig trainingtest optimism index.corrected   n
> Dxy   0.6324   0.6849  0.5955   0.0894  0.5430 200
> R20.3668   0.4220  0.3231   0.0989  0.2679 200
> Intercept 0.   0. -0.1924   0.1924 -0.1924 200
> Slope 1.   1.  0.7796   0.2204  0.7796 200
> Emax  0.   0.  0.0915   0.0915  0.0915 200
> D 0.2716   0.3229  0.2339   0.0890  0.1826 200
> U-0.0192  -0.0192  0.0243  -0.0436  0.0243 200
> Q 0.2908   0.3422  0.2096   0.1325  0.1582 200
> B 0.1272   0.1171  0.1357  -0.0186  0.1457 200
> g 1.6328   1.9879  1.4940   0.4939  1.1389 200
> gp0.2367   0.2502  0.2216   0.0286  0.2080 200
> 
> 
>  > validate(x6factor.lrm.pen, bw=F, B=200)
>index.orig trainingtest optimism index.corrected   n
> Dxy   0.6375   0.6857  0.6024   0.0833  0.5542 200
> R20.3145   0.3488  0.3267   0.0221  0.2924 200
> Intercept 0.   0.  0.0882  -0.0882  0.0882 200
> Slope 1.   1.  1.0923  -0.0923  1.0923 200
> Emax  0.   0.  0.0340   0.0340  0.0340 200
> D 0.2612   0.2571  0.2370   0.0201  0.2411 200
> U-0.0192  -0.0192 -0.0047  -0.0145 -0.0047 200
> Q 0.2805   0.2763  0.2417   0.0346  0.2458 200
> B 0.1292   0.1224  0.1355  -0.0132  0.1423 200
> g 1.2704   1.3917  1.5019  -0.1102  1.3805 200
> gp0.2020   0.2091  0.2229  -0.0138  0.2158 200
> 
> In the penalized model (x6factor.lrm.pen), the apparent Dxy is 0.64, and 
> bias-corrected Dxy is 0.55. The maximum absolute error is estimated to 
> be 0.034, smaller than non-penalized model (0.0915 in x6factor.lrm) The 
> changes in slope and intercept are substantially reduced in penalized 
> model. I think overfitting is improved at least to some extent. Should I 
> select this as a final model?
> 
> I have one more question. The "procedure" variable was defined as 0/1 
> value in the previous mail. For some graphical reason, I redefined it as 
> treat1/treat2 value. Then, the best penalty value was changed from 3.05 
> to 2.55. I guess change from numeric to factorial caused this reduction 
> in penalty. Which set up should I select?
> 
> I appreciate your help in advance.
> 
> -- 
> KH
> 
> (11/04/26 0:21), Frank Harrell wrote:
>> You've done a lot of good work on this.  Yes I would say you have
>> moderate
>> overfitting with the first model.  The only thing that saved you from
>> having
>> severe overfitting is that there seems to be a signal present [I am
>> assume
>> this model is truly pre-specified and was not developed at all by looking
>> at
>> patterns of responses Y.]
>>
>> The use of backwards stepdown demonstrated much worse overfitting.  This
>> is
>> in line with what we know about the damage of stepwise selection methods
>> that do not incorporate shrinkage.  I would throw away the stepwise
>> regression model.  You'll find that the model selected is entirely
>> arbitrary.  And you can't use the "selected" variables in any re-fit of
>> the
>> model, i.e., you can't use lrm pretending that the two remaining
>> variables
>> were pre-specified.  Stepwise regression methods only seem to help.  When
>> assessed properly we see that is an illusion.
>>
>> Y

Re: [R] How to define specially nested functions

2011-04-29 Thread Chee Chen
Hi, Jerome and Phil,
Thank you for your solutions and I have studied carefully your codes but I have 
further questions (since I guess the simple lines of codes may not do the real 
job I am going to describe to you. Please forgive me for my shallowness!)

 I guess I over-simplified my question, basically I need such a function as the 
integrand for estimation of the expectation by Monte Carlo methods.

Please allow me to state the problem in more details:

I have to define a function for Monte Carlo computation of conditional 
expectation and solve for the argument for which the expectation equals a 
pre-specified value. Say, the integrand function is f(x,y,z), where x, z are 
deterministic, y probabilistic and follows a distribution F.
I will have to feed x=x0 to f, then I sample from F for y and evaluate 
f(x0,y,z), and use Monte Carlo method to get the expectation, which gives a 
function of z; now that the expectation is a function of z only, say, E(z); 
finally to solve for z such that E(z) = 0.5, for example.
The function f itself is very complicated and has high dimensional vectors as 
arguments except z, which is a real number. 

I am new in R but unexpectedly encountered this symbolic incapability of R as I 
almost finished programming all major computations in R. I have been skillful 
in Matlab and Mathematica (and it is very easy to do this in them) but as I am 
now in statistics I would like to continue in R unless it really is not able to 
do it (in that case I will have to recode in Mathematica).

Any of your further help is much appreciated!
Best regards,
-Chee

 
From: Jerome Asselin 
Sent: Friday, April 29, 2011 12:25 AM
To: Chee Chen 
Cc: R -Help 
Subject: Re: [R] How to define specially nested functions


On Thu, 2011-04-28 at 23:08 -0400, Chee Chen wrote:
> Dear All,
> I would like to define a function: f(x,y,z) with three arguments x,y,z, such 
> that: given values for x,y,  f(x,y,z) is still a function of z and that I am 
> still allowed to find the root in terms of z when x,y are given.
> For example: f(x,y,z) =  x+y + (x^2-z),  given x=1,y=3, f(1,3,z)= 1+3+1-z is 
> a function of z, and then I can use R to find the root z=5.
> 
> Thank you.
> -Chee

Interesting exercise.

I've got this function, which I think it's doing what you're asking.

f <- function(x,y,z)
{
fcall <- match.call()
fargs <- NULL
if(fcall$x == "x")
fargs <- c(fargs, "x")
if(fcall$y == "y")
fargs <- c(fargs, "y")
if(fcall$z == "z")
fargs <- c(fargs, "z")

ffunargs <- as.list(fargs)
names(ffunargs) <- fargs

argslist <- list(fcall)
ffun <- append(argslist, substitute( x+y + (x^2-z) ), after=0)[[1]]
as.function(append(ffunargs, ffun))
}

This yields.

> f(3, 2, z)
function (z = "z") 
3 + 2 + (3^2 - z)

> f(3, 2, z)(3)
[1] 11

I haven't figured out how to get rid of the default argument value shown
here as 'z = "z"'. That doesn't prevent it to work, but it's less
pretty.  If you find a better way, let me know.

HTH,
Jerome



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem installing package "sp" in R 2.13.0

2011-04-29 Thread Roger Bivand
The rgdal package is not a dependency of sp, only suggested. In addition, you
are trying to install source packages, but should (probably) be installing
binaries, with type="mac.binary.leopard" the most likely. If you need the
OSX rgdal binary, make sure that CRAN extras is on your repository path -
see ?setRepositories, for example by 
setRepositories(ind=1:2). If you really want to install source packages
under OSX, be sure to read up on this on 

http://cran.r-project.org/bin/macosx/

looking for the FAQ, and links to tools. If you can manage with binary
packages, stay with them.

Roger


Arnaud Catherine wrote:
> 
> Hi,
> 
> I am having troubles trying to install package "sp" in R (2.13.0) on mac
> OSX.
> I have tried installing the package using GUi or function install.packages
> but it didn't work.
> 
> Here is the error message I get:
> 
> 
> also installing the dependency ‘rgdal’
> 
> trying URL 'http://cran.univ-lyon1.fr/src/contrib/rgdal_0.6-33.tar.gz'
> Content type 'application/x-gzip' length 1422992 bytes (1.4 Mb)
> opened URL
> ==
> downloaded 1.4 Mb
> 
> trying URL 'http://cran.univ-lyon1.fr/src/contrib/sp_0.9-80.tar.gz'
> Content type 'application/x-gzip' length 738569 bytes (721 Kb)
> opened URL
> ==
> downloaded 721 Kb
> 
> * installing *source* package ‘sp’ ...
> ** libs
> *** arch - i386
> sh: make: command not found
> ERROR: compilation failed for package ‘sp’
> * removing
> ‘/Library/Frameworks/R.framework/Versions/2.13/Resources/library/sp’
> ERROR: dependency ‘sp’ is not available for package ‘rgdal’
> 
> The downloaded packages are in
> 
> ‘/private/var/folders/8P/8P9oV0FHFI83GKIm2cPUOk+++TM/-Tmp-/RtmppsxaRa/downloaded_packages’
> * removing
> ‘/Library/Frameworks/R.framework/Versions/2.13/Resources/library/rgdal’
> 
> 
> 
> Any help would be much appreciated!
> 
> 
> Best regards.
> 
> 
> 
> 
> 
> Dr. Arnaud CATHERINE
> Post-Doctorant
> 
> UMR 7245 CNRS/MNHN "Molécules de Communication et Adaptation des
> Micro-organismes"
> Equipe "Cyanobactéries, Cyanotoxines et Environnement"
> Muséum National d'Histoire Naturelle
> 12, rue Buffon , Case 39
> 75231 Paris Cedex 05
> 
> Tel : + 33 (0)1 40 79 31 79
> Fax : +33 (0)1 40 79 35 94
> Email : arno...@mnhn.fr
> Site du Muséum National d'Histoire Naturelle : http://www.mnhn.fr
> 
> 
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-installing-package-sp-in-R-2-13-0-tp3481107p3483392.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replace non numeric with "NA"

2011-04-29 Thread Nandini B

 Hello,
I have a sample data frame which looks like this
  day  od   month
1   1 0.12
2   3 #VALUE! 1
3   5 0.4 12
4   7 0.8 10
5  11   -  3
6  14   s 7
7  18  -- 12
8  27  197


Now i wish to filter all the non numeric values and replace it with "NA". The 
data frame is actually huge and the non numeric characters vary from "-" to a 
string to absolutely anything!!!
Can anyone please help ?




Thank you,
Warm Regards,

Nandini 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot multiple ctrees in the same figure

2011-04-29 Thread tudor
Dear all:

Is there a way one could plot two conditional inference trees (party
package, ctree) in a figure specified by layout?  My attempts failed as
plot.party seemed to take over the layout functionality and forced a single
ctree plot to be displayed.  A brief (non reproducible) example together
with the intended behavior follows below.  I hope I am not missing something
obvious.  My system: R2.12.2 on a Windows machine with party0.9-1 and
partykit0.1-0.

Thanks.

Tudor


# CREATE ctrees
...
layout(matrix(c(1,2,0,2), 2, 2, byrow=TRUE), widths=c(1,2), heights=c(1,2))
plot(ctree1)# plot first ctree
plot(ctree2)# plot second ctree
...

   

--
View this message in context: 
http://r.789695.n4.nabble.com/Plot-multiple-ctrees-in-the-same-figure-tp3483231p3483231.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change the text size of the title in a legend of a R plot.

2011-04-29 Thread Victor Gabillon

thanks everyone for the help.

I ended up copying and pasting the legend function from the R source files.
I changed it so that the title.cex is not set by default to cex and so 
that this title.cex can be given as a parameter.


It works fine for me.
Note that if you make the title too big it goes out of the border as the 
borders were not designed for the case of a big title.


Thanks again!!

Victor

Le 29/04/2011 10:03, Jannis a écrit :

On 04/29/2011 05:21 AM, Victor Gabillon wrote:

Horizo <- c(1,2,6,10,20)
legtext <- paste(Horizo,sep="")
legend("topleft", legend=legtext,col=col,text.col=col,lwd=lwd,
lty=lty,cex=1.1,ncol=3,title = "Horizons",title.col 
="black",title.cex=1.4) 


I am not sure, but the manual regarding legend seems to be not correct 
(or at least misleading). There is not title.cex argument for legend 
(even though the help page mentions it). Either you set cex >1 but 
this will resize the labels as well. Or you modify the code of legend 
as follows:


change the following (near the end of the code):

   text2(left + w/2, top - ymax, labels = title, adj = c(0.5,
0), cex = cex, col = title.col)

to:

   text2(left + w/2, top - ymax, labels = title, adj = c(0.5,
0), cex = title.cex, col = title.col)

and add title.cex to the arguments of legend. Its probably easiest if 
you copy the code of legend and save its modified version within a 
different function.


Not sure on whom to contact regarding correcting the documentation of 
legend(). Perhaps even I am wrong, but I could not find any reference 
to title.cex in the code.


HTH
Jannis


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question of VECM restricted regression

2011-04-29 Thread Meilan Yan
Dear Colleague

  I am trying to figure out how to use R to do OLS restricted VECM regression. 
However, there are some notation I cannot understand.

Please tell me what is 'ect',  'sd' and 'LRM.dl1  in the following practice:

#OLS retricted VECM regression
data(denmark)
sjd <- denmark[, c("LRM", "LRY", "IBO", "IDE")]
sjd.vecm<- ca.jo(sjd, ecdet = "const", type="eigen", K=2, spec="longrun",
season=4)
sjd.vecm.rls<-cajorls(sjd.vecm,r=1)
summary(sjd.vecm.rls$rlm)
sjd.vecm.rls$beta

Response LRM.d :
Call:
lm(formula = substitute(LRM.d), data = data.mat)

Residuals:
  Min1QMedian3Q   Max
-0.027598 -0.012836 -0.003395  0.015523  0.056034

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
ect1-0.212955   0.064354  -3.309  0.00185 **
sd1 -0.057653   0.010269  -5.614 1.16e-06 ***
sd2 -0.016305   0.009177  -1.777  0.08238 .
sd3 -0.040859   0.008767  -4.660 2.82e-05 ***
LRM.dl1  0.049816   0.191992   0.259  0.79646
LRY.dl1  0.075717   0.157902   0.480  0.63389
IBO.dl1 -1.148954   0.372745  -3.082  0.00350 **
IDE.dl1  0.227094   0.546271   0.416  0.67959

> sjd.vecm.rls$beta
  ect1
LRM.l21.00
LRY.l2   -1.032949
IBO.l25.206919
IDE.l2   -4.215879


Many thanks
Meilan





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3-way contingency table

2011-04-29 Thread Mathias Walter
Hi,

I have large data frame with many columns. A short example is given below:

> dataH
host ms01 ms31 ms33 ms34
1  cattle4   2096
2   sheep4345
3  cattle4345
4  cattle4345
5   sheep4355
6goat4345
7   sheep4355
8goat4345
9goat4345
10 cattle4345

Now I want to determine the the frequencies of every unique value in
every column depending on the host column.

It is quite easy to determine the frequencies in total with the
following command:

> dataH2 <- dataH[,c(2,3,4,5)]
> table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)], useNA="ifany")

ms01 ms31 ms33 ms34
 3 0900
 410070
 5 0029
 6 0001
 9 0010
 200100

But I cannot manage to get it dependent on the host.

I tried

> xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH)

and many other ways but I'm not stressful.

I can get it for each column individually with

> with(dataH, table(host, ms33))

   ms33
host 4 5 9
 cattle 3 0 1
 deer   0 0 0
 goat   3 0 0
 human  0 0 0
 sheep  1 2 0
 tick   0 0 0

But I do not want to repeat the command for every column. I need a
single table which can be plotted as a balloon plot, for instance.

Does anybody knows how to achieve this?

--
Kind regards,
Mathias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] abline outside of plot region

2011-04-29 Thread Nick Sabbe
Hi R people.

 

I ran into this problem: I created a plot with errbars, like this:

> errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5),
yplus=c(2.5,1.5,3.5,3.5))

Next, I wanted to accentuate some x value with an abline, like this:

> abline(v=2)

 

In one of my R sessions (which admittedly I have had open for quite a while
now), the abline draws outside of the plotting region of errbars (till the
edge of my plotting window at least).

I tested for the cause by opening another session (clean) of the same
version of R (2.13), and running the same set of commands. In this session,
I do not have this behavior. Conclusion: I must have changed some graphical
parameter in my original session, but I don't know which one. Do you?

 

As an addendum: I also want to add a few specific axis ticks besides the
standard ones in my graph. I used axis for this, and it works. I set
col.ticks to match the color of my abline (in the nonsimplified code), and
this works too, but unfortunately, the label below the tick is not in this
color, and a parameter for this is not present in axis.

 

Suggestions for either? Note: I'm on windows 7 with R 2.13.

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:   http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Element by Element addition of the columns of a Matrix

2011-04-29 Thread Pete Brecknock
... is the apply function what you are looking for?

A=matrix(1,2,4)

apply(A,1,sum)

HTH

Pete




--
View this message in context: 
http://r.789695.n4.nabble.com/Element-by-Element-addition-of-the-columns-of-a-Matrix-tp3483545p3483628.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nomograms from rms' fastbw output objects

2011-04-29 Thread Frank Harrell
Hi Rob,

fastbw does not try to produce a full fit object.  You have to re-run the
fit manually based on what you (sometimes dangerously) learn from fastbw. 
If I can find a way to add a 'formula' component to the fastbw result then
you could do something like lrm(fastbw(fit)$formula, ...).

Frank


Rob James wrote:
> 
> There is both a technical and a theoretical element to my question... 
> Should I be able to use the outputs which arise from the fastbw function 
> as inputs to nomogram().  I seem to be failing at this, -- I obtain a 
> subscript out of range error.
> 
> That I can't do this may speak to technical failings, but I suspect it 
> is because Prof Harrell thinks/knows it injudicious. However,  I can't 
> invent a reason why nomograms should be restricted to the full models, 
> if the purpose of fastbw is to generate parsimonious models with 
> appropriate standard errors.
> 
> I'd welcome comments on either the technical or the theoretical issues.
> 
> Many thanks in advance,
> 
> Rob James
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Nomograms-from-rms-fastbw-output-objects-tp3482669p3483607.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot several histograms with same y-axes scaling using hist()

2011-04-29 Thread Jim Lemon

On 04/29/2011 08:35 PM, hck wrote:

Dear all

Problem: hist()-function, scale = “percent”

I want to generate histograms for changing underlying data. In order to make
them comparable, I want to fix the y-axis (vertical-axis) to, e.g., 0%, 10%,
20%, 30% as well as to fix the spaces, too. So the y-axis in each histogram
should be identical. Currently, I have 100 histograms and the y-axis scales
changes in each.

Here is my code:

="Hist(na.exclude("&AA3&"), breaks=50, col=""seashell3"",
scale=""percent"",xlim=c(-1, 1), xlab=""Bewertungsfehler"",
ylab=""Haeufigkeit (in %)"", main=""KBV"", border=""white"")"

I tried the ylim=c(…), but unfortunately it does not work.


Hi Hans,
The "barp" function in plotrix can plot histograms (see the last example 
on the help page) and may be flexible enough to do what you want.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix evaluation using if function

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 4:27 AM, ivan wrote:


Hi All,

I am trying to create a function which evaluates whether the values  
(which
are equal to one) of a matrix are the same as their mirror values.  
Consider

the following matrix:


n<-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3)
colnames(n)<-cbind("A","B","C");rownames(n)<-cbind("A","B","C")
n

 A B C
A 0 1 0
B 1 0 1
C 1 0 0

Hence, since n[2,1] and n[1,2] are 1 and the same, the function should
return the name of the row of n[2,1]. I used the following function:

for (i in length(rownames(n))) {

for (j in length(colnames(n))){

if(n[i,j]==n[j,i]){

rownames(n)[[i]]->output} else {}

}

}


output

NULL

The right answer would have been "B", though.


Can you explain why "A" would not be an equally good answer to satisfy  
your problem set up?


> which(n == t(n) & col(n) != row(n) , arr.ind=TRUE)
  row col
B   2   1
A   1   2
> rownames(which(n == t(n) & col(n) != row(n) , arr.ind=TRUE) )
[1] "B" "A"

# Which would seem to be the correct answer, but
# This adds an additional constraint and also insures no diagonal  
elements


> rownames(which(n == t(n) & col(n) != row(n) & lower.tri(n),  
arr.ind=TRUE) )

[1] "B"





I simply do not see my
mistake.


I would rather program a problem correctly that hash through errors in  
loop logic.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reference variables by string in for loop

2011-04-29 Thread Michael Bach
Kenn Konstabel  writes:

> Another way (not elegant but better and shorter than the eval-parse
> way) is to use get. ?get

This one is handy for interactive use, thanks for the hint.

Kind Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reference variables by string in for loop

2011-04-29 Thread Michael Bach
"Nick Sabbe"  writes:

> ObjectsOfInterest<- list(one_df, two_df, three_df)
> for(namedf in ObjectsOfInterest){...}

I see. This is also more readable and traceable for others.

> or probably even better
> sapply(ObjectsOfInterest, function(namedf){...})

I like this one for its functional style.

> hth.

It did, thanks.

Kind Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Putting x-axis in opposite order

2011-04-29 Thread Jim Lemon

On 04/29/2011 04:09 AM, Bogaso Christofer wrote:

Hi all, please consider this plot:



xx<- seq(4, 0.01, by = -0.04)

yy<- rnorm(xx)

plot(xx, yy, type="l")



Here you see my original 'xx' was in decreasing order, however R puts it in
the increasing order. I understand that in any plot x and y axis grow is
increasing order, however I am wondering whether I can manipulate this to
suit my above particular problem, so that number displayed in x-axis would
be in the given order.


Hi Bogaso,
If all else fails, have a look at rev.axis in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot several histograms with same y-axes scaling using hist()

2011-04-29 Thread hck
Thanks for the note: Indeed, the function is the hist() function not Hist()
with capital letter.

I use the standard R hist()-function with the lower case only. Nevertheless,
the ylim does not work as supposed to. 

--
View this message in context: 
http://r.789695.n4.nabble.com/plot-several-histograms-with-same-y-axes-scaling-using-hist-tp3483376p3483479.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot several histograms with same y-axes scaling using hist()

2011-04-29 Thread Philipp Pagel

On Fri, Apr 29, 2011 at 03:35:41AM -0700, hck wrote:
> Problem: hist()-function, scale = “percent”
[...]
> ="Hist(na.exclude("&AA3&"), breaks=50, col=""seashell3"",
> scale=""percent"",xlim=c(-1, 1), xlab=""Bewertungsfehler"",
> ylab=""Haeufigkeit (in %)"", main=""KBV"", border=""white"")"

Before anyone can really help you'll need to let us know where your
Hist() function came from. 

hist() from package graphics does not have a scale parameter and
honours ylim without a problem.

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summer student internship placement at University of York / YCCSA / SEI (paid)

2011-04-29 Thread Corrado Topi
Dear R-lings,

I did not know which list to post to, because it is a studentship so not 
really a job, so it did not fit the r-sig-jobs list  and it is about 
devloping an extension package interfaced with R  I hope I did not upset 
anyone. If so apologies.

The Centre For Complex systems Analysis at the University of York (YCCSA) in 
UK in collaboration with Stockholm Environment Institute is looking for a 
highly motivated student in Computer Science, Applied Mathematics, Applied 
Statistics or related fields for a 10 weeks paid student internship over the 
summer 2011, starting in july,  to collaborate in development of a R package. 
The student will participate in research projects to develop prototypes for 
toolkits for statistical predictions of diversity and dissimilarity and the 
generation of spatial landscapes, with applications in the biological and 
environmental sciences. We require excellent development skills and experience 
in CUDA/openCL, and a strong foundation in Computing, Statistics / Applied 
Mathematics and COmputer Graphics. We need an excellent problem solver, able 
to innovate, find solutions and work independently.

For further information on the project please contact ct...@york.ac.uk or go 
to http://www.york.ac.u...2011/201107.pdf

For further information on the studentship programme please look at 
http://www.york.ac.u...olarships.html.

Please send your application not later than the 13 of may to 
scholarsh...@yccsa.org as one single pdf document including:

1. Your CV (max 2 pages)
2. A brief personal statement (max 1 page) including:
* Which project(s) you are interested in (as many as you like but in 
preference order)
* Your reasons for applying
* Your academic interest
* Your future aspirations
3. A full written academic reference (not just contact details). Your 
application will not be accepted without this reference (max 1 page). 

Best,
-- 
Corrado Topi

Stockholm Environment Institute

Mob: +44 (0) 7769 601784
Tel: +44 (0) 1904 322893
Skype: corrado-eeos
Website:  sei-international.org

University of York
York YO10 5DD
UK

Fax: +44 (0) 1904 322898

EMAIL DISCLAIMER: http://www.york.ac.uk/docs/disclaimer/email.htm

-- 
Corrado Topi

Stockholm Environment Institute

Mob: +44 (0) 7769 601784
Tel: +44 (0) 1904 322893
Skype: corrado-eeos
Website:  sei-international.org

University of York
York YO10 5DD
UK

Fax: +44 (0) 1904 322898

EMAIL DISCLAIMER: http://www.york.ac.uk/docs/disclaimer/email.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] is there a way/library for generating colorful noise in R ??

2011-04-29 Thread Ubuntu Diego
I would like to generate some noisy time series. I know that it is possible to 
"classify" noise by looking at the exponent (beta) of the relationship between 
the spectrum of the time series and the frequencies (i.e. spectrum ~ frequency 
^ beta ).  
Is there a way to generate White (beta=0), Pink (beta=-1), Brown (Beta=-2), 
Blue(beta=1) and Violet (beta=2) noise in R ?.

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reference variables by string in for loop

2011-04-29 Thread Nick Sabbe
Hi Michael.
This is a classic :-)

ObjectsOfInterest<- list(one_df, two_df, three_df)
for(namedf in ObjectsOfInterest){...}

or probably even better
sapply(ObjectsOfInterest, function(namedf){...})

hth.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Michael Bach
Sent: vrijdag 29 april 2011 12:03
To: r-help@r-project.org
Subject: [R] Reference variables by string in for loop

Dear R Users,

I am trying to get the following to work better:

namevec <- c("one", "two", "three")
for (name in namevec) {
namedf <- eval(parse(text=paste(name, "_df", sep="")))
...
...
}

The rationale behind it being that I created variables with names
one_df, two_df and three_df earlier in the same script which I want to
reference inside the for loop.  Is there a more elegant way to do this?

Best Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >