Re: [R] ODD and EVEN numbers

2010-04-01 Thread Detlef Steuer

Just to give you a hint for the future:

If you ask google for "odd, even, R" you get a messages from 2003 as second 
match:

---
Dave Caccace wrote:
> Hi,
> I'm trying to create a function, jim(p) which varies
> depending on whether the value of p is odd or even. I
> was trying to use th eIf function, but i cant work out
> a formula to work out if p is odd or even.
> Thanks,
> Dave

if(p %% 2) "odd" else "even"

Uwe Ligges 
--
(Hi Uwe!)

My guess is, using so much capitals in your e-mail has turned away about 1000
helpful souls from your future posts.

May be reading the posting guide and a one minute try to solve the
problem by yourself googling would be appropriate?
Think for a moment: Google would have given an answer (the answer!) in 1
minute. You wrote an e-mail to quite a few thousands of subscribers.
That needed more than a minute on your side. And how many hours of
reading time took it off of your readers?

Seasonal greetings
Detlef



On Thu, 01 Apr 2010 17:27:01 -0700
girlm...@yahoo.com wrote:

> Excuse me Carl Withoft! 
> 
> For your information, this is not my homework. I'm just helping my friend in 
> "a part" of her R code.
> 
> And everytime I ask a question here, it's just a "SMALL PART" of the 
> 2-pages-program that I am doing. And for your information, the answers that I 
> get, I still think on how to make use of them. It does not mean that when I 
> get answers, I use them immediately without thinking!
> 
> And you have no right to tell me that coz I don't remember you answering any 
> of my questions.
> 
> IF YOU DON'T KNOW THE ANSWERS TO MY QUESTIONS, just keep quiet, and let the 
> smart guys share their thoughts.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Exporting Nuopt from splus to R

2010-04-01 Thread Jp2010

Hi all,
Thanks for the wonderful forum with all the valuable help and comments here.

I have been a splus user for the past 7 to 8 years and now crossing the mind
of changing over to R. Have been doing a lot of reading and one of the main
reasons is being an open source and the wonderful things that comes with
that. 

My question is though, is it possible to export any of the function or
librarys that come with splus to R.? 

For my specific situation. Windows platform, if there is a compiled s.dll is
there a way we can get this working in R. I would think if it s function or
source file it probably can be written without much difficulty in R. But
what about the compiled data. I am not a system programmer so don't know
much about compiling/ undoing that. 

>From my understanding it is going to be difficult, is that my understanding
right.?

Thanks

-- 
View this message in context: 
http://n4.nabble.com/Exporting-Nuopt-from-splus-to-R-tp1748681p1748681.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ODD and EVEN numbers

2010-04-01 Thread girlme80
Excuse me Carl Withoft! 

For your information, this is not my homework. I'm just helping my friend in "a 
part" of her R code.

And everytime I ask a question here, it's just a "SMALL PART" of the 
2-pages-program that I am doing. And for your information, the answers that I 
get, I still think on how to make use of them. It does not mean that when I get 
answers, I use them immediately without thinking!

And you have no right to tell me that coz I don't remember you answering any of 
my questions.

IF YOU DON'T KNOW THE ANSWERS TO MY QUESTIONS, just keep quiet, and let the 
smart guys share their thoughts.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Help

2010-04-01 Thread Ryan Cooper
Has anyone programmed the Nonparametric Canonical Correlation method in R?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time series problem: time points don't match

2010-04-01 Thread Brad Patrick Schneid

Gabor:
That is not the ideal solution, but it definitely works to provide me with
the "easier alternative".  Thanks for the reply!  
-- 
View this message in context: 
http://n4.nabble.com/time-series-problem-time-points-don-t-match-tp1748387p1748706.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading excel into R

2010-04-01 Thread Ravi Kulkarni

Use R Commander to do this. R Commander is a R package that offers a GUI for
R. It can be downloaded like any other R package.

If you use R Commander, there is a menu option where you specify that you
want to read an Excel file (you can also read text, SPSS, Minitab, Stata,
Access... files). It is very easy to use!

R Commander also lets you do some statistical tests through the GUI.



Ravi
-- 
View this message in context: 
http://n4.nabble.com/reading-excel-into-R-tp1747897p1748819.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] roccomp

2010-04-01 Thread joann

Does anyone know of a way to compare to ROC curves in R using the same method
used by roccomp in Stata?
-- 
View this message in context: 
http://n4.nabble.com/roccomp-tp1748818p1748818.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] BATCH jobs taking too much resources?

2010-04-01 Thread Stuart Luppescu
Is there something unusual about the way BATCH jobs are run? I ran a job
like this:
nice R CMD BATCH program.R

It ran for a little while and then it starting eating up huge amounts of
resources. Here is the entry from top:
top  17:34:10 up 36 days,  8:10,  4 users,  load average: 13.11, 6.85, 3.7 0
Tasks: 173 total,   6 running, 167 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.1%us, 99.4%sy,  0.0%ni,  0.0%id,  0.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32712300k total, 32565372k used,   146928k free,  856k buffers
Swap: 34766840k total, 34766840k used,0k free, 9812k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND   

28829 lupp  26  10 53.3g  30g  128 R  7.9 98.6 134:23.54 R  

Note that it's taking about 54GB of memory. Right after this the load
average went up to over 23 and the system started killing off processes,
including mine. And this is on a system with 32GB of physical memory and
2 quad-core Xenon processors. 

However, when I run the job as an inferior process in emacs under ess,
it is very well behaved. Here is the output from top:

top - 22:22:48 up 37 days, 12:58,  1 user,  load average: 1.01, 1.04, 1.07
Tasks: 133 total,   2 running, 131 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.0%us,  0.0%sy,  0.0%ni, 75.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32712300k total, 31120472k used,  1591828k free,   466180k buffers
Swap: 34766840k total,86160k used, 34680680k free, 10935976k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 
 3768 lupp  25   0 17.8g  17g 4060 R 96.2 56.6 154:42.43 R  
   

It only uses 17.8GB of memory and the load stays right around 1.0.
Admittedly, this is a rather big job: lmer with 2,200,000 records
crossed by 125,000 students and 10,000 teachers. But I don't understand
why it should consume resources so avariciously when run as a BATCH job.
Can anyone explain this to me?

TIA

-- 
Stuart Luppescu -*-*- slu  ccsr  uchicago  edu
CCSR in UEI at U of C

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] You are right and the problem is solved. Re: about the possible errors in Rgraphviz Package

2010-04-01 Thread Gabor Grothendieck
I finally got Rgraphviz to work.

I uninstalled Rgraphviz and graphviz, did a reboot of Vista, installed
graphviz again making sure to wget the path name found in the
Rgraphviz README (in the Rgraphviz source .tar.gz) and did another
reboot of Vista and also removed Rgraphviz, did another reboot of
Vista and reinstalled Rgraphviz and rebooted once again.

I also experimented with adding C:\Program Files\Internet Explorer
(since Dependency Walker said Rgraphviz.dll uses IESHIMS.dll which is
there) and C:\Program Files\R\R-2.10.x\bin to my permanent path
(normally I use a batch file which adds it temporarily on the fly) and
it did work; however, when I removed C:\Program Files\Internet
Explorer and C:\Program Files\R\R-2.10.x\bin from my permanent path it
still worked so I guess that had nothing to do with it.

My best guess is that the reboots cleared things up and perhaps the
instructions should include rebooting the system and also mention
Dependency Walker.  Probably not all the reboots were needed but it
would be time consuming to figure out which reboot was the key one.
Or perhaps the fact that I had C:\Program Files\Internet Explorer and
C:\Program Files\R\R-2.10.x\bin on my permanent path during the
installation of graphviz and Rgraphviz made a difference?

On Wed, Mar 31, 2010 at 3:29 PM, Martin Morgan  wrote:
> On 03/31/2010 11:44 AM, Gabor Grothendieck wrote:
>> I got an error message *AND* R becomes unusable and had to be restarted.
>>
>> Using dependency walker it seems to be complaining about R.dll so I
>> copied all of my ...\R\bin\R*.dll files to ..
>> ...\win-library\2.10\Rgraphviz\libs.
>>
>> Then it complained about IESHIMS.dll so I copied \Program
>> Files\Internet Explorer\IESHIMS.dll to
>> ...\win-library\2.10\Rgraphviz\libs.
>>
>> Looking at dependency walker these DLLs: LIBCDT-4.DLL, LIBGRAPH-4.DLL,
>> and LIBGVC-4.DLL all seem to point to correct place.
>>
>> I still have this message in dependency walker:
>> Warning: At least one module has an unresolved import due to a missing
>> export function in a delay-load dependent module.
>>
>> There is a red mark beside:
>> c:\windows\system32\IEFRAME.DLL
>> but its not missing.
>>
>> I also tried reinstalling R using the latest R 2.10.1.
>>
>> I am still getting the same result, namely I can run the indicated
>> code up to but not including plot(g1) but if I run plot(g1) R crashes
>> and must be restarted.
>
> Wow, I didn't mean to put you through such convolutions, and am sorry
> that I can't offer an immediate solution. Martin
>
>>
>>
>>
>> On Wed, Mar 31, 2010 at 12:54 PM, Martin Morgan  wrote:
>>> On 03/31/2010 08:18 AM, Gabor Grothendieck wrote:
 By the way, just in case you did not read the entire message R crashed
>>>
>>> I think you mean that you got an error message, not that R became
>>> unusable? More below...
>>>
 when I tried to run the code from the vignette.

 On Wed, Mar 31, 2010 at 10:52 AM, Gabor Grothendieck
  wrote:
> Based on your success I thought I would try again. I am not sure why I
> had more success this time but this time I got this far even though I
> did not change my path at all or make any system changes from what I
> had before.  I did reinstall Rgraphviz but used the previously
> installed graphviz.   I tried this code taken from the Rgrpahviz
> vignette:
>
>> library("Rgraphviz")
> Loading required package: graph
> Loading required package: grid
>> set.seed(123)
>> V <- letters[1:10]
>> M <- 1:4
>> g1 <- randomGraph(V, M, 0.2)
>> g1
> A graphNEL graph with undirected edges
> Number of Nodes = 10
> Number of Edges = 16
>
>
> but when I tried to plot it R crashed.
>
>> plot(g1)
>
> It did produce this message on the Windows console (not the R console):
>
> Error: Layout type: "dot" not recognized. Use one of:
>>>
>>> In the past this has come about when accessing incorrect graphviz DLLs.
>>> The gory detail (if I remember correctly) is that a FILE* gets allocated
>>> in Rgraphviz and passed to graphviz, and if there is a compiler mismatch
>>> then there are no guarantees about FILE* representation.
>>>
>>> The implication is that your Rgraphviz and graphiz are at least partly
>>> out of sync, but it could also be that I am 'getting lucky'. When you
>>> say (quoting from above)...
>>>
> did not change my path at all or make any system changes from what I
>>>
>>> the implication is that your PATH was already set to include a
>>> graphviz2.20\\bin directory; I'd encourage you to confirm that.
>>
>> Yes,  My path is:
>>
>> PATH=C:\Graphviz2.20\bin;...more stuff...
>>
>>>
>>> And recognizing that you might justifiably be willing to put this on the
>>> shelf, you might also use Dependency Walker
>>> (http://www.dependencywalker.com/) to open the Rgraphviz DLL
>>> (R_HOME/libraries/Rgraphviz/libs/Rgraphviz.dll), right click on the
>>> RGRAPHVIZ.DLL showing up in the top left panel, an

Re: [R] Confusing concept of vector and matrix in R

2010-04-01 Thread Johannes Huesing
Rolf Turner  [Tue, Mar 30, 2010 at 09:48:12PM CEST]:
[...]
> Who designed Excel?

Among others, Joel Spolsky. Is that an appeal to authority?




-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:johan...@huesing.name  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding regression lines to each factor on a plot when using ANCOVA

2010-04-01 Thread Peter Ehlers

Steve,

Thanks for providing an example (which does, however need a bit of
tweaking; BTW, it's usually not a good idea to cbind your data
when what you really want is a data.frame).

Your guess about some clever way of using abline() is unfortunately
not correct - as the help page indicates, the slope and intercept
must be given as single values. So you will have to extract each
(intercept, slope) pair from the model coefficients and call abline()
on them. A convenient way to do this is to specify the model as

 mod <- lm(y ~ f/x + 0)

(which I first learned from MASS, the book).
Here f is your grouping variable.  As the book says,
this gives "separate regression models of the type 1 + x within
the levels of f".  The "+ 0" removes the usual intercept which is
replaced by individual intercepts for each level of f.

For your example this will give 12 intercepts as
the first 12 coefficients and 12 slopes as the remaining coefs.

Then you can use

 cof <- coef(mod)
 for(i in 1:12) abline(a=cof[i], b=cof[12 + i])

to plot the 12 lines.

 -Peter Ehlers

On 2010-04-01 16:21, Steven Worthington wrote:


Dear R users,

i'm using a custom function to fit ancova models to a dataset. The data are
divided into 12 groups, with one dependent variable and one covariate. When
plotting the data, i'd like to add separate regression lines for each group
(so, 12 lines, each with their respective individual slopes). My 'model1'
uses the group*covariate interaction term, and so the coefficients to plot
these lines should be contained within the 'model1' object (there are 25
coefficients and it looks like I need the last 12). The problem is I can't
figure out how to extract the relevant coefficients from 'model1' and add
them using abline. I imagine there's some way of using the relevant slopes

abline(model1$coef[14:25])

together with the intercept, but I can't quite get it right. Can anyone
offer a suggestion as to how to go about this? Ideally, What i'd like is to
plot each regression line in the same color as the group to which it
belongs.

I've provided an example with dummy data below

best,

Steve


# ===
# hypothetical data
species<-
c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7,8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12)
beak.lgth<-
c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1,9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9,19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9)
mass<-
c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2,84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6,27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2,34.7,39.3,41.7,40.5,42.7,41.8)
dataset<- cbind(groups, beak.lgth, mass)

# ANCOVA function
anc<- function(variable, covariate, group){
# transform data
lgVar<- log10(variable)
lgCov<- log10(covariate)
# separate regression lines for each group
model1<- lm(lgVar ~ lgCov + group + lgCov:group)
model1.summ<- summary(model1)
model1.anv<- anova(model1)
# separate regression lines for each group, but with the same slope
model2<- lm(lgVar ~ lgCov + group)
model2.summ<- summary(model2)
model2.anv<- anova(model2)
# same regression line for all groups
model3<- lm(lgVar ~ lgCov)
model3.summ<- summary(model3)
model3.anv<- anova(model3)
compare<- anova(model1, model2, model3) # compare all models
# plots
par(mfcol=c(1,2))
boxplot(lgVar ~ group, col="darkgoldenrod1")
# plot the variate and covariate by group
plot(lgVar ~ lgCov, pch=as.numeric(group), col=as.numeric(group))
legend("topleft", inset=0, legend=as.character(unique(group)),
col=as.numeric(unique(group)),
pch=as.numeric(unique(group)), pt.cex=1.5)
abline(model1) # Need separate regression lines here
list(model_1_summary=model1.summ, model_1_ANOVA=model1.anv,
model_2_summary=model2.summ,
model_2_ANOVA=model2.anv, model_3_summary=model3.summ,
model_3_ANOVA=model3.anv, model_comparison=compare)
}

# call function
anc(beak.lgth, mass, species)
# ===



--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Heterogeneous bootstrap

2010-04-01 Thread Xiaoxi Gao






Hello all,

Can anybody tell me how to implement the heterogeneous bootstrap algorithm by 
the FEAR package. I only know there is a command (boot.sw98) for homogeneous 
bootstrap. Thanks a lot.

Xiaoxi


  
  
  
_
Hotmail is redefining busy with tools for the New Busy. Get more from your 
inbox.

N:WL:en-US:WM_HMP:032010_2
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the scale limits in lattice plot

2010-04-01 Thread James Rome
The key was to use grid.text() inside the panel function. It allows you
to specify things in 0-1 "npc" units.

On 4/1/10 12:23 PM, David Winsemius wrote:
>
> On Apr 1, 2010, at 11:53 AM, James Rome wrote:
>
>> I am drawing a density histogram, and want to label the plots with the
>> mean using ltext(). But I need the x,y coordinates to feed into ltext,
>> and I can't calculate them easily from my data. Is there a way to get
>> the x and y ranges being used for the plot, so I can put the text at the
>> correct position in the panel.function?
>
> No code  so what a "density histogram" might be is still vague,
> but perhaps you are using density() and if so, have you looked at the
> "Value" section of that function's help page? (The same advice would
> apply were you using one of the (several) histogram functions.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding regression lines to each factor on a plot when using ANCOVA

2010-04-01 Thread RICHARD M. HEIBERGER
## Steve,

## please use the ancova function in the HH package.

install.packages("HH")
library(HH)


## windows.options(record=TRUE)
windows.options(record=TRUE)
# hypothetical data
beak.lgth <-
c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1,
  9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9,
  19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9)
mass <-
c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2,
  84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6,
  27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2,
  34.7,39.3,41.7,40.5,42.7,41.8)
## Make species into a factor
species <-
factor(c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7,
 8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12))
## then construct a data.frame with the three variables and the log
transforms
dataset <-  data.frame(species, beak.lgth, mass,
   logBeak=log10(beak.lgth),
   logMass=log10(mass))
## default is 7 colors, we need 12
trellis.par.set("superpose.line",
  Rows(trellis.par.get("superpose.line"), c(1:6, 1:6)))
trellis.par.set("superpose.symbol",
  Rows(trellis.par.get("superpose.symbol"), c(1:6, 1:6)))

ancova(logBeak ~ logMass   * species, data=dataset)
ancova(logBeak ~ logMass   + species, data=dataset)
ancova(logBeak ~ logMass, groups=species, data=dataset)
ancova(logBeak ~ species, x=logMass, data=dataset)
bwplot(logBeak ~ species, data=dataset)

## Rich

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ODD and EVEN numbers

2010-04-01 Thread Carl Witthoft

There you go,  solving his homework again...


>
> Hi,
>
> anyone here who knows how to determine if an integer is "odd" or
> "even" in
> R?
> Thanks.

 > 2 %% 2 == 0
[1] TRUE
 > 3 %% 2 == 0
[1] FALSE is.even <- function(x){ x %% 2 == 0 }

 > is.even(2)
[1] TRUE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time series problem: time points don't match

2010-04-01 Thread Gabor Grothendieck
Perhaps something like this:

library(zoo)
library(chron)
# read in data

Lines1 <- "datetimelevel   temp
2009/10/01 00:01:52.0  2.8797  18.401
2009/10/01 00:16:52.0  2.8769  18.382
2009/10/01 00:31:52.0  2.8708  18.309
2009/10/01 00:46:52.0  2.8728  18.285
2009/10/01 01:01:52.0  2.8716  18.245
2009/10/01 01:16:52.0  2.8710  18.190"

Lines2 <- "datetimelevel  temp
2009/10/01 00:11:06.0  2.9507  18.673
2009/10/01 00:26:06.0  2.9473  18.630
2009/10/01 00:41:06.0  2.9470  18.593
2009/10/01 00:56:06.0  2.9471  18.562
2009/10/01 01:11:06.0  2.9451  18.518
2009/10/01 01:26:06.0  2.9471  18.480"

DF1 <- read.table(textConnection(Lines1), header = TRUE, as.is = TRUE)
DF2 <- read.table(textConnection(Lines2), header = TRUE, as.is = TRUE)

z1 <- zoo(DF1[3:4], chron(DF1[,1], DF1[,2], format=c("Y/M/D", "H:M:S")))
z2 <- zoo(DF2[3:4], chron(DF2[,1], DF2[,2], format=c("Y/M/D", "H:M:S")))

# process inputs z1 and z2
# aggregating into 15 minute intervals and merging

z1a <- aggregate(z1, trunc(time(z1), "00:15:00"), tail, n = 1)
z2a <- aggregate(z2, trunc(time(z2), "00:25:00"), tail, n = 1)

z <- merge(z1a, z2a)


On Thu, Apr 1, 2010 at 1:35 PM, Brad Patrick Schneid  wrote:
>
> Hi,
> I have a time series problem that I would like some help with if you have
> the time.  I have many data from many sites that look like this:
>
> Site.1
> date            time            level           temp
> 2009/10/01 00:01:52.0      2.8797      18.401
> 2009/10/01 00:16:52.0      2.8769      18.382
> 2009/10/01 00:31:52.0      2.8708      18.309
> 2009/10/01 00:46:52.0      2.8728      18.285
> 2009/10/01 01:01:52.0      2.8716      18.245
> 2009/10/01 01:16:52.0      2.8710      18.190
>
> Site.2
> date            time            level          temp
> 2009/10/01 00:11:06.0      2.9507      18.673
> 2009/10/01 00:26:06.0      2.9473      18.630
> 2009/10/01 00:41:06.0      2.9470      18.593
> 2009/10/01 00:56:06.0      2.9471      18.562
> 2009/10/01 01:11:06.0      2.9451      18.518
> 2009/10/01 01:26:06.0      2.9471      18.480
>
> As you can see, the times do not match up.  What I would like to do is be
> able to merge these two data sets to the nearest time stamp by creating a
> new time between the two; something like this:
>
>
> date            new.time        level.1       temp.1    level.2         temp.2
> 2009/10/01 00:01:52.0      2.8797      18.401   NA             NA
> 2009/10/01 00:13:59.0      2.8769      18.382      2.9507      18.673
> 2009/10/01 00:28:59.0      2.8708      18.309      2.9473      18.630
> 2009/10/01 00:43:59.0      2.8728      18.285      2.9470      18.593
> 2009/10/01 00:59:59.0      2.8716      18.245     2.9471      18.562
> 2009/10/01 01:13:59.0      2.8710      18.190     2.9451      18.518
> 2009/10/01 01:26:06.0       NA              NA          2.9471      18.480
>
> Note that the sites may not match in the # of observations and a return of
> NA would be necessary, but a deletion of that time point all together for
> both sites would be preferred.
>
> A possibly easier alternative would be a way to assign generic times for
> each observation according to the time interval, so that the 1st observation
> for each day would have a time = 00:00:00 and each consecutive one would be
> 15 minutes later.
>
> Thanks for any suggestions.
>
> Brad
>
> --
> View this message in context: 
> http://n4.nabble.com/time-series-problem-time-points-don-t-match-tp1748387p1748387.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nlrq parameter bounds

2010-04-01 Thread AJBG

Thank you for your reply Matthew,
There are many things I could say about the myriad difficulties I have had
in progressing with R, none worth stating here.  For the record I have read
the posting guidelines, at least to the extent possible in the time
available.  

For what its worth, I think I have found a way around this little problem by
limiting parameters outside the function using an if(any()) statement.  I'll
have a look at R-forge.

Thanks again, 
A
-- 
View this message in context: 
http://n4.nabble.com/nlrq-parameter-bounds-tp1733901p1748666.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A link to a collection of tutorials and videos on R.

2010-04-01 Thread datakid ..
A link to a collection of tutorials and videos on R.
Tutorials: http://www.dataminingtools.net/browsetutorials.php?tag=rdmt
Videos: http://www.dataminingtools.net/videos.php?id=8

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value-at-Risk Portfolio(both equity and option)

2010-04-01 Thread zhang

Thank you very much. Since I have never heard of "blotter" before, now I am
really excited. It seems exactly what I have been searching. Would be really
grateful if you could share some codes/examples regarding to this. I did not
happen to find the help file for the package. 

Thanks again. 
-- 
View this message in context: 
http://n4.nabble.com/Value-at-Risk-Portfolio-both-equity-and-option-tp1745179p1748538.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding regression lines to each factor on a plot when using ANCOVA

2010-04-01 Thread Steven Worthington

Dear R users,

i'm using a custom function to fit ancova models to a dataset. The data are
divided into 12 groups, with one dependent variable and one covariate. When
plotting the data, i'd like to add separate regression lines for each group
(so, 12 lines, each with their respective individual slopes). My 'model1'
uses the group*covariate interaction term, and so the coefficients to plot
these lines should be contained within the 'model1' object (there are 25
coefficients and it looks like I need the last 12). The problem is I can't
figure out how to extract the relevant coefficients from 'model1' and add
them using abline. I imagine there's some way of using the relevant slopes 

abline(model1$coef[14:25])

together with the intercept, but I can't quite get it right. Can anyone
offer a suggestion as to how to go about this? Ideally, What i'd like is to
plot each regression line in the same color as the group to which it
belongs.

I've provided an example with dummy data below

best,

Steve


# ===
# hypothetical data
species <-
c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7,8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12)
beak.lgth <-
c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1,9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9,19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9)
mass <-
c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2,84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6,27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2,34.7,39.3,41.7,40.5,42.7,41.8)
dataset <- cbind(groups, beak.lgth, mass)

# ANCOVA function
anc <- function(variable, covariate, group){
# transform data
lgVar <- log10(variable)
lgCov <- log10(covariate)
# separate regression lines for each group
model1 <- lm(lgVar ~ lgCov + group + lgCov:group) 
model1.summ <- summary(model1) 
model1.anv <- anova(model1)
# separate regression lines for each group, but with the same slope
model2 <- lm(lgVar ~ lgCov + group) 
model2.summ <- summary(model2) 
model2.anv <- anova(model2)
# same regression line for all groups
model3 <- lm(lgVar ~ lgCov) 
model3.summ <- summary(model3) 
model3.anv <- anova(model3)
compare <- anova(model1, model2, model3) # compare all models
# plots
par(mfcol=c(1,2))
boxplot(lgVar ~ group, col="darkgoldenrod1")
# plot the variate and covariate by group
plot(lgVar ~ lgCov, pch=as.numeric(group), col=as.numeric(group)) 
legend("topleft", inset=0, legend=as.character(unique(group)),
col=as.numeric(unique(group)),
pch=as.numeric(unique(group)), pt.cex=1.5)
abline(model1) # Need separate regression lines here
list(model_1_summary=model1.summ, model_1_ANOVA=model1.anv,
model_2_summary=model2.summ,
model_2_ANOVA=model2.anv, model_3_summary=model3.summ,
model_3_ANOVA=model3.anv, model_comparison=compare)
}

# call function
anc(beak.lgth, mass, species)
# ===

-- 
View this message in context: 
http://n4.nabble.com/Adding-regression-lines-to-each-factor-on-a-plot-when-using-ANCOVA-tp1748654p1748654.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using a string as a variable name - revisited

2010-04-01 Thread Erik Iverson

I meant to add that I'm guessing from this:

"What it seems to be doing is converting the text 
"Aulacoseira_islandica" to a number (25, for some reason"


is that foo is a factor, but we have no way of knowing without a 
reproducible example.


Euan Reavie wrote:

I would like to revisit a problem that was discussed previously (see
quoted discussion below). I am trying to do the same thing, using a
string to indicate a column with the same name. I am making "foo" a
string taken from a list of names. It matches the row where "item" =
5, and picks the corresponding "taxon"


foo <- list$taxon[match(5,list$item)]


Let's say this returns foo as "Aulacoseira_islandica". I have another
matrix "counts" with column headers corresponding to the taxon list.
But, when I try to access the data in the Aulacoseira_islandica
column, it instead uses the data from another column. For instance...


columndata <- counts[[foo]]


...returns the data from the wrong column. What it seems to be doing
is converting the text "Aulacoseira_islandica" to a number (25, for
some reason) and reading the count data from column number 25, instead
of from the column labelled with Aulacoseira_islandica.

If I try...


columndata <- counts$Aulacoseira_islandica


...it works fine. Any thoughts?

-Euan
NRRI-University of Minnesota Duluth


__
Jason Horn-2
Oct 20, 2006; 06:28pm
[R] Using a string as a variable name

Is it possible to use a string as a variable name?  For example:

foo<="var1"
frame$foo   # frame is a data frame with with a column titled "var1"

This does not work, unfortunately.  Am I just missing the correct
syntax to make this work?

- Jason
__
Oct 20, 2006; 06:30pm
Re: [R] Using a string as a variable name

frame[[foo]]

On 10/20/06, Jason Horn <[hidden email]> wrote:


Is it possible to use a string as a variable name?  For example:

foo<="var1"
frame$foo   # frame is a data frame with with a column titled "var1"

This does not work, unfortunately.  Am I just missing the correct
syntax to make this work?


- Jason




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using a string as a variable name - revisited

2010-04-01 Thread Erik Iverson

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Euan Reavie wrote:

I would like to revisit a problem that was discussed previously (see
quoted discussion below). I am trying to do the same thing, using a
string to indicate a column with the same name. I am making "foo" a
string taken from a list of names. It matches the row where "item" =
5, and picks the corresponding "taxon"


foo <- list$taxon[match(5,list$item)]


Let's say this returns foo as "Aulacoseira_islandica". I have another
matrix "counts" with column headers corresponding to the taxon list.
But, when I try to access the data in the Aulacoseira_islandica
column, it instead uses the data from another column. For instance...


columndata <- counts[[foo]]


...returns the data from the wrong column. What it seems to be doing
is converting the text "Aulacoseira_islandica" to a number (25, for
some reason) and reading the count data from column number 25, instead
of from the column labelled with Aulacoseira_islandica.

If I try...


columndata <- counts$Aulacoseira_islandica


...it works fine. Any thoughts?

-Euan
NRRI-University of Minnesota Duluth


__
Jason Horn-2
Oct 20, 2006; 06:28pm
[R] Using a string as a variable name

Is it possible to use a string as a variable name?  For example:

foo<="var1"
frame$foo   # frame is a data frame with with a column titled "var1"

This does not work, unfortunately.  Am I just missing the correct
syntax to make this work?

- Jason
__
Oct 20, 2006; 06:30pm
Re: [R] Using a string as a variable name

frame[[foo]]

On 10/20/06, Jason Horn <[hidden email]> wrote:


Is it possible to use a string as a variable name?  For example:

foo<="var1"
frame$foo   # frame is a data frame with with a column titled "var1"

This does not work, unfortunately.  Am I just missing the correct
syntax to make this work?


- Jason




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using a string as a variable name - revisited

2010-04-01 Thread Euan Reavie
I would like to revisit a problem that was discussed previously (see
quoted discussion below). I am trying to do the same thing, using a
string to indicate a column with the same name. I am making "foo" a
string taken from a list of names. It matches the row where "item" =
5, and picks the corresponding "taxon"

> foo <- list$taxon[match(5,list$item)]

Let's say this returns foo as "Aulacoseira_islandica". I have another
matrix "counts" with column headers corresponding to the taxon list.
But, when I try to access the data in the Aulacoseira_islandica
column, it instead uses the data from another column. For instance...

> columndata <- counts[[foo]]

...returns the data from the wrong column. What it seems to be doing
is converting the text "Aulacoseira_islandica" to a number (25, for
some reason) and reading the count data from column number 25, instead
of from the column labelled with Aulacoseira_islandica.

If I try...

> columndata <- counts$Aulacoseira_islandica

...it works fine. Any thoughts?

-Euan
NRRI-University of Minnesota Duluth


__
Jason Horn-2
Oct 20, 2006; 06:28pm
[R] Using a string as a variable name

Is it possible to use a string as a variable name?  For example:

foo<="var1"
frame$foo   # frame is a data frame with with a column titled "var1"

This does not work, unfortunately.  Am I just missing the correct
syntax to make this work?

- Jason
__
Oct 20, 2006; 06:30pm
Re: [R] Using a string as a variable name

frame[[foo]]

On 10/20/06, Jason Horn <[hidden email]> wrote:

> Is it possible to use a string as a variable name?  For example:
>
> foo<="var1"
> frame$foo   # frame is a data frame with with a column titled "var1"
>
> This does not work, unfortunately.  Am I just missing the correct
> syntax to make this work?
>
>
> - Jason

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sample size > 20K? Was: fitness of regression tree: how to measure???

2010-04-01 Thread Ravi Varadhan
The discussion of Leo Breiman's paper in Statistical Science: Statistical 
Modeling - The Two cultures, is a must read for all statisticians doing 
prediction modeling.  Especially see the exchange between Cox and Breiman (I 
call this the Cox-Breiman duel).

Ravi.



Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Bert Gunter 
Date: Thursday, April 1, 2010 12:55 pm
Subject: Re: [R] sample size > 20K? Was: fitness of regression tree: how to 
measure???
To: 'Frank E Harrell Jr' , 'vibha patel' 

Cc: r-help@r-project.org


> Since Frank has made this somewhat cryptic remark (sample size > 20K)
> several times now, perhaps I can add a few words of (what I hope is) further
> clarification.
> 
> Despite any claims to the contrary, **all** statistical (i.e. empirical)
> modeling procedures are just data interpolators: that is, all that 
> they can
> claim to do is produce reasonable predictions of what may be expected 
> within
> the extent of the data. The quality of the model is judged by the goodness
> of fit/prediction over this extent. Ergo the standard textbook caveats 
> about
> the dangers of extrapolation when using fitted models for prediction. 
> Note,
> btw, the contrast to "mechanistic" models, which typically **are** assessed
> by how well they **extrapolate** beyond current data. For example, Newton's
> apple to the planets. They are often "validated" by their ability to "work"
> in circumstances (or scales) much different than those from which they 
> were
> derived.
> 
> So statistical models are just fancy "prediction engines." In particular,
> there is no guarantee that they provide any meaningful assessment of
> variable importance: how predictors causally relate to the response.
> Obviously, empirical modeling can often be useful for this purpose,
> especially in well-designed studies and experiments, but there's no
> guarantee: it's an "accidental" byproduct of effective prediction.
> 
> This is particularly true for happenstance (un-designed) data and
> non-parametric models like regression/classification trees. Typically, 
> there
> are many alternative models (trees) that give essentially the same quality
> of prediction. You can see this empirically by removing a modest random
> subset of the data and re-fitting. You should not be surprised to see 
> the
> fitted model -- the tree topology -- change quite radically. HOWEVER, 
> the
> predictions of the models within the extent of the data will be quite
> similar to the original results. Frank's point is that unless the data 
> set
> is quite large and the predictive relationships quite strong -- which
> usually implies parsimony -- this is exactly what one should expect. 
> Thus it
> is critical not to over-interpret the particular model one get, i.e. to
> infer causality from the model (tree)structure.
> 
> Incidentally, there is nothing new or radical in this; indeed, John Tukey,
> Leo Breiman, George Box, and others wrote eloquently about this 
> decades ago.
> And Breiman's random forest modeling procedure explicitly abandoned efforts
> to build simply interpretable models (from which one might infer causality)
> in favor of building better interpolators, although assessment of "variable
> importance" does try to recover some of that interpretability 
> (however, no
> guarantees are given).
> 
> HTH. And contrary views welcome, as always.
> 
> Cheers to all,
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
>  
>  
> -Original Message-
> From: r-help-boun...@r-project.org [ On
> Behalf Of Frank E Harrell Jr
> Sent: Thursday, April 01, 2010 5:02 AM
> To: vibha patel
> Cc: r-help@r-project.org
> Subject: Re: [R] fitness of regression tree: how to measure???
> 
> vibha patel wrote:
> > Hello,
> > 
> > I'm using rpart function for creating regression trees.
> > now how to measure the fitness of regression tree???
> > 
> > thanks n Regards,
> > Vibha
> 
> If the sample size is less than 20,000, assume that the tree is a 
> somewhat arbitrary representation of the relationships in the data and 
> 
> that the form of the tree will not replicate in future datasets.
> 
> Frank
> 
> -- 
> Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
>   Department of Biostatistics   Vanderbilt University
> 
> __
> R-help@r-project.org mailing list
> 
> PLEASE do read the posting guide 
> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> 
> PLEASE do read the posting guide 
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
h

[R] Course***R Advanced Programming April 2010***R Courses*** by XLSolutions Corp Seattle, San Francisco, Salt Lake City

2010-04-01 Thread s...@xlsolutions-corp.com
XLSolutions is proud to announce our April 2010 R/S-PLUS Advanced
Programming course in USA


*** R/S Systems: Advanced Programming
*** S/R-PLUS Programming 3: Advanced Techniques and Efficiencies.

More on website

http://www.xlsolutions-corp.com/rplus.asp 

Ask for group discount and reserve your seat Now - Earlybird Rates.
Payment due after the class! Email Sue Turner:  sue at
xlsolutions-corp.com

Phone:   206-686-1578   

Please let us know if you and your colleagues are interested in this
class to take advantage of group discount. Register now to secure your
seat.

Cheers,
Elvis Miller, PhD
Manager Training.
XLSolutions Corporation
206 686 1578
www.xlsolutions-corp.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Aligning text in the call to the text function

2010-04-01 Thread baptiste auguie
Hi,

One option with Grid graphics,

m <-
matrix(c( 1667,3,459,
 2001, 45,   34,
 1996,   2,5235),
   dimnames=list(c("Eric & Alan", "Alan","John & David")),
ncol=3, byrow=T)

## install.packages("gridExtra", repos="http://R-Forge.R-project.org";)
library(gridExtra)

grid.table(m, theme=theme.white(row.just="left",core.just="left"))

HTH,

baptiste

On 1 April 2010 21:39, Tighiouart, Hocine
 wrote:
> Hi,
>
> I have text (see below) that is aligned nicely when printed in the
> command window in R but when plotted using text function, it does not
> show alignment unless I use the family="mono" in the call to the text
> function. Is there a way to specify a different font while maintaining
> the alignment?
>
>  Eric & Alan 1667   3   459
>  Alan 2001          45  34
>  John & David 1996  2   5235
>
> Thanks for any hints
>
> Hocine
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Line Graph Labels

2010-04-01 Thread zachcp

thanks again walmes. but the new problem would be that not all of the peaks
are the same intensity. therefore the top five datapoints from my highest
peak have greater intensity values than the highest point in the
second-highest peak.

but this is once again helpful. i found that there is a library msProcess
(not in the base CRAN packages) made specially for mass spec data analysis.
included in it is a peak-finder that looks for local maxima.  the data has
to be imported in a particular way and theres not too much documentation but
thats what I am going to try.

thanks again for your prompt reply,

best,
zach cp
-- 
View this message in context: 
http://n4.nabble.com/Line-Graph-Labels-tp1748218p1748586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pvals.fnc() with language R does not work with R 2.10.1

2010-04-01 Thread Ben Bolker
ozge gurcanli  cogsci.jhu.edu> writes:

> 
> 
> Hi Everyone,
> 
> I am using R 2.10.1.  lmer function works properly, however pvals.fnc 
> ()  does not despite the fact that I uploaded:
> 
> - library(lme4)
> - library(coda)
> - library(languageR)
> 
> This is the error message I get
> 
>   pvals.fnc(lexdec3.lmerE2, nsim=1)$fixed
> 
> Error in pvals.fnc(lexdec3.lmerE2, nsim = 1) :
>MCMC sampling is not yet implemented in lme4_0.999375
>for models with random correlation parameters
> 
> How can I resolve the problem?

  Neither of these packages is with base R (lme4 and languageR are
the source of your problems, not coda). The standard advice in
this case is to "contact the package maintainer" -- you can
find this info from help(package=languageR) (or the equivalent
for lme4) -- the maintainers are R Baayen and D Bates respectively.

  However, before you do that -- mcmcsamp has not worked in
lme4 for quite a while.  You could try to retrieve sufficiently
old versions of lme4 (and probably Matrix as well, since the
two are interconnected) to find a version where mcmcsamp works
(but this could be dangerous since I believe Bates abandoned
mcmcsamp after finding several cases where it got stuck and
deciding that he should remove it until he found a way to
make it more robust).

  My suggestion would be to try posting this to r-sig-mixed-models
to see if anyone has a suggestion, and possibly contacting Baayen.
I suspect Bates will just tell you what I stated above.  You could
also look at http://glmm.wikidot.com/faq , but again it doesn't
say much more than I have already stated.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Line Graph Labels

2010-04-01 Thread Walmes Zeviani

You can set the number of extreme points to be labeled instead of define a
cutoff. Look:

da <- data.frame(y=rnorm(50), x=1:50)
plot(y~x, data=da)
abline(h=c(-2,2), lty=3)
with(da, text(x[abs(y)>2], y[abs(y)>2], label=x[abs(y)>2], pos=2))

da <- da[order(da$y),]
plot(y~x, data=da)
# five small and big numbers
num <- 5
with(da, points(x[c(1:num, nrow(da):(nrow(da)-num))],
y[c(1:num, nrow(da):(nrow(da)-num))], pch=3, col=2))
with(da, text(x[c(1:num, nrow(da):(nrow(da)-num))],
  y[c(1:num, nrow(da):(nrow(da)-num))],
  label=x[c(1:5,nrow(da):(nrow(da)-5))], pos=rep(c(1,3),
each=num)))

Hope that helps.
Walmes.

-
..ooo0
...
..()... 0ooo...  Walmes Zeviani
...\..(.(.)... Master in Statistics and Agricultural
Experimentation
\_). )../   walmeszevi...@hotmail.com, Lavras - MG, Brasil

(_/
-- 
View this message in context: 
http://n4.nabble.com/Line-Graph-Labels-tp1748218p1748558.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Analyzing binary data on an absolute scale and determining conditions when risks become equal between groups

2010-04-01 Thread Chaudhari, Bimal
Suppose I have a binary outcome (disease/no disease and all subjects had the 
same period of exposure) and 2 or 3 (categorical) predictors.

I can obviously build a logistic regression model which describes the data, 
possibly including interaction terms, on a relative scale:

model<-glm(disease~sex*race*prematurity,family=binomial)

1) Is there any way to extract information on the absolute scale (ie instead of 
saying male sex has an OR = 2.0, saying that all else equal, males have a 5 
percentage point higher rate of disease, or, given certain values of 
covariates, the difference in rates of disease between boys and girls is X (95% 
ci for difference = ...).  I know there are mantzel-hanzell methods for 
cummarizing contingency tables, but if I had several covariates I wanted to 
control for, this approach quickly loses its appeal.  A regression framework 
which allowed for inference on the absolute scale would be ideal (or perhaps 
I'm just forgetting something about logistic regression?)

2) Now suppose that the situation is such that males are at higher risk of 
disease than females but that the magnitude of this difference varies by degree 
of prematurity (ie the interaction of sex*prematurity was significant) and 
suppose further that the effect of this interaction is to diminish the 
difference between males and females as one becomes less and less premature 
until the difference between sexes in undetectable.  Is there a procedure for 
determining at what level of the prematurity factor the impact of sex becomes 
undetectable?

My thought was to test the hypothesis that the model coefficients involving sex 
(ie a main effect and sex*prematurity interaction coefficients at each level of 
prematurity) sum to zero and taking the first level of prematurity where this 
sum was not statistically greater than zero as the level of prematurity at 
which sex ceased to alter risk.  

Does this approach make sense?


3) Suppose now that for each level of race, the level of prematurity at which 
sex ceases to increase risk is different.  Can anyone suggest an approach which 
would allow one to say that the level of prematurity at which this occurred in 
each race was statistically different?

Thanks,
bimal

Bimal P Chaudhari, MPH
MD Candidate, 2011
Boston University
MS Candidate, 2010
Washington University in St Louis


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Aligning text in the call to the text function

2010-04-01 Thread Duncan Murdoch

On 01/04/2010 3:39 PM, Tighiouart, Hocine wrote:

Hi,

I have text (see below) that is aligned nicely when printed in the
command window in R but when plotted using text function, it does not
show alignment unless I use the family="mono" in the call to the text
function. Is there a way to specify a different font while maintaining
the alignment?

 Eric & Alan 1667   3   459
 Alan 2001  45  34
 John & David 1996  2   5235

  



Plot each of the parts separately, i.e. plot 9 objects, not 3.  You can 
use strwidth() to compute how big each part is and adjust the 
positioning based on that.


You might also be able to do it using paste() and atop() with an 
expression; see ?plotmath.  But I think manual positioning is easiest.  
Here's an example:


leftalign <- function(x, y, m) {
 widths <- strwidth(m)
 dim(widths) <- dim(m)
 widths <- apply(widths, 2, max)
 widths <- widths + strwidth("   ")
 heights <- strheight(m)
 dim(heights) <- dim(m)
 heights <- apply(heights, 1, max)
 heights <- heights*1.5

 xoffsets <- c(0, cumsum(widths[-length(widths)]))
 yoffsets <- c(0, -cumsum(heights[-length(heights)]))

 text(x + rep(xoffsets, each=nrow(m)),
  y + rep(yoffsets, ncol(m)), m, adj=c(0,1))

}

plot(1)
leftalign(0.8, 1.2, matrix(c("Eric & Alan 1667", "Alan 2001", "John & 
David 1996", 3,45,2,459,34,5235), ncol=3))


Obviously you could make it a lot more elaborate, with left alignment 
for some columns and right alignment for others, etc.  Probably some 
package has already done this.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Aligning text in the call to the text function

2010-04-01 Thread Tighiouart, Hocine
Hi,

I have text (see below) that is aligned nicely when printed in the
command window in R but when plotted using text function, it does not
show alignment unless I use the family="mono" in the call to the text
function. Is there a way to specify a different font while maintaining
the alignment?

 Eric & Alan 1667   3   459
 Alan 2001  45  34
 John & David 1996  2   5235

Thanks for any hints

Hocine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sample size > 20K? Was: fitness of regression tree: how to measure???

2010-04-01 Thread Frank E Harrell Jr
Good comments Bert.  Just 2 points to add: People rely a lot on the tree 
structure found by recursive partitioning, so the structure needs to be 
stable.  This requires a huge samples size.  Second, recursive 
partitioning is not competitive with other methods in terms of 
predictive descrimination unless the sample size is so large that the 
tree doesn't need to be pruned upon cross-validation.


Frank


Bert Gunter wrote:

Since Frank has made this somewhat cryptic remark (sample size > 20K)
several times now, perhaps I can add a few words of (what I hope is) further
clarification.

Despite any claims to the contrary, **all** statistical (i.e. empirical)
modeling procedures are just data interpolators: that is, all that they can
claim to do is produce reasonable predictions of what may be expected within
the extent of the data. The quality of the model is judged by the goodness
of fit/prediction over this extent. Ergo the standard textbook caveats about
the dangers of extrapolation when using fitted models for prediction. Note,
btw, the contrast to "mechanistic" models, which typically **are** assessed
by how well they **extrapolate** beyond current data. For example, Newton's
apple to the planets. They are often "validated" by their ability to "work"
in circumstances (or scales) much different than those from which they were
derived.

So statistical models are just fancy "prediction engines." In particular,
there is no guarantee that they provide any meaningful assessment of
variable importance: how predictors causally relate to the response.
Obviously, empirical modeling can often be useful for this purpose,
especially in well-designed studies and experiments, but there's no
guarantee: it's an "accidental" byproduct of effective prediction.

This is particularly true for happenstance (un-designed) data and
non-parametric models like regression/classification trees. Typically, there
are many alternative models (trees) that give essentially the same quality
of prediction. You can see this empirically by removing a modest random
subset of the data and re-fitting. You should not be surprised to see the
fitted model -- the tree topology -- change quite radically. HOWEVER, the
predictions of the models within the extent of the data will be quite
similar to the original results. Frank's point is that unless the data set
is quite large and the predictive relationships quite strong -- which
usually implies parsimony -- this is exactly what one should expect. Thus it
is critical not to over-interpret the particular model one get, i.e. to
infer causality from the model (tree)structure.

Incidentally, there is nothing new or radical in this; indeed, John Tukey,
Leo Breiman, George Box, and others wrote eloquently about this decades ago.
And Breiman's random forest modeling procedure explicitly abandoned efforts
to build simply interpretable models (from which one might infer causality)
in favor of building better interpolators, although assessment of "variable
importance" does try to recover some of that interpretability (however, no
guarantees are given).

HTH. And contrary views welcome, as always.

Cheers to all,

Bert Gunter
Genentech Nonclinical Biostatistics
 
 
-Original Message-

From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Frank E Harrell Jr
Sent: Thursday, April 01, 2010 5:02 AM
To: vibha patel
Cc: r-help@r-project.org
Subject: Re: [R] fitness of regression tree: how to measure???

vibha patel wrote:

Hello,

I'm using rpart function for creating regression trees.
now how to measure the fitness of regression tree???

thanks n Regards,
Vibha


If the sample size is less than 20,000, assume that the tree is a 
somewhat arbitrary representation of the relationships in the data and 
that the form of the tree will not replicate in future datasets.


Frank




--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pvals.fnc() with language R does not work with R 2.10.1

2010-04-01 Thread ozge gurcanli


Hi Everyone,

I am using R 2.10.1.  lmer function works properly, however pvals.fnc 
()  does not despite the fact that I uploaded:


- library(lme4)
- library(coda)
- library(languageR)

This is the error message I get

 pvals.fnc(lexdec3.lmerE2, nsim=1)$fixed

Error in pvals.fnc(lexdec3.lmerE2, nsim = 1) :
  MCMC sampling is not yet implemented in lme4_0.999375
  for models with random correlation parameters

How can I resolve the problem?

Thanks,
Özge
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Line Graph Labels

2010-04-01 Thread zachcp

Thanks Walmes,

that does indeed work well for labeling all points greater than some
specified value. But the problem is that while my peaks are very sharp there
is more than one point along the line as it slopes up. This method will
label those points as well , making the data look cluttered. Ideally there
would be a command with an understanding of local maximum so that only the
points at the top of the peak are labeled.

thank you for suggestion, though; i definitely learned something new.

best
zach cp
-- 
View this message in context: 
http://n4.nabble.com/Line-Graph-Labels-tp1748218p1748497.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression

2010-04-01 Thread Erik Iverson

Bruce,

You don't tell us what class of data your y is.  Assuming y is defined 
in your enviroment, what does


class(y)

and

str(y)


tell you?  You'll most likely have to fine-tune your data reading 
process, or do some post-processing to make sure the y object (and x for 
that matter) are the classes you want.


And as always,
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

You can give us samples of your data with the dput function, see ?dput.

Bruce Kindseth wrote:

I am trying to learn R, and am having problems with doing a simple linear
regression.  I loaded data from a fixed width file, using wd=c(...), and
read.fwf(...) and I can read in the file ok and it comes in as vectors in
columns, which is what I expected.  The problem is when I try to do a linear
regression, lm=(y~x), I get the following error message, "Error in
model.frame.default(formula = y ~ x, drop.unused.levels = TRUE) : 


  invalid type (list) for variable 'y'

 


I tried various things, such as wd=numeric(c(..)), unlist(y) and putting x
and y in a data frame and attaching it, but nothing helps.  I have searched
through 3 online manuals, but can't seem to find an answer.  Maybe this is
so simple that nobody felt the need to address it.  

 


Thanks for your help.

 


B.Kindseth


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression

2010-04-01 Thread Duncan Murdoch

On 01/04/2010 2:59 PM, Bruce Kindseth wrote:

I am trying to learn R, and am having problems with doing a simple linear
regression.  I loaded data from a fixed width file, using wd=c(...), and
read.fwf(...) and I can read in the file ok and it comes in as vectors in
columns, which is what I expected.  The problem is when I try to do a linear
regression, lm=(y~x), I get the following error message, "Error in
model.frame.default(formula = y ~ x, drop.unused.levels = TRUE) : 


  invalid type (list) for variable 'y'

  


You don't give sample code, so let's assume that you read both x and y as

mydata <- read.fwf( ... )

Then the regression call would be

lm(y ~ x, data=mydata)

If you don't specify the "data=" argument, it will look for *vectors* x 
and y in your workspace to use in the formula.  (It would also accept 
matrices, but not dataframes,
and it sounds as though that's what you gave it.  But you almost 
certainly don't want it to do what it would do with matrices.)


Duncan Murdoch
 


I tried various things, such as wd=numeric(c(..)), unlist(y) and putting x
and y in a data frame and attaching it, but nothing helps.  I have searched
through 3 online manuals, but can't seem to find an answer.  Maybe this is
so simple that nobody felt the need to address it.  

 


Thanks for your help.

 


B.Kindseth


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test data in one column

2010-04-01 Thread Phil Spector

Marlin -
   Consider the following:


df = data.frame(x=rep(1:4,2),y=rep(c('M','F'),c(2,2)))
t.test(x~y,data=df)


Welch Two Sample t-test

data:  x by y 
t = 4.899, df = 6, p-value = 0.002714
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 1.001052 2.998948 
sample estimates:

mean in group F mean in group M
3.5 1.5

If you're uncomfortable with the formula notation, try

with(df,t.test(x[y=='M'],x[y=='F']))

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Thu, 1 Apr 2010, Marlin Keith Cox wrote:


I need a two sample t.test between M and F.  The data are arranged in one
column, x.  Cant seem to figure how to run a two sample t.test.  Not really
sure what this output is giving me, but there should be no difference
between M and F in the example, but summary p-value indicates this.

How can I run a two sample t.test with data in one column.

x=rep(c(1,2,3,4),2)
y=rep(c("M","M","M","M","F","F","F","F"))
data<-cbind(x,y)
t.test(x,by=list(y))
Thank you ahead of time.
keith



--
M. Keith Cox, Ph.D.
Alaska NOAA Fisheries, National Marine Fisheries Service
Auke Bay Laboratories
17109 Pt. Lena Loop Rd.
Juneau, AK 99801
keith@noaa.gov
marlink...@gmail.com
U.S. (907) 789-6603

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test data in one column

2010-04-01 Thread Erik Iverson

Hello,

Marlin Keith Cox wrote:

I need a two sample t.test between M and F.  The data are arranged in one
column, x.  Cant seem to figure how to run a two sample t.test.  Not really
sure what this output is giving me, but there should be no difference
between M and F in the example, but summary p-value indicates this.

How can I run a two sample t.test with data in one column.

x=rep(c(1,2,3,4),2)
y=rep(c("M","M","M","M","F","F","F","F"))
data<-cbind(x,y)
t.test(x,by=list(y))


Several issues:

First, your usage of cbind makes 'data' a matrix of type character, R no 
longer sees your numeric x.  You most likely want a data.frame (Which 
can contain multiple types) instead of a matrix (which has one type of 
data), so replace line 3 (and "data" is a function and argument name, so 
let's call it something else) with


df <- data.frame(x, y)

I don't see the "by" argument documented anywhere in ?t.test.

I do see the "formula" argument, documented as:

 formula: a formula of the form ‘lhs ~ rhs’ where ‘lhs’ is a numeric
  variable giving the data values and ‘rhs’ a factor with two
  levels giving the corresponding groups.


So try,

t.test(x ~ y, data = df)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] update.packages() and install.packages() does not work more because of "Error in read.dcf"

2010-04-01 Thread R P Herrold

On Thu, 1 Apr 2010, Uwe Ligges wrote:


herrold, before:
The true fault causing the noted message seems to be that there is a
faulty compression/decompression occuring in a carried library --
would't it be better to not bundle a frozen library, and to rather
simply 'flag' packages needed a rebuild so that fault is cured?

I am curious here


Ah, 'flag' those packages. I am curious to learn how to do that and see the 
patches / code that enable us to tell R, packages, repositories, users and 
developers that linking against some new library is required (including 
seldom but possible API changes).


An API change [which is what we are discussing here as to 
zlib] should also be occasioned by a SOname bump.  I use the 
term ** should ** as there is not formal enforcement, other 
than discipline on the part of the maintainer, which is not 
always followed (I am remembering fighting tcl/tk in this 
regard, which was unwilling or unable to get this into their 
release process)


Alternatively, in the instant case, at module retrieval, 
build, and install time, in the 'CHECK' process, testing for 
the observed error, and refusing to proceed (or at le3st 
issuing a warning) when seen, with text to the effect that 
there has been an 'unknown' API encountered, is a way to do 
this on the 'client of the library' side


-- Russ herrold

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regression

2010-04-01 Thread Bruce Kindseth
I am trying to learn R, and am having problems with doing a simple linear
regression.  I loaded data from a fixed width file, using wd=c(...), and
read.fwf(...) and I can read in the file ok and it comes in as vectors in
columns, which is what I expected.  The problem is when I try to do a linear
regression, lm=(y~x), I get the following error message, "Error in
model.frame.default(formula = y ~ x, drop.unused.levels = TRUE) : 

  invalid type (list) for variable 'y'

 

I tried various things, such as wd=numeric(c(..)), unlist(y) and putting x
and y in a data frame and attaching it, but nothing helps.  I have searched
through 3 online manuals, but can't seem to find an answer.  Maybe this is
so simple that nobody felt the need to address it.  

 

Thanks for your help.

 

B.Kindseth


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using GIS data in R

2010-04-01 Thread Don MacQueen
I'm currently doing a lot of simple GIS work in R, including points 
in polygon queries. My .Rprofile file has


   require(maptools)
   require(rgdal)

With that as a starting point, I find that the data structures play 
well together.


Define a coordinate reference system object with
 crs.ll <-  CRS('+proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs')

Load a shapefile with

  my.shp <- readOGR('directoryname','filename',  p4s=CRSargs(crs.ll) )

This will give you an object of class SpatialPolygonsDataFrame. 
readOGR() is in the rgdal package.


readShapeSpatial or readShapePoly from the maptools package should 
work as well, and I used to use them, but lately I've been using 
readOGR().


Then the overlay() function in the sp package will do your #2. But I 
do think you'll need your points to be one of the SpatialPoints 
classes.



Omitting the p4s argument from readOGR() might work, I'm not sure.
Or, I think you might be able to just supply the text string, i.e.,
p4s='+proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs'
but I have found it handy to have several projections predefined, as in
 crs.ll <-  CRS('+proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs')
 crs.utm <- CRS('+init=epsg:32610')
for use in the spTransform() function.

Also, your question would go better on R-sig-geo mailing list.

A final note, some plotting functions need to have the sp package 
earlier in the search() path than maptools.


-Don

At 9:37 AM -0600 4/1/10, Scott Duke-Sylvester wrote:

I have a simple problem: I need to load a ERSI shapefile of US states
and check whether or not a set of points are within the boundary of
these states. I have the shapefile, I have the coordinates but I'm
having a great deal of difficulty bringing the two together. The
problem is the various GIS packages for R do not play well with each
other. sp, shapefiles, maptools, etc all use different data
structures. Can someone suggest a simple set of commands that will
work together that will:

1) load the shapefile data.
2) Allow me to test whether or not a (lng,lat) coordinate pair are
inside or outside the polygons defined in the shapefile.

Many thanks,
scott.

--
Scott M. Duke-Sylvester
Assistant Professor
Department of Biology

Office : 300 E. St. Mary Blvd
 Billeaud Hall, Room 141
 Lafayette, LA 70504

Mailing address : UL Lafayette
  Department of Biology
  P.O.Box 42451
  Lafayette, LA 70504-2451

Phone : 337 482 5304
Fax   : 337 482 5834
email : smd3...@louisiana.edu

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] t.test data in one column

2010-04-01 Thread Marlin Keith Cox
I need a two sample t.test between M and F.  The data are arranged in one
column, x.  Cant seem to figure how to run a two sample t.test.  Not really
sure what this output is giving me, but there should be no difference
between M and F in the example, but summary p-value indicates this.

How can I run a two sample t.test with data in one column.

x=rep(c(1,2,3,4),2)
y=rep(c("M","M","M","M","F","F","F","F"))
data<-cbind(x,y)
t.test(x,by=list(y))
Thank you ahead of time.
keith



-- 
M. Keith Cox, Ph.D.
Alaska NOAA Fisheries, National Marine Fisheries Service
Auke Bay Laboratories
17109 Pt. Lena Loop Rd.
Juneau, AK 99801
keith@noaa.gov
marlink...@gmail.com
U.S. (907) 789-6603

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop; lm() regressions; list of vectors

2010-04-01 Thread Uwe Ligges



On 30.03.2010 20:57, David Winsemius wrote:


On Mar 30, 2010, at 12:42 PM, Driss Agramelal wrote:


## Hello everyone,
##
## I am trying to execute 150 times a lm regression using the 'for' loop,
with 150 vectors for y,
##
## and always the same vector for x.
##
## I have an object with 150 elements named "a",
##
## and a vector of 60 values named "b".
##
## Each element in "a" has 60 values plus a header.
##
## When I type:

r <- lm(i ~ b)

for(i in a) print(r)




Try instead something like this untested modification:

for(i in seq_along(a)) print(r <- lm(a[i] ~ b) )


Since a cannot be a vector, I guess you will need

for(i in seq_along(a)) print(r <- lm(a[[i]] ~ b) )

or to make the results reusable:


along <- seq_along(a)
r <- vector(mode="list", length=max(along))
for(i in along) print(r[[i]] <- lm(a[[i]] ~ b) )


Uwe Ligges







## I get 150 times the lm results of the first element of "a" regressed
with "b",
##
## whereas I would like to have 150 different regression results from
each
element in "a"...
##
## Can someone please help me with the syntax of my loop please?
##
## Many Thanks,
##
## Driss Agramelal
##
## Switzerland
##

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update.packages() and install.packages() does not work more because of "Error in read.dcf"

2010-04-01 Thread Uwe Ligges



On 30.03.2010 23:00, R P Herrold wrote:

On Tue, 30 Mar 2010, Prof Brian Ripley wrote:


A hint: I have seen this exact error message with a build configured
to use the system's zlib 1.2.4 (which has been out for about 2 weeks),
and this incompatibility is noted in the current R 2.11.0 alpha's
manuals.

We do recommend *not* using the system zlib, but if you do insist on
doing so do be aware that R may not work with future versions of
external software.


In the short term, and for a person willing to chase new versions that
is fine -- but part of using system (shared) libraries is that
unknowable unknowns such as latent security issues not yet known in the
'bundles with R' library are patched away when a (here for the sake of
example) new replacement zlib with a fix, is slotted in. Additionally
the benefit of a smaller memory footprint when shared libraries are used
is lost.

The true fault causing the noted message seems to be that there is a
faulty compression/decompression occuring in a carried library --
would't it be better to not bundle a frozen library, and to rather
simply 'flag' packages needed a rebuild so that fault is cured?

I am curious here



Ah, 'flag' those packages. I am curious to learn how to do that and see 
the patches / code that enable us to tell R, packages, repositories, 
users and developers that linking against some new library is required 
(including seldom but possible API changes).


Uwe Ligges





-- Russ herrold

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "is" and the story of a typo

2010-04-01 Thread Uwe Ligges



On 31.03.2010 10:21, (Ted Harding) wrote:

On 30-Mar-10 23:23:09, Jim Lemon wrote:

Hi all,
The gurus may pour scorn on me for not knowing this, but I happened
to  mistype "if" as "is" in the heat of debugging a function. As I
scanned the debugged function with some satisfaction, I noticed the
error. How could this have worked?


Beside Ted's comments: You could have mistyped it completely different 
with a valid function name and as long as it is syntactically valid, how 
should R be able to guess your error?


Example:

x <- 1
# You want:
if(x < 5)
   cat("YES!\n")

# works:
sin(x < 5)
   cat("YES!\n")


# and also
sin(x < 5)
{
   cat("YES!\n")
}

# but not

sin(x < 5){
   cat("YES!\n")
}

#  nor

sin(x < 5)  cat("YES!\n")

# and that is also true for is:

is(x < 5){
   cat("YES!\n")
}

is(x < 5) cat("YES!\n")



I think now we all have again a better idea about the way the R prser 
does its job


Uwe Ligges





 I assume that "is" is a generic

function that calls one of the is.* functions to evaluate whatever
is passed. It appears that this particular typo causes "is" to work
out and report the contents of its argument. Ho hum. As I did not
test the FALSE result, I never would have noticed that it was not
the conditional statement I expected until it evaluated something
that should have been FALSE.

Jim


Hi Jim,

   is: ?is

   is(is)

   is(is(is))

   is(is(is(is)))

   identical(is(is(is)),is(is(is(is

Ted.


E-Mail: (Ted Harding)
Fax-to-email: +44 (0)870 094 0861
Date: 31-Mar-10   Time: 09:21:29
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best fitted curve using AIC

2010-04-01 Thread Graham Smith
Simon

> I need a R script that compare known curves (e.g. logistic, exponential)
> with my curve. That curve was generated fitting data of forest cover
> (hectares) measured in 10 road distances (buffers).
>
> I´d like that comparison should be done using AICc to select the best model,
> that is, the best fitted curve in comparison with my data.

You could have a look at the  fitdistrplus package

and have a browse at what is available in the Probability
Distributions  Task View on Cran.

Graham

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sharing levels across multiple factor vectors

2010-04-01 Thread Jeff Brown

Wow, those are much more elegant.  Thanks!

Peter suggests:

df[] <- lapply(df, factor, levels=allLevels, labels=seq_along(allLevels)) 

Henrique suggests:

df[] <- as.numeric(unlist(df))

In both of those cases, why is the []  needed?  When I evaluate df vs. df[],
they both look the same, but apparently they have different meanings.  Is a
data frame internally represented as a list, and does df[] let you assign to
the elements of the list while maintaining the object's nature as a data
frame?

Moreover, Henrique, why is it that what you suggested works if done in one
line, but not in two?  That is, 

df[] <- unlist(df)
df[] <- as.numeric(df[])

gives an error at the second line: "Error: (list) object cannot be coerced
to type 'double'."
-- 
View this message in context: 
http://n4.nabble.com/Sharing-levels-across-multiple-factor-vectors-tp1747714p1748436.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] barplot with error bar in lattice

2010-04-01 Thread milton ruser
Dear all,

I have a data.frame like below and I need
to plot horizonal with error bar only for upper limit.
On the code below I am able to plot the bars within
groups, but I need (1) change from vertical to horizonal
plot and (2) add the error bar.

any hint are welcome.

milton


mydf<-read.table(stdin(),head=T,sep=",")
mygroup,xlabel,yvalue,error
Gr1,Bar1,1,0.5
Gr1,Bar2,2,0.7
Gr1,Bar3,2,1.0
Gr1,Bar4,2.5,0.5
Gr2,Bar1,1,2
Gr2,Bar2,1,1
Gr2,Bar3,3,1
Gr2,Bar4,4,0.5
require(lattice)
barchart(yvalue~xlabel|mygroup, data=mydf)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Line Graph Labels

2010-04-01 Thread Walmes Zeviani

You can adapt the following example

da <- data.frame(y=rnorm(50), x=1:50)
plot(y~x, data=da)
abline(h=c(-2,2), lty=3)
with(da, text(x[abs(y)>2], y[abs(y)>2], label=x[abs(y)>2], pos=2))

Walmes.

-
..ooo0
...
..()... 0ooo...  Walmes Zeviani
...\..(.(.)... Master in Statistics and Agricultural
Experimentation
\_). )../   walmeszevi...@hotmail.com, Lavras - MG, Brasil

(_/
-- 
View this message in context: 
http://n4.nabble.com/Line-Graph-Labels-tp1748218p1748431.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Best fitted curve using AIC

2010-04-01 Thread Simone Freitas
Dear fellows,

I need a R script that compare known curves (e.g. logistic, exponential)
with my curve. That curve was generated fitting data of forest cover
(hectares) measured in 10 road distances (buffers).

I´d like that comparison should be done using AICc to select the best model,
that is, the best fitted curve in comparison with my data.

Anyone could help me?

Best regards

Simone.
--
Simone R. Freitas
Universidade Federal do ABC (UFABC)
Centro de Ciências Naturais e Humanas (CCNH)
R. Catequese, 242
Bairro Jardim
09090-400 - Santo André - SP
Brasil
http://srfreitas.webs.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time series problem: time points don't match

2010-04-01 Thread Brad Patrick Schneid

Hi, 
I have a time series problem that I would like some help with if you have
the time.  I have many data from many sites that look like this:  

Site.1
datetimelevel   temp
2009/10/01 00:01:52.0  2.8797  18.401
2009/10/01 00:16:52.0  2.8769  18.382
2009/10/01 00:31:52.0  2.8708  18.309
2009/10/01 00:46:52.0  2.8728  18.285
2009/10/01 01:01:52.0  2.8716  18.245
2009/10/01 01:16:52.0  2.8710  18.190

Site.2
datetimelevel  temp
2009/10/01 00:11:06.0  2.9507  18.673
2009/10/01 00:26:06.0  2.9473  18.630
2009/10/01 00:41:06.0  2.9470  18.593
2009/10/01 00:56:06.0  2.9471  18.562
2009/10/01 01:11:06.0  2.9451  18.518
2009/10/01 01:26:06.0  2.9471  18.480

As you can see, the times do not match up.  What I would like to do is be
able to merge these two data sets to the nearest time stamp by creating a
new time between the two; something like this:


datenew.timelevel.1   temp.1level.2 temp.2
2009/10/01 00:01:52.0  2.8797  18.401   NA NA
2009/10/01 00:13:59.0  2.8769  18.382  2.9507  18.673
2009/10/01 00:28:59.0  2.8708  18.309  2.9473  18.630
2009/10/01 00:43:59.0  2.8728  18.285  2.9470  18.593
2009/10/01 00:59:59.0  2.8716  18.245 2.9471  18.562
2009/10/01 01:13:59.0  2.8710  18.190 2.9451  18.518
2009/10/01 01:26:06.0   NA  NA  2.9471  18.480

Note that the sites may not match in the # of observations and a return of
NA would be necessary, but a deletion of that time point all together for
both sites would be preferred.

A possibly easier alternative would be a way to assign generic times for
each observation according to the time interval, so that the 1st observation
for each day would have a time = 00:00:00 and each consecutive one would be
15 minutes later.   

Thanks for any suggestions.

Brad

-- 
View this message in context: 
http://n4.nabble.com/time-series-problem-time-points-don-t-match-tp1748387p1748387.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trying to understand lme() results

2010-04-01 Thread array chip
Thanks Dennis for the thorough explanation and correction on the design.

John

--- On Thu, 4/1/10, Dennis Murphy  wrote:

From: Dennis Murphy 
Subject: Re: [R] trying to understand lme() results
To: "array chip" 
Cc: R-help@r-project.org
Date: Thursday, April 1, 2010, 12:33 AM

Hi:


On Wed, Mar 31, 2010 at 2:31 PM, array chip  wrote:

Hi, I have very simple balanced randomized block design where I total have 48 
observations of a measure of weights of a product, the product was manufactured 
at 4 sites, so each site has 12 observations. I want to use lme() from nlme 
package to estimate the standard error of the product weight.


It's a balanced one-way design where site is assumed to be a random factor.
If you want to call it a block, fine, but if this were a genuine RCBD, there 
would be
treatments randomly assigned to 'units' within site, and that's not present 
here. 




So the data look like:



      MW site

1  54031    1

2  55286    1

3  54396    2

4  52327    2

5  55963    3

6  54893    3

7  57338    4

8  55597    4

:

:

:



The random effect model is: Y = mu + b + e where b is random block effect and e 
is model error.



so I fitted a lme model as:



obj<-lme(MW~1, random=~1|site, data=dat)



summary(obj)

Linear mixed-effects model fit by REML

Random effects:

 Formula: ~1 | site

        (Intercept) Residual

StdDev:    2064.006 1117.567



Fixed effects: MW ~ 1

               Value Std.Error DF  t-value p-value

(Intercept) 55901.31  1044.534 44 53.51796       0

:

:

Number of Observations: 48

Number of Groups: 4



I also did:

anova(obj)

            numDF denDF  F-value p-value

(Intercept)     1    44 2864.173  <.0001



I believe my standard error estimate is from "Residual" under "Random Effects" 
part of summary(), which is 1117.567.

Yes. 




Now my question is regarding t test under "Fixed effects". I think it's testing 
whether the over mean weight is 0 or not, which is not interesting anyway. But 
how the standard error of 1044.534 is calculated? I thought it should be 
sqrt(MSE)=1117.567 instead. anyone can explain?


When the data are balanced, 
the population variance of \bar{y}.., the sample grand mean, is E(MSA)/N, where
MSA is the between-site mean square and N is the total sample size (Searle, 
Casella

and McCulloch, _Variance Components_, p. 54, formula (37) derived for the 
balanced
data case - the corresponding ANOVA table, with expected mean squares, would be
on p. 60). The plug-in estimate of E(MSA) is


MSA = n * s^2(Intercept) + s^2(error) = 12 * (2064.006)^2 + 1117.567^2,

where n = 12 = number of observations per site. The standard error for 
\bar{y}.. is then
sqrt(MSA/N). Doing these calculations in R,


xx <- 12 * (2064.006)^2 + (1117.567)^2
sqrt(xx/48)
[1] 1044.533

which, within rounding error, is what lme() gives you in the test for fixed 
effects.





Same goes to the F test using anova(obj). The F test statistic is equal to 
square of the t test statistic because of 1 df of numerator. But what's the 
mean sum of squares of numerator and denominator, where to find them? BTW, I 
think denominator mean sum of squares (MSE) should be 1117.567^2, but this is 
not consistent to the standard error in the t test (1044.534).


lme() fits by ML or REML, so it doesn't output a conventional ANOVA table as 
part of
the output. If you want to see the sums of squares and mean squares, use aov(). 
In the
balanced one-way model, the observed df, SS and MS are the same in both the 
fixed

effects and random effects models, but the expected mean square for treatments 
differs
between the two models.

HTH,
Dennis




Thanks a lot for any help



John



__

R-help@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.






  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scope and assignment: baffling

2010-04-01 Thread Gabor Grothendieck
The code you presented is very close to object oriented in the style
of the proto package.  For example,

library(proto)

# generate a single object p
p <- proto(a = 0,
geta = function(.) .$a,
incra = function(.) .$a <- .$a + 5)

p$geta()
p$a # same
p$incra()
p$geta()


# Or if you want to be able to generate objects like that:

Account <- function() proto(a = 0,
geta = function(.) .$a,
incra = function(.) .$a <- .$a + 5)

# pp is an Account object
pp <- Account()
pp$geta()
pp$a # same
pp$incra()
pp$geta()

See http://r-proto.googlecode.com for more info.


On Wed, Mar 31, 2010 at 9:44 PM, Jeff Brown  wrote:
>
> Hi,
>
> The code below creates a value, x$a, which depending on how you access it
> evaluates to its initial value, or to what it's been changed to.  The last
> two lines should, I would have thought, evaluate to the same value, but they
> don't.
>
> f <- function () {
>        x <- NULL;
>        x$a <- 0;
>        x$get.a <- function () {
>                x$a;
>        };
>        x$increment.a <- function () {
>                x$a <<- x$a + 5;
>        };
>        x
> };
> x <- f();
> x$increment.a();
> x$get.a();
> x$a;
>
> This can be repeated; each time you call x$increment.a(), the value
> displayed by x$get.a() changes, but x$a continues to be zero.
>
> Is that totally weird, or what?
>
> Thanks,
> Jeff
> --
> View this message in context: 
> http://n4.nabble.com/Scope-and-assignment-baffling-tp1747582p1747582.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sample size > 20K? Was: fitness of regression tree: how to measure???

2010-04-01 Thread hadley wickham
> Incidentally, there is nothing new or radical in this; indeed, John Tukey,
> Leo Breiman, George Box, and others wrote eloquently about this decades ago.
> And Breiman's random forest modeling procedure explicitly abandoned efforts
> to build simply interpretable models (from which one might infer causality)
> in favor of building better interpolators, although assessment of "variable
> importance" does try to recover some of that interpretability (however, no
> guarantees are given).

I've found the making distinction between models for explanation and
models for prediction to be particularly helpful. I was first made
aware of this split by Brian Ripley's talk "Selecting amongst large
classes of models", presented at a symposium in honour of John
Nelder's 80th birthday -
http://www.stats.ox.ac.uk/~ripley/Nelder80.pdf

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scope and assignment: baffling

2010-04-01 Thread William Dunlap
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Jeff Brown
> Sent: Wednesday, March 31, 2010 6:45 PM
> To: r-help@r-project.org
> Subject: [R] Scope and assignment: baffling
> 
> 
> Hi,
> 
> The code below creates a value, x$a, which depending on how 
> you access it
> evaluates to its initial value, or to what it's been changed 
> to.  The last
> two lines should, I would have thought, evaluate to the same 
> value, but they
> don't.
> 
> f <- function () {
>   x <- NULL;

It would be better to say
  x <- list()
instead of x<-NULL, since the following x$a<-
coerces it to a list.

>   x$a <- 0;
>   x$get.a <- function () {
>   x$a;
>   };
>   x$increment.a <- function () {
>   x$a <<- x$a + 5;
>   };
>   x
> };
> x <- f();

Let's make things a tad clearer by naming the output
of f() 'globalX':
   globalX <- f()
globalX is a list containing a [virtual] copy of f's x, as it
stood when f() returned it.  You cannot modify globalX
without doing something like globalX$something<-xxx.
f's x remains in the environment created when you ran
f().  Each time you run f() you create an new environment
(with a new version of f's x).  A function's evaluation
environment generally goes away when the function is done,
but it remains if there are any references to the environment
from f()'s return value.  If the return value contains
functions defined in f() then they refer to f()'s evaluation
environment, so it hangs around.  When you run
globalX$increment.a()
you are modifying f()'s environment's x$a, not globalX$a.

Usually such function are written so the state variables
are not in the output structure, as in:
f <- function() {
fA <- 0
retval <- list()
retval$get.a <- function() { fA }
retval$increment.a <- function() { fA <<- fA + 5 }
retval
}
Then you can use it as you did but it is clear that
retval itself doesn't contain any state.  You can find
the possible state variables with
> globalX <- f()
> globalX$increment.a()
> objects(environment(globalX$get.a))
[1] "fA" "retval"
> eval(quote(fA), environment(globalX$get.a))
[1] 5
> globalX$increment.a()
> eval(quote(fA), environment(globalX$get.a))
[1] 10
> globalX$get.a()
[1] 10
The output of objects() suggests you might want
to clean up the code by not naming 'retval', just
returning list(get.a=function()fA, ...).  Then
your environment won't carry around an unused
copy of retval.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> x$increment.a();
> x$get.a();
> x$a;
> 
> This can be repeated; each time you call x$increment.a(), the value
> displayed by x$get.a() changes, but x$a continues to be zero.
> 
> Is that totally weird, or what?
> 
> Thanks,
> Jeff
> -- 
> View this message in context: 
> http://n4.nabble.com/Scope-and-assignment-baffling-tp1747582p1
747582.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sample size > 20K? Was: fitness of regression tree: how to measure???

2010-04-01 Thread Bert Gunter
Since Frank has made this somewhat cryptic remark (sample size > 20K)
several times now, perhaps I can add a few words of (what I hope is) further
clarification.

Despite any claims to the contrary, **all** statistical (i.e. empirical)
modeling procedures are just data interpolators: that is, all that they can
claim to do is produce reasonable predictions of what may be expected within
the extent of the data. The quality of the model is judged by the goodness
of fit/prediction over this extent. Ergo the standard textbook caveats about
the dangers of extrapolation when using fitted models for prediction. Note,
btw, the contrast to "mechanistic" models, which typically **are** assessed
by how well they **extrapolate** beyond current data. For example, Newton's
apple to the planets. They are often "validated" by their ability to "work"
in circumstances (or scales) much different than those from which they were
derived.

So statistical models are just fancy "prediction engines." In particular,
there is no guarantee that they provide any meaningful assessment of
variable importance: how predictors causally relate to the response.
Obviously, empirical modeling can often be useful for this purpose,
especially in well-designed studies and experiments, but there's no
guarantee: it's an "accidental" byproduct of effective prediction.

This is particularly true for happenstance (un-designed) data and
non-parametric models like regression/classification trees. Typically, there
are many alternative models (trees) that give essentially the same quality
of prediction. You can see this empirically by removing a modest random
subset of the data and re-fitting. You should not be surprised to see the
fitted model -- the tree topology -- change quite radically. HOWEVER, the
predictions of the models within the extent of the data will be quite
similar to the original results. Frank's point is that unless the data set
is quite large and the predictive relationships quite strong -- which
usually implies parsimony -- this is exactly what one should expect. Thus it
is critical not to over-interpret the particular model one get, i.e. to
infer causality from the model (tree)structure.

Incidentally, there is nothing new or radical in this; indeed, John Tukey,
Leo Breiman, George Box, and others wrote eloquently about this decades ago.
And Breiman's random forest modeling procedure explicitly abandoned efforts
to build simply interpretable models (from which one might infer causality)
in favor of building better interpolators, although assessment of "variable
importance" does try to recover some of that interpretability (however, no
guarantees are given).

HTH. And contrary views welcome, as always.

Cheers to all,

Bert Gunter
Genentech Nonclinical Biostatistics
 
 
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Frank E Harrell Jr
Sent: Thursday, April 01, 2010 5:02 AM
To: vibha patel
Cc: r-help@r-project.org
Subject: Re: [R] fitness of regression tree: how to measure???

vibha patel wrote:
> Hello,
> 
> I'm using rpart function for creating regression trees.
> now how to measure the fitness of regression tree???
> 
> thanks n Regards,
> Vibha

If the sample size is less than 20,000, assume that the tree is a 
somewhat arbitrary representation of the relationships in the data and 
that the form of the tree will not replicate in future datasets.

Frank

-- 
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nlrq parameter bounds

2010-04-01 Thread Matthew Dowle
Ashley,

This appears to be your first post to this list. Welcome to R. Over 2 days
is quite a long time to wait though, so you are unlikely to get a reply now.

Feedback:  since nlrq is in package quantreg, its a question about a package 
and should
be sent to the package maintainer. Some packages though, over 40 of the 664 
on
r-forge, have dedicated help/devel/forum lists hosted on r-forge.

No reply from r-help often, but not always, means you haven't followed some
detail of the posting guide or haven't followed this :
http://www.catb.org/~esr/faqs/smart-questions.html.

HTH
Matthew


"Ashley Greenwood"  wrote in message 
news:45708.131.217.6.9.1269916052.squir...@webmail.student.unimelb.edu.au...
> Hi there,
> Can anyone please tell me if it is possible to limit parameters in nlrq()
> to 'upper' and 'lower' bounds as per nls()?  If so how??
>
> Many thanks in advance
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] predicted time length differs from survfit.coxph:

2010-04-01 Thread Thomas Lumley

On Wed, 31 Mar 2010, Parminder Mankoo wrote:


Hello All,

Does anyone know why length(fit1$time) < length(fit2$n) in survfit.coxph
output? Why is the predicted time length is not the same as the number of
samples (n)?



In fact it is not true that length(fit1$time) < length(fit2$n), since length(fit2$n) 
is 1. Presumably you are asking why length(fit1$time) < fit2$n, or perhaps why 
length(fit2$time) < fit2$n.

The reason is that fit2$time has entries for the unique times in the data set.  
There are 241 records, but only 230 unique times. 10 times have two failures 
and 1 time has a failure and a censoring.

As the documentation (?survfit.object) says
  ntotal number of subjects in each curve.
  timethe time points at which the curve has a step.


-thomas


I tried: example(survfit.coxph).

Thanks,
parmee


fit2$n

[1] 241


fit2$time

 [1] 031326061   152   153   174   273   277   362
365   499   517   518   547
[17]   566   638   700   760   791   792   809   822   845   944  1005
1077  1116  1125  1218  1369
[33]  1392  1400  1431  1492  1625  1642  1673  1674  1706  1766  1767
1795  1815  1826  1851  1857
[49]  2006  2010  2070  2084  2099  2121  2160  2191  2223  2230  2236
2314  2345  2400  2422  2434
[65]  2435  2495  2556  2557  2587  2588  2678  2802  2815  2833  2844
2860  2861  2910  2922  2953
[81]  2989  3010  3012  3014  3091  3167  3186  3226  3227  3242  3318
3346  3380  3448  3560  3561
[97]  3590  3773  3775  3805  3837  3895  3932  3943  3962  3987  4119
4139  4201  4206  4224  4232
[113]  4249  4321  4370  4453  4536  4539  4627  4656  4758  4763  4810
4939  4959  4962  5024  5047
[129]  5068  5088  5181  5216  5236  5308  5354  5384  5550  5757  5789
5796  5824  5917  5930  5934
[145]  6008  6025  6089  6117  6126  6143  6155  6209  6256  6349  6479
6607  6626  6642  6723  6760
[161]  6763  6789  6800  6878  6931  6970  7003  7065  7085  7093  7160
7184  7198  7247  7280  7288
[177]  7301  7364  7370  7381  7397  7410  7417  7470  7479  7519  7533
7545  7555  7668  7732  7736
[193]  7758  7807  7862  7867  7875  7884  7899  7911  7954  7958  7965
8006  8009  8023  8030  8080
[209]  8100  8133  8165  8308  8327  8381  8389  8569  8600  8697  8761
8806  8887  8961  9257  9510
[225]  9560  9598  9993 10122 10359 12457

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using a Data Frame/Matrix outside a function

2010-04-01 Thread David Winsemius


On Apr 1, 2010, at 12:16 PM, Greg Gilbert wrote:



I have code that creates the same matrices , "a" and "b". The code  
creating
"a" is not in a function and the code creating "b" is in a function.  
I would
like to do operations on "b" like I can on "a", however I can't  
figure out
how I can return a matrix (or data frame) from a function. Thanks  
for your

help!


r <- 5; c <- 5
a <- matrix(NA,nrow=5, ncol=5)
  for(i in 1:r) {

+   for (j in 1:c) {
+  if(i==j) a[i, j] <- i
+   }
+}


diag(a)

[1] 1 2 3 4 5


data.out <- function(r, c) {

+b <- matrix(NA,nrow=r, ncol=c)
+for(i in 1:r) {
+   for (j in 1:c) {
+  if(i==j) b[i, j] <- i
+   }
+}
+ }


data.out(5, 5)
diag(b)

Error in diag(b) : object 'b' not found


The "b" object only existed while the function was being processed. At  
the conclusion of the function's activities the b object did not get  
returned as the result . Had you returned "b" or made it the last  
object evaluated (inside the function), the results of data.out(5,5)  
it would have still been accessible. Try this instead


data.out <- function(r, c) {
b <- matrix(NA,nrow=r, ncol=c)
for(i in 1:r) {
   for (j in 1:c) {
  if(i==j) b[i, j] <- i
   }
}
 b}  # or equivalently return(b)

> diag(data.out(5,5))
[1] 1 2 3 4 5


--
David

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding RcppFrame to RcppResultSet causes segmentation fault

2010-04-01 Thread Matthew Dowle
He could have posted into this thread then at the time to say that. 
Otherwise it appears like its open.

"Romain Francois"  wrote in message 
news:4bb4c4b8.2030...@dbmail.com...
The thread has been handled in Rcpp-devel. Rob posted there 7 minutes
after posting on r-help.

FWIW, I think the problem is fixed on the Rcpp 0.7.11 version (on cran
incoming)

Romain

Le 01/04/10 17:47, Matthew Dowle a écrit :
>
> Rob,
> Please look again at Romain's reply to you on 19th March.  He informed you
> then that Rcpp has its own dedicated mailing list and he gave you the 
> link.
> Matthew
>
> "R_help Help"  wrote in message
> news:ad1ead5f1003291753p68d6ed52q572940f13e1c0...@mail.gmail.com...
>> Hi,
>>
>> I'm a bit puzzled. I uses exactly the same code in RcppExamples
>> package to try adding RcppFrame object to RcppResultSet. When running
>> it gives me segmentation fault problem. I'm using gcc 4.1.2 on redhat
>> 64bit. I'm not sure if this is the cause of the problem. Any advice
>> would be greatly appreciated. Thank you.
>>
>> Rob.
>>
>>
>> int numCol=4;
>> std::vector  colNames(numCol);
>> colNames[0] = "alpha"; // column of strings
>> colNames[1] = "beta";  // column of reals
>> colNames[2] = "gamma"; // factor column
>> colNames[3] = "delta"; // column of Dates
>> RcppFrame frame(colNames);
>>
>> // Third column will be a factor. In the current implementation the
>> // level names are copied to every factor value (and factors
>> // in the same column must have the same level names). The level names
>> // for a particular column will be factored out (pardon the pun) in
>> // a future release.
>> int numLevels = 2;
>> std::string *levelNames = new std::string[2];
>> levelNames[0] = std::string("pass"); // level 1
>> levelNames[1] = std::string("fail"); // level 2
>>
>> // First row (this one determines column types).
>> std::vector  row1(numCol);
>> row1[0].setStringValue("a");
>> row1[1].setDoubleValue(3.14);
>> row1[2].setFactorValue(levelNames, numLevels, 1);
>> row1[3].setDateValue(RcppDate(7,4,2006));
>> frame.addRow(row1);
>>
>> // Second row.
>> std::vector  row2(numCol);
>> row2[0].setStringValue("b");
>> row2[1].setDoubleValue(6.28);
>> row2[2].setFactorValue(levelNames, numLevels, 1);
>> row2[3].setDateValue(RcppDate(12,25,2006));
>> frame.addRow(row2);
>>
>> RcppResultSet rs;
>> rs.add("PreDF", frame);


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
`- http://tr.im/O1wO : highlight 0.1-5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the scale limits in lattice plot

2010-04-01 Thread David Winsemius


On Apr 1, 2010, at 11:53 AM, James Rome wrote:


I am drawing a density histogram, and want to label the plots with the
mean using ltext(). But I need the x,y coordinates to feed into ltext,
and I can't calculate them easily from my data. Is there a way to get
the x and y ranges being used for the plot, so I can put the text at  
the

correct position in the panel.function?


No code  so what a "density histogram" might be is still vague,  
but perhaps you are using density() and if so, have you looked at the  
"Value" section of that function's help page? (The same advice would  
apply were you using one of the (several) histogram functions.


--
David.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the scale limits in lattice plot

2010-04-01 Thread Peter Ehlers

On 2010-04-01 9:53, James Rome wrote:

I am drawing a density histogram, and want to label the plots with the
mean using ltext(). But I need the x,y coordinates to feed into ltext,
and I can't calculate them easily from my data. Is there a way to get
the x and y ranges being used for the plot, so I can put the text at the
correct position in the panel.function?



You probably want current.panel.limits().

 -Peter Ehlers


Thanks,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to construct a time series

2010-04-01 Thread David Winsemius


On Apr 1, 2010, at 9:56 AM, n.via...@libero.it wrote:


Hi,
I need to generate the time series of the production, but as I'm new  
to this
topic I am not able to do that. This is what the time series should  
be:


PROD(t)=PROD(t,T)
PROD(t-1)=PROD(t-1,T)
PROD(t-2)=PROD(t-1)*PROD(t-2,T-1)/PROD(t-1,T-1)
PROD(t-3)=PROD(t-2)*PROD(t-3,T-2)/PROD(t-2,T-2)


It would make more sense to give a name to the LHS series that was  
different than the (higher dimensional) data input.



...
...
...
from PROD(t-2)...it will get the same expression; where PROD(t,T) is  
the value

of the production at t for the sample of firms presented at T and T-1;
Someone knows how to get it???


You have not provided a reproducible example from which to proceed  
(and have not even used correct R syntax in what you request), but it  
appears you will probable get success with matrix indexing that  
generates diagonal or sub-diagonal series processed with the cumprod  
function.


> PROD <- matrix(1:25, nrow=5)
[4,]9   14   19   24
> PROD[row(PROD)==col(PROD)+1]
[1]  2  8 14 20

> cumprod(PROD[row(PROD)==col(PROD)+1])
[1]2   16  224 4480

#cumulative product of first sub-diagonal.

--
David Winsemius.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using a Data Frame/Matrix outside a function

2010-04-01 Thread Greg Gilbert

I have code that creates the same matrices , "a" and "b". The code creating
"a" is not in a function and the code creating "b" is in a function. I would
like to do operations on "b" like I can on "a", however I can't figure out
how I can return a matrix (or data frame) from a function. Thanks for your
help!

> r <- 5; c <- 5
> a <- matrix(NA,nrow=5, ncol=5)
>for(i in 1:r) {
+   for (j in 1:c) {
+  if(i==j) a[i, j] <- i 
+   }
+}
> 
> diag(a)
[1] 1 2 3 4 5
> 
> data.out <- function(r, c) {
+b <- matrix(NA,nrow=r, ncol=c)
+for(i in 1:r) {
+   for (j in 1:c) {
+  if(i==j) b[i, j] <- i 
+   }
+}
+ }
> 
> data.out(5, 5)
> diag(b)
Error in diag(b) : object 'b' not found
> 

-- 
View this message in context: 
http://n4.nabble.com/Using-a-Data-Frame-Matrix-outside-a-function-tp1748285p1748285.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding the De-bugger in R

2010-04-01 Thread Tal Galili
One way I can think of is putting your loop inside a function (let's say
"func3") and then use:

library(debug)
mtrace(func3)
func3()

# and this at the end:
mtrace.off()

And see the steps...


Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Apr 1, 2010 at 7:02 PM, Ayush Raman  wrote:

> Snippet of my code:
>
> library(foreign)
> function1 <- function(y,t){
>
> ###do some matrix operations ##
> }
>
> function2 <- function(y){
>
> y1 = permute(y)
> F1 = function1(y1)
> }
>
> setwd("C:\\Results\\") ## Read Multiple Files
> files.total = list.files()
>
> for (j in files.total){
>table1 = read.table(j)
>
>### do some operations
>### Call functions1 and function2
>### get the result stored in object result
>
>  message("Result for\t\",j,"\t",result)
>   rm(table1,result) ### in short I am removing all the objects except j and
> function calls -- a crude way of getting the independent results and there
> is no dependency on ###previous results
> }
>
> Now, I would like to verify that it is calculating everything from scratch
> and it is not taking any results from the previous iteration. I am doing
> this because I am not getting the result that I want, also I have verified
> that my code works fine without any errors for the ones when I am reading it
> only once and not multiple times.
>
> Thanks.
>
> Regards,
> Ayush
>
>
>
> On Thu, Apr 1, 2010 at 11:44 AM, Tal Galili  wrote:
>
>> Hi Ayush,
>> Could you supply with a simple code to try to give an answer on ?
>>
>> Thanks,
>> Tal
>>
>>
>>
>> Contact
>> Details:---
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>>
>> --
>>
>>
>>
>>
>> On Thu, Apr 1, 2010 at 6:24 PM, Ayush Raman  wrote:
>>
>>> Hi,
>>>
>>> I would like to know if there is some debugger in R where I can check
>>> that I
>>> am not using or not doing calculation on my previously stored objects. I
>>> can't use rm (list = ls()) to remove all the objects since I am using a
>>> for
>>> loop for reading 500 files and making making common calculation for each
>>> file, therefore I need to keep the track of my iterator. Is it possible
>>> to
>>> remove everything except the iterator and see that my answers for each
>>> iterations are not getting compiled on previous answers.
>>>
>>> Thanks.
>>>
>>> --
>>> Regards,
>>> Ayush Raman
>>>
>>>[[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
> --
> Regards,
> Ayush Raman
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding RcppFrame to RcppResultSet causes segmentation fault

2010-04-01 Thread Romain Francois
The thread has been handled in Rcpp-devel. Rob posted there 7 minutes 
after posting on r-help.


FWIW, I think the problem is fixed on the Rcpp 0.7.11 version (on cran 
incoming)


Romain

Le 01/04/10 17:47, Matthew Dowle a écrit :


Rob,
Please look again at Romain's reply to you on 19th March.  He informed you
then that Rcpp has its own dedicated mailing list and he gave you the link.
Matthew

"R_help Help"  wrote in message
news:ad1ead5f1003291753p68d6ed52q572940f13e1c0...@mail.gmail.com...

Hi,

I'm a bit puzzled. I uses exactly the same code in RcppExamples
package to try adding RcppFrame object to RcppResultSet. When running
it gives me segmentation fault problem. I'm using gcc 4.1.2 on redhat
64bit. I'm not sure if this is the cause of the problem. Any advice
would be greatly appreciated. Thank you.

Rob.


int numCol=4;
std::vector  colNames(numCol);
colNames[0] = "alpha"; // column of strings
colNames[1] = "beta";  // column of reals
colNames[2] = "gamma"; // factor column
colNames[3] = "delta"; // column of Dates
RcppFrame frame(colNames);

// Third column will be a factor. In the current implementation the
// level names are copied to every factor value (and factors
// in the same column must have the same level names). The level names
// for a particular column will be factored out (pardon the pun) in
// a future release.
int numLevels = 2;
std::string *levelNames = new std::string[2];
levelNames[0] = std::string("pass"); // level 1
levelNames[1] = std::string("fail"); // level 2

// First row (this one determines column types).
std::vector  row1(numCol);
row1[0].setStringValue("a");
row1[1].setDoubleValue(3.14);
row1[2].setFactorValue(levelNames, numLevels, 1);
row1[3].setDateValue(RcppDate(7,4,2006));
frame.addRow(row1);

// Second row.
std::vector  row2(numCol);
row2[0].setStringValue("b");
row2[1].setDoubleValue(6.28);
row2[2].setFactorValue(levelNames, numLevels, 1);
row2[3].setDateValue(RcppDate(12,25,2006));
frame.addRow(row2);

RcppResultSet rs;
rs.add("PreDF", frame);



--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
`- http://tr.im/O1wO : highlight 0.1-5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GUI /IDE

2010-04-01 Thread ManInMoon

Thanks Philippe,

I will look at sciviews - I have never used it before.

I use Rcmdr, and I find I am often selecting and running the same lines of
code. I just thought if we could name them then easily "call" that region it
would be useful.

Cheers
-- 
View this message in context: 
http://n4.nabble.com/GUI-IDE-tp1745858p1748178.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding the De-bugger in R

2010-04-01 Thread Ayush Raman
Snippet of my code:

library(foreign)
function1 <- function(y,t){

###do some matrix operations ##
}

function2 <- function(y){

y1 = permute(y)
F1 = function1(y1)
}

setwd("C:\\Results\\") ## Read Multiple Files
files.total = list.files()

for (j in files.total){
   table1 = read.table(j)

   ### do some operations
   ### Call functions1 and function2
   ### get the result stored in object result

 message("Result for\t\",j,"\t",result)
  rm(table1,result) ### in short I am removing all the objects except j and
function calls -- a crude way of getting the independent results and there
is no dependency on ###previous results
}

Now, I would like to verify that it is calculating everything from scratch
and it is not taking any results from the previous iteration. I am doing
this because I am not getting the result that I want, also I have verified
that my code works fine without any errors for the ones when I am reading it
only once and not multiple times.

Thanks.

Regards,
Ayush


On Thu, Apr 1, 2010 at 11:44 AM, Tal Galili  wrote:

> Hi Ayush,
> Could you supply with a simple code to try to give an answer on ?
>
> Thanks,
> Tal
>
>
>
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> --
>
>
>
>
> On Thu, Apr 1, 2010 at 6:24 PM, Ayush Raman  wrote:
>
>> Hi,
>>
>> I would like to know if there is some debugger in R where I can check that
>> I
>> am not using or not doing calculation on my previously stored objects. I
>> can't use rm (list = ls()) to remove all the objects since I am using a
>> for
>> loop for reading 500 files and making making common calculation for each
>> file, therefore I need to keep the track of my iterator. Is it possible to
>> remove everything except the iterator and see that my answers for each
>> iterations are not getting compiled on previous answers.
>>
>> Thanks.
>>
>> --
>> Regards,
>> Ayush Raman
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
Regards,
Ayush Raman

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using GIS data in R

2010-04-01 Thread Scott Duke-Sylvester
I have a simple problem: I need to load a ERSI shapefile of US states
and check whether or not a set of points are within the boundary of
these states. I have the shapefile, I have the coordinates but I'm
having a great deal of difficulty bringing the two together. The
problem is the various GIS packages for R do not play well with each
other. sp, shapefiles, maptools, etc all use different data
structures. Can someone suggest a simple set of commands that will
work together that will:

1) load the shapefile data.
2) Allow me to test whether or not a (lng,lat) coordinate pair are
inside or outside the polygons defined in the shapefile.

Many thanks,
scott.

-- 
Scott M. Duke-Sylvester
Assistant Professor
Department of Biology

Office : 300 E. St. Mary Blvd
 Billeaud Hall, Room 141
 Lafayette, LA 70504

Mailing address : UL Lafayette
  Department of Biology
  P.O.Box 42451
  Lafayette, LA 70504-2451

Phone : 337 482 5304
Fax   : 337 482 5834
email : smd3...@louisiana.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bagging survival tree

2010-04-01 Thread paaventhan jeyaganth

Dear R users,

I have problem with bagging survial tree after finding the final tree.

f<-rpart(Surv(time ,dead )~ x1 +x2+ x3+x4+x5+x6, data=crp)
f.prun<-prune(f,cp=0.036701) 
# final tree i have 3 endnote including 
#the variable x3 and x4

how can i use the bagging code i use 
like this but it did not work
bagging(f_cp, nbagg=25, data=crp) 
library(ipred)
sbundle <- list(list(model = coxph, predict = predict))
errorest(Surv(time ,dead )~x1 +x2+ x3+x4+x5+x6, data=crp, model = bagging, 
nbagg = 100, 
comb = sbundle, control = rpart.control(minsplit = 2, xval = 0, cp 
= 0.03670131 ))

 

Thanks very much

Paaveen
  
_
Take your contacts everywhere

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to get the scale limits in lattice plot

2010-04-01 Thread James Rome
I am drawing a density histogram, and want to label the plots with the
mean using ltext(). But I need the x,y coordinates to feed into ltext,
and I can't calculate them easily from my data. Is there a way to get
the x and y ranges being used for the plot, so I can put the text at the
correct position in the panel.function?

Thanks,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding RcppFrame to RcppResultSet causes segmentation fault

2010-04-01 Thread Matthew Dowle
Rob,
Please look again at Romain's reply to you on 19th March.  He informed you 
then that Rcpp has its own dedicated mailing list and he gave you the link.
Matthew

"R_help Help"  wrote in message 
news:ad1ead5f1003291753p68d6ed52q572940f13e1c0...@mail.gmail.com...
> Hi,
>
> I'm a bit puzzled. I uses exactly the same code in RcppExamples
> package to try adding RcppFrame object to RcppResultSet. When running
> it gives me segmentation fault problem. I'm using gcc 4.1.2 on redhat
> 64bit. I'm not sure if this is the cause of the problem. Any advice
> would be greatly appreciated. Thank you.
>
> Rob.
>
>
> int numCol=4;
> std::vector colNames(numCol);
> colNames[0] = "alpha"; // column of strings
> colNames[1] = "beta";  // column of reals
> colNames[2] = "gamma"; // factor column
> colNames[3] = "delta"; // column of Dates
> RcppFrame frame(colNames);
>
> // Third column will be a factor. In the current implementation the
> // level names are copied to every factor value (and factors
> // in the same column must have the same level names). The level names
> // for a particular column will be factored out (pardon the pun) in
> // a future release.
> int numLevels = 2;
> std::string *levelNames = new std::string[2];
> levelNames[0] = std::string("pass"); // level 1
> levelNames[1] = std::string("fail"); // level 2
>
> // First row (this one determines column types).
> std::vector row1(numCol);
> row1[0].setStringValue("a");
> row1[1].setDoubleValue(3.14);
> row1[2].setFactorValue(levelNames, numLevels, 1);
> row1[3].setDateValue(RcppDate(7,4,2006));
> frame.addRow(row1);
>
> // Second row.
> std::vector row2(numCol);
> row2[0].setStringValue("b");
> row2[1].setDoubleValue(6.28);
> row2[2].setFactorValue(levelNames, numLevels, 1);
> row2[3].setDateValue(RcppDate(12,25,2006));
> frame.addRow(row2);
>
> RcppResultSet rs;
> rs.add("PreDF", frame);
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding the De-bugger in R

2010-04-01 Thread Tal Galili
Hi Ayush,
Could you supply with a simple code to try to give an answer on ?

Thanks,
Tal



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Apr 1, 2010 at 6:24 PM, Ayush Raman  wrote:

> Hi,
>
> I would like to know if there is some debugger in R where I can check that
> I
> am not using or not doing calculation on my previously stored objects. I
> can't use rm (list = ls()) to remove all the objects since I am using a for
> loop for reading 500 files and making making common calculation for each
> file, therefore I need to keep the track of my iterator. Is it possible to
> remove everything except the iterator and see that my answers for each
> iterations are not getting compiled on previous answers.
>
> Thanks.
>
> --
> Regards,
> Ayush Raman
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot and line

2010-04-01 Thread Tal Galili
Hi Roslina,

here is a simple example:


barplot.x.location <- barplot(c(12:17))
points(x = barplot.x.location, y = rep(10, length(aa)), col = 2, pch = 7)




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Apr 1, 2010 at 2:51 PM, Roslina Zakaria  wrote:

> Hi Tal,
>
> I tried but didn't really understand what you mean.  Can you give me an
> example?
>
> Thank you.
>
>
>
>
> --- On *Thu, 4/1/10, Tal Galili * wrote:
>
>
> From: Tal Galili 
> Subject: Re: [R] barplot and line
> To: "Roslina Zakaria" 
> Cc: r-help@r-project.org
> Date: Thursday, April 1, 2010, 6:03 PM
>
>  Hi Roslina
>
> In order to get the X coordinates of your barplot, assign the barplot to a
> variable:
>
> x.coor <- barplot(something)
>
> Now use the x.coor for your line plot.
>
> Cheers,
> Tal
>
>
>
> Contact
> Details:---
> Contact me: 
> tal.gal...@gmail.com|
>   972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> --
>
>
>
>
> On Thu, Apr 1, 2010 at 1:44 AM, Roslina Zakaria 
> http://us.mc587.mail.yahoo.com/mc/compose?to=zrosl...@yahoo.com>
> > wrote:
>
>> Hi r-users,
>>
>> I have this data below and would like to plot a barplot overlap with a
>> line.
>> This is my data:
>> > hist_50
>> pdf_obs pdf_gen.50
>> 1  0.00   0.00
>> 2  0.083156   0.125366
>> 3  0.132196   0.158230
>> 4  0.126866   0.149432
>> 5  0.120469   0.127897
>> 6  0.121535   0.104096
>> 7  0.103412   0.082171
>> 8  0.082090   0.063539
>> 9  0.065032   0.048408
>> 10 0.050107   0.036470
>> 11 0.036247   0.027236
>> 12 0.031983   0.020198
>> 13 0.017058   0.014893
>> 14 0.009595   0.010928
>> 15 0.007463   0.007986
>> 16 0.006397   0.005816
>> 17 0.003198   0.004222
>> 18 0.003198   0.003057
>> 19 0.00   0.002208
>>
>> I tried
>> sq <- seq(0,900,by=50)
>> sq.50 <- as.character(sq)
>> barplot(t(hist_50[,1]), col= "blue", beside=TRUE, ylim=c(0,0.2),
>> main="Observed and generated gamma sum",
>> xlab="Rainfall totals(mm)",cex.axis=0.8,ylab="Probability")
>> legend("topright", c("observed","generated"), fill= c("blue","yellow"))
>> rownames(hist_50) <- sq.50
>> lines(spline(hist_50[,2]),lty=1)
>>
>> The problem is the x-axis label is invisible, I want 0, 50, 100,
>> Another thing is I want the line to plot on top of the barplot.  It seems
>> that it is shifted to the left.
>>
>> Thank you.
>>
>>
>>
>>
>>[[alternative HTML version deleted]]
>>
>>
>> __
>> R-help@r-project.orgmailing
>>  list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Line Graph Labels

2010-04-01 Thread zachcp

I have a line graph that is flat and has sharp peaks (mass spectrometry
data). I would like to plot my data and label the peaks with their x-axis
value.  I know it is possible to have a function to find local maxima and
then to use these values in a label command but I am hoping there is a
simpler way to do this.

An example of what i would like to be able to do is in the Cooks Distance
Plot on this webpage:
http://www.personalityresearch.net/r/r.guide.html#linear

ideally i could also specify a cutoff for values to be labelled. 

any suggestions are appreciated,
best,
zach cp
-- 
View this message in context: 
http://n4.nabble.com/Line-Graph-Labels-tp1748218p1748218.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot and line

2010-04-01 Thread Greg Snow
Look at the 1st example on the help page for the updateusr function in the 
TeachingDemos package for a way to align a barplot and information added to it. 
 See the axis function for creating your own custom axis.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Roslina Zakaria
> Sent: Wednesday, March 31, 2010 4:44 PM
> To: r-help@r-project.org
> Subject: [R] barplot and line
> 
> Hi r-users,
> 
> I have this data below and would like to plot a barplot overlap with a
> line.
> This is my data:
> > hist_50
>     pdf_obs pdf_gen.50
> 1  0.00   0.00
> 2  0.083156   0.125366
> 3  0.132196   0.158230
> 4  0.126866   0.149432
> 5  0.120469   0.127897
> 6  0.121535   0.104096
> 7  0.103412   0.082171
> 8  0.082090   0.063539
> 9  0.065032   0.048408
> 10 0.050107   0.036470
> 11 0.036247   0.027236
> 12 0.031983   0.020198
> 13 0.017058   0.014893
> 14 0.009595   0.010928
> 15 0.007463   0.007986
> 16 0.006397   0.005816
> 17 0.003198   0.004222
> 18 0.003198   0.003057
> 19 0.00   0.002208
> 
> I tried
> sq <- seq(0,900,by=50)
> sq.50 <- as.character(sq)
> barplot(t(hist_50[,1]), col= "blue", beside=TRUE, ylim=c(0,0.2),
> main="Observed and generated gamma sum",
> xlab="Rainfall totals(mm)",cex.axis=0.8,ylab="Probability")
> legend("topright", c("observed","generated"), fill= c("blue","yellow"))
> rownames(hist_50) <- sq.50
> lines(spline(hist_50[,2]),lty=1)
> 
> The problem is the x-axis label is invisible, I want 0, 50, 100,
> Another thing is I want the line to plot on top of the barplot.  It
> seems that it is shifted to the left.
> 
> Thank you.
> 
> 
> 
> 
>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factorial regression with multiple features: how to remove non-significant features?

2010-04-01 Thread Ista Zahn
Hi Parthiban,
I urge you to rethink your approach, or at least proceed with extreme
caution. Lower-order terms involved in higher-order interactions may not be
what you think they are. And there are serious problems with stepwise model
selection.

I encourage you to read  a good regression modeling text or consult your
local statistician before proceeding.

-Ista

PS. Yes, the step function has options, which you would know if you had
taken the trouble to read the help file... Try ?step at the R prompt.

On Thu, Apr 1, 2010 at 4:11 PM, Vijaya Parthiban  wrote:

> Hello all,
>
> I am trying to do factorial regression using lm() like this (example):
>
> model<-lm(y ~ x1 + x2 + x3 + x4 + x1*x2*x3*x4)
>
> The final term 'x1*x2*x3*x4' adds all possible interactions between
> explanatory variables to the model. i.e. x1:x2, x1:x2:x3, etc, etc. Now,
> the
> issue is that some of the interactions are significant and some are not.
>
> I can manually remove features/interactions using 'update' like this:
>
> model1<-update(model,~. - x1:x2:x4)
>
>  one by one as long as all the explanatory variables/features or
> interactions are significant. But, this is so tedious. There must be a way
> to say to R automatically  'I want to retain only significant
> features/interactions' OR to do something to update(remove) all
> non-significant interactions.
>
> model2<-step(model)
>
> ..was not very helpful. Are there any options to it?
>
> Can someone shed light on how I can do that? Can glm() or gam() or anything
> else be more powerful to do this? Any help is greatly appreciated!
>
> Many thanks,
> Parthiban.
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regarding the De-bugger in R

2010-04-01 Thread Ayush Raman
Hi,

I would like to know if there is some debugger in R where I can check that I
am not using or not doing calculation on my previously stored objects. I
can't use rm (list = ls()) to remove all the objects since I am using a for
loop for reading 500 files and making making common calculation for each
file, therefore I need to keep the track of my iterator. Is it possible to
remove everything except the iterator and see that my answers for each
iterations are not getting compiled on previous answers.

Thanks.

-- 
Regards,
Ayush Raman

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Basic Question (Real Basic)

2010-04-01 Thread Idgarad
I am having a total brain fart... complete and total. This is part R, Part
Statstitics, Part "My Brain is on vacation apparently."

Ok I have a time series I need to LOG and DIFF for ARIMA with Regressors.
Say 100 data points.
Obviously when I diff the series once I get 98 data points now. So what is
the appropriate way to handle that now.

Part B (This is where I am having a fundamental brain fart). Given that I
have Logged and Diff'ed the original Time Series and I want to get a
forecast, how do I apply that back to my original data? I am having a mental
implosion right now (Just got back from vacation.) I am just not wrapping my
head around forecasting in R against transformed data (For stationary
purposes).

Total brain meltdown ( grilled cheese sounds good...) Anyone have a
basic shell script I can look at for reference... not connecting dots
today



Idgarad
-- 
"Who is John Galt?"

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Factorial regression with multiple features: how to remove non-significant features?

2010-04-01 Thread Vijaya Parthiban
Hello all,

I am trying to do factorial regression using lm() like this (example):

model<-lm(y ~ x1 + x2 + x3 + x4 + x1*x2*x3*x4)

The final term 'x1*x2*x3*x4' adds all possible interactions between
explanatory variables to the model. i.e. x1:x2, x1:x2:x3, etc, etc. Now, the
issue is that some of the interactions are significant and some are not.

I can manually remove features/interactions using 'update' like this:

model1<-update(model,~. - x1:x2:x4)

 one by one as long as all the explanatory variables/features or
interactions are significant. But, this is so tedious. There must be a way
to say to R automatically  'I want to retain only significant
features/interactions' OR to do something to update(remove) all
non-significant interactions.

model2<-step(model)

..was not very helpful. Are there any options to it?

Can someone shed light on how I can do that? Can glm() or gam() or anything
else be more powerful to do this? Any help is greatly appreciated!

Many thanks,
Parthiban.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] export R data as web service

2010-04-01 Thread Johann Hibschman
I'd like to access data in my R session from elsewhere via HTTP.  My
particular use case would be R on Linux, accessing data from Windows,
all on the same intranet.

Naively, when I say this, I imagine something like:

> theData <- big.calculation.returning.data.frame()
> startHttpServer(port=8675)
(blocks until I C-c it)

Elsewhere...

  http://linuxmachine:8675/htmlDF?theData
-> HTML table version of data frame

or

  http://linuxmachine:8675/csvDF?theData
-> CSV export of data frame

Then I C-c my server and continue working in R.

Rserve has its own protocol, plus it spawns extra processes.  biocep
looks interesting, but it's far more ambitious than this, and I can't
tell if it would actually work in this case.  svSocket looks
interesting, but (if I understand it correctly) it would be a lot of
work to implement HTTP over its raw sockets.

Basically, I have a few tools (including Excel) that know how to get
data from web services, but no way to export data from R in a way that
they can see.  I'm spoiled a little by having used q (from kx.com),
which embeds a HTTP server in the interpreter and makes this kind of
access very easy; I'd like to have the same thing in R.

Thanks,

- Johann

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to construct a time series

2010-04-01 Thread n.via...@libero.it
Hi,
I need to generate the time series of the production, but as I'm new to this 
topic I am not able to do that. This is what the time series should be:

PROD(t)=PROD(t,T)
PROD(t-1)=PROD(t-1,T)
PROD(t-2)=PROD(t-1)*PROD(t-2,T-1)/PROD(t-1,T-1)
PROD(t-3)=PROD(t-2)*PROD(t-3,T-2)/PROD(t-2,T-2)
...
...
...
from PROD(t-2)...it will get the same expression; where PROD(t,T) is the value 
of the production at t for the sample of firms presented at T and T-1;
Someone knows how to get it???
Thanks for your attention!!!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R: Generative Topographic Map

2010-04-01 Thread mauede
Thank you. I figured that out myself last night. I always forget that 
read.table does not actually read data into a matrix.
GTM MatLab toolbox comes with  a nice guide to use the package which may as 
well become an R vignette.

Anyway, I got the singular matrix warnings myself and do not know whether I 
should be concerned about it or not.
Moreover, I do not know how to avoid that.
I will go through some other experiments keeping the data space samples and 
dimensionality fixed and changing some of the input parameters.
I stress our goal is NOT visualization. We do not know the intrinsic 
dimensionality of that data space samples. Therefore we can only proceed by 
trial-&-error. That is we vary the dimensionality of the embedding space. In 
this experiment the dimensionality of the data space is 7 so we start out 
projecting our original data to a 1D embedding space, then we try out a 2D 
embedding space, ..., all the way up to a 6D embedding space. Since we do not 
know the intrinsic dimensionality of the original data, we need a method to 
evaluate the reliability of the projection. To assess that we reconstruct the 
data back from the embedding to the data space and here we calculate the RMSD 
between the original data and the reconstructed ones. Basically, using RMSD, we 
need as many reconstructed points as the original number. Such a requirement is 
achieved by choosing as many points in the latent space as in the data space. 
Can such a choice be the cause of the matrix singularity ?  Futhermore!
 , is the number of basis functions related to the number of latent space 
points somehow ?
Unluckily, even GTM MatLab documentation is not explicitly providing any clear 
criteria about the parameters choice and their dependence, if any.  

Thank you,
Maura


-Messaggio originale-
Da: Ondrej Such [mailto:ondrej.s...@gmail.com]
Inviato: gio 01/04/2010 11.16
A: mau...@alice.it
Oggetto: Re: Generative Topographic Map
 
Hello,

the problem that's tripping the package is that T is a data.frame and not a
matrix.

Simply replacing

 T <- read.table("DHA_TNH.txt")

with

T <- as.matrix(read.table("DHA_TNH.txt"))

makes the code run (though warnings about singular matrices remain, I'm not
sure to what degree that is worrisome). I'd be curious, as to how you'd
suggest improving the documentation.

Hope this helps,

--Ondrej

2010/3/31 

>  I tried to use R version of package
> I noticed the original MatLab Pckage is much better documented.
> I had a look at the R demo code "gtm_demo" and found that variable Y is
> used in advanced of being created:
> I wrote my own few lines as follows:
>  inDir <-  "C:/Documents and Settings/Monville/Alanine Dipeptide/DBP1/DHA"
>
>  setwd(inDir)
>  T <- read.table("DHA_TNH.txt")
>  L <- 3
>  X <- matrix(nrow=nrow(T),ncol=3,byrow=TRUE)
>  MU <- matrix(nrow=round(nrow(T)/5), ncol=L)
>
>  for(i in 1:ncol(X)) {
>for(j in 1:nrow(X)) {
>   X[j,i] <- RANDU()
>}
>  }
>
>  for(i in 1:ncol(MU)) {
>   for(j in 1:nrow(MU)) {
>  MU[j,i] <- RANDU()
>   }
>  }
>  sigma <-1
>
>  FI <- gtm_gbf(MU,sigma,X)
>  W <- gtm_ri(T,FI)
>  Y= FI%*%W
>  b = gtm_bi(Y)
>  lambda <- 0.001
>  for (m in 1:15) {
> trnResult = gtm_trn(T, FI, W, lambda, 1, b, 2,quiet = TRUE, minSing =
> 0.01)
> W = trnResult$W
> b = trnResult$beta
> Y = FI %*% W
>  }
>
> I ran the above script on my own data representing 1969 samples of 7
> dihedral angles of a folding molecule (attached.
> Upon running the 1st iteration of the training function "gtm_trn" I get the
> following error that I cannot interpret.
> Any help and/or suggestion is welcome:
>
> >  trnResult = gtm_trn(T, FI, W, lambda, 1, b, 2,quiet = TRUE, minSing =
> 1.)
> Error in gtmGlobalR %*% T :
>   requires numeric/complex matrix/vector arguments
> In addition: Warning messages:
> 1: In chol.default(A, pivot = TRUE) : matrix not positive definite
> 2: In gtm_trn(T, FI, W, lambda, 1, b, 2, quiet = TRUE, minSing = 1) :
>   Using 7 out of 395 eigenvalues
>
> Thank you in advance,
> Maura
>
>
>
>
> Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e
> tutti i telefonini TIM!

er
>




tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sharing levels across multiple factor vectors

2010-04-01 Thread Henrique Dallazuanna
Try this also:

df[] <- as.numeric(unlist(df))
df

On Thu, Apr 1, 2010 at 2:53 AM, Jeff Brown  wrote:
>
> Sorry for spamming.  I swear I had worked on that problem a long time before
> posting.
>
> But I just figured it out: I have to change the values, which are
> represented as integers, not strings.  So the following code will do it:
>
> df <- data.frame (
>        a = factor( c( "bob", "alice", "bob" ) ),
>        b = factor( c( "kenny", "alice", "alice" ) )
> );
> allLevels <- unique( c( levels( df$a ), levels( df$b ) ) )
> for (c in colnames(df)) {
>        df[,c] <- match( df[,c], allLevels);
>        levels( df[,c] ) <- 1:(length(allLevels))
> };
>
> --
> View this message in context: 
> http://n4.nabble.com/Sharing-levels-across-multiple-factor-vectors-tp1747714p1747722.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RCOM Save

2010-04-01 Thread Erich Neuwirth
Please be more precise.
You seem not to use the rcom (note the lowercase name) package,
but the RDCOMClient package which is not available from CRAN
but as part of the Omegahat project.

rcom has a command comCreateObject and RDCOMClient has a command COMCreate.

The code you supplied will definitely not run in with rcom.
Furthermore, if your code had included the appropriate
library statement, things would have been much clearer.


On 4/1/2010 12:00 PM, koj wrote:
> 
> Thank you. Unfortunately this recommendation does not solve my problem and I
> don't know why. Here is my test-code:
> 
> 
> pfad<-paste("C:/...","xls",sep=".")
> xl <- COMCreate("Excel.Application")  
> xl[["Visible"]] <- FALSE 
> wkbk <- xl$Workbooks()$Open(pfad)   
> sh <- xl$ActiveSheet()
> A3R <- sh$Range("A28")
> A3R[["Formula"]] <- "=Summe(A1:A27)"
> A3RF <- A3R$Font()
> A3RF[["Bold"]] <- TRUE
> wkbk$Close(pfad)
> system("taskkill /f /im Excel.exe")
> 
> I tried a lot of possibilities (e. g. in the close statement) but without
> success. Excel always asks about the saving.

-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About logistic regression

2010-04-01 Thread David Winsemius


On Apr 1, 2010, at 8:19 AM, Silvano wrote:


Hi,

I have a dichotomous variable (Q1) whose answers are Yes or No.
Also I have 2 categorical explanatory variables (V1 and V2) with two  
levels each.


I used logistic regression to determine whether there is an effect  
of V1, V2 or an interaction between them.


I used the R and SAS, just for the conference. It happens that there  
is disagreement about the effect of the explanatory variables  
between the two softwares.


Not really. You are incorrectly interpreting what SAS is reporting to  
you, although in your defense I think it is SAS's fault, and that what  
SA is reproting is nonsensical.




R:
q1 = glm(Q1~grau*genero, family=binomial, data=dados)
anova(q1, test="Chisq")

  Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL  202 277.82
grau 1   4.3537   201 273.46   0.03693 *
genero   1   1.4775   200 271.99   0.22417
grau:genero  1   0.0001   199 271.99   0.99031

SAS:
proc logistic data=psico;
class genero (param=ref ref='0') grau (param=ref ref='0');
model Q1 = grau genero grau*genero / expb;
run;
 Type 3 Analysis of Effects
Wald
   Effect   DFChi-Square Pr > ChiSq

   grau  11.6835 0.1945
   genero10.7789 0.3775
   genero*grau   10.0002 0.9902


I'm having difficulty figuring our how "type 3" analysis makes any  
sense in this situation. Remember that "type 3" analysis supposedly  
gives you an estimate for a covariate that is independent of its order  
of entry. How could you sensible be adding either of those "main  
effects" terms to a model that already had the interaction and the  
other covariate in it already? The nested model perspective offered by  
R seems much more sensible.


--
David




The parameters estimates are the same for both.
Coefficients:
   Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.191055   0.310016   0.6160.538
grau 0.562717   0.433615   1.2980.194
genero  -0.355358   0.402650  -0.8830.377
grau:genero  0.007052   0.580837   0.0120.990

What am I doing wrong?

Thanks,

--
Silvano Cesar da Costa
Departamento de Estatística
Universidade Estadual de Londrina
Fone: 3371-4346

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About logistic regression

2010-04-01 Thread Eik Vettorazzi

Hi Silvano,
this is FAQ 7.17 
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-does-the-output-from-anova_0028_0029-depend-on-the-order-of-factors-in-the-model_003f


hth.

Silvano schrieb:

Hi,

I have a dichotomous variable (Q1) whose answers are Yes or No.
Also I have 2 categorical explanatory variables (V1 and V2) with two 
levels each.


I used logistic regression to determine whether there is an effect of 
V1, V2 or an interaction between them.


I used the R and SAS, just for the conference. It happens that there 
is disagreement about the effect of the explanatory variables between 
the two softwares.


R:
q1 = glm(Q1~grau*genero, family=binomial, data=dados)
anova(q1, test="Chisq")

   Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL  202 277.82
grau 1   4.3537   201 273.46   0.03693 *
genero   1   1.4775   200 271.99   0.22417
grau:genero  1   0.0001   199 271.99   0.99031

SAS:
proc logistic data=psico;
class genero (param=ref ref='0') grau (param=ref ref='0');
model Q1 = grau genero grau*genero / expb;
run;
  Type 3 Analysis of Effects
 Wald
Effect   DFChi-Square Pr > ChiSq

grau  11.6835 0.1945
genero10.7789 0.3775
genero*grau   10.0002 0.9902

The parameters estimates are the same for both.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.191055   0.310016   0.6160.538
grau 0.562717   0.433615   1.2980.194
genero  -0.355358   0.402650  -0.8830.377
grau:genero  0.007052   0.580837   0.0120.990

What am I doing wrong?

Thanks,

--
Silvano Cesar da Costa
Departamento de Estatística
Universidade Estadual de Londrina
Fone: 3371-4346

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About logistic regression

2010-04-01 Thread Silvano

Hi,

I have a dichotomous variable (Q1) whose answers are Yes or 
No.
Also I have 2 categorical explanatory variables (V1 and V2) 
with two levels each.


I used logistic regression to determine whether there is an 
effect of V1, V2 or an interaction between them.


I used the R and SAS, just for the conference. It happens 
that there is disagreement about the effect of the 
explanatory variables between the two softwares.


R:
q1 = glm(Q1~grau*genero, family=binomial, data=dados)
anova(q1, test="Chisq")

   Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL  202 277.82
grau 1   4.3537   201 273.46   0.03693 *
genero   1   1.4775   200 271.99   0.22417
grau:genero  1   0.0001   199 271.99   0.99031

SAS:
proc logistic data=psico;
class genero (param=ref ref='0') grau (param=ref ref='0');
model Q1 = grau genero grau*genero / expb;
run;
  Type 3 Analysis of 
Effects

 Wald
Effect   DFChi-Square 
Pr > ChiSq


grau  11.6835 
0.1945
genero10.7789 
0.3775
genero*grau   10.0002 
0.9902


The parameters estimates are the same for both.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.191055   0.310016   0.6160.538
grau 0.562717   0.433615   1.2980.194
genero  -0.355358   0.402650  -0.8830.377
grau:genero  0.007052   0.580837   0.0120.990

What am I doing wrong?

Thanks,

--
Silvano Cesar da Costa
Departamento de Estatística
Universidade Estadual de Londrina
Fone: 3371-4346

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RCOM Save

2010-04-01 Thread Henrique Dallazuanna
Try this;




On Thu, Apr 1, 2010 at 7:00 AM, koj  wrote:
>
> Thank you. Unfortunately this recommendation does not solve my problem and I
> don't know why. Here is my test-code:
>
>
> pfad<-paste("C:/...","xls",sep=".")
> xl <- COMCreate("Excel.Application")
> xl[["Visible"]] <- FALSE

xl[['DisplayAlerts']] <- FALSE

> wkbk <- xl$Workbooks()$Open(pfad)
> sh <- xl$ActiveSheet()
> A3R <- sh$Range("A28")
> A3R[["Formula"]] <- "=Summe(A1:A27)"
> A3RF <- A3R$Font()
> A3RF[["Bold"]] <- TRUE
wkbk[['Save']] <- TRUE
xl$Quit()
xl <- NULL
gc(reset = TRUE)
#system("taskkill /f /im Excel.exe")
>
> I tried a lot of possibilities (e. g. in the close statement) but without
> success. Excel always asks about the saving.
> --
> View this message in context: 
> http://n4.nabble.com/RCOM-Save-tp1746602p1747885.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitness of regression tree: how to measure???

2010-04-01 Thread Frank E Harrell Jr

vibha patel wrote:

Hello,

I'm using rpart function for creating regression trees.
now how to measure the fitness of regression tree???

thanks n Regards,
Vibha


If the sample size is less than 20,000, assume that the tree is a 
somewhat arbitrary representation of the relationships in the data and 
that the form of the tree will not replicate in future datasets.


Frank

--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expression help to extract specific strings from text

2010-04-01 Thread Tony B
Thank you guys, both solutions work great! Seems I have two new
packages to investigate :)

Regards,
Tony Breyal

On 31 Mar, 14:20, Tony B  wrote:
> Dear all,
>
> Lets say I have the following:
>
> > x <- c("Eve: Going to try something new today...", "Adam: Hey @Eve, how are 
> > you finding R? #rstats", "Eve: @Adam, It's awesome, so much better at 
> > statistics that #Excel ever was! @Cain & @Able disagree though :(", "Adam: 
> > @Eve I'm sure they'll sort it out :)", "blahblah")
> > x
>
> [1] "Eve: Going to try something new
> today..."
> [2] "Adam: Hey @Eve, how are you finding R?
> #rstats"
> [3] "Eve: @Adam, It's awesome, so much better at statistics that
> \n#Excel ever was! @Cain & @Able disagree though :("
> [4] "Adam: @Eve I'm sure they'll sort it
> out :)"
> [5] "blahblah"
>
> I would like to come up with a data frame which looks like this
> (pulling out the usernames and #tags):
>
> > data.frame(Msg = x, Source = c("Eve", "Adam", "Eve", "Adam", NA), Mentions 
> > = c(NA, "Eve", "Adam, Cain, Able", "Eve", NA), HashTags = c(NA, "rstats", 
> > "Excel", NA, NA))
>
> The best I can do so far is:
>
> source <- lapply(x, function (x) {
>    tmp <- strsplit(x, ":", fixed = TRUE)
>    if(length(tmp[[1]]) < 2) {
>      tmp <- c(NA, tmp)
>    }
>    return(tmp[[1]][1])
>  } )
> source <- unlist(source)
>
> [1] "Eve"  "Adam" "Eve"  "Adam" NA
>
> I can't work out how to extract the usernames starting with '@' or the
> #tags. I can identify them using gsub and replace them, but I don't
> know how to just extract those terms only, e.g. sort of the opposite
> of the following
>
> > gsub("@([A-Za-z0-9_]+)", "@[...]", x)
>
> [1] "Eve: Going to try something new today..."
> [2] "Adam: Hey @[...], how are you finding R? #rstats"
> [3] "Eve: @[...], It's awesome, so much better at statistics that
> #Excel ever was! @[...] & @[...] disagree though :("
> [4] "Adam: @[...] I'm sure they'll sort it out :)"
> [5] "blahblah"
>
> and
>
> > gsub("#([A-Za-z0-9_]+)", "#[...]", x)
>
> [1] "Eve: Going to try something new today..."
> [2] "Adam: Hey @Eve, how are you finding R? #[...]"
> [3] "Eve: @Adam, It's awesome, so much better at statistics that
> #[...] ever was! @Cain & @Able disagree though :("
> [4] "Adam: @Eve I'm sure they'll sort it out :)"
> [5] "blahblah"
>
> I hope that makes sense, and thank you kindly in advance for your
> time.
> Tony Breyal
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RCOM Save

2010-04-01 Thread koj

Thank you. Unfortunately this recommendation does not solve my problem and I
don't know why. Here is my test-code:


pfad<-paste("C:/...","xls",sep=".")
xl <- COMCreate("Excel.Application")  
xl[["Visible"]] <- FALSE 
wkbk <- xl$Workbooks()$Open(pfad)   
sh <- xl$ActiveSheet()
A3R <- sh$Range("A28")
A3R[["Formula"]] <- "=Summe(A1:A27)"
A3RF <- A3R$Font()
A3RF[["Bold"]] <- TRUE
wkbk$Close(pfad)
system("taskkill /f /im Excel.exe")

I tried a lot of possibilities (e. g. in the close statement) but without
success. Excel always asks about the saving.
-- 
View this message in context: 
http://n4.nabble.com/RCOM-Save-tp1746602p1747885.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitness of regression tree: how to measure???

2010-04-01 Thread vibha patel
Hello,

I'm using rpart function for creating regression trees.
now how to measure the fitness of regression tree???

thanks n Regards,
Vibha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading excel into R

2010-04-01 Thread Gabor Grothendieck
See this link for a number of alternatives:

http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows

On Thu, Apr 1, 2010 at 6:19 AM, cheba meier  wrote:
> Dear all,
>
> I am new R user and I am sure that this question has been asked quite often
> and I have also googled it and read about it! I understood that in order to
> read excel sheet into R you need to open it and saved it as csv or text, is
> this true? or you can use read.delim2 and read.csv2 to do this without the
> following error
>
>> dat <- read.csv2(file="C:\\Dokumente und
> Einstellungen\\Cheba\\Desktop\\Rtemp\\ Results2010.xls",header = TRUE)
> Warnmeldung:
> In read.table(file = file, header = header, sep = sep, quote = quote,  :
>  unvollständige letzte Zeile von readTableHeader gefunden in 'C:\\Dokumente
> und Einstellungen\\Cheba\\Desktop\\Rtemp\\ Results2010.xls'
>> dat
> [1] ÐÏ.à..
> <0 Zeilen> (oder row.names mit Länge 0)
>
> Thes same error I get when I use read.delim and demlim2!
>
> Is library(gdata) the solution?
>
> Sorry for any inconvenience caused.
>
> Regards,
> Cheba
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sharing levels across multiple factor vectors

2010-04-01 Thread hadley wickham
On Thu, Apr 1, 2010 at 3:05 AM, Peter Dalgaard  wrote:
> Jeff Brown wrote:
>> Sorry for spamming.  I swear I had worked on that problem a long time before
>> posting.
>>
>> But I just figured it out: I have to change the values, which are
>> represented as integers, not strings.  So the following code will do it:
>>
>> df <- data.frame (
>>       a = factor( c( "bob", "alice", "bob" ) ),
>>       b = factor( c( "kenny", "alice", "alice" ) )
>> );
>> allLevels <- unique( c( levels( df$a ), levels( df$b ) ) )
>> for (c in colnames(df)) {
>>       df[,c] <- match( df[,c], allLevels);
>>       levels( df[,c] ) <- 1:(length(allLevels))
>> };
>>
>
> Hmm, I think I'd go for something like
>
> allLevels <- unique(unlist(lapply(df,levels)))
> df[] <- lapply(df, factor,
> levels=allLevels, labels=seq_along(allLevels))

This behaviour always catches me out:

levels(f) <- l
is very different to
f <- factor(f, levels = l)

Hadley


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Palette color order in bwplot (lattice violin plot) vs. boxplot

2010-04-01 Thread Luigi Ponti

Hello,

I am trying to give different colors to boxes in a violin plot obtained 
via bwplot from lattice package using a color palette from RColorBrewer:


> require(RColorBrewer)
> MyPalette <- brewer.pal(6, "Set3")

A call to:

> boxplot(count ~ spray, data = InsectSprays, col = MyPalette)

yields the example boxplot with each box colored according to the 
different colors from MyPalette. In addition, boxes are colored with the 
same color order of MyPalette. See


> display.brewer.pal(6, "Set3")

However, when I do the same thing with a violin plot from the lattice 
package

> require(lattice)
> bwplot(count ~ spray, data = InsectSprays,
+panel = function(..., box.ratio) {
+panel.violin(..., col = "transparent",
+ varwidth = FALSE, box.ratio = box.ratio)
+panel.bwplot(..., fill = MyPalette, box.ratio = .1)
+} )

boxplots are colored with the right colors (each box has a different 
color) but with a different color order -- too bad because I would like 
to color code the plot according to certain pre-defined colors. Same 
thing (wrong color order) with a simple bwplot:


> bwplot(count ~ spray, data = InsectSprays, fill = MyPalette)

Is there a way to get the right color (i.e. same order as in MyPalette) 
in bwplot/panel.violin?


Kind regards,

Luigi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading excel into R

2010-04-01 Thread milton ruser
Dear Cheba

Please, install the package "xlsReadWrite"
I suppose that read.csv is to read csv files, not xls ones.

cheers

milton

On Thu, Apr 1, 2010 at 6:19 AM, cheba meier wrote:

> Dear all,
>
> I am new R user and I am sure that this question has been asked quite often
> and I have also googled it and read about it! I understood that in order to
> read excel sheet into R you need to open it and saved it as csv or text, is
> this true? or you can use read.delim2 and read.csv2 to do this without the
> following error
>
> > dat <- read.csv2(file="C:\\Dokumente und
> Einstellungen\\Cheba\\Desktop\\Rtemp\\ Results2010.xls",header = TRUE)
> Warnmeldung:
> In read.table(file = file, header = header, sep = sep, quote = quote,  :
>  unvollständige letzte Zeile von readTableHeader gefunden in 'C:\\Dokumente
> und Einstellungen\\Cheba\\Desktop\\Rtemp\\ Results2010.xls'
> > dat
> [1] ÐÏ.à..
> <0 Zeilen> (oder row.names mit Länge 0)
>
> Thes same error I get when I use read.delim and demlim2!
>
> Is library(gdata) the solution?
>
> Sorry for any inconvenience caused.
>
> Regards,
> Cheba
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stack with factors

2010-04-01 Thread Peter Ehlers

On 2010-04-01 3:53, Ken Knoblauch wrote:

Kenneth Roy Cabrera Torres  une.net.co>  writes:

Hi R users:
I found that I cannot stack() a data.frame with factors.
db1<-data.frame(replicate(6,factor(sample(c("A","B"),6,replace=TRUE
str(db1)
db2<-stack(db1)
db2
"db2" does not have any row.
How can I stack them by the variables X1,X2,...,X6?


you can see what is happening in stack.data.frame
you have a line

x<- x[, unlist(lapply(x, is.vector)), drop = FALSE]

and

lapply(x, is.vector))

is applied to each column of the data frame but
you can verify for yourself that a factor yields FALSE here

x<- db1[[1]]
is.vector(x)
[1] FALSE

so I think that this at least explains why it doesn't work as
you expected.


db2 <- stack(lapply(db1, as.character))

will do it.

 -Peter Ehlers




Thank you for your help.

Kenneth





--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading excel into R

2010-04-01 Thread cheba meier
Dear all,

I am new R user and I am sure that this question has been asked quite often
and I have also googled it and read about it! I understood that in order to
read excel sheet into R you need to open it and saved it as csv or text, is
this true? or you can use read.delim2 and read.csv2 to do this without the
following error

> dat <- read.csv2(file="C:\\Dokumente und
Einstellungen\\Cheba\\Desktop\\Rtemp\\ Results2010.xls",header = TRUE)
Warnmeldung:
In read.table(file = file, header = header, sep = sep, quote = quote,  :
  unvollständige letzte Zeile von readTableHeader gefunden in 'C:\\Dokumente
und Einstellungen\\Cheba\\Desktop\\Rtemp\\ Results2010.xls'
> dat
[1] ÐÏ.à..
<0 Zeilen> (oder row.names mit Länge 0)

Thes same error I get when I use read.delim and demlim2!

Is library(gdata) the solution?

Sorry for any inconvenience caused.

Regards,
Cheba

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >